JeVoisBase  1.16
JeVois Smart Embedded Machine Vision Toolkit Base Modules
Share this page:
TensorFlowEasy.C
Go to the documentation of this file.
1 // ///////////////////////////////////////////////////////////////////////////////////////////////////////////////////
2 //
3 // JeVois Smart Embedded Machine Vision Toolkit - Copyright (C) 2016 by Laurent Itti, the University of Southern
4 // California (USC), and iLab at USC. See http://iLab.usc.edu and http://jevois.org for information about this project.
5 //
6 // This file is part of the JeVois Smart Embedded Machine Vision Toolkit. This program is free software; you can
7 // redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software
8 // Foundation, version 2. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY;
9 // without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public
10 // License for more details. You should have received a copy of the GNU General Public License along with this program;
11 // if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
12 //
13 // Contact information: Laurent Itti - 3641 Watt Way, HNB-07A - Los Angeles, CA 90089-2520 - USA.
14 // Tel: +1 213 740 3527 - itti@pollux.usc.edu - http://iLab.usc.edu - http://jevois.org
15 // ///////////////////////////////////////////////////////////////////////////////////////////////////////////////////
16 /*! \file */
17 
18 #include <jevois/Core/Module.H>
19 #include <jevois/Debug/Timer.H>
21 #include <opencv2/core/core.hpp>
22 #include <opencv2/imgproc/imgproc.hpp>
24 
25 // icon from tensorflow youtube
26 
27 static jevois::ParameterCategory const ParamCateg("TensorFlow Easy Options");
28 
29 //! Parameter \relates TensorFlowEasy
30 JEVOIS_DECLARE_PARAMETER(foa, cv::Size, "Width and height (in pixels) of the fixed, central focus of attention. "
31  "This is the size of the central image crop that is taken in each frame and fed to the "
32  "deep neural network. If the foa size does not fit within the camera input frame size, "
33  "it will be shrunk to fit. To avoid spending CPU resources on rescaling the selected "
34  "image region, it is best to use here the size that the deep network expects as input.",
35  cv::Size(128, 128), ParamCateg);
36 
37 //! Identify objects using TensorFlow deep neural network
38 /*! TensorFlow is a popular neural network framework. This module identifies the object in a square region in the center
39  of the camera field of view using a deep convolutional neural network.
40 
41  The deep network analyzes the image by filtering it using many different filter kernels, and several stacked passes
42  (network layers). This essentially amounts to detecting the presence of both simple and complex parts of known
43  objects in the image (e.g., from detecting edges in lower layers of the network to detecting car wheels or even
44  whole cars in higher layers). The last layer of the network is reduced to a vector with one entry per known kind of
45  object (object class). This module returns the class names of the top scoring candidates in the output vector, if
46  any have scored above a minimum confidence threshold. When nothing is recognized with sufficiently high confidence,
47  there is no output.
48 
49  \youtube{TRk8rCuUVEE}
50 
51  This module runs a TensorFlow network and shows the top-scoring results. In this module, we run the deep network on
52  every video frame, so framerate will vary depending on network complexity (see below). Point your camera towards
53  some interesting object, make the object fit within the grey box shown in the video (which will be fed to the neural
54  network), keep it stable, and TensorFlow will tell you what it thinks this object is.
55 
56  Note that by default this module runs different flavors of MobileNets trained on the ImageNet dataset. There are
57  1000 different kinds of objects (object classes) that these networks can recognize (too long to list here). The
58  input layer of these networks is 299x299, 224x224, 192x192, 160x160, or 128x128 pixels by default, depending on the
59  network used. The networks provided on the JeVois microSD image have been trained on large clusters of GPUs, using
60  1.2 million training images from the ImageNet dataset.
61 
62  For more information about MobileNets, see
63  https://github.com/tensorflow/models/blob/master/research/slim/nets/mobilenet_v1.md
64 
65  For more information about the ImageNet dataset used for training, see
66  http://www.image-net.org/challenges/LSVRC/2012/
67 
68  Sometimes this module will make mistakes! The performance of mobilenets is about 40% to 70% correct (mean average
69  precision) on the test set, depending on network size (bigger networks are more accurate but slower).
70 
71  Neural network size and speed
72  -----------------------------
73 
74  This module takes a central image region of size given by the \p foa parameter. If necessary, this image region is
75  then rescaled to match the deep network's expected input size. The network input size varies depending on which
76  network is used; for example, mobilenet_v1_0.25_128_quant expects 128x128 input images, while mobilenet_v1_1.0_224
77  expects 224x224. Note that there is a CPU cost to rescaling, so, for best performance, you should match the \p foa
78  size to the network's input size.
79 
80  For example:
81 
82  - mobilenet_v1_0.25_128_quant (network size 128x128), runs at about 8ms/prediction (125 frames/s).
83  - mobilenet_v1_0.5_128_quant (network size 128x128), runs at about 18ms/prediction (55 frames/s).
84  - mobilenet_v1_0.25_224_quant (network size 224x224), runs at about 24ms/prediction (41 frames/s).
85  - mobilenet_v1_1.0_224_quant (network size 224x224), runs at about 139ms/prediction (7 frames/s).
86 
87  To easily select one of the available networks, see <B>JEVOIS:/modules/JeVois/TensorFlowEasy/params.cfg</B> on the
88  microSD card of your JeVois camera.
89 
90  Serial messages
91  ---------------
92 
93  When detections are found with confidence scores above \p thresh, a message containing up to \p top category:score
94  pairs will be sent per video frame. Exact message format depends on the current \p serstyle setting and is described
95  in \ref UserSerialStyle. For example, when \p serstyle is \b Detail, this module sends:
96 
97  \verbatim
98  DO category:score category:score ... category:score
99  \endverbatim
100 
101  where \a category is a category name (from \p namefile) and \a score is the confidence score from 0.0 to 100.0 that
102  this category was recognized. The pairs are in order of decreasing score.
103 
104  See \ref UserSerialStyle for more on standardized serial messages, and \ref coordhelpers for more info on
105  standardized coordinates.
106 
107  More networks
108  -------------
109 
110  Search the web for models in TFLITE format and for TensorFlow 1.x series. For example, see
111  https://tfhub.dev/s?module-type=image-classification
112 
113  To add a new model to your microSD card:
114  - create a directory for it under <b>JEVOIS:/share/tensorflow</b>
115  - put your .tflite in there as \b model.tflite
116  - put a list of labels as a plain text file, one label per line, in your directory as \b labels.txt
117  - edit params.cfg for this module (best done in JeVois Inventor) to add a new entry for your network, and to
118  comment out the default entry.
119 
120  Using your own network
121  ----------------------
122 
123  For a step-by-step tutorial, see [Training custom TensorFlow networks for
124  JeVois](http://jevois.org/tutorials/UserTensorFlowTraining.html).
125 
126  This module supports RGB or grayscale inputs, byte or float32. You should create and train your network using fast
127  GPUs, and then follow the instruction here to convert your trained network to TFLite format:
128 
129  https://www.tensorflow.org/mobile/tflite/
130 
131  Then you just need to create a directory under <b>JEVOIS:/share/tensorflow/</B> with the name of your network, and,
132  in there, two files, \b labels.txt with the category labels, and \b model.tflite with your model converted to
133  TensorFlow Lite (flatbuffer format). Finally, edit <B>JEVOIS:/modules/JeVois/TensorFlowEasy/params.cfg</B> to
134  select your new network when the module is launched.
135 
136 
137  @author Laurent Itti
138 
139  @displayname TensorFlow Easy
140  @videomapping NONE 0 0 0.0 YUYV 320 240 60.0 JeVois TensorFlowEasy
141  @videomapping YUYV 320 308 30.0 YUYV 320 240 30.0 JeVois TensorFlowEasy
142  @videomapping YUYV 640 548 30.0 YUYV 640 480 30.0 JeVois TensorFlowEasy
143  @videomapping YUYV 1280 1092 7.0 YUYV 1280 1024 7.0 JeVois TensorFlowEasy
144  @email itti\@usc.edu
145  @address University of Southern California, HNB-07A, 3641 Watt Way, Los Angeles, CA 90089-2520, USA
146  @copyright Copyright (C) 2018 by Laurent Itti, iLab and the University of Southern California
147  @mainurl http://jevois.org
148  @supporturl http://jevois.org/doc
149  @otherurl http://iLab.usc.edu
150  @license GPL v3
151  @distribution Unrestricted
152  @restrictions None
153  \ingroup modules */
155  public jevois::Parameter<foa>
156 {
157  public:
158  // ####################################################################################################
159  //! Constructor
160  // ####################################################################################################
161  TensorFlowEasy(std::string const & instance) : jevois::StdModule(instance)
162  {
163  itsTensorFlow = addSubComponent<TensorFlow>("tf");
164  }
165 
166  // ####################################################################################################
167  //! Virtual destructor for safe inheritance
168  // ####################################################################################################
169  virtual ~TensorFlowEasy()
170  { }
171 
172  // ####################################################################################################
173  //! Processing function, no video output
174  // ####################################################################################################
175  virtual void process(jevois::InputFrame && inframe) override
176  {
177  // Wait for next available camera image:
178  jevois::RawImage const inimg = inframe.get();
179  unsigned int const w = inimg.width, h = inimg.height;
180 
181  // Adjust foa size if needed so it fits within the input frame:
182  cv::Size foasiz = foa::get(); int foaw = foasiz.width, foah = foasiz.height;
183  if (foaw > w) { foaw = w; foah = std::min(foah, foaw); }
184  if (foah > h) { foah = h; foaw = std::min(foaw, foah); }
185 
186  // Take a central crop of the input, with size given by foa parameter:
187  int const offx = ((w - foaw) / 2) & (~1);
188  int const offy = ((h - foah) / 2) & (~1);
189 
190  cv::Mat cvimg = jevois::rawimage::cvImage(inimg);
191  cv::Mat crop = cvimg(cv::Rect(offx, offy, foaw, foah));
192 
193  // Convert crop to RGB for predictions:
194  cv::Mat rgbroi; cv::cvtColor(crop, rgbroi, cv::COLOR_YUV2RGB_YUYV);
195 
196  // Let camera know we are done processing the input image:
197  inframe.done();
198 
199  // Launch the predictions, will throw if network is not ready (still loading):
200  itsResults.clear();
201  try
202  {
203  int netinw, netinh, netinc; itsTensorFlow->getInDims(netinw, netinh, netinc);
204 
205  // Scale the ROI if needed:
206  cv::Mat scaledroi = jevois::rescaleCv(rgbroi, cv::Size(netinw, netinh));
207 
208  // Predict:
209  float const ptime = itsTensorFlow->predict(scaledroi, itsResults);
210  LINFO("Predicted in " << ptime << "ms");
211 
212  // Send serial results:
214  }
215  catch (std::logic_error const & e) { } // network still loading
216  }
217 
218  // ####################################################################################################
219  //! Processing function with video output to USB
220  // ####################################################################################################
221  virtual void process(jevois::InputFrame && inframe, jevois::OutputFrame && outframe) override
222  {
223  static jevois::Timer timer("processing", 30, LOG_DEBUG);
224 
225  // Wait for next available camera image:
226  jevois::RawImage const inimg = inframe.get();
227 
228  timer.start();
229 
230  // We only handle one specific pixel format, but any image size in this module:
231  unsigned int const w = inimg.width, h = inimg.height;
232  inimg.require("input", w, h, V4L2_PIX_FMT_YUYV);
233 
234  // Compute central crop window size from foa parameter:
235  cv::Size foasiz = foa::get(); int foaw = foasiz.width, foah = foasiz.height;
236  if (foaw > w) { foaw = w; foah = std::min(foah, foaw); }
237  if (foah > h) { foah = h; foaw = std::min(foaw, foah); }
238  int const offx = ((w - foaw) / 2) & (~1);
239  int const offy = ((h - foah) / 2) & (~1);
240 
241  // While we process it, start a thread to wait for out frame and paste the input into it:
242  jevois::RawImage outimg;
243  auto paste_fut = jevois::async([&]() {
244  outimg = outframe.get();
245  outimg.require("output", outimg.width, h + 68, V4L2_PIX_FMT_YUYV);
246 
247  // Paste the current input image:
248  jevois::rawimage::paste(inimg, outimg, 0, 0);
249  jevois::rawimage::writeText(outimg, "JeVois TensorFlow Easy - input", 3, 3, jevois::yuyv::White);
250 
251  // Draw a grey rectangle for the FOA:
252  jevois::rawimage::drawRect(outimg, offx, offy, foaw, foah, 2, jevois::yuyv::MedGrey);
253 
254  // Blank out the bottom of the frame:
255  jevois::rawimage::drawFilledRect(outimg, 0, h, w, outimg.height - h, jevois::yuyv::Black);
256  });
257 
258  // Take a central crop of the input, with size given by foa parameter:
259  cv::Mat cvimg = jevois::rawimage::cvImage(inimg);
260  cv::Mat crop = cvimg(cv::Rect(offx, offy, foaw, foah));
261 
262  // Convert crop to RGB for predictions:
263  cv::Mat rgbroi; cv::cvtColor(crop, rgbroi, cv::COLOR_YUV2RGB_YUYV);
264 
265  // Let camera know we are done processing the input image:
266  paste_fut.get(); inframe.done();
267 
268  // Launch the predictions, will throw if network is not ready:
269  itsResults.clear();
270  try
271  {
272  int netinw, netinh, netinc; itsTensorFlow->getInDims(netinw, netinh, netinc);
273 
274  // Scale the ROI if needed:
275  cv::Mat scaledroi = jevois::rescaleCv(rgbroi, cv::Size(netinw, netinh));
276 
277  // Predict:
278  float const ptime = itsTensorFlow->predict(scaledroi, itsResults);
279 
280  // Draw some text messages:
281  jevois::rawimage::writeText(outimg, "Predict",
282  w - 7 * 6 - 2, h + 16, jevois::yuyv::White);
283  jevois::rawimage::writeText(outimg, "time:",
284  w - 7 * 6 - 2, h + 28, jevois::yuyv::White);
285  jevois::rawimage::writeText(outimg, std::to_string(int(ptime)) + "ms",
286  w - 7 * 6 - 2, h + 40, jevois::yuyv::White);
287 
288  // Send serial results:
290  }
291  catch (std::logic_error const & e)
292  {
293  // network still loading:
294  jevois::rawimage::writeText(outimg, "Loading network -", 3, h + 4, jevois::yuyv::White);
295  jevois::rawimage::writeText(outimg, "please wait...", 3, h + 16, jevois::yuyv::White);
296  }
297 
298  // Then write the names and scores of the detections:
299  int y = h + 4; if (y + itsTensorFlow->top::get() * 12 > outimg.height - 2) y = 16;
300 
301  for (auto const & p : itsResults)
302  {
303  jevois::rawimage::writeText(outimg, jevois::sformat("%s: %.2F", p.category.c_str(), p.score),
304  3, y, jevois::yuyv::White);
305  y += 12;
306  }
307 
308  // Show processing fps:
309  std::string const & fpscpu = timer.stop();
310  jevois::rawimage::writeText(outimg, fpscpu, 3, h - 13, jevois::yuyv::White);
311 
312  // Send the output image with our processing results to the host over USB:
313  outframe.send();
314  }
315 
316  // ####################################################################################################
317  protected:
318  std::shared_ptr<TensorFlow> itsTensorFlow;
319  std::vector<jevois::ObjReco> itsResults;
320 };
321 
322 // Allow the module to be loaded as a shared object (.so) file:
JEVOIS_REGISTER_MODULE
JEVOIS_REGISTER_MODULE(TensorFlowEasy)
jevois::OutputFrame
Timer.H
Module.H
jevois::rescaleCv
cv::Mat rescaleCv(cv::Mat const &img, cv::Size const &newdims)
jevois::sformat
std::string sformat(char const *fmt,...) __attribute__((format(__printf__
jevois::RawImage
jevois::Timer::start
void start()
jevois::ParameterCategory
jevois::RawImage::require
void require(char const *info, unsigned int w, unsigned int h, unsigned int f) const
jevois::RawImage::width
unsigned int width
jevois::rawimage::writeText
void writeText(RawImage &img, std::string const &txt, int x, int y, unsigned int col, Font font=Font6x10)
jevois
TensorFlowEasy::process
virtual void process(jevois::InputFrame &&inframe) override
Processing function, no video output.
Definition: TensorFlowEasy.C:175
TensorFlowEasy::itsTensorFlow
std::shared_ptr< TensorFlow > itsTensorFlow
Definition: TensorFlowEasy.C:318
jevois::Timer::stop
const std::string & stop(double *seconds)
jevois::rawimage::drawFilledRect
void drawFilledRect(RawImage &img, int x, int y, unsigned int w, unsigned int h, unsigned int col)
jevois::StdModule::StdModule
StdModule(std::string const &instance)
TensorFlowEasy::TensorFlowEasy
TensorFlowEasy(std::string const &instance)
Constructor.
Definition: TensorFlowEasy.C:161
jevois::async
std::future< std::invoke_result_t< std::decay_t< Function >, std::decay_t< Args >... > > async(Function &&f, Args &&... args)
RawImageOps.H
jevois::RawImage::height
unsigned int height
to_string
std::string to_string(T const &val)
jevois::InputFrame
jevois::rawimage::cvImage
cv::Mat cvImage(RawImage const &src)
jevois::rawimage::paste
void paste(RawImage const &src, RawImage &dest, int dx, int dy)
jevois::rawimage::drawRect
void drawRect(RawImage &img, int x, int y, unsigned int w, unsigned int h, unsigned int thick, unsigned int col)
h
int h
TensorFlowEasy::~TensorFlowEasy
virtual ~TensorFlowEasy()
Virtual destructor for safe inheritance.
Definition: TensorFlowEasy.C:169
jevois::StdModule
TensorFlowEasy::itsResults
std::vector< jevois::ObjReco > itsResults
Definition: TensorFlowEasy.C:319
ARtoolkit::JEVOIS_DECLARE_PARAMETER
JEVOIS_DECLARE_PARAMETER(camparams, std::string, "File stem of camera parameters, or empty. Camera resolution " "will be appended, as well as a .dat extension. For example, specifying 'camera_para' " "here and running the camera sensor at 320x240 will attempt to load " "camera_para320x240.dat from within the module's directory (if relative stem) or " "from the specified absolute location (if absolute stem).", JEVOIS_SHARE_PATH "/camera/camera_para", ParamCateg)
Parameter.
LINFO
#define LINFO(msg)
TensorFlowEasy
Identify objects using TensorFlow deep neural network.
Definition: TensorFlowEasy.C:154
TensorFlow.H
TensorFlowEasy::process
virtual void process(jevois::InputFrame &&inframe, jevois::OutputFrame &&outframe) override
Processing function with video output to USB.
Definition: TensorFlowEasy.C:221
jevois::Timer
jevois::StdModule::sendSerialObjReco
void sendSerialObjReco(std::vector< ObjReco > const &res)