JeVoisBase  1.21
JeVois Smart Embedded Machine Vision Toolkit Base Modules
Share this page:
Loading...
Searching...
No Matches
TensorFlowEasy.C
Go to the documentation of this file.
1// ///////////////////////////////////////////////////////////////////////////////////////////////////////////////////
2//
3// JeVois Smart Embedded Machine Vision Toolkit - Copyright (C) 2016 by Laurent Itti, the University of Southern
4// California (USC), and iLab at USC. See http://iLab.usc.edu and http://jevois.org for information about this project.
5//
6// This file is part of the JeVois Smart Embedded Machine Vision Toolkit. This program is free software; you can
7// redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software
8// Foundation, version 2. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY;
9// without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public
10// License for more details. You should have received a copy of the GNU General Public License along with this program;
11// if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
12//
13// Contact information: Laurent Itti - 3641 Watt Way, HNB-07A - Los Angeles, CA 90089-2520 - USA.
14// Tel: +1 213 740 3527 - itti@pollux.usc.edu - http://iLab.usc.edu - http://jevois.org
15// ///////////////////////////////////////////////////////////////////////////////////////////////////////////////////
16/*! \file */
17
18#include <jevois/Core/Module.H>
19#include <jevois/Debug/Timer.H>
21#include <opencv2/core/core.hpp>
22#include <opencv2/imgproc/imgproc.hpp>
24
25// icon from tensorflow youtube
26
27static jevois::ParameterCategory const ParamCateg("TensorFlow Easy Options");
28
29//! Parameter \relates TensorFlowEasy
30JEVOIS_DECLARE_PARAMETER(foa, cv::Size, "Width and height (in pixels) of the fixed, central focus of attention. "
31 "This is the size of the central image crop that is taken in each frame and fed to the "
32 "deep neural network. If the foa size does not fit within the camera input frame size, "
33 "it will be shrunk to fit. To avoid spending CPU resources on rescaling the selected "
34 "image region, it is best to use here the size that the deep network expects as input.",
35 cv::Size(128, 128), ParamCateg);
36
37//! Identify objects using TensorFlow deep neural network
38/*! TensorFlow is a popular neural network framework. This module identifies the object in a square region in the center
39 of the camera field of view using a deep convolutional neural network.
40
41 The deep network analyzes the image by filtering it using many different filter kernels, and several stacked passes
42 (network layers). This essentially amounts to detecting the presence of both simple and complex parts of known
43 objects in the image (e.g., from detecting edges in lower layers of the network to detecting car wheels or even
44 whole cars in higher layers). The last layer of the network is reduced to a vector with one entry per known kind of
45 object (object class). This module returns the class names of the top scoring candidates in the output vector, if
46 any have scored above a minimum confidence threshold. When nothing is recognized with sufficiently high confidence,
47 there is no output.
48
49 \youtube{TRk8rCuUVEE}
50
51 This module runs a TensorFlow network and shows the top-scoring results. In this module, we run the deep network on
52 every video frame, so framerate will vary depending on network complexity (see below). Point your camera towards
53 some interesting object, make the object fit within the grey box shown in the video (which will be fed to the neural
54 network), keep it stable, and TensorFlow will tell you what it thinks this object is.
55
56 Note that by default this module runs different flavors of MobileNets trained on the ImageNet dataset. There are
57 1000 different kinds of objects (object classes) that these networks can recognize (too long to list here). The
58 input layer of these networks is 299x299, 224x224, 192x192, 160x160, or 128x128 pixels by default, depending on the
59 network used. The networks provided on the JeVois microSD image have been trained on large clusters of GPUs, using
60 1.2 million training images from the ImageNet dataset.
61
62 For more information about MobileNets, see
63 https://github.com/tensorflow/models/blob/master/research/slim/nets/mobilenet_v1.md
64
65 For more information about the ImageNet dataset used for training, see
66 http://www.image-net.org/challenges/LSVRC/2012/
67
68 Sometimes this module will make mistakes! The performance of mobilenets is about 40% to 70% correct (mean average
69 precision) on the test set, depending on network size (bigger networks are more accurate but slower).
70
71 Neural network size and speed
72 -----------------------------
73
74 This module takes a central image region of size given by the \p foa parameter. If necessary, this image region is
75 then rescaled to match the deep network's expected input size. The network input size varies depending on which
76 network is used; for example, mobilenet_v1_0.25_128_quant expects 128x128 input images, while mobilenet_v1_1.0_224
77 expects 224x224. Note that there is a CPU cost to rescaling, so, for best performance, you should match the \p foa
78 size to the network's input size.
79
80 For example:
81
82 - mobilenet_v1_0.25_128_quant (network size 128x128), runs at about 8ms/prediction (125 frames/s).
83 - mobilenet_v1_0.5_128_quant (network size 128x128), runs at about 18ms/prediction (55 frames/s).
84 - mobilenet_v1_0.25_224_quant (network size 224x224), runs at about 24ms/prediction (41 frames/s).
85 - mobilenet_v1_1.0_224_quant (network size 224x224), runs at about 139ms/prediction (7 frames/s).
86
87 To easily select one of the available networks, see <B>JEVOIS:/modules/JeVois/TensorFlowEasy/params.cfg</B> on the
88 microSD card of your JeVois camera.
89
90 Serial messages
91 ---------------
92
93 When detections are found with confidence scores above \p thresh, a message containing up to \p top category:score
94 pairs will be sent per video frame. Exact message format depends on the current \p serstyle setting and is described
95 in \ref UserSerialStyle. For example, when \p serstyle is \b Detail, this module sends:
96
97 \verbatim
98 DO category:score category:score ... category:score
99 \endverbatim
100
101 where \a category is a category name (from \p namefile) and \a score is the confidence score from 0.0 to 100.0 that
102 this category was recognized. The pairs are in order of decreasing score.
103
104 See \ref UserSerialStyle for more on standardized serial messages, and \ref coordhelpers for more info on
105 standardized coordinates.
106
107 More networks
108 -------------
109
110 Search the web for models in TFLITE format and for TensorFlow 1.x series. For example, see
111 https://tfhub.dev/s?module-type=image-classification
112
113 To add a new model to your microSD card:
114 - create a directory for it under <b>JEVOIS:/share/tensorflow</b>
115 - put your .tflite in there as \b model.tflite
116 - put a list of labels as a plain text file, one label per line, in your directory as \b labels.txt
117 - edit params.cfg for this module (best done in JeVois Inventor) to add a new entry for your network, and to
118 comment out the default entry.
119
120 Using your own network
121 ----------------------
122
123 For a step-by-step tutorial, see [Training custom TensorFlow networks for
124 JeVois](http://jevois.org/tutorials/UserTensorFlowTraining.html).
125
126 This module supports RGB or grayscale inputs, byte or float32. You should create and train your network using fast
127 GPUs, and then follow the instruction here to convert your trained network to TFLite format:
128
129 https://www.tensorflow.org/lite/
130
131 Then you just need to create a directory under <b>JEVOIS:/share/tensorflow/</B> with the name of your network, and,
132 in there, two files, \b labels.txt with the category labels, and \b model.tflite with your model converted to
133 TensorFlow Lite (flatbuffer format). Finally, edit <B>JEVOIS:/modules/JeVois/TensorFlowEasy/params.cfg</B> to
134 select your new network when the module is launched.
135
136
137 @author Laurent Itti
138
139 @displayname TensorFlow Easy
140 @videomapping NONE 0 0 0.0 YUYV 320 240 60.0 JeVois TensorFlowEasy
141 @videomapping YUYV 320 308 30.0 YUYV 320 240 30.0 JeVois TensorFlowEasy
142 @videomapping YUYV 640 548 30.0 YUYV 640 480 30.0 JeVois TensorFlowEasy
143 @videomapping YUYV 1280 1092 7.0 YUYV 1280 1024 7.0 JeVois TensorFlowEasy
144 @email itti\@usc.edu
145 @address University of Southern California, HNB-07A, 3641 Watt Way, Los Angeles, CA 90089-2520, USA
146 @copyright Copyright (C) 2018 by Laurent Itti, iLab and the University of Southern California
147 @mainurl http://jevois.org
148 @supporturl http://jevois.org/doc
149 @otherurl http://iLab.usc.edu
150 @license GPL v3
151 @distribution Unrestricted
152 @restrictions None
153 \ingroup modules */
155 public jevois::Parameter<foa>
156{
157 public:
158 // ####################################################################################################
159 //! Constructor
160 // ####################################################################################################
161 TensorFlowEasy(std::string const & instance) : jevois::StdModule(instance)
162 {
163 itsTensorFlow = addSubComponent<TensorFlow>("tf");
164 }
165
166 // ####################################################################################################
167 //! Virtual destructor for safe inheritance
168 // ####################################################################################################
170 { }
171
172 // ####################################################################################################
173 //! Processing function, no video output
174 // ####################################################################################################
175 virtual void process(jevois::InputFrame && inframe) override
176 {
177 // Wait for next available camera image:
178 jevois::RawImage const inimg = inframe.get();
179 int const w = inimg.width, h = inimg.height;
180
181 // Adjust foa size if needed so it fits within the input frame:
182 cv::Size foasiz = foa::get(); int foaw = foasiz.width, foah = foasiz.height;
183 if (foaw > w) { foaw = w; foah = std::min(foah, foaw); }
184 if (foah > h) { foah = h; foaw = std::min(foaw, foah); }
185
186 // Take a central crop of the input, with size given by foa parameter:
187 int const offx = ((w - foaw) / 2) & (~1);
188 int const offy = ((h - foah) / 2) & (~1);
189
190 cv::Mat cvimg = jevois::rawimage::cvImage(inimg);
191 cv::Mat crop = cvimg(cv::Rect(offx, offy, foaw, foah));
192
193 // Convert crop to RGB for predictions:
194 cv::Mat rgbroi; cv::cvtColor(crop, rgbroi, cv::COLOR_YUV2RGB_YUYV);
195
196 // Let camera know we are done processing the input image:
197 inframe.done();
198
199 // Launch the predictions, will throw if network is not ready (still loading):
200 itsResults.clear();
201 try
202 {
203 int netinw, netinh, netinc; itsTensorFlow->getInDims(netinw, netinh, netinc);
204
205 // Scale the ROI if needed:
206 cv::Mat scaledroi = jevois::rescaleCv(rgbroi, cv::Size(netinw, netinh));
207
208 // Predict:
209 float const ptime = itsTensorFlow->predict(scaledroi, itsResults);
210 LINFO("Predicted in " << ptime << "ms");
211
212 // Send serial results:
214 }
215 catch (std::logic_error const & e) { } // network still loading
216 }
217
218 // ####################################################################################################
219 //! Processing function with video output to USB
220 // ####################################################################################################
221 virtual void process(jevois::InputFrame && inframe, jevois::OutputFrame && outframe) override
222 {
223 static jevois::Timer timer("processing", 30, LOG_DEBUG);
224
225 // Wait for next available camera image:
226 jevois::RawImage const inimg = inframe.get();
227
228 timer.start();
229
230 // We only handle one specific pixel format, but any image size in this module:
231 int const w = inimg.width, h = inimg.height;
232 inimg.require("input", w, h, V4L2_PIX_FMT_YUYV);
233
234 // Compute central crop window size from foa parameter:
235 cv::Size foasiz = foa::get(); int foaw = foasiz.width, foah = foasiz.height;
236 if (foaw > w) { foaw = w; foah = std::min(foah, foaw); }
237 if (foah > h) { foah = h; foaw = std::min(foaw, foah); }
238 int const offx = ((w - foaw) / 2) & (~1);
239 int const offy = ((h - foah) / 2) & (~1);
240
241 // While we process it, start a thread to wait for out frame and paste the input into it:
242 jevois::RawImage outimg;
243 auto paste_fut = jevois::async([&]() {
244 outimg = outframe.get();
245 outimg.require("output", outimg.width, h + 68, V4L2_PIX_FMT_YUYV);
246
247 // Paste the current input image:
248 jevois::rawimage::paste(inimg, outimg, 0, 0);
249 jevois::rawimage::writeText(outimg, "JeVois TensorFlow Easy - input", 3, 3, jevois::yuyv::White);
250
251 // Draw a grey rectangle for the FOA:
252 jevois::rawimage::drawRect(outimg, offx, offy, foaw, foah, 2, jevois::yuyv::MedGrey);
253
254 // Blank out the bottom of the frame:
256 });
257
258 // Take a central crop of the input, with size given by foa parameter:
259 cv::Mat cvimg = jevois::rawimage::cvImage(inimg);
260 cv::Mat crop = cvimg(cv::Rect(offx, offy, foaw, foah));
261
262 // Convert crop to RGB for predictions:
263 cv::Mat rgbroi; cv::cvtColor(crop, rgbroi, cv::COLOR_YUV2RGB_YUYV);
264
265 // Let camera know we are done processing the input image:
266 paste_fut.get(); inframe.done();
267
268 // Launch the predictions, will throw if network is not ready:
269 itsResults.clear();
270 try
271 {
272 int netinw, netinh, netinc; itsTensorFlow->getInDims(netinw, netinh, netinc);
273
274 // Scale the ROI if needed:
275 cv::Mat scaledroi = jevois::rescaleCv(rgbroi, cv::Size(netinw, netinh));
276
277 // Predict:
278 float const ptime = itsTensorFlow->predict(scaledroi, itsResults);
279
280 // Draw some text messages:
281 jevois::rawimage::writeText(outimg, "Predict",
282 w - 7 * 6 - 2, h + 16, jevois::yuyv::White);
283 jevois::rawimage::writeText(outimg, "time:",
284 w - 7 * 6 - 2, h + 28, jevois::yuyv::White);
285 jevois::rawimage::writeText(outimg, std::to_string(int(ptime)) + "ms",
286 w - 7 * 6 - 2, h + 40, jevois::yuyv::White);
287
288 // Send serial results:
290 }
291 catch (std::logic_error const & e)
292 {
293 // network still loading:
294 jevois::rawimage::writeText(outimg, "Loading network -", 3, h + 4, jevois::yuyv::White);
295 jevois::rawimage::writeText(outimg, "please wait...", 3, h + 16, jevois::yuyv::White);
296 }
297
298 // Then write the names and scores of the detections:
299 int y = h + 4; if (y + itsTensorFlow->top::get() * 12 > outimg.height - 2) y = 16;
300
301 for (auto const & p : itsResults)
302 {
303 jevois::rawimage::writeText(outimg, jevois::sformat("%s: %.2F", p.category.c_str(), p.score),
304 3, y, jevois::yuyv::White);
305 y += 12;
306 }
307
308 // Show processing fps:
309 std::string const & fpscpu = timer.stop();
310 jevois::rawimage::writeText(outimg, fpscpu, 3, h - 13, jevois::yuyv::White);
311
312 // Send the output image with our processing results to the host over USB:
313 outframe.send();
314 }
315
316 // ####################################################################################################
317 protected:
318 std::shared_ptr<TensorFlow> itsTensorFlow;
319 std::vector<jevois::ObjReco> itsResults;
320};
321
322// Allow the module to be loaded as a shared object (.so) file:
JEVOIS_REGISTER_MODULE(ArUcoBlob)
int h
Identify objects using TensorFlow deep neural network.
virtual void process(jevois::InputFrame &&inframe) override
Processing function, no video output.
std::shared_ptr< TensorFlow > itsTensorFlow
TensorFlowEasy(std::string const &instance)
Constructor.
virtual ~TensorFlowEasy()
Virtual destructor for safe inheritance.
virtual void process(jevois::InputFrame &&inframe, jevois::OutputFrame &&outframe) override
Processing function with video output to USB.
JEVOIS_DECLARE_PARAMETER(foa, cv::Size, "Width and height (in pixels) of the fixed, central focus of attention. " "This is the size of the central image crop that is taken in each frame and fed to the " "deep neural network. If the foa size does not fit within the camera input frame size, " "it will be shrunk to fit. To avoid spending CPU resources on rescaling the selected " "image region, it is best to use here the size that the deep network expects as input.", cv::Size(128, 128), ParamCateg)
Parameter.
std::vector< jevois::ObjReco > itsResults
unsigned int width
unsigned int height
void require(char const *info, unsigned int w, unsigned int h, unsigned int f) const
void sendSerialObjReco(std::vector< ObjReco > const &res)
StdModule(std::string const &instance)
std::string const & stop(double *seconds)
#define LINFO(msg)
void paste(RawImage const &src, RawImage &dest, int dx, int dy)
cv::Mat cvImage(RawImage const &src)
void writeText(RawImage &img, std::string const &txt, int x, int y, unsigned int col, Font font=Font6x10)
void drawFilledRect(RawImage &img, int x, int y, unsigned int w, unsigned int h, unsigned int col)
cv::Mat rescaleCv(cv::Mat const &img, cv::Size const &newdims)
void drawRect(RawImage &img, int x, int y, unsigned int w, unsigned int h, unsigned int thick, unsigned int col)
std::future< std::invoke_result_t< std::decay_t< Function >, std::decay_t< Args >... > > async(Function &&f, Args &&... args)
std::string sformat(char const *fmt,...) __attribute__((format(__printf__
unsigned short constexpr Black
unsigned short constexpr White
unsigned short constexpr MedGrey