JeVoisBase  1.20
JeVois Smart Embedded Machine Vision Toolkit Base Modules
Share this page:
DarknetSingle.C
Go to the documentation of this file.
1 // ///////////////////////////////////////////////////////////////////////////////////////////////////////////////////
2 //
3 // JeVois Smart Embedded Machine Vision Toolkit - Copyright (C) 2016 by Laurent Itti, the University of Southern
4 // California (USC), and iLab at USC. See http://iLab.usc.edu and http://jevois.org for information about this project.
5 //
6 // This file is part of the JeVois Smart Embedded Machine Vision Toolkit. This program is free software; you can
7 // redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software
8 // Foundation, version 2. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY;
9 // without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public
10 // License for more details. You should have received a copy of the GNU General Public License along with this program;
11 // if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
12 //
13 // Contact information: Laurent Itti - 3641 Watt Way, HNB-07A - Los Angeles, CA 90089-2520 - USA.
14 // Tel: +1 213 740 3527 - itti@pollux.usc.edu - http://iLab.usc.edu - http://jevois.org
15 // ///////////////////////////////////////////////////////////////////////////////////////////////////////////////////
16 /*! \file */
17 
18 #include <jevois/Core/Module.H>
19 #include <jevois/Debug/Timer.H>
21 #include <opencv2/core/core.hpp>
22 #include <opencv2/imgproc/imgproc.hpp>
24 
25 // icon from https://pjreddie.com/darknet/
26 
27 //! Identify objects using Darknet deep neural network
28 /*! Darknet is a popular neural network framework. This module identifies the object in a square region in the center
29  of the camera field of view using a deep convolutional neural network.
30 
31  The deep network analyzes the image by filtering it using many different filter kernels, and several stacked passes
32  (network layers). This essentially amounts to detecting the presence of both simple and complex parts of known
33  objects in the image (e.g., from detecting edges in lower layers of the network to detecting car wheels or even
34  whole cars in higher layers). The last layer of the network is reduced to a vector with one entry per known kind of
35  object (object class). This module returns the class names of the top scoring candidates in the output vector, if
36  any have scored above a minimum confidence threshold. When nothing is recognized with sufficiently high confidence,
37  there is no output.
38 
39  Darknet is a great alternative to popular neural network frameworks like Caffe, TensorFlow, MxNet, pyTorch, Theano,
40  etc as it features: 1) small footprint which is great for small embedded systems; 2) hardware acceleration using ARM
41  NEON instructions; 3) support for large GPUs when compiled on expensive servers, which is useful to train the
42  neural networks on big servers, then copying the trained weights directly to JeVois for use with live video.
43 
44  See https://pjreddie.com/darknet for more details about darknet.
45 
46  \youtube{d5CfljT5kec}
47 
48  This module runs a Darknet network and shows the top-scoring results. The network is currently a bit slow, hence it
49  is only run once in a while. Point your camera towards some interesting object, make the object fit in the picture
50  shown at right (which will be fed to the neural network), keep it stable, and wait for Darknet to tell you what it
51  found. The framerate figures shown at the bottom left of the display reflect the speed at which each new video frame
52  from the camera is processed, but in this module this just amounts to converting the image to RGB, sending it to the
53  neural network for processing in a separate thread, and creating the demo display. Actual network inference speed
54  (time taken to compute the predictions on one image) is shown at the bottom right. See below for how to trade-off
55  speed and accuracy.
56 
57  Note that by default this module runs the Imagenet1k tiny Darknet (it can also run the slightly slower but a bit
58  more accurate Darknet Reference network; see parameters). There are 1000 different kinds of objects (object classes)
59  that these networks can recognize (too long to list here). The input layer of these two networks is 224x224 pixels
60  by default. This modules takes a crop at the center of the video image, with size determined by the network input
61  size. With the default network parameters, this module hence requires at least 320x240 camera sensor resolution. The
62  networks provided on the JeVois microSD image have been trained on large clusters of GPUs, typically using 1.2
63  million training images from the ImageNet dataset.
64 
65  Sometimes this module will make mistakes! The performance of darknet-tiny is about 58.7% correct (mean average
66  precision) on the test set, and Darknet Reference is about 61.1% correct on the test set, using the default 224x224
67  network input layer size.
68 
69  Neural network size and speed
70  -----------------------------
71 
72  When using a video mapping with USB output, the network is automatically resized to a square size that is the
73  difference between the USB output video width and the camera sensor input width (e.g., when USB video mode is
74  544x240 and camera sensor mode is 320x240, the network will be resized to 224x224 since 224=544-320).
75 
76  The network size direcly affects both speed and accuracy. Larger networks run slower but are more accurate.
77 
78  For example:
79 
80  - with USB output 544x240 (network size 224x224), this module runs at about 450ms/prediction.
81  - with USB output 448x240 (network size 128x128), this module runs at about 180ms/prediction.
82 
83  When using a videomapping with no USB output, the network is not resized (since we would not know what to resize it
84  to). You can still change its native size by changing the network's config file, for example, change the width and
85  height fields in <b>JEVOIS:/share/darknet/single/cfg/tiny.cfg</b>.
86 
87  Note that network dims must always be such that they fit inside the camera input image.
88 
89  Serial messages
90  ---------------
91 
92  When detections are found with confidence scores above \p thresh, a message containing up to \p top category:score
93  pairs will be sent per video frame. Exact message format depends on the current \p serstyle setting and is described
94  in \ref UserSerialStyle. For example, when \p serstyle is \b Detail, this module sends:
95 
96  \verbatim
97  DO category:score category:score ... category:score
98  \endverbatim
99 
100  where \a category is a category name (from \p namefile) and \a score is the confidence score from 0.0 to 100.0 that
101  this category was recognized. The pairs are in order of decreasing score.
102 
103  See \ref UserSerialStyle for more on standardized serial messages, and \ref coordhelpers for more info on
104  standardized coordinates.
105 
106  @author Laurent Itti
107 
108  @displayname Darknet Single
109  @videomapping NONE 0 0 0.0 YUYV 320 240 2.1 JeVois DarknetSingle
110  @videomapping YUYV 544 240 15.0 YUYV 320 240 15.0 JeVois DarknetSingle
111  @videomapping YUYV 448 240 15.0 YUYV 320 240 15.0 JeVois DarknetSingle
112  @email itti\@usc.edu
113  @address University of Southern California, HNB-07A, 3641 Watt Way, Los Angeles, CA 90089-2520, USA
114  @copyright Copyright (C) 2017 by Laurent Itti, iLab and the University of Southern California
115  @mainurl http://jevois.org
116  @supporturl http://jevois.org/doc
117  @otherurl http://iLab.usc.edu
118  @license GPL v3
119  @distribution Unrestricted
120  @restrictions None
121  \ingroup modules */
123 {
124  public:
125  // ####################################################################################################
126  //! Constructor
127  // ####################################################################################################
128  DarknetSingle(std::string const & instance) : jevois::StdModule(instance)
129  {
130  itsDarknet = addSubComponent<Darknet>("darknet");
131  }
132 
133  // ####################################################################################################
134  //! Virtual destructor for safe inheritance
135  // ####################################################################################################
136  virtual ~DarknetSingle()
137  { }
138 
139  // ####################################################################################################
140  //! Un-initialization
141  // ####################################################################################################
142  virtual void postUninit() override
143  {
144  try { itsPredictFut.get(); } catch (...) { }
145  }
146 
147  // ####################################################################################################
148  //! Processing function, no video output
149  // ####################################################################################################
150  virtual void process(jevois::InputFrame && inframe) override
151  {
152  // Wait for next available camera image:
153  jevois::RawImage const inimg = inframe.get();
154  int const w = inimg.width, h = inimg.height;
155 
156  // Check input vs network dims, will throw if network not ready:
157  int netw, neth, netc;
158  try { itsDarknet->getInDims(netw, neth, netc); }
159  catch (std::logic_error const & e) { inframe.done(); return; }
160 
161  if (netw > w) netw = w;
162  if (neth > h) neth = h;
163 
164  // Take a central crop of the input:
165  int const offx = ((w - netw) / 2) & (~1);
166  int const offy = ((h - neth) / 2) & (~1);
167 
168  cv::Mat cvimg = jevois::rawimage::cvImage(inimg);
169  cv::Mat crop = cvimg(cv::Rect(offx, offy, netw, neth));
170 
171  // Convert crop to RGB for predictions:
172  cv::cvtColor(crop, itsCvImg, cv::COLOR_YUV2RGB_YUYV);
173 
174  // Let camera know we are done processing the input image:
175  inframe.done();
176 
177  // Launch the predictions (do not catch exceptions, we already tested for network ready in this block):
178  float const ptime = itsDarknet->predict(itsCvImg, itsResults);
179  LINFO("Predicted in " << ptime << "ms");
180 
181  // Send serial results:
183  }
184 
185  // ####################################################################################################
186  //! Processing function with video output to USB
187  // ####################################################################################################
188  virtual void process(jevois::InputFrame && inframe, jevois::OutputFrame && outframe) override
189  {
190  static jevois::Timer timer("processing", 30, LOG_DEBUG);
191 
192  // Wait for next available camera image:
193  jevois::RawImage const inimg = inframe.get();
194 
195  timer.start();
196 
197  // We only handle one specific pixel format, but any image size in this module:
198  int const w = inimg.width, h = inimg.height;
199  inimg.require("input", w, h, V4L2_PIX_FMT_YUYV);
200 
201  // While we process it, start a thread to wait for out frame and paste the input into it:
202  jevois::RawImage outimg;
203  auto paste_fut = jevois::async([&]() {
204  outimg = outframe.get();
205  outimg.require("output", outimg.width, outimg.height, V4L2_PIX_FMT_YUYV);
206 
207  // Paste the current input image:
208  jevois::rawimage::paste(inimg, outimg, 0, 0);
209  jevois::rawimage::writeText(outimg, "JeVois Darknet Single - input", 3, 3, jevois::yuyv::White);
210 
211  // Paste the latest prediction results, if any, otherwise a wait message:
212  cv::Mat outimgcv = jevois::rawimage::cvImage(outimg);
213  if (itsRawPrevOutputCv.empty() == false)
214  itsRawPrevOutputCv.copyTo(outimgcv(cv::Rect(w, 0, itsRawPrevOutputCv.cols, itsRawPrevOutputCv.rows)));
215  else
216  {
217  jevois::rawimage::drawFilledRect(outimg, w, 0, outimg.width - w, h, jevois::yuyv::Black);
218  jevois::rawimage::writeText(outimg, "Loading network -", w + 3, 3, jevois::yuyv::White);
219  jevois::rawimage::writeText(outimg, "please wait...", w + 3, 15, jevois::yuyv::White);
220  }
221  });
222 
223  // Decide on what to do based on itsPredictFut: if it is valid, we are still predicting, so check whether we are
224  // done and if so draw the results. Otherwise, start predicting using the current input frame:
225  if (itsPredictFut.valid())
226  {
227  // Are we finished predicting?
228  if (itsPredictFut.wait_for(std::chrono::milliseconds(5)) == std::future_status::ready)
229  {
230  // Do a get() on our future to free up the async thread and get any exception it might have thrown. In
231  // particular, it will throw a logic_error if we are still loading the network:
232  bool success = true; float ptime = 0.0F;
233  try { ptime = itsPredictFut.get(); } catch (std::logic_error const & e) { success = false; }
234 
235  // Wait for paste to finish up and let camera know we are done processing the input image:
236  paste_fut.get(); inframe.done();
237 
238  if (success)
239  {
240  int const netw = itsRawInputCv.cols, neth = itsRawInputCv.rows;
241  cv::Mat outimgcv = jevois::rawimage::cvImage(outimg);
242 
243  // Update our output image: First paste the image we have been making predictions on:
244  itsRawInputCv.copyTo(outimgcv(cv::Rect(w, 0, netw, neth)));
245  jevois::rawimage::drawFilledRect(outimg, w, neth, netw, h - neth, jevois::yuyv::Black);
246 
247  // Then draw the detections: either below the detection crop if there is room, or on top of it if not enough
248  // room below:
249  int y = neth + 3; if (y + int(itsDarknet->top::get()) * 12 > h - 21) y = 3;
250 
251  for (auto const & p : itsResults)
252  {
253  jevois::rawimage::writeText(outimg, jevois::sformat("%s: %.2F", p.category.c_str(), p.score),
254  w + 3, y, jevois::yuyv::White);
255  y += 12;
256  }
257 
258  // Send serial results:
260 
261  // Draw some text messages:
262  jevois::rawimage::writeText(outimg, "Predict time: " + std::to_string(int(ptime)) + "ms",
263  w + 3, h - 11, jevois::yuyv::White);
264 
265  // Finally make a copy of these new results so we can display them again while we wait for the next round:
266  itsRawPrevOutputCv = cv::Mat(h, netw, CV_8UC2);
267  outimgcv(cv::Rect(w, 0, netw, h)).copyTo(itsRawPrevOutputCv);
268 
269  } else { itsRawPrevOutputCv.release(); } // network is not ready yet
270  }
271  else
272  {
273  // Future is not ready, do nothing except drawings on this frame (done in paste_fut thread) and we will try
274  // again on the next one...
275  paste_fut.get(); inframe.done();
276  }
277  }
278  else // We are not predicting: start new predictions
279  {
280  // Wait for paste to finish up:
281  paste_fut.get();
282 
283  // In this module, we use square crops for the network, with size given by USB width - camera width:
284  if (outimg.width < inimg.width) LFATAL("USB output image must be larger than camera input");
285  int const netw = outimg.width - inimg.width;
286  int const neth = netw; // square crop
287 
288  // Check input vs network dims:
289  if (netw > w || neth > h) LFATAL("Network input window must fit within camera frame");
290 
291  // Take a central crop of the input:
292  int const offx = ((w - netw) / 2) & (~1);
293  int const offy = ((h - neth) / 2) & (~1);
294  cv::Mat cvimg = jevois::rawimage::cvImage(inimg);
295  cv::Mat crop = cvimg(cv::Rect(offx, offy, netw, neth));
296 
297  // Convert crop to RGB for predictions:
298  cv::cvtColor(crop, itsCvImg, cv::COLOR_YUV2RGB_YUYV);
299 
300  // Also make a raw YUYV copy of the crop for later displays:
301  crop.copyTo(itsRawInputCv);
302 
303  // Let camera know we are done processing the input image:
304  inframe.done();
305 
306  // Launch the predictions; will throw if network is not ready:
307  try
308  {
309  int netinw, netinh, netinc; itsDarknet->getInDims(netinw, netinh, netinc); // will throw if not ready
310  itsPredictFut = jevois::async([&]() { return itsDarknet->predict(itsCvImg, itsResults); });
311  }
312  catch (std::logic_error const & e) { itsRawPrevOutputCv.release(); } // network is not ready yet
313  }
314 
315  // Show processing fps:
316  std::string const & fpscpu = timer.stop();
317  jevois::rawimage::writeText(outimg, fpscpu, 3, h - 13, jevois::yuyv::White);
318 
319  // Send the output image with our processing results to the host over USB:
320  outframe.send();
321  }
322 
323  // ####################################################################################################
324  protected:
325  std::shared_ptr<Darknet> itsDarknet;
326  std::vector<jevois::ObjReco> itsResults;
327  std::future<float> itsPredictFut;
328  cv::Mat itsRawInputCv;
329  cv::Mat itsCvImg;
331 };
332 
333 // Allow the module to be loaded as a shared object (.so) file:
jevois::OutputFrame
jevois::async
std::future< std::invoke_result_t< std::decay_t< Function >, std::decay_t< Args >... > > async(Function &&f, Args &&... args)
Timer.H
Module.H
jevois::sformat
std::string sformat(char const *fmt,...) __attribute__((format(__printf__
DarknetSingle::itsRawInputCv
cv::Mat itsRawInputCv
Definition: DarknetSingle.C:328
DarknetSingle::DarknetSingle
DarknetSingle(std::string const &instance)
Constructor.
Definition: DarknetSingle.C:128
jevois::RawImage
jevois::Timer::start
void start()
DarknetSingle::process
virtual void process(jevois::InputFrame &&inframe, jevois::OutputFrame &&outframe) override
Processing function with video output to USB.
Definition: DarknetSingle.C:188
jevois::RawImage::require
void require(char const *info, unsigned int w, unsigned int h, unsigned int f) const
jevois::RawImage::width
unsigned int width
jevois::rawimage::writeText
void writeText(RawImage &img, std::string const &txt, int x, int y, unsigned int col, Font font=Font6x10)
jevois
DarknetSingle::itsResults
std::vector< jevois::ObjReco > itsResults
Definition: DarknetSingle.C:326
DarknetSingle::process
virtual void process(jevois::InputFrame &&inframe) override
Processing function, no video output.
Definition: DarknetSingle.C:150
success
#define success()
jevois::Timer::stop
const std::string & stop(double *seconds)
DarknetSingle::itsDarknet
std::shared_ptr< Darknet > itsDarknet
Definition: DarknetSingle.C:325
jevois::rawimage::drawFilledRect
void drawFilledRect(RawImage &img, int x, int y, unsigned int w, unsigned int h, unsigned int col)
JEVOIS_REGISTER_MODULE
JEVOIS_REGISTER_MODULE(DarknetSingle)
jevois::StdModule::StdModule
StdModule(std::string const &instance)
LFATAL
#define LFATAL(msg)
RawImageOps.H
jevois::RawImage::height
unsigned int height
to_string
std::string to_string(T const &val)
jevois::InputFrame
jevois::rawimage::cvImage
cv::Mat cvImage(RawImage const &src)
jevois::rawimage::paste
void paste(RawImage const &src, RawImage &dest, int dx, int dy)
DarknetSingle::itsCvImg
cv::Mat itsCvImg
Definition: DarknetSingle.C:329
DarknetSingle::~DarknetSingle
virtual ~DarknetSingle()
Virtual destructor for safe inheritance.
Definition: DarknetSingle.C:136
DarknetSingle::itsPredictFut
std::future< float > itsPredictFut
Definition: DarknetSingle.C:327
h
int h
DarknetSingle::postUninit
virtual void postUninit() override
Un-initialization.
Definition: DarknetSingle.C:142
jevois::StdModule
DarknetSingle
Identify objects using Darknet deep neural network.
Definition: DarknetSingle.C:122
LINFO
#define LINFO(msg)
DarknetSingle::itsRawPrevOutputCv
cv::Mat itsRawPrevOutputCv
Definition: DarknetSingle.C:330
demo.w
w
Definition: demo.py:85
jevois::Timer
Darknet.H
jevois::StdModule::sendSerialObjReco
void sendSerialObjReco(std::vector< ObjReco > const &res)