JeVoisBase  1.20
JeVois Smart Embedded Machine Vision Toolkit Base Modules
Share this page:
DemoSalGistFaceObj.C
Go to the documentation of this file.
1 // ///////////////////////////////////////////////////////////////////////////////////////////////////////////////////
2 //
3 // JeVois Smart Embedded Machine Vision Toolkit - Copyright (C) 2016 by Laurent Itti, the University of Southern
4 // California (USC), and iLab at USC. See http://iLab.usc.edu and http://jevois.org for information about this project.
5 //
6 // This file is part of the JeVois Smart Embedded Machine Vision Toolkit. This program is free software; you can
7 // redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software
8 // Foundation, version 2. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY;
9 // without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public
10 // License for more details. You should have received a copy of the GNU General Public License along with this program;
11 // if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
12 //
13 // Contact information: Laurent Itti - 3641 Watt Way, HNB-07A - Los Angeles, CA 90089-2520 - USA.
14 // Tel: +1 213 740 3527 - itti@pollux.usc.edu - http://iLab.usc.edu - http://jevois.org
15 // ///////////////////////////////////////////////////////////////////////////////////////////////////////////////////
16 /*! \file */
17 
18 #include <jevois/Core/Module.H>
19 
20 #include <jevois/Debug/Log.H>
21 #include <jevois/Debug/Timer.H>
27 
28 #include <opencv2/core/core.hpp>
29 #include <opencv2/imgproc/imgproc.hpp>
30 #include <linux/videodev2.h> // for v4l2 pixel types
31 //#include <opencv2/highgui/highgui.hpp> // used for debugging only, see imshow below
32 // icon by Freepik in interface at flaticon
33 
34 //! Simple demo that combines saliency, gist, face detection, and object recognition
35 /*! Run the visual saliency algorithm to find the most interesting location in the field of view. Then extract a square
36  image region around that point. On alternating frames, either
37 
38  - attempt to detect a face in the attended region, and, if positively detected, show the face in the bottom-right
39  corner of the display. The last detected face will remain shown in the bottom-right corner of the display until a
40  new face is detected.
41 
42  - or attempt to recognize an object in the attended region, using a deep neural network. The default network is a
43  handwritten digot recognition network that replicated the original LeNet by Yann LeCun and is one of the very
44  first convolutional neural networks. The network has been trained on the standard MNIST database of handwritten
45  digits, and achives over 99% correct recognition on the MNIST test dataset. When a digit is positively identified,
46  a picture of it appears near the last detected face towards the bottom-right corner of the display, and a text
47  string with the digit that has been identified appears to the left of the picture of the digit.
48 
49  Serial Messages
50  ---------------
51 
52  This module can send standardized serial messages as described in \ref UserSerialStyle, where all coordinates and
53  sizes are standardized using \ref coordhelpers. One message is issued on every video frame at the temporally
54  filtered attended (most salient) location (green circle in the video display):
55 
56  - Serial message type: \b 2D
57  - `id`: always \b sm (shorthand for saliency map)
58  - `x`, `y`: standardized 2D coordinates of temporally-filtered most salient point
59  - `w`, `h`: standardized size of the pink square box around each attended point
60  - `extra`: none (empty string)
61 
62  See \ref UserSerialStyle for more on standardized serial messages, and \ref coordhelpers for more info on
63  standardized coordinates.
64 
65 
66  @author Laurent Itti
67 
68  @displayname Demo Saliency + Gist + Face Detection + Object Recognition
69  @videomapping YUYV 640 312 50.0 YUYV 320 240 50.0 JeVois DemoSalGistFaceObj
70  @email itti\@usc.edu
71  @address University of Southern California, HNB-07A, 3641 Watt Way, Los Angeles, CA 90089-2520, USA
72  @copyright Copyright (C) 2016 by Laurent Itti, iLab and the University of Southern California
73  @mainurl http://jevois.org
74  @supporturl http://jevois.org/doc
75  @otherurl http://iLab.usc.edu
76  @license GPL v3
77  @distribution Unrestricted
78  @restrictions None
79  \ingroup modules */
81 {
82  public:
83  //! Constructor
84  DemoSalGistFaceObj(std::string const & instance) : jevois::StdModule(instance), itsScoresStr(" ")
85  {
86  itsSaliency = addSubComponent<Saliency>("saliency");
87  itsFaceDetector = addSubComponent<FaceDetector>("facedetect");
88  itsObjectRecognition = addSubComponent<ObjectRecognitionMNIST>("MNIST");
89  itsKF = addSubComponent<Kalman2D>("kalman");
90  }
91 
92  //! Virtual destructor for safe inheritance
93  virtual ~DemoSalGistFaceObj() { }
94 
95  //! Processing function
96  virtual void process(jevois::InputFrame && inframe, jevois::OutputFrame && outframe) override
97  {
98  static jevois::Timer itsProcessingTimer("Processing");
99  static cv::Mat itsLastFace(60, 60, CV_8UC2, 0x80aa) ; // Note that this one will contain raw YUV pixels
100  static cv::Mat itsLastObject(60, 60, CV_8UC2, 0x80aa) ; // Note that this one will contain raw YUV pixels
101  static std::string itsLastObjectCateg;
102  static bool doobject = false; // alternate between object and face recognition
103 
104  // Wait for next available camera image:
105  jevois::RawImage inimg = inframe.get();
106 
107  // We only handle one specific input format in this demo:
108  inimg.require("input", 320, 240, V4L2_PIX_FMT_YUYV);
109 
110  itsProcessingTimer.start();
111  int const roihw = 32; // face & object roi half width and height
112 
113  // Compute saliency, in a thread:
114  auto sal_fut = jevois::async([&](){ itsSaliency->process(inimg, true); });
115 
116  // While computing, wait for an image from our gadget driver into which we will put our results:
117  jevois::RawImage outimg = outframe.get();
118  outimg.require("output", 640, 312, V4L2_PIX_FMT_YUYV);
119 
120  // Paste the original image to the top-left corner of the display:
121  unsigned short const txtcol = jevois::yuyv::White;
122  jevois::rawimage::paste(inimg, outimg, 0, 0);
123  jevois::rawimage::writeText(outimg, "JeVois Saliency + Gist + Faces + Objects", 3, 3, txtcol);
124 
125  // Wait until saliency computation is complete:
126  sal_fut.get();
127 
128  // find most salient point:
129  int mx, my; intg32 msal;
130  itsSaliency->getSaliencyMax(mx, my, msal);
131 
132  // Scale back to original image coordinates:
133  int const smlev = itsSaliency->smscale::get();
134  int const smadj = smlev > 0 ? (1 << (smlev-1)) : 0; // half a saliency map pixel adjustment
135  int const dmx = (mx << smlev) + smadj;
136  int const dmy = (my << smlev) + smadj;
137 
138  // Compute instantaneous attended ROI (note: coords must be even to avoid flipping U/V when we later paste):
139  int const rx = std::min(int(inimg.width) - roihw, std::max(roihw, dmx));
140  int const ry = std::min(int(inimg.height) - roihw, std::max(roihw, dmy));
141 
142  // Asynchronously launch a bunch of saliency drawings and filter the attended locations
143  auto draw_fut =
144  jevois::async([&]() {
145  // Paste the various saliency results:
146  drawMap(outimg, &itsSaliency->salmap, 320, 0, 16, 20);
147  jevois::rawimage::writeText(outimg, "Saliency Map", 640 - 12*6-4, 3, txtcol);
148 
149  drawMap(outimg, &itsSaliency->color, 0, 240, 4, 18);
150  jevois::rawimage::writeText(outimg, "Color", 3, 243, txtcol);
151 
152  drawMap(outimg, &itsSaliency->intens, 80, 240, 4, 18);
153  jevois::rawimage::writeText(outimg, "Intensity", 83, 243, txtcol);
154 
155  drawMap(outimg, &itsSaliency->ori, 160, 240, 4, 18);
156  jevois::rawimage::writeText(outimg, "Orientation", 163, 243, txtcol);
157 
158  drawMap(outimg, &itsSaliency->flicker, 240, 240, 4, 18);
159  jevois::rawimage::writeText(outimg, "Flicker", 243, 243, txtcol);
160 
161  drawMap(outimg, &itsSaliency->motion, 320, 240, 4, 18);
162  jevois::rawimage::writeText(outimg, "Motion", 323, 243, txtcol);
163 
164  // Draw the gist vector:
165  drawGist(outimg, itsSaliency->gist, itsSaliency->gist_size, 400, 242, 40, 2);
166 
167  // Draw a small square at most salient location in image and in saliency map:
168  jevois::rawimage::drawFilledRect(outimg, mx * 16 + 5, my * 16 + 5, 8, 8, 0xffff);
169  jevois::rawimage::drawFilledRect(outimg, 320 + mx * 16 + 5, my * 16 + 5, 8, 8, 0xffff);
170  jevois::rawimage::drawRect(outimg, rx - roihw, ry - roihw, roihw*2, roihw*2, 0xf0f0);
171  jevois::rawimage::drawRect(outimg, rx - roihw+1, ry - roihw+1, roihw*2-2, roihw*2-2, 0xf0f0);
172 
173  // Blank out free space from 480 to 519 at the bottom, and small space above and below gist vector:
174  jevois::rawimage::drawFilledRect(outimg, 480, 240, 40, 60, 0x8000);
175  jevois::rawimage::drawRect(outimg, 400, 240, 80, 2, 0x80a0);
176  jevois::rawimage::drawRect(outimg, 400, 298, 80, 2, 0x80a0);
177  jevois::rawimage::drawFilledRect(outimg, 0, 300, 640, 12, jevois::yuyv::Black);
178 
179  // Filter the attended locations:
180  itsKF->set(dmx, dmy, inimg.width, inimg.height);
181  float kfxraw, kfyraw, kfximg, kfyimg;
182  itsKF->get(kfxraw, kfyraw, kfximg, kfyimg, inimg.width, inimg.height, 1.0F, 1.0F);
183 
184  // Draw a circle around the kalman-filtered attended location:
185  jevois::rawimage::drawCircle(outimg, int(kfximg), int(kfyimg), 20, 1, jevois::yuyv::LightGreen);
186 
187  // Send saliency info to serial port (for arduino, etc):
188  sendSerialImg2D(inimg.width, inimg.height, kfximg, kfyimg, roihw * 2, roihw * 2, "sm");
189  });
190 
191  // Extract a raw YUYV ROI around attended point:
192  cv::Mat rawimgcv = jevois::rawimage::cvImage(inimg);
193  cv::Mat rawroi = rawimgcv(cv::Rect(rx - roihw, ry - roihw, roihw * 2, roihw * 2));
194 
195  if (doobject)
196  {
197  // #################### Object recognition:
198 
199  // Prepare a color or grayscale ROI for the object recognition module:
200  auto objsz = itsObjectRecognition->insize();
201  cv::Mat objroi;
202  switch (objsz.depth_)
203  {
204  case 1: // grayscale input
205  {
206  // mnist is white letters on black background, so invert the image before we send it for recognition, as we
207  // assume here black letters on white backgrounds. We also need to provide a clean crop around the digit for
208  // the deep network to work well:
209  cv::cvtColor(rawroi, objroi, cv::COLOR_YUV2GRAY_YUYV);
210 
211  // Find the 10th percentile gray value:
212  size_t const elem = (objroi.cols * objroi.rows * 10) / 100;
213  std::vector<unsigned char> v; v.assign(objroi.datastart, objroi.dataend);
214  std::nth_element(v.begin(), v.begin() + elem, v.end());
215  unsigned char const thresh = std::min((unsigned char)(100), std::max((unsigned char)(30), v[elem]));
216 
217  // Threshold and invert the image:
218  cv::threshold(objroi, objroi, thresh, 255, cv::THRESH_BINARY_INV);
219 
220  // Find the digit and center and crop it:
221  cv::Mat pts; cv::findNonZero(objroi, pts);
222  cv::Rect r = cv::boundingRect(pts);
223  int const cx = r.x + r.width / 2;
224  int const cy = r.y + r.height / 2;
225  int const siz = std::min(roihw * 2, std::max(16, 8 + std::max(r.width, r.height))); // margin of 4 pix
226  int const tlx = std::max(0, std::min(roihw*2 - siz, cx - siz/2));
227  int const tly = std::max(0, std::min(roihw*2 - siz, cy - siz/2));
228  cv::Rect ar(tlx, tly, siz, siz);
229  cv::resize(objroi(ar), objroi, cv::Size(objsz.width_, objsz.height_), 0, 0, cv::INTER_AREA);
230  //cv::imshow("cropped roi", objroi);cv::waitKey(1);
231  }
232  break;
233 
234  case 3: // color input
235  cv::cvtColor(rawroi, objroi, cv::COLOR_YUV2RGB_YUYV);
236  cv::resize(objroi, objroi, cv::Size(objsz.width_, objsz.height_), 0, 0, cv::INTER_AREA);
237  break;
238 
239  default:
240  LFATAL("Unsupported object detection input depth " << objsz.depth_);
241  }
242 
243  // Launch object recognition on the ROI and get the recognition scores:
244  auto scores = itsObjectRecognition->process(objroi);
245 
246  // Create a string to show all scores:
247  std::ostringstream oss;
248  for (size_t i = 0; i < scores.size(); ++i)
249  oss << itsObjectRecognition->category(i) << ':' << std::fixed << std::setprecision(2) << scores[i] << ' ';
250  itsScoresStr = oss.str();
251 
252  // Check whether the highest score is very high and significantly higher than the second best:
253  float best1 = scores[0], best2 = scores[0]; size_t idx1 = 0, idx2 = 0;
254  for (size_t i = 1; i < scores.size(); ++i)
255  {
256  if (scores[i] > best1) { best2 = best1; idx2 = idx1; best1 = scores[i]; idx1 = i; }
257  else if (scores[i] > best2) { best2 = scores[i]; idx2 = i; }
258  }
259 
260  // Update our display upon each "clean" recognition:
261  if (best1 > 90.0F && best2 < 20.0F)
262  {
263  // Remember this recognized object for future displays:
264  itsLastObjectCateg = itsObjectRecognition->category(idx1);
265  itsLastObject = rawimgcv(cv::Rect(rx - 30, ry - 30, 60, 60)).clone(); // make a deep copy
266 
267  LINFO("Object recognition: best: " << itsLastObjectCateg <<" (" << best1 <<
268  "), second best: " << itsObjectRecognition->category(idx2) << " (" << best2 << ')');
269  }
270  }
271  else
272  {
273  // #################### Face detection:
274 
275  // Prepare a grey ROI from our raw YUYV roi:
276  cv::Mat grayroi; cv::cvtColor(rawroi, grayroi, cv::COLOR_YUV2GRAY_YUYV);
277  cv::equalizeHist(grayroi, grayroi);
278 
279  // Launch the face detector:
280  std::vector<cv::Rect> faces; std::vector<std::vector<cv::Rect> > eyes;
281  itsFaceDetector->process(grayroi, faces, eyes, false);
282 
283  // Draw the faces and eyes, if any:
284  if (faces.size())
285  {
286  LINFO("detected " << faces.size() << " faces");
287  // Store the attended ROI into our last ROI, fixed size 60x60 for our display:
288  itsLastFace = rawimgcv(cv::Rect(rx - 30, ry - 30, 60, 60)).clone(); // make a deep copy
289  }
290 
291  for (size_t i = 0; i < faces.size(); ++i)
292  {
293  // Draw one face:
294  cv::Rect const & f = faces[i];
295  jevois::rawimage::drawRect(outimg, f.x + rx - roihw, f.y + ry - roihw, f.width, f.height, 0xc0ff);
296 
297  // Draw the corresponding eyes:
298  for (auto const & e : eyes[i])
299  jevois::rawimage::drawRect(outimg, e.x + rx - roihw, e.y + ry - roihw, e.width, e.height, 0x40ff);
300  }
301  }
302 
303  // Let camera know we are done processing the raw YUV input image. NOTE: rawroi is now invalid:
304  inframe.done();
305 
306  // Paste our last attended and recognized face and object (or empty pics):
307  cv::Mat outimgcv(outimg.height, outimg.width, CV_8UC2, outimg.buf->data());
308  itsLastObject.copyTo(outimgcv(cv::Rect(520, 240, 60, 60)));
309  itsLastFace.copyTo(outimgcv(cv::Rect(580, 240, 60, 60)));
310 
311  // Wait until all saliency drawings are complete (since they blank out our object label area):
312  draw_fut.get();
313 
314  // Print all object scores:
315  jevois::rawimage::writeText(outimg, itsScoresStr, 2, 301, txtcol);
316 
317  // Write any positively recognized object category:
318  jevois::rawimage::writeText(outimg, itsLastObjectCateg.c_str(), 517-6*itsLastObjectCateg.length(), 263, txtcol);
319 
320  // FIXME do svm on gist and write resuts here
321 
322  // Show processing fps:
323  std::string const & fpscpu = itsProcessingTimer.stop();
324  jevois::rawimage::writeText(outimg, fpscpu, 3, 240 - 13, jevois::yuyv::White);
325 
326  // Send the output image with our processing results to the host over USB:
327  outframe.send();
328 
329  // Alternate between face and object recognition:
330  doobject = ! doobject;
331  }
332 
333  protected:
334  std::shared_ptr<Saliency> itsSaliency;
335  std::shared_ptr<FaceDetector> itsFaceDetector;
336  std::shared_ptr<ObjectRecognitionBase> itsObjectRecognition;
337  std::shared_ptr<Kalman2D> itsKF;
338  std::string itsScoresStr;
339 };
340 
341 // Allow the module to be loaded as a shared object (.so) file:
drawGist
void drawGist(jevois::RawImage &img, unsigned char const *gist, size_t gistsize, unsigned int xoff, unsigned int yoff, unsigned int width, unsigned int scale)
Definition: Saliency.C:771
jevois::OutputFrame
jevois::async
std::future< std::invoke_result_t< std::decay_t< Function >, std::decay_t< Args >... > > async(Function &&f, Args &&... args)
Timer.H
Module.H
DemoSalGistFaceObj::DemoSalGistFaceObj
DemoSalGistFaceObj(std::string const &instance)
Constructor.
Definition: DemoSalGistFaceObj.C:84
jevois::StdModule::sendSerialImg2D
void sendSerialImg2D(unsigned int camw, unsigned int camh, float x, float y, float w=0.0F, float h=0.0F, std::string const &id="", std::string const &extra="")
Log.H
jevois::RawImage
jevois::rawimage::drawCircle
void drawCircle(RawImage &img, int x, int y, unsigned int rad, unsigned int thick, unsigned int col)
jevois::Timer::start
void start()
drawMap
void drawMap(jevois::RawImage &img, env_image const *fmap, unsigned int xoff, unsigned int yoff, unsigned int scale)
Definition: Saliency.C:709
DemoSalGistFaceObj::itsKF
std::shared_ptr< Kalman2D > itsKF
Definition: DemoSalGistFaceObj.C:337
jevois::RawImage::require
void require(char const *info, unsigned int w, unsigned int h, unsigned int f) const
jevois::RawImage::width
unsigned int width
jevois::rawimage::writeText
void writeText(RawImage &img, std::string const &txt, int x, int y, unsigned int col, Font font=Font6x10)
jevois
F
float F
FaceDetector.H
DemoSalGistFaceObj::itsFaceDetector
std::shared_ptr< FaceDetector > itsFaceDetector
Definition: DemoSalGistFaceObj.C:335
DemoSalGistFaceObj::process
virtual void process(jevois::InputFrame &&inframe, jevois::OutputFrame &&outframe) override
Processing function.
Definition: DemoSalGistFaceObj.C:96
jevois::Timer::stop
const std::string & stop(double *seconds)
DemoSalGistFaceObj::~DemoSalGistFaceObj
virtual ~DemoSalGistFaceObj()
Virtual destructor for safe inheritance.
Definition: DemoSalGistFaceObj.C:93
Kalman2D.H
jevois::rawimage::drawFilledRect
void drawFilledRect(RawImage &img, int x, int y, unsigned int w, unsigned int h, unsigned int col)
jevois::StdModule::StdModule
StdModule(std::string const &instance)
LFATAL
#define LFATAL(msg)
JEVOIS_REGISTER_MODULE
JEVOIS_REGISTER_MODULE(DemoSalGistFaceObj)
DemoSalGistFaceObj::itsScoresStr
std::string itsScoresStr
Definition: DemoSalGistFaceObj.C:338
RawImageOps.H
jevois::RawImage::height
unsigned int height
DemoSalGistFaceObj::itsSaliency
std::shared_ptr< Saliency > itsSaliency
Definition: DemoSalGistFaceObj.C:334
jevois::InputFrame
jevois::rawimage::cvImage
cv::Mat cvImage(RawImage const &src)
jevois::rawimage::paste
void paste(RawImage const &src, RawImage &dest, int dx, int dy)
DemoSalGistFaceObj
Simple demo that combines saliency, gist, face detection, and object recognition.
Definition: DemoSalGistFaceObj.C:80
jevois::rawimage::drawRect
void drawRect(RawImage &img, int x, int y, unsigned int w, unsigned int h, unsigned int thick, unsigned int col)
ObjectRecognitionMNIST.H
DemoSalGistFaceObj::itsObjectRecognition
std::shared_ptr< ObjectRecognitionBase > itsObjectRecognition
Definition: DemoSalGistFaceObj.C:336
jevois::StdModule
Saliency.H
LINFO
#define LINFO(msg)
intg32
ENV_INTG32_TYPE intg32
32-bit signed integer
Definition: env_types.h:52
jevois::Timer