JeVoisBase  1.22
JeVois Smart Embedded Machine Vision Toolkit Base Modules
Share this page:
Loading...
Searching...
No Matches
DemoSalGistFaceObj.C
Go to the documentation of this file.
1// ///////////////////////////////////////////////////////////////////////////////////////////////////////////////////
2//
3// JeVois Smart Embedded Machine Vision Toolkit - Copyright (C) 2016 by Laurent Itti, the University of Southern
4// California (USC), and iLab at USC. See http://iLab.usc.edu and http://jevois.org for information about this project.
5//
6// This file is part of the JeVois Smart Embedded Machine Vision Toolkit. This program is free software; you can
7// redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software
8// Foundation, version 2. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY;
9// without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public
10// License for more details. You should have received a copy of the GNU General Public License along with this program;
11// if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
12//
13// Contact information: Laurent Itti - 3641 Watt Way, HNB-07A - Los Angeles, CA 90089-2520 - USA.
14// Tel: +1 213 740 3527 - itti@pollux.usc.edu - http://iLab.usc.edu - http://jevois.org
15// ///////////////////////////////////////////////////////////////////////////////////////////////////////////////////
16/*! \file */
17
18#include <jevois/Core/Module.H>
19
20#include <jevois/Debug/Log.H>
21#include <jevois/Debug/Timer.H>
27
28#include <opencv2/core/core.hpp>
29#include <opencv2/imgproc/imgproc.hpp>
30#include <linux/videodev2.h> // for v4l2 pixel types
31//#include <opencv2/highgui/highgui.hpp> // used for debugging only, see imshow below
32// icon by Freepik in interface at flaticon
33
34//! Simple demo that combines saliency, gist, face detection, and object recognition
35/*! Run the visual saliency algorithm to find the most interesting location in the field of view. Then extract a square
36 image region around that point. On alternating frames, either
37
38 - attempt to detect a face in the attended region, and, if positively detected, show the face in the bottom-right
39 corner of the display. The last detected face will remain shown in the bottom-right corner of the display until a
40 new face is detected.
41
42 - or attempt to recognize an object in the attended region, using a deep neural network. The default network is a
43 handwritten digot recognition network that replicated the original LeNet by Yann LeCun and is one of the very
44 first convolutional neural networks. The network has been trained on the standard MNIST database of handwritten
45 digits, and achives over 99% correct recognition on the MNIST test dataset. When a digit is positively identified,
46 a picture of it appears near the last detected face towards the bottom-right corner of the display, and a text
47 string with the digit that has been identified appears to the left of the picture of the digit.
48
49 Serial Messages
50 ---------------
51
52 This module can send standardized serial messages as described in \ref UserSerialStyle, where all coordinates and
53 sizes are standardized using \ref coordhelpers. One message is issued on every video frame at the temporally
54 filtered attended (most salient) location (green circle in the video display):
55
56 - Serial message type: \b 2D
57 - `id`: always \b sm (shorthand for saliency map)
58 - `x`, `y`: standardized 2D coordinates of temporally-filtered most salient point
59 - `w`, `h`: standardized size of the pink square box around each attended point
60 - `extra`: none (empty string)
61
62 See \ref UserSerialStyle for more on standardized serial messages, and \ref coordhelpers for more info on
63 standardized coordinates.
64
65
66 @author Laurent Itti
67
68 @displayname Demo Saliency + Gist + Face Detection + Object Recognition
69 @videomapping YUYV 640 312 50.0 YUYV 320 240 50.0 JeVois DemoSalGistFaceObj
70 @email itti\@usc.edu
71 @address University of Southern California, HNB-07A, 3641 Watt Way, Los Angeles, CA 90089-2520, USA
72 @copyright Copyright (C) 2016 by Laurent Itti, iLab and the University of Southern California
73 @mainurl http://jevois.org
74 @supporturl http://jevois.org/doc
75 @otherurl http://iLab.usc.edu
76 @license GPL v3
77 @distribution Unrestricted
78 @restrictions None
79 \ingroup modules */
81{
82 public:
83 //! Constructor
84 DemoSalGistFaceObj(std::string const & instance) : jevois::StdModule(instance), itsScoresStr(" ")
85 {
86 itsSaliency = addSubComponent<Saliency>("saliency");
87 itsFaceDetector = addSubComponent<FaceDetector>("facedetect");
88 itsObjectRecognition = addSubComponent<ObjectRecognitionMNIST>("MNIST");
89 itsKF = addSubComponent<Kalman2D>("kalman");
90 }
91
92 //! Virtual destructor for safe inheritance
93 virtual ~DemoSalGistFaceObj() { }
94
95 //! Processing function
96 virtual void process(jevois::InputFrame && inframe, jevois::OutputFrame && outframe) override
97 {
98 static jevois::Timer itsProcessingTimer("Processing");
99 static cv::Mat itsLastFace(60, 60, CV_8UC2, 0x80aa) ; // Note that this one will contain raw YUV pixels
100 static cv::Mat itsLastObject(60, 60, CV_8UC2, 0x80aa) ; // Note that this one will contain raw YUV pixels
101 static std::string itsLastObjectCateg;
102 static bool doobject = false; // alternate between object and face recognition
103
104 // Wait for next available camera image:
105 jevois::RawImage inimg = inframe.get();
106
107 // We only handle one specific input format in this demo:
108 inimg.require("input", 320, 240, V4L2_PIX_FMT_YUYV);
109
110 itsProcessingTimer.start();
111 int const roihw = 32; // face & object roi half width and height
112
113 // Compute saliency, in a thread:
114 auto sal_fut = jevois::async([&](){ itsSaliency->process(inimg, true); });
115
116 // While computing, wait for an image from our gadget driver into which we will put our results:
117 jevois::RawImage outimg = outframe.get();
118 outimg.require("output", 640, 312, V4L2_PIX_FMT_YUYV);
119
120 // Paste the original image to the top-left corner of the display:
121 unsigned short const txtcol = jevois::yuyv::White;
122 jevois::rawimage::paste(inimg, outimg, 0, 0);
123 jevois::rawimage::writeText(outimg, "JeVois Saliency + Gist + Faces + Objects", 3, 3, txtcol);
124
125 // Wait until saliency computation is complete:
126 sal_fut.get();
127
128 // find most salient point:
129 int mx, my; intg32 msal;
130 itsSaliency->getSaliencyMax(mx, my, msal);
131
132 // Scale back to original image coordinates:
133 int const smlev = itsSaliency->smscale::get();
134 int const smadj = smlev > 0 ? (1 << (smlev-1)) : 0; // half a saliency map pixel adjustment
135 int const dmx = (mx << smlev) + smadj;
136 int const dmy = (my << smlev) + smadj;
137
138 // Compute instantaneous attended ROI (note: coords must be even to avoid flipping U/V when we later paste):
139 int const rx = std::min(int(inimg.width) - roihw, std::max(roihw, dmx));
140 int const ry = std::min(int(inimg.height) - roihw, std::max(roihw, dmy));
141
142 // Asynchronously launch a bunch of saliency drawings and filter the attended locations
143 auto draw_fut =
144 jevois::async([&]() {
145 // Paste the various saliency results:
146 drawMap(outimg, &itsSaliency->salmap, 320, 0, 16, 20);
147 jevois::rawimage::writeText(outimg, "Saliency Map", 640 - 12*6-4, 3, txtcol);
148
149 drawMap(outimg, &itsSaliency->color, 0, 240, 4, 18);
150 jevois::rawimage::writeText(outimg, "Color", 3, 243, txtcol);
151
152 drawMap(outimg, &itsSaliency->intens, 80, 240, 4, 18);
153 jevois::rawimage::writeText(outimg, "Intensity", 83, 243, txtcol);
154
155 drawMap(outimg, &itsSaliency->ori, 160, 240, 4, 18);
156 jevois::rawimage::writeText(outimg, "Orientation", 163, 243, txtcol);
157
158 drawMap(outimg, &itsSaliency->flicker, 240, 240, 4, 18);
159 jevois::rawimage::writeText(outimg, "Flicker", 243, 243, txtcol);
160
161 drawMap(outimg, &itsSaliency->motion, 320, 240, 4, 18);
162 jevois::rawimage::writeText(outimg, "Motion", 323, 243, txtcol);
163
164 // Draw the gist vector:
165 drawGist(outimg, itsSaliency->gist, itsSaliency->gist_size, 400, 242, 40, 2);
166
167 // Draw a small square at most salient location in image and in saliency map:
168 jevois::rawimage::drawFilledRect(outimg, mx * 16 + 5, my * 16 + 5, 8, 8, 0xffff);
169 jevois::rawimage::drawFilledRect(outimg, 320 + mx * 16 + 5, my * 16 + 5, 8, 8, 0xffff);
170 jevois::rawimage::drawRect(outimg, rx - roihw, ry - roihw, roihw*2, roihw*2, 0xf0f0);
171 jevois::rawimage::drawRect(outimg, rx - roihw+1, ry - roihw+1, roihw*2-2, roihw*2-2, 0xf0f0);
172
173 // Blank out free space from 480 to 519 at the bottom, and small space above and below gist vector:
174 jevois::rawimage::drawFilledRect(outimg, 480, 240, 40, 60, 0x8000);
175 jevois::rawimage::drawRect(outimg, 400, 240, 80, 2, 0x80a0);
176 jevois::rawimage::drawRect(outimg, 400, 298, 80, 2, 0x80a0);
178
179 // Filter the attended locations:
180 itsKF->set(dmx, dmy, inimg.width, inimg.height);
181 float kfxraw, kfyraw, kfximg, kfyimg;
182 itsKF->get(kfxraw, kfyraw, kfximg, kfyimg, inimg.width, inimg.height, 1.0F, 1.0F);
183
184 // Draw a circle around the kalman-filtered attended location:
185 jevois::rawimage::drawCircle(outimg, int(kfximg), int(kfyimg), 20, 1, jevois::yuyv::LightGreen);
186
187 // Send saliency info to serial port (for arduino, etc):
188 sendSerialImg2D(inimg.width, inimg.height, kfximg, kfyimg, roihw * 2, roihw * 2, "sm");
189 });
190
191 // Extract a raw YUYV ROI around attended point:
192 cv::Mat rawimgcv = jevois::rawimage::cvImage(inimg);
193 cv::Mat rawroi = rawimgcv(cv::Rect(rx - roihw, ry - roihw, roihw * 2, roihw * 2));
194
195 if (doobject)
196 {
197 // #################### Object recognition:
198
199 // Prepare a color or grayscale ROI for the object recognition module:
200 auto objsz = itsObjectRecognition->insize();
201 cv::Mat objroi;
202 switch (objsz.depth_)
203 {
204 case 1: // grayscale input
205 {
206 // mnist is white letters on black background, so invert the image before we send it for recognition, as we
207 // assume here black letters on white backgrounds. We also need to provide a clean crop around the digit for
208 // the deep network to work well:
209 cv::cvtColor(rawroi, objroi, cv::COLOR_YUV2GRAY_YUYV);
210
211 // Find the 10th percentile gray value:
212 size_t const elem = (objroi.cols * objroi.rows * 10) / 100;
213 std::vector<unsigned char> v; v.assign(objroi.datastart, objroi.dataend);
214 std::nth_element(v.begin(), v.begin() + elem, v.end());
215 unsigned char const thresh = std::min((unsigned char)(100), std::max((unsigned char)(30), v[elem]));
216
217 // Threshold and invert the image:
218 cv::threshold(objroi, objroi, thresh, 255, cv::THRESH_BINARY_INV);
219
220 // Find the digit and center and crop it:
221 cv::Mat pts; cv::findNonZero(objroi, pts);
222 cv::Rect r = cv::boundingRect(pts);
223 int const cx = r.x + r.width / 2;
224 int const cy = r.y + r.height / 2;
225 int const siz = std::min(roihw * 2, std::max(16, 8 + std::max(r.width, r.height))); // margin of 4 pix
226 int const tlx = std::max(0, std::min(roihw*2 - siz, cx - siz/2));
227 int const tly = std::max(0, std::min(roihw*2 - siz, cy - siz/2));
228 cv::Rect ar(tlx, tly, siz, siz);
229 cv::resize(objroi(ar), objroi, cv::Size(objsz.width_, objsz.height_), 0, 0, cv::INTER_AREA);
230 //cv::imshow("cropped roi", objroi);cv::waitKey(1);
231 }
232 break;
233
234 case 3: // color input
235 cv::cvtColor(rawroi, objroi, cv::COLOR_YUV2RGB_YUYV);
236 cv::resize(objroi, objroi, cv::Size(objsz.width_, objsz.height_), 0, 0, cv::INTER_AREA);
237 break;
238
239 default:
240 LFATAL("Unsupported object detection input depth " << objsz.depth_);
241 }
242
243 // Launch object recognition on the ROI and get the recognition scores:
244 auto scores = itsObjectRecognition->process(objroi);
245
246 // Create a string to show all scores:
247 std::ostringstream oss;
248 for (size_t i = 0; i < scores.size(); ++i)
249 oss << itsObjectRecognition->category(i) << ':' << std::fixed << std::setprecision(2) << scores[i] << ' ';
250 itsScoresStr = oss.str();
251
252 // Check whether the highest score is very high and significantly higher than the second best:
253 float best1 = scores[0], best2 = scores[0]; size_t idx1 = 0, idx2 = 0;
254 for (size_t i = 1; i < scores.size(); ++i)
255 {
256 if (scores[i] > best1) { best2 = best1; idx2 = idx1; best1 = scores[i]; idx1 = i; }
257 else if (scores[i] > best2) { best2 = scores[i]; idx2 = i; }
258 }
259
260 // Update our display upon each "clean" recognition:
261 if (best1 > 90.0F && best2 < 20.0F)
262 {
263 // Remember this recognized object for future displays:
264 itsLastObjectCateg = itsObjectRecognition->category(idx1);
265 itsLastObject = rawimgcv(cv::Rect(rx - 30, ry - 30, 60, 60)).clone(); // make a deep copy
266
267 LINFO("Object recognition: best: " << itsLastObjectCateg <<" (" << best1 <<
268 "), second best: " << itsObjectRecognition->category(idx2) << " (" << best2 << ')');
269 }
270 }
271 else
272 {
273 // #################### Face detection:
274
275 // Prepare a grey ROI from our raw YUYV roi:
276 cv::Mat grayroi; cv::cvtColor(rawroi, grayroi, cv::COLOR_YUV2GRAY_YUYV);
277 cv::equalizeHist(grayroi, grayroi);
278
279 // Launch the face detector:
280 std::vector<cv::Rect> faces; std::vector<std::vector<cv::Rect> > eyes;
281 itsFaceDetector->process(grayroi, faces, eyes, false);
282
283 // Draw the faces and eyes, if any:
284 if (faces.size())
285 {
286 LINFO("detected " << faces.size() << " faces");
287 // Store the attended ROI into our last ROI, fixed size 60x60 for our display:
288 itsLastFace = rawimgcv(cv::Rect(rx - 30, ry - 30, 60, 60)).clone(); // make a deep copy
289 }
290
291 for (size_t i = 0; i < faces.size(); ++i)
292 {
293 // Draw one face:
294 cv::Rect const & f = faces[i];
295 jevois::rawimage::drawRect(outimg, f.x + rx - roihw, f.y + ry - roihw, f.width, f.height, 0xc0ff);
296
297 // Draw the corresponding eyes:
298 for (auto const & e : eyes[i])
299 jevois::rawimage::drawRect(outimg, e.x + rx - roihw, e.y + ry - roihw, e.width, e.height, 0x40ff);
300 }
301 }
302
303 // Let camera know we are done processing the raw YUV input image. NOTE: rawroi is now invalid:
304 inframe.done();
305
306 // Paste our last attended and recognized face and object (or empty pics):
307 cv::Mat outimgcv(outimg.height, outimg.width, CV_8UC2, outimg.buf->data());
308 itsLastObject.copyTo(outimgcv(cv::Rect(520, 240, 60, 60)));
309 itsLastFace.copyTo(outimgcv(cv::Rect(580, 240, 60, 60)));
310
311 // Wait until all saliency drawings are complete (since they blank out our object label area):
312 draw_fut.get();
313
314 // Print all object scores:
315 jevois::rawimage::writeText(outimg, itsScoresStr, 2, 301, txtcol);
316
317 // Write any positively recognized object category:
318 jevois::rawimage::writeText(outimg, itsLastObjectCateg.c_str(), 517-6*itsLastObjectCateg.length(), 263, txtcol);
319
320 // FIXME do svm on gist and write resuts here
321
322 // Show processing fps:
323 std::string const & fpscpu = itsProcessingTimer.stop();
324 jevois::rawimage::writeText(outimg, fpscpu, 3, 240 - 13, jevois::yuyv::White);
325
326 // Send the output image with our processing results to the host over USB:
327 outframe.send();
328
329 // Alternate between face and object recognition:
330 doobject = ! doobject;
331 }
332
333 protected:
334 std::shared_ptr<Saliency> itsSaliency;
335 std::shared_ptr<FaceDetector> itsFaceDetector;
336 std::shared_ptr<ObjectRecognitionBase> itsObjectRecognition;
337 std::shared_ptr<Kalman2D> itsKF;
338 std::string itsScoresStr;
339};
340
341// Allow the module to be loaded as a shared object (.so) file:
JEVOIS_REGISTER_MODULE(ArUcoBlob)
void drawGist(jevois::RawImage &img, unsigned char const *gist, size_t gistsize, unsigned int xoff, unsigned int yoff, unsigned int width, unsigned int scale)
Definition Saliency.C:771
void drawMap(jevois::RawImage &img, env_image const *fmap, unsigned int xoff, unsigned int yoff, unsigned int scale)
Definition Saliency.C:709
Simple demo that combines saliency, gist, face detection, and object recognition.
virtual void process(jevois::InputFrame &&inframe, jevois::OutputFrame &&outframe) override
Processing function.
std::shared_ptr< ObjectRecognitionBase > itsObjectRecognition
virtual ~DemoSalGistFaceObj()
Virtual destructor for safe inheritance.
DemoSalGistFaceObj(std::string const &instance)
Constructor.
std::shared_ptr< FaceDetector > itsFaceDetector
std::shared_ptr< Saliency > itsSaliency
std::shared_ptr< Kalman2D > itsKF
unsigned int width
unsigned int height
void require(char const *info, unsigned int w, unsigned int h, unsigned int f) const
void sendSerialImg2D(unsigned int camw, unsigned int camh, float x, float y, float w=0.0F, float h=0.0F, std::string const &id="", std::string const &extra="")
StdModule(std::string const &instance)
std::string const & stop(double *seconds)
ENV_INTG32_TYPE intg32
32-bit signed integer
Definition env_types.h:52
#define LFATAL(msg)
#define LINFO(msg)
void paste(RawImage const &src, RawImage &dest, int dx, int dy)
cv::Mat cvImage(RawImage const &src)
void writeText(RawImage &img, std::string const &txt, int x, int y, unsigned int col, Font font=Font6x10)
void drawFilledRect(RawImage &img, int x, int y, unsigned int w, unsigned int h, unsigned int col)
void drawRect(RawImage &img, int x, int y, unsigned int w, unsigned int h, unsigned int thick, unsigned int col)
void drawCircle(RawImage &img, int x, int y, unsigned int rad, unsigned int thick, unsigned int col)
std::future< std::invoke_result_t< std::decay_t< Function >, std::decay_t< Args >... > > async(Function &&f, Args &&... args)
unsigned short constexpr Black
unsigned short constexpr White
unsigned short constexpr LightGreen