Module Documentation

Run the visual saliency algorithm to find the most interesting location in the field of view. Then extract a square image region around that point. On alternating frames, either

attempt to detect a face in the attended region, and, if positively detected, show the face in the bottom-right corner of the display. The last detected face will remain shown in the bottom-right corner of the display until a new face is detected.
or attempt to recognize an object in the attended region, using a deep neural network. The default network is a handwritten digot recognition network that replicated the original LeNet by Yann LeCun and is one of the very first convolutional neural networks. The network has been trained on the standard MNIST database of handwritten digits, and achives over 99% correct recognition on the MNIST test dataset. When a digit is positively identified, a picture of it appears near the last detected face towards the bottom-right corner of the display, and a text string with the digit that has been identified appears to the left of the picture of the digit.

Serial Messages

This module can send standardized serial messages as described in Standardized serial messages formatting, where all coordinates and sizes are standardized using Helper functions to convert coordinates from camera resolution to standardized. One message is issued on every video frame at the temporally filtered attended (most salient) location (green circle in the video display):

Serial message type: 2D
id: always sm (shorthand for saliency map)
x, y: standardized 2D coordinates of temporally-filtered most salient point
w, h: standardized size of the pink square box around each attended point
extra: none (empty string)

See Standardized serial messages formatting for more on standardized serial messages, and Helper functions to convert coordinates from camera resolution to standardized for more info on standardized coordinates.

Parameter

Type

Description

Default

Valid Values

(FaceDetector) face_cascade

std::string

File name of the face cascade

JEVOIS_SHARE_PATH /facedetector/haarcascade_frontalface_alt.xml

(FaceDetector) eye_cascade

std::string

File name of the eye cascade, or empty to not detect eyes

JEVOIS_SHARE_PATH /facedetector/haarcascade_eye_tree_eyeglasses.xml

(Kalman2D) usevel

bool

Use velocity tracking, in addition to position

false

(Kalman2D) procnoise

float

Process noise standard deviation

0.003F

(Kalman2D) measnoise

float

Measurement noise standard deviation

0.05F

(Kalman2D) postnoise

float

A posteriori error estimate standard deviation

0.3F

(Saliency) cweight

byte

Color channel weight

255

(Saliency) iweight

byte

Intensity channel weight

255

(Saliency) oweight

byte

Orientation channel weight

255

(Saliency) fweight

byte

Flicker channel weight

255

(Saliency) mweight

byte

Motion channel weight

255

(Saliency) centermin

size_t

Lowest (finest) of the 3 center scales

(Saliency) deltamin

size_t

Lowest (finest) of the 2 center-surround delta scales

(Saliency) smscale

size_t

Scale of the saliency map

(Saliency) mthresh

byte

Motion threshold

(Saliency) fthresh

byte

Flicker threshold

(Saliency) msflick

bool

Use multiscale flicker computation

false

Detailed docs:

DemoSalGistFaceObj

License:

GPL v3

Distribution:

Unrestricted

Restrictions:

None

Support URL:

http://jevois.org/doc

Other URL:

http://iLab.usc.edu

Address:

University of Southern California, HNB-07A, 3641 Watt Way, Los Angeles, CA 90089-2520, USA