Darknet is a popular neural network framework. This module first finds the most conspicuous (salient) object in the scene, then identifies it using a deep neural network. It returns the top scoring candidates.

See http://ilab.usc.edu/bu/ for more information about saliency detection, and https://pjreddie.com/darknet for more information about the Darknet deep neural network framework.

This module runs a Darknet network on an image window around the most salient point and shows the top-scoring results. The network is currently a bit slow, hence it is only run once in a while. Point your camera towards some interesting object, and wait for Darknet to tell you what it found. The framerate figures shown at the bottom left of the display reflect the speed at which each new video frame from the camera is processed, but in this module this just amounts to computing the saliency map from the camera input, converting the input image to RGB, cropping it around the most salient location, sending it to the neural network for processing in a separate thread, and creating the demo display. Actual network inference speed (time taken to compute the predictions on one image crop) is shown at the bottom right. See below for how to trade-off speed and accuracy.

Note that by default this module runs the Imagenet1k tiny Darknet (it can also run the slightly slower but a bit more accurate Darknet Reference network; see parameters). There are 1000 different kinds of objects (object classes) that this network can recognize (too long to list here).

Sometimes it will make mistakes! The performance of darknet-tiny is about 58.7% correct (mean average precision) on the test set, and Darknet Reference is about 61.1% correct on the test set. This is when running these networks at 224x224 network input resolution (see parameter netin below).

Neural network size and speed

When using networks that are fully convolutional (as is the case for the default networks provided with this module), one can resize the network to any desired input size. The network size direcly affects both speed and accuracy. Larger networks run slower but are more accurate.

This module provides two parameters that allow you to adjust this tradeoff:

foa determines the size of a region of interest that is cropped around the most salient location
netin determines the size to which that region of interest is rescaled and fed to the neural network

For example:

with netin = (224 224), this module runs at about 450ms/prediction.
with netin = (128 128), this module runs at about 180ms/prediction.

Finally note that, when using video mappings with USB output, irrespective of foa and netin, the crop around the most salient image region (with size given by foa) will always also be rescaled so that, when placed to the right of the input image, it fills the desired USB output dims. For example, if camera mode is 320x240 and USB output size is 544x240, then the attended and recognized object will be rescaled to 224x224 (since 224 = 544-320) for display purposes only. This is so that one does not need to change USB video resolution while playing with different values of foa and netin live.

Serial messages

On every frame where detection results were obtained that are above thresh, this module sends a standardized 2D message as specified in Standardized serial messages formatting

Serial message type: 2D
id: top-scoring category name of the recognized object, followed by ':' and the confidence score in percent
x, y, or vertices: standardized 2D coordinates of object center or corners
w, h: standardized object size
extra: any number of additional category:score pairs which had an above-threshold score, in order of decreasing score where category is the category name (from namefile) and score is the confidence score from 0.0 to 100.0

See Standardized serial messages formatting for more on standardized serial messages, and Helper functions to convert coordinates from camera resolution to standardized for more info on standardized coordinates.

Module Documentation

Neural network size and speed

Serial messages