AlexyAB has an XNOR version of tiny yolov3 (link attached). Are you looking into a video mode for XNOR?

1 Answer

Certainly of interest indeed!

Looking at the related commits from AlexeyAB, I see a lot of modified CUDA code but not much modified CPU code. Since we don't run CUDA, this may take some work.

Do you know whether OpenCV DNN supports XNOR? This might be an easier path than the original (or AlexeyAB) darknet code.

answered Oct 11, 2018 by JeVois (46,580 points)

Yes, if quantized was bundled into OpenCV DNN, that would be slick! I have not seen this yet.

However, I know that Tensorflow Light (mobilenetV2 quantized) can work on Raspberry Pi (similar CPU arch to Jevois, right?) https://www.tensorflow.org/lite/rpi

Also, here is an even lighter yolo version from Alexey that is meant for CPU and GPU, so CUDA would only be needed for training w Nvidia. Inference can use CPU. https://github.com/AlexeyAB/yolo2_light.

I doubt quantized mobilenetv2 with TF would be any faster than tiny-yolo with NNPack for NEON, but I haven't tested. I am thinking a DarkFlow implementation of TF lite would be interesting...

Here is an example of an optimized NNPack (40% faster than original, I've confirmed on Pi) with an interesting (slower) option to use the Pi GPU/QPU. https://github.com/shizukachan/darknet-nnpack

~Andrew

commented Oct 11, 2018 by spinoza1791 (170 points)

Also, I received this reply from AlexyAB via github:

Re: [AlexeyAB/darknet] Will XNOR yolo work on ARMv7? (#1751)

Alexey
4:59 PM (1 hour ago)
to AlexeyAB/darknet, me, Author

I didn't test XNOR on ARM CPU. Also XNOR currently doesn't use SIMD instructions on ARM, since XNOR implementation is optimized for AVX2 instructions on Intel CPUs.

It should work on ARM if you use GCC compiler with built-in popcnt-instruction: https://github.com/AlexeyAB/darknet/blob/7ee4135910624f11e80de36b236208b223f58eb4/src/gemm.c#L1640
But you should compile with OPENMP=1 AVX=0 in the Makefile.

commented Oct 11, 2018 by spinoza1791 (170 points)

excellent, thanks much, we should be able to make it work! Realistically, next week is shot because of ARM TechCon, but will be next on our todo list after that. Do you have pretrained weights that we could use to test, that would significantly speed up the process?

We have support for mobilenets v2 in TensorFlowEasy already, see the commented out entries in its params.cfg at http://jevois.org/moddoc/TensorFlowEasy/modinfo.html

But we do not have the quantized weights, and mobilenet v2 with float weights are slower than quantized mobilenet v1, which is why v1 is still the one enabled by default. But remember that those are a different kind of network (recognition only, as opposed to detection plus recognition in yolo). If you have pointers to the mobilenet v2 quantized weights for imagenet, that would be great too, we could add them to our standard distro.

commented Oct 12, 2018 by JeVois (46,580 points)

Most popular tags

AlexyAB has an XNOR version of tiny yolov3 (link attached). Are you looking into a video mode for XNOR?

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Related questions

Most popular tags

AlexyAB has an XNOR version of tiny yolov3 (link attached). Are you looking into a video mode for XNOR?

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.