The main limitation of python is that it does not support true multithreading. So you are kind of stuck with using one core of JeVois when you could use 4 in C++ easily using std::async() to launch threads.
Now, having said that, you should also be aware that not much in terms of machine vision is actually implemented in python. Even OpenCV, that is all C++ under the hood. When you use OpenCV in python, that just calls the underlying C++ OpenCV implementation. For this reason, some OpenCV functions may still be highly efficient and multithreaded. So, usually, for pure OpenCV code, there is not much speed difference between python and C++, since python just calls the OpenCV C++ code. You would see big differences if, for example, you have a loop over all pixels in an image, and then do something in that loop. The loop may run much slower in python than C++. But as long as you only use image-level functions (like, for example, add two images) then the speed difference is negligible since the same optimized C++ code for matrix addition will be executed in both cases.
For TensorFlow and 83fps: this is with the smallest variant of MobileNets V1 quantized, that network runs at about 83fps. If you can cast your problem into something that a CNN could solve, then you can train your own mobilenet and run it on JeVois. See here:
http://jevois.org/tutorials/UserTensorFlowTraining.html