tinyML Talks: Cracking a 600 million year old secret to fit computer vision on the edge


July 6, 2021



Contact us



Timezone: PDT

Cracking a 600 million year old secret to fit computer vision on the edge

Shivy YOHANANDAN, Co-founder and Chief Technology Officer


The ultimate goal of AI IoT is to be aware of our surroundings through sensors which can respond in real-time so we can be more selective with how we use and manage our limited resources, which reduces business and environmental cost. But the big problem with AI IoT is that current AI uses more energy to process IoT data than the energy it’s trying to save, which is a paradox. The main cause is expensive algorithm families like YOLO, SSD, R-CNN, and their derivatives, which account for most of the computer vision algorithms used by everyone!

YOLOs and SSDs do object detection (a staple in most computer vision) by shrinking the full resolution image to 416×416 or 300×300 and then doing both localization and classification on this shrunken image. But you’ve now lost over 95% of information from the original image, which is why accuracy, robustness and generalizability seems to be poor, especially when trying to scale across many IoT sensors (e.g. cameras). In addition to this inherent design flaw, these models are huge and computationally expensive, which is why everyone is trying to fit them on the edge by shrinking these models. However, this often results in losing even more accuracy on a model that was already inaccurate to begin with!

Xailient solved this problem by cracking a 600 million year old secret in biological vision: selective attention and salience. The secret mechanism shows us how to split object detection into two separate models: detection and classification. This results in Xailient’s detector being only 44 KB – 5000x smaller than YOLO! You can then use your own flavor of classifier to process each detected ROI one-by-one, except now using a crop from the original image, thus preserving more information for better accuracy. So we’ve solved both model size and accuracy in one hit!

This allows Xailient to fit object detection on ultra-low power devices, which is exactly what we need to break the paradox above. And now we built a platform giving everyone easy access to this new kind of computer vision that’s much more efficient and accurate, and you don’t even need model compression! In this talk I will share some example use-cases of ultra-low powered aware AI IoT.

Shivy YOHANANDAN, Co-founder and Chief Technology Officer


Dr. Shivy Yohanandan is the co-founder and Chief Technology Officer at Xailient – the computer vision platform that is revolutionizing Artificial Intelligence by teaching algorithms how to process images and video like humans! He holds a PhD in Artificial Intelligence and Computer Science but started his career as a Neuroscientist and Bioengineer from the University of Melbourne. Passionate about vision, Shivy spent 4 years bringing vision to the blind by helping build Australia’s first bionic eye as a Research Engineer. Then, during his PhD, he made a breakthrough in Neuroscience; discovering for the first time the precise mechanism behind a 600-million-year-old secret on how animals are capable of processing vast amounts of visual information very efficiently. Realising a significant gap in the inefficient way modern AI processes images and video, he mapped nature’s secret vision formulae into algorithms that now provide the core behind Xailient’s visual AI, which are now being used by companies across the world like Sony. Previously, Shivy worked as a research scientist for 3 years at IBM Research in AI for healthcare including computer vision in medical imaging and building a brain-machine interface to decode brainwaves for controlling a robotic arm.

Schedule subject to change without notice.