tinyML Talks: Once-for-All: Train One Network and Specialize it for Efficient Deployment & Unsupervised collaborative learning technology at the Edge for industrial machine vendors

Date

April 28, 2020

Location

Virtual

Contact us

Discussion

Schedule

Timezone: PDT

Once-for-All: Train One Network and Specialize it for Efficient Deployment

Song HAN, Assistant Professor

MIT EECS

We address the challenging problem of efficient inference across many devices and resource constraints, especially on edge devices. We propose to train a once-for-all network (OFA) that supports diverse architectural settings by decoupling training and search. We can quickly get a specialized sub-network by selecting from the OFA network without additional training. We also propose a novel progressive shrinking algorithm, a generalized pruning method that reduces the model size across many more dimensions than pruning (depth, width, kernel size, and resolution), which can obtain a surprisingly large number of sub-networks (> 1019) that can fit different latency constraints. On edge devices, OFA consistently outperforms SOTA NAS methods (up to 4.0% ImageNet top1 accuracy improvement over MobileNetV3, or same accuracy but 1.5x faster than MobileNetV3, 2.6x faster than Efficient Net w.r.t measured latency) while reducing many orders of magnitude GPU hours and CO2 emission. In particular, OFA achieves a new SOTA 80.0% ImageNet top1 accuracy under the mobile setting (<600M MACs). OFA is the winning solution for 4th Low Power Computer Vision Challenge, both classification track and detection track.

Song HAN, Assistant Professor

MIT EECS

Song Han is an assistant professor in MIT EECS. He received his PhD degree from Stanford University. His research focuses on efficient deep learning computing. He proposed “deep compression” technique that can reduce neural network size by an order of magnitude without losing accuracy, and the hardware implementation “efficient inference engine” that first exploited pruning and weight sparsity in deep learning accelerators. His recent research on hardware-aware neural architecture search and TinyML was highlighted by MIT News, Wired, and Venture Beat, and received many low-power computer vision (LPCV) contest awards. Song received Best Paper awards at ICLR’16 and FPGA’17, Amazon Machine Learning Research Award, SONY Faculty Award, Facebook Faculty Award. Song was named “35 Innovators Under 35” by MIT Technology Review for his contribution on “deep compression” technique that “lets powerful artificial intelligence (AI) programs run more efficiently on low-power mobile devices.” Song received the NSF CAREER Award for “efficient algorithms and hardware for accelerated machine learning.

Timezone: PDT

Unsupervised collaborative learning technology at the Edge for industrial machine vendors

Alexander EROMA, Intelligence Team Lead

Octonion

Introduction to Unsupervised Collaborative learning technology from Octonion SA that allows industrial machine vendors and owners to get machine efficiency insights. TinyML and TinyEgde approaches are the base construction blocks of Octonion’s technology. The presentation addresses the implementation of the Edge pipeline from Octonion that is compatible with ARM Cortex-M4 core.

Alexander EROMA, Intelligence Team Lead

Octonion

Alexander graduated from Belarusian State University of Informatics and Radioelectronics as an engineer of computer systems and networks. Also, he graduated with Master of Engineering degree in mathematical and software of computers, complexes, and computer networks. Since 2015 he is participating in the Ph.D. course in the area of computer science. Alexander is the author of several scientific articles in the field of machine learning. At Octonion, Alexander is responsible for the development of complex algorithms, high-performance code, as well as solution architecture.

Schedule subject to change without notice.