tinyML Summit 2019

Lorem Ipsum

March 20-21, 2019

Inaugural tinyML Summit

Join us for the inaugural tinyML Summit!

Venue

Google - Building 111

111 Java Drive Sunnyvale, CA 94089

Contact us

Bette COOPER

Schedule

7:30 am to 8:30 am

Registration and Breakfast

8:30 am to 8:45 am

Welcome & Opening Remarks

Evgeni GOUSEV, Senior Director, Qualcomm Research

8:45 am to 10:30 am

Hardware and Architectures

Session Moderator: Ian BRATT, Distinguished Engineer & Fellow, Arm

From Small to Tiny: How to Co-design ML Models, Computational Precision and Circuits in the Energy-Accuracy Trade-off Space

Marian VERHELST, Associate Professor, KU Leuven

Abstract (English)

Deep narrow networks, shallow wide networks, low precision networks, and even binary networks. NN model designers have so many degrees of freedom. Yet, also chip and circuit architects have many design options: from MAC-centric streaming architectures, to multi-level memory-hierarchies. From variable precision digital processing, over binary compute units, to analog or even in memoryprocessing. To achieve truly tiny ML, one has to make smart design decisions across this complete algorithm-architecture stack, while judging design decisions on their impact at the final system and application level metrics.

This talk will show design options at these different layers, and assess their impact. Subsequently, we will discuss how to achieve the required cross-layer trade-offs in a methodological way. This will allow to gain insights from such co-optimization, illustrated with resulting state-of-the-art implementations. Finally, it is very important to judge all optimizations at the system and application level, which is the only one that really matters to the user. The impact on these system level metrics will be assessed in light of 2 applications: always-on face recognition and keyword spotting.

The Next Level of Energy-efficient Edge Computing

Simon CRASKE, Lead Embedded Architect and Fellow, ARM

Abstract (English)

Arm and its partners continue to see ever increasing demands for at-the-far-edge computation, and a drive for increased signal processing and machine learning capabilities within microcontroller-based systems. In this presentation we will introduce the recently announced Armv8.1-M architecture, and its new M-Profile Vector Extension (MVE), which forms the foundation of Arm Helium™ technology. MVE delivers single-instruction multiple-data (SIMD) capabilities, more commonly associated with application type processors, but deployed in such a way as to retain the microcontroller fundamentals of low interrupt latency, low gate count, and very high energy efficiency. Yielding up to 5x the DSP and 15x the ML computation capability of a standard microcontroller, when combined with Arm TrustZone technology, Armv8.1-M delivers on the three key technologies for the next wave of embedded applications: System-wide security, DSP for signal conditioning and ML for decision making. The addition of MVE to the M-Profile architecture and future Cortex-M products enables the unification of efficientcompute and embedded-control onto a single processor architecture, removing the need for software writers to comprehend disparate toolchains, and offering increased portability of computational libraries between systems; as a result, this presentation aims to expose software writers and system designers to Arm’s next level of energy-efficient edge computing.

  • YouTube

Ultra Low Power Inference at the Very Edge of the Network

Eric FLAMAND, Cofounder and CTO, GreenWaves Technologies

Abstract (English)

Large scale deployment and operation of smart data-sensors at the very edge of the network is feasible, for cost and scalability reasons, if we assume over the air (OTA) connectivity as well as battery operation. Deep-learning based approaches enable us to efficiently solve network bandwidth issues, since the huge amount of sampled raw data can be reduced at the edge to highly qualitative data requiring very limited bandwidth on the network side. This reduction comes with a computational cost. In order to limit, as much as possible, the amount of energy to be spent to support this computational complexity, we must look at every possible source of optimizations when designing a chip for mW class inference. This is the goal of Gap8: combining many hardware architectural and implementation optimizations, together with tool driven optimizations, in order to keep memory traffic and computing activity as low as possible. In this talk we will go through the various mechanism we have been using from core ISA extension, light weight vectorization, parallelization, agile and hierarchical power management as well as tool assisted explicit memory management. We will demonstrate that with this approach we can run small to medium complexity networks as well as pre and post processing steps under a couple of mW budget.

  • YouTube

What Can In-memory Computing Deliver, and What Are the Barriers?

Naveen VERMA, Director of Keller Center and Professor of Electrical Engineering, Princeton University

Abstract (English)

Inference based on deep-learning models is being employed pervasively in applications today. In many such applications, state-of-the-art models can easily push the platforms to their limits of performance and energy efficiency. To address this, digital acceleration has been widely exploited. But, deep-learning computations exhibit critical attributes that limit the gains achievable by traditional digital acceleration. In particular, computations are dominated by high-dimensionality matrix-vector multiplications (MVMs), where the precision requirements of elements have been reducing (from FP32 a few years ago, to INT8/4/2 now and in the future). In this scenario, in-memory computing (IMC) offers distinct advantages, which have been demonstrated through recent prototypes leading to roughly 10x higher energy efficiency and area-normalized throughput, compared to optimized digital accelerators. This arises from the structural alignment of dense 2D memory arrays with the dataflow of MVMs. While digital spatial architectures (e.g., systolic arrays) also exploit this, IMC can do so more aggressively, minimizing data movement and amortizing compute into highly-efficient, highly-parallel analog operations. But, IMC also raises critical challenges, at each level (need for analog compute at circuit level, need for high bandwidth hardware infrastructure at architectural level, constrained configurability/virtualization at the software-mapping level). Recent research advances have shown remarkable promise in addressing many of these challenges, making IMC more of a reality than ever. These advances, their potential implications, and key questions remaining will be reviewed.

  • YouTube

10:30 am to 11:00 am

Break

11:00 am to 12:40 pm

Systems and Algorithms

Session Moderator: Boris MURMANN, Professor of Electrical Engineering, Stanford University

Better Learning Through Specialization

William CHAPPELL, CTO, Azure Global

Abstract (English)

At the edge, where sensor data is collected and physical interactions are mediated, we often lack the infrastructure necessary to support large scale computationally intensive machine learning. For these power-starved remote scenarios, DARPA/MTO is strategically investing in future technologies that enable ultra-high efficiency computing and real-time decision making regardless of whether a sensor is connected back to the cloud or is in a more autonomous deployment. Our projects include nanowattclass “wake-up” circuits for near-passive, high discrimination environmental unattended sensing and spectrum sharing algorithms that exploit machine learning to most efficiently divide available RF bandwidth among competing users without the need for prescribed standards. To complement these technologies, MTO is concurrently developing the next wave of specialized machine learning processors that overcome the “memory wall” limitation inherent in existing architectures, resulting in upwards of a 70X improvement in the energy*execution time metric. Together with its commercial, academic and defense industry partners, DARPA is charting a path towards effective yet efficient machine learning hardware at the edge.

Ultra-low Power Always-On Computer Vision

Edwin PARK, Principal Engineer, QUALCOMM Inc

Abstract (English)

Qualcomm’s productized Always on Computer Vision Module provides computer vision at <1mW. This module is the lowest power production on the market today. This average power includes the sensor, power management, the digital components, algorithms, and SW. We achieved this significant achievement in power and functionality by innovating from the ground up on all fronts. From the sensor, custom ASIC components, architecture, algorithm, SW, and custom trainable models, we optimized every component to minimize power and cost. These achievements enable our customer to design products with intelligent always-on vision – which was not previously available to them – with customized CV models specific to their use cases. The power envelope permits the emergence of new categories of IOT devices and the augmentation of many of the devices that you know. The talk will present Qualcomm’s chip and system, use cases enabled by this device, and many of the components of this system.

  • YouTube

Machine Learning Accelerators for IoT2 Devices

Dennis SYLVESTER, Professor, University of Michigan

Abstract (English)

Per Bell’s Law, the next class of computing systems will be the Internet of Tiny Things (IoT2 ), which extends current IoT devices to smaller form factors and greater ubiquity. To this end, over the past decade we have developed the Michigan Micro Mote (M3 ) for mm-scale wireless sensing applications. M3 is an ideal platform for implementing intelligence in IoT2 devices. One main application area we have focused on is in audio sensing. We have developed ultra-low power always-on wakeup devices that can detect various objects in the environment and more recently the lowest power human voice activity detection reported to date. This work is currently being extended to keyword spotting and natural language processing, at unprecedented power levels. These approaches harness lightweight neural networks for accurate classification within nanowatt power budgets.

  • YouTube

Next Frontier in CMOS Imaging – Always ON Sensing

Charles CHONG, Dir of Strategic Marketing for N America, PixArt Imaging

Abstract (English)

Optical Imagers have progressed tremendously over the last 2 decades from film to CCDs, to now the ubiquitous CMOS sensor that almost every one of us carries on our smartphones. Over the last decade, CMOS in consumer devices drove technological advancements from lower noise, pixel shrink, to BSI. As a result, camera in consumer space benefits from higher resolution, in line with the needs from Smartphones. However, there is now a new frontier that we should not overlook; the need for ultra-low power imagers for Always ON sensing. Thinking back, this why CMOS low power advantage was able to outpace CCDs in the first place.

The development in CMOS imaging has also brought along tremendous benefits in implementing on chip SOC for image analytics. With the emergence of smart sensors in application that involves detection, CMOS sensors can now be more widely used in non “Capture and Display” application. In the discussion, I will share my thoughts about what is sure to become a reality in Always ON sensing.

  • YouTube

Efficient Voice Trigger Detection for Low Resource Hardware

Kisun YOU, Engineering Manager, Siri Speech at Apple

Abstract (English)

This talk will describe the architecture of an always-on keyword spotting (KWS) system for battery-powered mobile devices used to initiate an interaction with the device. An always-available voice assistant needs a carefully designed voice keyword detector to satisfy the power and computational constraints of battery powered devices. We employ a multi-stage system that uses a low-power always-on primary stage to decide when to turn on a main processor and run a more accurate (but more power-hungry) secondary detector or checker. We describe a straightforward (DNN/HMM) primary detector and explore variations that result in very useful reductions in computation (or increased accuracy for the same computation). By reducing the set of target labels from three to one per phone, and reducing the rate at which the acoustic model is operated, the compute rate can be reduced by a factor of six while maintaining the same accuracy.

  • YouTube

12:40 pm to 2:30 pm

Lunch, Poster Session, Demos, Networking

2:30 pm to 4:30 pm

Software and Applications

Session Moderator: Kurt KEUTZER, Full Professor, University of California, Berkeley

ELL: the Microsoft Embedded Learning Library

Byron CHANGUION, Principal Software Engineer, Microsoft

Abstract (English)

The Microsoft Embedded Learning Library (ELL) is a cross-platform open source framework, designed and developed by Microsoft Research over the past three years. ELL is part of Microsoft’s broader efforts around intelligent edge computing.

At its core, ELL is an optimizing cross-compiler toolchain for AI pipelines, geared towards small resource-constrained devices and microcontrollers. ELL takes as input an end-to-end AI pipeline, such as an audio keyword detection pipeline or a vision-based people counting pipeline. It compresses the model and generates optimized machine executable code for an embedded target platform. AI pipelines compiled with ELL run locally on the target platform, without requiring a connection to the cloud and without relying on other runtime frameworks. While ELL can generate code for any platform, it is primarily optimized for standard off-the-shelf processors, such as the ARM Cortex A-class and M-class architectures that are prevalent in single-board computers. In addition to its functionality as a compiler, the ELL project provides an online gallery of pre-trained AI models and a handful of tutorials written for makers, technology enthusiasts, and developers who aspire to build intelligent devices and AI-powered gadgets.

In this talk, the vision behind ELL will be presented, and design, scope, and roadmap for the future will be presented. Design considerations that led Microsoft to build an AI compiler, rather than the more conventional choice of building an AI runtime will be addressed. In addition this talk will explain how Microsoft was able to move past the standard academic metrics, instead using real-world criteria to guide our work and priorities.

TF-lite for tinyML

Pete WARDEN, Technical Lead, Google

Abstract (English)

TensorFlow supports a variety of microcontroller and embedded hardware. This talk will cover the goals and vision for TensorFlow Lite Micro on these platforms, and will look at the available example code. There will also be a demonstration of speech keyword recognition running on a low power device, and a discussion of the model architectures required to fit within the constraints of these systems. The extension of these techniques to low-power vision applications will be covered, along with accelerometer and other sensor data analysis. We’ll conclude with the roadmap for the future of TensorFlow on resource-limited hardware.

  • YouTube

Visual AI in Milliwatts: A Grand Challenge and a Massive Opportunity

Jeff BIER, Founder, Embedded Vision Alliance

Abstract (English)

Visual images are uniquely rich in information. For example, from a video sequence of a person’s face and torso, algorithms can infer identity, age, gender, heart rate, respiration rate and emotional state, among other attributes. But today, nearly all visual data goes to waste: it isn’t captured by sensors, and therefore isn’t used. This is a massive missed opportunity, because by using information distilled from images, virtually every aspect of our lives, our society and our economy can be meaningfully improved – from health care to transportation to education. What is needed to enable harnessing a significant fraction of the available visual data? Clearly, we need ubiquitous cameras. But cameras that merely capture and transmit images aren’t sufficient. Transmitting images from billions of cameras to the cloud is both impractical and undesirable from the perspective of privacy and security. What’s needed, then, is smart cameras – cameras that not only capture images, but also interpret them, transmitting only metadata, which requires orders of magnitude less network bandwidth. In order for these smart cameras to become ubiquitous, they’ll need to be very inexpensive and, in many cases, they’ll need to be extremely energy efficient, so that they don’t require external power. In this presentation, Jeff Bier, founder of the Embedded Vision Alliance, will explore the current state of low-cost, low-power smart cameras and underlying technologies. He will also illustrate the size and importance of the associated application opportunities with examples of current products. Finally, Jeff will highlight key factors that are facilitating and opposing rapid progress in this space, and propose a call to action for innovators in industry and academia to accelerate progress towards practical, ubiquitous smart cameras.

  • YouTube

Ultra-Low-Power Command Recognition for Ubiquitous Devices

Chris ROWEN, VP of Engineering, Cisco

Abstract (English)

Rapid advanced in deep learning over the past 5 years have permanently changed expectations for speech as a fundamental interface for many systems. The high compute demands of the early systems, have pushed most of the processing into the cloud, where cost, latency, energy, robustness and privacy are all compromised in favor of development flexibility. Speech, however, cannot become a universal interface for everyday devices until cost is measured in the pennies and power in a few milli-watts. Moreover, most of today’s speech systems are tuned for the ideal environment of quiet offices and living rooms. Real-world devices must sustain high speech response accuracy even under the chaotic real-world conditions of traffic, crowds, wind, and interfering music.

In this talk I decipher the current speech recognition spectrum from key-word detection to full natural language dictation, and highlight the emerging role for noise-robust rich command recognition. I present a new training regimen, network structure and optimized 8b integer implementation that recognizes large suites of device commands, while running on small microcontrollers, and consuming single-digit mW. Moreover, this new system achieves effective accuracy almost one order of magnitude better than general-purpose cloud-based speech recognizers for the target command sets under high-noise conditions.

  • YouTube

Transforming Epilepsy from a Chronic Condition Towards an Acute One Using tinyML

Hans DE CLERCQ, Co-founder and CTO, Byteflies

Abstract (English)

Epilepsy is a neurological disorder in which brain activity becomes abnormal. This complex disease has >100 subclasses which all have one commonality: unprovoked, recurrent seizures. The number of seizures is the basis of any medical decision to prescribe or alter the drug for treatment and for approving drugs for their efficacy. Moreover, according to patient testimonies, the real burden of epilepsy comes from the unpredictable nature of seizure occurrences. This has a serious impact on the patients’ quality of life: educational problems, limited employability, no driving license, societal stigma, restricted recreational choices (e.g. no/supervised swimming), ever-present fear and anxiety for another seizure, pregnancy complications, etc.

Currently, long-term patient follow-up and the low sensitivity of patient reported outcome (PRO) are very apparent in this disease area. The gold standard for patient diagnosis is video-EEG where patients need to be hospitalized for up to one week, while being connected to a full set of sensors. This procedure is resource-intensive, time-consuming and is not a guarantee for seizure detection (50% of patients do not experience seizures while hospitalized).

Byteflies aims to overcome these challenges by offering a unique, unobtrusive and wearable solution for the combined and non-invasive measurement of vital parameters needed for objective seizure logging and diagnosis of the different epileptic subclasses to improve the standard of care for Epilepsy patients. Moreover, the combination of this continuous daily monitoring solution and the online data analysis capabilities of tinyML enable objective assessment in epilepsy – including seizure prediction, the holy grail.

  • YouTube

4:30 pm to 6:30 pm

Reception, Poster Session, Demos, Networking

Poster Presentations

OpenMV Cam H7 – Low Power Machine Vision w/ Python

Kwabena AGYEMAN, President & Co-Founder, OpenMV, LLC

Hybrid Ultra-low Power Edge Computing

Allessandro AIMAR, Founder & CTO, Synthara Technologies

Deep Learning on Off-the-shelf Cameras with Examples in Top View, People Detection, Facial Recognition, and Car Detection

Karim ALI, CEO, Invision AI

The Eye in IoT: a unique solution by Himax & emza that includes special CMOS sensor, custom made processor and lean ML algorithm

Elad BARAM, VP Products, Emza Visual Sense

Developer Choices and Trends: Insights from the Embedded Vision Alliance’s Computer Vision Developer Survey

Jeff BIER, Founder, Embedded Vision Alliance

Abstract (English)

The Embedded Vision Alliance conducts an annual survey of computer vision system and application developers’ technology choices and challenges. This survey focuses on product developers working in industry, and examines key trends in adoption of processors, algorithms, sensors and software tools. In this poster, we present insights from the latest version of this survey.

TinyAI: from edgeAI tools to Neuro-Spiking Architectures

Fabian CLERMIDY, Head of Computing Departement, CEA

Resource-Constrained Keyword Spotting

Adam FUKS, Director of Design Engineering, NXP

ReBNet: Residual Binarized Networks

Mohammad GHASEMEZADEH, Software Engineer, Apple

Network Architecture Search for Efficient Wake-word Detection

Warren GROSS, Professor, McGill University

Memory-Optimal Direct Convolutions for Maximizing Classification Accuracy in Embedded Applications

Albert GURAL, Graduate Student, Stanford University

NeuroFabric: A Priori Sparsity for Training on the Edge

Mihailo ISAKOV, PhD Student, Department of Electrical and Computer Engineering, Boston University

Quantization for Efficient Inference in Edge Devices

Raghuraman KRISHNAMOOR, Software Engineer, Facebook

State-of-the-Art Voice UI Performance on Device

Christopher LOTT, Senior Staff Engr/Manager, Qualcomm, Inc.

Towards Further Compression of Low-Bitwidth DNNs with Permuted Diagonal Matrices

Tinoosh MOHSENIN, Associate Professor, University of Maryland Baltimore County

Energy Efficient, High Performance AI Acceleration from Edge to Cloud

Brad NOLD, Strategic Account Manager, Ansys

16-bit CNN accuracy in 5.5 mm2 package FPGA Human Presence Detection @ 10mW

Hussein OSMAN, Market Segment Manager, Lattice Semiconductor

STM32Cube.AI: AI productivity boosted on STM32 MCU

Danilo PAU, Technical Director, IEEE and ST Fellow, STMicroelectronics Italia

Enhanced Speech Segregation with Low-Latency DNN processing

Niels PONTOPPIDAN, Research Manager, Augmented Hearing Science at Eriksholm Research Center

A frame-free event-based approach to low-power real-time machine vision

Christoph POSCH, CTO, PROPHESEE

Batteryless Always-On Wireless Sensing for Full-Stack IoT Insights-as-a-Service

Richard SAWYER, Vice President Software, Everactive

Ultra-Low Power Computing Hardware Architecture and Circuits for Artificial Intelligence and Machine-Learning

Mingoo SEOK, Associate Professor, Columbia University

Object Detection @ 1 mW: Achieving Always-On Computer Vision at the Edge

Ravishankar SIVALINGAM, Sr. Staff Engineer/Manager, Qualcomm

Abstract (English)

Qualcomm Research has pioneered an Always-on Computer Vision Module (CVM) product based on a holistic approach combining innovations in the system architecture, ultra-low power design, and dedicated hardware for vision algorithms running at the edge. With low end-to-end power consumption (< 1 mW), a tiny form factor, and low cost, the CVM can be integrated into a wide range of battery- and line-powered devices (IoT, mobile, VR/AR, automotive, etc.), performing various computer vision tasks. This poster presents the underlying technology behind the CVM, applications enabled by this technology, and the unique challenges associated with the customer products. It also presents the efficient computer vision algorithms that run on the CVM, such as object detection, gesture recognition, and change detection, as well as the software tools available for training, and tuning tuning.

Sensors and Machine Learning in Energy and Embedded IoT Applications

Andreas SPANIAS, Professor and Center Director, Arizona State University

Hardware Architecture Considerations for Embedded Machine Learning

Jim STEELE, Company Owner, SystemOne Tel-Communications Inc.

uTensor: A Graph to C++ Code Generator

Neil TAN, Software Engineer, ARM

RNN Compression using Hybrid Matrix Decomposition

Urmish THAKKER, Principal Engineer , SambaNova Systems Inc

Hardware-Aware Network Design via Differentiable Architecture Search and Model Adaptation

Peter VAJDA, Research and Engineering Manager, Facebook

Resource-Efficient Machine Learning Research at MICAS

Marian VERHELST, Associate Professor, KU Leuven

Neural Networks on Microcontrollers with TensorFlow & the Pelion IoT platform

Yue ZHAO, Research scientist, ARM

Demo Presentations

OpenMV Cam H7 – Low Power Machine Vision w/ Python

Kwabena AGYEMAN, President & Co-Founder, OpenMV, LLC

The Eye in IoT

Elad BARAM, VP Products, Emza Visual Sense

Abstract (English)

Based on Himax CMOS sensor & and emza algorithms. The system is using upper body classifier based on emza lean machine learning framework.

RRAM-based Nonvolatile In-Memory Computing Macro with Precision-Configurable In-Field Nonlinear Activation

Yiran CHEN, Professor, Duke University

Abstract (English)

RRAM featuring high-density and high-energy-efficiency demonstrates great potential in developing neural network processors. The analog-digital conversion for supporting the analog computing nature of RRAM devices and digital data transition induces excessive design overhead. In this work, we present a RRAM-based nonvolatile in-memory-computing macro, which integrates a 64Kb RRAM array for synaptic weighting and in-field nonlinear activation (IFNA). IFNA merges ADC and activation computation by leveraging its nonlinear working region, which increases 8.2× density and eliminates the need for additional circuits to introduce nonlinearity. IFNA can be flexibly configured to support 1- to 8-bit activation precision. It speedups computation by 2x compared with the conventional separate ACC design. The real-time testing of MNIST and CIFAR-10 datasets on our chip prototype achieves the accuracy of 94.2% and 78.5%, respectively. The chip’s power is 1.52mW with a power efficiency of 3.36 TOPS/W.

Measuring the right medical data 24/7 with the Sensor Dot Technology

Hans DE CLERCQ, Co-founder and CTO, Byteflies

Abstract (English)

Wearable technology platform, which is able to measure a large set of vital signals, which we today record raw and send to our cloud for analysis. Showcasing the hardware using wearable technology and ML to measure and analyze Epilepsy seizures 24/7 and about the potential value of embedding algorithms into the sensors using tinyML.

Network Architecture Search for Efficient Wake-word Detection

Warren GROSS, Professor, McGill University

STM32Cube.AI: AI productivity boosted on STM32 MCU

Danilo PAU, Technical Director, IEEE and ST Fellow, STMicroelectronics Italia

From Collection To Classification: An End-to-End Software Solution for tinyML on Resource Constrained Embedded Devices

Christopher KNOROWSKI, CTO, SensiML Corp

Audio Analytic Battery-powered products with a sense of hearing beyond speech and music

Christopher MITCHELL, CEO and Founder, Audio Analytic

Abstract (English)

This demo will show acoustic event recognition operating on compact, edge-based target hardware which is capable of operating for multiple years on batteries. Recent advances in production quality piezoelectric MEMS microphones, coupled with extremely compact and robust sound recognition machine learning, and microprocessors that use transistors biased in the sub-threshold region of operation, are key to this demonstration. For the first time, this powerful combination of optimized hardware and software enables a wide range of devices to provide accurate and continuous sound recognition capabilities within a home environment for multiple years.

Video of chip performing image classification, facial recognition, voice recognition and command recognition at the simultaneously

Brad NOLD, Strategic Account Manager, Ansys

Abstract (English)

Demonstrating multiple model AI accelerator chips that include non-volatile memory and can be designed into Edge devices or used in data centers to support Cloud AI.

16-bit CNN accuracy in 5.5 mm2 package FPGA Human Presence Detection @ 10mW

Hussein OSMAN, Market Segment Manager, Lattice Semiconductor

Enhanced speech segregation with low-latency DNN processing

Niels PONTOPPIDAN, Research Manager, Augmented Hearing Science at Eriksholm Research Center

Abstract (English)

Hearing aid users are challenged in listening situations with noise and especially speech-on-speech situations with two or more competing voices. Specifically, the task of attending to and segregating two competing voices is particularly hard, unlike for normal-hearing listeners, as shown in a small subexperiment. In the main experiment, the competing voices benefit of a DNN based stream segregation enhancement algorithm was tested on hearing-impaired listeners.

A mixture of two voices was separated using a DNN and presented to the two ears as individual streams and tested for word score. Compared to the unseparated mixture, there was a 13%-point benefit from the separation, while attending to both voices. The results indicate that DNNs have a large potential for improving stream segregation and speech intelligibility in difficult scenarios with two equally important targets without any prior selection of a primary target stream.

Event-based vision sensor/system live demo

Christoph POSCH, CTO, PROPHESEE

Object Detection @ 1 mW: Achieving Always-On Computer Vision at the Edge

Ravishankar SIVALINGAM, Sr. Staff Engineer/Manager, Qualcomm

Abstract (English)

Qualcomm Research has pioneered an Always-on Computer Vision Module (CVM) product based on a holistic approach combining innovations in the system architecture, ultra-low power design, and dedicated hardware for vision algorithms running at the edge. With low end-to-end power consumption (< 1 mW), a tiny form factor, and low cost, the CVM can be integrated into a wide range of battery- and line-powered devices (IoT, mobile, VR/AR, automotive, etc.), performing various computer vision tasks. This poster presents the underlying technology behind the CVM, applications enabled by this technology, and the unique challenges associated with the customer products. It also presents the efficient computer vision algorithms that run on the CVM, such as object detection, gesture recognition, and change detection, as well as the software tools available for training, and tuning tuning.

Live demo of short command recognition system

Chris ROWEN, VP of Engineering, Cisco

Always on speech processing for 10 keywords consuming microwatts

Hari SHANKAR, Principal Engineer , Eta Compute

Abstract (English)

Compact speech recognition applications run on small coin cells or hearing aid batteries that supply between 25mA hours to 250mA hours. An always-on speech recognition application, especially those that are flexible enough for phrases or larger vocabularies can quickly drain the batteries. Furthermore, peak currents for inferencing can further reduce 50% of a battery’s life. In our demonstration, we use our TENSAI® AI processor with integrated DSP and Cortex-M3® to optimally implement speech processing on microwatts. TENSAI is a flexible solution that can be retrained for additional or different words and phrases. The net result is power consumption that enables small coin cells or hearing aid batteries to last up to 10X longer than other processing technologies.

6:30 pm to 8:00 pm

Google

Dinner

8:00 am to 9:00 am

Breakfast / Networking

9:00 am to 10:00 am

tinyML State of Technology: Summary and Highlights of Summit Day 1

Hardware and Architectures

Ian BRATT, Distinguished Engineer & Fellow, Arm

System and Algorithms

Boris MURMANN, Professor of Electrical Engineering, Stanford University

  • YouTube

Software and Application

Kurt KEUTZER, Full Professor, University of California, Berkeley

  • YouTube

10:00 am to 10:30 am

Break

10:30 am to 11:45 am

Two Panels and Audience Discussions

Session Moderator: Chris ROWEN, VP of Engineering, Cisco

tinyML Applications: opportunities and challenges

Moderator: Chris ROWEN, VP of Engineering, Cisco

tinyML Ecosystem development

Moderator: Chris ROWEN, VP of Engineering, Cisco

  • YouTube

11:45 am to 12:00 pm

Break

12:00 pm to 12:15 pm

Concluding Remarks – Call to Action

12:15 pm to 1:30 pm

Lunch

Schedule subject to change without notice.

Committee

Evgeni GOUSEV

Qualcomm Research

Pete WARDEN

Google

Speakers

Jeff BIER

Embedded Vision Alliance

Byron CHANGUION

Microsoft

William CHAPPELL

Azure Global

Microsoft

Charles CHONG

PixArt Imaging

Simon CRASKE

ARM

Hans DE CLERCQ

Byteflies

Eric FLAMAND

GreenWaves Technologies

Edwin PARK

QUALCOMM Inc

Chris ROWEN

Cisco

Dennis SYLVESTER

University of Michigan

Marian VERHELST

KU Leuven

Naveen VERMA

Princeton University

Pete WARDEN

Google

Kisun YOU

Siri Speech at Apple

Sponsors

( Click on a logo to get more information)