tinyML Talks: Policy Pruning and Shrinking of Deep Reinforcement Learning for edge devices


December 15, 2020



Contact us



Timezone: PST

Policy Pruning and Shrinking of Deep Reinforcement Learning for edge devices

Dor LIVNE, ML Engineer

DSP Group

The recent success of deep neural networks (DNNs) for function approximation in deep reinforcement learning (DRL) has triggered the development of DRL algorithms in various fields. Unfortunately, DNNs have high computational and memory requirements, limiting their use in systems with constrained hardware and power resources.
In recent years, pruning algorithms successfully reduced redundancy in DNNs, but existing algorithms suffer from significant performance reduction in the DRL domain. This paper introduces the Policy Pruning and Shrinking (PoPS) algorithm, the first effective solution to this problem.
As the paper will describe, PoPS has three stages. First, it uses transfer learning to capture the full information regarding the desired policy—without pruning. Next, PoPS executes a novel transfer learning-based policy pruning procedure to find an efficient pruned representation of the model. Finally, in the policy shrinking step, PoPS regenerates and trains a newly-constructed smaller dense model based on the redundancy measured by the policy pruning procedure. The policy pruning and policy shrinking steps are repeated until the algorithm can no longer detect any redundancy.

Dor LIVNE, ML Engineer

DSP Group

Dor Livne received the B.Sc. degree in electrical engineering from the Ben-Gurion University of the Negev, Beersheba, Israel, in 2019, where he is currently working toward the M.Sc. degree in electrical and computer engineering, under the supervision of Dr. Kobi Cohen. His current research topics include cognitive radio, multi-agent deep reinforcement learning, neural network compression, and decision theory.
Dor Currently works at DSPG as a machine learning engineer.

Schedule subject to change without notice.