Binary Networks for Computer Vision

CVPR 2021 Workshop

Covering the latest development of novel methodologies for Binary Neural Networks and their application to Computer Vision. Bringing together a diverse group of researchers working in several related areas.

Workshop Description

An open problem in deep learning is how to develop models which are more and more compact, lightweight and power efficient so that they can be effectively deployed on devices that billions of users use in their everyday life and work like cars, smart-phones, tablets, TVs etc. One of the most prominent methods for achieving all these goals is by training Binary Networks in which both the features and the weights can take only 2 values: +-1. Binarization can result in huge model compression and computational speeds; however, an open problem is how to train binary networks which maintain the same accuracy levels as their real-valued counterparts. Very recent research efforts have shown that training highly accurate Binary Networks is actually feasible opening-up the path of applying the models to a wide variety of Computer Vision problems. This workshop aims to cover both the development of novel methodologies for Binary Neural Networks and their application to Computer Vision, and bring together a diverse group of researchers working in several related areas.

Call for Papers

Authors are welcome to submit full 8-page papers or short 2-page extended abstracts on any of the following topics:

  • Binary Neural Networks (BNNs): New methodologies (optimization and objective functions), and architectures for training.
  • Neural Architecture Search (NAS) for BNNs.
  • BNNs for Computer Vision: image classification, semantic, instance & panoptic segmentation, pose estimation, object detection, 3D vision, and video recognition.
  • BNNs for generative models: GANs, VAE etc.
  • Hardware implementation and on-device deployment of BNNs.
  • New methodologies and architectures for extreme quantisation.
  • Frameworks and bare-metal implementations for binary and low-bit networks.
  • On-device learning.

Important Dates

Paper submission deadline: March 29th, 2021 (11:59pm PST)
Decisions: April 14th, 2021 (11:59pm PST)
Camera ready papers due: April 20th, 2021 (11:59pm PST)
Extended abstract submission: May 17th, 2021 (11:59pm PST)
Extended abstract decisions: May 31th, 2021 (11:59pm PST)
Workshop Date: June 25th, 2021

Submission Guidelines

  • Papers included in CVPR proceedings: Submitted (full 8-page) papers must be formatted using the CVPR 2021 template and should adhere to CVPR submission guidelines. The maximum file size for submissions is 50MB. The CMT-based review process will be double-blind. These submissions will be included in the proceedings and must contain new previously unpublished material.
  • Extended abstracts NOT included in CVPR proceedings: We encourage the submission of extended abstracts (2 pages plus references) that summarize previously published or unpublished work. Extended abstracts will undergo a light single-blind review process. Template for extended abstract can be found here.
    • Previously published work: We welcome previously published papers from previous CV/ML conferences including CVPR 2021 which are within the scope of the workshop.
    • Unpublished work: We also encourage the submission of papers that summarize work in-progress. The idea of this type of submission is the dissemination of preliminary results or methods that fall within the overall scope of the workshop.

Please upload submissions at: cmt


The Workshop will take place on the 25th of June according to the following schedule. All times are in BST (UTC+1).

20:00 - 20:10 Opening remarks and workshop kickoff
20:10 - 20:40 Invited talk: Daniel Soudry - On depth and data limitations with extreme quantization
We examine three aspects of quantized neural nets:
  1. We derive optimal initializations and find the maximal trainable depth as a function of numerical precision (NeurIPS 2019).
  2. We suggest methods for 4bit post-training quantization, enabling high-quality quantization with very little data (ICML 2021) .
  3. We suggest a method for generating synthetic data, for fine-tuning a quantized model without any data (CVPR 2020).
20:40 - 21:10 Invited talk: Nicholas Lane - What is Next for the Efficient Machine Learning Revolution?
Mobile and embedded devices increasingly rely on deep neural networks to understand the world -- a formerly impossible feat that would have overwhelmed their system resources just a few years ago. The age of on-device artificial intelligence is upon us; but incredibly, these dramatic changes are just the beginning. Looking ahead, mobile machine learning will extend beyond just classifying categories and perceptual tasks, to roles that alter how every part of the systems stack of smart devices function. This evolutionary step in constrained-resource computing will finally produce devices that meet our expectations in how they can learn, reason and react to the real-world. In this talk, I will briefly discuss the initial breakthroughs that allowed us to reach this point, and outline the next set of open problems we must overcome to bring about this next deep transformation of mobile and embedded computing.
21:10 - 21:15 Short Break
21:15 - 21:30 Oral talk 1
21:30 - 21:45 Oral talk 2
21:45 - 22:15 Invited talk: Diana Marculescu - miliJoules for 1000 Inferences: Machine Learning Systems ‘on the Cheap’
Machine learning (ML) applications have entered and impacted our lives unlike any other technology advance from the recent past. While the holy grail for judging the quality of a ML model has largely been serving accuracy, and only recently its resource usage, neither of these metrics translate directly to energy efficiency, runtime, or mobile device battery lifetime. This talk uncovers the need for designing efficient convolutional neural networks (CNNs) for deep learning mobile applications that operate under stringent energy and latency constraints. We show that while CNN model quantization and pruning are effective tools in bringing down the model size and resulting energy cost by up to 1000x while maintaining baseline accuracy, the interplay between bitwidth, channel count, and CNN memory footprint uncovers a non-trivial trade-off. Surprisingly, there exists a single weight bitwidth that is superior to others for a given storage constraint, even outperforming mixed-precision quantization. Our results show that even when the channel count is allowed to change, a single weight bitwidth can be sufficient for model compression, which greatly reduces the software and hardware optimization costs for CNN-based ML systems.
22:15 - 22:30 Break
22:30 - 23:00 Invited Talk: Tim de Bruin - BNNs for TinyML: performance beyond accuracy
Over the past few years, there has been a lot of exciting progress in the field of Binary Neural Networks. New training methods and network architectures have enabled rapid increases in accuracy, especially on traditional computer vision benchmarks such as ImageNet -- closing the gap to higher bit-width models while delivering on the promise of increased inference efficiency. At Plumerai, we are strong believers in BNNs. We think that their reduced memory, energy, and computational needs will be especially relevant in the subfield of TinyML, where they can enable previously infeasible products. However, the TinyML field does bring a unique set of challenges: from the quality of the data coming from the low-cost sensors to extreme constraints on the model architectures imposed by the available hardware. This means that solutions developed for ImageNet do not always generalize to this domain. These challenges also extend beyond simply obtaining a high enough accuracy, as real world performance is often more nuanced; requiring stable predictions and a good understanding of model biases. We demonstrate the effects of binarization within this domain. We start by demonstrating how binary convolutions make networks more sensitive to small changes to their inputs. We then show how changes in network architectures designed to more easily carry gradients during training cause models to pick up on different biases in their training data. We also explain how we combine our own collected data with our tiny BNNs into a tool to look at publicly available datasets, and some of the sampling biases they contain. Finally we make the case for an increased research focus into BNNs in the TinyML domain. Given the need for the strengths of BNNs in this domain, the lower computational cost of experiments and the fact that smaller networks bring some of the remaining challenges of BNNs more clearly in focus, we believe that research into TinyML-BNNs could be especially impactful.
23:00 - 23:15 Oral talk 3
23:15 - 23:30 Oral talk 4
23:30 - 23:35 Short Break
23:35 - 00:05 Invited Talk: Mohammad Rastegari and Maxwell Horton - Data-Free Model Compression
Efficient method for compressing a trained neural network without using any data is very challenging. Our data-free method requires 14x-450x fewer FLOPs than comparable state-of-the-art methods. We break the problem of data-free network compression into a number of independent layer-wise compressions. We show how to efficiently generate layer-wise training data, and how to precondition the network to maintain accuracy during layer-wise compression. We show state-of-the-art performance on MobileNetV1 for data-free low-bit-width quantization. We also show state-of-the-art performance on data-free pruning of EfficientNet B0 when combining our method with end-to-end generative methods.
00:05 - 00:10 Closing remarks and Conclusions

Invited Speakers

Diana Marculescu

The University of Texas at Austin

Nicholas Lane

University of Cambridge and Samsung AI


Adrian Bulat

Samsung AI

Zechun Liu


Brais Martinez

Samsung AI

Georgios Tzimiropoulos

QMUL and Samsung AI