A Programmable Approach to Model Compression
Deep neural networks frequently contain far more weights, represented at a
higher precision, than are required for the specific task which they are
trained to perform. Consequently, they can often be compressed using techniques
such as weight...