Methods to optimize LSTM network which has inputs bound to be in a fixed integer range

I’m designing a feature extractor + LSTM classifier. The feature extraction processes outputs a time series which is constrained to be in an integer range [1, 10].

I designed, trained, and tested the LSTM classifier in Pytorch without worrying about dtypes and model size at first (so I treated everything as the default float32).

Now that I know the feature extraction process and classifier work, I wanted to optimize the LSTM network to make it faster and lighter, and I would like to exploit the fact the input has a limited integer range so I could in theory work with fewer input bits.

I’m kind of lost on what method to use to do this: quantization aware training (something like Brevitas or qKeras?), or just redoing the network with a different input dtype?

Any suggestion would be very appreciated

Leave a Comment