I’m designing a feature extractor + LSTM classifier. The feature extraction processes outputs a time series which is constrained to be in an integer range [1, 10].
I designed, trained, and tested the LSTM classifier in Pytorch without worrying about dtypes and model size at first (so I treated everything as the default float32).
Now that I know the feature extraction process and classifier work, I wanted to optimize the LSTM network to make it faster and lighter, and I would like to exploit the fact the input has a limited integer range so I could in theory work with fewer input bits.
I’m kind of lost on what method to use to do this: quantization aware training (something like Brevitas or qKeras?), or just redoing the network with a different input dtype?
Any suggestion would be very appreciated