Description:
I’m doing some realtime-ish audio processing on Android. Mic audio is 1.) immediately looped back to playback device, and 2.) accumulated in a buffer that is periodically sent to server for processing.
Issue:
Step 1 works great. Step 2 works alright, but the audio is noticeably noisier/glitchier than when listening to the looped-back audio from Step 1.
Possible Clues:
- The effect rapidly worsens as fastBufferSize increases beyond AudioRecord.getMinBufferSize, but the looped-back audio sounds fine (other than slight increase in delay). Specifically, it sounds like the audio samples are not shifted correctly in time, or perhaps incompletely copied. Possible description: a smooth ramp audio signal ends up sounding like a warbly ramp.
- Adjusting recordBufferSize or slowBufferSize seem to minimally impact audio quality.
- I’m reasonably confident that the data isn’t being corrupted when being sent server-side and also that the server is faithfully reconstructing the audio.
- I am using PCM16 audio format, but observe similar behavior when trying PCM8.
Any thoughts would be greatly appreciated. Relevant code snippets below.
Current Implementation:
AudioRecord and AudioTrack are initialized:
recordBufferSize = AudioRecord.getMinBufferSize(sampleRateHz, CHANNELS, AUDIO_FORMAT);
// ORIGINALLY, fastBufferSize was going to be set like this, but also results in glitchy audio, happy spot seems to be right around minBufferSize
// String outFramesPerBuf = manager.getProperty(AudioManager.PROPERTY_OUTPUT_FRAMES_PER_BUFFER);
// fastBufferSize = Integer.parseInt(outFramesPerBuf);
// if (fastBufferSize == 0) fastBufferSize = 256; // Use default
fastBufferSize = AudioRecord.getMinBufferSize(sampleRateHz, CHANNELS, AUDIO_FORMAT);
// Build mic, NOTE: recordBufferSize should be larger than fastBufferSize
recorder = new AudioRecord.Builder()
.setAudioSource(MediaRecorder.AudioSource.VOICE_RECOGNITION)
.setAudioFormat(new AudioFormat.Builder()
.setEncoding(AUDIO_FORMAT)
.setSampleRate(sampleRateHz)
.setChannelMask(AudioFormat.CHANNEL_IN_MONO)
.build())
.setBufferSizeInBytes(recordBufferSize)
.build();
// Build player
player = new AudioTrack.Builder()
.setAudioAttributes(new AudioAttributes.Builder()
.setUsage(AudioAttributes.USAGE_VOICE_COMMUNICATION)
.setContentType(AudioAttributes.CONTENT_TYPE_SPEECH)
.build())
.setAudioFormat(new AudioFormat.Builder()
.setEncoding(AUDIO_FORMAT)
.setSampleRate(sampleRateHz)
.setChannelMask(AudioFormat.CHANNEL_OUT_MONO)
.build())
.setBufferSizeInBytes(fastBufferSize)
.build();
A producer thread handles AudioRecord.read, AudioTrack.write, and transfers data from the fastAudioBuffer to a volatile slowAudioBuffer:
read = recorder.read(fastAudioBuffer, 0, fastBufferSize);
if (read > 0) {
synchronized (slowAudioBuffer) {
// Shift slowAudioBuffer
System.arraycopy(slowAudioBuffer, 0, slowAudioBuffer, read, slowBufferSize - read);
// Copy new audio into slowAudioBuffer
System.arraycopy(fastAudioBuffer, 0, slowAudioBuffer, 0, read);
}
if (currentState == UserState.PASSTHROUGH) {
player.write(slowAudioBuffer, 0, read, AudioTrack.WRITE_NON_BLOCKING);
}
}
A consumer thread copies the volatile slowAudioBuffer to sampleAudioBuffer and sends it for processing:
synchronized (slowAudioBuffer)
{
// Copy slowAudioBuffer into sampleAudioBuffer
System.arraycopy(slowAudioBuffer, 0, sampleAudioBuffer, 0, sampleBufferSize);
}
// Process at remote server
processAudioRemote(sampleAudioBuffer);