Detect silence in audio recording
With this solution inspired by Visualizations with Web Audio API, you can set minimal required decibels and detect if anything was recorded.
const MIN_DECIBELS = -45;
navigator.mediaDevices.getUserMedia({ audio: true })
.then(stream => {
const mediaRecorder = new MediaRecorder(stream);
mediaRecorder.start();
const audioChunks = [];
mediaRecorder.addEventListener("dataavailable", event => {
audioChunks.push(event.data);
});
const audioContext = new AudioContext();
const audioStreamSource = audioContext.createMediaStreamSource(stream);
const analyser = audioContext.createAnalyser();
analyser.minDecibels = MIN_DECIBELS;
audioStreamSource.connect(analyser);
const bufferLength = analyser.frequencyBinCount;
const domainData = new Uint8Array(bufferLength);
let soundDetected = false;
const detectSound = () => {
if (soundDetected) {
return
}
analyser.getByteFrequencyData(domainData);
for (let i = 0; i < bufferLength; i++) {
const value = domainData[i];
if (domainData[i] > 0) {
soundDetected = true
}
}
window.requestAnimationFrame(detectSound);
};
window.requestAnimationFrame(detectSound);
mediaRecorder.addEventListener("stop", () => {
const audioBlob = new Blob(audioChunks);
const audioUrl = URL.createObjectURL(audioBlob);
const audio = new Audio(audioUrl);
audio.play();
console.log({ soundDetected });
});
});
Detect silence when recording
How can I detect silence when recording operation is started in Java?
Calculate the dB or RMS value for a group of sound frames and decide at what level it is considered to be 'silence'.
What is PCM data?
Data that is in Pulse-code modulation format.
How can I calculate PCM data in Java?
I do not understand that question. But guessing it has something to do with the speech-recognition
tag, I have some bad news. This might theoretically be done using the Java Speech API. But there are apparently no 'speech to text' implementations available for the API (only 'text to speech').
I have to calculate rms for speech-recognition project. But I do not know how can I calculate in Java.
For a single channel that is represented by signal sizes in a double
ranging from -1 to 1, you might use this method.
/** Computes the RMS volume of a group of signal sizes ranging from -1 to 1. */
public double volumeRMS(double[] raw) {
double sum = 0d;
if (raw.length==0) {
return sum;
} else {
for (int ii=0; ii<raw.length; ii++) {
sum += raw[ii];
}
}
double average = sum/raw.length;
double sumMeanSquare = 0d;
for (int ii=0; ii<raw.length; ii++) {
sumMeanSquare += Math.pow(raw[ii]-average,2d);
}
double averageMeanSquare = sumMeanSquare/raw.length;
double rootMeanSquare = Math.sqrt(averageMeanSquare);
return rootMeanSquare;
}
There is a byte buffer to save input values from the line, and what I should have to do with this buffer?
If using the volumeRMS(double[])
method, convert the byte
values to an array of double
values ranging from -1 to 1. ;)
Can I use MediaRecorder to record audio with capture silence detection in Android?
No, in order to detect silence you need to inspect the audio data.
MediaRecorder just writes the audio to a file as can be seen in MediaRecorder's state machine diagram.
The solution would be to use AudioRecord and analyze the data before writing the bytes to a file. Since you want to resume recording, you'll need to the keep the mic open and process the incoming audio for the user speaking.
The subject of analyzing digital audio data is known as signal processing where you can find a number of resources:
Simplest way of detecting where audio envelopes start and stop
What dBFS threshold should I set for differentiation between silence and NOT silence
But for a first attempt, I would use Audio record in android and FFT and assume a quiet room. You'll have to come up with an appropriate trigger level (see above links).
Be aware that processing does add some lag and you'll have to compensate, i.e. buffer 0 has silence, buffer 1 the user starts to speak but doesn't trigger threshold, buffer 2 user is speaking and threshold is triggered. You may want to save buffer 1 in addition to buffer 2.
Find silence in AVAudioRecorder Session
here I Have created my function that will actually detect silence for 5 seconds and if condition is satisfied you can stop recording that time
-- I had used Recording Manager NSObject class so you can get the code from below function and manage to use it in yours
Code
//StartNewRecordingIfSilenceFor5Second
func newSessionIfSilence(){
//get Audio file name to store
let AudioFileName = getDocumentsDirectory().appendingPathComponent("\(getUniqueName()).wav")
//Declare a value that will be updated when silence is detected
var statusForDetection = Float()
//Recorder Settings used
let settings: [String: Any] = [
AVFormatIDKey: Int(kAudioFormatLinearPCM),
AVSampleRateKey: 16000,
AVNumberOfChannelsKey: 1,
AVLinearPCMBitDepthKey: 16,
AVEncoderAudioQualityKey: AVAudioQuality.high.rawValue,
AVLinearPCMIsBigEndianKey: false,
AVLinearPCMIsFloatKey: false,
]
//Try block
do {
//Start Recording With Audio File name
Manager.recorder = try AVAudioRecorder(url: AudioFileName, settings: settings)
Manager.recorder?.delegate = self
Manager.recorder?.isMeteringEnabled = true
Manager.recorder?.prepareToRecord()
Manager.recorder?.record()
//Tracking Metering values here only
Manager.meterTimer = Timer.scheduledTimer(withTimeInterval: 0.10, repeats: true, block: { (timer: Timer) in
//Update Recording Meter Values so we can track voice loudness
//Getting Recorder from another class
//i managed my recorder from Manager class
if let recorder = Manager.recorder
{
//Start Metering Updates
recorder.updateMeters()
//Get peak values
Manager.recorderApc0 = recorder.averagePower(forChannel: 0)
Manager.recorderPeak0 = recorder.peakPower(forChannel: 0)
//it’s converted to a 0-1 scale, where zero is complete quiet and one is full volume.
let ALPHA: Double = 0.05
let peakPowerForChannel = pow(Double(10), (0.05 * Double(Manager.recorderPeak0)))
// static var lowPassResults: Double = 0.0
RecordingManager.lowPassResults = ALPHA * peakPowerForChannel + (1.0 - ALPHA) * RecordingManager.lowPassResults
if(RecordingManager.lowPassResults > 0){
print("Mic blow detected")
//Do what you wanted to do here
//if blow is detected update silence value as zero
statusForDetection = 0.0
}
else
{
//Update Value for Status is blow being detected or not
//As timer is called at interval of 0.10 i.e 0.1 So add value each time in silence Value with 0.1
statusForDetection += 0.1
//if blow is not Detected for 5 seconds
if statusForDetection > 5.0 {
//Update value to zero
//When value of silence is greater than 5 Seconds
//Time to Stop recording
statusForDetection = 0.0
//Stop Audio recording
recorder.stop()
}
}
}
})
} catch {
//Finish Recording with a Error
print("Error Handling: \(error.localizedDescription)")
self.finishRecording(success: false)
}
}
Detect & Record Audio in Python - trim beginning silence
The issue was due to a quick, inaudible flash of sound when the mic was started. Adding del r[0:8000]
before r = normalize(r)
eliminated the first moments of audio containing the inaudible sound data, solving my issue.
Related Topics
How to Cast a Double to an Int in Java by Rounding It Down
Google Firebase Check If Child Exists
Android Simpledateformat, How to Use It
How to Draw a Filled Triangle in Android Canvas
Java.Net.Socketexception: Socket Failed: Eperm (Operation Not Permitted)
Java.Io.Ioexception:No Authentication Challenges Found
Android Overlay a View Ontop of Everything
When/Why to Call System.Out.Flush() in Java
How to Use Jndi Datasource Provided by Tomcat in Spring
Failed to Resolve: Com.Google.Firebase:Firebase-Core:16.0.1
What Is Difference Between @+Id/Android:List and @+Id/List
Source Code Does Not Match the Bytecode' When Debugging on a Device
Android:How to Read File in Bytes
Jar Mismatch Found 2 Versions of Android-Support-V4.Jar in the Dependency List