Decibel Values at Specific Points in Wav File

How do I attenuate a WAV file by a given decibel value?

I think you want to convert from decibel to gain.

The equations for audio are:

decibel to gain:

  gain = 10 ^ (attenuation in db / 20)

or in C:

  gain = powf(10, attenuation / 20.0f);

The equations to convert from gain to db are:

  attenuation_in_db = 20 * log10 (gain)

Python: Get volume decibel Level real time or from a wav file

Short answer: dB isn't the same as dB. Your results are probably correct.

Long answer: dB levels always define a relation to some reference value. For audio / acoustics, there are many reference values, and you need to specify which one you are using for a value in dB to be meaningful. When you say

normal sound may be in the range of 50-70 dB

that's not really an accurate statement, you probably mean

normal sound may be in the range of 50-70 dB SPL

where you are giving a value relative to the reference sound pressure level of 20 µPa.

In digital systems, sound files are typically represented by floating numbers < 1, then we speak of dB FS (dB full scale) with reference value 1. By the laws of math, dB FS values are negative.

It is also clear that you cannot directly relate dB FS values to dB SPL values: if you play the same audio file (i.e. taking some dB FS value) and play it twice, but turn up the volume knob of your speaker, it will lead to two different values dB SPL (what you hear).

Finding the 'volume' of a .wav at a given time

In digital audio processing you typically refer to the momentary peak amplitude of the signal (this is also called PPM -- peak programme metering). Depending on how accurate you want to be or if you wish to model some standardised metering or not, you could either

just use a sliding window of sample frames (find the maximum absolute value per window)
implement some sort of peak-hold mechanism that retains the last peak value for a given duration and then start to have the value 'fall' by a given amount of decibels per second.

The other measuring mode is RMS which is calculated by integrating over a certain time window (add the squared sample values, divide by the window length, and take the square-root, thus root-mean-square RMS). This gives a better idea of the 'energy' of the signal, moving smoother than peak measurements, but not capturing the maximum values observed. This mode is sometimes called VU meter as well. You can approximate this with a sort of lagging (lowpass) filter, e.g. y[i] = y[i-1]*a + |x[i]|*(a-1), for some value 0 < a < 1

You typically display the values logarithmically, i.e. in decibels, as this corresponds better with our perception of signal strength and also for most signals produces a more regular coverage of your screen space.

Three projects I'm involved with may help you:

ScalaAudioFile which you can use to read the sample frames from an AIFF or WAVE file
ScalaAudioWidgets which is a still young and incomplete project to provide some audio application widgets on top of scala-swing, including a PPM view -- just use a sliding window and set the window's current peak value (and optionally RMS) at a regular interval, and the view will take care of peak-hold and fall times
(ScalaCollider, a client for the SuperCollider sound synthesis system, which you might use to play back the sound file and measure the peak and RMS amplitudes in real time. The latter is probably an overkill for your project and would involve some serious learning curve if you have never heard of SuperCollider. The advantage would be that you don't need to worry about synchronising your sound playback with the meter display)

Analyzing Sound in a WAV file

The FFT has nothing to do with volume and everything to do with frequencies. To find out how loud a scene is on average, simply average the sampled values. Depending on whether you get the data as signed or unsigned values in your language, you might have to apply an absolute function first so that negative amplitudes don't cancel out the positive ones, but that's pretty much it. If you don't get the results you were expecting that must have to do with the way you are extracting the individual values in line 20.

That said, there are a few refinements that might or might not affect your task. Perceived loudness, amplitude and acoustic power are in fact related in non-linear ways, but as long as you are only trying to get a rough estimate of how much is "going on" in the audio signal I doubt that this is relevant for you. And of course, humans hear different frequencies better or worse - for instance, bats emit ultrasound squeals that would be absolutely deafening to us, but luckily we can't hear them at all. But again, I doubt this is relevant to your task, since e.g. frequencies above 22kHz (or was is 44kHz? not sure which) are in fact not representable in simple WAV format.

Representing music audio samples in terms of dB?

As Bastyen (+1 from me) indicates, calculating decibels is actually NOT simple, but requires looking at a large number of samples. However, since sound samples run MUCH more frequently than visual frames in an animation, making an aggregate measure works out rather neatly.

A nice visual animation rate, for example, updates 60 times per second, and the most common sampling rate for sound is 44100 times per second. So, 735 samples (44100 / 60 = 735) might end up being a good choice for interfacing with a visualizer.

By the way, of all the official Java tutorials I've read (I am a big fan), I have found the ones that accompany the javax.sound.sampled to be the most difficult. http://docs.oracle.com/javase/tutorial/sound/TOC.html

But they are still worth reading. If I were in charge of a rewrite, there would be many more code examples. Some of the best code examples are in several sections deep, e.g., the "Using Files and Format Converters" discussion.

If you don't wish to compute the RMS, a hack would be to store the local high and/or low value for the given number of samples. Relating these numbers to decibels would be dubious, but MAYBE could be useful after giving it a mapping of your choice to the visualizer. Part of the problem is that values for a single point on given wave can range wildly. The local high might be more due to the phase of the constituent harmonics happening to line up than about the energy or volume.

Your PCM top and bottom values would probably NOT be 0 and 256, more likely -128 to 127 for 8-bit encoding. More common still is 16-bit encoding (-32768 to 32767). But you will get the hang of this if you follow Bastyen's links. To make your code independent of the bit-encoding, you would likely normalize the data (convert to floats between -1 and 1) before doing any other calculations.

Increase volume by X decibels and rewrite audio file

Your code looks good. To measure the average sound level of an audio sample you need to calculate the RMS (root mean square) of this sound level:

RMS := Sqrt( Sum(x_i*x_i)/N)

with x_i being the i-th sample and N the number of samples. The RMS is the average amplitude of your signal. Use

RMS_dB = 20*log(RMS/ref)

(with ref being 1.0 or 32767.0)

to convert it to a decibel value.

You may calculate this RMS value before and after you change the volume. The difference should be erxactly the dB you used in your IncreaseVolume()