Generating a Waveform Using Ffmpeg

Generating a waveform using ffmpeg

Default waveform

ffmpeg -i input.wav -filter_complex showwavespic -frames:v 1 output.png

Notes

Notice the segment of silent audio in the middle (see "Fancy waveform" below if you want to see how to add a line).
The background is transparent.
Default colors are red (left channel) and green (right channel) for a stereo input. The color is mixed where the channels overlap.
You can change the channel colors with the colors option, such as "showwavespic=colors=blue|yellow". See a list of valid color names or use hexadecimal notation, such as #ffcc99.
See the showwavespic filter documentation for additional options.
If you want a video instead of an image use the showwaves filter.

Fancy waveform

ffmpeg -i input.mp4 -filter_complex \
"[0:a]aformat=channel_layouts=mono, \
 compand=gain=-6, \
 showwavespic=s=600x120:colors=#9cf42f[fg]; \
 color=s=600x120:color=#44582c, \
 drawgrid=width=iw/10:height=ih/5:color=#9cf42f@0.1[bg]; \
 [bg][fg]overlay=format=auto,drawbox=x=(iw-w)/2:y=(ih-h)/2:w=iw:h=1:color=#9cf42f" \
-frames:v 1 output.png

Explanation of options

aformat downsamples the audio to mono. Otherwise, by default, a stereo input would result in a waveform with a different color for each channel (see Default waveform example above).
compand modifies the dynamic range of the audio to make the waveform look less flat. It makes a less accurate representation of the actual audio, but can be more visually appealing for some inputs.
showwavespic makes the actual waveform.
color source filter is used to make a colored background that is the same size as the waveform.
drawgrid adds a grid over the background. The grid does not represent anything, but is just for looks. The grid color is the same as the waveform color (#9cf42f), but opacity is set to 10% (@0.1).
overlay will place [bg] (what I named the filtergraph for the background) behind [fg] (the waveform).
Finally, drawbox will make the horizontal line so any silent areas are not blank.

Gradient example

Using gradients filter:

ffmpeg -i input.mp3 -filter_complex "gradients=s=1920x1080:c0=000000:c1=434343:x0=0:x1=0:y0=0:y1=1080,drawbox=x=(iw-w)/2:y=(ih-h)/2:w=iw:h=1:color=#0000ff[bg];[0:a]aformat=channel_layouts=mono,showwavespic=s=1920x1080:colors=#0068ff[fg];[bg][fg]overlay=format=auto" -vframes:v 1 output.png

Color background

waveform with simple color background

ffmpeg -i input.opus -filter_complex "color=c=blue[color];aformat=channel_layouts=mono,showwavespic=s=1280x720:colors=white[wave];[color][wave]scale2ref[bg][fg];[bg][fg]overlay=format=auto" -frames:v 1 output.png

The scale2ref filter automatically makes the background the same size as the waveform.

Image background

Of course you can use an image or video instead for the background:

Image background example

ffmpeg -i audio.flac -i background.jpg -filter_complex \
"[1:v]scale=600:-1,crop=iw:120[bg]; \
 [0:a]showwavespic=s=600x120:colors=cyan|aqua[fg]; \
 [bg][fg]overlay=format=auto" \
-q:v 3 showwavespic_bg.jpg

Getting waveform stats and data

Use the astats filter. Many stats are available: RMS, peak, min, max, difference, etc.

RMS level per audio frame

Example to get standard RMS level measured in dBFS per audio frame:

ffprobe -v error -f lavfi -i "amovie=input.wav,astats=metadata=1:reset=1" -show_entries frame_tags=lavfi.astats.Overall.RMS_level -of csv=p=0 > rms.log

Peak level per second

Add the asetnsamples filter.

ffprobe -v error -f lavfi -i "amovie=input.wav,asetnsamples=44100,astats=metadata=1:reset=1" -show_entries frame_tags=lavfi.astats.Overall.Peak_level -of csv=p=0

Same as above but with timestamps

ffprobe -v error -f lavfi -i "amovie=input.wav,asetnsamples=44100,astats=metadata=1:reset=1" -show_entries frame=pkt_pts_time:frame_tags=lavfi.astats.Overall.Peak_level -of csv=p=0

Output to file

Just append > output.log to the end of your command:

ffprobe -v error -f lavfi -i "amovie=input.wav,asetnsamples=44100,astats=metadata=1:reset=1" -show_entries frame_tags=lavfi.astats.Overall.RMS_level -of csv=p=0 > output.log

JSON

ffprobe -v error -f lavfi -i "amovie=input.wav,asetnsamples=44100,astats=metadata=1:reset=1" -show_entries frame_tags=lavfi.astats.Overall.RMS_level -of json > output.json

Produce waveform video from audio using FFMPEG

What showwaves does is show the waveform in realtime, and the display window is 1/framerate i.e. if the video output is 25 fps, then each frame shows the waveform of 40 ms of audio. There's no 'history' or 'memory' so you can't (directly) get a scrolling output like it seems your reference video shows.

The workaround for this is to use the showwavespic filter to produce a single frame showing the entire waveform at a high enough horizontal resolution. Then do a scrolling overlay of that picture over a desired background, at a speed such that the scroll lasts as long as the audio.

Basic command template would be:

ffmpeg -loop 1 -i bg.png -loop 1 -i wavespic.png -i audio.mp3
 -filter_complex "[0][1]overlay=W-w*t/mp3dur:y=SOMEFIXEDVALUE" -shortest waves.mp4

mp3dur above should be replaced with the duration of the audio file.

Generate waveforms for audio files with large amount of channels

Untested workaround is to use pan to choose each channel, a showwavespic per channel, and stack them with vstack:

ffmpeg -i input.wav -filter_complex
  "[0:a]pan=mono|c0=c0,showwavespic=s=1920x40[a0];
   [0:a]pan=mono|c0=c1,showwavespic=s=1920x40[a1];
   ...
   [0:a]pan=mono|c0=c29,showwavespic=s=1920x40[a29];
   [a0][a1]...[a29]vstack=inputs=30" output.png

Get waveform data from audio file using FFMPEG

Sabona budi,

Wrote about the manual way to get waveform but then to show you an example, I found this code which does what you want (or at the least, you can learn something from it).

1) Use FFmpeg to get array of samples

Try the example code shown here : http://blog.wudilabs.org/entry/c3d357ed/?lang=en-US

Experiment with it, try tweaking with suggestions from manual etc... In that shown code just change string path to point to your own file-path. Edit the proc.StartInfo.Arguments section to replace the last section to look like:

proc.StartInfo.Arguments = "-i \"" + path + "\" -vn -ac 1 -filter:a aresample=myNum -map 0:a -c:a pcm_s16le -f data -";

That myNum from the part aresample=myNum is calculated by :

44100 * total Seconds = X.
myNum = X / WaveForm Width.

Finally use the ProcessBuffer function with this logic :

static void ProcessBuffer(byte[] buffer, int length)
{
    float val; //amplitude value of a sample
    int index = 0; //position within sample bytes
    int slicePos = 0; //horizontal (X-axis) position for pixels of next slice

    while (index < length)
    {
        val = BitConverter.ToInt16(buffer, index);
        index += sizeof(short);

        // use number in va to do something...
        // eg: Draw a line on canvas for part of waveform's pixels
        // eg: myBitmap.SetPixel(slicePos, val, Color.Green);

        slicePos++;
    }
}

If you want to do it manually without FFmpeg. You could try...

2) Decode audio to PCM
You could just load the audio file (mp3) into your app and first decode that to PCM (ie: raw digital audio). Then read just the PCM numbers to make the waveform. Don't read numbers directly from bytes of compression math like MP3.

These PCM data values (about audio amplitudes) go into a byte array. If your sound is 16-bit then you extract the PCM value by reading each sample as a short (ie: getting value of two consecutive bytes at once since 16 bits == 2 bytes length).

Basically when you have 16-bit audio PCM inside a byte array, every two bytes represents an audio sample's amplitude value. This value becomes your height (loudness) at each slice. A slice is a 1-pixel vertical line from a time in the waveform.

Now sample rate means how many samples per-second. Usually 44100 samples (44.1 khz). You can see that using 44 thousand pixels to represent one second of sound is not feasible, so divide total seconds by required waveform width. Take the result & multiply by 2 (to cover two bytes) and that is how you much you jump-&-sample the amplitudes as you form the waveform. Do this in a while loop.

Generating a Waveform Using Ffmpeg