Are there any libraries that can create a waveform image out of a audio file (mp3)?
http://www.freesound.org/blog/?p=10
I found this library, and it works well.
Generate visual (waveform) from MP3/WAV file in Windows 2008 Server?
Sox, "the Swiss Army knife of audio manipulation", can generate accurate PNG spectrograms from sound files. It plays pretty much anything, and binaries are available for Windows. At the most basic level, you'd use something like this:
sox my.wav -n spectrogram
If you want a spectrogram with no axes, titles, legends, and a light background that's 100px high:
sox "Me, London.mp3" -n spectrogram -Y 130 -l -r -o "Me, London.png"
Sox accepts a lot of options if you only want to analyze a single channel for example. If you need your visuals to be even cooler, you could post-process the resulting PNG.
Here is a short overview from the commandline about all available parameters, the manpage has more details:
-x num X-axis size in pixels; default derived or 800
-X num X-axis pixels/second; default derived or 100
-y num Y-axis size in pixels (per channel); slow if not 1 + 2^n
-Y num Y-height total (i.e. not per channel); default 550
-z num Z-axis range in dB; default 120
-Z num Z-axis maximum in dBFS; default 0
-q num Z-axis quantisation (0 - 249); default 249
-w name Window: Hann (default), Hamming, Bartlett, Rectangular, Kaiser
-W num Window adjust parameter (-10 - 10); applies only to Kaiser
-s Slack overlap of windows
-a Suppress axis lines
-r Raw spectrogram; no axes or legends
-l Light background
-m Monochrome
-h High colour
-p num Permute colours (1 - 6); default 1
-A Alternative, inferior, fixed colour-set (for compatibility only)
-t text Title text
-c text Comment text
-o text Output file name; default `spectrogram.png'
-d time Audio duration to fit to X-axis; e.g. 1:00, 48
-S time Start the spectrogram at the given time through the input
Generating a waveform using ffmpeg
Default waveform
ffmpeg -i input.wav -filter_complex showwavespic -frames:v 1 output.png
Notes
Notice the segment of silent audio in the middle (see "Fancy waveform" below if you want to see how to add a line).
The background is transparent.
Default colors are red (left channel) and green (right channel) for a stereo input. The color is mixed where the channels overlap.
You can change the channel colors with the
colors
option, such as"showwavespic=colors=blue|yellow"
. See a list of valid color names or use hexadecimal notation, such as#ffcc99
.See the showwavespic filter documentation for additional options.
If you want a video instead of an image use the showwaves filter.
Fancy waveform
ffmpeg -i input.mp4 -filter_complex \
"[0:a]aformat=channel_layouts=mono, \
compand=gain=-6, \
showwavespic=s=600x120:colors=#9cf42f[fg]; \
color=s=600x120:color=#44582c, \
drawgrid=width=iw/10:height=ih/5:color=#9cf42f@0.1[bg]; \
[bg][fg]overlay=format=auto,drawbox=x=(iw-w)/2:y=(ih-h)/2:w=iw:h=1:color=#9cf42f" \
-frames:v 1 output.png
Explanation of options
aformat downsamples the audio to mono. Otherwise, by default, a stereo input would result in a waveform with a different color for each channel (see Default waveform example above).
compand modifies the dynamic range of the audio to make the waveform look less flat. It makes a less accurate representation of the actual audio, but can be more visually appealing for some inputs.
showwavespic makes the actual waveform.
color source filter is used to make a colored background that is the same size as the waveform.
drawgrid adds a grid over the background. The grid does not represent anything, but is just for looks. The grid color is the same as the waveform color (
#9cf42f
), but opacity is set to 10% (@0.1
).overlay will place
[bg]
(what I named the filtergraph for the background) behind[fg]
(the waveform).Finally, drawbox will make the horizontal line so any silent areas are not blank.
Gradient example
Using gradients filter:
ffmpeg -i input.mp3 -filter_complex "gradients=s=1920x1080:c0=000000:c1=434343:x0=0:x1=0:y0=0:y1=1080,drawbox=x=(iw-w)/2:y=(ih-h)/2:w=iw:h=1:color=#0000ff[bg];[0:a]aformat=channel_layouts=mono,showwavespic=s=1920x1080:colors=#0068ff[fg];[bg][fg]overlay=format=auto" -vframes:v 1 output.png
Color background
ffmpeg -i input.opus -filter_complex "color=c=blue[color];aformat=channel_layouts=mono,showwavespic=s=1280x720:colors=white[wave];[color][wave]scale2ref[bg][fg];[bg][fg]overlay=format=auto" -frames:v 1 output.png
The scale2ref filter automatically makes the background the same size as the waveform.
Image background
Of course you can use an image or video instead for the background:
ffmpeg -i audio.flac -i background.jpg -filter_complex \
"[1:v]scale=600:-1,crop=iw:120[bg]; \
[0:a]showwavespic=s=600x120:colors=cyan|aqua[fg]; \
[bg][fg]overlay=format=auto" \
-q:v 3 showwavespic_bg.jpg
Getting waveform stats and data
Use the astats filter. Many stats are available: RMS, peak, min, max, difference, etc.
RMS level per audio frame
Example to get standard RMS level measured in dBFS per audio frame:
ffprobe -v error -f lavfi -i "amovie=input.wav,astats=metadata=1:reset=1" -show_entries frame_tags=lavfi.astats.Overall.RMS_level -of csv=p=0 > rms.log
Peak level per second
Add the asetnsamples filter.
ffprobe -v error -f lavfi -i "amovie=input.wav,asetnsamples=44100,astats=metadata=1:reset=1" -show_entries frame_tags=lavfi.astats.Overall.Peak_level -of csv=p=0
Same as above but with timestamps
ffprobe -v error -f lavfi -i "amovie=input.wav,asetnsamples=44100,astats=metadata=1:reset=1" -show_entries frame=pkt_pts_time:frame_tags=lavfi.astats.Overall.Peak_level -of csv=p=0
Output to file
Just append > output.log
to the end of your command:
ffprobe -v error -f lavfi -i "amovie=input.wav,asetnsamples=44100,astats=metadata=1:reset=1" -show_entries frame_tags=lavfi.astats.Overall.RMS_level -of csv=p=0 > output.log
JSON
ffprobe -v error -f lavfi -i "amovie=input.wav,asetnsamples=44100,astats=metadata=1:reset=1" -show_entries frame_tags=lavfi.astats.Overall.RMS_level -of json > output.json
Generate .WAV sound frequency?
https://bitbucket.org/corfr/wavegenerator/src
A friend did this one :
You need linux (i successfully use Centos & Ubuntu)
Libmad
If i remember that was enough, it generate a .png from a .mp3 file, using libmad so. Code is quite simple to understand, as always feel free to submit improve !
it will generate a waveform pretty close as what you can found on soundcloud for example...
Related Topics
Adding Any Current Directory './' to the Search Path in Linux
Trying to Use Bash on Windows and Got No Installed Distributions Message
How to Run Nohup and Write Its Pid File in a Single Bash Statement
How to Install Chkconfig on Ubuntu
Is Usb Supported on Bash on Ubuntu on Windows 10
Indenting Multi-Line Output in a Shell Script
How to Clean Caches Used by the Linux Kernel
I Get "Dquote>" as a Result of Executing a Program in Linux Shell
How to Create a Script to Save and Restore Permissions
Sed Insert Line with Spaces to a Specific Line
How to Print Only the Hex Values from Hexdump Without the Line Numbers or the Ascii Table
How to Extract Files Without Folder Structure Using Tar
How to Get Hostname from Ip (Linux)