Conversion of audio files using SoX
Introduction
You have got audio files for your application but unfortunately the speech platform does not accept the file type or audio format.
So you need to either :
- convert audio files from one file format into another (e.g. raw to wav)
- convert the audio-encoding format (e.g. 16-bit PCM to a-law)
- change sampling rate (e.g. from 16 kHz to 8 KHz)
- reduce number of channels (e.g. stereo to mono).
- apply filters to the signal
Needed Steps
- Download Sox:http://sourceforge.net/projects/sox/
SoX is a open source software.
It is proposed as:"SoX is meant to be the Swiss Army Knife of sound processing utils. It can convert audio files to other popular audio file types and also apply sound effects and filters during the conversion."
- Install SoX
SoX does not need any installation. You simply have to unzip the archive to a folder of your choice.
- Using SoX
Assuming you extracted SoX to C:\sox, the usage is as follows.
- First example: Converting a file in.wav (e.g. from 16-bit PCM, wave) into a-law-encoding, wave with 8kHz sampling rate (out.wav):
Start a command-shell and go to the folder where your audio file is stored. Type in and execute:
C:\sox\sox.exe -V in.wav -A -t .wav -r 8000 out.wav
Meaning of the options used:
- -V = verbose
- -A = encoding of destination file: a-law
- -t .wav = file format of destination file: WAVE
- -r 8000= sampling rate of destination file: 8kHz
- Second example: Convert a folder containing lots of audio files (batch conversion)
Put the following string into a batch file (e.g. convert.bat - this is one line of code):
for %%f in (*.wav) do (C:\sox\sox.exe -V %%~nf.wav -c 1 -A -t .wav -r 8000 C:\temp\converted\%%~nf.wav >> C:\temp\converted\conversion.log 2>&1)
Save that file into the folder where your audios are stored in and start it. This will process all *.wav files in the folder.
Converted files will be stored in "C:\temp\converted". This folder must exist before starting the batch file. The output of SoX is stored in C:\temp\converted\conversion.log.
Meaning of the SoX-options used:
- -V = verbose
- -c 1= number of channels of destination file: 1
- -A = encoding of destination file: a-law
- -t .wav = file format of destination file: WAVE
- -r 8000= sampling rate of destination file: 8kHz
For further options and examples refer to the SoX documentation (sox.txt and soxexam.txt) coming with the downloaded archive.
Note: SoX is a very powerful tool for processing audio files. It can manipulate audio files in many different ways. Refer to the above mentioned documentation for a complete overview.
Convert Audio Files For Prophecy
Prophecy prefers Audio files as 8bit, 8Khz, u-law Wave files (Riff Format). To get audio files into this format please use the following command lines:
- 16bit 8khz wav (like created from Audacity) to 8bit, 8Khz, u-law Wave:
$ sox -V my16bitAudio.wav -b 8 -U -t .wav -r 8000 my8bitAudio.wav channels 1 rate 8k
- Apple AIFF (e.g. created from Garage Band) to 8bit8khz u-law:
$ sox -V myAIFFAudio.aif -b 8 -U -t .wav -r 8000 my8bitAudio.wav channels 1 rate 8k
Links