The digitization technology has brought us convenience and entertainment in our lives through various digital devices. More and more of these devices are required to have voice prompts for their contents for elderly and disadvantaged users. Therefore voice module is now the focus in the industry. Manufacturers all over the world are investing in the development and application of the voice module technology, for the benefit of mankind.
Many circuits transfer their internal information to the external users through sound effects or voices. The most common and easiest way is the speaker, and the more advanced way is the voice processor which generates voices and sound effects.
This article is the first chapter of the Computer Sound Effects: How to Make Your Computer Sing (The Basic Principles), of the “IoT series.” It aims to tell the readers the basic principles of computer sound effects, and other things in the future.
What is MP3
MP3 is a file compression format used to store digital audio data. Generally, data compressed in MP3 using a computer will have only be 1/10 to 1/12 its original size. This compression however, loses part of the original data. The smaller size the more loss there is, and which is why it’s called the lossy compression.
MP3 is short for ISO-MPEG Audio Layer-3. It was an audio format developed in the Digital Audio Broadcasting project in 1987. However, MP3 and MPEG-3 are two completely different things. This kind of audio compression format allows a higher compression (smaller file size), without loss in audio quality. MP3 relies on something called Perceptual Coding, to achieve a smaller file size, while guaranteeing quality of the audio, and giving us the approximate audio fidelity.
Because MP3 audio is the result of compression, its quality will certainly be slightly different from the original, but most people can’t tell the difference. This is why MP3 music is so popular, small file size and decent audio quality, which is also why MP3 is common on the Internet.
Audio File Compression
Generally speaking, when audio is digitized, the data produced is huge, so we want to compress it. Files that are not compressed are usually known as:
FLAC, which is actually the original raw sound, like our voices, the sound made by musical instruments; because digital music come in the forms of a CD, WAV or MP3, and MP3 files are usually converted from CDs, so we call CDs, WAV files “lossless audio.”
The other compression format is known as lossy compression, which means that the data after being compressed and uncompressed is still very close to its original counterpart, also known as “lossy format.” Common lossy formats include MP3, RM… etc. Simply put, what MP3 does is removing the sound people can’t hear in CD music in order to reduce file size but at the same time damages the quality.
The above mentioned WAV is the most common lossless audio format, developed by Microsoft. WAV conforms to the Resource Interchange File Format regulations.
Every WAV file has a header. It stores the parameters of audio stream, which WAV has no strict rules for, besides Pulse-code modulation: PCM, any kind of PCM encoding is compatible with WAV.
An Introduction to WAV
The WAV format is the most common digital music file. Its biggest feature is that it is not compressed at all and thus has the best sound quality, and works on either PC or Mac, and of course the audio sampler.
Unlike the traditional analog audio material the Audio CD, there is no need for the audio sampler, you can directly use the digital WAV file on your work through an audio editor, use a mouse to drag and form loops, and save yourself tons of post production time. There won’t be any problem in file compatibility either, because WAV is extremely common, almost every audio editor supports it.
Pros and Cons of WAV:
- Pros: Uncompressed, best quality, raw audio, high compatibility with all audio editors.
- Cons: Huge file size, huge requirement of storage, difficult to share, a WAV file usually requires 10MB for a 1 minute audio, which is a big issue for low storage and slow internet.
Here is a simple chart comparing WAV and MP3:
When talking about WAV and MP3, we also need to talk about bitrate. Many people might not know what bit rate is, so here is a simple explanation:
Bit rate is the amount of data it takes to represent one second of audio. Let’s take WAV and MP3 for example, WAV is like DVD video, with better quality but bigger file size, while MP3 is like VCD, despite being inferior to DVD in quality, it has a smaller file size.
Formula 1: Bit rate
We can see from the above the bit rate of WAV and other formats, but how is the file size of a WAV file generated?
WAV file formula:
Bit rate divided by 8 equals the disk space this file uses per second (KB/S), and when you multiply the result by the length of the entire song in seconds it is the total file size.
Formula 2: WAV file size
Example on a WAV file of a 5 minute song:
- 1411.2 Kbps / 8 (bits)=176.4 KB/s (unit conversion)
- 176.4 KB / s *(60 seconds * 5 minutes)= 52920 KB (file size)
- 52920 KB / 1024(bytes)= about 51.68 MB
Calculating the size of an MP3 file is the same.
Example on a 5 minute song:
- 192 Kbps / 8(bits)= 24 KB/s (unit conversion)
- 24 KB/s *(60 seconds * 5 minutes)= 7200 KB(file size)
- 7200 / 1024(bytes)= about 7 MB
This article is the first chapter of the Computer Sound Effects: How to Make Your Computer Sing (The Basic Principles), of the “IoT series.” It aims to tell the readers how to make sound and voice features while developing IoT products.
This chapter is an introduction to the sub series: Computer Voices, of the IoT series. It is to help readers understand how to create sound effects, music and voices for the applications, development and design of IoT. We hope to make more IoT products and experiments in the future of this column.
There will be more articles of the IoT series, and we hope to create better and more futuristic IoT products and technology.
Please stay tuned for more articles.