An MPEG audio file is built up from smaller parts called frames.
Generally, frames are independent items. Each frame has its own
header and audio informations. As there is no file header, you can
cut any part of MPEG file and play it correctly (this should be
done on frame boundaries but most applications will handle incorrect
headers). However, for Layer III, this is not 100% correct. Due
to internal data organization in MPEG Layer III files, frames are
often dependent of each other and they cannot be cut off just like
that.
When you want to read info about an MPEG
file, it is usually enough to find the first frame, read its header
and assume that the other frames are the same. But this may not
be always the case. Variable bitrate MPEG files may use so called
bitrate switching, which means that bitrate changes according to
the content of each frame. This way lower bitrates may be used in
frames where it will not reduce sound qualit y. This allows making
better compression while keeping high quality of sound.
The frame header is constituted by the
very first four bytes (32bits) in a frame. The first eleven bits
(or first twelve bits, see below about frame sync) of a frame header
are always set and they are called "frame sync". Therefore,
you can search through the file for the first occurence of frame
sync (meaning that you have to find a byte with a value of 255,
and followed by a byte with its three (or four) most significant
bits set). Then you read the whole header and check if the values
are correct. You will see in the following table the exact meaning
of each bit in the header. Each value that is specified as reserved,
invalid, bad, or not allowed should indicate an invalid header.
Frames may have a CRC check. The CRC is
16 bits long and, if it exists, it follows the frame header. After
the CRC comes the audio data. You may calculate the CRC of the frame,
and compare it with the one you read from the file. This is actually
a very good method to check the MPEG frame validity.
Here is a presentation of the header content.
Characters from A to M are used to indicate different fields. In
the table below, you can see details about the content of each field.
AAAAAAAA AAABBCCD EEEEFFGH IIJJKLMM
Sign |
Length
(bits) |
Position
(bits) |
Description |
A |
11 |
(31-21) |
Frame sync (all bits set) |
B |
2 |
(20,19) |
MPEG Audio version ID
00 - MPEG Version 2.5 (unofficial)
01 - reserved
10 - MPEG Version 2 (ISO/IEC 13818-3)
11 - MPEG Version 1 (ISO/IEC 11172-3)
Note: MPEG Version 2.5 is not official standard. It is an
extension of the standard used for very low bitrate files.
|
C |
2 |
(18,17) |
Layer description
00 - reserved
01 - Layer III
10 - Layer II
11 - Layer I |
D |
1 |
(16) |
Protection bit
0 - Protected by CRC (16bit crc follows header)
1 - Not protected |
E |
4 |
(15,12) |
Bitrate index
bits |
V1,L1 |
V1,L2 |
V1,L3 |
V2,L1 |
V2, L2 & L3 |
0000 |
free |
free |
free |
free |
free |
0001 |
32 |
32 |
32 |
32 |
8 |
0010 |
64 |
48 |
40 |
48 |
16 |
0011 |
96 |
56 |
48 |
56 |
24 |
0100 |
128 |
64 |
56 |
64 |
32 |
0101 |
160 |
80 |
64 |
80 |
40 |
0110 |
192 |
96 |
80 |
96 |
48 |
0111 |
224 |
112 |
96 |
112 |
56 |
1000 |
256 |
128 |
112 |
128 |
64 |
1001 |
288 |
160 |
128 |
144 |
80 |
1010 |
320 |
192 |
160 |
160 |
96 |
1011 |
352 |
224 |
192 |
176 |
112 |
1100 |
384 |
256 |
224 |
192 |
128 |
1101 |
416 |
320 |
256 |
224 |
144 |
1110 |
448 |
384 |
320 |
256 |
160 |
1111 |
bad |
bad |
bad |
bad |
bad |
NOTES: All values are in kbps
V1 - MPEG Version 1
V2 - MPEG Version 2 and Version 2.5
L1 - Layer I
L2 - Layer II
L3 - Layer III
"free" means free format. If the correct fixed bitrate
(such files cannot use variable bitrate) is different than
those presented in upper table it must be determined by the
application. This may be implemented only for internal purposes
since third party applications have no means to findout correct
bitrate. Howewer, this is not impossible to do but demands
lot's of efforts.
"bad" means that this is not an allowed value
MPEG files may have variable bitrate (VBR). Each frame may
be created with different bitrate. It may be used in all layers.
Layer III decoders must support this method. Layer I & II
decoders may support it.
For Layer II there are some combinations of bitrate and mode
which are not allowed. Here is a list of allowed combinations.
bitrate |
single channel
|
stereo
|
intensity stereo
|
dual channel
|
free |
yes
|
yes
|
yes
|
yes
|
32 |
yes
|
no
|
no
|
no
|
48 |
yes
|
no
|
no
|
no
|
56 |
yes
|
no
|
no
|
no
|
64 |
yes
|
yes
|
yes
|
yes
|
80 |
yes
|
no
|
no
|
no
|
96 |
yes
|
yes
|
yes
|
yes
|
112 |
yes
|
yes
|
yes
|
yes
|
128 |
yes
|
yes
|
yes
|
yes
|
160 |
yes
|
yes
|
yes
|
yes
|
192 |
yes
|
yes
|
yes
|
yes
|
224 |
no
|
yes
|
yes
|
yes
|
256 |
no
|
yes
|
yes
|
yes
|
320 |
no
|
yes
|
yes
|
yes
|
384 |
no
|
yes
|
yes
|
yes
|
|
F |
2 |
(11,10) |
Sampling rate frequency index (values are in Hz)
bits |
MPEG1 |
MPEG2 |
MPEG2.5 |
00 |
44100 |
22050 |
11025 |
01 |
48000 |
24000 |
12000 |
10 |
32000 |
16000 |
8000 |
11 |
reserv. |
reserv. |
reserv. |
|
G |
1 |
(9) |
Padding bit
0 - frame is not padded
1 - frame is padded with one extra slot
Padding is used to fit the bit rates exactly. For an example:
128k 44.1kHz layer II uses a lot of 418 bytes and some of 417
bytes long frames to get the exact 128k bitrate. For Layer I
slot is 32 bits long, for Layer II and Layer III slot is 8 bits
long. |
H |
1 |
(8) |
Private bit. It may be freely used for specific needs of an
application. |
I |
2 |
(7,6) |
Channel Mode
00 - Stereo
01 - Joint stereo (Stereo)
10 - Dual channel (2 mono channels)
11 - Single channel (Mono)
Note: Dual channel files are made of two independant mono channel.
Each one uses exactly half the bitrate of the file. Most decoders
output them as stereo, but it might not always be the case.
One example of use would be some speech
in two different languages carried in the same bitstream, and
and appropriate decoder would decode only the choosen language
|
J |
2 |
(5,4) |
Mode extension (Only if Joint stereo)
Mode extension is used to join informations that are of no
use for stereo effect, thus reducing needed resources. These
bits are dynamically determined by an encoder in Joint stereo
mode.
Complete frequency range of MPEG file is divided in subbands
There are 32 subbands. For Layer I & II these two bits determine
frequency range (bands) where intensity stereo is applied.
For Layer III these two bits determine which type of joint
stereo is used (intensity stereo or m/s stereo). Frequency
range is determined within decompression algorythm.
Layer I and II |
Layer III |
value |
Layer I & II |
00 |
bands 4 to 31 |
01 |
bands 8 to 31 |
10 |
bands 12 to 31 |
11 |
bands 16 to 31 |
|
Intensity stereo |
MS stereo |
off |
off |
on |
off |
off |
on |
on |
on |
|
|
K |
1 |
(3) |
Copyright
0 - Audio is not copyrighted
1 - Audio is copyrighted |
L |
1 |
(2) |
Original
0 - Copy of original media
1 - Original media |
M |
2 |
(1,0) |
Emphasis
00 - none
01 - 50/15 ms
10 - reserved
11 - CCIT J.17 |