This is an old revision of the document!
MPEG-2 Elementary Video
- stream structure
- Sequence headers
- Sequence Extension
- Sequence Display Extension
- Quant Matrix Extension
- Copyright extension
- Sequence Scalable Extension
- Group of Pictures
- Picture headers
- Picture Display Extension
- Picture Coding Extension
- Slice
The structure differs from that of MPEG-1 elementary video in terms of the possible extensions and, in the case of the TMPG encoder, the number of slices.
MPEG-2 Elementary Video
stream structure
Structure of an MPEG-2 program stream.
video streaming:
Sequence | Sequence | ... | Sequence |
The video stream consists of several consecutive sequences.
Sequence:
Sequence start code |
video parameters |
bitstream parameters |
QT's, Misc |
Group of Picture |
... | Group of Picture |
Sequence end code |
The sequence can or should contain further sequence headers. Usually before each Group of Picture (GOP). The sequence headers are needed for creating entry points (chapter markers) and for fast scrolling. At the end there should be a sequence decode. However, this does not seem to be common with MPEG-2.
Group of Picture (GOP):
GOP start code |
timecode _ |
GOP params |
Picture | ... | Picture |
The GOP contains the different picture types. There are I, P, B and D frames.
Intra picture (I-frames) are full images. Predicted picture (P frames) are partial pictures and refer to the previous I/P frames. Bidirectional pictures (B frames) are also partial pictures and refer to the preceding and following I/B frame. Direct Coded Picture (D-Frames) are not used in MPEG-2.
Picture:
Picture start code |
Type | buffer params |
Encode Params |
Slice | ... | Slice |
The order of the images does not correspond to the order in which they are displayed. The images are composed of slices.
Slice:
Slice start code |
Vertical position |
QScale | macro block | ... | macro block |
Slices are a vertical and horizontal aggregation of macroblocks.
macro block:
Address Increment |
Type | Motion Vector |
QScale | CBP | b0 | ... | b5 |
A macroblock comprises an image section with a total of 16x16 picture elements (pixels). It stores the YUV 4:2:0 (YCbCr) color information. The Y components of all pixels are contained in the first four blocks b0 to b3. In the fifth block b4, the blue chrominance values ​​(Cb) are stored in one point for each 4 pixels. The corresponding red chrominance values ​​(Cr) are in the sixth block b5.
MPEG-2 Elementary Video
Sequence headers
Each header starts with the PACK_START_CODE_PREFIX, which consists of the three bytes 0, 0 and 1. As a hexadecimal number, it is represented as $000001. This is followed by the ID. For the sequence header, this is the value $B3. The length of the header has different lengths according to the matrices used.
Each sequence header represents a possible entry point. This means that a chapter can only be created where there is a sequence header.
It should also happen that sequence headers are not marked with $000001B3 but with $000000B3.
Construction
- 4 bytes: SEQUENCE_HEADER_CODE = $000001B3
- 12 Bit: Width - Image width in pixels
- 12 Bit: Height - image height in pixels
- 4 bit: Aspect Ratio - aspect ratio
- 4 bit: frame rate - refresh rate
- 18 bit: bit rate
- 1 bit: markers
- 10 bits: VBV
- 1 bit: Constrained parameter flag
- 1 bit: load intra matrix (0) or standard (1)?
- 64 bytes: Intra matrix if non-standard matrix is ​​used.
- 1 Bit: Load Non Intra Matrix (0) or Standard (1)?
- 64 bytes: Non Intra Matrix, unless Standard Matrix is ​​used.
As a scheme:
7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | |||||
0 | SEQUENCE_HEADER_CODE | |||||||||||
1 | ||||||||||||
2 | ||||||||||||
3 | ||||||||||||
4 | image width | |||||||||||
5 | nor image width | image height | ||||||||||
6 | nor image height | |||||||||||
7 | aspect ratio | frame rate | ||||||||||
8th | bitrate | |||||||||||
9 | nor bitrate | |||||||||||
10 | nor bitrate | marker | VBV | |||||||||
11 | VBV | CPF | Load? | IN THE | ||||||||
12 | still intra matrix | |||||||||||
... | ||||||||||||
75 | still intra matrix | Load? | ||||||||||
76 | Non intra matrix | |||||||||||
... | ||||||||||||
139 |
The MPEG type can only be taken directly from the sequence header if the aspect ratio has a value above 4. But then it is an MPEG-1 video.
Explanations
The values ​​for the aspect ratio:
value | Aspect ratio text |
0 | 'forbidden' |
1 | 1:1 square pixels |
2 | 4:3 screen |
3 | 16:9 screen |
4 | 2.21:1 display |
5-15 | 'reserved' |
The values ​​for the frame rate (refresh rate):
value | Frame rate number | Frame rate text |
0 | 0 | 'forbidden' |
1 | 24000/1001.0 | '23.976 fps -- NTSC encapsulated film rate' |
2 | 24.0 | 'Standard international cinema film rate' |
3 | 25.0 | 'PAL (625/50) video frame rate' |
4 | 30000/1001.0 | '29.97 -- NTSC video frame rate' |
5 | 30.0 | 'NTSC drop-frame (525/60) video frame rate' |
6 | 50.0 | 'double frame rate/progressive PAL' |
7 | 60000/1001.0 | 'double frame rate NTSC' |
8th | 60.0 | 'double frame rate drop frame NTSC' |
9-15 | 0 | 'reserved' |
The bit rate is given in 400 bits per second. The value $3FFFF is supposed to mean a variable bitrate. In the test, TMPGEnc and CCEB each indicated the maximum bit rate there.
The marker bit is intended to detect errors. It must always have the value 1.
The VBV is the memory required to decode the images. It is specified in 16 kB blocks.
The constrained parameter flag is always set to 0 here. The TMPG Encoder 2.5* uses the 10 bits for the VBV and the bit of the constrained parameter flag for the VBV specification, so that only half of the VBV is displayed here.
The intra or non intra matrix load bit indicates whether the standard matrix is ​​used or whether the corresponding matrix is ​​saved at this point and must be loaded. The matrix is ​​stored in Zig Zag Scan.
extensions
At least one extension follows the sequence header. An MPEG-2 is identified via this. The ID of the extension is $B5. This is followed by the ID of the extension. These extensions should be considered for the sequence header:
- Sequence Extension
- Sequence Display Extension
- Quant Matrix Extension
- Copyright Extension
- Sequence Scalable Extension
MPEG-2 Elementary Video
Sequence Extension
Extensions have the ID $B5. This is followed by the ID of the extension, here $1.
Construction
- 4 bytes: EXTENSION_START_CODE = $000001B5
- 4 bits: Start Code Identifier - Sequence Extension = $1
- 4 bit: profile
- 4 bits: levels
- 1 bit: progressive sequence
- 2-bit: chroma format
- 2 bit: wide extension
- 2 bits: height extension
- 12-bit: bit rate extension
- 1 bit: markers
- 8-bit: VBV Buffer Extension
- 1 bit: low delay
- 2 bits: Frame rate extension numerator
- 5 bits: Frame rate extension denominator
As a scheme:
7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | |
0 | EXTENSION_START_CODE $000001B5 |
|||||||
1 | ||||||||
2 | ||||||||
3 | ||||||||
4 | Boot Code Identifier = $1 | profile | ||||||
5 | levels | program | chroma | width ext. | ||||
6 | still BE. | height ext. | Bit Rate Extension | |||||
7 | nor bit rate extension | marker | ||||||
8th | VBV Buffer Extension | |||||||
9 | Low Delay | Framerate Ext N | Framerate Ext D |
Note: In a source, four bits are used for the profile. In Andrew Duncan, the first bit of this is called 'Profile/Level Escape', with the value 0 being reserved.
Explanations
The profiles are:
value | profile | Typical for |
1 | High profile | production equipment requiring 4:2:2 |
2 | Spatially Scalable Profile | simulcasting |
3 | SNR Scalable Profile | simulcasting |
4 | Main profiles | 95% of TVs, VCRs, cable applications |
5 | Simple profiles | Low-cost memory, eg no B pictures |
The profiles Multiview and 4:2:2 are sometimes also specified, but without coding.
The levels are:
value | profile | Typical for |
4 | high level | HDTV production rates: e.g. 1920 x 1080 x 30 Hz |
6 | High 1440 levels | HDTV consumer rates: e.g. 1440 x 960 x 30 Hz |
8th | main level | CCIR 601 rates: eg 720 x 480 x 30 Hz |
10 | low level | SIF video rate: e.g. 352 x 240 x 30 Hz |
Allowed combinations of level and profile:
Simple | Main | SNR scalable |
Spatially scalable |
High | multiview | 4:2:2 | |
high level | X | X | |||||
High-1440 level | X | X | X | ||||
main level | X | X | X | X | X | X | |
low level | X | X |
The chrominance values:
value | profile | description |
1 | 4:2:0 | half resolution in both dimensions (most common format) |
2 | 4:2:2 | Half resolution in horizontal direction (High Profile only) |
3 | 4:4:4 | full resolution (not allowed in any currently defined profile) |
Low Delay means that no B-frames are used. "Frame reordering delay is not present i the VBV description, skipped pictures (VBV underflow) may occur." The Framerate Extension Numerator and Denominator are multiplied by the base framerate.
MPEG-2 Elementar Video
Sequence Display Extension
Extensionen are ID $B5. The ID is $2.
Aufbau
- 4 Byte: EXTENSION_START_CODE = $000001B5
- 4 Bit: Start Code Identifier - Sequence Display Extension = $2
- 3 Bit: Video Format
- 1 Bit: Color, if set:
- 1 Byte: Color Primaries
- 1 Byte: Transfer Characteristics
- 1 Byte: Matrix Coefficients
- 14 Bit: Display Width
- 1 Bit: Marker
- 14 Bit: Display Height
Als Schema:
7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | |
0 | EXTENSION_START_CODE $000001B5 |
|||||||
1 | ||||||||
2 | ||||||||
3 | ||||||||
4 | Start Code Identifier = $2 | Video Format | Color | |||||
5 | Color Primaries | |||||||
6 | Transfer Characteristics | |||||||
7 | Matrix Coefficients | |||||||
5 / 8 | Display Width | |||||||
6 / 9 | Display Width | Marker | D H | |||||
7 / 10 | Display Height | |||||||
8 / 11 | Display Height | ... |
Explainations
Video Formats are:
Value | Format |
0 | Component |
1 | PAL |
2 | NTSC |
3 | SECAM |
4 | MAC |
5-8 | Reserved |
Color Primaries and Transfer Characteristics are:
Wert | Profil |
0 | Forbidden |
1 | ITU-R Rec. 709 (1990) |
2 | Unspecified |
3 | Reserviert |
4 | ITU-R Rec. 624-4 System M |
5 | ITU-R Rec. 624-4 System B, G |
6 | SMPTE 170M |
7 | SMPTE 240M (1987) |
... | Reserviert |
Matrix Coefficients:
Value | Profile |
0 | Forbidden |
1 | ITU-R Rec. 709 (1990) |
2 | Unspecified |
3 | Reserved |
4 | FCC |
5 | ITU-R Rec. 624-4 System B, G |
6 | SMPTE 170M |
7 | SMPTE 240M (1987) |
... | Reserved |
Extensions have the ID $B5. The ID of this extension is $3.Quant Matrix Extension
Construction
As a scheme:
7 6 5 4 3 2 1 0 0 EXTENSION_START_CODE
$000001B51 2 3 4 Boot Code Identifier = $3 Load? Intra Quantizer Matrix 5-67 Intra Quantizer Matrix 68 Intra Quantizer Matrix Load? NIQM 69-131 Non Intra Quantizer Matrix 132 Non Intra Quantizer Matrix Load? CIQM 133-195 Chroma Intra Quantizer Matrix 196 Chroma Intra Quantizer Matrix Load? 197-260 Chroma Non Intra Quantizer Matrix
Extensions have the ID $B5. This extension has the ID $4.Copyright extension
Construction
As a scheme:
7 6 5 4 3 2 1 0 0 EXTENSION_START_CODE
$000001B51 2 3 4 Boot Code Identifier = $4 copyright
flagcopyright identifiers 5 nor copyright identifiers copy? reserved 6 reserved marker CN 1 7 nor copyright number 1 8th nor copyright number 1 9 still CN 1 marker Copyright number 2 10 nor copyright number 2 11 nor copyright number 2 12 still CN 2 marker Copyright number 3 13 nor copyright number 3 14 nor copyright number 3
Extensionen haben die ID $B5. Diese Extension hat die ID $5. Ist der Scalable Mode "spatial scalability" Ist der Scalable Mode "temporal scalability" Scalable Mode = "spatial scalability": Scalable Mode = "temporal scalability":Sequence Scalable Extension
Aufbau
Als Schema
7 6 5 4 3 2 1 0 0 EXTENSION_START_CODE
$000001B51 2 3 4 Start Code Identifier = $5 Scalable Mode Layer ID 5 noch Layer ID
7 6 5 4 3 2 1 0 5 Lower Layer Prediction Horizontal Size 6 noch Lower Layer Prediction Horizontal Size 7 Marker Lower Layer Prediction Vertical Size 8 noch Lower Layer Prediction Vertical Size HSF M 9 noch Horizontal Subsampling Factor M Horizontal Subsampling Factor N 10 n. HSF N Vertical Subsampling Factor M VSF N 11 noch Vertical
Subsampling Factor N
7 6 5 4 3 2 1 0 5 Picture
Mux
EnableMux zu
Progr.
SequencePicture Mux Order PMF 6 noch Picture
Mux Faktor
The ID $B8 follows the PACK_START_CODE_PREFIX for the GROUP_START_CODE. The length of the header is 4 bytes. The drop frame is used for NTSC to lower the frame rate from 30 to 29.97 fps. The marker must again be set. A GOP is closed if the pictures in a GOP relate only to pictures in its own GOP. Even if the stream is not encdet closed, the first and penultimate (CCE Basic) or last (TMPG Encoder) GOP are closed. The broken link flag is set when frames refer to frames that no longer exist after a cut.Group of Pictures
Construction
As a scheme
7 6 5 4 3 2 1 0 0 GROUP_START_CODE 1 2 3 4 Drop
frameTime code hours Time code minutes 5 still time code minutes marker Time code seconds 6 still time code seconds Time Code Picture 7 nor
TCPClosed
GOPbroken
link Explanations
Dem PACK_START_CODE_PREFIX folgt der PICTURE_START_CODE mit der ID $00. Nur bei P- und B-Frames: Nur bei B-Frames: Extra Informationen: Die Temporal Reference ist die Reihenfolge, in welcher die Bilder angezeigt werden sollen. Das erste Bild der Gruppe hat
den Wert 0. Die Coding Types sind: Das VBV Delay wird bei konstanten Bitraten in 90 kHz Cyclen angegeben. Bei variablen Bitrate wird der Delay auf FFFF gesetzt. Zu den forward und backward vectors sowie den Extra Informationen fehlen mir weitergehende Informationen. Die Extra Information
besteht aus neun Bit. Das erste ist der Indikator dafür, ob Extra Informationen folgen. Die folgenden 8 bit stellen die
Extra Information dar. Bei MPEG-2 schliessen sich dem Picture Header verschiedene Extensionen an. Die ID der Extension ist $B5. Daran schliesst sich
die ID der Extension an. Für den Picture Header dürften diese Extensionen in Frage kommen:Picture Header
Aufbau
Als Schema
7 6 5 4 3 2 1 0 0 PICTURE_START_CODE 1 2 3 4 Temporal Reference 5 Temporal Reference Coding Type VBV Delay 6 noch VBV Delay 7 noch VBV Delay full fel forward vector forward f code 8 noch forward f code full backward vector backward f code Extra Bit set Extra Information 9 Noch Extra Information ... Extra Bit cleared Erläuterungen
Extensionen
The extension start code with the ID $B5 is followed by four bits with the extension ID $B7. This extension is not necessary for the normal decoding process. However, it allows the picture to be positioned on the display. An application for this is Pan & Scan . This extension must be preceded by a sequence display extension. frame_centre_horizontal_offset frame_centre_vertical_offset The size of the display is defined in the Sequence Display Extension. The coordinates for the decoded picture in the picture display extension. The center of the decoded picture is the center of the display. A picture can refer to one, two or three decoded fields, ie it can contain up to three offsets. The number of offsets results from the flags progressive_sequence, repeat_first_field and top_field_first. Das Flag progressive_sequence ist in der Sequence Extension enthalten. repeat_first_field, top_field_first und picture_structure
in der Picture Coding Extension. Das Fehlen der Offsets bedeutet, dass die zuvor verwendeten Werte verwendet werden sollen. Das gilt auch, wenn nicht alle
Offsets angegeben sind. Nach einem Sequence Header werden wieder Nullwerte verwendet, bis wieder Werte angegeben werden. Das Bild unten stammt aus einem Draft zur ISO 13818-2.Picture Display Extension
Construction
As a scheme
7 6 5 4 3 2 1 0 0 EXTENSION_START_CODE
$000001B51 2 3 4 Boot Code Identifier = $7 frame_centre_horizontal_offset 5 nor frame_centre_horizontal_offset 6 nor frame_centre_horizontal_offset marker frame_centre_vertical_offset 7 nor frame_centre_vertical_offset 8th nor frame_centre_vertical_offset marker ... Explanations
Is the horizontal offset in units of 1/16 sample. A positive value means the center of the picture is to the right of the center of the display.
Is the vertical offset in units of 1/16 sample. A positive value means the center of the picture is below the center of the display.if ( progressive_sequence == 1)
{ if ( repeat_first_field == 1 )
{ if ( top_field_first == 1 )
number_of_frame_centre_offsets = 3
else
number_of_frame_centre_offsets = 2
}
else
{
number_of_frame_centre_offsets = 1
}
}
else
{ if (picture_structure == "field")
{ number_of_frame_centre_offsets = 1
}
else
{ if (repeat_first_field == 1 )
number_of_frame_centre_offsets = 3
else
number_of_frame_centre_offsets = 2
}
}
The extension start code with the ID $B5 is followed by four bits with the extension ID $B8. f_code[s][t] Intra DC Precision Picture Struktur Wenn ein Frame in Halbbildern encodet wurde, muss immer ein Paar mit gleichem Picture Coding Type vorliegen. Top Field First Frame Pred Frame DCT Concealment Motion Vectors Q Scale Type Intra VLC Format Alternate Scan Repeat First Field Chroma 420 Type Progressive Frame Composite Display Flag V Axis Field Sequence Sub Carrier Burst Amplitude Sub Carrier PhasePicture Coding Extension
Construction
As a scheme
7 6 5 4 3 2 1 0 0 EXTENSION_START_CODE
$000001B51 2 3 4 Boot Code Identifier = $8 f_code[0][0] 5 f_code[0][1] f_code[1][0] 6 f_code[1][1] Intra DC Prec. Picture Structure 7 TFF FPF
DCTCMV Q scale type Intra VLC format Alternate Scan Repeat First Field Chroma 420 type 8th Progressive frames composite display? V axis Field Sequence subcarriers burst amplitude 9 still BA sub-carrier phase 10 nor SCP Explanations
The values ​​are used to decode the motion vectors. The value 0 is forbidden, 1 to 9 and 15 are allowed.
Accuracy with which the discrete cosine transformation is used. The Intra DC Mult(iplikator) is derived from this.
value accuracy multiplier 00 8 bits 8th 01 9 bits 4 10 10 bits 2 11 11 bits 1
Wert Picture Struktur 00 Reserviert 01 Top Field - oberes Halbbild 10 Bottom Field - unteres Halbbild 11 Frame Picture - Vollbild
Dieses Flag ist von Picture Structure, Progressive Sequence und Repeat First Field abhängig.
Bei
nicht gesetzten Flag Progressive Sequence dient es beim Decoding der Rekonstruktion des Frames und gibt an, ob das First Field
(oberes Halbbild) zuerst ausgegeben wird.
Bei gesetzten Flag Progressive Sequence, gibt es in Verbindung mit dem Flag Repeat
First Field an, wie oft ein Frame beim Decoding ausgegeben wird. Dies bedeutet, bei Repeat First Field = 0 und Top Field First
= 0 wird ein progressives Frame, bei Repeat First Field = 1 und Top Field First = 0 werden zwei identische progressive Frames
und bei Repeat First Field = 1 und Top Field First = 1 drei identische progressive Frames ausgegeben.
Wenn dieses Flag gesetzt ist, wird nur Frame DCT und Frame Prediction genutzt. Bei Field Picture
(Halbbildern) ist es 0, bei progressiven Frames 1.
Dieses Flag gibt an, ob die Intra Macroblocks mit Motion Vectors encoded wurden.
Diesses Flag wird für den Quantiser Scale Factor verwendet.
Dieses Flag wird für die Bestimmung der DCT Koeffizienten verwendet.
Dieses Flag wird für die Bestimmung der DCT Koeffizienten verwendet.
Wenn die Flags progressisve_sequence (siehe Sequence Extension
Header) und progressive_frame nicht gesetzt sind, ist das Flag repeat_first_field ebenfalls nicht gesetzt. Beim Decoden wird
dann das Frame aus zwei Feldern zusammengesetzt. Ist das Flag progressive_sequence nicht aber das Flag progressive_frame gesetzt,
wird das Frame aus zwei Feldern zusammengesetzt. Das erste Feld (Top Field oder Bottom Field) wird durch das Flag top_field_first
identifiziert und wird von dem anderen gefolgt. Ist dann das Flag repeat_first_field gesetzt, wird das Frame aus drei Fields
zusammengesetzt. Das erste Field wird durch das Flag top_field_first bestimmt und von dem anderen gefolgt. Anschließend wird
das erste Field wiederholt. Ist das Flag progressive_sequence gesetzt und das Flag repeat_first_field nicht, wird das Frame
aus einem Frame decodiert.
Ist genauso gesetzt wie Progressive Frame. Existiert aus historischen Gründen.
Ist das Flag nicht gesetzt, bedeutet dies, dass die beiden Fields eines Frame zwei interlaced Fields
sind. Repeat First Field muss 0 sein (two field duration). Ist das Flag gesetzt, bedeutet dies, dass die beiden Fields zu einem
verschmolzen sind. Picture Structure muss auf "Frame" und Frame Pred Frame DCT auf 1 gesetzt sein.
Dieses Flag ist gesetzt, wenn die Picture, aus denen das MPEG encodet wurde, als (analoges)
Composite Video encodet waren. Die Informationen beziehen sich auf die Exstension folgenden Picture. Handelt es sich um ein
Frame Picture, beziehen sich die Informationen auf das First Field. Die Informationen werden für das Second Field angepasst,
da diese nicht gespeichert werden können. Die folgenden Elemente werden für das Decoding nicht genutzt. Das Repeat First Field
Flag und das Composite Display Flag dürfen nicht gleichzeitig gesetzt sein.
1-bit integer used only when the bitstream represents a signal that had previously been encoded according to
PAL systems. v_axis is set to 1 on a positive sign, v_axis is set to 0 otherwise.
Gibt die Nummer des Feldes einer acht Field Sequence bei einem PAL System oder einer fünf Field Sequence
bei einem NTSC System entsprechend nachfolgender Tabelle an:
Field Sequence Frame Field 000 1 1 001 1 2 010 2 3 011 2 4 100 3 5 101 3 6 110 4 7 111 4 8
Ist das Flag nicht gesetzt ist das sub-carrier/line frequency relationship korrekt.
gibt die Burstamplitude für PAL und NTSC an.
Gibt die Phase des Referenz Sub Carrier der Fieldsynchronisation an.
Sub Carrier Phase Phase 0 ([360° / 256] * 0) 1 ([360° / 256] * 1) ... ... 255 ([360° / 256] * 255)