MPEG format


In most video sequences, most of the scenes are fixed or change very little; this is what is called the temporal redundancy.

When only the lips of the actor move, it is almost only the pixels of the mouth that will be modified from one image to another; it is thus sufficient to just describe the change from one image to another. This is the main difference between MPEG (Moving Pictures Experts Group) and M-JPEG. However, this method will have much less impact on an action scene.

The MPEG Group was established in 1988 with the aim of developing international standards for compression, decompression, processing and coding of animated images and audio data.

There are several MPEG standards:

  • MPEG-1, developed in 1988, is a standard for the compression of video data and the associated audio channels (up to 2 channels for stereo listening). It allows the storage of videos at a rate of 1.5 Mbps in a quality close to that of VHS cassettes on a CD support called VCD (Video CD).
  • MPEG-2, a standard originally dedicated to digital television (HDTV) offering a high quality at a rate that may go up to 40 Mbps, and 5 surround audio channels. MPEG-2 moreover, allows identification and protection against ripping. This is the format used for DVD videos.
  • MPEG-4, a standard intended to allow multi-media coding of data in the form of digital objects, in order to achieve greater interactivity, which makes it particularly suitable for the Web and for mobile peripheral devices.
  • MPEG-7, a standard aimed at providing a standard representation of audio and visual data in order to allow the search for information in such data flows. This standard is thus also known as Multimedia Content Description Interface.
  • MPEG-21, a standard still under development, whose goal is to provide a framework for all digital actors (producers, consumers,…) in order to standardize the management of these contents, as well as of the access rights, the copyrights,…


The MPEG-1 standard represents each image as a set of 16 X 16 blocks. It makes it possible to obtain a resolution of:

  • 352x240 at 30 images per second in NTSC
  • 352x288 at 25 images per second in PAL/SECAM

MPEG-1 makes it possible to achieve rates of around 1.2 Mbps (readable on a CD-ROM).

MPEG-1 allows videos to be encoded using several techniques:

  • Intra-coded frames (I, frames, corresponding to an internal coding): the images are coded separately without referring to the preceding images
  • • Predictive coded frames (P frames or predictive coding): the images are described by their differences relative to the preceding images
  • • Bi-directionally predictive coded frames (B frames): the images are described by their differences relative to the preceding image and the following image
  • DC Coded frames: images are decoded by making block averages

I frames

These images are coded only by using JPEG, coding, without worrying about the images which surround them. Such images are necessary in a MPEG video because it is these that ensure image cohesion (since the others images are described relative to their surrounding images); they are particularly useful for video streams which can be tuned in at will at any time (television), and are essential in the event of any error in the reception. There is thus one or two of these per second in a MPEG video.

P frames

These images are defined by their difference relative to the preceding image. The encoder seeks the differences of the image compared to the preceding one and defines blocks, called macroblocks (16x16 pixels) which will be superimposed on the preceding image.

The algorithm compares both images block per block and starting from a certain difference threshold it considering the preceding image area to be different from that of the image in progress and applies a JPEG compression to it.

It is the search for the macroblocks which will determine the speed of the encoding, because the more the algorithm seeks “good” blocks, the more time it wastes …
Compared to I frames (directly compressed), P frames require the preceding image to always be in memory.

B frames

Like P frames, B frames work based on differences relative to a reference image, except that in the case of B frames this difference can be carried out either on the preceding one (as in the case of P frames) or on the following one, which allows a better compression, but induces a delay (since the following image needs to be known) and makes it necessary to keep three images in the memory (the preceding one, the current one and the following one).

D frames

These images offer a very low resolution quality but allow a very fast decompression, which is particularly useful during fast forward viewing because “normal” decoding would require too many processor resources.

In practice…

In order to optimize MPEG coding, in practice the image sequences are coded according to a succession of I, B, and P images (D being, as was mentioned above, reserved for fast forward viewing), the order of these having been determined experimentally. The sequence known as GOP (Group Of Pictures is the following:


An I image is thus enclosing all the 12 frames.

Ask a question
CCM is a leading international tech website. Our content is written in collaboration with IT experts, under the direction of Jean-François Pillou, founder of CCM reaches more than 50 million unique visitors per month and is available in 11 languages.
This document, titled « MPEG format », is available under the Creative Commons license. Any copy, reuse, or modification of the content should be sufficiently credited to CCM (