Introduction to digital video

January 2017

What is a video?

A video is a succession of images presented at a certain rate. The human eye is able to distinguish approximately 20 images per second. Thus, when more than 20 images are displayed per second, it is possible to mislead the eye and create the illusion of an animated image. The fluidity of a video is characterized by the number of images per second (frame rate), expressed in FPS (Frames per second.

In addition, multi-media video is usually accompanied by sound, i.e. audio data.

Digital and analogue video

“Animated images” are usually classified into several large families:

  • Cinema, which consists in storing the succession of negative images on a film. The film is displayed by using a light source which projects the successive images, from a positive copy, onto a screen.

  • Analogue video, which represents information as a continuous flow of analogue data, is intended to be shown on a TV screen (based on the scanning principle). There are several standards for analogue video. The three main ones are:

  • Digital video, which consists in coding the video in a succession of digital images.


The PAL/SECAM format (Phase Alternating Line/Sequential Color with Memory), used in Europe for Hertzian television, makes it possible to code videos on 625 rows (only 576 are displayed because 8% of the rows are used for synchronization), at a rate of 25 images per second with a 4:3 format (i.e. with a 4/3 width/height ratio).

However, at 25 images per second, many people perceive a beat in the image. Thus, since it was not possible to send more information due to band-width limitations, it was decided to interlace the images, i.e. to send the even rows initially, then the odd rows. The term “field” thus indicates the “half-image” formed either by the even rows, or by the odd rows. The whole transmission consisting of two fields is called the interlaced screen. When there is no interlacing the term progressive screen is used.


Thanks to this process called "interlacing", a PAL/SECAM television set displays 50 fields per second (i.e. at a frequency of 50 Hz), that is to say, 2x25 images in two seconds.


The NTSC standard (National Television Standards Committee), used in the United States and Japan, uses a system of 525 interlaced rows at 30 images/second (i.e. at a frequency of 60Hz). As in the case of the PAL/SECAM, 8% of the rows are used to synchronize the receiver. Thus, since the NTSC displays a 4:3 image format, the resolution actually displayed is of 640x480.

Digital video

Digital video consists in showing a succession of digital images. Since these digital images are displayed at a certain rate, it is possible to know the necessary video display rate, i.e. the number of bytes displayed (or transferred) per unit of time.

Thus, the necessary rate to display a video (in bytes per second) is equal to the size of the image multiplied by the number of images per second.

Consider a true color image (24 bits) with a definition of 640X480 pixels. In order to correctly display a video with this definition, it is necessary to display at least 30 images per second, i.e. at a rate equal to:

900 KB * 30 =  27 MB/s


Since the eye is not very sensitive to chrominance variations, the technique known as chroma subsampling (also called decimation, consists in removing chrominance information from a group of 4x4 pixels.


