Video Compression
So
far our discussion on compression has been on still images. These techniques
tryto exploit the spatial correlation that exists in a still image. When we
want to compressvideo or sequence images we have an added dimension to exploit,
namely, thetemporal dimension. Generally, there is little or very little change
in the spatial arrangementof objects between two or more consecutive frames in
a video. Therefore,it is advantageous to send or store the differences between
consecutive frames rather
than
sending or storing each frame. The difference frame is called the residual or
differential frame and may contain far less details than the actual frame
itself. Dueto this reduction in the details in the differential frames,
compression is achieved. Toillustrate the idea, let us consider compressing two
consecutive frames
(frame 120and frame 121) of a video sequence as shown in
Figures 1 a,b, respectively (seeM-file listing shown below). The difference
between frames 121 and 120 is shown
Figure
1.Compressing a video
sequence: (a) frame 120 of a table tennis video sequence;
(b) frame 121 of the video sequence;
(c) difference between frames 121 and 120; (d) histogram
of frame 121; (e) histogram of the
difference of frames; (g) quantized difference frame; and (h)
Reconstructed frame 121 by adding
the quantized difference frame to frame 120.
in Figure 1c. The differential frame has a small amount of details
corresponding tothe movements of the hand and the racket. Note that stationary
objects do not appearin the difference frame. This is evident from the
histogram of the differential frameshown in Figure 1e, where the intensity
range occupied by the differential pixels ismuch smaller. Compare this with the
histogram of frame 121 in Figure 1d whichis much wider. The quantized
differential frame and the reconstructed frame 121 areshown in Figures 1f,g,
respectively. We see some distortions in the edges due toquantization.
When objects move between successive frames,
simple differencing will introducelarge residual values especially when the
motion is large. Due to relative motionof objects, simple differencing is not
efficient from the point of view of achievablecompression. It is more
advantageous to determine or estimate the relative motionsof objects between
successive frames and compensate for the motion and then dothe differencing to
achieve a much higher compression. This type of prediction isknown as motion compensated prediction. Because we perform motion estimationand compensation at the encoder, we need to inform the decoder
about this motioncompensation. This is done by sending motion vectors as side
information, whichconveys the object motion in the horizontal and vertical
directions. The decoder thenuses the motion vectors to align the blocks and
reconstruct the image.
% generates a differential frame by subtracting two
% temporally adjacent intensity image frames
% quantizes the differential frame and reconstructs
% original frame by adding quantized differential frame
% to the other frame.
close all
clear
Frm1 = ’tt120.ras’;
Frm2 = ’tt121.ras’;
I = imread(Frm1); % read frame # 120
I1 = im2single(I); % convert from uint8 to float single
I = imread(Frm2); % read frame # 121
figure,imhist(I,256),title([’Histogram of frame ’ num2str(121)])
xlabel(’Pixel Value’), ylabel(’Pixel Count’)
I2 = im2single(I); % convert from uint8 to float single
clear I
figure,imshow(I1,[]), title([num2str(120) ’th frame’])
figure,imshow(I2,[]), title([num2str(121) ’st frame’])
%
Idiff = imsubtract(I2,I1); % subtract frame 120 from 121
figure,imhist(Idiff,256),title(’Histogram of difference image’)
xlabel(’Pixel Value’), ylabel(’Pixel Count’)
figure,imshow(Idiff,[]),title(’Difference image’)
% quantize and dequantize the differential image
IdiffQ = round(4*Idiff)/4;
figure,imshow(IdiffQ,[]),title(’Quantized Difference image’)
y = I1 + IdiffQ; % reconstruct frame 121
figure,imshow(y,[]),title(’Reconstructed
image’)
A video sequence is generally
divided into scenes with scene changes markingthe boundaries between
consecutive scenes. Frames within a scene are similar andthere is a high
temporal correlation between successive frames within a scene. Wemay,
therefore, send differential frames within a scene to achieve high
compression.However, when the scene changes, differencing may result in much
more detailsthan the actual frame due to the absence of correlation, and
therefore, compressionmay not be possible. The first frame in a scene is
referred to as the key frame, andit is compressed by any of the above-mentioned
schemes such as the DCT or DWT.Other frames in the scene are compressed using
temporal differencing.
VIDEO COMPRESSION STANDARDS
Interoperability
is crucial when different platforms and devices are involved in thedelivery of
images and video data. If for instance images and video are compressedusing a
proprietary algorithm, then decompression at the user end is not feasible
unlessthe same proprietary algorithm is used, thereby encouraging
monopolization.
Figure 1 A taxonomy of image
and video compression methods.
This, therefore, calls for a standardization
of the compression algorithms as well asdata transportation mechanism and
protocols so as to guarantee not only interoperabilitybut also competitiveness.
This will eventually open up growth potential forthe technology and will
benefit the consumers as the prices will go down. This hasmotivated people to
form organizations across nations to develop solutions to interoperability.
The first successful standard for still image
compression known as JPEGwas developed jointly by the International
Organization for Standardization (ISO)and International Telegraph and Telephone
Consultative Committee (CCITT) ina collaborative effort. CCITT is now known as
International TelecommunicationUnion—Telecommunication (ITU-T). JPEG standard
uses DCT as the compressiontool for grayscale and true color still image
compression. In 2000, JPEG [4] adopted2D DWT as the compression vehicle.
For video coding and distribution, MPEG was developed
under the auspiciousof ISO and International Electrotechnical Commission (IEC)
groups. MPEG [5] denotesa family of standards used to compress audio-visual
information. Since its inceptionMPEG standard has been extended to several
versions. MPEG-1 was meantfor video compression at about 1.5 Mb/s rate suitable
for CD ROM. MPEG-2 aimsfor higher data rates of 10 Mb/s or more and is intended
for SD and HD TV applications.MPEG-4 is intended for very low data rates of 64
kb/s or less. MPEG-7 ismore on standardization of description of multimedia
information rather than compression.It is intended for enabling efficient
search of multimedia contents and isaptly called multimedia content description interface. MPEG-21 aims at enablingthe use of
multimedia sources across many different networks and devices used bydifferent
communities in a transparent manner. This is to be accomplished by definingthe
entire multimedia framework as digital
items.
No comments:
Post a Comment