FRIENDz

FAILURE IS THE SIGN OF SUCCESS!!

IMAGE AND VIDEO COMPRESSION TECHNIQUES

Still Image Compression
Let us first see the difference between data compression and bandwidth compression.
Data compression refers to the process of reducing the digital source data to a desired
level. On the other hand, bandwidth compression refers to the process of reducing the
analog bandwidth of the analog source. What do we really mean by these terms? Here
is an example. Consider the conventional wire line telephony. A subscriber’s voice
is filtered by a lowpass filter to limit the bandwidth to a nominal value of 4 kHz. So,
the channel bandwidth is 4 kHz. Suppose that it is converted to digital data for longdistance
transmission. As we will see later, in order to reconstruct the original analog
signal that is band limited to 4 kHz exactly, sampling theory dictates that one should
have at least 8000 samples per second. Additionally, for digital transmission each
analog sample must be converted to a digital value. In telephony, each analog voice
sample is converted to an 8-bit digital number using pulse code modulation (PCM).

Therefore, the voice data rate that a subscriber originates is 64,000 bits per second.
As we mentioned above, in an ideal case this digital source will require 32 kHz of
bandwidth for transmission. Even if we employ some form of data compression to
reduce the source rate to say, 16 kilobits per second, it will still require at least 8 kHz
of channel bandwidth for real-time transmission. Hence, data compression does not
necessarily reduce the analog bandwidth. Note that the original analog voice requires
only 4 kHz of bandwidth. If we want to compress bandwidth, we can simply filter
theanalog signal by a suitable filter with a specified cutoff frequency to limit the
bandwidth occupied by the analog signal.

Figure 1Original cameraman picture.
Having clarified the terms data compression and bandwidth compression, let us
look into some basic data compression techniques known to us. Henceforth, we will
use the terms compression and data compression interchangeably. All image and
video sources have redundancies. In a still image, each pixel in a row may have a
value very nearly equal to a neighboring pixel value. As an example, consider the
cameraman picture shown in Figure 1. Figure 2 shows the profile (top figure)

Figure 2 Profile of cameraman image along row number 164. The top graph shows pixel
intensity, and the bottom graph shows corresponding normalized correlation over 128 pixel
displacements.
and the corresponding correlation (bottom figure) of the cameraman picture along
row 164. The MATLAB M-file for generating Figure 1.4 is listed below. Observe
that the pixel values are very nearly the same over a large number of neighboring
pixels and so is the pixel correlation. In other words, pixels in a row have a high correlation.
Similarly, pixels may also have a high correlation along the columns. Thus,
pixel redundancies translate to pixel correlation. The basic principle behind image
data compression is to decorrelatethe pixels and encode the resulting decorrelated
image for transmission or storage. A specific compression scheme will depend on
the method by which the pixel correlations are removed.
Figure1 4.m
% Plots the image intensity profile and pixel correlation
% along a specified row.
clear
close all
I = imread(’cameraman.tif’);
figure,imshow(I),title(’Input image’)
%
Row = 164; % row number of image profile
x = double(I(Row,:));
Col = size(I,2);
%
MaxN = 128; % number of correlation points to calculate
Cor = zeros(1,MaxN); % array to store correlation values
for k = 1:MaxN
l = length(k:Col);
Cor(k) = sum(x(k:Col) .* x(1:Col-k+1))/l;
end
MaxCor = max(Cor);
Cor = Cor/MaxCor;
figure,subplot(2,1,1),plot(1:Col,x,’k’,’LineWidth’,2)
xlabel(’Pixel number’), ylabel(’Amplitude’)
legend([’Row’ ’ ’ num2str(Row)],0)
subplot(2,1,2),plot(0:MaxN-1,Cor,’k’,’LineWidth’,2)
xlabel(’Pixel displacement’), ylabel(’Normalized corr.’)

One of the earliest and basic image compression techniques is known as the differential
pulse code modulation (DPCM) [1]. If the pixel correlation along only one
dimension (row or column) is removed, then the DPCM is called one-dimensional
(1D) DPCM or row-by-row DPCM. If the correlations along both dimensions are
removed, then the resulting DPCM is known as 2D DPCM. A DPCM removes
pixel correlation and requantizes the residual pixel values for storage or transmission.
The residual image has a variance much smaller than that of the original image.
Further, the residual image has a probability density function, which is a doublesided
exponential function. These give rise to compression.
The quantizer is fixed no matter how the decorrelated pixel values are. A variation
on the theme is to use quantizers that adapt to changing input statistics, and therefore,
the corresponding DPCM is called an adaptive DPCM. DPCM is very simple to
implement, but the compression achievable is about 4:1. Due to limited bit width of
thequantizer for the residual image, edges are not preserved well in the DPCM. It
also exhibits occasional streaks across the image when channel error occurs.We will
discuss DPCM in detail in a later chapter.
Another popular and more efficient compression scheme is known by the generic
nametransform coding. Remember that the idea is to reduce or remove pixel correlation
to achieve compression. In transform coding, a block of image pixels is linearly
transformed into another block of transform coefficients of the same size as the pixel
block with the hope that only a few of the transform coefficients will be significant
and the rest may be discarded. This implies that storage space is required to store
only the significant transform coefficients, which are a fraction of the total number
of coefficients and hence the compression. The original image can be reconstructed
by performing the inverse transform of the reduced coefficient block. It must be
pointed out that the inverse transform must exist for unique reconstruction. There are
a number of such transforms available in the field to choose from, each having its own
merits and demerits. The most efficient transform is one that uses the least number of
transform coefficients to reconstruct the image for a given amount of distortion. Such
a linear transform is known as the optimal transform where optimality is with respect
to the minimum mean square error between the original and reconstructed images.
This optimal image transform is known by the names Karhunen–Lo`eve transform
(KLT) or Hotelling transform. The disadvantage of the KLT is that the transform
kernel depends on the actual image to be compressed, which requires a lot more side
information for the receiver to reconstruct the original image from the compressed
image than other fixed transforms. A highly popular fixed transform is the familiar
discrete cosine transform (DCT). The DCT has very nearly the same compression
efficiencyas the KLT with the advantage that its kernel is fixed and so no side information
is required by the receiver for the reconstruction. The DCT is used in
the JPEG and MPEG video compression standards. The DCT is usually applied on
nonoverlapping blocks of an image. Typical DCT blocks are of size 8 × 8 or 16 × 16.
One of the disadvantages of image compression using the DCT is the blocking
artifact. Because the DCT blocks are small compared with the image and because
the average values of the blocks may be different, blocking artifacts appear when
the zero-frequency (dc) DCT coefficients are quantized rather heavily. However, at
low compression, blocking artifacts are almost unnoticeable. An example showing
blockingartifacts due to compression using 8 × 8 DCT is shown in Figure 3 a.
Blockiness is clearly seen in flat areas—both low and high intensities as well as undershoot
and overshoot along the sharp edges—see Figure 3 b. A listing of M-file
for Figures 3 a,b is shown below.

Figure 3 (a) Cameraman image showing blocking artifacts due to quantization of the DCT
coefficients. The DCT size is 8 × 8. (b) Intensity profile along row number 164 of the
image in (a).
% Figure3.m
% Example to show blockiness in DCT compression
% Quantizes and dequantizes an intensity image using
% 8x8 DCT and JPEG quantization matrix
close all
clear
I = imread(’cameraman.tif’);
figure,imshow(I), title(’Original Image’)
%
fun = @dct2; % 2D DCT function
N = 8; % block size of 2D DCT
T = blkproc(I,[N N],fun); % compute 2D DCT of image using NxN blocks
%
Scale = 4.0; % increasing Scale quntizes DCT coefficients heavily
% JPEG default quantization matrix
jpgQMat = [16 11 10 16 24 40 51 61;
12 12 14 19 26 58 60 55;
14 13 16 24 40 57 69 56;
14 17 22 29 51 87 80 62;
18 22 37 56 68 109 103 77;
24 35 55 64 81 194 113 92;
49 64 78 87 103 121 120 101;
72 92 95 98 121 100 103 99];
Qstep = jpgQMat * Scale; % quantization step size
% Quantize and dequantize the coefficients
for k = 1:N:size(I,1)
for l = 1:N:size(I,2)
T1(k:k+N-1,l:l+N-1) = round(T(k:k+N-1,l:l+N-1)./ Qstep).*Qstep;
end
end
% do inverse 2D DCT
fun = @idct2;
y = blkproc(T1,[N N],fun);
y = uint8(round(y));
figure,imshow(y), title(’DCT compressed Image’)
% Plot image profiles before and after compression
ProfRow = 164;
figure,plot(1:size(I,2),I(ProfRow,:),’k’,’LineWidth’,2)
hold on
plot(1:size(I,2),y(ProfRow,:),’-.k’,’LineWidth’,1)
title([’Intensity profile of row ’ num2str(ProfRow)])
xlabel(’Pixel number’), ylabel(’Amplitude’)
%legend([’Row’ ’ ’ num2str(ProfRow)],0)
legend(’Original’,’Compressed’,0)

A third and relatively recent compression method is based on wavelet transform.
As we will see in a later chapter, wavelet transform captures both long-term and
short-term changes in an image and offers a highly efficient compression mechanism.
As a result, it is used in the latest versions of the JPEG standards as a compression
tool. It is also adopted by the SMPTE (Society of Motion Pictures and Television
Engineers). Even though the wavelet transform may be applied on blocks of an image
like the DCT, it is generally applied on the full image and the various wavelet

Figure 4.A two-level 2D DWT of cameraman image.
coefficients are quantized according to their types. A two-level discrete wavelet transform
(DWT) of the cameraman image is shown in Figure 1.6 to illustrate how the 2D
wavelet transform coefficients look like. Details pertaining to the levels and subbands
of the DWT will be given in a later chapter. The M-file to implement multilevel 2D
DWT that generates Figure 5 is listed below. As we will see in a later chapter, the
2D DWT decomposes an image into one approximation and many detail coefficients.
The number of coefficient subimages corresponding to an L-level 2D DWT equals
3 × L + 1. Therefore, for a two-level 2D DWT, there are seven coefficient subimages.
In the first level, there are three detail coefficient subimages, each of size 14
the original image. The second level consists of four sets of DWT coefficients—one
approximation and three details, each 1
16 the original image. As the name implies the
approximation coefficients are lower spatial resolution approximations to the original
image. The detail coefficients capture the discontinuities or edges in the image with
orientations in the horizontal, vertical, and diagonal directions. In order to compress,
an image using 2D DWT we have to compute the 2D DWT of the image up to a given
level and then quantize each coefficient subimage. The achievable quality and compression
ratio depend on the chosen wavelets and quantization method. The visual
effect of quantization distortion in DWT compression scheme is different from that
in DCT-based scheme. Figure 6 a is the cameraman image compressed using 2D
DWT. The wavelet used is called Daubechies 2 (db2 in MATLAB) and the number
of levels used is 1. We note that there are no blocking effects, but there are patches
in the flat areas. We also see that the edges are reproduced faithfully as evidenced
in the profile (Figure 6 b). It must be pointed out that the amount of quantization
applied in Figure 6 a is not the same as that used for the DCT example and that the
two examples are given only to show the differences in the artifacts introduced by
the two schemes. An M-file listing to generate Figures 6 a,b is shown below


% 2D Discrete Wavelet Transform (DWT)
% Computes multi-level 2D DWT of an intensity image
close all
clear
I = imread(’cameraman.tif’);
figure,imshow(I), title(’Original Image’)
L = 2; % number of levels in DWT
[W,B] = wavedec2(I,L,’db2’); % do a 2-level DWT using db2 wavelet
% declare level-1 subimages
w11 = zeros(B(3,1),B(3,1));
w12 = zeros(B(3,1),B(3,1));
w13 = zeros(B(3,1),B(3,1));
% declare level-2 subimages
w21 = zeros(B(1,1),B(1,1));
w22 = zeros(B(1,1),B(1,1));
w23 = zeros(B(1,1),B(1,1));
w24 = zeros(B(1,1),B(1,1));
% extract level-1 2D DWT coefficients
offSet11 = 4*B(1,1)*B(1,2);
offSet12 = 4*B(1,1)*B(1,2)+B(3,1)*B(3,2);
offSet13 = 4*B(1,1)*B(1,2)+2*B(3,1)*B(3,2);
for c = 1:B(2,2)
for r = 1:B(2,1)
w11(r,c) = W(offSet11+(c-1)*B(3,1)+r);
w12(r,c) = W(offSet12+(c-1)*B(3,1)+r);
w13(r,c) = W(offSet13+(c-1)*B(3,1)+r);
end
end
% extract level-2 2D DWT coefficients
offSet22 = B(1,1)*B(1,2);
offSet23 = 2*B(1,1)*B(1,2);
offSet24 = 3*B(1,1)*B(1,2);
for c = 1:B(1,2)
for r = 1:B(1,1)
w21(r,c) = W((c-1)*B(1,1)+r);
w22(r,c) = W(offSet22+(c-1)*B(1,1)+r);
w23(r,c) = W(offSet23+(c-1)*B(1,1)+r);
w24(r,c) = W(offSet24+(c-1)*B(1,1)+r);
end
end
% declare output array y to store all the DWT coefficients
%y = zeros(261,261);
y = zeros(2*B(1,1)+B(3,1),2*B(1,2)+B(3,2));
y(1:B(1,1),1:B(1,1))=w21;
y(1:B(1,1),B(1,1)+1:2*B(1,1))=w22;
y(B(1,1)+1:2*B(1,1),1:B(1,1))=w23;
y(B(1,1)+1:2*B(1,1),B(1,1)+1:2*B(1,1))=w24;
%
y(1:B(3,1),2*B(1,1)+1:261)=w11;
y(2*B(1,1)+1:261,1:129)=w12;
y(2*B(1,1)+1:261,2*B(1,1)+1:261)=w13;
figure,imshow(y,[]),title([num2str(L) ’-level 2D DWT’])
% Figure6.m
% An example to show the effect of quantizing the 2D DWT
% coefficients of an intensity image along with intensity
% profile along a specified row
%
close all
clear
I = imread(’cameraman.tif’);
figure,imshow(I), title(’Original Image’)
% do a 1-level 2D DWT
[W,B] = wavedec2(I,1,’db2’);
w11 = zeros(B(1,1),B(1,2));
w12 = zeros(B(1,1),B(1,2));
w13 = zeros(B(1,1),B(1,2));
w14 = zeros(B(1,1),B(1,2));
%
offSet12 = B(1,1)*B(1,2);
offSet13 = 2*B(1,1)*B(1,2);
offSet14 = 3*B(1,1)*B(1,2);
% quantize only the approximation coefficients
Qstep = 16;
for c = 1:B(1,2)
for r = 1:B(1,1)
W((c-1)*B(1,1)+r) = floor(W((c-1)*B(1,1)+r)/Qstep)*Qstep;
%{
W(offSet12+(c-1)*B(1,1)+r) = floor(W(offSet12+(c-1)*B(1,1)+r)/8)*8;
W(offSet13+(c-1)*B(1,1)+r) = floor(W(offSet13+(c-1)*B(1,1)+r)/8)*8;
W(offSet14+(c-1)*B(1,1)+r) = floor(W(offSet14+(c-1)*B(1,1)+r)/8)*8;
%}
end
end
% do inverse 2D DWT
y = waverec2(W,B,’db2’);
figure,imshow(y,[])
% plot profile
ProfRow = 164;
figure,plot(1:size(I,2),I(ProfRow,:),’k’,’LineWidth’,2)
hold on
plot(1:size(I,2),y(ProfRow,:),’-.k’,’LineWidth’,1)
title([’Profile of row ’ num2str(ProfRow)])
xlabel(’Pixel number’), ylabel(’Amplitude’)
%legend([’Row’ ’ ’ num2str(ProfRow)],0)
legend(’Original’,’Compressed’,0)


 

Figure 6(a) Cameraman image compressed using one-level 2D DWT. (b) Intensity profile of

No comments:

Post a Comment