ElectRoidWarE: IMAGE AND VIDEO COMPRESSION TECHNIQUES

Still Image Compression

Let us first see the difference between data compression and bandwidth compression.

Data compression refers to the process of reducing the digital source data to a desired

level. On the other hand, bandwidth compression refers to the process of reducing the

analog bandwidth of the analog source. What do we really mean by these terms? Here

is an example. Consider the conventional wire line telephony. A subscriber’s voice

is filtered by a lowpass filter to limit the bandwidth to a nominal value of 4 kHz. So,

the channel bandwidth is 4 kHz. Suppose that it is converted to digital data for longdistance

transmission. As we will see later, in order to reconstruct the original analog

signal that is band limited to 4 kHz exactly, sampling theory dictates that one should

have at least 8000 samples per second. Additionally, for digital transmission each

analog sample must be converted to a digital value. In telephony, each analog voice

sample is converted to an 8-bit digital number using pulse code modulation (PCM).

Therefore, the voice data rate that a subscriber originates is 64,000 bits per second.

As we mentioned above, in an ideal case this digital source will require 32 kHz of

bandwidth for transmission. Even if we employ some form of data compression to

reduce the source rate to say, 16 kilobits per second, it will still require at least 8 kHz

of channel bandwidth for real-time transmission. Hence, data compression does not

necessarily reduce the analog bandwidth. Note that the original analog voice requires

only 4 kHz of bandwidth. If we want to compress bandwidth, we can simply filter

theanalog signal by a suitable filter with a specified cutoff frequency to limit the

bandwidth occupied by the analog signal.

Figure 1Original cameraman picture.

Having clarified the terms data compression and bandwidth compression, let us

look into some basic data compression techniques known to us. Henceforth, we will

use the terms compression and data compression interchangeably. All image and

video sources have redundancies. In a still image, each pixel in a row may have a

value very nearly equal to a neighboring pixel value. As an example, consider the

cameraman picture shown in Figure 1. Figure 2 shows the profile (top figure)

Figure 2 Profile of cameraman image along row number 164. The top graph shows pixel

intensity, and the bottom graph shows corresponding normalized correlation over 128 pixel

displacements.

and the corresponding correlation (bottom figure) of the cameraman picture along

row 164. The MATLAB M-file for generating Figure 1.4 is listed below. Observe

that the pixel values are very nearly the same over a large number of neighboring

pixels and so is the pixel correlation. In other words, pixels in a row have a high correlation.

Similarly, pixels may also have a high correlation along the columns. Thus,

pixel redundancies translate to pixel correlation. The basic principle behind image

data compression is to decorrelatethe pixels and encode the resulting decorrelated

image for transmission or storage. A specific compression scheme will depend on

the method by which the pixel correlations are removed.

Figure1 4.m

% Plots the image intensity profile and pixel correlation

% along a specified row.

clear

close all

I = imread(’cameraman.tif’);

figure,imshow(I),title(’Input image’)

Row = 164; % row number of image profile

x = double(I(Row,:));

Col = size(I,2);

MaxN = 128; % number of correlation points to calculate

Cor = zeros(1,MaxN); % array to store correlation values

for k = 1:MaxN

l = length(k:Col);

Cor(k) = sum(x(k:Col) .* x(1:Col-k+1))/l;

end

MaxCor = max(Cor);

Cor = Cor/MaxCor;

figure,subplot(2,1,1),plot(1:Col,x,’k’,’LineWidth’,2)

xlabel(’Pixel number’), ylabel(’Amplitude’)

legend([’Row’ ’ ’ num2str(Row)],0)

subplot(2,1,2),plot(0:MaxN-1,Cor,’k’,’LineWidth’,2)

xlabel(’Pixel displacement’), ylabel(’Normalized corr.’)

One of the earliest and basic image compression techniques is known as the differential

pulse code modulation (DPCM) [1]. If the pixel correlation along only one

dimension (row or column) is removed, then the DPCM is called one-dimensional

(1D) DPCM or row-by-row DPCM. If the correlations along both dimensions are

removed, then the resulting DPCM is known as 2D DPCM. A DPCM removes

pixel correlation and requantizes the residual pixel values for storage or transmission.

The residual image has a variance much smaller than that of the original image.

Further, the residual image has a probability density function, which is a doublesided

exponential function. These give rise to compression.

The quantizer is fixed no matter how the decorrelated pixel values are. A variation

on the theme is to use quantizers that adapt to changing input statistics, and therefore,

the corresponding DPCM is called an adaptive DPCM. DPCM is very simple to

implement, but the compression achievable is about 4:1. Due to limited bit width of

thequantizer for the residual image, edges are not preserved well in the DPCM. It

also exhibits occasional streaks across the image when channel error occurs.We will

discuss DPCM in detail in a later chapter.

Another popular and more efficient compression scheme is known by the generic

nametransform coding. Remember that the idea is to reduce or remove pixel correlation

to achieve compression. In transform coding, a block of image pixels is linearly

transformed into another block of transform coefficients of the same size as the pixel

block with the hope that only a few of the transform coefficients will be significant

and the rest may be discarded. This implies that storage space is required to store

only the significant transform coefficients, which are a fraction of the total number

of coefficients and hence the compression. The original image can be reconstructed

by performing the inverse transform of the reduced coefficient block. It must be

pointed out that the inverse transform must exist for unique reconstruction. There are

a number of such transforms available in the field to choose from, each having its own

merits and demerits. The most efficient transform is one that uses the least number of

transform coefficients to reconstruct the image for a given amount of distortion. Such

a linear transform is known as the optimal transform where optimality is with respect

to the minimum mean square error between the original and reconstructed images.

This optimal image transform is known by the names Karhunen–Lo`eve transform

(KLT) or Hotelling transform. The disadvantage of the KLT is that the transform

kernel depends on the actual image to be compressed, which requires a lot more side

information for the receiver to reconstruct the original image from the compressed

image than other fixed transforms. A highly popular fixed transform is the familiar

discrete cosine transform (DCT). The DCT has very nearly the same compression

efficiencyas the KLT with the advantage that its kernel is fixed and so no side information

is required by the receiver for the reconstruction. The DCT is used in

the JPEG and MPEG video compression standards. The DCT is usually applied on

nonoverlapping blocks of an image. Typical DCT blocks are of size 8 × 8 or 16 × 16.

One of the disadvantages of image compression using the DCT is the blocking

artifact. Because the DCT blocks are small compared with the image and because

the average values of the blocks may be different, blocking artifacts appear when

the zero-frequency (dc) DCT coefficients are quantized rather heavily. However, at

low compression, blocking artifacts are almost unnoticeable. An example showing

blockingartifacts due to compression using 8 × 8 DCT is shown in Figure 3 a.

Blockiness is clearly seen in flat areas—both low and high intensities as well as undershoot

and overshoot along the sharp edges—see Figure 3 b. A listing of M-file

for Figures 3 a,b is shown below.

Figure 3 (a) Cameraman image showing blocking artifacts due to quantization of the DCT

coefficients. The DCT size is 8 × 8. (b) Intensity profile along row number 164 of the

image in (a).

% Figure3.m

% Example to show blockiness in DCT compression

% Quantizes and dequantizes an intensity image using

% 8x8 DCT and JPEG quantization matrix

close all

clear

I = imread(’cameraman.tif’);

figure,imshow(I), title(’Original Image’)

fun = @dct2; % 2D DCT function

N = 8; % block size of 2D DCT

T = blkproc(I,[N N],fun); % compute 2D DCT of image using NxN blocks

Scale = 4.0; % increasing Scale quntizes DCT coefficients heavily

% JPEG default quantization matrix

jpgQMat = [16 11 10 16 24 40 51 61;

12 12 14 19 26 58 60 55;

14 13 16 24 40 57 69 56;

14 17 22 29 51 87 80 62;

18 22 37 56 68 109 103 77;

24 35 55 64 81 194 113 92;

49 64 78 87 103 121 120 101;

72 92 95 98 121 100 103 99];

Qstep = jpgQMat * Scale; % quantization step size

% Quantize and dequantize the coefficients

for k = 1:N:size(I,1)

for l = 1:N:size(I,2)

T1(k:k+N-1,l:l+N-1) = round(T(k:k+N-1,l:l+N-1)./ Qstep).*Qstep;

end

% do inverse 2D DCT

fun = @idct2;

y = blkproc(T1,[N N],fun);

y = uint8(round(y));

figure,imshow(y), title(’DCT compressed Image’)

% Plot image profiles before and after compression

ProfRow = 164;

figure,plot(1:size(I,2),I(ProfRow,:),’k’,’LineWidth’,2)

hold on

plot(1:size(I,2),y(ProfRow,:),’-.k’,’LineWidth’,1)

title([’Intensity profile of row ’ num2str(ProfRow)])

xlabel(’Pixel number’), ylabel(’Amplitude’)

%legend([’Row’ ’ ’ num2str(ProfRow)],0)

legend(’Original’,’Compressed’,0)

A third and relatively recent compression method is based on wavelet transform.

As we will see in a later chapter, wavelet transform captures both long-term and

short-term changes in an image and offers a highly efficient compression mechanism.

As a result, it is used in the latest versions of the JPEG standards as a compression

tool. It is also adopted by the SMPTE (Society of Motion Pictures and Television

Engineers). Even though the wavelet transform may be applied on blocks of an image

like the DCT, it is generally applied on the full image and the various wavelet

Figure 4.A two-level 2D DWT of cameraman image.

coefficients are quantized according to their types. A two-level discrete wavelet transform

(DWT) of the cameraman image is shown in Figure 1.6 to illustrate how the 2D

wavelet transform coefficients look like. Details pertaining to the levels and subbands

of the DWT will be given in a later chapter. The M-file to implement multilevel 2D

DWT that generates Figure 5 is listed below. As we will see in a later chapter, the

2D DWT decomposes an image into one approximation and many detail coefficients.

The number of coefficient subimages corresponding to an L-level 2D DWT equals

3 × L + 1. Therefore, for a two-level 2D DWT, there are seven coefficient subimages.

In the first level, there are three detail coefficient subimages, each of size 14

the original image. The second level consists of four sets of DWT coefficients—one

approximation and three details, each 1

16 the original image. As the name implies the

approximation coefficients are lower spatial resolution approximations to the original

image. The detail coefficients capture the discontinuities or edges in the image with

orientations in the horizontal, vertical, and diagonal directions. In order to compress,

an image using 2D DWT we have to compute the 2D DWT of the image up to a given

level and then quantize each coefficient subimage. The achievable quality and compression

ratio depend on the chosen wavelets and quantization method. The visual

effect of quantization distortion in DWT compression scheme is different from that

in DCT-based scheme. Figure 6 a is the cameraman image compressed using 2D

DWT. The wavelet used is called Daubechies 2 (db2 in MATLAB) and the number

of levels used is 1. We note that there are no blocking effects, but there are patches

in the flat areas. We also see that the edges are reproduced faithfully as evidenced

in the profile (Figure 6 b). It must be pointed out that the amount of quantization

applied in Figure 6 a is not the same as that used for the DCT example and that the

two examples are given only to show the differences in the artifacts introduced by

the two schemes. An M-file listing to generate Figures 6 a,b is shown below

% 2D Discrete Wavelet Transform (DWT)

% Computes multi-level 2D DWT of an intensity image

close all

clear

I = imread(’cameraman.tif’);

figure,imshow(I), title(’Original Image’)

L = 2; % number of levels in DWT

[W,B] = wavedec2(I,L,’db2’); % do a 2-level DWT using db2 wavelet

% declare level-1 subimages

w11 = zeros(B(3,1),B(3,1));

w12 = zeros(B(3,1),B(3,1));

w13 = zeros(B(3,1),B(3,1));

% declare level-2 subimages

w21 = zeros(B(1,1),B(1,1));

w22 = zeros(B(1,1),B(1,1));

w23 = zeros(B(1,1),B(1,1));

w24 = zeros(B(1,1),B(1,1));

% extract level-1 2D DWT coefficients

offSet11 = 4*B(1,1)*B(1,2);

offSet12 = 4*B(1,1)*B(1,2)+B(3,1)*B(3,2);

offSet13 = 4*B(1,1)*B(1,2)+2*B(3,1)*B(3,2);

for c = 1:B(2,2)

for r = 1:B(2,1)

w11(r,c) = W(offSet11+(c-1)*B(3,1)+r);

w12(r,c) = W(offSet12+(c-1)*B(3,1)+r);

w13(r,c) = W(offSet13+(c-1)*B(3,1)+r);

end

% extract level-2 2D DWT coefficients

offSet22 = B(1,1)*B(1,2);

offSet23 = 2*B(1,1)*B(1,2);

offSet24 = 3*B(1,1)*B(1,2);

for c = 1:B(1,2)

for r = 1:B(1,1)

w21(r,c) = W((c-1)*B(1,1)+r);

w22(r,c) = W(offSet22+(c-1)*B(1,1)+r);

w23(r,c) = W(offSet23+(c-1)*B(1,1)+r);

w24(r,c) = W(offSet24+(c-1)*B(1,1)+r);

end

% declare output array y to store all the DWT coefficients

%y = zeros(261,261);

y = zeros(2*B(1,1)+B(3,1),2*B(1,2)+B(3,2));

y(1:B(1,1),1:B(1,1))=w21;

y(1:B(1,1),B(1,1)+1:2*B(1,1))=w22;

y(B(1,1)+1:2*B(1,1),1:B(1,1))=w23;

y(B(1,1)+1:2*B(1,1),B(1,1)+1:2*B(1,1))=w24;

y(1:B(3,1),2*B(1,1)+1:261)=w11;

y(2*B(1,1)+1:261,1:129)=w12;

y(2*B(1,1)+1:261,2*B(1,1)+1:261)=w13;

figure,imshow(y,[]),title([num2str(L) ’-level 2D DWT’])

% Figure6.m

% An example to show the effect of quantizing the 2D DWT

% coefficients of an intensity image along with intensity

% profile along a specified row

close all

clear

I = imread(’cameraman.tif’);

figure,imshow(I), title(’Original Image’)

% do a 1-level 2D DWT

[W,B] = wavedec2(I,1,’db2’);

w11 = zeros(B(1,1),B(1,2));

w12 = zeros(B(1,1),B(1,2));

w13 = zeros(B(1,1),B(1,2));

w14 = zeros(B(1,1),B(1,2));

offSet12 = B(1,1)*B(1,2);

offSet13 = 2*B(1,1)*B(1,2);

offSet14 = 3*B(1,1)*B(1,2);

% quantize only the approximation coefficients

Qstep = 16;

for c = 1:B(1,2)

for r = 1:B(1,1)

W((c-1)*B(1,1)+r) = floor(W((c-1)*B(1,1)+r)/Qstep)*Qstep;

W(offSet12+(c-1)*B(1,1)+r) = floor(W(offSet12+(c-1)*B(1,1)+r)/8)*8;

W(offSet13+(c-1)*B(1,1)+r) = floor(W(offSet13+(c-1)*B(1,1)+r)/8)*8;

W(offSet14+(c-1)*B(1,1)+r) = floor(W(offSet14+(c-1)*B(1,1)+r)/8)*8;

end

% do inverse 2D DWT

y = waverec2(W,B,’db2’);

figure,imshow(y,[])

% plot profile

ProfRow = 164;

figure,plot(1:size(I,2),I(ProfRow,:),’k’,’LineWidth’,2)

hold on

plot(1:size(I,2),y(ProfRow,:),’-.k’,’LineWidth’,1)

title([’Profile of row ’ num2str(ProfRow)])

xlabel(’Pixel number’), ylabel(’Amplitude’)

%legend([’Row’ ’ ’ num2str(ProfRow)],0)

legend(’Original’,’Compressed’,0)

Figure 6(a) Cameraman image compressed using one-level 2D DWT. (b) Intensity profile of

Pages

FRIENDz

IMAGE AND VIDEO COMPRESSION TECHNIQUES

Uniform Quantizer with Matlab Program

Non-rectangular Sampling Grids

Nonideal Sampling using Matlab

SAMPLING A CONTINUOUS IMAGE

Video Compression using Matlab

IMAGE AND VIDEO COMPRESSION TECHNIQUES

Uniform Quantizer with Matlab Program

No comments:

Post a Comment

Popular Posts

Labels

Blog Archive