Abstract

Introduction

Decorrelation

Energy Concentration

Examples

Results

Conclusions

Further Work

Appendix: Data Reference

 


 

This project will look at different ways of using the eigendecomposition (decorrelation) ideas of the Karhunen Loeve transform for color images. These images are vector-valued for each pixel, yielding different components to decorrelate. This project will try three approaches. First, decorrelating the three colors (transforming to three decorrelated variables spanning the color space) and then spatially decorrelating each variable. Second, spatially decorrelating each color component, and then decorrelating the three color variables. Third, decorrelating the joint color-space distribution (i.e. regarding the image as 3*n^2 pixels and decorrelating them). This project will use a training set of images to develop a transform for each of these three approaches, and compare performance (i.e. energy concentration ability) within and outside of the training set.

 


 

A decorrelating transform for a grayscale image is typically found by evaluating a blockwise Karhunen-Loeve transform on a training set of images. In color images, each pixel is three-dimensional, and so there are a number of ways of generalizing the Karhunen-Loeve transform. The first is to act as if each block contains 3 times as many pixels, and to decorrelate them as usual. One can, however, also decorrelate along the marginal distribution of the pixels or the colors. In every case, the autocorrelation matrix if found by averaging over all blocks (possibly through multiple images of the training set) the empirical block autocorrelation, and then the transform used is the Hermitian transpose of the autocorrelation matrix's eigenvector matrix. The methods differ in what constitutes a block. In joint decorrelation, a block consists of all three colors of a square segment of pixels. In spatial decorrelation, a block consists of one color component of a square array of pixels. In color decorrelation, a block consists of the three dimensions of one pixel. Thus, each joint block is comprised of three spatial blocks or a square of color blocks. Applying the resultant eigenvector matrix in the joint case yields a blockwise decorrelated eigenimage, but after decorrelating along either marginal, the other marginal is left to decorrelate. Thus, there are three ways to decorrelate an image

-Jointly decorrelate.

-Decorrelate the image spatially, and then color decorrelate the resultant eigenimage.

-Decorrelate the image by colors, and then spatially decorrelate the resultant eigenimage.

This project will examine each of these three methods and compare results using a set of images from the USC SIPI database [1], using cross-validation techniques. It should be noted that under self-fitting (evaluation on the training data) joint decorrelation is guaranteed to be optimal (for the given blocksize) because of the optimality properties of the KL transform. This might suggest that it should perform best on test images as well. The joint method allowing the mathematics to optimize jointly on its training set, whereas the other methods constrain the mathematics to ignore space-color correlation. Thus, the comparative performance has a statistical modeling interpretation - whether images seem to have a systematic space-color correlation that the algorithm should be allowed to fit, or whether this correlation is unimportant and amounts to noise that will cause an overfit model and hamper its test performance.

 


 

Joint decorrelation

We begin with an image set

divided into bxb size blocks. Each block thus has dimension 3b^2 and there are m^2 blocks in each image where m=n/b.

Joint decorrelation treats each block the same as if it were a grayscale block - as a single vector to take the empirical autocorrelation matrix of. The only real difference is that in the color case the vectors have length 3b^2 instead of b^2; however, this makes no difference to the KL algorithm. The empirical joint block autocorrelation matrix is thus

where are the block vectors.

After obtaining this, we obtain the eigenvector matrix of R and its Hermitian becomes the transformation, which is applied to each block. This can also be thought of as the application of the blockwise diagonal transform applied to the entire image expressed as a vector

.

Spatial decorrelation

Spatial decorrelation treats each block's three color components as separate blocks ("spatial blocks"). Thus, in determining the empirical spatial autocorrelation matrix, the three colors merely provide three times as many blocks to average over. The empirical spatial autocorrelation matrix is

From this we get the Hermitian-eigenmatrix] transform applied to each spatial block.

 

Color decorrelation

Just as spatial decorrelation regards the three colors as merely providing more blocks, color decorrelation considers different pixels to be different blocks. Color decorrelation actually corresponds exactly to the special case of joint decorrelation where the blocksize is 1. The empirical color autocorrelation matrix is

And the Hermitian-eigenmatrix transform is applied to each color block (pixel).

 


 

The joint decorrelation provides energy concentration - in fact, optimal energy concentration (for the given blocksize) on the training set. However, after applying spatial decorrelation, energy may still be evenly spread throughout the colors, and vice versa for spatial decorrelation. Thus, to concentrate energy with these methods, we apply one decorrelation, then the other.

If are the joint, spatial, and color decorrelation operations respectively, then the three methods are:

Note that after performing spatial (resp., color) decorrelation, the color (resp., spatial) decorrelation takes the color autocorrelation matrix of the intermediate image, not the original one.

 


 

Here is a simple example applying all three techniques a baboon image [1] with blocksize 16.

 

Original baboon image

 

 

Eigenimage after applying joint decorrelation

The eigencomponents are displayed first 256 in red, next 256 in green, last 256 in blue. Since they fall off quickly, this produces eigenimage that resembles the result of a grayscale KLT, except that the large components are red. The pixels can also play the most significant ordering role, which would produce colored pixels (about a third as many) in the upper left corners of blocks instead. This display approach is a bit less informative, as it is easier for the human eye to distinguish relative brightness of different reds than to accurately assess strengths of different components of a mixed-color pixel.

 

 

Original after applying spatial decorrelation...

...and then color decorrelation.

Note that after the spatial decorrelation, as might be expected, the colors are still somewhat evident, since the colors have not been transformed yet. After the second decorrelation, we get an eigenimage similar to the joint eigenimage, though slightly less well concentrated. This is evidenced, for example, by the bright dots, which illustrates the fact that with non-joint decorrelation, the order of the strengths of the components may not correspond exactly to their spatial/color ordering.

 

Original after applying color decorrelation...

...and then spatial decorrelation.

The color decorrelation, which transforms the colors but not the pixels, always results in the "same" image differently colored. (The mind's eye is much better at mental color transformations than mental spatial blockwise transformations). By default ordering, red denotes the largest-energy eigencolor, which subjectively appears to have picked up most of the energy. The baboon' blue cheeks, however, contain more of the second component, and thus appear orange-yelllow.

After spatial decorrelation, we once again have a concentrated images. As in the space-color case, the non-joint decorrelation produces a different ordering of energy. This means that, while the second eigencolor component may have had much less energy that the first, its largest spatial component still has enough energy to whiten the first pixel in many of the blocks.

 


 

Tests of the different methods were performed on a set of 9 different 256 by 256 images [1]. To evaluate out-of-training-set performance, cross validation was used. This is a statistical technique by which each observation (in this case, image) is in turn excluded from the training set, comprised of all the other observations, and used as the test image. After singling out each image in turn, performance (in this case, energy in each component) is averaged over the results from each image. This technique nearly doubles effective sample size from n to 2n-1, as the training set size is n-1 and the test set size is n, and yet no self-fitting on the training set has been done.

Performance of methods is measured by energy concentration in the eigenimage. In all cases, joint decorrelation performed the best by a wide margin, with space-color outperforming color-space by a smaller marginal. The first margin grew larger with increasing block size, whereas the second shrank. One interesting result is that the results for cross-validation are almost as good as those for within-training-set fits, particularly for joint decorrelation. Test set performance almost matching training set performance is an indicator that a technique is actually well suited to the data. Thus, these results suggest that decorrelation via a training set's empirical blockwise joint KLT actually is a good way to compress images. Of course, evaluation of results for a larger and more diverse dataset of images could confirm this with more confidence.

 

 

Energy concentration results, blocksizes 4, 8, 16

Joint decorrelation

Space-color decorrelation

Color-space decorrelation

 

Cross-validation performance Training set performance
Blocksize 4
Blocksize 8
Blocksize 16

 

 

 

Table of energy concentration in first 1, 10, 100 components (note that 4x4 blocks yield only 12 components)

Fit type Blocksize Method 1 component 10 components 100 components
Cross-validation 4x4 Joint .9469 .9949 1
Space-color .9389 .9865 1
Color-space .7638 .9830 1
8x8 Joint .9363 .9880 .9990
Space-color .7853 .9127 .9935
Color-space .5379 .8869 .9928
16x16 Joint .9209 .9775 .9930
Space-color .4344 .6883 .8947
Color-space .2665 .6389 .9063
Training set fit 4x4 Joint .9491 .9950 1
Space-color .9391 .9866 1
Color-space .8130 .9834 1
8x8 Joint .9386 .9881 .9992
Space-color .7874 .9143 .9936
Color-space .5807 .8869 .9933
16x16 Joint .9232 .9780 .9951
Space-color .4406 .6971 .9145
Color-space .2826 .6294 .8814

 

 


Joint decorrelation performed uniformly the best by a wide margin, as hypothesized. The statistical modeling interpretation of this is that there is a color-spatial interaction which should be accounted for. Space-color outperforming color-space in all cases, but by a smaller marginal. The first margin (between joint and space-color) grew larger with increasing block size, whereas the second (between space-color and color-space) shrank. One interesting result is that the results for cross-validation are almost as good as those for within-training-set fits, particularly for joint decorrelation. Test set performance almost matching training set performance is an indicator that a technique is actually well suited to the data. Thus, these results suggest that decorrelation via a training set's empirical blockwise joint KLT actually is a good way to compress images. Of course, evaluation of results for a larger and more diverse dataset of images could confirm this with more confidence.

 


 

-Evaluating performance on a larger data set, possibly with different image/block sizes.

-Applying transforms other that KL (DCT, etc.) jointly or across color or space marginals.

 


 

[1] Dataset of images obtained from USC SIPI database at http://sipi.usc.edu/

 


Charles Mathis - EE368A - 5-31-01