Histogram-Based Color Image Retrieval

Psych221/EE362 Project Report
Mar.15,  2001
Sangoh Jeong <sojeong@stanford.edu>

 

Abstract

      In this project, six histogram-based search methods are investigated in two different color spaces, RGB and HSV. Histogram search characterizes an image by its color distribution, or histogram but the drawback of a global histogram representation is that information about object location, shape, and texture is discarded. Thus this project showed that images retrieved by using the global color histogram may not be semantically related even though they share similar color distribution in some results. An image retrieval demo system was built to make it easy to test the retrieval performance and to expedite further algorithm investigation. And six histogram-based image retrieval methods in two color spaces were exhaustively compared by providing precision vs. recall graphs for each image class and for all test images. In general, histogram-based retrievals in HSV color space showed better performance than in RGB color space. The hitogram Intersection-based image retrieval in HSV color space was found to be  most desirable among six retrieval methods.


Table of Contents

1. Introduction
2. Color Space
3. Histogram-Based Image Search
4. Retrieval Effectiveness
5. Implemented Method
6. Results
7. Conclusion
8. References
9. Appendix


1. Introduction 

   In the past decade, many image retrieval systems have been developed, such as the IBM QBIC system developed at the IBM Almaden Research Center, the VIRAGE System developed by the Virage Incorporation, the Photobook System developed by the MIT Media Lab, the VisualSeek system developed at Columbia University, the WBIIS System developed at Stanford University, and the Blobworld System developed at U.C. Berkeley[1]. The common ground for them is to extract a signature for every image based on its pixel values, and to define a rule for comparing images. The signature can be shape, texture, color or any other information with which two images could be compared. However, in this project, only the color signature is used as a signature to retrieve images.  
   Existing color-based general-purpose image retrieval systems roughly fall into three categories depending on the signature extraction approach used: histogram, color layout, and region-based search. And, in this project, histogram-based search methods are investigated in two different color spaces. A color space is defined as a model for representing color in terms of intensity values. Typically, a color space defines a one- to four- dimensional space. A color component, or a color channel, is one of the dimensions. Color spaces are related to each other by mathematical formulas. In this project, only two three-dimensional color spaces, RGB and HSV, are investigated.
   Histogram search characterizes an image by its color distribution, or histogram. Many histogram distances have been used to define the similarity of two color histogram representations. Euclidean distance and its variations are the most commonly used. The drawback of a global histogram representation is that information about object location, shape, and texture is discarded. The global color histogram indexing method, which is used in this project, correlates to the image semantics well. But, images retrieved by using the global color histogram may not be semantically related even though they share similar color distribution.
   In the experiment, an image retrieval demo system was built to make it easy to test the retrieval performance and to expedite further algorithm investigation. And six histogram-based image retrieval methods in two color spaces are exhaustively compared by providing precision vs. recall graphs for each image class and for all test images.


2. Color Space

   A color space is defined as a model for representing color in terms of intensity values[1]. Typically, a color space defines a one- to four-dimensional space. A color component, or a color channel, is one of the dimensions. A color dimensional space (i.e. one dimension per pixel) represents the gray-scale space. The following two models are commonly used in color image retrieval system.

 2.1 RGB color model 

   The RGB color model is composed of the primary colors Red, Green, and Blue. This system defines the color model that is used in most color CRT monitors and color raster graphics. They are considered the "additive primaries" since the colors are added together to produce the desired color. The RGB model uses the cartesian coordinate system as shown in Figure 1. (a).  Notice the diagonal from (0,0,0) black to (1,1,1) white which represents the grey-scale. Figure 1. (b) is a view of the RGB color model looking down from "White" to origin.

                

Figure 1.  (a) RGB coordinates system                                                            (b) RGB color model             

 2.2 HSV Color Model

   The HSV stands for the Hue, Saturation, and Value based on the artists (Tint, Shade, and Tone). The coordinate system in a hexacone in Figure 2. (a). And Figure 2.(b) a view of the HSV color model. The Value represents intensity of a color, which is decoupled from the color information in the represented image. The hue and saturation components are intimately related to the way human eye perceives color resulting in image processing algorithms with physiological basis. 
   As hue varies from 0 to 1.0, the corresponding colors vary from red, through yellow, green, cyan, blue, and magenta, back to red, so that there are actually red values both at 0 and 1.0. As saturation varies from 0 to 1.0, the corresponding colors (hues) vary from unsaturated (shades of gray) to fully saturated (no white component). As value, or brightness, varies from 0 to 1.0, the corresponding colors become increasingly brighter.

                                                            

           Figure 2.  (a) HSV coordinates system                                                               (b) HSV color model                         

 2.3 Color conversion

   In order to use a good color space for a specific application, color conversion is needed between color spaces. The good color space for image retrieval system should preserve the perceived color differences. In other words, the numerical Euclidean difference should approximate the human perceived difference. 

  2.3.1 RGB to HSV conversion

   In Figure 3., the obtainable HSV colors lie within a triangle whose vertices are defined by the three primary colors in RGB space:

.
Figure 3. Obtainable HSV color from RGB color space 

   The hue of the point P is the measured angle between the line connecting P to the triangle center and line connecting RED point to the triangle center. The saturation of the point P is the distance between P and triangle center. The value (intensity) of the point P is represented as height on a line perpendicular to the triangle and passing through its center. The grayscale points are situated onto the same line. And the conversion formula is as follows :

  ,  
  ,                     
                                                                                                                                  (1)

  2.3.1 HSV to RGB conversion

   Conversion from HSV space to RGB space is more complex. And, given to the nature of the hue information, we will have a different formula for each sector of the color triangle.

            Red-Green Sector:

                                                                  

            Green-Blue Sector:

                                                                    

            Blue-Red Sector:

                                                     (2)    


3. Histogram-Based Image Search

   The color histogram for an image is constructed by counting the number of pixels of each color. Retrieval from image databases using color histograms has been investigated in [tools, fully, automated]. In these studies the developments of the extraction algorithms follow a similar progression: (1) selection of a color space, (2) quantization of the color space, (3) computation of histograms, (4) derivation of the histogram distance function, (5) identification of indexing shortcuts. Each of these steps may be crucial towards developing a successful algorithm. 
   There are several difficulties with histogram based retrieval. The first of these is the high dimensionality of the color histograms. Even with drastic quantization of the color space, the image histogram feature spaces can occupy over 100 dimensions in real valued space. This high dimensionality ensures that methods of feature reduction, pre-filtering and hierarchical indexing must be implemented. The large dimensionality also increases the complexity and computation of the distance function. It particularly complicates `cross' distance functions that include the perceptual distance between histogram bins [2].

 3.1 Color histogram definition

   An image histogram refers to the probability mass function of the image intensities. This is extended for color images to capture the joint probabilities of the intensities of the three color channels. More formally, the color histogram is defined by,

                                                                                    (3)

where A , B and C represent the three color channels (R,G,B or H,S,V) and N is the number of pixels in the image. Computationally, the color histogram is formed by discretizing the colors within an image and counting the number of pixels of each color.
   Since the typical computer represents color images with up to 224 colors, this process generally requires substantial quantization of the color space. The main issues regarding the use of color histograms for indexing involve the choice of color space and quantization of the color space. When a perceptually uniform color space is chosen uniform quantization may be appropriate. If a non-uniform color space is chosen, then non-uniform quantization may be needed. Often practical considerations, such as to be compatible with the workstation display, encourage the selections of uniform quantization and RGB color space. The color histogram can be thought of as a set of vectors. For gray-scale images these are two dimensional vectors. One dimension gives the value of the gray-level and the other the count of pixels at the gray-level. For color images the color histograms are composed of 4-D vectors. This makes color histograms very difficult to visualize. There are several lossy approaches for viewing color histograms, one of the easiest is to view separately the histograms of the color channels. This type of visualization does illustrate some of the salient features of the color histogram [2].        

 3.2 Color uniformity

   The RGB color space is far from being perceptually uniform. To obtain a good color representation of the image by uniformly sampling the RGB space it is necessary to select the quantization step sizes to be fine enough such that distinct colors are not assigned to the same bin. The drawback is that oversampling at the same time produces a larger set of colors than may be needed. The increase in the number of bins in the histogram impacts performance of database retrieval. Large sized histograms become computationally unwieldy, especially when distance functions are computed for many items in the database. Furthermore, as we shall see in the next section, to have finer but not perceptually uniform sampling of colors negatively impacts retrieval effectiveness.
    However, the HSV color space mentioned earlier offers improved perceptual uniformity. It represents with equal emphasis the three color variants that characterize color: Hue, Saturation and Value (Intensity). This separation is attractive because color image processing performed independently on the color channels does not introduce false colors. Furthermore, it is easier to compensate for many artifacts and color distortions. For example, lighting and shading artifacts are typically be isolated to the lightness channel. But this color space is often inconvenient due to the non-linearity in forward and reverse transformation with RGB space [2]. 

 3.3 Color Histogram Discrimination

   There are several distance formulas for measuring the similarity of color histograms. In general, the techniques for comparing probability distributions, such as the kolmogoroff-smirnov test are not appropriate for color histograms. This is because visual perception determines similarity rather than closeness of the probability distributions. Essentially, the color distance formulas arrive at a measure of similarity between images based on the perception of color content. Three distance formulas that have been used for image retrieval including histogram euclidean distance, histogram intersection and histogram quadratic (cross) distance [2, 3].

  3.3.1 Histogram euclidean distance

   Let h and represent two color histograms. The euclidean distance between the color histograms h and g can be computed as:

                                                                                                   (4)           

In this distance formula, there is only comparison between the identical bins in the respective histograms. Two different bins may represent perceptually similar colors but are not compared cross-wise. All bins contribute equally to the distance.

  3.3.2 Histogram intersection distance

The color histogram intersection was proposed for color image retrieval in [4]. The intersection of histograms h and g is given by: 

                                                                    (5)

where |h| and |g| gives the magnitude of each histogram, which is equal to the number of samples. Colors not present in the user's query image do not contribute to the intersection distance. This reduces the contribution of background colors. The sum is normalized by the histogram with fewest samples.

 3.3.3 Histogram quadratic (cross) distance

   The color histogram quadratic distance was used by the QBIC system introduced in [1, 6]. The cross distance formula is given by:

                                                                                       (6)

The cross distance formula considers the cross-correlation between histogram bins based on the perceptual similarity of the colors represented by the bins. And the set of all cross-correlation values are represented by a matrix A, which is called a similarity matrix. And a (i,j)th element in the similarity matrix A is given by :    for RGB space, 

                                                                                                (7)

where is the distance between the color i and j in the RGB space. In the case that quantization of the color space is not perceptually uniform the cross term contributes to the perceptual distance between color bins.
For HSV space it is given in [5] by:

   (8)

which corresponds to the proximity in the HSV color space[fully]. 

กก


4. Retrieval Effectiveness

Two metrics for retrieval effectiveness are recall and precision. Recall signifies the relevant images in the database that are retrieved in response to a query. Precision is the proportion of the retrieved images that are relevant to the query. More precisely, let A be the set of relevant items, let B the set of retrieved items and a,b,c and d are given in Figure 4. 

Figure 4. Sets for explaining retrieval effectiveness

In the picture, a stands for 'retrieved relevant' images, b for 'retrieved irrelevant' images, c for 'unretrieved relevant' images and d for 'unretrieved irrelevant' images. Then recall and precision are defined as the following conditional probabilities [3].

                                                            (9)    

With these conditions, image retrieval is said to be more effective if precision values are higher at the same recall values.


5. Implemented Method  

 5.1 Overall scheme

   The flow of the experiment in Figure 5. follows the procedure of the general image retrieval system .

  

Figure 5. Flow of implemented image retrieval system

   In the figure above, once a query image and a retrieval method are chosen by users, the rest of whose process is done automatically. However, the histogram data for all images in database are computed and saved in DB (Database) in advance so that only the image indexes and histogram data can be used to compare the query image with images in DB. The six retrieval methods used in this experiment are all based on histogram distance measures introduced in equations (4)~(8). The three histogram distance measures are used in RGB color space and in HSV color space separately, which makes up the six retrieval methods. However, each color space requires different processing to the query image and images in DB. All processes were realized using MATLAB. The following sections explain the experiment in detail.

 5.2 Generation of image DB

   The image data used in this experiment were downloaded from a web site (http://wang1.ist.psu.edu/). Those were the same images used in SIMPLIcity[Wang].  However, in order to reduce the computation time of the whole process, only the first half out of 1000 images were used and their sizes were reduced to 64x64 pixels.  Those 500 images have numbers as their names from 0 to 499 with JPEG-compressed for convenience and for saving storage. The original 1000 images has 10 categories (Africans & villages, Beach, Buildings, Buses, Dinosaurs, Elephants, Flowers, Horses, Mountains & glaciers, Food). And this experiment uses only first 5 categories which have 100 images each. The category information is used in calculating precision and recall for a retrieval method.

 5.3 Quantization

   In order to reduce computation time, a 256x256x256 = 16777216 color image is quantized into a 8 x 8 x 8 = 512 color image in RGB color space. Otherwise,  it will take really long time to compute and compare 16777216 bins of a histogram with others. This is a trade-off between performance and time. Since R,G and B have same distance in its color space, they are quantized into same levels. However, in order for the HSV color space to be used, a RGB color space image should first be transformed to a HSV color space image. In HSV color space, quantization of hue requires the most attention. The hue circle consists of the primaries red, green and blue separated by 120 degrees. A circular quantization at 20 degree steps sufficiently separates the hues such that the three primaries and yellow, magenta and cyan are represented each with three sub-divisions. Saturation and value are each quantized to three levels yielding greater perceptual tolerance along these dimensions. Thus, H is quantized to 18 levels and S & V is quantized to 3 levels. The quantized HSV space has 18x3x3  = 162 histogram bins. 

 5.4 Computation and comparison of histogram

   Computation of histogram is simple. All we have to do is just to count number of pixels that correspond to a specific color in quantized color space whether it is the RGB color space or the HSV space. In order to compare histograms of two images, we first need to generate specific codes for    all histogram bins. In this experiment, (r : 0~7, g: 0~7, b: 0~7 ) codes were generated for 512 RGB histogram bins and (h: 0~17, s: 0~3, v: 0~3) codes were generated for 162 HSV histogram bins. Then, histogram distances introduced in (4)~(8) were computed in each quantized color space.   Since the number of bins of qunatized HSV color space to be compared is less than a third, it was expected that the comparison in HSV color space would be done faster than in RGB color space. 


6. Results

6.1 Image Retrieval Demo System

   In the experiment, a image retrieval demo system in Figure 6. was built to test image retrieval efficiently. If the user selects an image in the listbox on the upper left side and chooses a retrieval method on the lower right listbox, by pressing 'Apply' button, the retrieval system finds the best two candidate images excluding the same image as the query image. Internally, it has information about all sorted indexes of images in DB in the order of increasing histogram distance. But, it shows only two best candidates. This system has also a function that show a precision vs. recall graph for an image regarding 6 histogram-based retrieval methods (3 distance measures in RGB and in HSV color spaces). 

Figure 6. Image retrieval demo system

6.2 Retrieval results for several images

   Figure 7. shows retrieval results for several images. Each query image was chosen from each category stated in 5.2 and the following abbreviations represent distance measures:       HE-hsv distance -> Histogram Euclidean distance in HSV color space, HI-hsv distance -> Histogram Intersection distance in HSV color space, HQ-hsv distance -> Histogram Quadratic distance in HSV color space. For RGB color space, HE, HI and HQ have the same meaning as in HSV space with rgb instead of hsv. For a query image, all six results were shown in the figure.

[ Query image 1]                 [ Retrieved images - Distance measures :  HE-hsv distance, HI-hsv distance, HQ-hsv distance (from left) ]     

  

[ Query image 1]                [ Retrieved images - Distance measures :  HE-rgb distance, HI-rgb distance, HQ-rgb distance (from left) ]   

  

[ Query image 2]                   [ Retrieved images - Distance measures :  HE-hsv distance, HI-hsv distance, HQ-hsv distance (from left) ]      

  

[ Query image 2]                  [ Retrieved images - Distance measures :  HE-rgb distance, HI-rgb distance, HQ-rgb distance (from left) ]    

  

[ Query image 3]                   [ Retrieved images - Distance measures :  HE-hsv distance, HI-hsv distance, HQ-hsv distance (from left) ]    

  

[ Query image 3]                  [ Retrieved images - Distance measures :  HE-rgb distance, HI-rgb distance, HQ-rgb distance (from left) ]    

  

[ Query image 4]                   [ Retrieved images - Distance measures :  HE-hsv distance, HI-hsv distance, HQ-hsv distance (from left) ]     

  

[ Query image 4]                  [ Retrieved images - Distance measures :  HE-rgb distance, HI-rgb distance, HQ-rgb distance (from left) ]    

  

[ Query image 5]                   [ Retrieved images - Distance measures :  HE-hsv distance, HI-hsv distance, HQ-hsv distance (from left) ]    

  

[ Query image 5]                  [ Retrieved images - Distance measures :  HE-rgb distance, HI-rgb distance, HQ-rgb distance (from left) ]    

  

Figure 7. Retrieval results for several images

   For query images 2,3 and 5, it's hard to judge in which color space the histogram-base retrieval showed better performance. For query image 1, it seems to be sure that histogram-based image retrieval in RGB color space showed better performance than in HSV color space. However, for query image 4,  histogram-based image retrieval in HSV color space showed better performance than in RGB color space. Since these images are only one case in 100 images in each category and each image in the same category has quite different characteristics, it's hard to find a general trend by the subjective test. It depends on the user's inclination for the most part. Therefore, it is required to test the results on the objective basis. The following section shows those objective results. 

6.3 Retrieval effectiveness for histogram-based image retrieval

   In order to measure retrieval effectiveness for a image retrieval system, precision vs. recall graphs are used. Although this effectiveness measure is useful to test a retrieval system objectively, it cannot exclude subjectivity inherent in image retrieval system. It's because the category itself was classified by humans. However, it is still most popular to test an information retrieval system. In the experiment, since the precision vs. recall graph for a query image regarding six histogram-based retrieval methods cannot show the generality of a specific retrieval methods, averages were taken for each retrieval method. The results are shown in Figure 8.~ Figure 13.

            Figure 8. Histogram-based Retrieval Effectiveness (Africans)                    Figure 9. Histogram-based Retrieval Effectiveness (Beach)     

            Figure 10. Histogram-based Retrieval Effectiveness (Buildings)                    Figure 11. Histogram-based Retrieval Effectiveness (Buses)    

        Figure 12. Histogram-based Retrieval Effectiveness (Dinosaurs)              Figure 13. Histogram-based Retrieval Effectiveness (500 images)     

   As we can see from the result from Figure 8. ~ Figure 11., except for 'Africans' category images, histogram-based retrievals in HSV color space showed better performance than in RGB color space since it showed higher precision values for the same recall values. In Figure 12., all three histogram-based image retrieval methods in RGB color space showed better performances than those in HSV color space. It seems to come from the characteristics of 'Dinosaurs' images, since they are not real photographs but composite graphic images. They are created to the characteristics of the RGB color space faithfully from the start. Thus, although the histogram intersection-based retrieval in RGB space showed good performance in average in Figure 13, it's because 'Africans' images and 'Dinosaurs' images affected dominantly. And we can find that the histogram euclidean distance and histogram intersection distance in HSV color space are most useful among six histogram distance measures in the average sense. In a viewpoint of computation time, using HSV color space is faster than using RGB color space and using Intersection method is faster than using Euclidean or Quadratic method. Thus, we can conclude that the Histogram Intersection-based image retrieval in HSV color space is most desirable among six retrieval methods mentioned in considering both computation time and retrieval effectiveness. From the graphs, we can also find that the quadratic distance measure is not efficient though it is computationally burdensome.    


7. Conclusion

   In this project, six histogram-based search methods were investigated in two different color spaces, RGB and HSV. Histogram search characterizes an image by its color distribution, or histogram but the drawback of a global histogram representation is that information about object location, shape, and texture is discarded. Thus this project showed that images retrieved by using the global color histogram may not be semantically related even though they share similar color distribution in some results. An image retrieval demo system was built to make it easy to test the retrieval performance and to expedite further algorithm investigation. And six histogram-based image retrieval methods in two color spaces were exhaustively compared by providing precision vs. recall graphs for each image class and for all test images. In general, histogram-based retrievals in HSV color space showed better performance than in RGB color space since it showed higher precision values for the same recall values. For graphic images, 3., all three histogram-based image retrieval methods in RGB color space showed better performance than those in HSV color space. And it is found  that the histogram euclidean distance and histogram intersection distance in HSV color space are most useful among six histogram distance measures in the average sense. In a viewpoint of computation time, using HSV color space is faster than using RGB color space and using Intersection method is faster than using Euclidean or Quadratic method. Thus, in conclusion, the Histogram Intersection-based image retrieval in HSV color space is most desirable among six retrieval methods mentioned in considering both computation time and retrieval effectiveness. From the results, the quadratic distance measure is not efficient though it is computationally burdensome.    

   For future work, it is possible to use the histogram intersection distance measure with other features such as shape and texture to improve the overall image retrieval performance. Finding good parameters for quadratic distance measure could be another subject to improve histogram-based search algorithm. However, even in that case, it would be difficult to reduce the computation time for quadratic measure significantly.


8. References

[1] James Z. Wang, "Integrated Region-Based Image Retrieval", Boston, Kluwer Academic Publishers, 2001
[2] J. R. Smith and S.-F. Chang. " Automated image retrieval using color and texture", Technical Report CU/CTR 408-95-14, Columbia University, July 1995.
[3] J. R. Smith and S.-F. Chang. " Tools and techniques for color image retrieval", In Symposium on Electronic Imaging: Science and Technology - Storage & Retrieval for Image and Video Databases IV, volume 2670, San Jose, CA, February 1996. IS&T/SPIE.
[4] M. J. Swain and D. H. Ballard. "Color indexing", International Journal of Computer Vision, 7:1 1991.
[5] J. R. Smith and S.-F. Chang. "VisualSEEk: a fully automated content-based image query system.", ACM Multimedia '96, November, 1996.
[6] James Hafner, Harpreet S.Sawhney, Will Equits, Myron Flickner and Wayne Niblack, "Efficient Color Histogram Indexing for Quadratic Form Distance Functions", IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol. 17, No. 7, July 1995


9. Appendixกก