Final Project : Psych 221

Applied Vision and Image Systems

 

 

 

Abstract

 

Digital halftoning is the process of rendering an image with greater amplitude resolution to one with lesser amplitude resolution. This has been practiced for over one hundred years in the printing industry : indeed a printer can only produce black dots on a sheet of paper, and one has to optimally distribute them across the space to render a visually pleasing pattern, giving the illusion of the original tonal quality of the image. This is not the only application of halftoning, though. Image rendering with a significant reduction of physically distinct colors is also a problem that arises for low level displays. A VGA screen is only driven by two bits for each color gun (RGB), which gives 64 different colors, instead of the 224 possibilities in a bitmap image. This is an immediate generalization of our work, known as multilevel dithering.

We review the different halftoning techniques applied to low level resolution displays, both grayscale and color images, for the convenience of being able to see the differences on the screen. The algorithms we present for both monochrome and color digital halftoning can be categorized into three classes, in decreasing order of how myopically they transform a given image into a halftone and, thus in increasing order of computational complexity and halftone quality. In the first class are the pointwise approaches like screening or dithering where the image is pixelwise compared to an array of threshold and the piwels exceeding this spatially varying mask of thresholds are changed to black. The next class of approaches uses the information about a neighbourhood of pixels to decide the halftone state of a given pixel. Error diffusion is the most important example of this category. The third class consists of iterative approaches where several passes over the image pixels are made to minimize an error metric or satisfy certain constraints before the halftoning process is completed. All three of these classes of algorithms can be generalized to digital color halftoning with some modifications.


Introduction

 

The “perceptually similar” criterion is very important one, and is linked to the characteristics of the human visual system. If we take a simpler criterion of minimal mean square difference between the original and the halftoned image, we would generate a thresholded image that occludes most of the nuances of the original images. However, this is what is used when we control a computer remotely with netmeeting, for example. Instead of transmitting the whole data for the background image for example, it transmits only a few bits to characterize it. This is a multilevel thresholding, and is very noticeable to the eye. We shall see that with only one bit for each color, we may do a better job than this ! This appears also when reducing the resolution of the screen.

We exploit the fact that our high spatial frequency response drops quickly above 6 cycles per degree, and even less for chrominance components. So high frequency patterns are perceived as their average over a neighbourhood depending on the distance between two adjacent pixels on the screen, and the viewing distance.

Here are some plots of the image that we are going to use throughout this paper, and their thresholded counterpart.

 

Gamma correction

 

One needs to account for the fact that the 8-bit values that are put in the frame buffer to trigger the red, green and blue guns of the CRT are not the RGB tristimulus values of the colors displayed on the monitor. This is because the CRT has a nonlinear response  to frame buffer values. Thus, we need to pass the  RGB values of the image through this non-linearity to obtain the RGB coordinates of the colors displayed on the monitor. This corresponds to the inverse of the gamma correction. The color images are first preprocessed with this point nonlinearity before they are halftoned. This ensures that the colors in the halftone are closest to the color actually rendered on the monitor.

Here is the relationship I have used :

 

Color halftoning

 

Another idea that I use to go from an original grayscale halftoning algorithm to a color one is to use the fact that the eye is much more sensitive to luminance variation than to chromatic variations. Therefore, we want to try to put more noise in the chromatic components to better shape the noise in the luminance. This idea is adapted to the main techniques of color halftoning.

 

 

 

This is the linear transformation to change color space, process data as visual pathways intensities, and go back to RGB values.
Ordered dithering

 

a)    White noise dithering

 

Perhaps the first solution that comes to mind in order to try and recreate the local average of the intensities is to use the most common type of noise, white noise, in order to vary the threshold. We use a uniform distribution from 0 to 255. This way, we can recreate the noise impression. Here is what we get :

 

Three main remarks. First, when seen from far enough, it is recreating the continuous grayscale, which is what we were aiming at.

Second, an important drawback of this method is the nature of the noise, which confers the grainy aspect of the image, and is very visually disturbing .

Finally, it is blurring the image more than our eyes do and by this way, information is lost.

I tried to sharpen the image as a pre-process to white noise dithering with a Laplacian filter.

 

I­­sharp[x,y] = I[x,y] – β.L[x,y]*I[x,y]

 

Here is what I get, somewhat an improvement, but still has this disturbing grainy aspect

 

b)     Clustered-dot dithering

 

Clustered-dot halftones are those that we commonly see in mass hard-copy publications produce by offset printing, such as in newspapers, magazines and books. As opposed to dispersed dot patterns, where the addressability of each pixel is used, the pixels in clustered dot pattern are nucleated in groups in regular intervals. Figure X illustrates an example of this type of halftone both for grayscale and color images.

This process has a main advantage, which is that it is very robust against differences among printers characteristics. In the case of offset printing, there is a minimum size dot that will hold ink, so high resolution dispersed dot screens will not work.

Many models of the human spatial contrast sensitivity function illustrate the property that the visual system is more sensitive to orientations at 0 and 90 degree, and less acute at 45 degrees, so that’s what has been used from the earliest days of halftoning.

Here is how we build such a dither template and we see the pattern that it produces when used to reproduce a constant gray level. The dither template is a matrix specifying the order in which pixels will be turned on as gray level increases. It is periodic and is thus replicated throughout the entire image. In this figure, the threshold value ordering begins at the center of the blue spirals, until half of the array elements are assigned, then continues from the outside of the red spiral.

 

 

Here we can derive a formula to know at which viewing distance the pattern is not going to be seen anymore by the eye, depending on the display/printer resolution in dots/pixels per inch, and the size of the dither template.

 

 

In the case of the tests performed (figure  ), you have to look at it at a distance of :

Using the fact that the frequency discrimination of the eye drops at about 6 cycles per degree.

 

In color printing, the color components are rotated at different angles to avoid moiré patterns, black (most apparent color) is set at 0°, Yellow (least apparent) is set at 45°, and the two other colors, Magenta and Cyan are set at +/-15°.

 

c)    Recursive tessellation (Bayer dithering)

 

For displays or printers that are not of the highest resolution, it is desirable to spread the black and white dots more over the space, and in fact we can search the optimal template to produce a distribution of black and white pixels all over the screen, for any gray level. The procedure to build such a template is a particular case of the blue noise mask construction, so I won’t go into great detail. Let’s just look at the output of this process.

 

It is very visually pleasing. The drawback of this deterministic process is precisely its deterministic nature. We have regular patterns and we can see straight line and crosses all across the image.

We’ll see with the blue noise mask how to keep the same properties that make it visually pleasing without being so straight.

As before, I’ve tried to optimise it for color images, and a simple enhancement that I’ve thought about is to scroll the template vertically and horizontally before we apply it the the different color components. An optimal scrolling value can be found by optimally distributing the luminance across the space. Since the green component gives more luminance than the two others, here is what I end up with for the 4x4 dimensions.

 

 

 


2-  Frequency domain investigation and Blue Noise

 

Representing signals in the frequency domain can often simplify complexity seen in the spatial domain. This is indeed the case with dither patterns. It allows us a mean to examine the distribution of energy and its consequences on the quality of the patterns. As it is the flat regions of an image where the nature of dither is most important, the focus will be on the power spectrum of patterns that result from the dithering of a single fixed gray level. The DC or frequency center point contributes to the macroscopic average, or gray level, of the dither pattern. As this datum is already known, it contributes to nothing to aiding our interpretation of the nature of the distribution of pixels that make up the dither pattern, and so is omitted in resulting plots.

This turns out to be equivalent to the power spectrum of the quantization error between the original image and the dithered one, because for a uniform gray level image, they are scaled versions of one another.

 

However two dimensional spectral plots can also afford to be made more succinct. Most well formed patterns share the characteristic of being isotropic. This leads to a metric that summarizes the spectrum radially. Figure X shows the segmenting of a spectral period into concentric annuli. Averaging the power spectrum in each annulus results in a radially averaged power spectrum, where these averages are plotted against the radial frequency, the radial distance from the DC center point of the annulus. As with the horizontal or vertical spatial frequency, this radial frequency is in units of inverse spatial sample periods.

Tadial frequency can go as high as 1/√2 at the high frequency corners ; these high frequency corners correspond to a checkerboard pattern, the highest 2-D pattern possible.

The power spectrum also needs to be normalized for the gray level g, which we will specify as ranging grom g=0 (black) to g=1 (white). Spectral energy increases as the number of minority pixels in a bitonal pattern increases, peaking at g=0.5, and so the spectral values are divided by the gray level variance.

 

Figure X shows the measured radially averaged power spectrum for white noise patterns. Using the normalization described above, we can plot the spectra for different gray levels on the same plot, and compare them.

 

 

 

The spatial contrast sensitivity function is shown in Figure X

In order for the quantization error to be perceived as little as possible, we see that we must avoid frequency components in the low frequencies, because the high frequencies are not perceived by the human vision.

The viewing distance is related to the width of the CSF, and we can relate the viewing distance to the principal frequency at a particular gray level in the case of Bayer dithering to determine how far we have to be from a display not to see the artefacts.

 

This acknowledges the optimality of  the Bayer dithering matrix among all ordered dither matrices. However, the structured natured of this is reflected in the spike in the spectrum at the end of the range, for all gray levels.

 

The kind of spectrum we are looking for is often referred to in the literature as “blue noise”. Ideally, a well-formed dither pattern should have the unstructured nature of white noise without the low frequency components.

 

I describe now the process for building a blue noise mask, along with the result of my experiments.

 

A blue noise mask is a template, usually of smaller size than the image (which doesn’t introduce any problem since it is built with wrap-around properties, as we shall see). A blue noise mask is independent from the image to be halftoned.

We want to design a matrix with coefficients ordered in the order according to which the pixels are to be turned on, as the gray level increases. The mask can be viewed as a collection of dot profiles, the binary matrices obtained by thresholding the mask with a certain gray level g, for all values of the gray level.

The dot profiles have to be related to one another, and the relationship can be expressed in this way : the dot profile for a gray level g1>g2 is to be designed from the dot profile of g2 such that all pixels turned on in the first dot profile are also turned on in the second. This condition cannot be explained by spectral consideration, because we have only properties for the spectrum of a fixed gray level. However we can understand it the following way : If two different gray levels are present in an image, the boundary between those should not introduce any discontinuities in the dot repartition. Thus, the profiles of the higher gray level should be include the dots presents in the lower one.

 

The actual mask design comprises three main phases. First, we design a dot pattern for a certain gray level, which is arbitrary, but presents the characteristics of blue noise.

To do this, I started with a white noise pattern, and improved it to get blue noise by moving the pixels one at a time, from a very dense region, to the less dense. I have used a gaussian filter to detect high and low densities.

Then we have to iteratively generate the dot profiles for the other gray levels. From our starting point, we first generate the ones corresponding to the gray levels below the first one, and then the ones above it, both starting from the initial blue noise pattern. For both ways, I still use my void/cluster finding filter.

 

I believe the size of the mask should be chosen in relation to the average area of uniform gray levels, such that we never see any regular pattern reproduced across the image. What that means is that we have to relate the spectrum of the original image to the spectrum of the quantization error in some way.

 

MATLAB Handle GraphicsMATLAB Handle Graphics 


3-  Error diffusion

 

                                     

 

­­Traditional error diffusion halftoning produces high quality binary images from digital grayscale images. Error diffusion shapes the quantization noise power into the high frequency regions where the human eye is the least sensitive.

More precisely, the input image is compared to a threshold, and the quantization error is used to modify the surrounding yet to be considered input values as governed by the error filter.

This algorithm was first introduced by Floyd and Steinberg, who also proposed the most common filter, shown in figure X.

The algorithm processes pixels in a raster order, so the only non-zero filter elements are those in front and below the current pixel. As with all error filters, the elements must sum to one. Several larger error filters have been proposed that appear to create better looking images. However, the output looks better because the filters tends to sharpen more ; areas of flat gray tend to be less homogeneous than the original Floyd and Steinberg filter. Another solution that has been investigated is to use a sharpening filter as a pre-filter, which also introduces better looking images. Sometimes visual artefacts occur, especially for medium gray levels. These drawbacks can be cancelled by introducing two features : first, a less regular raster of the image, and second, the add of noise to the filter coefficients. The result of this modified error diffusion is shown in figure X.

 

Error diffusion may be extended to color images as well. The easiest extension is generated by diffusing the error into separate color channels independently. The result is pretty good, but we can see some undesirable features, that could be reduced if we process the RGB values at the same time. Indeed, the eye is more sensitive to achromatic high frequencies than to the lower ones.

So the idea we want to exploit is to diffuse the error from each gun of the pixel into not only neighbouring pixels of the same color, but also the ones of different colors.

To obtain a true matrix linear color model, one needs to model the color processing of the human visual system as a convolution with a matrix valued coefficient. We use a pattern color separable model, which first transforms devide-dependent RGB values into a space with basis functions represented by the normalized color sensitivities of the three cone types. I used the spectra provided in the tutorials for the power spectral densities of the three display guns, along with the entire spectral responsivities of the LMS cones. The transition matrix is the product of those spectra. Thus, each RGB pixel value is transformed into the corresponding cone photoreceptor absorption rate. The LMS coordinates are then transformed using a color transformation into an opponent representation. The three opponent visual pathways are the luminance (L+M+S), the red-green (L-M) and the blue yellow (L+M-S).

Then, we perform spatial filtering on each of these channel using a different spatial filter on each channel.

 

A matrix valued filter can be obtained using this model and minimizing a cost function of the visual distortion.

The matrix we used was taken from []. Here it is :

 

The following plots show the very good results of this method.

 

 


4-  Model-based halftoning

 

 

We present a halftoning technique which searches for a halftone  which minimizes a global measure of error based on models of the human visual system (can also include a model of the output printer, but we consider only displays). The halftone which minimizes the error is found iteratively. The interest of this approach is not a practical application, because the computational cost is discouraging. However, such searches for optimality are theoretically very useful, because they allow us to see how far we are from the optimum, with different methods. 

We have implemented a computationally efficient (takes only a few minutes and about 12 passes over the image to find an optimum on matlab for a 256x256 grayscale image) algorithm using direct binary search. The DBS algorithm proceeds by transforming an initial halftone into one which locally minimizes the visual model response to the color image. Since this minimum is local, it is not unique, and an important drawback of this method is that the result is dependent on the starting point.

 

Let’s start with the description of the algorithm : the global mean squared error between the perceived halftone image and the perceived original image is computed. We model the low pass characteristic of the eye with a luminance spatial frequency response function (model by Daly)

We compute this error efficiently by computing first the autocorrelation function of the filter, Cpp and the correlation between the error and the filter, Cpe. As explained in [], the error is a function of those two. Each time we try a change, we can evaluate the consequences of this change on the error by using only four values of these matrix. If the change we tried effectively reduces the error, then it is accepted, and this time we need to change as many values of Cpe as there are values in Cpp. This is acceptable though, because especially in the last passes over the image, we try much more changes than the number we accept.

 

This way, the process scans the image, for each pixel it tries both changing the value of this pixel, or swaping it with one of its neighbours. If the error decreases, the change is validated, the error is updated, and it proceeds to the next pixel.

It keeps scanning the image until the error decreases less than 0.1 % in an entire image scan.

We consider then that a local minimum has been found.

We can visualize in figure X the evolution of the error as a function of time, with different starting points, reaching different stable points (local optima)

We show several plots of the result for our test images.

The reader should note that the algorithm only ensures convergence toward a local minimum, with varying performance depending on the starting point. This is illustrated in the following plot

 

Starting point : Blue noise mask (7 cycles to converge)

Starting point : Bayer matrix ( 6 cycles to converge)

Starting point : Error diffusion (4 cycles to converge)

 

 

The grid on the images is due to the scaling performed by the softwares. To view the real image, you should download it and display it with your favorite viewer.

Let’s look at the generalization of this procedure for color images : Once again, the data we have to work with has to be expressed in terms of the visual pathways, which have different modulation transfer functions (low pass for the luminance, very low pass for the opponent channels)

 

We have to compute the following linear transformation :

 

 

in order to have the image in the color space we want, and then filter the three channels independently  with the contrast sensitivity functions, for both the original and halftoned image, and take the mean squared error of those two. This gives us our error measure.

The algorithm will have to be modified a little bit, but we should still be able to exploit most of the low complexity ideas her.