Final Project : Psych 221
Applied Vision and Image Systems
Digital halftoning is the process of rendering an image with greater
amplitude resolution to one with lesser amplitude resolution. This has been
practiced for over one hundred years in the printing industry : indeed a
printer can only produce black dots on a sheet of paper, and one has to
optimally distribute them across the space to render a visually pleasing
pattern, giving the illusion of the original tonal quality of the image. This
is not the only application of halftoning, though. Image rendering with a
significant reduction of physically distinct colors is also a problem that
arises for low level displays. A VGA screen is only driven by two bits for each
color gun (RGB), which gives 64 different colors, instead of the 224
possibilities in a bitmap image. This is an immediate generalization of our
work, known as multilevel dithering.
We review the different halftoning techniques applied to low level
resolution displays, both grayscale and color images, for the convenience of
being able to see the differences on the screen. The algorithms we present for
both monochrome and color digital halftoning can be categorized into three
classes, in decreasing order of how myopically they transform a given image
into a halftone and, thus in increasing order of computational complexity and
halftone quality. In the first class are the pointwise approaches like
screening or dithering where the image is pixelwise compared to an array of
threshold and the piwels exceeding this spatially varying mask of thresholds
are changed to black. The next class of approaches uses the information about a
neighbourhood of pixels to decide the halftone state of a given pixel. Error
diffusion is the most important example of this category. The third class
consists of iterative approaches where several passes over the image pixels are
made to minimize an error metric or satisfy certain constraints before the
halftoning process is completed. All three of these classes of algorithms can
be generalized to digital color halftoning with some modifications.
The “perceptually similar” criterion is very important one, and is linked
to the characteristics of the human visual system. If we take a simpler
criterion of minimal mean square difference between the original and the
halftoned image, we would generate a thresholded image that occludes most of
the nuances of the original images. However, this is what is used when we
control a computer remotely with netmeeting, for example. Instead of
transmitting the whole data for the background image for example, it transmits
only a few bits to characterize it. This is a multilevel thresholding, and is
very noticeable to the eye. We shall see that with only one bit for each color,
we may do a better job than this ! This appears also when reducing the
resolution of the screen.
We exploit the fact that our high spatial frequency response drops
quickly above 6 cycles per degree, and even less for chrominance components. So
high frequency patterns are perceived as their average over a neighbourhood
depending on the distance between two adjacent pixels on the screen, and the
viewing distance.
![]()
Here are some plots of the image that we are going to use throughout this
paper, and their thresholded counterpart.
One needs to account for the fact that the 8-bit values that are put in
the frame buffer to trigger the red, green and blue guns of the CRT are not the
RGB tristimulus values of the colors displayed on the monitor. This is because
the CRT has a nonlinear response to
frame buffer values. Thus, we need to pass the
RGB values of the image through this non-linearity to obtain the RGB
coordinates of the colors displayed on the monitor. This corresponds to the
inverse of the gamma correction. The color images are first preprocessed with
this point nonlinearity before they are halftoned. This ensures that the colors
in the halftone are closest to the color actually rendered on the monitor.
Here is the relationship I have used :
![]()
Another idea that I use to go from an original grayscale halftoning
algorithm to a color one is to use the fact that the eye is much more sensitive
to luminance variation than to chromatic variations. Therefore, we want to try
to put more noise in the chromatic components to better shape the noise in the
luminance. This idea is adapted to the main techniques of color halftoning.

This is the linear transformation to change color space, process data as
visual pathways intensities, and go back to RGB values.
Ordered dithering
a) White noise dithering
Perhaps the first solution that comes to mind in order to try and
recreate the local average of the intensities is to use the most common type of
noise, white noise, in order to vary the threshold. We use a uniform
distribution from 0 to 255. This way, we can recreate the noise impression.
Here is what we get :
Three main remarks. First, when seen from far enough, it is recreating
the continuous grayscale, which is what we were aiming at.
Second, an important drawback of this method is the nature of the noise,
which confers the grainy aspect of the image, and is very visually disturbing .
Finally, it is blurring the image more than our eyes do and by this way,
information is lost.
I tried to sharpen the image as a pre-process to white noise dithering
with a Laplacian filter.
Isharp[x,y]
= I[x,y] – β.L[x,y]*I[x,y]
Here is what I get, somewhat an improvement, but still has this
disturbing grainy aspect
b) Clustered-dot dithering
Clustered-dot halftones are those that we commonly see in mass hard-copy
publications produce by offset printing, such as in newspapers, magazines and
books. As opposed to dispersed dot patterns, where the addressability of each
pixel is used, the pixels in clustered dot pattern are nucleated in groups in
regular intervals. Figure X illustrates an example of this type of halftone
both for grayscale and color images.
This process has a main advantage, which is that it is very robust
against differences among printers characteristics. In the case of offset
printing, there is a minimum size dot that will hold ink, so high resolution
dispersed dot screens will not work.
Many models of the human spatial contrast sensitivity function illustrate
the property that the visual system is more sensitive to orientations at 0 and
90 degree, and less acute at 45 degrees, so that’s what has been used from the
earliest days of halftoning.
Here is how we build such a dither template and we see the pattern that
it produces when used to reproduce a constant gray level. The dither template
is a matrix specifying the order in which pixels will be turned on as gray
level increases. It is periodic and is thus replicated throughout the entire
image. In this figure, the threshold value ordering begins at the center of the
blue spirals, until half of the array elements are assigned, then continues
from the outside of the red spiral.

Here we can derive a formula to know at which viewing distance the
pattern is not going to be seen anymore by the eye, depending on the
display/printer resolution in dots/pixels per inch, and the size of the dither
template.
![]()
In the case of the tests performed (figure ), you have to look at it at a distance of :
Using the fact that the frequency discrimination of the eye drops at
about 6 cycles per degree.
In color printing, the color components are rotated at different angles
to avoid moiré patterns, black (most apparent color) is set at 0°, Yellow
(least apparent) is set at 45°, and the two other colors, Magenta and Cyan are
set at +/-15°.
c) Recursive tessellation (Bayer dithering)
For displays or printers that are not of the highest resolution, it is
desirable to spread the black and white dots more over the space, and in fact
we can search the optimal template to produce a distribution of black and white
pixels all over the screen, for any gray level. The procedure to build such a
template is a particular case of the blue noise mask construction, so I won’t
go into great detail. Let’s just look at the output of this process.

It is very visually pleasing. The drawback of this deterministic process
is precisely its deterministic nature. We have regular patterns and we can see
straight line and crosses all across the image.
We’ll see with the blue noise mask how to keep the same properties that
make it visually pleasing without being so straight.
As before, I’ve tried to optimise it for color images, and a simple
enhancement that I’ve thought about is to scroll the template vertically
and horizontally before we apply it the the different color components. An
optimal scrolling value can be found by optimally distributing the luminance
across the space. Since the green component gives more luminance than the two
others, here is what I end up with for the 4x4 dimensions.

2- Frequency domain investigation and Blue Noise
Representing signals in the frequency domain can often simplify
complexity seen in the spatial domain. This is indeed the case with dither
patterns. It allows us a mean to examine the distribution of energy and its
consequences on the quality of the patterns. As it is the flat regions of an
image where the nature of dither is most important, the focus will be on the
power spectrum of patterns that result from the dithering of a single fixed
gray level. The DC or frequency center point contributes to the macroscopic
average, or gray level, of the dither pattern. As this datum is already known,
it contributes to nothing to aiding our interpretation of the nature of the
distribution of pixels that make up the dither pattern, and so is omitted in
resulting plots.
This turns out to be equivalent to the power spectrum of the quantization
error between the original image and the dithered one, because for a uniform
gray level image, they are scaled versions of one another.
However two dimensional spectral plots can also afford to be made more
succinct. Most well formed patterns share the characteristic of being
isotropic. This leads to a metric that summarizes the spectrum radially. Figure X shows the segmenting of a spectral
period into concentric annuli. Averaging the power spectrum in each annulus
results in a radially averaged power spectrum, where these averages are plotted
against the radial frequency, the radial distance from the DC center point of
the annulus. As with the horizontal or vertical spatial frequency, this radial
frequency is in units of inverse spatial sample periods.
Tadial frequency can go as high as 1/√2 at the high frequency
corners ; these high frequency corners correspond to a checkerboard pattern,
the highest 2-D pattern possible.
The power spectrum also needs to be normalized for the gray level g,
which we will specify as ranging grom g=0 (black) to g=1 (white). Spectral
energy increases as the number of minority pixels in a bitonal pattern
increases, peaking at g=0.5, and so the spectral values are divided by the gray
level variance.
Figure X shows the measured radially averaged power spectrum for white
noise patterns. Using the normalization described above, we can plot the
spectra for different gray levels on the same plot, and compare them.





The spatial contrast sensitivity function is shown in Figure X
In order for the quantization error to be perceived as little as
possible, we see that we must avoid frequency components in the low
frequencies, because the high frequencies are not perceived by the human
vision.
The viewing distance is related to the width of the CSF, and we can
relate the viewing distance to the principal frequency at a particular gray
level in the case of Bayer dithering to determine how far we have to be from a
display not to see the artefacts.
This acknowledges the optimality of
the Bayer dithering matrix among all ordered dither matrices. However,
the structured natured of this is reflected in the spike in the spectrum at the
end of the range, for all gray levels.
The kind of spectrum we are looking for is often referred to in the
literature as “blue noise”. Ideally, a well-formed dither pattern should have
the unstructured nature of white noise without the low frequency components.
I describe now the process for building a blue noise
mask, along with the result of my experiments.
A blue noise mask is a template, usually of smaller size than the image
(which doesn’t introduce any problem since it is built with wrap-around
properties, as we shall see). A blue noise mask is independent from the image
to be halftoned.
We want to design a matrix with coefficients ordered in the order
according to which the pixels are to be turned on, as the gray level increases.
The mask can be viewed as a collection of dot profiles, the binary matrices
obtained by thresholding the mask with a certain gray level g, for all values
of the gray level.
The dot profiles have to be related to one another, and the relationship
can be expressed in this way : the dot profile for a gray level g1>g2
is to be designed from the dot profile of g2 such that all pixels
turned on in the first dot profile are also turned on in the second. This
condition cannot be explained by spectral consideration, because we have only
properties for the spectrum of a fixed gray level. However we can understand it
the following way : If two different gray levels are present in an image, the
boundary between those should not introduce any discontinuities in the dot
repartition. Thus, the profiles of the higher gray level should be include the
dots presents in the lower one.
The actual mask design comprises
three main phases. First, we design a dot pattern for a
certain gray level, which is arbitrary, but presents the characteristics of
blue noise.
To do this, I started with a white noise pattern, and improved it to get
blue noise by moving the pixels one at a time, from a very dense region, to the
less dense. I have used a gaussian filter to detect high and low densities.
Then we have to iteratively generate the dot profiles for the
other gray levels. From our starting point, we first generate the ones corresponding
to the gray levels below the first one, and then the ones above it, both
starting from the initial blue noise pattern. For both ways, I still use my void/cluster finding filter.
I believe the size of the mask should be
chosen in relation to the average area of uniform gray levels, such that we
never see any regular pattern reproduced across the image. What that means is
that we have to relate the spectrum of the original image to the spectrum of
the quantization error in some way.

3- Error diffusion
Traditional error diffusion halftoning
produces high quality binary images from digital grayscale images. Error
diffusion shapes the quantization noise power into the high frequency regions
where the human eye is the least sensitive.
More precisely, the input image is compared to a threshold, and the
quantization error is used to modify the surrounding yet to be considered input
values as governed by the error filter.
This algorithm was first introduced by Floyd and Steinberg, who also
proposed the most common filter, shown in figure X.
The algorithm processes pixels in a raster order, so the only non-zero
filter elements are those in front and below the current pixel. As with all
error filters, the elements must sum to one. Several larger error filters have
been proposed that appear to create better looking images. However, the output
looks better because the filters tends to sharpen more ; areas of flat gray
tend to be less homogeneous than the original Floyd and Steinberg filter. Another
solution that has been investigated is to use a sharpening filter as a
pre-filter, which also introduces better looking images. Sometimes visual
artefacts occur, especially for medium gray levels. These drawbacks can be
cancelled by introducing two features : first, a less regular raster of the
image, and second, the add of noise to the filter coefficients. The result of
this modified error diffusion is shown in figure X.
Error diffusion may be extended to color images as well.
The easiest extension is generated by diffusing the error into separate color
channels independently. The result is pretty good, but we can see some
undesirable features, that could be reduced if we process the RGB values at the
same time. Indeed, the eye is more sensitive to achromatic high frequencies
than to the lower ones.
So the idea we want to exploit is to diffuse the error from each gun of
the pixel into not only neighbouring pixels of the same color, but also the
ones of different colors.
To obtain a true matrix linear color model, one needs to model the color
processing of the human visual system as a convolution with a matrix valued
coefficient. We use a pattern color separable model, which first transforms
devide-dependent RGB values into a space with basis functions represented by
the normalized color sensitivities of the three cone types. I used the spectra
provided in the tutorials for the power spectral densities of the three display
guns, along with the entire spectral responsivities of the LMS cones. The transition
matrix is the product of those spectra. Thus, each RGB pixel value is
transformed into the corresponding cone photoreceptor absorption rate. The LMS
coordinates are then transformed using a color transformation into an opponent
representation. The three opponent visual pathways are the luminance (L+M+S),
the red-green (L-M) and the blue yellow (L+M-S).
Then, we perform spatial filtering on each of these channel using a
different spatial filter on each channel.
A matrix valued filter can be obtained using this model and minimizing a
cost function of the visual distortion.
The matrix we used was taken from []. Here it is :

The following plots show the very good results of this method.


4- Model-based halftoning

We present a halftoning technique which searches for a halftone which minimizes a global measure of error
based on models of the human visual system (can also include a model of the
output printer, but we consider only displays). The halftone which minimizes
the error is found iteratively. The interest of this approach is not a
practical application, because the computational cost is discouraging. However,
such searches for optimality are theoretically very useful, because they allow
us to see how far we are from the optimum, with different methods.
We have implemented a computationally efficient (takes only a few minutes
and about 12 passes over the image to find an optimum on matlab for a 256x256
grayscale image) algorithm using direct binary search. The DBS algorithm
proceeds by transforming an initial halftone into one which locally minimizes
the visual model response to the color image. Since this minimum is local, it
is not unique, and an important drawback of this method is that the result is
dependent on the starting point.
Let’s start with the description of the algorithm : the
global mean squared error between the perceived halftone image and the
perceived original image is computed. We model the low pass characteristic of
the eye with a luminance spatial frequency response function (model by Daly)
We compute this error efficiently by computing first the autocorrelation
function of the filter, Cpp and the correlation between the error
and the filter, Cpe. As explained in [], the error is a function of
those two. Each time we try a change, we can evaluate the consequences of this
change on the error by using only four values of these matrix. If the change we
tried effectively reduces the error, then it is accepted, and this time we need
to change as many values of Cpe as there are values in Cpp.
This is acceptable though, because especially in the last passes over the
image, we try much more changes than the number we accept.
This way, the process scans the image, for each pixel it tries both
changing the value of this pixel, or swaping it with one of its neighbours. If
the error decreases, the change is validated, the error is updated, and it
proceeds to the next pixel.
It keeps scanning the image until the error decreases less than 0.1 % in
an entire image scan.
We consider then that a local minimum has been found.
We can visualize in figure X the evolution of the error as a function of
time, with different starting points, reaching different stable points (local
optima)
We show several plots of the result for our test images.
The reader should note that the algorithm only ensures convergence toward
a local minimum, with varying performance depending on the starting point. This
is illustrated in the following plot

|
|
|
|
|
Starting point : Blue noise mask (7 cycles to
converge) |
Starting point : Bayer matrix ( 6 cycles to
converge) |
Starting point : Error diffusion (4 cycles to
converge) |
The grid on the images is due to the scaling performed by the softwares.
To view the real image, you should download it and display it with your
favorite viewer.
Let’s look at the generalization of this procedure for color images :
Once again, the data we have to work with has to be expressed in terms of the
visual pathways, which have different modulation transfer functions (low pass
for the luminance, very low pass for the opponent channels)
We have to compute the following linear transformation :

in order to have the image in the color space we want, and then filter
the three channels independently with
the contrast sensitivity functions, for both the original and halftoned image,
and take the mean squared error of those two. This gives us our error measure.
The algorithm will have to be modified a little bit, but we should still
be able to exploit most of the low complexity ideas her.