|
Project List
| Authors |
Topic |
Contact |
Presentation |
Report |
Min-Wook Jeong, Young Min Kim, Jungsuk Kwac |
Image Compression using Inpainting
|
Shantanu |
|
|
Stephanie Kwan, Karen Zhu |
Compression of Color Mosaic Images |
Markus and Joyce |
|
|
Wei Hsu, Maya Khaneboubi, Jonathan Solnit |
Lossless Compression of Hyperspectral Images with Random Access Support |
Pierpaolo |
|
|
Po-Kai Chen, Liz Li |
|
David V |
|
|
Vijay Chandrasekhar, Louis Chen, Gabriel Takacs |
Joint Compression of Similar Photos |
Anne |
|
|
Pradeep Dunna, Deepan Gandhi, Rohit Watve |
Image Transcoding for Web Browsing on Mobile Devices
|
Xiaoqing |
|
|
David Lau, Shaan Patel, Shahar Yuval |
Interactive Image Browsing with JPEG2000/JPIP
|
Chuo-Ling and Aditya |
|
|
Eric Lin, Hyungsik Shin |
Film Grain Noise Removal and Synthesis for JPEG2000 Image Coding
|
Aditya and Prof Girod |
|
|
Chien-An Lai, George Yu |
Tuning Baseline JPEG Encoding for Individual Images |
David V |
|
|
David Chen, Sameh Zakhary |
Improving Laplacian Pyramid Image Coding |
Aditya and Markus |
|
|
Course Project Topics
Each group should either select two of the project topics (as 1st and 2nd choices) listed below OR submit their own topic and proposal by February 8. Please send a single email informing the General TA of the members of your group and your project selections.
Groups usually consist of 2 or 3 students. We might combine groups of 2 students and individuals interested in the same topic at our discretion. Groups with more than 3 students are not encouraged.
Please address your questions pertaining to specific topics to the indicated contacts, who are members of the Image, Video and Multimedia Systems Group. You may contact them for advice before and after making your selections. They may provide you with further references and code. Most papers are available at the IEEE Xplore web site, CiteSeer or Google Scholar.
IMAGE COMPRESSION |
| |
|
| Topic 1 |
Image Compression using Inpainting |
| Contact |
Shantanu Rane |
| Proposal |
Image inpainting [1] involves filling in missing portions of an image.
The goal of an inpainting algorithm is to smoothly propagate colors and
edges into the region to be inpainted. An inpainting algorithm [1,2]
based on partial differential equations has been applied for error
concealment. Inpainting can also be applied at the decoder to fill in
regions that have intentionally been removed at the encoder [3]. Thus,
if a large portion of an image can be inpainted at the decoder with a
very small degradation in picture quality, then it is profitable (from a
compression point of view) to skip that portion of the image. As proposed in [2,3], edge detection can be used to find regions that can
be inpainted with a small distortion, and these regions are removed from
the image to be compressed. The remaining image is then compressed using
a standardized compression algorithm such as JPEG2000. At the decoder,
the JPEG2000 bit stream is first decoded to obtain an image with holes.
These holes are then inpainted to give the final decoded image.
Students are expected to implement and optimize an inpainting algorithm
in concert with a standard compression algorithm and investigate its
rate-distortion performance in comparison with JPEG and JPEG-2000. The
decoder should be able to automatically identify regions containing
structure (predominantly edges) and texture, and employ suitable
inpainting algorithms for both. In addition to rate-distortion performance, students
should comment on the compression artifacts resulting from this approach. |
| References |
[1] M. Bertalmio, G. Sapiro, V. Caselles and C. Ballester, "Image
Inpainting," SIGGRAPH 2000, pp. 417-422.
[2] S. Rane, M. Bertalmio and G. Sapiro, "Structure and Texture Fillin-in
of missing image blocks for Wireless Transmission and for Compression
Applications," IEEE Trans. Image Processing, Vol. 12, No. 3, pp.
296-303, March 2003.
[3] Chen Wang, Xiaoyan Sun, Feng Wu and Hongkai Xiong, "Image compression
with structure-aware inpainting," IEEE International Symposium on
Circuits and Systems 2006, May 2006. |
| |
|
| Topic 2 |
Shape-adaptive Transforms for Coding Arbitrarily Shaped Objects |
| Contact |
Aditya Mavlankar and Chou-Ling Chang |
| Proposal |
For certain applications, e.g., image compositing, it is advantageous to code objects in an image separately. This requires a compression algorithm which can handle arbitrarily shaped regions. To apply conventional techniques, one could use zero padding beyond the object
boundaries to obtain a rectangle of pixels which can be coded.
Alternatively, one might want to employ a shape-adaptive algorithm, for
example, using the SA-DCT proposed in [1]. There are also other
techniques, some of which are mentioned in [2,3].
As part of the project, you should investigate how compression
efficiency is affected by coding more and more objects in an image
independently of the rest of the image. You will present the pros and
cons of some methods as well as suggest possible improvements. The
suggested improvements may cover the gamut of transforms, quantization
and entropy coding. It should be noted that the focus of the project is
NOT the segmentation algorithm used for deciding object boundaries. An
algorithm with human feedback can be quickly devised at the beginning to
obtain the input data. |
| References |
[1] T. Sikora and B. Makai, “Shape-adaptive DCT for generic coding of
video,” IEEE Transactions on Circuits and Systems for Video Technology
(CSVT), vol. 5, no. 1, pp. 59-62, Feb. 1995.
[2] G. Shen, B. Zeng and M. L. Liou, “An efficient hybrid arbitrarily
shaped object coding technique,” IEEE International Symposium on
Circuits and Systems (ISCAS), Geneva, Switzerland, May 2000.
[3] C.-L. Chang, X. Zhu, P. Ramanathan and B. Girod, "Light Field
Compression Using Disparity-Compensated Lifting and Shape Adaptation,"
IEEE Trans. Image Processing, vol. 15, no. 4, pp. 793 – 806, April 2006. |
| |
|
| Topic 3 |
Efficient Compression of MCOT Low-Band Images
|
| Contact |
Markus Flierl |
| Proposal |
Recently, motion-compensated orthogonal transforms (MCOT)
have been introduced for video compression [1]. These transforms achieve
orthonormal signal decompositions for arbitrary motion fields between
temporally successive frames. The temporal low-band of the MCOT
looks like a conventional image, except that the intensity
is scaled by a (known) factor that can change from pixel to pixel.
Therefore, efficient spatial compression of these low-band images
is challenging.
The simplest coding approach is to divide each pixel value by its
scale factor and use standard image coding techniques like JPEG 2000
to compress the unscaled image. But this does not encode the low-band
image in the original orthonormal subspace. In [1], Section 3, we
have proposed a Haar-like adaptive spatial transform that considers
these scale factors.
The goal of this project is to study adaptive block transforms and
wavelet transforms for efficient subband coding of MCOT low-band images. The
above mentioned approaches shall be investigated and compared. Further,
novel techniques with superior compression performance shall be proposed.
No video processing is required. We will provide some sample low-band
images with scale factor matrices for experimentation. |
| References |
[1] M. Flierl and B. Girod, "A Motion-Compensated Orthogonal Transform with Energy Concentration Constraint," Proc. of the IEEE Workshop on Multimedia Signal Processing, Victoria, Canada, Oct. 2006. Available at http://www.stanford.edu/~mflierl/Publications/flierl06-MMSP.pdf
|
| |
|
| Topic 4 |
Compression of Color Mosaic Images |
| Contact |
Markus Flierl and Joyce Farrell |
| Proposal |
Most digital cameras use only one image sensor with
a color filter array that yields interleaved sensor elements
for red, green, and blue. A color demosaicking algorithm
generates the RGB image, which is subsequently compressed,
typically by baseline JPEG. An alternative approach, which
compresses the raw color mosaic image directly, is promising;
it can reduce processing in the camera, while simultaneously
lowering the required bit-rate and improving the ultimate
picture quality. It is also attractive for archiving of
professional-quality photos.
Excellent results were reported recently for direct lossless compression
of color mosaic images using discrete wavelet transforms [1].
In this project, students should select a scheme from [1] as a
starting point and investigate its extension to lossy compression.
Quality can be evaluated by measuring PSNR of RGB images reconstructed
by a simple demosaicking algorithm and shall be compared with
the conventional architecture (demosaicking first - then compression).
Color mosaic images for experimentation can be provided. |
| References |
[1] N. Zhang and X. Wu, "Lossless Compression of Color Mosaic Images,"
IEEE Transactions on Image Processing, vol. 15, no. 6, pp. 1379 - 1388,
June 2006. |
| |
|
| Topic 5 |
Lossless Compression of Hyperspectral Images with Random Access Support |
| Contact |
Pierpaolo Baccichet |
| Proposal |
Spectral image sensors are used to acquire images of surfaces featured
with several hundreds of spectral bands. In [1], a simple lossless
compression scheme is proposed that jointly encodes all bands to ensure
high compression gains. The algorithm encodes a first band using the
JPEG-LS standard and then subsequently encodes the remaining bands,
exploiting previously processed bands for prediction. Although this
solution achieves very good compression performance, it might require
the decoding of the complete file to access individual bands.
For many interactive applications, random access to individual
bands and/or regions of the image is required. Thus, the signal
should be encoded such that the desired content can be extracted
quickly without too much overhead.
Students should develop and evaluate a lossless compression scheme to
support region and band extraction, without requiring the complete file
to be decoded, and evaluate the trade-offs in terms of overall file size
and computation for decoding. A hyperspectral image is available for
testing purposes.
|
| References |
[1] H. Wang, S.D. Babacan and K. Sayood, "Lossless Hyperspectral Image Compression Using Context-based Conditional Averages," Proc. Data Compression Conference, Snowbird UT, March 2005. |
| |
|
| Topic 6 |
Fractal Image Coding |
| Contact |
David Varodayan |
| Proposal |
Fractal image coding [1, 2] is very different to conventional methods;
the image is represented as a contractive function that maps the image to a very similar version of itself. The idea is that the image is near a fixed point of the function. Thus, the decoder can begin with any image and iteratively apply the
function to recover the fixed point image. Fractal image coding
exploits self-similarity of the image at different resolutions.
In this project, students should implement a fractal image coder and compare its rate-distortion performance to JPEG and JPEG-2000.
|
| References |
[1] A. E. Jacquin, "A novel fractal block-coding technique for digital
images," International Conference on Acoustics, Speech, and Signal Processing, 1990.
[2] A. E. Jacquin, "Fractal image coding: a review," Proceedings of
the IEEE, vol. 81, no. 10, pp. 1451-1465, October 1993. |
| |
|
| Topic 7 |
Joint Compression of Similar Photos |
| Contact |
Anne Aaron |
| Proposal |
With the proliferation of digital cameras, people often collect and
store a large number of similar photos of the same scene or event (for
example, very similar photos of a landscape or photos with the same
background). These images exhibit high redundancy among each other which
can be exploited by a compression algorithm specifically targeted for
similar photo files. Such an algorithm could be used to back-up a
personal photo collection or a large online photo sharing site.
The goal of this project is to design an algorithm to jointly compress a
set of similar images. The algorithm should include image registration,
arrangements of views in an order most amenable to compression, and
prediction or transform coding across the images. Related work for lightfield compression [1,2] can serve as a starting point. The
algorithm should be compared with individual compression of the images using JPEG and JPEG-2000.
|
| References |
[1] M. Magnor and B. Girod, "Data Compression in Image-Based Rendering," IEEE Transactions on Circuits and Systems for Video Technology, vol. 10,
no. 3, pp. 338-343, April 2000.
[2] C.-L. Chang, X. Zhu, P. Ramanathan, and B. Girod, "Light Field Compression Using Disparity-Compensated Lifting and Shape Adaptation,"
IEEE Trans. Image Processing, vol. 15, no. 4, pp. 793 – 806, April 2006.
|
| |
|
IMAGE COMMUNICATION |
| |
|
| Topic 8 |
Image Transcoding for Web Browsing on Mobile Devices |
| Contact |
Xiaoqing Zhu |
| Proposal |
Web browsing on mobile devices is challenging due to
the limited bitrate of the wireless connection, as well
as the small display size. The bulk of the information
for Web browsing consists of images, typically encoded
as JPEG or GIF. These images can be transcoded by a gateway
to the wireless network into lower spatial resolution and
smaller file size to improve the overall user experience.
Target rate and file format (e.g. GIF or JPEG) of the
transcoded images need to be determined on-the-fly, depending
on the charateristics of the original image (e.g., graphics
vs photo), as well as current channel rate. One example
of such an image transcoding system is described in [1].
In this project, you will design and investigate a fast
algorithm for automatically transcoding JPEG and GIF images
in Web pages. The scheme should select the resolution, target
encoding rate and appropriate file format of the transcoded
images. Performance of the proposed algorithm should be evaluated
in terms of reconstructed quality and download time of the
transcoded image files. Note that download time is increased by
the processing required for transcoding. This effect should be
analyzed and taken into account for the automatic selection of
the best transcoding.
|
| References |
[1] R. Han, P. Bhagwat, R. LaMaire, T. Mummert, V. Perret and J.Rubas, "Dynamic adaptation in an image transcoding proxy for mobile Web Browsing," IEEE Personal Communications, IEEE [see also IEEE Wireless Communications],
Dec 1998, vol 5, no. 6, page(s): 8-17. |
| |
|
| Topic 9 |
Interactive Image Browsing with JPEG2000/JPIP |
| Contact |
Chou-Ling Chang and Aditya Mavlankar |
| Proposal |
In this project, we consider interactive browsing of large aerial/satellite images. The user at the client terminal requests
image data from the server. The user may, for instance, zoom in and
move around to view details. The requested image region, at a requested spatial resolution and quality, is transmitted from the
server to the client. In most existing systems, such as Google Maps, a
large image is partitioned into independent image tiles. For each
image tile, different versions, corresponding to different spatial
resolutions and image qualities, are stored as independent files at
the server.
JPIP is a client-server protocol for efficient interactive browsing of
JPEG2000 encoded images over networked environments [1,2]. Using
JPIP, together with the support for resolution and quality scalability by JPEG2000, the data already available at the client can be progressively refined to higher resolution and/or higher quality in an
optimized manner, rather than being replaced entirely as in the
existing systems. In this project, performance of the JPEG2000/JPIP
system would be compared with a Google-Maps-like system. In
particular, you will compare the compression efficiency of the two
approaches. You may assume, for simplicity, that the network has zero
delay and zero packet loss.
|
| References |
[1] D. Taubman and R. Prandolini, "Architecture, Philosophy and
Performance of JPIP: Internet Protocol Standard for JPEG2000," Proc.
Visual Communications and Image Processing, 2003.
[2] C++ implementation of JPEG2000 and JPIP. Available at http://www.kakadusoftware.com |
| |
|
| Topic 10 |
Estimating the Fidelity of Analog TV using Digital Side Information |
| Contact |
Yao-Chung Lin and David Varodayan |
| Proposal |
Analog television broadcast suffers from additive noise and multipath effects (which create ghosting artifacts). A viewer could perform blind quality assessment of a frame of video without reference to the original. This could be done subjectively, but objective metrics include edge sharpness, random noise level and structural noise level [1]. If the broadcaster provides the viewer with additional digital side information (of a few bytes per frame), fidelity estimation becomes possible; that is, how similar a frame is to the original (usually measured as peak signal-to-noise ratio).
For this project, students should make several suggestions for the low-rate digital side information; for example, a small subset of pixels or transform coefficients. The data could be selected from each frame in a fixed manner, or randomly, or in a content-dependent way. The data should be compressed conventionally or using distributed source codes [2,3]. For each candidate for the digital side information, the trade off between error in fidelity estimation and rate should be explored. The candidates should be compared in terms of average performance as well as robustness. |
| References |
[1] X. Li, "Blind image quality assessment," Proc. International Conference on Image Processing, Rochester, NY, September 2002.
[2] D. Varodayan, A. Aaron and B. Girod, "Rate-adaptive codes for distributed source coding," EURASIP Signal Processing Journal, Special Section on Distributed Source Coding, vol. 86, no. 11, pp. 3123-3130, November 2006.
[3] Matlab/C implementation of distributed source codes. Available at http://www.stanford.edu/~divad/software.html
|
| |
|
| Topic 11 |
Film Grain Noise Removal and Synthesis for JPEG2000 Image Coding |
| Contact |
Aditya Mavlankar and Bernd Girod |
| Proposal |
For digital cinema applications, each frame is independently
encoded at very high quality using JPEG2000. The quality
requirement is such that the grain of the film material used is reproduced. Exact encoding and reproduction of a
noise-like signal is not efficient and requires many bits [1].
Recently, techniques have been proposed to remove the film grain
noise at the encoder and re-synthesize visually indistinguishable film grain noise at the decoder, e.g., [2]. This allows a substantial reduction in bit-rate. Building on the prior work,
students will develop their own algorithm to remove, model, and
re-synthesize film grain noise and explore the bit-rate savings
possible for JPEG2000 image coding. The noise removal and
synthesis algorithm should take the correlation of the film
noise across color channels into account. As test images, scanned
color photographs can be used.
|
| References |
[1] O. K. Al-Shaykh and R. M. Mersereau, "Lossy compression of noisy
images," IEEE Transactions on Image Processing, vol. 7, no. 12, pp.
1641-1642, Dec. 1998.
[2] B. T. Oh, C.-C. Jay Kuo, S. Sun, S. Lei, "Film Grain
Noise Modeling in Advanced Video Coding," Proc. Visual Communications
and Image Processing, VCIP-2007, San Jose, CA, SPIE vol. 6508,
January 2007.
|
| |
|
IMAGE CODER OPTIMIZATION |
| |
|
| Topic 12 |
Vector Quantization for Optimal Color Palette Selection |
| Contact |
David Rebollo-Monedero |
| Proposal |
The Graphics Interchange Format (GIF) image compression standard [1]
uses a palette limited to 256 distinct colors from the 24-bit RGB
color space. Pixel colors available are mapped into palette values and
then compressed losslessly. The purpose of this project is to develop an algorithm to choose an appropriate palette for a given image; that
is, to quantize pixel colors efficiently in terms of the color distortion introduced and the resulting compression rate.
We would like to investigate color palette selection methods,
including entropy constrained vector quantization with the Lloyd
Algorithm [2] and an application of quantization for distributed
source coding [3]. Rate-distortion performance should be compared to
the method followed by the GIF standard. Technical details will be
available from the contact listed above.
|
| References |
[2] P. A. Chou, T. Lookabaugh, and R. M. Gray, "Entropy constrained
vector quantization," IEEE Transactions on Acoustics, Speech, and
Signal Processing, Vol. 37, pp. 31-42, January 1989.
[3] D. Rebollo-Monedero and B. Girod, "Design of optimal quantizers
for distributed coding of noisy sources," in Proc. IEEE Int. Conf.
Acoust., Speech, Signal Processing (ICASSP), Philadelphia, PA, Mar.
2005. See other publications by the same author on high-rate
quantization and transforms for distributed source coding, available
at http://www.stanford.edu/~drebollo/publications.htm.
|
| |
|
| Topic 13 |
Tuning Baseline JPEG Encoding for Individual Images |
| Contact |
David Varodayan |
| Proposal |
The baseline JPEG standard [1] for image compression gives flexibility
to the encoder to choose quantization matrices and Huffman tables for
each image to be encoded. But in practice, most encoder
implementations forgo this opportunity and use fixed values. Clearly,
coding efficiency could be improved by tuning quantization matrices
and Huffman tables for individual images. An iterative algorithm for
the joint optimization is presented in [2].
For this project, students should implement the joint optimization
algorithm and compare the rate-distortion performance of tuned
baseline JPEG to untuned baseline JPEG as well as JPEG2000. Students
should also investigate how image properties (such as resolution and
content) influence the performance gain. |
| References |
[1] G. K. Wallace, "The JPEG still picture compression standard,"
Communications of the ACM, vol. 34, no. 4, April 1991.
[2] M. Crouse and K. Ramchandran, "Joint thresholding and quantizer
selection for transform image coding: entropy-constrained analysis and
applications to baseline JPEG", IEEE Transactions on Image Processing,
vol. 6, no. 2, February 1997. |
| |
|
| Topic 14 |
Improving Laplacian Pyramid Image Coding |
| Contact |
Aditya Mavlankar and Markus Flierl |
| Proposal |
The Laplacian pyramid [1] permits efficient compression of an image
through an overcomplete multi-resolution decomposition. The relationship
between the transform coefficients at different resolutions is
hierarchical; reconstructing an image at a particular resolution
requires the corresponding coefficients as well as the coefficients for
all lower resolutions. Conditions for reconstructing these samples with
minimum mean squared error, are presented in [2]. This project will
explore practical ways of accomplishing this optimal or near-optimal synthesis for both the open-loop and the closed-loop modes.
As a starting point, the contact listed above will describe to you a
novel structure that incorporates quantization noise processing into the Laplacian pyramid decomposition. As part of the project, you will analyze
theoretically the propagation of quantization noise to the reconstructed
image. This insight might be used for further improvements in
encoder/decoder design. The system might be optimized by you with respect to various
criteria, such as compression performance, spatial scalability and
computational burden.
|
| References |
[1] P. J. Burt and E. H. Adelson, "The Laplacian Pyramid as a compact image code," IEEE Transactions on Communications, vol. 31, pp. 532–540, Apr. 1983.
[2] Minh N. Do and Martin Vetterli, "Framing Pyramids," IEEE Transactions on Signal Processing, vol. 51, no. 9, pp. 2329 –2342, Sept. 2003. |
|