EE398A - Image Communication I

Project List

Authors
Topic
Contact
Presentation
Report
Min-Wook Jeong, Young Min Kim, Jungsuk Kwac
Image Compression using Inpainting
Shantanu
Stephanie Kwan, Karen Zhu
Compression of Color Mosaic Images
Markus and Joyce
Wei Hsu, Maya Khaneboubi, Jonathan Solnit
Lossless Compression of Hyperspectral Images with Random Access Support
Pierpaolo
Po-Kai Chen, Liz Li

Fractal Image Coding

David V
Vijay Chandrasekhar, Louis Chen, Gabriel Takacs
Joint Compression of Similar Photos
Anne
Pradeep Dunna, Deepan Gandhi, Rohit Watve
Image Transcoding for Web Browsing on Mobile Devices
Xiaoqing
David Lau, Shaan Patel, Shahar Yuval
Interactive Image Browsing with JPEG2000/JPIP
Chuo-Ling and Aditya
Eric Lin, Hyungsik Shin
Film Grain Noise Removal and Synthesis for JPEG2000 Image Coding
Aditya and Prof Girod
Chien-An Lai, George Yu
Tuning Baseline JPEG Encoding for Individual Images
David V
David Chen, Sameh Zakhary
Improving Laplacian Pyramid Image Coding
Aditya and Markus

 

Course Project Topics

Each group should either select two of the project topics (as 1st and 2nd choices) listed below OR submit their own topic and proposal by February 8. Please send a single email informing the General TA of the members of your group and your project selections.

Groups usually consist of 2 or 3 students. We might combine groups of 2 students and individuals interested in the same topic at our discretion. Groups with more than 3 students are not encouraged.

Please address your questions pertaining to specific topics to the indicated contacts, who are members of the Image, Video and Multimedia Systems Group. You may contact them for advice before and after making your selections. They may provide you with further references and code. Most papers are available at the IEEE Xplore web site, CiteSeer or Google Scholar.

IMAGE COMPRESSION
   
Topic 1 Image Compression using Inpainting
Contact Shantanu Rane
Proposal

Image inpainting [1] involves filling in missing portions of an image. The goal of an inpainting algorithm is to smoothly propagate colors and edges into the region to be inpainted. An inpainting algorithm [1,2]
based on partial differential equations has been applied for error concealment. Inpainting can also be applied at the decoder to fill in regions that have intentionally been removed at the encoder [3]. Thus,
if a large portion of an image can be inpainted at the decoder with a very small degradation in picture quality, then it is profitable (from a compression point of view) to skip that portion of the image. As proposed in [2,3], edge detection can be used to find regions that can be inpainted with a small distortion, and these regions are removed from the image to be compressed. The remaining image is then compressed using a standardized compression algorithm such as JPEG2000. At the decoder,
the JPEG2000 bit stream is first decoded to obtain an image with holes. These holes are then inpainted to give the final decoded image.

Students are expected to implement and optimize an inpainting algorithm in concert with a standard compression algorithm and investigate its rate-distortion performance in comparison with JPEG and JPEG-2000. The decoder should be able to automatically identify regions containing structure (predominantly edges) and texture, and employ suitable inpainting algorithms for both. In addition to rate-distortion performance, students should comment on the compression artifacts resulting from this approach.

References

[1] M. Bertalmio, G. Sapiro, V. Caselles and C. Ballester, "Image Inpainting," SIGGRAPH 2000, pp. 417-422.

[2] S. Rane, M. Bertalmio and G. Sapiro, "Structure and Texture Fillin-in of missing image blocks for Wireless Transmission and for Compression Applications," IEEE Trans. Image Processing, Vol. 12, No. 3, pp. 296-303, March 2003.

[3] Chen Wang, Xiaoyan Sun, Feng Wu and Hongkai Xiong, "Image compression with structure-aware inpainting," IEEE International Symposium on Circuits and Systems 2006, May 2006.

   
Topic 2 Shape-adaptive Transforms for Coding Arbitrarily Shaped Objects
Contact Aditya Mavlankar and Chou-Ling Chang
Proposal

For certain applications, e.g., image compositing, it is advantageous to code objects in an image separately. This requires a compression algorithm which can handle arbitrarily shaped regions. To apply conventional techniques, one could use zero padding beyond the object boundaries to obtain a rectangle of pixels which can be coded. Alternatively, one might want to employ a shape-adaptive algorithm, for example, using the SA-DCT proposed in [1]. There are also other techniques, some of which are mentioned in [2,3].

As part of the project, you should investigate how compression efficiency is affected by coding more and more objects in an image independently of the rest of the image. You will present the pros and
cons of some methods as well as suggest possible improvements. The suggested improvements may cover the gamut of transforms, quantization and entropy coding. It should be noted that the focus of the project is NOT the segmentation algorithm used for deciding object boundaries. An algorithm with human feedback can be quickly devised at the beginning to obtain the input data.

References

[1] T. Sikora and B. Makai, “Shape-adaptive DCT for generic coding of video,” IEEE Transactions on Circuits and Systems for Video Technology (CSVT), vol. 5, no. 1, pp. 59-62, Feb. 1995.

[2] G. Shen, B. Zeng and M. L. Liou, “An efficient hybrid arbitrarily shaped object coding technique,” IEEE International Symposium on Circuits and Systems (ISCAS), Geneva, Switzerland, May 2000.

[3] C.-L. Chang, X. Zhu, P. Ramanathan and B. Girod, "Light Field Compression Using Disparity-Compensated Lifting and Shape Adaptation," IEEE Trans. Image Processing, vol. 15, no. 4, pp. 793 – 806, April 2006.

   
Topic 3 Efficient Compression of MCOT Low-Band Images
Contact Markus Flierl
Proposal

Recently, motion-compensated orthogonal transforms (MCOT) have been introduced for video compression [1]. These transforms achieve orthonormal signal decompositions for arbitrary motion fields between temporally successive frames. The temporal low-band of the MCOT looks like a conventional image, except that the intensity is scaled by a (known) factor that can change from pixel to pixel. Therefore, efficient spatial compression of these low-band images is challenging.

The simplest coding approach is to divide each pixel value by its scale factor and use standard image coding techniques like JPEG 2000 to compress the unscaled image. But this does not encode the low-band image in the original orthonormal subspace. In [1], Section 3, we have proposed a Haar-like adaptive spatial transform that considers these scale factors.

The goal of this project is to study adaptive block transforms and wavelet transforms for efficient subband coding of MCOT low-band images. The above mentioned approaches shall be investigated and compared. Further, novel techniques with superior compression performance shall be proposed.
No video processing is required. We will provide some sample low-band images with scale factor matrices for experimentation.

References

[1] M. Flierl and B. Girod, "A Motion-Compensated Orthogonal Transform with Energy Concentration Constraint," Proc. of the IEEE Workshop on Multimedia Signal Processing, Victoria, Canada, Oct. 2006. Available at http://www.stanford.edu/~mflierl/Publications/flierl06-MMSP.pdf

   
Topic 4 Compression of Color Mosaic Images
Contact Markus Flierl and Joyce Farrell
Proposal

Most digital cameras use only one image sensor with a color filter array that yields interleaved sensor elements for red, green, and blue. A color demosaicking algorithm generates the RGB image, which is subsequently compressed, typically by baseline JPEG. An alternative approach, which compresses the raw color mosaic image directly, is promising; it can reduce processing in the camera, while simultaneously lowering the required bit-rate and improving the ultimate picture quality. It is also attractive for archiving of professional-quality photos.

Excellent results were reported recently for direct lossless compression of color mosaic images using discrete wavelet transforms [1]. In this project, students should select a scheme from [1] as a
starting point and investigate its extension to lossy compression. Quality can be evaluated by measuring PSNR of RGB images reconstructed by a simple demosaicking algorithm and shall be compared with the conventional architecture (demosaicking first - then compression). Color mosaic images for experimentation can be provided.

References

[1] N. Zhang and X. Wu, "Lossless Compression of Color Mosaic Images," IEEE Transactions on Image Processing, vol. 15, no. 6, pp. 1379 - 1388, June 2006.

   
Topic 5 Lossless Compression of Hyperspectral Images with Random Access Support
Contact Pierpaolo Baccichet
Proposal

Spectral image sensors are used to acquire images of surfaces featured with several hundreds of spectral bands. In [1], a simple lossless compression scheme is proposed that jointly encodes all bands to ensure high compression gains. The algorithm encodes a first band using the JPEG-LS standard and then subsequently encodes the remaining bands, exploiting previously processed bands for prediction. Although this solution achieves very good compression performance, it might require the decoding of the complete file to access individual bands.

For many interactive applications, random access to individual bands and/or regions of the image is required. Thus, the signal should be encoded such that the desired content can be extracted quickly without too much overhead.

Students should develop and evaluate a lossless compression scheme to support region and band extraction, without requiring the complete file to be decoded, and evaluate the trade-offs in terms of overall file size and computation for decoding. A hyperspectral image is available for testing purposes.

References

[1] H. Wang, S.D. Babacan and K. Sayood, "Lossless Hyperspectral Image Compression Using Context-based Conditional Averages," Proc. Data Compression Conference, Snowbird UT, March 2005.

   
Topic 6 Fractal Image Coding
Contact David Varodayan
Proposal

Fractal image coding [1, 2] is very different to conventional methods; the image is represented as a contractive function that maps the image to a very similar version of itself. The idea is that the image is near a fixed point of the function. Thus, the decoder can begin with any image and iteratively apply the
function to recover the fixed point image. Fractal image coding exploits self-similarity of the image at different resolutions.

In this project, students should implement a fractal image coder and
compare its rate-distortion performance to JPEG and JPEG-2000.

References

[1] A. E. Jacquin, "A novel fractal block-coding technique for digital images," International Conference on Acoustics, Speech, and Signal Processing, 1990.

[2] A. E. Jacquin, "Fractal image coding: a review," Proceedings of the IEEE,  vol. 81, no. 10, pp. 1451-1465, October 1993.

   
Topic 7 Joint Compression of Similar Photos
Contact Anne Aaron
Proposal

With the proliferation of digital cameras, people often collect and store a large number of similar photos of the same scene or event (for example, very similar photos of a landscape or photos with the same
background). These images exhibit high redundancy among each other which can be exploited by a compression algorithm specifically targeted for similar photo files. Such an algorithm could be used to back-up a personal photo collection or a large online photo sharing site.

The goal of this project is to design an algorithm to jointly compress a set of similar images. The algorithm should include image registration, arrangements of views in an order most amenable to compression, and prediction or transform coding across the images. Related work for
lightfield compression [1,2] can serve as a starting point.  The algorithm should be compared with individual compression of  the images using JPEG and JPEG-2000.

References

[1] M. Magnor and B. Girod, "Data Compression in Image-Based Rendering," IEEE Transactions on Circuits and Systems for Video Technology, vol. 10, no. 3, pp. 338-343, April 2000.

[2] C.-L. Chang, X. Zhu, P. Ramanathan, and B. Girod, "Light Field Compression Using Disparity-Compensated Lifting and Shape Adaptation," IEEE Trans. Image Processing, vol. 15, no. 4, pp. 793 – 806, April 2006.

   
IMAGE COMMUNICATION
   
Topic 8 Image Transcoding for Web Browsing on Mobile Devices
Contact Xiaoqing Zhu
Proposal

Web browsing on mobile devices is challenging due to the limited bitrate of the wireless connection, as well as the small display size. The bulk of the information for Web browsing consists of images, typically encoded as JPEG or GIF. These images can be transcoded by a gateway to the wireless network into lower spatial resolution and smaller file size to improve the overall user experience. Target rate and file format (e.g. GIF or JPEG) of the transcoded images need to be determined on-the-fly, depending
on the charateristics of the original image (e.g., graphics vs photo), as well as current channel rate. One example of such an image transcoding system is described in [1].

In this project, you will design and investigate a fast algorithm for automatically transcoding JPEG and GIF images in Web pages. The scheme should select the resolution, target encoding rate and appropriate file format of the transcoded images. Performance of the proposed algorithm should be evaluated in terms of reconstructed quality and download time of the transcoded image files. Note that download time is increased by the processing required for transcoding. This effect should be
analyzed and taken into account for the automatic selection of the best transcoding.

References

[1] R. Han, P. Bhagwat, R. LaMaire, T. Mummert, V. Perret and J.Rubas, "Dynamic adaptation in an image transcoding proxy for mobile Web Browsing," IEEE Personal Communications, IEEE [see also IEEE Wireless Communications], Dec 1998, vol 5, no. 6, page(s): 8-17.

   
Topic 9 Interactive Image Browsing with JPEG2000/JPIP
Contact Chou-Ling Chang and Aditya Mavlankar
Proposal

In this project, we consider interactive browsing of large aerial/satellite images. The user at the client terminal requests image data from the server. The user may, for instance, zoom in and move around to view details. The requested image region, at a requested spatial resolution and quality, is transmitted from the server to the client. In most existing systems, such as Google Maps, a large image is partitioned into independent image tiles. For each image tile, different versions, corresponding to different spatial resolutions and image qualities, are stored as independent files at the server.

JPIP is a client-server protocol for efficient interactive browsing of JPEG2000 encoded images over networked environments [1,2]. Using JPIP, together with the support for resolution and quality scalability by JPEG2000, the data already available at the client can be progressively refined to higher resolution and/or higher quality in an optimized manner, rather than being replaced entirely as in the existing systems. In this project, performance of the JPEG2000/JPIP system would be compared with a Google-Maps-like system. In particular, you will compare the compression efficiency of the two approaches. You may assume, for simplicity, that the network has zero delay and zero packet loss.

References

[1] D. Taubman and R. Prandolini, "Architecture, Philosophy and Performance of JPIP: Internet Protocol Standard for JPEG2000," Proc. Visual Communications and Image Processing, 2003.

[2] C++ implementation of JPEG2000 and JPIP. Available at http://www.kakadusoftware.com

   
Topic 10 Estimating the Fidelity of Analog TV using Digital Side Information
Contact Yao-Chung Lin and David Varodayan
Proposal

Analog television broadcast suffers from additive noise and multipath effects (which create ghosting artifacts). A viewer could perform blind quality assessment of a frame of video without reference to the original. This could be done subjectively, but objective metrics include edge sharpness, random noise level and structural noise level [1]. If the broadcaster provides the viewer with additional digital side information (of a few bytes per frame), fidelity estimation becomes possible; that is, how similar a frame is to the original (usually measured as peak signal-to-noise ratio). 

For this project, students should make several suggestions for the low-rate digital side information; for example, a small subset of pixels or transform coefficients. The data could be selected from each frame in a fixed manner, or randomly, or in a content-dependent way. The data should be compressed conventionally or using distributed source codes [2,3]. For each candidate for the digital side information, the trade off between error in fidelity estimation and rate should be explored. The candidates should be compared in terms of average performance as well as robustness.

References

[1] X. Li, "Blind image quality assessment," Proc. International Conference on Image Processing, Rochester, NY, September 2002. 

[2] D. Varodayan, A. Aaron and B. Girod, "Rate-adaptive codes for distributed source coding," EURASIP Signal Processing Journal, Special Section on Distributed Source Coding, vol. 86, no. 11, pp. 3123-3130, November 2006. 

[3] Matlab/C implementation of distributed source codes. Available at http://www.stanford.edu/~divad/software.html

   
Topic 11 Film Grain Noise Removal and Synthesis for JPEG2000 Image Coding
Contact Aditya Mavlankar and Bernd Girod
Proposal

For digital cinema applications, each frame is independently encoded at very high quality using JPEG2000. The quality requirement is such that the grain of the film material used is reproduced. Exact encoding and reproduction of a noise-like signal is not efficient and requires many bits [1].

Recently, techniques have been proposed to remove the film grain noise at the encoder and re-synthesize visually indistinguishable film grain noise at the decoder, e.g., [2]. This allows a substantial reduction in bit-rate. Building on the prior work, students will develop their own algorithm to remove, model, and re-synthesize film grain noise and explore the bit-rate savings possible for JPEG2000 image coding. The noise removal and synthesis algorithm should take the correlation of the film noise across color channels into account. As test images, scanned color photographs can be used.

References

[1] O. K. Al-Shaykh and R. M. Mersereau, "Lossy compression of noisy images," IEEE Transactions on Image Processing, vol. 7, no. 12, pp. 1641-1642, Dec. 1998.

[2] B. T. Oh, C.-C. Jay Kuo, S. Sun, S. Lei, "Film Grain Noise Modeling in Advanced Video Coding," Proc. Visual Communications and Image Processing, VCIP-2007, San Jose, CA, SPIE vol. 6508, January 2007.

   
IMAGE CODER OPTIMIZATION
   
Topic 12 Vector Quantization for Optimal Color Palette Selection
Contact David Rebollo-Monedero
Proposal

The Graphics Interchange Format (GIF) image compression standard [1] uses a palette limited to 256 distinct colors from the 24-bit RGB color space. Pixel colors available are mapped into palette values and then compressed losslessly. The purpose of this project is to develop an algorithm to choose an appropriate palette for a given image; that is, to quantize pixel colors efficiently in terms of the color distortion introduced and the resulting compression rate.

We would like to investigate color palette selection methods, including entropy constrained vector quantization with the Lloyd Algorithm [2] and an application of quantization for distributed source coding [3]. Rate-distortion performance should be compared to the method followed by the GIF standard. Technical details will be available from the contact listed above.

References

[2] P. A. Chou, T. Lookabaugh, and R. M. Gray, "Entropy constrained vector quantization," IEEE Transactions on Acoustics, Speech, and Signal Processing, Vol. 37, pp. 31-42, January 1989.

[3] D. Rebollo-Monedero and B. Girod, "Design of optimal quantizers for distributed coding of noisy sources," in Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing (ICASSP), Philadelphia, PA, Mar.
2005. See other publications by the same author on high-rate quantization and transforms for distributed source coding, available at http://www.stanford.edu/~drebollo/publications.htm.

   
Topic 13 Tuning Baseline JPEG Encoding for Individual Images
Contact David Varodayan
Proposal

The baseline JPEG standard [1] for image compression gives flexibility to the encoder to choose quantization matrices and Huffman tables for each image to be encoded. But in practice, most encoder
implementations forgo this opportunity and use fixed values. Clearly, coding efficiency could be improved by tuning quantization matrices and Huffman tables for individual images. An iterative algorithm for the joint optimization is presented in [2].

For this project, students should implement the joint optimization algorithm and compare the rate-distortion performance of tuned baseline JPEG to untuned baseline JPEG as well as JPEG2000. Students should also investigate how image properties (such as resolution and content) influence the performance gain.

References

[1] G. K. Wallace, "The JPEG still picture compression standard," Communications of the ACM,  vol. 34, no. 4, April 1991.

[2] M. Crouse and K. Ramchandran, "Joint thresholding and quantizer selection for transform image coding: entropy-constrained analysis and applications to baseline JPEG", IEEE Transactions on Image Processing, vol. 6, no. 2, February 1997.

   
Topic 14 Improving Laplacian Pyramid Image Coding
Contact Aditya Mavlankar and Markus Flierl
Proposal

The Laplacian pyramid [1] permits efficient compression of an image through an overcomplete multi-resolution decomposition. The relationship between the transform coefficients at different resolutions is
hierarchical; reconstructing an image at a particular resolution requires the corresponding coefficients as well as the coefficients for all lower resolutions. Conditions for reconstructing these samples with
minimum mean squared error, are presented in [2]. This project will explore practical ways of accomplishing this optimal or near-optimal synthesis for both the open-loop and the closed-loop modes.

As a starting point, the contact listed above will describe to you a novel structure that incorporates quantization noise processing into the
Laplacian pyramid decomposition. As part of the project, you will analyze theoretically the propagation of quantization noise to the reconstructed image. This insight might be used for further improvements in encoder/decoder design. The system might be optimized by you with respect to various criteria, such as compression performance, spatial scalability and
computational burden.

References

[1] P. J. Burt and E. H. Adelson, "The Laplacian Pyramid as a compact image code," IEEE Transactions on Communications, vol. 31, pp. 532–540, Apr. 1983.

[2] Minh N. Do and Martin Vetterli, "Framing Pyramids," IEEE Transactions on Signal Processing, vol. 51, no. 9, pp. 2329 –2342, Sept. 2003.

 

Last modified: 17-Mar-2007