|
Project List
| Authors |
Topic |
Contact |
Presentation |
Report |
Glen Gibb, Pierre Souillot |
Wavelet Coding of Color Images Using a Spectral Transform in the Subband Domain
|
David R-M |
|
|
Michael Yeh, Doe Hyun Yoon |
Adaptive Reversible Multiplication-FreeColor Transforms |
David V |
|
|
Jad Naous, Tao Xu |
Scalable Geometric Flow Representation for Image Compression with Directional Lifting |
Chuo-Ling |
|
|
Carri Chan, Yuki Konda |
ρ-domain Modeling for Rate Estimation
|
Xiaoqing |
|
|
John Poon, Laura Savidge |
Multi-View Image Prediction with Sub-Sampled Neighboring Views
|
Aditya and Markus |
|
|
Steven Lansel, Nandhini Nandiwada Santhanam |
Multi-View Image Coding with Disparity-Compensated Lifted Wavelets
|
Markus |
|
|
Jim Deng, Alex Giladi, Fernando Gomez-Pancorbo |
Noise Reduction Prefiltering for Video Compression
|
Mark |
|
|
Binh Ho, Harold Nyikal, Hoi Wong |
Rate Estimation of H.264 Encoding based on Encoding at a Lower Resolution
|
David V |
|
|
Course Project Topics
Each group should either select two of the project topics (as 1st and
2nd choices) listed below OR submit their own topic and proposal by
February 9. Please send a single email informing the General TA of the
members of your group and your project selections.
Groups usually consist of 3 students. We might combine groups of
2 students and individuals interested in the same topic at our
discretion. Groups with more than 3 students are not encouraged.
Please address your questions pertaining to specific topics to the indicated contacts, who are members of the Image, Video and Multimedia Systems Group. You may contact them for advice before and after making your selections. They may provide you with further references and code. Most papers are available at the IEEE Xplore web site, CiteSeer or Google Scholar.
IMAGE CODING |
| |
|
| Topic 1 |
Wavelet Coding of Color Images Using a Spectral Transform in the Subband Domain |
| Contact |
David Rebollo-Monedero |
| Proposal |
For compression of color images, the RGB components are usually transformed into YCbCr components, which are then compressed independently. Visual inspection of YCbCr image suggests that there is statistical dependence among the components which could be exploited for compression.
Strobel et al. have proposed a decorrelating transform AFTER a spatial wavelet decomposition of RGB image components. Their scheme is for lossless compression. By tailoring the color transform to each subband, they achieve substantial gains relative to conventional scheme where the color transform precedes the wavelet decomposition. In this project, you should investigate the Strobel et al. scheme for lossy compression of color images. In particular, explore how many bits are required to include the information of the color transforms tailored to a particular image subband. The scheme should be implemented as an extension of the JPEG-2000 wavelet coding algorithm. |
| References |
[1] N. Strobel, S.K. Mitra, B. S. Manjunath, "Reversible wavelet and spectral transforms for lossless compression of color images," Proc. IEEE Intern. Conf. Image Processing, ICIP-98, Chicago, IL,
vol. 3, pp. 896-900, Oct. 1998. |
| |
|
| Topic 2 |
Multiplication-Free Implementations of Reversible Integer-to-Integer Transforms |
| Contact |
David Varodayan |
| Proposal |
Reversible integer-to-integer color and spatial transforms are required for truly lossless image compression [1] because they avoid losses due to finite precision computations. Some of these transforms even permit efficient multiplication-free implementations; refer to [2] for the case of color transforms and [3] for an approximation to the discrete cosine transform (DCT).
For this project, students will automate the construction of these multiplication-free lossless transforms by casting arbitrary color and block-based spatial transforms as lifting structures and rounding lifting weights to multiples of dyadic fractions. For a given image, this will enable the creation of customized reversible integer-to-integer transforms parameterized by the rounded weights. Students will investigate to what extent it is beneficial to use customized transforms and how fine the rounding of lifting weights should be. |
| References |
[1] M. J. Gormish, E. L. Schwartz, A. F. Keith, M. P. Boliek, and A. Zandi, "Lossless and nearly lossless compression of high-quality images,” Proc. SPIE, vol. 3025, pp. 62–70, Mar. 1997.
[2] P. Hao and Q. Shi, “Comparative study of color transforms for image coding and derivation of integer reversible color transform,” in Proc. ICPR, vol. 3, Barcelona, Spain, Sept. 3–8, 2000, pp. 228–231.
[3] Y. Zeng, L. Cheng, G. Bi, and A. C. Kot, “Integer DCTs and fast algorithms”, IEEE Trans. Signal Processing, vol. 49, no. 11, pp. 2774-2782, Nov. 2001. |
| |
|
| Topic 3 |
Improving Laplacian Pyramid Image Coding by Quantization Noise Processing |
| Contact |
Aditya Mavlankar |
| Proposal |
The Laplacian pyramid [1] permits the efficient compression of an image at several resolutions at once. The relationship between the transform coefficients at different resolutions is hierarchical. Reconstructing an image at a particular resolution requires the corresponding coefficients as well as the coefficients for all lower resolutions. The optimality conditions for synthesizing these samples, in order to minimize mean squared error in the reconstruction, are presented in [2]. This project will explore practical ways of accomplishing this optimal or near-optimal synthesis for both the open-loop and the closed-loop modes.
As a starting point, the contacts listed above will describe for you a novel structure that incorporates quantization noise processing into the Laplacian pyramid decomposition. By design, implementation and experimentation, you can optimize the system with respect to various criteria, such as compression performance, spatial scalability and computational burden. |
| References |
[1] P. J. Burt and E. H. Adelson, "The Laplacian Pyramid as a compact image code," IEEE Transactions on Communications, vol. COM-31, pp. 532–540, Apr. 1983.
[2] Minh N. Do and Martin Vetterli, "Framing Pyramids," IEEE Transactions on Signal Processing, vol. 51, no. 9, pp. 2329 –2342, Sept. 2003. |
| |
|
| Topic 4 |
Scalable Geometric Flow Representation for Image Compression with Directional Lifting |
| Contact |
Chuo-Ling Chang |
| Proposal |
For wavelet-based image compression, conventionally the 2-D DWT is carried out as a separable transform by cascading two 1-D transforms in the vertical and horizontal direction. Such a separable transform cannot efficiently represent image features with geometric flow not aligned in these two directions, such as an oblique edge, since it distributes the energy of these features into several subbands. Directional lifting has recently been proposed in wavelet-based image coding to locally adapt the filtering direction to the geometric flow, and have demonstrated significant subjective and objective quality improvement [1][2]. The estimated geometric flow, hence the selections of filtering directions, are explicitly signalled to the decoder.
For scalable image coding, the target total bit-rate is unknown during encoding. Therefore, it is desirable to have a scalable representation also for the geometric flow so that the amount of the geometric flow overhead can be adjusted adaptively according to the target total bit-rate. A similar issue of a scalable motion vector representation for scalable video coding is discussed in [3]. In this project, students shall devise a scalable representation for the geometric flow for image compression and investigate its impact on both the objective and subjective compression performance. |
| References |
[1] W. Ding, F. Wu, and S. Li, “Lifting-based wavelet transform with directionally spatial prediction,” in Proc. Picture Coding Symposium 2004, San Francisco, CA, USA, Dec. 2004.
[2] C.-L. Chang, A. Maleki, and B. Girod, “Adaptive wavelet transform for image compression via directional quincunx lifting,” in Proc. IEEE Workshop on Multimedia Signal Processing, Shanghai, China, Oct. 2005.
[3] A. Secker and D. Taubman, "Highly Scalable Video Compression with Scalable Motion Coding", Trans. on Image Processing, Aug 2004 |
| |
|
| Topic 5 |
Rate-Distortion Modeling and Rate Control Using the rho-Domain Model |
| Contact |
Xiaoqing Zhu |
| Proposal |
The rho-domain model is a widely used rate-distortion model
for transform coding of images or video which relates both rate
and distortion to the number of non-zero transform coefficients [1]. In this project, we explore the suitability of the rho-domain model for JPEG image compression. Based on a sufficiently large set of test images, investigate the accuracy of the model and its limitations. Suggest improvements of the model that overcome the limitation. Demonstrate the utility of the model in a rate control scheme which transmits video sequences using Motion-JPEG encoding at a constant bit-rate. How small can you make your buffer for constant bit-rate transmission?
|
| References |
[1] Zhihai He, Sanjit Mitra, "A Unified Rate-Distortion Analysis Framework for Transform Coding," IEEE Trans. on Circuits and Systems for Video Technology, vol. 11, no. 12, pp. 1221-1236, December 2001. |
| |
|
MULTI-VIEW IMAGE CODING |
| |
|
| Topic 6 |
Multi-View Image Coding with Disparity-Compensated Lifted Wavelets
|
| Contact |
Markus Flierl |
| Proposal |
This project considers coding of multi-view imagery. The images are generated by capturing a real-world scene from different view-points.
Note that these images are very similar as they result from the same
scene. This similarity is exploited in coding schemes. But the different
view points cause a so called 'disparity' between corresponding pixels
in two images. Consequently, coding schemes use disparity compensation
to warp an image to another image. In this project, you will investigate
coding schemes that use adaptive wavelet transforms implemented with the
lifting structure. The adaptivity is achieved by disparity compensation
in the lifting steps of the transform [1][2].
The disparity-compensated wavelet transform will decorrelate the multi-view
images and will generate subbands that can be coded independently with lossy
image compression algorithms. The disparity between pairs of images will
be approximated by block-based disparity estimates. The encoded disparity
information as well as the encoded individual subbands represent the
original multi-view images. We aim for a scheme that is optimal in the
rate-distortion sense. In particular, the bit rate spent for the disparity
information should be allocated efficiently [3]. You will investigate
methods to allocate the bit rate for the disparity information such that
the overall rate-distortion performance is optimized.
|
| References |
[1] B. Pesquet-Popescu and V. Bottreau, "Three-Dimensional Lifting Schemes
for Motion Compensated Video Compression," Proceedings of the IEEE
International Conference on Acoustics, Speech and Signal Processing,
Salt Lake City, UT, May 2001, pp. 1793-1796.
[2] Chuo-Ling Chang, Xiaoqing Zhu, Prashant Ramanathan, and Bernd Girod, "Inter-view Wavelet Compression of Light Fields with Disparity-Compensated
Lifting," Proc. SPIE Visual Communications and Image Processing, VCIP-03,
Lugano, Switzerland, July 2003, pp. 14-22.
[3] B. Girod, "Rate-Constrained Motion Estimation", in Proceedings of the
SPIE Conference on Visual Communications and Image Processing, Chicago, USA,
Sept. 1994, pp. 1026-1034.
|
| |
|
VIDEO CODING |
| |
|
| Topic 7 |
Noise Reduction Prefiltering for Video Compression |
| Contact |
Mark Kalman |
| Proposal |
Practical video compression systems typically use a nonlinear, temporally recursive prefilter for noise reduction. Such a "Crawford filter" forms the difference between the previous noise-reduced frame and the current frame and performs a "soft-coring" operation on the frame difference [1] [2]. Both rate-distortion performance and subjective quality of the overall system can be substantially improved, particularly for noisy input sequences.
In the project, you shall investigate the impact of noise added to test video sequences on the rate-distortion performance of an H.264 video encoder. Then, introduce the Crawford filter for noise reduction and find a rule for the optimal filter settings depending on the noise level of the source sequence and the bit-rate of the H.264 coder. Quantify the gains achievable by noise reduction prefiltering both by measuring a PSNR and by (informal) subjective testing. |
| References |
[1] D. J. Crawford, "Spatio-temporal filtering in television picture coding," Ph.D. dissertation, Univ. Essex, July 1983.
[2] B. Girod, EE398 course notes, Stanford University. |
| |
|
| Topic 8 |
Rate Estimation of H.264 Encoding based on Encoding at a Lower Resolution |
| Contact |
David Varodayan |
| Proposal |
Predicting the bit rate of H.264-encoded video without actually performing the
encoding is very difficult. The problem of estimating the rate of spatially downsampled video given the high resolution encoding was addressed in [1]. For this project, students will consider the dual problem: estimating the bit rate of H.264-encoded high resolution video based only on the encoding of a lower resolution version. This scenario may occur in practice if the encoder has limited computational power yet needs to control the bit rate. Students should consider the cases in which the lower resolution counterpart is subsampled spatially, temporally and both. A good first step would be to compare the bit rates of several video sequences encoded at different resolutions.
|
| References |
[1] P. H. W. Wong, R. T. W. Hung, J. Y. B. Lee, S. C. Liew, C. S. Kim, and R. T. Chin, "Rate estimation for H.264/AVC spatial resolution reduction," Proc. ICIP, vol. 4, Singapore, Oct. 24-27, 2004, pp. 2773-2776.
|
|