EE398B - Image Communication I I

2004 Projects.

 

Project
Project Topics
Presentation

Suggested Topics for the Course Project

Please refer your questions to the people listed in the specific topic. They may provide you with further references and code. Most papers are available at the IEEE Xplore web site.

You may contact someone in the Image, Video and Multimedia Systems Group, mentioned in the topic lists or presenting his research (see schedule), before and after submitting your proposal, for advice.

 

I. WYNER-ZIV CODING

 

Area Wyner-Ziv Video Coding (see more on this area below)

There is a number of applications in which we would like to code video with a low-complexity encoder, but a high-complexity decoder is affordable. Distributed source coding, and in particular Wyner-Ziv coding, are data compression approaches that can make this possible with efficiency close to that of conventional systems, based on high-complexity coders.

Wyner-Ziv coding, named after [1], consists of lossy source coding with decoder side information. More precisely, the source data X is encoded with a rate constraint, and decoded with a certain distortion, using some side information Y available at the decoder only. Although the values Y takes on are not available at the encoder, the statistical dependence between X and Y is know, and exploited when designing the entire system. Information-theoretical studies suggest that the compression efficiency achieved can be similar to the case in which the side information Y is available at the encoder as well. In fact, in the lossless case, known as Slepian-Wolf coding, the rate in both cases is equal to H(X|Y) [2].

Despite the recent theoretical and experimental effort made to build practical Wyner-Ziv coders [3,4], there is a fundamental problem still unsolved. Motion compensation lead to a substantial improvement in video coding, but also increased the complexity of video encoders. If a low-complexity encoder is desired, we face the challenge of designing a Wyner-Ziv video coder where the motion compensation is almost entirely carried out at the decoder. This is theoretically possible if one thinks of the source data as the current frame and of the side information as the previously reconstructed frames.

The following project topics are based on ideas of the IVMS group to solve the problem of practical Wyner-Ziv motion compensation.

Contact

 David Rebollo-Monedero (also Anne Aaron and Shantanu Rane)

 

Topics

1. Wyner-Ziv Video Coding with Increasingly Accurate Motion-Compensated Side Information

The current frame is divided into blocks, each of which undergoes some transformation, and a scan order is defined on groups of the resulting coefficients.

Initially, the previously reconstructed frames without motion compensation or with motion compensation based on previous motion vector fields are used to estimate the first group of transform coefficients. Using this estimate as side information, the first group of transform coefficients is Wyner-Ziv coded. At the decoder, this coarse reconstruction is used to obtain more accurate motion estimation and motion-compensated side information. Successive groups of transform coefficients are coded with more and more accurate side information, obtained with some motion estimation technique based on intermediate reconstructions and previous motion vectors. At the last stages, a rather fine reconstruction of the current block is available and advanced motion-compensation techniques such as subpixel resolution or variable block sizes are possible.

[3] suggests the use of the DCT as blockwise transform. As for the scan order, the conventional zigzag scan seems reasonable, provided that high frequencies are more susceptible to accurate motion compensation that low frequencies. The maximum number of stages, equal to the number of DCT coefficients, would lead to the best performance but also the highest decoder complexity. Also, at the first stage, or even at each stage, some additional information can be sent to help the decoder perform the motion estimation (see 'hash' in note below).

 The objective of the project is the implementation of a Wyner-Ziv video coder based on this idea, and compare it to other distributed and nondistributed schemes.

 Note: The formulation above is in fact pretty general. For instance, it would also include a two-stage coding setting using pixel-domain block subsampling for motion-compensation followed by DCT. In this case, the transform is linear, and represented by a matrix composed by a few canonical basis vectors and also the DCT vectors, which gives an overcomplete representation of the source data. There are two coefficient groups only. The role of the first group consists of providing some type of message digest or ‘hash’ to make motion compensation for the second group at the decoder possible. More complicated ‘hashes’ can be used, possibly nonlinear. Subsampling in the pixel, frequency or bit domain are possible. In fact, this is one of the lines of research of the group, proposed by Prof. Girod.

2. Motion Compensation for Wyner-Ziv Video Coding

A generalization of Wyner-Ziv coding is Wyner-Ziv coding of noisy sources, in which the observed data Z to encode and the target data X to be reconstructed are different random variables, possibly in different alphabets. In this case, X is the motion of a block in the current frame, which is not known at the encoder, Z is the block itself, and Y is the previously reconstructed frame. How much information about Z do we need to transmit if we are only interested in recovering X only? At a second coding stage, X shall be used to motion-compensate Y and recover Z with very small rate, hopefully small enough to make up for the coding of the motion vector.

The objective of the project is to study the problem of Wyner-Ziv coding of the motion vectors only, theoretically or practically, in terms of rate and distortion. The second problem, namely the second stage of the Wyner-Ziv coding, is not a primary objective.

 

References 

[1] A. D. Wyner, J. Ziv, “The rate-distortion function for source coding with side information at the decoder,” IEEE Trans. Inform. Theory, vo. IT-22, Jan. 1976.

[2] J. D. Slepian and J. K. Wolf, “Noiseless coding of correlated information sources,” IEEE Trans. Inform. Theory, vol. IT-19, pp. 471–480, Jul. 1973.

[3] D. Rebollo-Monedero, A. Aaron and B. Girod, "Transforms for High-Rate Distributed Source Coding," Asilomar Conference on Signals, Systems and Computers, 2003 (invited paper) [PDF]

 [4] B. Girod, A. Aaron, S. Rane and D. Rebollo-Monedero, "Distributed Video Coding", in Proc. IEEE, Special Issue on Advances in Video Coding and Delivery, 2003 (invited paper, submitted) [PDF]

 

 

Area Wyner-Ziv Video Coding (see more on this area above)

Wyner-Ziv coding – source coding with side information only at the decoder – has been shown to be useful and suitable for certain video applications.  In our recent work we applied Wyner-Ziv coding to build an intraframe encoder & interframe decoder system which has a very simple encoder, suitable for low-complexity video applications, such as mobile camera-phones, wireless PC cameras and surveillance cameras.  We have also used Wyner-Ziv coding techniques to develop a novel error resiliency scheme for video broadcasting, which outperforms traditional forward error correction schemes and does not require a layered video bitstream for graceful quality degradation.  Although this has been an active research area in the last few years, there are still many open problems which need to be solved to make Wyner-Ziv coding more practical for real-world systems.

 

Contact

 Anne Aaron and Shantanu Rane

 

Topics

1. Codes for Wyner-Ziv Coding

Channel codes have been shown to work well for source coding with decoder side information. In our current systems, we use a turbo codec as a near lossless Slepian-Wolf codec. Design and study the compression efficiency of other channel codes, especially Low Density Parity Check (LDPC) codes, and investigate their practicality of use for video systems. One important aspect to study is the rate flexibility of the codes for changing source statistics.

2. Rate Control for Wyner-Ziv Coding

For Wyner-Ziv coding scenarios, the rate is dependent on the statistics between the source and the side information. However, the side information is not available at the encoder. Therefore, determining the rate at the encoder is an important issue. Our current rate control assumes feedback from the decoder to the encoder. Investigate better Wyner-Ziv rate control schemes, especially for our current low-complexity video encoder.

3. Other Applications of Wyner-Ziv Coding

We have shown that Wyner-Ziv coding can be used for low-complexity video encoding, error resiliency schemes for video broadcasting and compression for light field images. Can Wyner-Ziv coding be used for other video applications, such as layered video coding, multiple description coding, etc.?

 

References 

[1] A. D. Wyner, J. Ziv, “The rate-distortion function for source coding with side information at the decoder,” IEEE Trans. Inform. Theory, vol. IT-22, Jan. 1976.

[2] J. D. Slepian and J. K. Wolf, “Noiseless coding of correlated information sources,” IEEE Trans. Inform. Theory, vol. IT-19, pp. 471–480, Jul. 1973.

[3] B. Girod, A. Aaron, S. Rane and D. Rebollo-Monedero, "Distributed Video Coding", in Proc. IEEE, Special Issue on Advances in Video Coding and Delivery, 2003 (invited paper, submitted)

 

 

Area

Wyner-Ziv Quantization

Design of optimal quantizers with decoder side information (see area “Wyner-Ziv Video Coding” for an explanation of Wyner-Ziv coding).

 

Contact

David Rebollo-Monedero

Topics

1. High-Rate Wyner-Ziv Quantization with an Unconditional Entropy Constraint

A recent extension of the Lloyd algorithm exists to design optimal quantizers for distributed source coding [1]. A high-rate approximation principle has been established for the case in which the rate equals the conditional entropy of the quantization index given the side information, i.e., R=H(Q|Y) [2]. We would like to find a high-rate approximation for the case R=H(Q). This approximation could shed some light on optimal implementation of Slepian-Wolf codes.

 The objective of the project is to find a mathematical theorem that characterizes Wyner-Ziv quantizers as R=H(Q) goes to infinity, along with an experimental verification for a particular case, for instance, Gaussian. Matlab code implementing Lloyd algorithm for this particular case available.

 

References 

[1] D. Rebollo-Monedero, R. Zhang and B. Girod, "Design of Optimal Quantizers for Distributed Source Coding," Data Compression Conference (DCC), 2003 [PDF]

 [2] D. Rebollo-Monedero, A. Aaron and B. Girod, "Transforms for High-Rate Distributed Source Coding," Asilomar Conference on Signals, Systems and Computers, 2003 (invited paper) [PDF]

 

 

 

II. NETWORK STREAMING

 

Area Congestion-distortion optimized scheduling of video over a bottleneck link

In a multimedia streaming system, schedulers are responsible for determining the transmission times of different packets of a sequence. When the channel conditions are adverse, e.g. varying delay or random losses, each packet cannot always be reliably delivered to the receiver by its playout deadline. To address this issue, smart schedulers which seek to maximize the reconstructed media quality have been proposed recently [1, 2]. In a bandwidth-limited environment, the impact of the sender also needs to be taken into account as the media stream itself might overwhelm a bottleneck link and create large amounts of congestion. This has motivated us to develop a scheduler which strives to achieve the highest reconstructed media quality for a given level of congestion [3].

 

Contact Eric Setton

Topics 1. Comparison of layered and multiple description coding using the CoDiO scheduler

Several comparisons of layered coding and multiple description coding using heuristic or rate-distortion optimized schedulers have been conducted [4]. The aim of the project would be to pursue a new version of this comparison for a bandwidth-limited environment and with a new scheduler. Students working on this project would learn how to use the H.264 encoder/decoder to generate compressed sequences with different encoding structures and would simulate the video streaming/scheduling in the network simulator ns-2 (code and help will be provided !). Motivated students may also want to consider other kind of source coding techniques in their comparisons.

2. Delivering video to a mobile wireless client

Work on CoDiO has focused so far on a fixed bottleneck bandwidth. The idea of this project is to extend this to a mobile wireless client. In this case, the capacity of the last hop will vary making the channel estimate inaccurate. Students working on this project will analyze how feedback may be used to estimate the channel variation and how the frequency of the estimation impacts the performance of video streaming with the CoDiO scheduler. Students will simulate video streaming/scheduling in the network simulator ns-2 (code and help will be provided !). Motivated students may also want to consider the impact of time-varying cross traffic over the bottleneck link.

 

References  [1] P. A. Chou and Z. Miao, "Rate-distortion optimized streaming of packetized media," IEEE Transactions on Multimedia, February 2001. Submitted. Can be found at : http://research.microsoft.com/~pachou 

[2] M. Kalman, P. Ramanathan, and B. Girod, "Rate-Distortion Optimized Streaming with Multiple Deadlines," Proc. IEEE International Conference on Image Processing, ICIP-2003, Barcelona, Spain, Sept. 2003. Can be found at : http://www.stanford.edu/~bgirod/publications.html 

[3] E. Setton and B. Girod, "Congestion-distortion optimized scheduling of video over a bottleneck link," submitted to MMSP 2004. Can be found at : http://www.stanford.edu/~esetton/publications.htm 

[4] J. Chakareski, S. Han, and B. Girod, "Layered Coding vs. Multiple Descriptions for Video Streaming Over Multiple Paths," Proc. ACM Multimedia 2003, Berkeley, CA, Nov. 2003. Can be found at : http://www.stanford.edu/~bgirod/publications.html

 

 

Area

 

Video Streaming Over Ad Hoc Wireless Network

In an ad hoc wireless network, resources are limited and channel qualities fluctuate over time. This poses intriguing challenges for video streaming, an application with stringent delay constraints and high bitrate requirement. Design of the video source coding, rate allocation over the channels, as well as the routing algorithm, need to be tailored for the characteristics of the network, or optimized jointly.

 

Contact

Xiaoqing Zhu

 

Topics 1. Rate adaptation over wireless ad hoc network

One way to mitigate the effect of channel fluctuation on received video quality is to adapt the source rate accordingly and avoid unnecessary packet losses. Rate adaptation techniques include straight-forward transcoding [1] , skipping less important frames for prediction-based video coders (e.g., MPEG-2 B frames), dropping enhancement information in a stream with layered representation(e.g. in H.263+), switching between multiple representations of a video stream (e.g. SP frames in H.26L), or simply truncating bitstreams if the video stream has an embedded representation (e.g., FGS coding in MPEG-4/H.26L or 3-D wavelet coders[2]).
In this project, one can choose one or more of the rate adaptation techniques and see how they perform over a time-varying wireless ad hoc network. Experiments can be carried out with simple Markovian channel models or over a simulated network using ns-2 [3].
Starter codes and help are provided.

2. Performance of ad hoc routing protocols for video streaming

Routing protocols for ad hoc wireless network has been a popular topic for current research [4]. Existing algorithms include Dynamic Source Routing (DSR) [5], Ad hoc On-Demand Distance Vector(AODV)[6] , etc. The design objectives of these algorithms generally involves minimizing the number of hops in a path, and may not necessarily work well with video streaming. The goal of the project is to evaluate the performance of the routing protocols for video streaming over an ad hoc network. Experiments will be performed over a simulated network using ns-2 [3]. (The mobile nodes and routing algorithms are provided by in the network simulator.) One may also propose additional improvements for the video streaming scheme. Starter codes and help are offered.

 

References [1] A. Vetro, C. Christopoulos and H. Sun, "Video transcoding architectures and techniques: an overview", IEEE Signal Processing Magazine, March 2003 pp.18 - 29 [PDF]

[2] X. Zhu, S. Han and B. Girod, "Congestion-optimized multipath streaming of video over ad hoc wireless network", IEEE International Conference on Image Processing, ICIP-2004, Singapore, October, 2004 [PDF]

[3] NS-2: The Network Simulator http://www.isi.edu/nsnam/ns

[4] E. M. Royer and C-K Toh, "A review of current routing protocols for ad hoc mobile wireless network", IEEE Personal Communication, April 1999, pp.46-55 [PDF]

[5] D. B. Johnson and D. A. Maltz, "Dynamic source routing in ad hoc wireless networks", Mobile Computing, 1996 [PDF]

[6] M. K. Marina and S. R. Das, "On demand multipath distance vector routing in ad hoc network", IEEE International Conference on Network Protocols, Riverside, CA, USA, Novermber 2001, pp.14-23 [PDF]

 

 

Area

SP-frames for video coding

 

Contact

Prashant Ramanathan

 

Topics

1. Investigation of SP-frames for video coding

SP-frames have been incorporated in the H.264 standard to allow for switching between bitstreams that have different quality levels, as well as enable error resiliency, and random access without any prediction mismatch [1]. In this project, you will implement SP-frames, and investigate and optimize coder parameters and see how they affect overall compression efficiency under various conditions. Possible improvements include better entropy coding, combining loop filter with the transform for prediction error, and selecting the best quantization parameters for the primary and the switching bitstreams.

 

References

[1] M. Karczewicz, R. Kurceren, "The SP- and SI-Frames Design for H.264/AVC," IEEE Trans. CSVT, July 2003.

 

 

 

III. LIGHT FIELDS

 

Area

Light Field Streaming & Compression

 

Contact

Prashant Ramanathan

Contact Prashant Ramanathan at pramanat@stanford.edu for more details, and
pointers to relevant papers.

 

Topics

1. User Interaction Modelling for Light Field Streaming

Light fields are image-based rendering data sets that allow users to experience an photo-realistic environment or interact with a photo-realistic object.  Recently, there has been preliminary work on rate-distortion optimized streaming of these large light field data sets over a lossy packet network.  Currently, this work assumes perfect knowledge of the users' view trajectory.  In this project, you would investigate and model the way a user interacts with a light field, and predict the future actions of the user, for use by the streaming system.

2. Geometry/Image Trade-off in Light Field Compression

Light fields are image-based rendering data sets that tend to be very large.  Accurate geometry is important for the efficient compression of these data sets, but it also requires more bits to encode the geometry. In this project, you will experimentally (and possibly theoretically) investigate the optimal trade-off in the bits spent on geometry and image information.  You will experiment with and possible implement various geometry coding alogrithms, and examine how this affects the trade-off.

3. Good Heuristic Schemes for Light Field Streaming Packet Scheduling

Rate-distortion optimized (RaDiO) packet scheduling has been applied to the problem of interactive light field streaming [1]. Effective rate control and computational complexity are challenges that the RaDiO approach faces in order to be useful in a practical setting. Recent work on streaming scalable light field data sets [2] takes a different approach, and are less computationally complex and more amenable to rate control. This project explores the design and implementation of a "heuristic" algorithm for packet scheduling that can come close to RaDiO in terms of R-D performance, but is more practical.


References 

[1] P. Ramanathan, M. Kalman, B. Girod, "Rate-Distortion Optimized Streaming of Compressed Light Fields", Proc. ICIP 2003, Barcelona, Spain, Sep 2003.

[2] C.-L. Chang, B. Girod, "Rate-Distortion Optimized Interactive Streaming for Scalable Bitstreams of Light Fields", VCIP-2004.

Other: Contact Prashant.

 

 

 

IV. WAVELET VIDEO CODING

 

Area

3-D wavelet video coding with scalable motion vectors

Over the years, many researchers have proposed 3-D wavelet coding of video sequences. Thanks to the multi-resolution nature of wavelet transforms as well as efficient embedded coding of the wavelet coefficients, 3-D subband video coding provides great support for scalability, a very desirable feature when transmitting video over the network. However, linear transforms applied in the temporal direction may be inefficient if the motion between frames is not fully exploited.

Many attempts have been made to incorporate motion compensation into the 3-D wavelet video coding framework Earlier works are somewhat unsatisfactory in terms of the rate-distortion coding performance because the motion vector field is severely restricted and the temporal transform is usually limited to the two-tap Haar wavelet. Recently, motion-compensated lifting has been proposed, which successfully incorporates unrestricted motion compensation into 3-D wavelet coding and provides compression efficiency approaching the state-of-the-art predictive video coding schemes [1][2][3].

 

Contact

Chuo-Ling Chang and Sangeun Han

 

Topics 1.  3-D Wavelet Video Coding with Scalable Motion Coding and Optimized Bit Allocation

In video coding, motion vectors are usually encoded losslessly as side information. The number of bits available for coding the motion vectors directly affects the efficiency of motion compensation, hence significantly influences compression performance. In non-scalable coders, various techniques have been used to optimize the portion of bit-rate spent on motion vectors for a target total bit-rate. However, in scalable video coding, such as 3-D wavelet video coding, the target total bit-rate is unknown during encoding.

For 3-D wavelet video coding, a scalable motion coding scheme using techniques similar to JPEG 2000 has been proposed in conjunction of scalable coding of the subbands (frame texture) [4]. In addition, it is shown that the distortion in the reconstructed video frames can be approximated as a linear function of the distortion in the reconstructed motion vectors. For such a scheme, the total bit-rate is the sum of the motion bit-rate and the subband bit-rate, and the allocation between the two plays an important role in the resulting compression performance. In [4], the best trade-off between the motion bit-rate and the subband bit-rate is determined by an exaustive search from combinations of certain bit-rates. The interaction between the motion bit-rate and the subband bit-rate, and the resulting distortion in the reconstructed video frames are yet to be established and verified in order to achieve a joint bit allocation framework.

 

References [1] A. Secker and D. Taubman, "Motion-compensated highly scalable video compression using an adaptive 3D wavelet transform based on lifting,'', in Proc. IEEE Int. Conf. on Image Processing 2001, Thessaloniki, Greece, Oct. 2001, vol. 2, pp. 1029-1032.

[2] B. Pesquet-Popescu and V. Bottreau, "Three dimensional lifting schemes for motion compensated video compression,'' in Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing 2001, Salt Lake City, UT, USA, May 2001, vol. 3, pp. 1793-1796.

[3] L. Luo, J. Li, S. Li, et al., "Motion compensated lifting wavelet and its application in video coding,'' in Proc. IEEE Int. Conf. on Multimedia and Expo 2001, Tokyo, Japan, Aug. 2001, pp. 481-484.

[4] D. Taubman and A. Secker, "Highly scalable video compression with scalable motion coding," in Proc. IEEE Int. Conf. on Image Processing 2003, Barcelona, Spain, Sep. 2003, vol. 3, pp. 273-276.

 

 

 

 

 

 


Home ] Course Info ] Personnel ] Handouts ] Project ] [ Project Topics ] Presentation ] Sample Files ] Announcements ] Previous Years ]

Please contact us if you have any questions about this page. Last modified: 20 Apr 2004 .