SCIEN Colloquium Abstracts
Super-Resolution from Unregistered Aliased Images
Dr. Luciano Sbaiz
Abstract
Aliasing in images is often considered as a nuisance. Artificial low frequency patterns and jagged edges appear when an image is sampled at a too low frequency. However, aliasing also conveys useful information about the high frequency content of the image, which is exploited in super-resolution applications. We use a set of input images of the same scene to extract such high frequency information and create a higher resolution aliasing-free image. Typically, there is a small shift or more complex motion between the different images, such that they contain slightly different information about the scene. Super-resolution image reconstruction can be formulated as a multichannel sampling problem with unknown offsets. This results in a set of equations that are linear in the unknown signal coefficients but nonlinear in the offsets. If a part of the image spectra is free of aliasing, the planar shift and rotation parameters can be computed using only this low frequency information. In such a case, the images can be registered pairwise to a reference image. Such a method is not applicable if the images are undersampled by a factor of two or larger. A higher number of images needs to be registered jointly. Two subspace methods are discussed for such highly aliased images. The first approach is based on a Fourier description of the aliased signals as a sum of overlapping parts of the spectrum. It uses a rank condition to find the correct offsets. The second one uses a more general expansion in an arbitrary Hilbert space to compute the signal offsets. The sampled signal is represented as a linear combination of sampled basis functions. The offsets are computed by projecting the signal onto varying subspaces. Under certain conditions, in particular for bandlimited signals, the nonlinear super-resolution equations can be written as a set of polynomial equations. Using Buchberger's algorithm, the solution can then be computed as a Gröbner basis for the corresponding polynomial ideal.
Biography:
Dr. Luciano Sbaiz received the "Laurea in Ingegneria" degree in electronic engineering and the Ph.D. degree from the University of Padova, Padova, Italy, in December 1993 and June 1998, respectively. Between 1998 and 1999, he was Postdoctoral Researcher at the Audiovisual Communications Laboratory at Ecole Polytechnique Federale de Lausanne (EPFL), Lausanne, Switzerland, where he conducted research on the application of computer vision techniques to the creation of video special effects. In 1999, he joined Dartfish Ltd., Fribourg, Switzerland, as Project Manager.Within the company, he developed video special effects for television broadcasting and sport analysis. In 2004, he became a Senior Researcher at the Audiovisual Communications Laboratory at EPFL, where he conducts research on signal processing. His activities are in the field of image and audio processing, superresolution techniques, and tomography.
The Evolution of Video Quality Measurement
Dr. Stefan Winkler
Abstract
With the ever-increasing complexity of digital video services, video quality has become a critical issue. This encompasses not only the Quality of Service (QoS) of the delivery network, but even more importantly, the Quality of Experience (QoE) of the viewers. In this talk, I'll give an overview of the current state of the art of video quality assessment methods and summarize the main standardization efforts in this area. I'll also take a closer look at emerging new generations of quality metrics, in particular so-called ``hybrid'' approaches combining network and content metrics for QoE measurement, and discuss some aspects of audiovisual quality.
Biography:
Dr. Stefan Winkler is the Principal Technologist for Symmetricom's QoE Assurance Division. Formerly, he was the Chief Scientist and co-founder of Genista Corporation, a provider of quality assurance solutions for IPTV and mobile media. He has also held assistant professor positions at the National University of Singapore (NUS) and the University of Lausanne, Switzerland. He has published more than 40 papers on perceptual quality measurement and is the author of the book, "Digital Video Quality." He has also been a member and contributor of the Video Quality Experts Group (VQEG) since its foundation in 1997. Dr. Winkler holds a Master of Science in Electrical Engineering from the University of Technology in Vienna, Austria, and a doctorate from the Ecole Polytechnique Federale de Lausanne (EPFL), Switzerland
Did the great masters "cheat" using optics? Computer image analysis of Renaissance masterpieces sheds light on a controversial theory
Dr. David Stork
Abstract
In 2001, artist David Hockney and scientist Charles Falco stunned the art world with a controversial theory that, if correct, would profoundly alter our view of the development of image making. They claimed that as early as 1420, Renaissance artists employed optical devices such as concave mirrors to project images onto their canvases, which they then traced or painted over. In this way, the theory attempts to explain the newfound heightened naturalism or "opticality" of painters such as Jan van Eyck, Robert Campin, Hans Holbein the Younger, and many others.
This talk will describe the application of rigorous computer image analysis to masterpieces adduced as evidence for this theory. It covers basic geometrical optics of image projection, the analysis of perspective, curved surface reflections, shadows, lighting and color. While there remain some loose ends, such analysis of the paintings, infra-red reflectograms, modern reenactments, internal consistency of the theory, and alternate explanations allows us to judge with high confidence the plausibility of this controversial theory. You may never see Renaissance paintings the same way again.
Joint work with Antonio Criminisi and Christopher W. Tyler.
Biography:
David G. Stork is Chief Scientist of Ricoh Innovations and has taught "Light, Color and Visual Phenomena," "Pattern Classification," "Optics, perspective and Renaissance painting," and other courses Stanford University . He studied art history at Wellesley College and was Artist-in-Residence through the New York State Council of the Arts. He holds 35 patents and his five books include Seeing the Light: Optics in Nature, Photography, Color, Vision and Holography with D. Falk and D. Brill and Pattern Classification (2nd ed.) with R. Duda and P. Hart. He was one of four scientists invited to analyze Mr. Hockney's theory at a major symposium at the New York Institute for the Humanities in December 2001
See also: http://www.diatrope.com/stork/HockneyComputerVision.html, www.diatrope.com/stork/FAQs.html
A Lateral Chromatic Aberration Correction System for Ultrahigh-definition Color Video Camera
Takayuki Yamashita
Abstract
We have developed color camera for Super Hi-Vision - a 4000-scanning-line ultrahigh-definition video system - with a 5x zoom lens and a signal-processing system incorporating a function for real-time lateral chromatic aberration correction. The chromatic aberration of the lens degrades color image resolution. However, to enable low chromatic aberration, a lens usually has to be large, which makes the zoom lens for Super Hi-Vision cameras difficult to downsize. To develop a compact zoom lens consistent with ultrahigh-resolution characteristics, we incorporated a real-time correction function in the signal-processing system. In our current prototype system, we have focused on correction in relation to only the focal length. The signal-processing system has a memory table to store the correction data for each focal length on the blue and red channels. When the focal length is inputted from the lens system, the relevant correction data are selected. This system performs geometrical conversion on both channels using this correction data. In this way, lateral chromatic aberration was successfully reduced to an amount small enough to ensure the desired image resolution was achieved over the entire range of the lens in real time.
Speaker Biography:
Takayuki Yamashita received B.E. and M.E. degrees in electronics and information science from Kyoto Institute of Technology in 1993 and 1995, respectively. He joined NHK (Japan Broadcasting Corp.) in 1995 and has been engaged in research of HDTV cameras at NHK's Science and Technical Research Laboratories, since 1999. He is working on ultrahigh-definition TV camera systems.
The Compressive Optical MONTAGE Photography Initiative
Professor David Brady
Addy Family Professor of Electrical Computer Engineering, Pratt School of Engineering, Duke University
Abstract:
MONTAGE is a DARPA MTO program to use emerging computational imaging strategies to change to form factor of digital imaging systems. As a project within MONTAGE, COMP-I aims to produce digital imaging systems with an order of magnitude reduction in system thickness from the current baseline. In addition to the DARPA system metrics, COMP-I uses compressive sampling to reduce image data rates at the physical/discrete interface. The first phase of COMP-I used multichannel generalized sampling to produce 2 mm thick visible and 2.3 mm thick LWIR imagers.
COMP-I Phase I explored shift codes, focal plane codes and psf codes to modulate the sensor channel for nondegenerate multichannel sampling.
COMP-I is led by the Duke Integrated Sensing and Processing group (DISP) and is includes participants from Raytheon, Digital Optics Corporation and the Universities of Delaware and North Carolina Charlotte as well as Michigan Tech and Rice. This talk briefly describes DISP projects in compressive spectroscopy and imaging, outcomes from COMP-I phase I and plans for COMP-I phase II.
Speaker Bio :
David J. Brady is the Addy Family Professor of Electrical Computer Engineering in the Pratt School of Engineering at Duke University and Principal Investigator for DISP. Professor Brady graduated from Macalester College with a B.A. in Physics and Mathematics prior to earning M.S. and Ph. D. degrees from Caltech in Applied Physics. He was on the ECE faculty at the University of Illinois from 1990-2001 prior to moving to Duke University , where he was the founding director of the Fitzpatrick Institute for Photonics. Brady was a David and Lucile Packard Foundation Fellow from 1990-1995 and is a Fellow of the Optical Society of America. Brady's research focuses on computational imaging and spectroscopy.
High Dynamic Range Imaging
Greg Ward
Abstract:
High dynamic range imaging is a prominent trend in graphics, with implications for digital photography, film, special effects, and virtual reality. The speaker will describe the techniques and technologies behind HDR imaging, covering methods for capture, representation, and display. The talk will feature live demonstrations of HDR image capture using a standard digital camera and real-time HDR display.The speaker will also address tone-mapping and gamut-mapping issues for low dynamic range output and printing.
Speaker Bio :
Greg Ward developed the first widely-used HDR image file format in
1986 as part of the Radiance lighting simulation system. In 1998, he introduced the more advanced LogLuv TIFF encoding, and more recently, created a backwards-compatible HDR extension to JPEG. He is the author of the Mac OS X application Photosphere, which provides advanced HDR assembly and cataloging and is freely available from www.anyhere.com . He is also coauthor of a new book from Morgan Kaufmann Publishers entitled "High Dynamic Range Imaging" by Reinhard, Ward, Pattanaik and Debevec ( http://www.anyhere.com/gward/papers/REINHARD_Flyer.pdf ). He is currently working as an independent consultant in Albany , California .
Dual Photography
Dr. Hendrik P. A. Lensch
We present a novel photographic technique called dual photography, which exploits Helmholtz reciprocity to interchange the lights and cameras in a scene. With a video projector providing structured illumination, reciprocity permits us to generate pictures from the viewpoint of the projector, even though no camera was present at that location. The technique automatically captures all light transport paths, including shadows, inter-reflections and caustics. In its simplest form, the technique can be used to take photographs without a camera; we demonstrate this by capturing a photograph using a projector and a photo-resistor. If the photo-resistor is replaced by a camera, we can produce a 4D dataset that allows for relighting with 2D incident illumination. Using an array of cameras we can produce a 6D slice of the 8D reflectance field that allows for relighting with arbitrary light fields. Since an array of cameras can operate in parallel without interference, whereas an array of light sources cannot, dual photography is fundamentally a more efficient way to capture such a 6D dataset than a system based on multiple projectors and one camera. Still the sampling of the reflectance field with regard to the camera positions (virtual projectors) is rather sparse. Therefore, we also developed a technique for interpolating between virtual projector positions allowing for simulation of moving projectors or area light sources. As an example, we show how dual photography can be used to capture and relight scenes.
Biography
Hendrik P. A. Lensch is a visiting assistant professor with Marc Levoy at Stanford University , where he leads the research group "General Appearance Acquisition" of the Max Planck Center for Visual Computing and Communication (Saarbrücken / Stanford). He received his diploma in computers science from the University of Erlangen in 1999 after studying both in Erlangen and at the KTH in Stockholm , Sweden . He worked as a research associate at Hans-Peter Seidel's computer graphics group at the MPI Informatik in Saarbrücken and received his PhD from Saarland University in 2003. His research interests include 3D appearance acquisition, relightable models and image-based rendering.
Computer Vision Research at Watson: From VeggieVision to PeopleVision
Dr. Arun Hampapur
IBM T.J. Watson Research Center
Computer Vision, the science of recognizing patterns in visual imagery has a wide range of applications. This talk presents an overview of projects at the Exploratory Computer Vision Group in the IBM Watson Research Center . I will briefly describe our work in automatic object recognition (VeggieVision), automatic video indexing for broadcast (VideoVista), and audio-visual speech recognition. The focus of the talk will be around Biometrics and Video Surveillance.
In Biometrics, I will present our work on finger print matching, including acquisition, feature extraction and finger print verification. Anonymous biometrics is an effort to build technical solutions to the privacy and security challenges that arise from the wide spread use of biometrics. Large scale biometric matching explores the use of feature based indexing techniques to address the accuracy and performance issues that arise in 1 to many matching.
PeopleVision, is a project that is exploring the use of camera based object detection, tracking and classification as the basis for building Smart Surveillance Systems. These systems are capable of automatically monitoring physical spaces to provide a variety of functionalities like real-time behavioral alerts, automatic event based retrieval, event pattern analysis.
Smart surveillance systems have applications in a wide range of markets including Homeland Security, Retail, Travel and Transportation and Healthcare.
Biography: Dr. Arun Hampapur manages the Exploratory Computer Vision Group at the IBM T.J. Watson Research Center . The Exploratory Computer Vision Group is a 10 member team with PhD's from the top universities in the world, the team currently has two thrusts, video surveillance and biometrics technologies. At IBM since 1997, Dr Hampapur is one of the early researchers in the field of Multimedia Database Management. Dr Hampapur obtained his PhD from the University of Michigan in 1995. Before moving to IBM he was leading the video effort at Virage Inc (1995 - 1997). At IBM Research in addition to several research projects, Dr Hampapur served as a design consultant for the CNN Video archive system a joint project between IBM and Sony. His role as a indexing technology expert included developing a technology adoption map for the customer and vendor qualification. Dr Hampapur now leads an Adventurous Research project called PeopleVision.
PeopleVision explores several aspects of understanding people and their actions using camera based tracking. The technology developed in the PeopleVision project is currently being commercialized in the surveillance domain as the IBM Smart Surveillance System (S3). He has published more than 40 papers on various topics related to media indexing, video analysis, and video surveillance and holds 8 US patents. He is also active in the research community and serves on the program committees of several IEEE International conferences. He also served on NSF review panel for small business innovation research. Dr Hampapur is an IEEE Senior Member.
"Super Hi-Vision" -
Ultrahigh Definition Video Camera System
Kohji Mitani
Senior Research Engineer
Advanced Television Systems
Science and Technical Research Laboratories
Japan Broadcasting Corporation (NHK)
We are investigating an ultrahigh-definition video system that can provide viewers with a greater sensation of reality than HDTV. The present target is to develop a system with 4000 scanning lines. This system should play a major role in any application that requires high resolution such as cinema, virtual museum, video archives, and telemedicine. A video camera and projection display together with a disc recorder system have been developed as experimental devices for the 4000 scanning line system. At the present time, the number of panel pixels is limited to 2kk4k for both CCD and LCD.
Due to this resolution constraint, four panels (two greens, one red and one blue) are combined to produce a resolution of 4k 8k pixels (16 times that of HDTV) in both the camera and display. The two green panels are arranged by the diagonal-pixel-offset method to achieve the above resolution. This talk will describe the development of these devices and a new system to be exhibited at the EXPO 2005that will be held in Aichi Pref., Japan .
Subpixel Rendering on Colour Matrix Displays.
Dean Messing, Louis Kerofsky and Scott Daly.
Sharp Labs of America.
It is well known that the luminance resolution of a colour matrix display exceeds the resolution implied by the spatial sampling frequency of the display pixels. This is due to the spatially discrete nature of the subpixels that comprise each pixel. By treating the subpixels as individual luminance-contributing elements it is possible to enhance resolution of the display. However there are practical and theoretical difficulties in attaining this additional resolution the most important of which is the attendant colour aliasing. The first half of our talk discusses the resolution enhancement problem and presents an image processing algorithm that increases the resolution of a flat-panel display having the usual one-dimensional "striped" subpixel geometry. By a clever use of properties of the Human Visual System (HVS) the added resolution is obtained without the usual annoying colour artifacts. As a coda to this half of the presentation we also discuss some interesting results related to multi-frame resolution enhancement and the role played by the image acquisition system's colour filter array (CFA). As it turns out, CFA Interpolation and Subpixel Rendering are dual problems.
The rendering algorithm presented in the first half is specific to panels that have striped RGB subpixel geometry. Until recently this geometry has been nearly universal. However a new generation of colour matrix panels is on the horizon. These panels have both two-dimensional subpixel patterns and subpixels that do not (necessarily) use the standard RGB colour primaries. The second half of our talk discusses the rendering problem in the context of such panels and presents our recently proposed Optimal Rendering Framework for rendering imagery on such displays. This Framework uses the spatial sensitives of the HVS and the subpixel geometry of the display to pose a Constrained Optimisation Problem, the solution of which yields an array of 2D optimal rendering filters. The constraints are determined by the display panel, and the Cost Function is determined by the HVS. This Framework is very general. It can support displays with quite arbitrary 2D geometries and colour primaries. We provide examples showing how to apply this framework to several different 2D subpixel geometries and to displays having more than three primaries. We conclude with a novel and elegant application of the Framework to a problem unrelated to its original intent.
Presentation Slides (coming soon)
Media Representation - From Software Tools to Theory
Alexandros Eleftheriadis
Department of Electrical Engineering, Columbia University.
Media representation encompasses the various ways to digitally represent audio and video signals for the purposes of efficient creation, processing, storage, or transmission. It is a cross section of multimedia software, signal processing and compression, communication systems, and information theory. In this talk I briefly review some of our work in rate shaping, model-assisted video coding, mobile broadcast video, and semi-automatic video segmentation, and discuss how seemingly simple questions and the need to bridge the gap between low-level signal properties and high-level authoring structures have led us to the design of the Flavor language and the development of Complexity Distortion Theory. Flavor is a software tool for codec designers, whereas Complexity Distortion is a new theoretical framework for media representation based on complexity theory. I will also talk about our participation and the lessons learned in MPEG-4, an important milestone in media representation standards, as well as our current design philosophy in new software tools for next-generation media applications.
Adaptive Image Segmentation Based on Perceptual Color and Texture Features
Thrasyvoulos Pappas
Department of Electrical and Computer Engineering, Northwestern University, Evanston, Illinois.
We propose an image segmentation algorithm that is based on spatially adaptive color and texture features. The color features are based on the estimation of spatially adaptive dominant colors, which on one hand, reflect the fact that the human visual system cannot simultaneously perceive a large number of colors, and on the other, the fact that image colors are spatially varying. The spatially adaptive dominant colors are obtained using a previously developed adaptive clustering algorithm for color segmentation. The (spatial) texture features are based on a steerable filter decomposition, which offers an efficient and flexible approximation of early processing in the human visual system. In contrast to texture analysis/synthesis techniques that use a large number of parameters to describe texture, our segmentation algorithm relies on only a few parameters to segment the image into simple yet meaningful texture categories. Since texture feature estimation requires a finite neighborhood that limits spatial resolution, the proposed algorithm combines texture with color information to obtain accurate and precise edge localization. The performance of the proposed algorithm is demonstrated in the domain of photographic images, including low resolution, degraded, and compressed images.
Using Lightfields in Image Processing
Heinrich Niemann
Chair for Pattern Recognition, University of Erlangen-Nuremberg, Germany
A short introduction to the concept of a lightfield (LF) is given and to the recording by one hand- or robot-manipulated camera. The need for at least rough scene geometry for good rendering quality is pointed out. A so-called free-form LF is used where the camera positions can be arbitrary, in particular need not be on a regular grid. The main part of the talk is devoted to potential usage of LF for various tasks in image processing.
It is shown that the purely image based modeling of an object or a scene by a LF may be used to track an object and to self-localize a robot in an indoor environment. Both tasks basically amount, in a probabilistic framework, to a recursive state estimation which is performed by particle filters. For the tracking task the accuracy of state vector estimation was investigated for different resolutions and particle numbers. Self-localization was performed with a mobile platform.
Another application is to provide a 3-D recording, rendering, and augmentation of a (laparoscopic) surgery situation by a LF. The recording is being done by either a hand-held or a robot-manipulated camera. Sufficiently stable prominent points can be detected and tracked from different organs so that a camera calibration is possible. It was evaluated experimentally that the LF is useful to remove the effect of specular reflections from images taken by an endoscope during surgery. Ongoing work is directed towards augmenting the LF by vessels hidden from the camera and by organs from preoperative images.
Information Embedding and Digital Watermarking
Joachim Eggers
University of Erlangen-Nuremberg, Telecommunications Laboratory
The ease of perfect copying, distribution, and manipulation of digital data has become a significant problem for copyright protection and integrity verification of digitized multimedia content. Digital watermarking has been proposed as one possible method to combat these problems. Thus, information embedding and digital watermarking has gained a lot of attention during the last years. The many publications on information embedding and digital watermarking show a mutual improvement of watermarking schemes and attacks against embedded watermarks. However, recently, research on theoretical performance limits of digital watermarking has intensified as well. In this presentation, theoretical and experimental results for several more or less restricted watermarking scenarios are discussed. The focus is on blind watermarking, where the receiver does not have access to the original non-watermarked data. In this case, digital watermarking can be considered communication with side information about the original signal at the encoder. Further, robust digital watermarking schemes should be designed by considering digital watermarking a game between the watermark embedder and attacker. The presentation closes with example results for image watermarking.
Audio-Visual Content Indexing, Filtering, and Adaptation
Shih-Fu Chang
Digital Video and Multimedia Research Lab
Columbia University
Recently, researchers have been very active in developing new techniques and standards for audio-visual content description. Such descriptions can be used in innovative applications such as multimedia search engines, personalized media filters, and intelligent video navigators. In our research, we are particularly interested in emerging applications in two environments: personal media server (also called personal video recorder) and mobile multimedia device.
In this talk, I will first give an overview of technical issues arising in exciting applications mentioned above. Then I will present our recent research in two specific areas.
The first area involves real-time event detection in specific application domains, such as sports. Such techniques are needed especially for filtering live broadcast programs in a mobile, time-sensitive environment. We will present a real-time system for sports event parsing and detection, combining approaches of machine learning, compressed-domain processing, and domain rule modeling. Demos of real-time performance will be shown. In addition, we will introduce a novel video streaming framework in which video bitrate is adapted dynamically according to matching of detected event to user's preference. Such adaptation enables improved video quality and user experience by allocating scarce resource to video segments with important content. We will discuss interesting system-level issues inspired by such a content-adaptive streaming framework.
The second area involves parsing and summarizing high-level content in long unstructured programs, such as films. Here we will present our work on audio-visual scene segmentation using a multimedia integrative framework. Psychological memory-based models are used for detecting long-term scene boundaries. Structural patterns such as dialog and anchoring are analyzed. Multimedia cues (such as speech or silence) are incorporated for aligning audio-visual scenes. We will report promising results from experiments with films of different genres. Given the scene structure, we have also developed a unique approach to audio-visual content skimming by incorporating production syntax and perceptual complexity of video.
At the end, I will briefly review the emerging MPEG-7 standard. I will discuss the relationships between MPEG-7 and above research tools. I will also briefly describe our work in indexing medical video in Columbia's Digital Library Project.
Perceptual Video Distortion Metrics And Coding
Henry Wu and Damian Tan
School of Computer Science and Software Engineering, Monash University, Australia
Quantitative quality and impairment metrics based on the human visual system have remained one of the most critical issues in the field of digital video coding and compression. The progress made on this issue significantly affects research activities in at least the following three areas: design of new high performance coding/compression algorithm, quality/impairment measurements for digital video coding algorithms and products, and quantitative definition of "psychovisual redundancy".
This talk will discuss the vision model which we use in devising our blocking and ringing impairment metrics for digital video quality assessment [1] in comparison with those used in proposals in the forum of VQEG (Video Quality Experts Group). The performance of our vision model based impairment metrics have been evaluated showing high correlations with corresponding subjective test results.
A new perceptual image coder [2] will also be described which adopts the coding structure of the EBCOT with the proposed perceptual distortion measure in place of the MSE and the CVIS. Examples will be given to compare the performance of the new perceptual coder with that of the EBCOT with the MSE and CVIS.
References:
[1] Z. Yu, H.R. Wu, S. Winkler and T. Chen, "Objective assessment of blocking artifacts for digital video with a vision model", the Proceedings of the IEEE, November 2001.
[2] D. Tan, H.R. Wu and Z. Yu, "Perceptual coding of digital colour images", Proceedings of IEEE International Symposium on Intelligent Signal Processing and Communication Systems, November 2001.
What is the entropy of a 2-manifold graph?
Peter Schroeder
California Institute of Technology, Pasadena
With the increasing availability of 3D scanning methodologies surfaces are emerging as a new multimedia datatype. Surface scans can be quite detailed and are typically given as a mesh, i.e., a set of samples on the surface together with neighborhood relations ("triangles"). Finding efficient representations for such surface descriptions is a subject of ongoing research.
In this talk I will consider a particular problem that arises in the compression of meshes: How efficient can we represent the connectivity of a general polyhedral mesh? We have recently developed an algorithm based on entropy coding valence (number of neighbors of a vertex) and degree (number of vertices in a face) streams which is near optimal according to a census of all planar graphs. I will discuss the development of algorithms of this kind and how the optimality property is proven.
Time permitting I will speculate on the larger (and much harder) question of how to define the entropy of a surface.
Joint work with Andrei Khodakovsky, Pierre Alliez, and Mathieu Desbrun.
Context-based Adaptive Coding and the Emerging H.26L Video Compression Standard
Thomas Wiegand
Heinrich-Hertz-Institute, Berlin, Germany
H.26L is the current project of the ITU-T Video Coding Experts Group. The main goals of the new ITU-T H.26L standardization effort are a simple and straight forward video design to achieve enhanced compression performance and provision of a "network-friendly" packet-based video representation addressing "conversational" (i.e., video telephony) and "non-conversational" (i.e., storage, broadcast, or streaming) applications.
H.26L contains a new entropy coding scheme that is based on context-based adaptive binary arithmetic coding. In this entropy coding scheme, context models are utilized for efficient prediction of the coding symbols. The novel binary adaptive arithmetic coding technique is employed to match the conditional entropy of the coding symbols given the context model estimates. The adaptation is also employed to keep track of non-stationary symbol statistics. By using our new entropy coding scheme instead of the current one variable length code approach of the current TML, large bit-rate savings up to 32% can be achieved. As a remarkable outcome of our experiments, we observed that high gains are reached not only at high bit-rates, but also at very low rates.
Quantized Frame Expansions with Erasures
Jelena Kovacevic
Bell Laboratories, Lucent Technologies
A large fraction of the information that flows across today's networks is useful even in a degraded condition. Examples include speech, audio, still images and video. When this information is subject to packet losses and retransmission is impossible due to real-time constraints, superior performance with respect to total transmitted rate, distortion, and delay may sometimes be achieved by adding redundancy to the bit stream rather than repeating lost packets. In multiple description coding, the data is broken into several streams with some redundancy among the streams. When all the streams are received, one can guarantee low distortion at the expense of having a slightly higher bit rate than a system designed purely for compression. On the other hand, when only some of the streams are received, the quality of the reconstruction degrades gracefully, which is very unlikely to happen with a system designed purely for compression. I will describe a scheme that achieves redundancy by using overcomplete expansions -- frames. We then discuss frame design issues in the presence of quantization and losses.
Surface Representations and Signal Processing
Professor Markus Gross
Computer Graphics Laboratory
ETH Zurich
In recent years powerful low-cost computing resources continuously push the size and complexity of scientific and engineering simulations. As a consequence, contemporary visualization methods have to cope with graphics objects of highly complex geometry or topology. As conventional algorithms approach their limits, more and more sophisticated surface representation, processing and rendering techniques are developed to improve the quality and performance of visual data analysis.
In this talk I will give two examples of surface processing and rendering algorithms. The first one discusses fairing methods to remove high frequency noise and distortions from input models. I will introduce a multilevel
smoothing concept that handles non-manifold models. The method works with a variety of fairing operators and can be extended to preserve volume and other mesh features. I will also present algorithms for automatic feature detection in meshes.
In the second part of the talk, I will introduce surface representations (surfels) that use point sampled geometry without explicit connectivity. As a pre-process, a hierarchical representation is computed. Rendering attributes, such as normals, material and different levels of texture colors can be stored per point sample. During rendering, a hierarchical forward warping algorithm projects surfels to a z-buffer. Point sampled representations are especially suited to render complex geometry.