Petr Kollnhofer


November 28, 2018 4:30 pm to 5:30 pm

Location: Packard 101

Talk Title: Perceptual modeling with multi-modal sensing

Talk Abstract: The research of human perception has enabled many visual applications in computer graphics that efficiently utilize computation resources to deliver a high quality experience within the limitations of the hardware. Beyond vision, humans perceive their surrounding using variety of senses to build a mental model of the world and act upon it. This mental image is often incomplete or incorrect which may have safety implications. As we cannot directly see inside the head, we need to read indirect signals projected outside. In the first part of the talk I will show how perceptual modeling can be used to overcome and exploit limitations of one specific human sense - the vision. Then, I will describe how we can build sensors to observe other human interactions connected first with physical touch and then with eye gaze patterns. Finally, I will outline how such readings can be used to teach computers to understand human behavior, to predict and to provide assistance or safety.

Speaker's Biography: Dr. Petr Kellnhofer has completed his PhD at Max-Planck Institute for Informatics in Germany under supervision of Prof. Hans-Peter Seidel and Prof. Karol Myszkowski. His thesis on perceptual modeling of human vision for stereoscopy was awarded the Eurographics PhD award. After the graduation he has been a postdoc in the group of Prof. Wojciech Matusik at MIT CSAIL and he has been working on topics related to human sensing such as eye tracking. Dr. Kellnhofer’s current research interest is a combination of perceptual modeling and machine learning in order to utilize data gathered from various types of sensors and to learn about human perception and higher level behavior.


Shalin Mehta

Chan Zuckerberg Biohub

October 17, 2018 4:30 pm to 5:30 pm

Location: Packard 101

Talk Title: Computational microscopy of dynamic order across biological scales

Talk Abstract: Living systems are characterized by emergent behavior of
ordered components. Imaging technologies that reveal dynamic
arrangement of organelles in a cell and of cells in a tissue are
needed to understand the emergent behavior of living systems. I will
present an overview of challenges in imaging dynamic order at the
scales of cells and tissue, and discuss advances in computational
label-free microscopy to overcome these challenges.

Speaker's Biography: Shalin Mehta received his Ph.D. at the National University of
Singapore, focusing on optics and biological microscopy. His Ph.D.
research led to better mathematical models and novel approaches for
label-free imaging of cellular morphology. He then joined the Marine
Biological Laboratory in Woods Hole, where he developed novel imaging
and computational methods for detecting molecular order across a range
of scales in living systems. He built an instantaneous fluorescence
polarization microscope that revealed the dynamics of molecular
assemblies by tracking the orientation and position of molecules in live cells.
At CZ Biohub, his lab seeks to measure
physical properties of biological systems with increasing precision,
resolution, and throughput by exploiting diverse light-matter
interactions and algorithms.


Mohammad Musa

Deepen AI

October 10, 2018 4:30 pm to 5:30 pm

Location: Packard 101

Talk Title: How to train neural networks on LiDAR point cloud data

Talk Abstract: Accurate LiDAR classification and segmentation is required for developing critical ADAS & Autonomous Vehicles components. Mainly, its required for high definition mapping and developing perception and path/motion planning algorithms. This talk will cover best practices for how to accurately annotate and benchmark your AV/ADAS models against LiDAR point cloud ground truth training data.

Speaker's Biography: Mohammad Musa started Deepen AI in January 2017 focusing on AI tools and infrastructure for the Autonomous Development industry. Mohammad used to lead product efforts for Google wide Initiatives to enable teams to build excellent products. He worked specifically on infrastructure products for tracking user centered metrics, bug management and user feedback loops. Prior to that, he was the head of Launch & Readiness at Google Apps for Work where he lead a cross functional team managing product launches, product roadmap, trusted tester and launch communications. Before Google, Mohammad worked in software engineering and technical sales positions in the video games and semiconductor industries in multiple startups.


Professor Hany Farid

University of California at Berkeley

November 14, 2018 4:30 pm to 5:30 pm

Location: Packard 101

Talk Title: Photo Forensics from JPEG Coding Artifacts

Talk Abstract: The past few years have seen a startling and troubling rise in the fake-news phenomena in which everyone from individuals to state-sponsored entities produce and distribute mis-information, which is then widely promoted and disseminated on social media. The implications of fake news range from a mis-informed public to an existential threat to democracy, and horrific violence. At the same time, recent and rapid advances in machine learning are making it easier than ever to create sophisticated and compelling fake images and videos, making the fake-news phenomena even more powerful and dangerous. I will start by providing a broad overview of the field of image and video forensics and then I will describe in detail a suite of image forensic techniques that explicitly detect inconsistencies in JPEG coding artifacts.

Speaker's Biography: I am the Albert Bradley 1915 Third Century Professor of Computer Science at Dartmouth. My research focuses on digital forensics, image analysis, and human perception. I received my undergraduate degree in Computer Science and Applied Mathematics from the University of Rochester in 1989 and my Ph.D. in Computer Science from the University of Pennsylvania in 1997. Following a two-year post-doctoral fellowship in Brain and Cognitive Sciences at MIT, I joined the faculty at Dartmouth in 1999. I am the recipient of an Alfred P. Sloan Fellowship, a John Simon Guggenheim Fellowship, and I am a Fellow of the National Academy of Inventors.


Professor Michael Zollhöfer

Stanford University

October 24, 2018 4:30 pm to 5:30 pm

Location: Packard 101

Talk Title: Is it real? Deep Neural Face Reconstruction and Rendering

Talk Abstract: A broad range of applications in visual effects, computer animation, autonomous driving, and man-machine interaction heavily depend on robust and fast algorithms to obtain high-quality reconstructions of our physical world in terms of geometry, motion, reflectance, and illumination. Especially, with the increasing popularity of virtual, augmented and mixed reality devices, there comes a rising demand for real-time and low-latency solutions.
This talk covers data-parallel optimization and state-of-the-art machine learning techniques to tackle the underlying 3D and 4D reconstruction problems based on novel mathematical models and fast algorithms. The particular focus of this talk is on self-supervised face reconstruction from a collection of unlabeled in-the-wild images. The proposed approach can be trained end-to-end without dense annotations by fusing a convolutional encoder with a differentiable expert-designed renderer and a self-supervised training loss.
The resulting reconstructions are the foundation for advanced video editing effects, such as photo-realistic re-animation of portrait videos. The core of the proposed approach is a generative rendering-to-video translation network that takes computer graphics renderings as input and generates photo-realistic modified target videos that mimic the source content. With the ability to freely control the underlying parametric face model, we are able to demonstrate a large variety of video rewrite applications. For instance, we can reenact the full head using interactive user-controlled editing and realize high-fidelity visual dubbing.

Speaker's Biography: Michael Zollhöfer is a Visiting Assistant Professor at Stanford University. His stay at Stanford is funded by a postdoctoral fellowship of the Max Planck Center for Visual Computing and Communication (MPC-VCC), which he received for his work in the fields of computer vision, computer graphics, and machine learning. Before joining Stanford University, Michael was a Postdoctoral Researcher at the Max Planck Institute for Informatics working with Christian Theobalt. He received his PhD from the University of Erlangen-Nuremberg for his work on real-time reconstruction of static and dynamic scenes. During his PhD, he was an intern at Microsoft Research Cambridge working with Shahram Izadi on data-parallel optimization for real-time template-based surface reconstruction. The primary goal of his research is to teach computers to reconstruct and analyze our world at frame rate based on visual input. To this end, he develops key technology to invert the image formation models of computer graphics based on data-parallel optimization and state-of-the-art deep learning techniques. The reconstructed intrinsic scene properties, such as geometry, motion, reflectance, and illumination are the foundation for a broad range of applications not only in virtual and augmented reality, visual effects, computer animation, autonomous driving, and man-machine interaction, but also in other fields such as medicine and biomechanics.


Professor Jerome Mertz

Boston University

October 3, 2018 4:30 pm to 5:30 pm

Location: Packard 101

Talk Title: The challenge of large-scale brain imaging

Talk Abstract: Advanced optical microscopy techniques have enabled the recording and stimulation of large populations of neurons deep within living, intact animal brains. I will present a broad overview of these techniques, and discuss challenges that still remain in performing large-scale imaging with high spatio-temporal resolution, along with various strategies that are being adopted to address these challenges.

Speaker's Biography: Jerome Mertz received an AB in physics from Princeton University in 1984, and a PhD in quantum optics from UC Santa Barbara and the University of Paris VI in 1991. Following postdoctoral studies at the University of Konstanz and at Cornell University, he became a CNRS research director at the Ecole Supérieure de Physique et de Chimie Industrielle in Paris. He is currently a professor of Biomedical Engineering at Boston University. His interests are in the development and applications of novel optical microscopy techniques for biological imaging. He is also author of a textbook titled "Introduction to Optical Microscopy".


Loic Royer

Chan Zuckerberg Biohub

May 16, 2018 4:30 pm to 5:30 pm

Location: Packard 101

Talk Title: Pushing the Limits of Fluorescence Microscopy with adaptive imaging and machine learning

Talk Abstract: Fluorescence microscopy lets biologist see and understand the intricate machinery at the heart of living systems and has led to numerous discoveries. Any technological progress towards improving image quality would extend the range of possible observations and would consequently open up the path to new findings. I will show how modern machine learning and smart robotic microscopes can push the boundaries of observability. One fundamental obstacle in microscopy takes the form of a trade-of between imaging speed, spatial resolution, light exposure, and imaging depth. We have shown that deep learning can circumvent these physical limitations: microscopy images can be restored even if 60-fold fewer photons are used during acquisition, isotropic resolution can be achieved even with a 10-fold under-sampling along the axial direction, and diffraction-limited structures can be resolved at 20-times higher frame-rates compared to state-of-the-art methods. Moreover, I will demonstrate how smart microscopy techniques can achieve the full optical resolution of light-sheet microscopes — instruments capable of capturing the entire developmental arch of an embryo from a single cell to a fully formed motile organism. Our instrument improves spatial resolution and signal strength two to five-fold, recovers cellular and sub-cellular structures in many regions otherwise not resolved, adapts to spatiotemporal dynamics of genetically encoded fluorescent markers and robustly optimises imaging performance during large-scale morphogenetic changes in living organisms.

Speaker's Biography: Royer first studied engineering in his native France and then obtained a master's degree in Artificial Intelligence, specializing in Cognitive Robotics, followed by a Ph.D. in Bioinformatics from the Dresden University of Technology in Germany. He then joined Gene Myers’ lab, first at HHMI's Janelia Farms and then at the Max Planck Institute of Molecular Cell Biology and Genetics, where he developed novel technology at the intersection of computer science and microscopy, including the first adaptive multi-view light-sheet microscope, which he developed in collaboration with Philipp Keller. As a group leader at CZ Biohub, Royer and his team are building ‘discovery machines’ that not only acquire image data, but also perform online processing, instant 3D visualization, adaptive imaging, and automated photo-manipulation. These integrated instruments bring together state of the art optics, robotics, machine learning, and image analysis with the aim of advancing beyond the automation of repetitive tasks and into the realm of actual automated scientific reasoning.


Mark McCord

Cepton Technologies

May 23, 2018 4:30 pm to 5:30 pm

Location: Packard 101

Talk Title: LiDAR Technology for Autonomous Vehicles

Talk Abstract: LiDAR is a key sensor for autonomous vehicles that enables them to understand their surroundings in 3 dimensions. I will discuss the evolution of LiDAR, and describe various LiDAR technologies currently being developed. These include rotating sensors, MEMs and Optical Phase Array scanning devices, flash detector arrays, and single photon avalanche detectors. Requirements for autonomous vehicles are very challenging, and the different technologies each have advantages and disadvantages that will be discussed. The architecture of LiDAR also affects how it fits into the overall vehicle architecture. Image fusion with other sensors including radar, cameras, and ultrasound will be part of the overall solution. Other LiDAR applications including non-automotive transportation, mining, precision agriculture, UAV’s, mapping, surveying, and security will be described.

Speaker's Biography: As Co-Founder and Vice President of Engineering at Cepton Technologies, Dr. Mark McCord leads the development of high performance, low-cost imaging LiDAR systems. Prior to Cepton, Dr. McCord was Director of System Engineering, Advanced Development at KLA-Tencor, where he developed electron beam technologies for etching and imaging silicon chips. Earlier in his career, Dr. McCord served as an Associate Professor of Electrical Engineering at Stanford University, where he and his group researched various methods of nanometer-scale silicon processing, and as a Research Staff Member at IBM Research, where he worked on development of X-ray and electron beam chip lithography. Dr. McCord earned a B.S. in Electrical Engineering from Princeton University and a PhD in Electrical Engineering from Stanford University.


Jake Li

Hamamatsu Photonics

May 30, 2018 4:30 pm to 5:30 pm

Location: Packard 101

Talk Title: Emerging LIDAR concepts and sensor technologies for autonomous vehicles

Talk Abstract: Sensor technologies such as radar, camera, and LIDAR have become the key enablers for achieving higher levels of autonomous control in vehicles, from fleets to commercial. There are, however, still questions remaining: to what extent will radar and camera technologies continue to improve, and which LIDAR concepts will be the most successful? This presentation will provide an overview of the tradeoffs for LIDAR vs. competing sensor technologies (camera and radar); this discussion will reinforce the need for sensor fusion. We will also discuss the types of improvements that are necessary for each sensor technology. The presentation will summarize and compare various LIDAR designs -- mechanical, flash, MEMS-mirror based, optical phased array, and FMCW (frequency modulated continuous wave) -- and then discuss each LIDAR concept’s future outlook. Finally, there will be a quick review of guidelines for selecting photonic components such as photodetectors, light sources, and MEMS mirrors

Speaker's Biography: Jake Q. Li is in charge of research and analysis of various market segments, with a concentration in the automotive LiDAR market. He is knowledgeable about various optical components such as photodetectors -- including MPPC (a type of silicon photomultiplier or SiPM), avalanche photodiodes, and PIN photodiodes -- and light emitters that are important parts of LIDAR system designs. He has expert understanding of the upcoming solid-state technology needs for the autonomous vehicle market. Together with his experience and understanding of the specific requirements needed for LIDAR systems, he will guide you through the selection process of the best photodetectors and light sources that will fit your individual needs.


Dr. Ben Backus

Vivid Vision

June 6, 2018 4:30 pm to 5:30 pm

Location: Packard 101

Talk Title: Mobile VR for vision testing and treatment

Talk Abstract: Consumer-level HMDs are adequate for many medical applications. Vivid Vision (VV) takes advantage of their low cost, light weight, and large VR gaming code base to make vision tests and treatments. The company’s software is built using the Unity engine, which allows for multiplatform the Unity framework, allowing it to run on many hardware platforms. New headsets are available every six months or less, which creates interesting challenges within in the medical device space. VV’s flagship product is the commercially available Vivid Vision System, used by more than 120 clinics to test and treat binocular dysfunctions such as convergence difficulties, amblyopia, strabismus, and stereo blindness. VV has recently developed a new, VR-based visual field analyzer.

Speaker's Biography: Ben Backus was Empire Innovation Associate Professor in Manhattan at the Graduate Center of the SUNY College of Optometry until October 2017, when he gave up tenure and moved to San Francisco to be Chief Science Officer at Vivid Vision, Inc. He still teaches and leads NIH-funded research in New York. He has training in mathematics (BA, Swarthmore), human vision (PhD, UC Berkeley), and visual neuroscience (postdoc, Stanford). He was a math teacher in the Oakland Public Schools and a professor of Psychology at the University of Pennsylvania. He makes a good Meyer limoncello.


Dr. Boyd Fowler


May 9, 2018 4:30 pm to 5:30 pm

Location: Packard 101

Talk Title: Advances in automotive image sensors

Talk Abstract: In this talk I present recent advances in 2D and 3D image sensors for automotive applications such as rear view cameras, surround view cameras, ADAS cameras and in cabin driver monitoring cameras. This includes developments in high dynamic range image capture, LED flicker mitigation, high frame rate capture, global shutter, near infrared sensitivity and range imaging. I will also describe sensor developments for short range and long range LIDAR systems.

Speaker's Biography: Boyd Fowler joined OmniVision in December 2015 and is the CTO. Prior to joining OmniVision he was a founder and VP of Engineering at Pixel Devices where he focused on developing high performance CMOS image sensors. After Pixel Devices was acquired by Agilent Technologies, Dr. Fowler was responsible for advanced development of their commercial CMOS image sensors products. In 2005 Dr. Fowler joined Fairchild Imaging as the CTO and VP of technology, where he developed SCMOS image sensors for high performance scientific applications. After Fairchild Imaging was acquired by BAE Systems, Dr. Fowler was appointed the technology directory of the CCD/CMOS image sensor business. He has authored numerous technical papers, book chapters and patents. Dr. Fowler received his M.S. and Ph.D. degrees in Electrical Engineering from Stanford University in 1990 and 1995 respectively.


Dr. Seishi Takamura


April 25, 2018 4:30 pm to 5:30 pm

Location: Packard 101

Talk Title: Video Coding before and beyond HEVC

Talk Abstract: We are enjoying video contents in various situations. Though they are already compressed down to 1/10 - 1/1000 from its original size, it has been reported that video traffic over the internet is increasing 31% per year, within which the video traffic will occupy 82% by 2020. This is why development of better compression technology is eagerly demanded. ITU-T/ISO/IEC jointly developed the latest video coding standard, High Efficiency Video Coding (HEVC), in 2013. They are about to start next generation standard. Corresponding proposals will be evaluated at April 2018 meeting in San Diego, just a week before this talk.

In this talk, we will first overview the advances of video coding technology in the last several decades, latest topics including the report of the San Diego meeting, some new approaches including deep learning technique etc. will be presented.

Speaker's Biography: Seishi Takamura received his B.E., M.E., and Ph.D. from the Department of Electronic Engineering, Faculty of Engineering, the University of Tokyo in 1991, 1993, and 1996. He joined NTT Corporation in 1996 and appointed a Distinguished Technical Member in 2009. From 2005 to 2006, he was a visiting scientist at Stanford University, California, USA. Currently, he is a Senior Distinguished Engineer of NTT Media Intelligence Laboratories. His current research interests include efficient video coding and ultrahigh quality video coding.

He has fulfilled various duties in the research and academic community in current and prior roles including Associate Editor of IEEE Trans. CSVT (2006-2014), Executive Committee Member of the IEEE Tokyo Section, the IEEE Japan Council, and the Institute of Electronics, Information and Communication Engineers (IEICE) Image Engineering SIG Chair and Board of Directors of the Institute of Image Information and Television Engineers (ITE). He has also served as Japan National Body Chair and Japan Head of Delegates of ISO/IEC JTC 1/SC 29, and as an International Steering Committee Member of the Picture Coding Symposium.

He has received 41 academic awards including the ITE Niwa-Takayanagi Best Paper Award in 2002, the Information Processing Society of Japan (IPSJ) Nagao Special Researcher Award in 2006, PCSJ (Picture Coding Symposium of Japan) Frontier/Best Paper Awards in 2004, 2008 and 2015, the ITE Fujio Frontier Award in 2014, and TAF (Telecommunications Advancement Foundation) Telecom System Technology Awards in 2004, 2008 and 2015 (with highest honor). Dr. Takamura is a senior member of IEEE, IEICE, and IPSJ and a member of APSIPA, SID, ITE and MENSA.


Professor Kyros Kutulakos

University of Toronto

April 18, 2018 4:30 pm to 5:30 pm

Location: Packard 101

Talk Title: Transport-Aware Cameras

Talk Abstract: Conventional cameras record all light falling onto their sensor regardless of
its source or its 3D path to the camera. In this talk I will present an
emerging family of coded-exposure video cameras that can be programmed to
record just a fraction of the light coming from an artificial source---be it a
common street lamp or a programmable projector---based on the light path's
geometry or timing. Live video from these cameras offers a very unconventional
view of our everyday world in which refraction and scattering can be
selectively blocked or enhanced, visual structures too subtle to notice with
the naked eye can become apparent, and the flicker of electric lights can be
turned into a powerful cue for analyzing the electrical grid from room to city

I will discuss the unique optical properties and power efficiency of these
"transport aware cameras" through three case studies: the ACam for analyzing
the electrical grid, EpiScan3D for robust structured-light 3D imaging, and
EpiToF for robust time-of-flight imaging. I will also discuss our initial
progress toward designing a computational CMOS sensor for
coded two-bucket imaging---a novel capability that promises much more flexible and
powerful transport-aware cameras compared to existing off-the-shelf solutions.

Speaker's Biography: Kyros Kutulakos is a Professor of Computer Science at the University of
Toronto. He received his PhD degree from the University of Wisconsin-Madison
in 1994 and his BS degree from the University of Crete in 1988, both in
Computer Science. In addition to the University of Toronto, he has held
appointments at the University of Rochester (1995-2001) and Microsoft Research
Asia (2004-05 and 2011-12). He is the recipient of an Alfred P. Sloan
Fellowship, an Ontario Premier's Research Excellence Award, a Marr Prize in
1999, a Marr Prize Honorable Mention in 2005, and four other paper awards
(CVPR 1994, ECCV 2006, CVPR 2014, CVPR 2017). He also served as Program
Co-Chair of CVPR 2003, ICCP 2010 and ICCV 2013.


Thomas Burnett


April 11, 2018 4:30 pm to 5:30 pm

Location: Packard 101

Talk Title: Light-field Display Architecture and the Heterogenous Display Ecosystem FoVI3D

Talk Abstract: Human binocular vision and acuity, and the accompanying 3D retinal processing of the human eye and brain are specifically designed to promote situational awareness and understanding in the natural 3D world. The ability to resolve depth within a scene whether natural or artificial improves our spatial understanding of the scene and as a result reduces the cognitive load accompanying the analysis and collaboration on complex tasks.

A light-field display projects 3D imagery that is visible to the unaided eye (without glasses or head tracking) and allows for perspective correct visualization within the display’s projection volume. Binocular disparity, occlusion, specular highlights and gradient shading, and other expected depth cues are correct from the viewer’s perspective as in the natural real-world light-field.

Light-field displays are no longer a science fiction concept and a few companies are producing impressive light-field display prototypes. This presentation will review:
· The application agnostic light-field display architecture being developed at FoVI3D.
· General light-field display properties and characteristics such as field of view, directional resolution, and their effect on the 3D aerial image.
· The computation challenge for generating high-fidelity light-fields.
· A display agnostic ecosystem.

Demo after the talk:The FoVI3D Light-field Display Developer Kit (LfD DK2) is a prototype, wide field-of-view, full parallax, monochrome light-field display capable of projecting ~100,000,000 million unique rays to fill a 9cm x 9cm x 9cm projection volume. The particulars of the light-field compute, photonics subsystem and hogel optics will be discussed during the presentation.

Speaker's Biography: Thomas graduated from Texas A&M University in 1989 with a bachelor’s degree in Computer Science. He has spent 25+ years developing, architecting, and managing computer software and hardware projects including processor logic synthesis and simulation, 2D image processing pipelines, 2D/3D and light-field rendering solutions, 3D physics engines and 2D/3D games.

Thomas has been a developer/manager with multiple visualization start-up companies in and around Austin's Silicon Hills. At Applied Science Fiction (ASF) he co-developed image processing libraries and a processing pipeline to render images from exposed yet undeveloped 35mm film. As the software lead at Zebra Imaging, Thomas was a key contributor in the development of static light-field topographic maps for use by the Department of Defense in Iraq and Afghanistan. He was the computation architect for the DARPA Urban Photonic Sandtable Display (UPSD) program which produced several wide-area light-field display prototypes for human factors testing and research.

More recently, Thomas launched a new light-field display development program at FoVI3D where he serves as President and CTO. FoVI3D is developing a next-generation light-field display architecture and display prototype to further socialize the cognitive benefit of spatially accurate 3D aerial image


Dr. Anna-Karin Gustavson

Stanford University

May 2, 2018 4:30 pm to 5:30 pm

Location: Packard 101

Talk Title: 3D single-molecule super-resolution microscopy using a tilted light sheet

Talk Abstract: To obtain a complete picture of subcellular structures, cells must be imaged with high resolution in all three dimensions (3D). In this talk, I will present tilted light sheet microscopy with 3D point spread functions (TILT3D), an imaging platform that combines a novel, tilted light sheet illumination strategy with engineered long axial range point spread functions (PSFs) for low-background, 3D super localization of single molecules as well as 3D super-resolution imaging in thick cells. Here the axial positions of the single molecules are encoded in the shape of the PSF rather than in the position or thickness of the light sheet. TILT3D is built upon a standard inverted microscope and has minimal custom parts. The result is simple and flexible 3D super-resolution imaging with tens of nm localization precision throughout thick mammalian cells. We validated TILT3D for 3D super-resolution imaging in mammalian cells by imaging mitochondria and the full nuclear lamina using the double-helix PSF for single-molecule detection and the recently developed Tetrapod PSFs for fiducial bead tracking and live axial drift correction. We think that TILT3D in the future will become an important tool not only for 3D super-resolution imaging, but also for live whole-cell single-particle and single-molecule tracking.

Speaker's Biography: Dr. Anna-Karin Gustavsson is a postdoctoral fellow in the Moerner Lab at the Department of Chemistry at Stanford University, and she also holds a postdoctoral fellowship from the Karolinska Institute in Stockholm, Sweden. Her research is focused on the development and application of 3D single-molecule super-resolution microscopy for cell imaging, and includes the implementation of light sheet illumination for optical sectioning. She has a background in physics and received her PhD in Physics in 2015 from the University of Gothenburg, Sweden. Her PhD project was focused on studying dynamic responses in single cells by combining and optimizing techniques such as fluorescence microscopy, optical tweezers, and microfluidics. Dr. Gustavsson has received several awards, most notably the FEBS Journal Richard Perham Prize for Young Scientists in 2012 and the PicoQuant Young Investigator Award in 2018.


April 5, 2018

Talk Title: Workshop on Medical VR and AR

Talk Abstract: This workshop will bring researchers and clinicians who are using VR/AR to advance healthcare together with engineers who are developing the enabling technologies and applications.

More Information:


Professor Christian Theobalt

Max-Planck-Institute (MPI) for Informatics

March 21, 2018 4:30 pm to 5:30 pm

Location: Packard 101

Talk Title: Video-based Reconstruction of the Real World in Motion

Talk Abstract: New methods for capturing highly detailed models of moving real world scenes with cameras, i.e., models of detailed deforming geometry, appearance or even material properties, become more and more important in many application areas. They are needed in visual content creation, for instance in visual effects, where they are needed to build highly realistic models of virtual human actors. Further on, efficient, reliable and highly accurate dynamic scene reconstruction is nowadays an important prerequisite for many other application domains, such as: human-computer and human-robot interaction, autonomous robotics and autonomous driving, virtual and augmented reality, 3D and free-viewpoint TV, immersive telepresence, and even video editing.

The development of dynamic scene reconstruction methods has been a long standing challenge in computer graphics and computer vision. Recently, the field has seen important progress. New methods were developed that capture - without markers or scene instrumentation - rather detailed models of individual moving humans or general deforming surfaces from video recordings, and capture even simple models of appearance and lighting. However, despite this recent progress, the field is still at an early stage, and current technology is still starkly constrained in many ways. Many of today's state-of-the-art methods are still niche solutions that are designed to work under very constrained conditions, for instance: only in controlled studios, with many cameras, for very specific object types, for very simple types of motion and deformation, or at processing speeds far from real-time.

In this talk, I will present some of our recent works on detailed marker-less dynamic scene reconstruction and performance capture in which we advanced the state of the art in several ways. For instance, I will briefly show new methods for marker-less capture of the full body (like our VNECT approach) and hands that work in more general environments, and even in real-time and with one camera. I will then show some of our work on high-quality face performance capture and face reenactment. Here, I will also illustrate the benefits of both model-based and learning-based approaches and show how different ways to join the forces of the two open up new possibilities. Live demos included !

More Information:

Speaker's Biography: Christian Theobalt is a Professor of Computer Science and the head of the research group "Graphics, Vision, & Video" at the Max-Planck-Institute (MPI) for Informatics, Saarbrücken, Germany. He is also a Professor of Computer Science at Saarland University, Germany.
From 2007 until 2009 he was a Visiting Assistant Professor in the Department of Computer Science at Stanford University.
He received his MSc degree in Artificial Intelligence from the University of Edinburgh, his Diplom (MS) degree in Computer Science from Saarland University, and his PhD (Dr.-Ing.) from Saarland University and Max-Planck-Institute for Informatics.

In his research he looks at algorithmic problems that lie at the intersection of Computer Graphics, Computer Vision and machine learning, such as: static and dynamic 3D scene reconstruction, marker-less motion and performance capture, virtual and augmented reality, computer animation, appearance and reflectance modelling, intrinsic video and inverse rendering, machine learning for graphics and vision, new sensors for 3D acquisition, advanced video processing, as well as image- and physically-based rendering. He is also interested in using reconstruction techniques for human computer interaction.

For his work, he received several awards, including the Otto Hahn Medal of the Max-Planck Society in 2007, the EUROGRAPHICS Young Researcher Award in 2009, the German Pattern Recognition Award 2012, and the Karl Heinz Beckurts Award in 2017. He received two ERC grants, one of the most prestigious and competitive individual research grants in Europe: An ERC Starting Grant in 2013 and an ERC Consolidator Grant in 2017. In 2015, he was elected as one of the top 40 innovation leaders under 40 in Germany by the business magazine Capital. Christian Theobalt is also a co-founder of an award-winning spin-off company from his group - - that is commercializing one of the most advanced solutions for marker-less motion and performance capture.


Professor Marty Banks and Dr. Stephen Cholewiak

UC Berkeley

February 28, 2018 4:30 pm to 5:30 pm

Location: Packard 101

Talk Title: ChromaBlur: Rendering Chromatic Eye Aberration Improves Accommodation and Realism

Talk Abstract: Computer-graphics engineers and vision scientists want to generate images that reproduce realistic depth-dependent blur. Current rendering algorithms take into account scene geometry, aperture size, and focal distance, and they produce photorealistic imagery as with a high-quality camera. But to create immersive experiences, rendering algorithms should aim instead for perceptual realism. In so doing, they should take into account the significant optical aberrations of the human eye. We developed a method that, by incorporating some of those aberrations, yields displayed images that produce retinal images much closer to the ones that occur in natural viewing. In particular, we create displayed images taking the eye’s chromatic aberration into account. This produces different chromatic effects in the retinal image for objects farther or nearer than current focus. We call the method ChromaBlur. We conducted two experiments that illustrate the benefits of ChromaBlur. One showed that accommodation (eye focusing) is driven quite effectively when ChromaBlur is used and that accommodation is not driven at all when conventional methods are used. The second showed that perceived depth and realism are greater with imagery created by ChromaBlur than in imagery created conventionally. ChromaBlur can be coupled with focus-adjustable lenses and gaze tracking to reproduce the natural relationship between accommodation and blur in HMDs and other immersive devices. It can thereby minimize the adverse effects of vergence-accommodation conflicts.

Speaker's Biography: Martin S. Banks is a professor in the UC Berkeley School of Optometry. Before arriving at UCB in 1985, he was a professor of Psychology at the University of Texas at Austin. He received his graduate degree in Development Psychology from the University of Minnesota in 1976. Banks has received many awards for his work on basic and applied research on human visual development, on visual space perception, and on the development and evaluation of visual displays. These include the Boyd R. McCandless Award from the American Psychological Association (1984), Kurt Koffka Medal from Giessen University (2007), Charles F. Prentice Award from the American Academy of Optometry (2016), and Otto Schade Prize from the Society for Information Display (2017). He is a Fellow of the Center for Advanced Study of the Behavioral Sciences (1988), Fellow of the American Association for the Advancement of Science (2008), Fellow of the American Psychological Society (2009), Holgate Fellow of Durham University (2011), WICN Fellow of University of Wales (2011), Honorary Professor of University of Wales (2017), and Borish Scholar of Indiana University (2017). This year he will receive the Edgar D. Tillyard Award from the Optical Society of America.

Steven A. Cholewiak received his bachelor's degree from the University of Virginia in 2006, double majoring in Psychology and Physics. While at UVA, he worked as a research assistant in the laboratories of Gerald Clore, Timothy Salthouse, and Dennis Proffitt. After graduation, he joined the Haptic Interface Research Laboratory with Hong Z. Tan at Purdue University for a year. Steven entered Rutgers University's Perceptual Science program in the Psychology department in 2007 and received his M.S. in 2010 in Cognitive Psychology and Ph.D. in 2012 with advisor Manish Singh. His first postdoc was in the Experimental Psychology department at Justus-Liebig-University Giessen in Germany from 2012 to 2015 with Roland W. Fleming and was co-advised by Steven W. Zucker at Yale University. He is currently a postdoctoral scholar in the UC Berkeley School of Optometry's Vision Science Program and is working with Martin S. Banks.


Chris Metzler

Rice University

February 21, 2018 4:30 pm to 5:30 pm

Location: Packard 101

Talk Title: Data-driven Computational Imaging

Talk Abstract: Between ever increasing pixel counts, ever cheaper sensors, and the ever expanding world-wide-web, natural image data has become plentiful. These vast quantities of data, be they high frame rate videos or huge curated datasets like Imagenet, stand to substantially improve the performance and capabilities of computational imaging systems. However, using this data efficiently presents its own unique set of challenges. In this talk I will use data to develop better priors, improve reconstructions, and enable new capabilities for computational imaging systems.

Speaker's Biography: Chris Metzler is a PhD candidate in the Machine Learning, Digital Signal Processing, and Computational Imaging labs at Rice University. His research focuses on developing and applying new algorithms, including neural networks, to problems in computational imaging. Much of his work concerns imaging through scattering media, like fog and water, and last summer he interned in the U.S. Naval Research Laboratory's Applied Optics branch. He is an NSF graduate research fellow and was formerly an NDSEG graduate research fellow.


Steve Silverman


February 7, 2018 4:30 pm to 5:30 pm

Location: Packard 101

Talk Title: Street View 2018 - The Newest Generation Of Mapping Hardware

Talk Abstract: A brief overview of Street View from it's inception 10 years ago until now will be presented. Street level Imagery has been the prime objective for Google's Street View in the past, and has now migrated into a state-of-the-art mapping platform. Challenges and solutions to the design and fabrication of the imaging system and optimization of hardware to align with specific software post processing will be discussed. Real world challenges of fielding hardware in 80+ countries will also be addressed.

Speaker's Biography: Steven Silverman is a Technical Program Manager at Google, Inc developing and deploying camera/mapping systems for View. He has developed flash lidar systems which are part of the SpaceX Dragon Vehicle birthing system. He was the Chief Engineer for the Thermal Emission Spectrometers (TES and Mini-TES) for Mars Global Surveyor, both Mars Exploration Rovers, as well as the Chief Engineer for the Thermal Emission Imaging System (Themis) for Mars Odyssey. He graduated from Cal Poly SLO in Engineering Science, and has an MS in ECE from UCSB.


Professor Kristen Grauman

University of Texas at Austin

January 24, 2018 4:30 pm to 5:30 pm

Location: Packard 101

Talk Title: Learning where to look in 360 environments

Talk Abstract: Many vision tasks require not just categorizing a well-composed human-taken photo, but also intelligently deciding “where to look” in order to get a meaningful observation in the first place. We explore how an agent can anticipate the visual effects of its actions, and develop policies for learning to look around actively---both for the sake of a specific recognition task as well as for generic exploratory behavior. In addition, we examine how a system can learn from unlabeled video to mimic human videographer tendencies, automatically deciding where to look in unedited 360 degree panoramas. Finally, to facilitate 360 video processing, we introduce spherical convolution, which allows application of off-the-shelf deep networks and object detectors to 360 imagery.

Speaker's Biography: Kristen Grauman is a Professor in the Department of Computer Science at the University of Texas at Austin. Her research in computer vision and machine learning focuses on visual recognition. Before joining UT-Austin in 2007, she received her Ph.D. at MIT. She is an Alfred P. Sloan Research Fellow and Microsoft Research New Faculty Fellow, a recipient of NSF CAREER and ONR Young Investigator awards, the PAMI Young Researcher Award in 2013, the 2013 IJCAI Computers and Thought Award, and a Presidential Early Career Award for Scientists and Engineers (PECASE) in 2013. Work with her collaborators has been recognized with paper awards at CVPR 2008, ICCV 2011, ACCV 2016, and CHI 2017. She currently serves as an Associate Editor in Chief for the Transactions on Pattern Analysis and Machine Intelligence (TPAMI) and an Editorial Board Member for the International Conference on Computer Vision (IJCV), and she served as a Program Chair of CVPR 2015 in Boston.


Professor Jacob Chakareski

University of Alabama

March 14, 2018 4:30 pm to 5:30 pm

Location: Packard 101

Talk Title: Drone IoT Networks for Virtual Human Teleportation

Talk Abstract: Cyber-physical/human systems (CPS/CHS) are set to play an increasingly visible role in our lives, advancing research and technology across diverse disciplines. I am exploring novel synergies between three emerging CPS/CHS technologies of prospectively broad societal impact, virtual/augmented reality (VR/AR), the Internet of Things (IoT), and autonomous micro-aerial robots (UAVs). My long-term research objective is UAV-IoT-deployed ubiquitous VR/AR immersive communication that can enable virtual human teleportation to any corner of the world. Thereby, we can achieve a broad range of technological and societal advances that will enhance energy conservation, quality of life, and the global economy.
I am investigating fundamental problems at the intersection of signal acquisition and representation, communications and networking, (embedded) sensors and systems, and rigorous machine learning for stochastic control that arise in this context. I envision a future where UAV-IoT-deployed immersive communication systems will break existing barriers in remote sensing, monitoring, localization and mapping, navigation, and scene understanding. The presentation will outline some of my present and envisioned investigations. Interdisciplinary applications will be highlighted.

Speaker's Biography: Jacob Chakareski is an Assistant Professor of Electrical and Computer Engineering at The University of Alabama, where he leads the Laboratory for VR/AR Immersive Communication (LION). His interests span virtual and augmented reality, UAV-IoT sensing and communication, and rigorous machine learning for stochastic control. Dr. Chakareski received the Swiss NSF Ambizione Career Award and the best paper award at ICC 2017. He trained as a PhD student at Rice and Stanford, held research appointments with Microsoft, HP Labs, and EPFL, and sits on the advisory board of Frame, Inc. His research is supported by the NSF, AFOSR, Adobe, NVIDIA, and Microsoft. For further info, please visit


Dr. Fu-Chung Huang

Nvidia Research

February 14, 2018 4:30 pm to 5:30 pm

Location: Packard 101

Talk Title: Accelerated Computing for Light Field and Holographic Displays

Talk Abstract: In this talk, I will present two recently published papers at the annual SIGGRAPH ASIA 2017.
For the first paper, we present a 4D light field sampling and rendering system for light field displays that can support both foveation and accommodation to reduce rendering cost while maintaining perceptual quality and comfort.
For the second paper, we present a light field based Computer Generated Holography (CGH) rendering pipeline allowing for reproduction of high-definition 3D scenes with continuous depth and support of intra-pupil view dependent occlusion using computer generated hologram. Our rendering and Fresnel integral accurately accounts for diffraction and supports various types of reference illumination for holograms.

Speaker's Biography: Fu-Chung Huang is a research scientist at Nvidia Research. He works on computational displays where high performance computation is applied to solve problems related to optics and perception. Recently, his research focus specifically on near-eye displays for virtual reality and augmented reality.

He received Ph.D. in Computer Science from UC Berkeley in 2013 and his dissertation on Vision-correcting Light Field Displays won the Scientific America’s World Changing Ideas 2014. He was a visiting scientist at MIT Media Lab with Prof. Ramesh Raskar during 2011 to 2013 and at Stanford University with Prof. Gordon Wetzstein during 2014 to 2015.


Dr. Alex Lidow

Efficient Power Conversion (EFC)

January 17, 2018 4:30 pm to 5:30 pm

Location: Packard 101

Talk Title: Driverless Anything and the Role of LiDAR

Talk Abstract: LiDAR, or light detection and ranging, is a versatile light-based remote sensing technology that has been the subject of a great deal of attention in recent times. It has shown up in a number of media venues, and has even led to public debate about engineering choices of a well-known electric car company, Tesla Motors. During this talk the speaker will provide some background on LiDAR and discuss why it is a key link to the future autonomous vehicle ecosystem as well as its strong connection to power electronics technologies.

Speaker's Biography: Alex Lidow is CEO and co-founder of Efficient Power Conversion Corporation (EPC). Since 1977 Dr. Lidow has been dedicated to making power conversion more efficient upon the belief that this will reduce the harm to our environment and increase the global standard of living.

Dr. Lidow served as CEO of International Rectifier for 12 years prior to founding EPC in 2007. Under his leadership, International Rectifier was named one of the best managed companies in America by Forbes magazine in 2005. Dr. Lidow recently received the 2015 SEMI Award for North America for the commercialization of more efficient power devices.

Dr. Lidow was one of the lead representatives of the Semiconductor Industry Association (SIA) for the trade negotiations that resulted in the U.S.-Japan Trade Accord of 1986 and testified to Congress many times on behalf of the industry.

Dr. Lidow holds many patents in power semiconductor technology, including basic patents in power MOSFETs as well as in GaN transistors and integrated circuits. His MOSFET patents received royalties exceeding $900M over their lifetime and International Rectifier, where he spent 30 years of his career, was the largest producer of these products until their merger with Infineon Technologies in January 2015. He has authored numerous peer reviewed publications on related subjects, and recently co-authored the first textbook on GaN transistors, “GaN Transistors for Efficient Power Conversion”, now in its second edition published by John Wiley and Sons.

Dr. Lidow earned his Bachelor of Science in Applied Physics from Caltech in 1975, and his PhD in Applied Physics from Stanford in 1977 as a Fannie and John Hertz Foundation Fellow.

Since 1998 Dr. Lidow has been a member of the Board of Trustees of the California Institute of Technology, and has been the Chairman of the Compensation and Nominating Committees, and Vice Chair of the Investment Committee.


Dr. Edward Chang

Research and Healthcare (DeepQ) at HTC

January 10, 2018 4:30 pm to 5:30 pm

Location: Packard 101

Talk Title: Advancing Healthcare with AI and VR

Talk Abstract: Quality, cost, and accessibility form an iron triangle that has prevented healthcare from achieving accelerated advancement in the last few decades. Improving any one of the three metrics may lead to degradation of the other two. However, thanks to recent breakthroughs in artificial intelligence (AI) and virtual reality (VR), this iron triangle can finally be shattered. In this talk, I will share the experience of developing DeepQ, an AI platform for AI-assisted diagnosis and VR-facilitated surgery. I will present three healthcare initiatives we have undertaken since 2012: Healthbox, Tricorder, and VR surgery, and explain how AI and VR play pivotal roles in improving diagnosis accuracy and treatment effectiveness. And more specifically, how we have dealt with not only big data analytics, but also small data learning, which is typical in the medical domain. The talk concludes with roadmaps and a list of open research issues in signal processing and AI to achieve precision medicine and surgery.

Speaker's Biography: Edward Chang currently serves as the President of Research and Healthcare (DeepQ) at HTC. Ed's most notable work is co-leading the DeepQ project (with Prof. CK Peng at Harvard), working with a team of physicians, scientists, and engineers to design and develop mobile wireless diagnostic instruments. Such instruments can help consumers make their own reliable health diagnoses anywhere at any time. The project entered the Tricorder XPRIZE competition in 2013 with 310 other entrants and was awarded second place in April 2017 with 1M USD prize. The deep architecture that powers DeepQ is also applied to power Vivepaper, an AR product Ed's team launched in 2016 to support immersive augmented reality experiences (for education, training, and entertainment).

Prior to his HTC post, Ed was a director of Google Research for 6.5 years, leading research and development in several areas including scalable machine learning, indoor localization, social networking and search integration, and Web search (spam fighting). His contributions in parallel machine learning algorithms and data-driven deep learning (US patents 8798375 and 9547914) are recognized through several keynote invitations and the developed open-source codes have been collectively downloaded over 30,000 times. His work on IMU calibration/fusion with project X was first deployed via Google Indoor Maps (see XINX paper and ASIST/ACM SIGIR/ICADL keynotes) and is now widely used on mobile phones and VR/AR devices. Ed's team also developed the Google Q&A system (codename Confucius), which was launched in over 60 countries.

Prior to Google, Ed was a full professor of Electrical Engineering at the University of California, Santa Barbara (UCSB). He joined UCSB in 1999 after receiving his PhD from Stanford University, and was tenured in 2003 and promoted to full professor in 2006. Ed has served on ACM (SIGMOD, KDD, MM, CIKM), VLDB, IEEE, WWW, and SIAM conference program committees, and co-chaired several conferences including MMM, ACM MM, ICDE, and WWW. He is a recipient of the NSF Career Award, IBM Faculty Partnership Award, and Google Innovation Award. He is also an IEEE Fellow for his contributions to scalable machine learning.


Professor Liang Gao

University of Illinois Urbana-Champaign

December 6, 2017 4:30 pm to 5:30 pm

Location: Packard 101

Talk Title: Compressed Ultrafast Photography and Microscopy: Redefining the Limit of Passive Ultrafast Imaging

Talk Abstract: High-speed imaging is an indispensable technology for blur-free observation of fast transient dynamics in virtually all areas including science, industry, defense, energy, and medicine. Unfortunately, the frame rates of conventional cameras are significantly constrained by their data transfer bandwidth and onboard storage. We demonstrate a two-dimensional dynamic imaging technique, compressed ultrafast photography (CUP), which can capture non-repetitive time-evolving events at up to 100 billion fps. Compared with existing ultrafast imaging techniques, CUP has a prominent advantage of measuring an x, y, t (x, y, spatial coordinates; t, time) scene with a single camera snapshot, thereby allowing observation of transient events occurring on a time scale down to tens of picoseconds. Thanks to the CUP technology, for the first time, the human can see light pulses on the fly. Because this technology advances the imaging frame rate by orders of magnitude, we now enter a new regime and open new visions.

In this talk, I will discuss our recent effort to develop a second-generation CUP system and demonstrate its applications at scales from macroscopic to microscopic. For the first time, we imaged photonic Mach cones and captured “Sonic Boom” of light in action. Moreover, by adapting CUP for microscopy, we enabled two-dimensional fluorescence lifetime imaging at an unprecedented speed. The advantage of CUP recording is that even visually simple systems can be scientifically interesting when they are captured at such a high speed. Given CUP’s capability, we expect it to find widespread applications in both fundamental and applied sciences including biomedical research.

Speaker's Biography: Dr. Liang Gao is currently an Assistant Professor of Electrical and Computer Engineering department at University of Illinois Urbana-Champaign. He is also affiliated with Beckman Institute for Advanced Science and Technology. His primary research interests encompass multidimensional optical imaging, including hyperspectral imaging and ultrafast imaging, photoacoustic tomography and microscopy, and cost-effective high-performance optics for diagnostics. Dr. Liang Gao is the author of more than 30 peer-reviewed publications in top-tier journals, such as Nature, Science Advances, Physics Report, and Annual Review of Biomedical Engineering. He received his BS degree in Physics from Tsinghua University in 2005 and Ph.D. degree in Applied Physics and Bioengineering from Rice University in 2011. Dr. Liang Gao is a recipient of NSF CAREER award in 2017.


Rudolf Oldenbourg

Marine Biological Laboratory, Woods Hole MA

November 1, 2017 4:30 pm to 5:30 pm

Location: Packard 101

Talk Title: Mapping molecular orientation using polarized light microscopy

Talk Abstract: Polarization is a basic property of light, but the human eye is not sensitive to it. Therefore, we don’t have an intuitive understanding of polarization and of optical phenomena that are based on it. They either elude us, like the polarization of the blue sky or the rainbow, or they puzzle us, like the effect of Polaroid sunglasses. Meanwhile, polarized light plays an important role in nature and can be used to manipulate and analyze molecular order in materials, including living cells, tissues, and whole organisms, by observation with the polarized light microscope.
In this seminar, Rudolf Oldenbourg will first illustrate the nature of polarized light and its interaction with aligned materials using hands-on demonstrations. He will then introduce a modern version of the polarized light microscope, the LC-PolScope, created at the MBL. Enhanced by liquid crystal devices, electronic imaging, and digital image processing techniques, the LC-PolScope reveals and measures the orientation of molecules in every resolved specimen point at once. In recent years, his lab expanded the LC-PolScope technique to include the measurement of polarized fluorescence of GFP and other fluorescent molecules, and applied it to record the remarkable choreography of septin proteins during cell division, displayed in yeast to mammalian cells.
Talon Chandler will then discuss extending polarized light techniques to multi-view microscopes, including light sheet and light field microscopes. In contrast to traditional, single-view microscopy, the recording of specimen images along two or more viewing directions allows us to unambiguously measure the three dimensional orientation of molecules and their aggregates. Chandler will discuss ongoing work on optimizing the design and reconstruction algorithms for multi-view polarized light microscopy.

More Information:

Speaker's Biography: Rudolf Oldenbourg is a Senior Scientist at the Marine Biological Laboratory in Woods Hole, Massachusetts, where he develops light microscopy techniques for studying living cells. He received his undergrad degree in physics at the Technical University Munich, and his PhD in physics in 1981 from the University of Konstanz, Germany. In 1989, Oldenbourg joined Shinya Inoué at the MBL, starting a fruitful collaboration leading to important technical advances in polarized light microscopy. He continues in the MBL tradition of cutting edge research through interdisciplinary collaborations, combining technical, physical, and biological insights to search for answers to life’s persistent question: What is life?

Talon Chandler received his B.A.Sc. degree in engineering physics from the University of British Columbia in 2015. He is currently a Ph.D. candidate in medical physics at the University of Chicago, and he is researching new techniques for measuring molecular orientation at the Marine Biological Laboratory.


SCIEN Colloquia

October 4, 2017 4:30 pm to 5:30 pm

Talk Title: Fall Quarter Schedule

Talk Abstract: The SCIEN Colloquia take place on Wednesdays at 4:30 pm in the Packard 101 Auditorium.

The schedule for the Fall Academic Quarter (2017) is

October 4: Gordon Wetzstein (Stanford) on "Computational Near-Eye Displays"
October 11: Doug Lanman (Facebook - Oculus) on "Focal Surface Displays"
October 18: Andrew Maimone (Microsoft) on "Holographic Near-Eye Displays for Virtual and Augmented Reality"
October 25: Basel Salahieh (Intel) on "Light field Retargeting for Integral and Multi-panel Displays"
November 1: Rudolf Oldenbourg (MBL at Woods Hole)
November 8: Andrew Jones (USC) on "Interactive 3D Digital Humans"
November 15:Kaan Aksit (NVIDIA) on "Near-Eye Varifocal Augmented Reality Displays"
November 22: Thanksgiving recess
November 29: Hakan Urey (Koç University and CY Vision) on "Next Generation Wearable AR Display Technologies"
December 6: Liang Gao (University of Illinois) on "Compressed Ultrafast Photography and Microscopy: redefining the limit of passive ultrafast imaging"


Professor Gordon Wetzstein

Stanford University

October 4, 2017 4:30 pm to 5:30 pm

Location: Packard 101

Talk Title: Computational Near-Eye Displays

Talk Abstract: Virtual reality is a new medium that provides unprecedented user experiences. Eventually, VR/AR systems will redefine communication, entertainment, education, collaborative work, simulation, training, telesurgery, and basic vision research. In all of these applications, the primary interface between the user and the digital world is the near-eye display. While today’s VR systems struggle to provide natural and comfortable viewing experiences, next-generation computational near-eye displays have the potential to provide visual experiences that are better than the real world. In this talk, we explore the frontiers of VR/AR systems engineering and discuss next-generation near-eye display technology, including gaze-contingent focus, light field displays, monovision, holographic near-eye displays, and accommodation-invariant near-eye displays.

More Information:

Speaker's Biography: Gordon Wetzstein is an Assistant Professor of Electrical Engineering and, by courtesy, of Computer Science at Stanford University. He is the leader of the Stanford Computational Imaging Lab (Links to an external site.)Links to an external site., an interdisciplinary research group focused on advancing imaging, microscopy, and display systems. Prior to joining Stanford in 2014, Prof. Wetzstein was a Research Scientist in the Camera Culture Group at the MIT Media Lab. He founded as a forum for sharing computational display design instructions with the DIY community, and presented a number of courses on Computational Displays and Computational Photography at ACM SIGGRAPH.


Dr. Douglas Lanman

Oculus Research

October 11, 2017 4:30 pm to 5:30 pm

Location: Packard 101

Talk Title: Focal Surface Displays

Talk Abstract: Conventional binocular head-mounted displays (HMDs) vary the stimulus to vergence with the information in the picture, while the stimulus to accommodation remains fixed at the apparent distance of the display, as created by the viewing optics. Sustained vergence-accommodation conflict (VAC) has been associated with visual discomfort, motivating numerous proposals for delivering near-correct accommodation cues. We introduce focal surface displays to meet this challenge, augmenting conventional HMDs with a phase-only spatial light modulator (SLM) placed between the display screen and viewing optics. This SLM acts as a dynamic freeform lens, shaping synthesized focal surfaces to conform to the virtual scene geometry. We introduce a framework to decompose target focal stacks and depth maps into one or more pairs of piecewise smooth focal surfaces and underlying display images. We build on recent developments in "optimized blending" to implement a multifocal display that allows the accurate depiction of occluding, semi-transparent, and reflective objects. Practical benefits over prior accommodation-supporting HMDs are demonstrated using a binocular focal surface display employing a liquid crystal on silicon (LCOS) phase SLM and an organic light-emitting diode (OLED) display.

More Information:

Speaker's Biography: Douglas Lanman is the Director of Computational Imaging at Oculus Research, where he leads investigations into advanced display and imaging technologies. His prior research has focused on head-mounted displays, glasses-free 3D displays, light field cameras, and active illumination for 3D reconstruction and interaction. He received a B.S. in Applied Physics with Honors from Caltech in 2002 and M.S. and Ph.D. degrees in Electrical Engineering from Brown University in 2006 and 2010, respectively. He was a Senior Research Scientist at NVIDIA Research from 2012 to 2014, a Postdoctoral Associate at the MIT Media Lab from 2010 to 2012, and an Assistant Research Staff Member at MIT Lincoln Laboratory from 2002 to 2005.


Dr. Andrew Maimone

Oculus Research

October 18, 2017 4:30 pm to 5:30 pm

Location: Packard 101

Talk Title: Holographic Near-Eye Displays for Virtual and Augmented Reality

Talk Abstract: Today's near-eye displays are a compromise of field of view, form factor, resolution, supported depth cues, and other factors. There is no clear path to obtain eyeglasses-like displays that reproduce the full fidelity of human vision. Computational displays are a potential solution in which hardware complexity is traded for software complexity, where it is easier to meet many conflicting optical constraints. Among computational displays, digital holography is a particularly attractive solution that may scale to meet all the optical demands of an ideal near-eye display. I will present novel designs for virtual and augmented reality near-eye displays based on phase-only holographic projection. The approach is built on the principles of Fresnel holography and double phase amplitude encoding with additional hardware, phase correction factors, and spatial light modulator encodings to achieve full color, high contrast and low noise holograms with high resolution and true per-pixel focal control. A unified focus, aberration correction, and vision correction model, along with a user calibration process, accounts for any optical defects between the light source and retina. This optical correction ability not only to fixes minor aberrations but enables truly compact, eyeglasses-like displays with wide fields of view (80 degrees) that would be inaccessible through conventional means. All functionality is evaluated across a series of proof-of-concept hardware prototypes; I will discuss remaining challenges to incorporate all features into a single device and obtain practical displays.

Speaker's Biography: Andrew Maimone is a research scientist at Oculus Research (Facebook). His research focuses on the use of computation to simplify and enhance virtual and augmented reality displays. Previously, Andrew was a researcher in the Hardware, Devices, and Experiences group of Microsoft Research NExT. Andrew completed a PhD and MS in computer science at the University of North Carolina at Chapel Hill.


Dr. Basel Salahieh


October 25, 2017 4:30 pm to 5:30 pm

Location: Packard 101

Talk Title: Light field Retargeting for Integral and Multi-panel Displays

Talk Abstract: Light fields are a collection of rays emanating from a 3D scene at various directions, that when properly captured provides a means of projecting depth and parallax cues on 3D displays. However due to the limited aperture size and the constrained spatial-angular sampling of many light field capture systems (e.g. plenoptic cameras), the displayed light fields provide only a narrow viewing zone in which parallax views can be supported. In addition, the autostereoscopic displaying devices may be of unmatched spatio-angular resolution (e.g. integral display) or of different architecture (e.g. multi-panel display) as opposed to the capturing plenoptic system which requires careful engineering between the capture and display stages. This talk presents an efficient light field retargeting pipeline for integral and multi-panel displays which provides us with a controllable enhanced parallax content. This is accomplished by slicing the captured light fields according to their depth content, boosting the parallax, and merging these slices with data filling. In integral displays, the synthesized views are simply resampled and reordered to create elemental images that beneath a lenslet array can collectively create multi-view rendering. For multi-panel displays, additional processing steps are needed to achieve seamless transition over different depth panels and viewing angles where displayed views are synthesized and aligned dynamically according to the position of the viewer. The retargeting technique is simulated and verified experimentally on actual integral and multi-panel displays

Speaker's Biography: Basel Salahieh is a research scientist at Intel Labs working on computational imaging, light field processing, autostereoscopic displays, and virtual reality. He received his BS in electrical engineering from the University of Aleppo – Syria in 2007, his first MS degree in electrical engineering from University of Oklahoma in 2010, his second MS degree in optical sciences and PhD degree in electrical and computer engineering from the University of Arizona in 2015. He has published more than 15 papers in international conferences and journals and more than 5 pending patents.


Dr. Kaan Aksit


November 15, 2017 4:30 pm to 5:30 pm

Location: Packard 101

Talk Title: Near-Eye Varifocal Augmented Reality Displays

Talk Abstract: With the goal of registering dynamic synthetic imagery onto the real world, Ivan Sutherland envisioned a fundamental idea to combine digital displays with conventional optical components in a wearable fashion. Since then, various new advancements in the display engineering domain, and a broader understanding in the vision science domain have led us to computational displays for virtual reality and augmented reality applications. Today, such displays promise a more realistic and comfortable experience through techniques such as lightfield displays, holographic displays, always-in-focus displays, multiplane displays, and varifocal displays. In this talk, as an Nvidian, I will be presenting our new optical layouts for see-through computational near-eye displays that is simple, compact, varifocal, and provides a wide field of view with clear peripheral vision and large eyebox. Key to our efforts so far contain novel see-through rear-projection holographic screens, and deformable mirror membranes. We establish fundamental trade-offs between the quantitative parameters of resolution, field of view, and the form-factor of our designs; opening an intriguing avenue for future work on accommodation-supporting augmented reality display.

Speaker's Biography: Kaan Akşit received his B.S. degree in electrical engineering from Istanbul Technical University, Turkey in 2007, his M.Sc. degree in electrical power engineering from RWTH Aachen University, Germany in 2010, and his Ph.D. degree in electrical engineering at Koç University, Turkey in 2014. In 2009, he joined Philips Research at Eindhoven, the Netherlands as an intern. In 2013, he joined Disney Research, Zurich, Switzerland as an intern. His past research include topics such as visible light communications, optical medical sensing, solar cars, and auto-stereoscopic displays. Since July 2014, he is working as a research scientist at Nvidia Corporation located at Santa Clara, USA, tackling the problems related to computational displays for virtual and augmented reality.


Dr. Andrew Jones


November 8, 2017 4:30 pm to 5:30 pm

Location: Packard 101

Talk Title: Interactive 3D Digital Humans

Talk Abstract: This talk will cover recent methods for recording and displaying interactive life-sized digital humans using the ICT Light Stage, natural language interfaces, and automultiscopic 3D displays. We will then discuss the first full application of this technology to preserve the experience of in-person interactions with Holocaust survivors

More Information:

Speaker's Biography: Andrew Jones is a computer graphics programmer and inventor at the University of Southern California’s Institute for Creative Technology. In 2004, Jones began working in cultural heritage, using 3D scanning techniques to virtually reunite the Parthenon and its sculptures. The resulting depictions of the Parthenon were featured in the 2004 Olympics, PBS’s NOVA, National Geographic, the IMAX film Greece: Secrets of the Past, and The Louvre. However computer generated worlds only truly come alive when combined with interactive human characters.Subsequently, Andrew developed new techniques to record dynamic human facial and full-body performances. These photoreal real-time characters have been used by companies such as ImageMetrics, Activision, Digital Domain and Weta for visual effects and games. As part of his PhD, Jones designed new display devices that can show 3D imagery to multiple viewers without the need for stereo glasses, winning “Best Emerging Technology” at SIGGRAPH 2007. His current work with the USC Shoah Foundation explores how to use digital humans and holographic technology to change how we communicate with each other and the past.


Professor Hakan Urey

Koç University in Istanbul-Turkey and CY Vision in San Jose, CA

November 29, 2017 4:30 pm to 5:30 pm

Location: Packard 101

Talk Title: Next Generation Wearable AR Display Technologies

Talk Abstract: Wearable AR/VR displays have a long history and earlier efforts failed due to various limitations. Advances in sensors, optical technologies, and computing technologies renewed the interest in this area. Most people are convinced AR will be very big. A key question is whether AR glasses can be the new computing platform and replace smart phones? I’ll discuss some of the challenges ahead. We have been working on various wearable display architectures and I’ll discuss our efforts related to MEMS scanned beam displays, head-mounted projectors and smart telepresence screens, and holographic near-eye displays.

Speaker's Biography: Hakan Urey is co-founder of CY Vision, San Jose and Professor of Electrical Engineering at Koç University, Istanbul-Turkey. He received the BS degree from Middle East Technical University and MS and Ph.D. degrees from Georgia Institute of Technology all in Electrical Engineering. He worked for Microvision (Seattle area) before joining Koç University.

He is the inventor of more than 50 issued and pending patents in the areas of novel displays, imaging systems, MEMS, optical sensors, and microtechnologies. His inventions have been licensed by five companies for commercialization and resulted in four spin-off companies. He published about 200 papers in international journals and conferences, gave more than 30 invited talks at conferences, and received a number of awards. He received the prestigious European Research Council Advanced Grant in 2013 to develop next generation wearable and 3D display technologies.


December 8, 2017 1:00 pm to 5:30 pm

Talk Title: Stanford Center for Image Systems Engineering (SCIEN) 2017 Industry Affiliates Meeting

Talk Abstract: The Stanford Center for Image Systems Engineering will hold its' annual meeting on
December 8, 2017 from 1-5:30 in the David Packard Electrical Engineering Building.

The program includes talks from 3 new Stanford faculty:
- Kayvon Fatahalian architects high performance visual computing systems that enable immersive and intelligent visual computing applications.
- Jeannette Bohg's research lies at the intersection of robotics, computer vision and machine learning.
- Sean Follmer creates interactive, dynamic physical 3D displays and haptic interfaces.

The talks will be followed by a poster session featuring postdocs and graduate students with expertise in computer vision, machine learning and human perception.

Seating is limited, hence registration is required.

More Information:


Dr. Alex Hegyi


May 17, 2017 4:30 pm to 5:30 pm

Location: Packard 101

Talk Title: Hyperspectral imaging using polarization interferometry

Talk Abstract: Polarization interferometers are interferometers that utilize birefringent crystals in order to generate an optical path delay between two polarizations of light. In this talk I will describe how I have employed polarization interferometry to make two kinds of Fourier imaging spectrometers; in one case, by temporally scanning the optical path delay with a liquid crystal cell, and in the other, utilizing relative motion between scene and detector to spatially scan the optical path delay through a position-dependent wave plate.

Speaker's Biography: Alex Hegyi is a Member of Research Staff at PARC, a Xerox company, where he works on novel concepts for optical sensing. He holds a PhD in Electrical Engineering from UC Berkeley and a BS with Honors and Distinction in Physics from Stanford. He is a former Hertz Foundation Fellow, is a winner of the Hertz Foundation Thesis Prize, and is one of the 2016 Technology Review “35 Innovators Under 35”.


Dr. Patrick Llull


March 7, 2018 4:30 pm to 5:30 pm

Location: Packard 101

Talk Title: Temporal coding of volumetric imagery

Talk Abstract: 'Image volumes' refer to realizations of images in other dimensions such as time, spectrum, and focus. Recent advances in scientific, medical, and consumer applications demand improvements in image volume capture. Though image volume acquisition continues to advance, it maintains the same sampling mechanisms that have been used for decades; every voxel must be scanned or captured in parallel and is presumed independent of its neighbors. Under these conditions, improving performance comes at the cost of increased system complexity, data rates, and power consumption.

This talk describes systems and methods with which to efficiently detect and visualize image volumes by temporally encoding the extra dimensions’ information into 2D measurements or displays. Some highlights of my research include video and 3D recovery from photographs, and true-3D augmented reality image display by time multiplexing. In the talk, I show how temporal optical coding can improve system performance, battery life, and hardware simplicity for a variety of platforms and applications.

Speaker's Biography: Currently with Google's Daydream virtual reality team, Patrick Llull completed his Ph.D. under Prof. David Brady at the Duke University Imaging and Spectroscopy Program (DISP) in May 2016. His doctoral research focused on compressive video and multidimensional sensing, with research internship experience with Ricoh Innovations in near-eye multifocal displays. During his Ph.D. Patrick won two best paper awards and was an NSF graduate fellowship honorable mention. Patrick graduated with his BS from the University of Arizona's College of Optical Sciences in May 2012.


Dr. Kari Pulli


May 3, 2017 4:30 pm to 5:30 pm

Location: Packard 101

Talk Title: Heterogeneous Computational Imaging

Talk Abstract: Modern systems-on-a-chip (SoC) have many different types of processors that could be used in computational imaging. Unfortunately, they all have different programming models, and are thus difficult to optimize as a system. In this talk we discuss various standards (OpenCL, OpenVX) and domain-specific programming languages (Halide, Proximal) that make it easier to accelerate processing for computational imaging.

Speaker's Biography: Kari Pulli is CTO at Meta. Before joining Meta, Kari worked as CTO of the Imaging and Camera Technologies Group at Intel influencing the architecture of future IPUs. He was VP of Computational Imaging at Light and before that he led research teams at NVIDIA Research (Senior Director) and at Nokia Research (Nokia Fellow) on Computational Photography, Computer Vision, and Augmented Reality. He headed Nokia's graphics technology, and contributed to many Khronos and JCP mobile graphics and media standards, and wrote a book on mobile 3D graphics. Kari holds CS degrees from Univ. Minnesota (BSc), Univ. Oulu (MSc, Lic. Tech.), Univ. Washington (PhD); and an MBA from Univ. Oulu. He has taught and worked as a researcher at Stanford, Univ. Oulu, and MIT.


Prof. Reza Zadeh

Matroid and Stanford University

May 31, 2017 4:30 pm to 5:30 pm

Location: Packard 101

Talk Title: FusionNet: 3D Object Classification Using Multiple Data Representations

Talk Abstract: High-quality 3D object recognition is an important component of many vision and robotics systems. We tackle the object recognition problem using two data representations: Volumetric representation, where the 3D object is discretized spatially as binary voxels - 1 if the voxel is occupied and 0 otherwise. Pixel representation where the 3D object is represented as a set of projected 2D pixel images. At the time of submission, we obtained leading results on the Princeton ModelNet challenge. Some of the best deep learning architectures for classifying 3D CAD models use Convolutional Neural Networks (CNNs) on pixel representation, as seen on the ModelNet leaderboard. Diverging from this trend, we combine both the above representations and exploit them to learn new features. This approach yields a significantly better classifier than using either of the representations in isolation. To do this, we introduce new Volumetric CNN (V-CNN) architectures.

More Information:

Speaker's Biography: Reza Zadeh is CEO at Matroid and Adjunct Professor at Stanford University. His work focuses on machine learning, distributed computing, and discrete applied mathematics. He has served on the Technical Advisory Board of Microsoft and Databricks.


Dr. Donald Dansereau

Stanford University

June 7, 2017 4:30 pm to 5:30 pm

Talk Title: Computational Imaging for Robotic Vision

Talk Abstract: This talk argues for combining the fields of robotic vision and computational imaging. Both consider the joint design of hardware and algorithms, but with dramatically different approaches and results. Roboticists seldom design their own cameras, and computational imaging seldom considers performance in terms of autonomous decision-making.The union of these fields considers whole-system design from optics to decisions. This yields impactful sensors offering greater autonomy and robustness, especially in challenging imaging conditions. Motivating examples are drawn from autonomous ground and underwater robotics, and the talk concludes with recent advances in the design and evaluation of novel cameras for robotics applications.

Speaker's Biography: Donald G. Dansereau joined the Stanford Computational Imaging Lab as a postdoctoral scholar in September 2016. His research is focused on computational imaging for robotic vision, and he is the author of the open-source Light Field Toolbox for Matlab. Dr. Dansereau completed B.Sc. and M.Sc. degrees in electrical and computer engineering at the University of Calgary in 2001 and 2004, receiving the Governor General’s Gold Medal for his work in light field processing. His industry experience includes physics engines for video games, computer vision for microchip packaging, and FPGA design for high-throughput automatic test equipment. In 2014 he completed a Ph.D. in plenoptic signal processing at the Australian Centre for Field Robotics, University of Sydney, and in 2015 joined on as a research fellow at the Australian Centre for Robotic Vision at the Queensland University of Technology, Brisbane. Donald's field work includes marine archaeology on a Bronze Age city in Greece, seamount and hydrothermal vent mapping in the Sea of Crete and Aeolian Arc, habitat monitoring off the coast of Tasmania, and hydrochemistry and wreck exploration in Lake Geneva.


May 12, 2017 to May 14, 2017

Talk Title: 2017 International Conference on Computational Photography

Talk Abstract: The field of Computational Photography seeks to create new photographic functionalities and experiences that go beyond what is possible with traditional cameras and image processing tools. The IEEE International Conference on Computational Photography is organized with the vision of fostering the community of researchers, from many different disciplines, working on computational photography.

More Information:


May 11, 2017

Talk Title: Workshop on Augmented and Mixed Reality

Talk Abstract: This workshop will bring together scientists and engineers who are advancing sensor technologies, computer vision, machine learning, head-mounted displays and our understanding of human vision, and developers who are creating new and novel applications for augmented and mixed reality in retail, education, science and medicine.

More Information:


Dr. Greg Corrado


April 26, 2017 4:30 pm to 5:30 pm4:30 pm to 5:30 pm

Location: Packard 101

Talk Title: Deep Learning Imaging Applications

Talk Abstract: Deep learning has driven huge progress in visual object recognition in the last five years, but this is one aspect of its application to imaging. This talk will provided a brief overview deep learning and artificial neural networks in computer vision, before delving into wide range of application Google has pursued in this area. Topics will include image summarization, image augmentation, artistic style transfer, and medical diagnostics.

Speaker's Biography: Greg Corrado is a Principal Scientist at Google, and the co-founder of the Google Brain Team. He works at the nexus of artificial intelligence, computational neuroscience, and scalable machine learning, and has published in fields ranging from behavioral economics, to particle physics, to deep learning. In his time at Google he has worked to put AI directly into the hands of users via products like RankBrain and SmartReply, and into the hands of developers via opensource software releases like TensorFlow and word2vec. He currently leads several research efforts in advanced applications of machine learning, ranging from natural human communication to expanded healthcare availability. Before coming to Google, he worked at IBM Research on neuromorphic silicon devices and large scale neural simulations. He did his graduate studies in both Neuroscience and in Computer Science at Stanford University, and his undergraduate work in Physics at Princeton University.


Professor Steve Mann

University of Toronto

April 19, 2017 4:30 pm to 5:30 pm

Location: Packard 101

Talk Title: Monitorless Workspaces and Operating Rooms of the Future: Virtual/Augmented Reality through Multiharmonic Lock-In Amplifiers.

Talk Abstract: In my childhood I invented a new kind of lock-in amplifier and used it as the basis for the world's first wearable augmented reality computer ( This allowed me to see radio waves, sound waves, and electrical signals inside the human body, all aligned perfectly with the physical space in which they were present. I built this equipment into special electric eyeglasses that automatically adjusted their convergence and focus to match their surroundings. By shearing the spacetime continuum one sees a stroboscopic vision in coordinates in which the speed of light, sound, or wave propagation is exactly zero (, or slowed down, making these signals visible to radio engineers, sound engineers, neurosurgeons, and the like. See the attached picture of a violin attached to the desk in my office at Meta, where we're creating the future of computing based on Human-in-the-Loop Intelligence (

More Information:


Dr. Felix Heide

Stanford University

April 12, 2017 4:30 pm to 5:30 pm

Location: Packard 101

Talk Title: Capturing the “Invisible”: Computational Imaging for Robust Sensing and Vision

Talk Abstract: Imaging has become an essential part of how we communicate with each other, how autonomous agents sense the world and act independently, and how we research chemical reactions and biological processes. Today's imaging and computer vision systems, however, often fail in critical scenarios, for example in low light or in fog. This is due to ambiguity in the captured images, introduced partly by imperfect capture systems, such as cellphone optics and sensors, and partly present in the signal before measuring, such as photon shot noise. This ambiguity makes imaging with conventional cameras challenging, e.g. low-light cellphone imaging, and it makes high-level computer vision tasks difficult, such as scene segmentation and understanding.

In this talk, I will present several examples of algorithms that computationally resolve this ambiguity and make sensing and vision systems robust. These methods rely on three key ingredients: accurate probabilistic forward models, learned priors, and efficient large-scale optimization methods. In particular, I will show how to achieve better low-light imaging using cell-phones (beating Google's HDR+), and how to classify images at 3 lux (substantially outperforming very deep convolutional networks, such as the Inception-v4 architecture). Using a similar methodology, I will discuss ways to miniaturize existing camera systems by designing ultra-thin, focus-tunable diffractive optics. Finally, I will present new exotic imaging modalities which enable new applications at the forefront of vision and imaging, such as seeing through scattering media and imaging objects outside direct line of sight.

Speaker's Biography: Felix Heide is a postdoctoral research working with Professor Gordon Wetzstein in the Department of Electrical Engineering at Stanford University. He is interested in the theory and application of computational imaging and vision systems. Researching imaging systems end-to-end, Felix's work lies at the intersection of optics, machine learning, optimization, computer graphics and computer vision. Felix has co-authored over 25 publications and filed 3 patents. He co-founded
the mobile vision start-up Algolux. Felix received his Ph.D. in December 2016 at the University of British Columbia under the advisement of Professor Wolfgang Heidrich. His doctoral dissertation focuses on optimization for computational imaging and won the Alain Fournier Ph.D. Dissertation Award.


Peter Gao


April 5, 2017 4:30 pm to 5:30 pm

Location: Packard 101

Talk Title: Practical Computer Vision for Self-Driving Cars

Talk Abstract: Cruise is developing and testing a fleet of self driving cars on the streets of San Francisco. Getting these cars to drive is a hard engineering and science problem - this talk explains roughly how self driving cars work and how computer vision, from camera hardware to deep learning, helps make a self driving car go.

Speaker's Biography: Peter Gao is the computer vision architect at Cruise.


Christian Sandor

Nara Institute of Science and Technology (NAIST)

March 14, 2017 4:30 pm to 5:30 pm

Location: Clark Auditorium in the BioX Building

Talk Title: Breaking the Barriers to True Augmented Reality

Talk Abstract: In 1950, Alan Turing introduced the Turing Test, an essential concept in the philosophy of Artificial Intelligence (AI). He proposed an “imitation game” to test the sophistication of an AI software. Similar tests have been suggested for fields including Computer Graphics and Visual Computing. In this talk, we will propose an Augmented Reality Turing Test (ARTT).

Augmented Reality (AR) embeds spatially-registered computer graphics in the user’s view in realtime. This capability can be used for a lot of purposes; for example, AR hands can demonstrate manual repair steps to a mechanic. To pass the ARTT, we must create AR objects that are indistinguishable from real objects. Ray Kurzweil bet USD 20,000 that the Turing Test will be passed by 2029. We think that the ARTT can be passed significantly earlier.

We will discuss the grand challenges for passing the ARTT, including: calibration, localization & tracking, modeling, rendering, display technology, and multimodal AR. We will also show examples from our previous and current work at Nara Institute of Science and Technology in Japan.

Speaker's Biography: Dr. Christian Sandor is an Associate Professor at one of Japan’s most prestigious research universities, Nara Institute of Science and Technology (NAIST), where he is co-directing the Interactive Media Design Lab together with Professor Hirokazu Kato. Since the year 2000, his foremost research interest is Augmented Reality, as he believes that it will have a profound impact on the future of mankind.

In 2005, he obtained a doctorate in Computer Science from the Munich University of Technology, Germany under the supervision of Prof. Gudrun Klinker and Prof. Steven Feiner. He decided to explore the research world in the spirit of Alexander von Humboldt and has lived outside of Germany ever since to work with leading research groups at institutions including: Columbia University (New York, USA), Canon’s Leading-Edge Technology Research Headquarters (Tokyo, Japan), Graz University of Technology (Austria), University of Stuttgart (Germany), and Tohoku University (Japan).

Before joining NAIST, he directed the Magic Vision Lab ( Together with his students, he won awards at the premier Augmented Reality conference, IEEE International Symposium on Mixed and Augmented Reality (ISMAR): best demo (2011) and best poster honourable mention (2012, 2013). He presented several keynotes and acquired funding close to 1.5 million dollars; in 2012, Magic Vision Lab was the first, and still only, Australian lab to be awarded in Samsung’s Global Research Outreach Program. In 2014, he received a Google Faculty Award for creating an Augmented Reality X-Ray system for Google Glass.


Professor Vivek Goyal

Boston University

February 22, 2017 4:30 pm to 5:30 pm

Location: Packard 101

Talk Title: First-Photon Imaging and Other Imaging with Few Photons

Talk Abstract: LIDAR systems use single-photon detectors to enable long-range reflectivity and depth imaging. By exploiting an inhomoheneous Poisson process observation model and the typical structure of natural scenes, first-photon imaging demonstrates the possibility of accurate LIDAR with only 1 detected photon per pixel, where half of the detections are due to (uninformative) ambient light. I will explain the simple ideas behind first-photon imaging. Then I will touch upon related subsequent works that mitigate the limitations of detector arrays, withstand 25-times more ambient light, allow for unknown ambient light levels, and capture multiple depths per pixel.

More Information:

Speaker's Biography: Vivek Goyal received the M.S. and Ph.D. degrees in electrical engineering from the University of California, Berkeley, where he received the Eliahu Jury Award for outstanding achievement in systems, communications, control, or signal processing. He was a Member of Technical Staff at Bell Laboratories, a Senior Research Engineer for Digital Fountain, and the Esther and Harold E. Edgerton Associate Professor of Electrical Engineering at MIT. He was an adviser to 3dim Tech, winner of the MIT $100K Entrepreneurship Competition Launch Contest Grand Prize, and consequently with Nest Labs. He is now an Associate Professor of Electrical and Computer Engineering at Boston University.

Dr. Goyal is a Fellow of the IEEE. He was awarded the 2002 IEEE Signal Processing Society Magazine Award, an NSF CAREER Award, and the Best Paper Award at the 2014 IEEE International Conference on Image Processing. Work he supervised won student best paper awards at the IEEE Data Compression Conference in 2006 and 2011 and the IEEE Sensor Array and Multichannel Signal Processing Workshop in 2012 as well as five MIT thesis awards. He currently serves on the Editorial Board of Foundations and Trends and Signal Processing, the Scientific Advisory Board of the Banff International Research Station for Mathematical Innovation and Discovery, the IEEE SPS Computational Imaging SIG, and the IEEE SPS Industry DSP TC. He was a Technical Program Committee Co-chair of Sampling Theory and Applications 2015 and was a Conference Co-chair of the SPIE Wavelets and Sparsity conference series 2006-2016. He is a co-author of Foundations of Signal Processing (Cambridge University Press, 2014).


Professor Henry Fuchs

University of North Carolina

March 1, 2017 4:30 pm to 5:30 pm

Location: Packard 101

Talk Title: The AR/VR Renaissance: promises, disappointments, unsolved problems

Talk Abstract: Augmented and Virtual Reality have been hailed as “the next big thing” several times in the past 25 years. Some are predicting that VR will be the next computing platform, or at least the next platform for social media. Others worry that today’s VR systems are closer to the 1990s Apple Newton than the 2007 Apple iPhone. This talk will feature a short, personal history of AR and VR, a survey of some of current work, sample applications, and remaining problems. Current work with encouraging results include 3D acquisition of dynamic, populated spaces; compact and wide field-of-view AR displays; low-latency and high-dynamic range AR display systems; and AR lightfield displays that may reduce the accommodation-vergence conflict.

More Information:

Speaker's Biography: Henry Fuchs (PhD, Utah, 1975) is the Federico Gil Distinguished Professor of Computer Science and Adjunct Professor of Biomedical Engineering at UNC Chapel Hill, coauthor of over 200 papers, mostly on rendering algorithms (BSP Trees), graphics hardware (Pixel-Planes), head-mounted / near-eye and large-format displays, virtual and augmented reality, telepresence, medical and training applications. He is a member of the National Academy of Engineering, a fellow of the American Academy of Arts and Sciences, recipient of the 2013 IEEE VGTC Virtual Reality Career Award, and the 2015 ACM SIGGRAPH Steven Anson Coons Award.


Dr. Hans Kiening

Arnold & Richter Cine Technik

March 22, 2017 4:30 pm to 5:30 pm

Location: Packard 101

Talk Title: ARRIScope - A new era in surgical microscopy

Talk Abstract: The continuous increase in performance and the versatility of ARRI´s digital motion picture camera systems led to our development of the first fully digital stereoscopic operating microscope, the ARRISCOPE. For the last 18 months’ multiple units have been used in clinical trials at renowned clinics in the field of Otology in Germany.
During our presentation we will cover the obstacles, initial applications and future potentials of 3D camera based surgical microscopes and give an insight into the technical preconditions and advantages of the digital imaging chain. In conclusion of the presentation, examples of different surgical procedures recorded with the ARRISCOPE and near future augmented and virtual reality 3D applications will be demonstrated.

More Information:

Speaker's Biography: Dr. Hans Kiening is general manager and founder of the medical business unit at Arnold & Richter (ARRI) based in Munich, Germany. He has more than 20 years of experience with image science and sensor technologies. He began his career at ARRI Research & Development in 1996, where he developed an automated image analysis and calibration system for the ARRILASER (a solidstate laser filmrecorder). In 2012, he conceptualized and realized the development of a purely digital surgical microscope based on the Alexa camera system - the ARRISCOPE. He holds a lectureship at the University of Applied Science in Munich and is the author of many SMPTE papers (journal award 2006) and medical/media image science related patents. He holds a PhD in image science from the University of Cottbus, Germany.


Dr. Abe Davis

Stanford University

February 15, 2017 4:30 pm to 5:30 pm

Location: Packard 101

Talk Title: Visual Vibration Analysis

Talk Abstract: I will show how video can be a powerful way to measure physical vibrations. By relating the frequencies of subtle, often imperceptible changes in video to the vibrations of visible objects, we can reason about the physical properties of those objects and the forces that drive their motion. In my talk I'll show how this can be used to recover sound from silent video (Visual Microphone), estimate the material properties of visible objects (Visual Vibrometry), and learn enough about the physics of objects to create plausible image-space simulations (Dynamic Video).

Speaker's Biography: Abe Davis is a new postdoc at Stanford working with Doug James. He recently completed his PhD at MIT, where he was advised by Fredo Durand. His thesis focused on analyzing subtle variations in video to reason about physical vibrations. Abe has explored applications of his work in graphics, vision, and civil engineering, with publications in SIGGRAPH, SIGGRAPH Asia, and CVPR, as well as top venues in structural health monitoring and nondestructive testing. His dissertation won the 2016 Sprowls award for outstanding thesis in computer science. Abe's research has been featured in most major news outlets that cover science and technology. Business Insider named him one of the "8 most innovative scientists in tech and engineering" in 2015, and Forbes named him one of their "30 under 30" in 2016.


Dr. Jon Shlens


March 8, 2017 4:30 pm to 5:30 pm

Location: Packard 101

Talk Title: A Learned Representation for Artistic Style

Talk Abstract: The diversity of painting styles represents a rich visual vocabulary for the construction of an image. The degree to which one may learn and parsimoniously capture this visual vocabulary measures our understanding of the higher level features of paintings, if not images in general. In this work we investigate the construction of a single, scalable deep network that can parsimoniously capture the artistic style of a diversity of paintings. We demonstrate that such a network generalizes across a diversity of artistic styles by reducing a painting to a point in an embedding space. Importantly, this model permits a user to explore new painting styles by arbitrarily combining the styles learned from individual paintings. We hope that this work provides a useful step towards building rich models of paintings and offers a window on to the structure of the learned representation of artistic style.

Speaker's Biography: Jonathon Shlens received his Ph.D in computational neuroscience from UC San Diego in 2007 where his research focused on applying machine learning towards understanding visual processing in real biological systems. He was previously a research fellow at the Howard Hughes Medical Institute, a research engineer at Pixar Animation Studios and a Miller Fellow at UC Berkeley. He has been at Google Research since 2010 and is currently a research scientist focused on building scalable vision systems. During his time at Google, he has been a core contributor to deep learning systems including the recently open-sourced TensorFlow. His research interests have spanned the development of state-of-the-art image recognition systems and training algorithms for deep networks.


Professor Keisuke Goda

University of Tokyo

January 27, 2017 4:30 pm to 5:30 pm

Location: Packard 101

Talk Title: High-speed imaging meets single-cell analysis

Talk Abstract: High-speed imaging is an indispensable tool for blur-free observation and monitoring of fast transient dynamics in today’s scientific research, industry, defense, and energy. The field of high-speed imaging has steadily grown since Eadweard Muybridge demonstrated motion-picture photography in 1878. High-speed cameras are commonly used for sports, manufacturing, collision testing, robotic vision, missile tracking, and fusion science and are even available to professional photographers. Over the last few years, high-speed imaging has been shown highly effective for single-cell analysis – the study of individual biological cells among populations for identifying cell-to-cell differences and elucidating cellular heterogeneity invisible to population-averaged measurements. The marriage of these seemingly unrelated disciplines has been made possible by exploiting high-speed imaging’s capability of acquiring information-rich images at high frame rates to obtain a snapshot library of numerous cells in a short duration of time (with one cell per frame), which is useful for accurate statistical analysis of the cells. This is a paradigm shift in the field of high-speed imaging since the approach is radically different from its traditional use in slow-motion analysis. In this talk, I introduce a few different methods for high-speed imaging and their application to single-cell analysis for precision medicine and green energy.

More Information:

Speaker's Biography: Keisuke Goda is Department Chair and a Professor of Physical Chemistry in the Department of Chemistry at the University of Tokyo and holds an adjunct faculty position at UCLA. He obtained his BA degree summa cum laude from UC Berkeley in 2001 and his PhD from MIT in 2007, both in physics. At MIT, he worked on the development of quantum-enhancement techniques in LIGO for gravitational-wave detection. His research currently focuses on the development of innovative laser-based molecular imaging and spectroscopy methods for data-driven science. He has been awarded the Gravitational Wave International Committee Thesis Award (2008), Burroughs Wellcome Fund Career Award (2011), Konica Minolta Imaging Science Award (2013), IEEE Photonics Society Distinguished Lecturers Award (2014), and WIRED Audi Innovation Award (2016). He serves as an Associate Editor for APL Photonics (AIP Publishing) and a Young Global Leader for World Economic Forum.


Professor Trevor Darrell

University of California at Berkeley

February 8, 2017 4:30 pm to 5:30 pm

Location: Packard 101

Talk Title: Adversarial perceptual representation learning across diverse modalities and domains

Talk Abstract: Learning of layered or "deep" representations has provided significant advances in computer vision in recent years, but has traditionally been limited to fully supervised settings with very large amounts of training data. New results in adversarial adaptive representation learning show how such methods can also excel when learning in sparse/weakly labeled settings across modalities and domains. I'll review state-of-the-art models for fully convolutional pixel-dense segmentation from weakly labeled input, and will discuss new methods for adapting models to new domains with few or no target labels for categories of interest. As time permits, I'll present recent long-term recurrent network models that learn cross-modal description and explanation, visuomotor robotic policies that adapt to new domains, and deep autonomous driving policies that can be learned from heterogeneous large-scale dashcam video datasets.

More Information:

Speaker's Biography: Prof. Darrell is on the faculty of the CS Division of the EECS Department at UC Berkeley and he is also appointed at the UC-affiliated International Computer Science Institute (ICSI). Darrell’s group develops algorithms for large-scale perceptual learning, including object and activity recognition and detection, for a variety of applications including multimodal interaction with robots and mobile devices. His interests include computer vision, machine learning, computer graphics, and perception-based human computer interfaces. Prof. Darrell was previously on the faculty of the MIT EECS department from 1999-2008, where he directed the Vision Interface Group. He was a member of the research staff at Interval Research Corporation from 1996-1999, and received the S.M., and PhD. degrees from MIT in 1992 and 1996, respectively. He obtained the B.S.E. degree from the University of Pennsylvania in 1988, having started his career in computer vision as an undergraduate researcher in Ruzena Bajcsy's GRASP lab.


Professor Alfredo Dubra

Stanford University Department of Ophthalmology

January 18, 2017 4:30 pm to 5:30 pm

Location: Packard 101

Talk Title: Adaptive optics retinal imaging: more than just high-resolution

Talk Abstract: The majority of the cells in the retina do not reproduce, making early diagnosing of eye disease paramount. Through improved resolution provided by the correction of the ocular monochromatic aberrations, adaptive optics combined with conventional and novel imaging techniques reveal pathology at the cellular-scale. When compared with existing clinical tools, the ability to visualize retinal cells and microscopic structures non-invasively represents a quantum leap in the potential for diagnosing and managing ocular, systemic and neurological diseases. The presentation will first cover the adaptive optics technology itself and some of its unique technical challenges. This will be followed by a review of AO-enhanced imaging modalities applied to the study of the healthy and diseased eye, with particular focus on multiple-scattering imaging to reveal transparent retinal structures.

Speaker's Biography: Alfredo (Alf) Dubra is an Associate Professor of Ophthalmology at Stanford (Byers Eye Institute). He trained in Physics in the Universidad de la República in Uruguay (BSc and MSc) and at Imperial College London in the United Kingdom (PhD). Before joining Stanford, he was with the University of Rochester and the Medical College of Wisconsin. His research focuses on the translation of mathematical, engineering and optical tools for the diagnosing, monitoring progression and treatment of ocular disease.


Professor Emily Cooper

Psychological and Brain Sciences Department at Dartmouth College

January 11, 2017 4:30 pm to 5:30 pm

Location: Packard 101

Talk Title: Designing and assessing near-eye displays to increase user inclusivity

Talk Abstract: Recent years have seen impressive growth in near-eye display systems, which are the basis of most virtual and augmented reality experiences. There are, however, a unique set of challenges to designing a display system that is literally strapped to the user’s face. With an estimated half of all adults in the United States requiring some level of visual correction, maximizing inclusivity for near-eye displays is essential. I will describe work that combines principles from optics, optometry, and visual perception to identify and address major limitations of near-eye displays both for users with normal vision and those that require common corrective lenses. I will also describe ongoing work assessing the potential for near-eye displays to assist people with less common visual impairments at performing day-to-day tasks.

Speaker's Biography: Emily Cooper is an assistant research professor in the Psychological and Brain Sciences Department at Dartmouth College. Emily’s research focuses on basic and applied visual perception, including 3D vision and perceptual graphics. She received her B.A. in Psychology and English Literature from the University of Chicago in 2007. She received her Ph.D. in Neuroscience from the University of California, Berkeley in 2012. Following a postdoctoral fellowship at Stanford University, she joined the faculty at Dartmouth College in 2015.


Professor Daniel Palanker

Stanford University

December 6, 2016 4:30 pm to 5:30 pm

Location: Packard 101

Talk Title: Electronic augmentation of body functions: progress in electro-neural interfaces

Talk Abstract: Electrical nature of neural signaling allows efficient bi-directional electrical communication with the nervous system. Currently, electro-neural interfaces are utilized for partial restoration of sensory functions, such as hearing and sight, actuation of prosthetic limbs and restoration of tactile sensitivity, enhancement of tear secretion, and many others. Deep brain stimulation helps controlling tremor in patients with Parkinson’s disease, improve muscle control in dystonia, and in other neurological disorders. With technological advances and progress in understanding of the neural systems, these interfaces may allow not only restoration or augmentation of the lost functions, but also expansion of our natural capabilities – sensory, cognitive and others. I will review the state of the field and future directions of technological development.

Speaker's Biography: Daniel Palanker is a Professor in the Department of Ophthalmology and Director of the Hansen Experimental Physics Laboratory at Stanford University. He received MSc in Physics in 1984 from the Yerevan State University in Armenia, and PhD in Applied Physics in 1994 from the Hebrew University of Jerusalem, Israel.

Dr. Palanker studies interactions of electric field with biological cells and tissues, and develops optical and electronic technologies for diagnostic, therapeutic, surgical and prosthetic applications, primarily in ophthalmology. These studies include laser-tissue interactions with applications to ocular therapy and surgery, and interferometric imaging of neural signals. In the field of electro-neural interfaces, Dr. Palanker is developing retinal prosthesis for restoration of sight to the blind and implants for electronic control of secretary glands and blood vessels.

Several of his developments are in clinical practice world-wide: Pulsed Electron Avalanche Knife (PEAK PlasmaBlade), Patterned Scanning Laser Photocoagulator (PASCAL), and OCT-guided Laser System for Cataract Surgery (Catalys). Several others are in clinical trials: Gene therapy of the retinal pigment epithelium (Ocular BioFactory, Avalanche Biotechnologies Inc); Neural stimulation for enhanced tear secretion (TearBud, Allergan Inc.); Smartphone-based ophthalmic diagnostics and monitoring (Paxos, DigiSight Inc.).


Dr. Alexandre Alahi

Stanford University

November 29, 2016 4:30 pm to 5:30 pm

Location: Packard 101

Talk Title: Towards Socially-aware AI

Talk Abstract: Over the past sixty years, Intelligent Machines have made great progress in playing games, tagging images in isolation, and recently making decisions for self-driving vehicles. Despite these advancements, they are still far from making decisions in social scenes and effectively assisting humans in public spaces such as terminals, malls, campuses, or any crowded urban environment. To overcome these limitations, I claim that we need to empower machines with social intelligence, i.e., the ability to get along well with others and facilitate mutual cooperation. This is crucial to design future generations of smart spaces that adapt to the behavior of humans for efficiency, or develop autonomous machines that assist in crowded public spaces (e.g., delivery robots, or self-navigating segways).

In this talk, I will present my work towards socially-aware machines that can understand human social dynamics and learn to forecast them. First, I will highlight the machine vision techniques behind understanding the behavior of more than 100 million individuals captured by multi-modal cameras in urban spaces. I will show how to use sparsity promoting priors to extract meaningful information about human behavior. Second, I will introduce a new deep learning method to forecast human social behavior. The causality behind human behavior is an interplay between both observable and non-observable cues (e.g., intentions). For instance, when humans walk into crowded urban environments such as a busy train terminal, they obey a large number of (unwritten) common sense rules and comply with social conventions. They typically avoid crossing groups and keep a personal distance to their surrounding. I will present detailed insights on how to learn these interactions from millions of trajectories. I will describe a new recurrent neural network that can jointly reason on correlated sequences and forecast human trajectories in crowded scenes. It opens new avenues of research in learning the causalities behind the world we observe. I will conclude my talk by mentioning some ongoing work in applying these techniques to social robots, and the future generations of smart hospitals.

More Information:

Speaker's Biography: Alexandre Alahi is currently a research scientist at Stanford University and received his PhD from EPFL in Switzerland (nominated for the EPFL PhD prize). His research enables machines to perceive the world and make decisions in the context of transportation problems and built environments at all scales. His work is centered around understanding and predicting human social behavior at all scale with multi-modal data. He has worked on the theoretical and practical applications of socially-aware Artificial Intelligence. He was awarded the Swiss NSF early and advanced researcher grants for his work on predicting human social behavior. He won the CVPR 2012 Open Source Award for his work on Retina-inspired image descriptor, and the ICDSC 2009 Challenge Prize for his sparsity driven algorithm that has tracked more than 100 million pedestrians in train terminals. His research has been covered internationally by BBC, Euronews, Wall street journal, as well as national news in the US and Switzerland. Finally, he co-founded the startup Visiosafe, won several startup competitions, and was elected as the Top 20 Swiss Venture leaders in 2010.


Saverio Murgia

Horus Technology

November 15, 2016 4:30 pm to 5:30 pm

Location: Packard 101

Talk Title: Designing a smart wearable camera for blind and visually impaired people

Talk Abstract: Horus Technology was founded in July 2014 with the goal of creating a smart wearable camera for blind and visually impaired people featuring intelligent algorithms that could understand the environment around its user and describe it out loud. Two years later, Horus has a working prototype being tested by a number of blind people in Europe and North America. Harnessing the power of portable GPUs, stereo vision and deep learning algorithms, Horus can read texts in different languages, learn and recognize faces, objects and identify obstacles. Designing a wearable device, we had to face a number of challenges and difficult choices. We will describe our systems, our design choices for both software and hardware and we will end with a small demo of Horus capabilities.

More Information:

Speaker's Biography: Founder and CEO of Horus Technology, Saverio Murgia is passionate about machine learning, computer vision and robotics. Both engineer and entrepreneur, in 2015 he obtained a double MSc/MEng in Advanced Robotics from the Ecole Centrale de Nantes and the University of Genoa. He also owns a degree in management from ISICT and a BSc in Biomedical Engineering from the University of Genoa. Before founding Horus Technology, Saverio was visiting researcher at EPFL and the Italian Institute of Technology.


Dr Emanuele Mandelli

InVisage Technologies

November 9, 2016 4:30 pm to 5:30 pm

Location: Packard 101

Talk Title: Quantum dot-based image sensors for cutting-edge commercial multispectral cameras

Talk Abstract: This work presents the development of a quantum dot-based photosensitive film engineered to be integrated on standard CMOS process wafers. It enables the design of exceptionally high performance, reliable image sensors. Quantum dot solids absorb light much more rapidly than typical silicon-based photodiodes do, and with the ability to tune the effective material bandgap, quantum dot-based imagers enable higher quantum efficiency over extended spectral bands, both in the Visible and IR regions of the spectrum. Moreover, a quantum dot-based image sensor enables desirable functions such as ultra-small pixels with low crosstalk, high full well capacity, global shutter and wide dynamic range at a relatively low manufacturing cost. At InVisage, we have optimized the manufacturing process flow and are now able to produce high-end image sensors for both visible and NIR in quantity.

Speaker's Biography: Emanuele is Vice President of Engineering at InVisage Technologies, an advanced materials and camera platform company based in Menlo Park, CA. He has more than 20 years of experience with image sensors, X-ray, and particle physics detectors. He began his career at the Lawrence Berkeley National Laboratory, where he designed integrated circuits for high energy physics and helped deliver the pixel readout modules for the Atlas CERN inner detector in the Large Hadron Collider that confirmed the Higgs boson theory. He then joined AltaSens, an early stage startup company spun off from Rockwell Scientific and later acquired by JVC Kenwood, where he designed high-end CMOS image sensors for cinematographers, television broadcasters and filmmakers. He has been a reviewer for the NSS-MIC conference and he is author of numerous papers and image sensor-related patents. He holds a PhD, MS and BS in Electrical Engineering and Computer Science from the University of Pavia, Italy.


Dr. Christy Fernandez Cull

MIT Media Lab

November 1, 2016 4:30 pm to 5:30 pm

Location: Packard 101

Talk Title: Smart pixel imaging with computational arrays

Talk Abstract: This talk will review architectures for computational imaging arrays where algorithms and cameras are co-designed. Finally, this seminar will focus on novel digital readout integrated circuits (DROICs) that achieve snapshot on-
chip high dynamic range and object tracking where most commercial systems require a multiple exposure acquisition.

Speaker's Biography: Christy Fernandez-Cull received her M.S. and Ph.D. in Electrical and Computer Engineering from Duke University. She has worked at MIT Lincoln Laboratory as a member of the technical staff in the Sensor Technology and System Applications group and is a research affiliate with the Camera Culture Group at MIT Media Laboratory. She is an active member in the OSA, SPIE, IEEE, and SHPE. Christy has worked on and published papers pertaining to computational imager design, coded aperture systems, photonics, compressive holography, weather satellites, periscope systems and millimeter-wave holography systems. Her interests include research and development efforts in science and technology, volunteering to stimulate science, technology, engineering, and math disciplines in K-12 grades, and keeping up-to-date with advances in science policy.


Dr. Matthew O'Toole

Stanford University

October 18, 2016 4:30 pm to 5:30 pm

Location: Packard 101

Talk Title: Optical Probing for Analyzing Light Transport

Talk Abstract: Active illumination techniques enable self-driving cars to detect and avoid hazards, optical microscopes to see deep into volumetric specimens, and light stages to digitally capture the shape and appearance of subjects. These active techniques work by using controllable lights to emit structured illumination patterns into an environment, and sensors to detect and process the light reflected back in response. Although such techniques confer many unique imaging capabilities, they often require long acquisition and processing times, rely on predictive models for the way light interacts with a scene, and cease to function when exposed to bright ambient sunlight.

In this talk, we introduce a generalized form of active illumination—known as optical probing—that provides a user with unprecedented control over which light paths contribute to a photo. The key idea is to project a sequence of illumination patterns onto a scene, while simultaneously using a second sequence of mask patterns to physically block the light received at select sensor pixels. This all-optical technique enables RAW photos to be captured in which specific light paths are blocked, attenuated, or enhanced. We demonstrate experimental probing prototypes with the ability to (1) record live direct-only or indirect-only video streams of a scene, (2) capture the 3D shape of objects in the presence of complex transport properties and strong ambient illumination, and (3) overcome the multi-path interference problem associated with time-of-flight sensors.

Speaker's Biography: Matthew O’Toole is a Banting Postdoctoral Fellow at Stanford’s Computational Imaging group. He received his Ph.D. degree from the University of Toronto in 2016 on research related to active illumination and light transport. He organized the IEEE 2016 International Workshop on Computational Cameras and Displays, and was a visiting student at the MIT Media Lab’s Camera Culture group in 2011. His work was the recipient of two “Best Paper Honorable Mention” awards at CVPR 2014 and ICCV 2007, and two “Best Demo” awards at CVPR 2015 and ICCP 2015.


Brian Cabral


October 12, 2016 4:30 pm to 5:30 pm

Location: Packard 101

Talk Title: The Soul of a New Camera: The design of Facebook's Surround Open Source 3D-360 video camera

Talk Abstract: Around a year ago we set out to create an open-source reference design for a 3D-360 camera. In nine months, we had designed and built the camera, and published the specs and code. Our team leveraged a series of maturing technologies in this effort. Advances and availability in sensor technology, 20+ of computer vision algorithm development, 3D printing, rapid design photo-typing and computation photography allowed our team to move extremely fast. We will delve into the roles each of these technologies played in the designing of the camera, giving an overview of the system components and discussing the tradeoffs made during the design process. The engineering complexities and technical elements of 360 stereoscopic video capture will be discussed as well. We will end with some demos of the system and its output.

Speaker's Biography: Brian Cabral is Director of engineering at Facebook specializing in computational photography, computer vision, and computer graphics. He is the holder of numerous patents (filed and issued) and lead the Surround 360 VR camera team. He has published a number of diverse papers in the area of computer graphics and imaging including the pioneering Line Integral Convolution algorithm.


Bernard Kress

Microsoft Research

October 5, 2016 4:30 pm to 5:30 pm

Location: Packard 101

Talk Title: Human-centric optical design: a key for next generation AR and VR optics

Talk Abstract: The ultimate wearable display is an information device that people can use all day. It should be as forgettable as a pair of glasses or a watch, but more useful than a smart phone. It should be small, light, low-power, high-resolution and have a large field of view (FOV). Oh, and one more thing, it should be able to switch from VR to AR.

These requirements pose challenges for hardware and, most importantly, optical design. In this talk, I will review existing AR and VR optical architectures and explain why it is difficult to create a small, light and high-resolution display that has a wide FOV. Because comfort is king, new optical designs for the next-generation AR and VR system should be guided by an understanding of the capabilities and limitations of the human visual system.

Speaker's Biography: Bernard has made over the past two decades significant scientific contributions as an engineer, researcher, associate professor, consultant, instructor, and author.
He has been instrumental in developing numerous optical sub-systems for consumer electronics and industrial products, generating IP, teaching and transferring technological solutions to industry. Application sectors include laser materials processing, optical anti-counterfeiting, biotech sensors, optical telecom devices, optical data storage, optical computing, optical motion sensors, digital image projection, displays, depth map sensors, and more recently head-up and head mounted displays (smart glasses, AR and VR).
His is specifically involved in the field of micro-optics, wafer scale optics, holography and nanophotonics.
Bernard has 32 patents granted worldwide and published numerous books and book chapters on micro-optics. He is a short course instructor for the SPIE and was involved in numerous SPIE conferences as technical committee member and conference co-chair.
He is an SPIE fellow since 2013 as has been recently elected to the board of Directors of SPIE.
Bernard has joined Google [X] Labs. in 2011 as the Principal Optical Architect, and is now Partner Optical Architect at Microsoft Corp, on the Hololens project.


Professor Gal Chechik

Google and the Gonda Brain Research Center at Bar-Ilan University in Israel

May 11, 2016 4:15 pm to 5:15 pm

Location: Packard 101

Talk Title: Machine learning for large-scale image understanding

Talk Abstract: The recent progress in recognizing visual objects and annotating images has been driven by super-rich models and massive datasets. However, machine vision models still have a very limited 'understanding' of images, rendering them brittle when attempting to generalize to unseen examples. I will describe recent efforts to improve the robustness and accuracy of systems for annotating and retrieving images, first, by using structure in the space of images and fusing various types of information about image labels, and second, by matching structures in visual scenes to structures in their corresponding language descriptions or queries. We apply these approaches to billions of queries and images, to improve search and annotation of public images and personal photos.

More Information:

Speaker's Biography: Gal Chechik is a professor at the Gonda brain research center, Bar-Ilan University, Israel, and a senior research scientist at Google. His work focuses on learning in brains and in machines. Specifically, he studies the principles governing representation and adaptivity at multiple timescales in the brain, and algorithms for training computers to represent signals and learn from examples. Gal earned his PhD in 2004 from the Hebrew University of Jerusalem developing machine learning and probabilistic methods to understand the auditory neural code. He then studied computational principles regulating molecular cellular pathways as a postdoctoral researcher at the CS dept in Stanford. In 2007, he joined Google research as a senior research scientist, developing large-scale machine learning algorithms for machine perception. Since 2009, he heads the computational neurobiology lab at BIU and was appointed an associate professor in 2013. He was awarded a Fulbright fellowship, a complexity scholarship and the Israeli national Alon fellowship.


Professor Matthias Niessner

Stanford University

May 4, 2016 4:15 pm to 5:15 pm

Location: Packard 101

Talk Title: Interactive 3D: Static and Dynamic Environment Capture in Real Time

Talk Abstract: In recent years, commodity 3D sensors have become easily and widely available. These advances in sensing technology have inspired significant interest in using captured 3D data for mapping and understanding 3D environments. In this talk, I will show how we can now easily obtain 3D reconstructions of static and dynamic environments in an interactive manner, and how we can process and utilize the data efficiently in real time on modern graphics hardware. In a concrete example application for 3D reconstruction, I will talk about facial reenactment, where we use an intermediate 3D reconstruction to interactively edit videos in real time.

More Information:

Speaker's Biography: Matthias Niessner is a visiting assistant professor at Stanford University. Previous to his appointment at Stanford, he earned his PhD from the University of Erlangen-Nuremberg, Germany under the supervision of Günther Greiner. His research focuses on different fields of computer graphics and computer vision, including the reconstruction and semantic understanding of 3D scene environments.

Video Files:


Professor Brian Wandell


April 27, 2016 4:15 pm to 5:15 pm

Location: Packard 101

Talk Title: Learning the image processing pipeline

Talk Abstract: Many creative ideas are being proposed for image sensor designs, and these may be useful in applications ranging from consumer photography to computer vision. To understand and evaluate each new design, we must create a corresponding image-processing pipeline that transforms the sensor data into a form that is appropriate for the application. The need to design and optimize these pipelines is time-consuming and costly. I explain a method that combines machine learning and image systems simulation that automates the pipeline design. The approach is based on a new way of thinking of the image-processing pipeline as a large collection of local linear filters. Finally, I illustrate how the method has been used to design pipelines for consumer photography and mobile imaging.

More Information:

Speaker's Biography: Brian A. Wandell is the first Isaac and Madeline Stein Family Professor. He joined the Stanford Psychology faculty in 1979 and is a member, by courtesy, of Electrical Engineering and Ophthalmology. Wandell is the founding director of Stanford’s Center for Cognitive and Neurobiological Imaging, and a Deputy Director of the Stanford Neuroscience Institute. He is the author of the vision science textbook Foundations of Vision. His research centers on vision science, spanning topics from visual disorders, reading development in children, to digital imaging devices and algorithms for both magnetic resonance imaging and digital imaging. In 1996, together with Prof. J. Goodman, Wandell founded Stanford’s Center for Image Systems Engineering which evolved into SCIEN in 2003.


Professor Ramesh Raskar

MIT Media Lab

April 20, 2016 4:15 pm to 5:15 pm

Location: Packard 101

Talk Title: Extreme Computational Photography

Talk Abstract: The Camera Culture Group at the MIT Media Lab aims to create a new class of imaging platforms. This talk will discuss three tracks of research: femto photography, retinal imaging, and 3D displays.
Femto Photography consists of femtosecond laser illumination, picosecond-accurate detectors and mathematical reconstruction techniques allowing researchers to visualize propagation of light. Direct recording of reflected or scattered light at such a frame rate with sufficient brightness is nearly impossible. Using an indirect 'stroboscopic' method that records millions of repeated measurements by careful scanning in time and viewpoints we can rearrange the data to create a 'movie' of a nanosecond long event. Femto photography and a new generation of nano-photography (using ToF cameras) allow powerful inference with computer vision in presence of scattering.

EyeNetra is a mobile phone attachment that allows users to test their own eyesight. The device reveals corrective measures thus bringing vision to billions of people who would not have had access otherwise. Another project, eyeMITRA, is a mobile retinal imaging solution that brings retinal exams to the realm of routine care, by lowering the cost of the imaging device to a 10th of its current cost and integrating the device with image analysis software and predictive analytics. This provides early detection of Diabetic Retinopathy that can change the arc of growth of the world’s largest cause of blindness.

Finally the talk will describe novel lightfield cameras and lightfield displays that require a compressive optical architecture to deal with high bandwidth requirements of 4D signals.

More Information:

Speaker's Biography: Ramesh Raskar is an Associate Professor at MIT Media Lab. Ramesh Raskar joined the Media Lab from Mitsubishi Electric Research Laboratories in 2008 as head of the Lab’s Camera Culture research group. His research interests span the fields of computational photography, inverse problems in imaging and human-computer interaction. Recent projects and inventions include transient imaging to look around a corner, a next generation CAT-Scan machine, imperceptible markers for motion capture (Prakash), long distance barcodes (Bokode), touch+hover 3D interaction displays (BiDi screen), low-cost eye care devices (Netra,Catra), new theoretical models to augment light fields (ALF) to represent wave phenomena and algebraic rank constraints for 3D displays(HR3D).

In 2004, Raskar received the TR100 Award from Technology Review, which recognizes top young innovators under the age of 35, and in 2003, the Global Indus Technovator Award, instituted at MIT to recognize the top 20 Indian technology innovators worldwide. In 2009, he was awarded a Sloan Research Fellowship. In 2010, he received the Darpa Young Faculty award. Other awards include Marr Prize honorable mention 2009, LAUNCH Health Innovation Award, presented by NASA, USAID, US State Dept and NIKE, 2010, Vodafone Wireless Innovation Project Award (first place), 2011. He holds over 50 US patents and has received four Mitsubishi Electric Invention Awards. He is currently co-authoring a book on Computational Photography.


Professor Richard Baraniuk

Rice University

April 14, 2016 4:15 pm to 5:15 pm

Location: Clark Auditorium

Talk Title: A Probabilistic Theory of Deep Learning

Talk Abstract: A grand challenge in machine learning is the development of computational algorithms that match or outperform humans in perceptual inference tasks that are complicated by nuisance variation. For instance, visual object recognition involves the unknown object position, orientation, and scale in object recognition while speech recognition involves the unknown voice pronunciation, pitch, and speed. Recently, a new breed of deep learning algorithms have emerged for high-nuisance inference tasks that routinely yield pattern recognition systems with near- or super-human capabilities. But a fundamental question remains: Why do they work? Intuitions abound, but a coherent framework for understanding, analyzing, and synthesizing deep learning architectures has remained elusive. We answer this question by developing a new probabilistic framework for deep learning based on the Deep Rendering Model: a generative probabilistic model that explicitly captures latent nuisance variation. By relaxing the generative model to a discriminative one, we can recover two of the current leading deep learning systems, deep convolutional neural networks and random decision forests, providing insights into their successes and shortcomings, a principled route to their improvement, and new avenues for exploration.

Speaker's Biography: Richard G. Baraniuk is the Victor E. Cameron Professor of Electrical and Computer Engineering at Rice University. His research interests lie in new theory, algorithms, and hardware for sensing, signal processing, and machine learning. He is a Fellow of the IEEE and AAAS and has received national young investigator awards from the US NSF and ONR, the Rosenbaum Fellowship from the Isaac Newton Institute of Cambridge University, the ECE Young Alumni Achievement Award from the University of Illinois, the Wavelet Pioneer and Compressive Sampling Pioneer Awards from SPIE, the IEEE Signal Processing Society Best Paper Award, and the IEEE Signal Processing Society Technical Achievement Award. His work on the Rice single-pixel compressive camera has been widely reported in the popular press and was selected by MIT Technology Review as a TR10 Top 10 Emerging Technology. For his teaching and education projects, including Connexions ( and OpenStax (, he has received the C. Holmes MacDonald National Outstanding Teaching Award from Eta Kappa Nu, the Tech Museum of Innovation Laureate Award, the Internet Pioneer Award from the Berkman Center for Internet and Society at Harvard Law School, the World Technology Award for Education, the IEEE-SPS Education Award, the WISE Education Award, and the IEEE James H. Mulligan, Jr. Medal for Education.


Nicholas Frushour and Michael Carney

Canon Mixed Reality

April 13, 2016 4:15 pm to 5:15 pm

Location: Packard 101

Talk Title: Practical uses of mixed reality in a manufacturing workflow

Talk Abstract: There are strong use-cases for augmented and mixed reality outside of entertainment. One particularly practical use is in the manufacturing industry. It’s an industry with well-established workflows and product cycles, but also known problems and pinch-points. In this talk, we will walk through each step of the Product Lifecycle Management workflow and discuss how mixed reality is helping manufacturers each step of the way. We will also cover the background of mixed reality and the key differences between AR, MR, and VR.

After the talk, there will be a demo: The Canon MREAL System enables designers and engineers to review and interact with CAD designs. 3D designs can be viewed from any angle – including creating a cross section view for advanced visualization and design assessment.

Speaker's Biography: Nicholas Frushour is a software engineer at Canon and has been working with mixed reality for over 3 years.

Michael Carney is a Visualization Consultant specializing in Mixed Reality at Canon USA. He is proficient in not only the technology of New Media but the culture in which it is immersed, how it is used, and the direction in which trends are heading. He is now applying Mixed Reality conventions to enterprise use cases, such as Design and Manufacturing, High Risk Training and Education.


Dr. Nicolas Pégard

U.C. Berkeley

April 6, 2016 4:15 pm to 5:15 pm

Location: Packard 101

Talk Title: Compressive light-field microscopy for 3D functional imaging of the living brain.

Talk Abstract: We present a new microscopy technique for 3D functional neuroimaging in live brain tissue. The device is a simple light field fluorescence microscope allowing full volume acquisition in a single shot and can be miniaturized into a portable implant. Our computational methods first rely on spatial and temporal sparsity of fluorescence signals to identify and precisely localize neurons. We compute for each neuron a unique pattern, the light-field signature, that accounts for the effects of optical scattering and aberrations. The technique then yields a precise localization of active neurons and enables quantitative measurement of fluorescence with individual neuron spatial resolution and at high speeds, all without ever reconstructing a volume image. Experimental results are shown on live Zebrafish.

More Information:

Speaker's Biography: Nicolas Pégard received his B.S. in Physics from Ecole Polytechnique (France) in 2009, and his Ph.D. in Electrical Engineering at Princeton University under Prof. Fleischer in 2014. He is now a postdoctoral researcher at U.C. Berkeley under the supervision of Prof. H.Adesnik (Molecular and Cell Biology dpt.) and Prof. L.Waller. (Electrical Engineering and Computer Science dpt.) His main research interests are in optical system design and computational microscopy. He is currently developing all-optical methods to observe and control the activity of individual neurons in deep, live brain tissue with high spatial and temporal resolution.


Visiting Assistant Professor Haricharan Lakshman

Stanford University

March 30, 2016 4:15 pm to 5:15 pm

Location: Packard 101

Talk Title: Data Representations for Cinematic Virtual Reality

Talk Abstract: Historically, virtual reality (VR) with head-mounted displays (HMDs) is associated with computer-generated content and gaming applications. However, recent advances in 360 degree cameras facilitate omnidirectional capture of real-world environments to create content to be viewed on HMDs - a technology referred to as cinematic VR. This can be used to immerse the user, for instance, in a concert or sports event. The main focus of this talk will be on data representations for creating such immersive experiences.

In cinematic VR, videos are usually represented in a spherical format to account for all viewing directions. To achieve high-quality streaming of such videos to millions of users, it is crucial to consider efficient representations for this type of data, in order to maximize compression efficiency under resource constraints, such as the number of pixels and bitrate. We formulate the choice of representation as a multi-dimensional, multiple choice knapsack problem and show that the resulting representations adapt well to varying content. 

Existing cinematic VR systems update the viewports according to head rotation, but do not support head translation or focus cues. We propose a new 3D video representation, referred to as depth augmented stereo panorama, to address this issue. We show that this representation can successfully induce head-motion parallax in a predefined operating range, as well as generate light fields across the observer’s pupils, suitable for using with emerging light field HMDs.

Speaker's Biography: Haricharan Lakshman is a Visiting Assistant Professor in the Electrical Engineering Department at Stanford University since Fall 2014. His research interests are broadly in Image Processing, Visual Computing and Communications. He received his PhD in Electrical Engineering from the Technical University of Berlin, Germany, in Jan 2014, while working as a Researcher in the Image Processing Group of Fraunhofer HHI. Between 2011-2012, he was a Visiting Researcher at Stanford. He was awarded the IEEE Communications Society MMTC Best Journal Paper Award in 2013, and was a finalist for the IEEE ICIP Best Student Paper Award in 2010 and 2012.


Professor Bas Rokers

University of Wisconsin - Madison

March 23, 2016 4:15 pm to 5:15 pm

Location: Packard 101

Talk Title: Fundamental and individual limitations in the perception of 3D motion: Implications for Virtual Reality

Talk Abstract: Neuroscientists have extensively studied motion and depth perception, and have provided a good understanding of the underlying neural mechanisms. However, since these mechanisms are frequently studied in isolation, their interplay remains poorly understood. In fact, I will introduce a number of puzzling deficits in the perception of 3D motion in this talk. Given the advent of virtual reality (VR) and the need to provide a compelling user experience, it is imperative that we understand the factors that determine the sensitivity and limitations of 3D motion perception.

I will present recent work from our lab which shows that fundamental as well as individual limitations in the processing of retinal information cause specific deficits in the perception of 3D motion. Subsequently, I will discuss the potential of extra-retinal (head motion) information to overcome some of these limitations. Finally, I will discuss how individual variability in the sensitivity to 3D motion predicts the propensity for simulator sickness.

Our research sheds light on the interplay of the neural mechanisms that underlie perception, and accounts for the visual system’s sensitivity to 3D motion. Our results provide specific suggestions to improve VR technology and bring virtual reality into the mainstream.

More Information:

Speaker's Biography: Bas Rokers is Associate Professor in the Department of Psychology and a member of the McPherson Eye Research Institute at the University of Wisconsin - Madison. His work in visual perception aims to uncover the neural basis of binocular perception, visual disorders and brain development. In 2015 he was a Visiting Professor in the Department of Brain and Cognitive Sciences at MIT, and he can currently be seen in the National Geographic television series Brain Games on Netflix.


Professor Colin Sheppard

Nanophysics Department at the Italian Institute of Technology, Genoa, Italy

March 16, 2016 4:15 pm to 5:15 pm

Location: Packard 101

Talk Title: Confocal microscopy: past, present and future.

Talk Abstract: Confocal microscopy has made a dramatic impact on biomedical imaging, in particular, but also in other areas such as industrial inspection. Confocal microscopy can image in 3D, with good resolution, into living biological cells and tissue. I have had the good fortune to be involved with the development of confocal microscopy over the last 40 years. Other techniques have been introduced that overcome some of its limitations, but still it is the preferred choice in many cases. And new developments in confocal microscopy, such as focal modulation microscopy, and image-scanning microscopy, can improve its performance in terms of penetration depth, resolution and signal level.

Speaker's Biography: Colin Sheppard is Senior Scientist in the Nanophysics Department at the Italian Institute of Technology, Genoa. He is a Visiting Miller Professor at UC-Berkeley. He obtained his PhD degree from University of Cambridge. Previously he has been Professor in the Departments of Bioengineering, Biological Sciences and Diagnostic Radiology at the National University of Singapore, Professor of Physics at the University of Sydney, and University Lecturer in Engineering Science at the University of Oxford. He developed an early confocal microscope, the first with computer control and storage (1983), launched the first commercial confocal microscope (1982), published the first scanning multiphoton microscopy images (1977), proposed two-photon fluorescence and CARS microscopy (1978), and patented scanning microscopy using Bessel beams (1977). In 1988, he proposed scanning microscopy using a detector array with pixel reassignment, now known as image scanning microscopy.


EE 367 Computational Imaging and Display - Final Course Project Poster & Demo Session

March 9, 2016 4:15 pm to 5:15 pm

Talk Title: EE 367 Computational Imaging and Display - Final Course Project Poster & Demo Session

Talk Abstract: Computational imaging and display systems have a wide range of applications in consumer electronics, scientific imaging, HCI, medical imaging, microscopy, and remote sensing. In this end-of-the-academic-quarter poster & demo session, the graduate students will present their course projects. More information about the course can be found here: and an overview of projects presented in previous years is here:

More Information:


The Stanford Center for Mind, Brain and Computation presents "Deep Learning: Fundamental Progress, Brain Representations and Semantic Learning"

March 2, 2016 1:00 pm to 5:00 pm

Location: Mackenzie Room (Room 300), Huang Engineering Building

Talk Title: The Stanford Center for Mind, Brain and Computation presents "Deep Learning: Fundamental Progress, Brain Representations and Semantic Learning"

Talk Abstract: Speakers include:

Surya Ganguli, Stanford: “Towards theories of deep learning: from semantic cognition to neural engineering”

Quoc V. Le, Google: “Large scale deep learning”

Daniel Yamins, MIT: “Using behaviorally-driven computational models to uncover principles of cortical representation”

With a Panel and audience discussion moderated by Jay McCleland, Stanford, and a wine and cheese reception

More Information:


Professor Doug James

Stanford University

February 24, 2016 4:15 pm to 5:15 pm

Location: Packard 101

Talk Title: Physics-based Animation Sound: Progress and Challenges

Talk Abstract: Decades of advances in computer graphics have made it possible to convincingly animate a wide range of physical phenomena, such as fracturing solids and splashing water. Unfortunately, our visual simulations are essentially "silent movies" with sound added as an afterthought. In this talk, I will describe recent progress on physics-based sound synthesis algorithms that can help simulate rich multi-sensory experiences where graphics, motion, and sound are synchronized and highly engaging. I will describe work on specific sound phenomena, and highlight the important roles played by precomputation techniques, and reduced-order models for vibration, radiation, and collision processing.

More Information:

Speaker's Biography: Doug L. James is a Full Professor of Computer Science at Stanford University since June 2015, and was previously an Associate Professor of Computer Science at Cornell University from 2006-2015. He holds three degrees in applied mathematics, including a Ph.D. in 2001 from the University of British Columbia. In 2002 he joined the School of Computer Science at Carnegie Mellon University as an Assistant Professor, before joining Cornell in 2006. His research interests include computer graphics, computer sound, physically based animation, and reduced-order physics models. Doug is a recipient of a National Science Foundation CAREER award, and a fellow of both the Alfred P. Sloan Foundation and the Guggenheim Foundation. He recently received a Technical Achievement Award from The Academy of Motion Picture Arts and Sciences for "Wavelet Turbulence," and the Katayanagi Emerging Leadership Prize from Carnegie Mellon University and Tokyo University of Technology. He was the Technical Papers Program Chair of ACM SIGGRAPH 2015.


Dr. Tokuyuki Honda


February 10, 2016 4:15 pm to 5:15 pm

Location: Packard 101

Talk Title: Medical innovation with minimally-invasive optical imaging and image-guided robotics

Talk Abstract: A fundamental challenge of healthcare is to meet the increasing demand for quality of care while minimizing the cost. We aim to make meaningful contributions to solve the challenge by creating innovative, minimally-invasive imaging and image-guided robotics technologies collaborating with leading research hospitals and other stake holders. In this lecture, we present technologies under development such as ultra-miniature endoscope, image-guided needle robot, and scanning-laser ophthalmoscope, and discuss how we can potentially address unmet clinical needs.

More Information:

Speaker's Biography: Dr. Tokuyuki (Toku) Honda is a Senior Fellow in Canon U.S.A., Inc., and has been the head of Healthcare Optics Research Laboratory in Cambridge, MA, since its creation in 2012. The mission of the laboratory is to grow the seeds of new medical business collaborating with hospitals and universities. Dr. Honda received his Ph.D. in Applied Physics from University of Tokyo, and worked in 1996-1998 as a Postdoctoral Fellow in the group of Professor Lambertus Hesselink at the Department of Electrical Engineering, Stanford University.


Professor Edoardo Charbon

Delft University of Technology, Netherlands

February 3, 2016 4:15 pm to 5:15 pm

Location: Packard 101

Talk Title: The Photon Counting Camera: a Versatile Tool for Quantum Imaging and Quantitative Photography

Talk Abstract: The recent availability of miniaturized photon counting pixels in standard CMOS processes has paved the way to the introduction of photon counting in low-cost image sensors. The uses of these devices are multifold, ranging from LIDARs to Raman spectroscopy, from fluorescence lifetime to molecular imaging, from super-resolution microscopy to data security and encryption.

In this talk we describe the technology at the core of this revolution: single-photon avalanche diodes (SPADs) and the architectures enabling SPAD based image sensors. We discuss tradeoffs and design trends, often referring to specific sensor chips, new materials for extended sensitivity, and 3D integration for ultra-high speed operation. We also discuss the recent impact of SPAD cameras in metrology, robotics, mobile phones, and consumer electronics.

Speaker's Biography: Edoardo Charbon (SM’10) received the Diploma from ETH Zürich in 1988, the M.S. degree from UCSD in 1991, and the Ph.D. degree from UC-Berkeley in 1995, all in Electrical Engineering and EECS. From 1995 to 2000, he was with Cadence Design Systems, where he was the architect of the company’s intellectual property protection and on-chip information hiding tools; from 2000 to 2002, he was Canesta Inc.’s Chief Architect, leading the design of consumer time-of-flight 3D cameras; Canesta was sold to Microsoft Corp. in 2010. Since November 2002, he has been a member of the Faculty of EPFL in Lausanne, Switzerland and in Fall 2008 he joined the Faculty of TU Delft, Chair of VLSI Design, succeeding Patrick Dewilde. Dr. Charbon is the initiator and coordinator of MEGAFRAME and SPADnet, two major European projects for the creation of CMOS photon counting image sensors in biomedical diagnostics. He has published over 250 articles in peer-reviewed technical journals and conference proceedings and two books, and he holds 18 patents. Dr. Charbon is the co-recipient of the European Photonics Innovation V


Professor Ronnier Luo

Zhejiang University (China) and Leeds University (UK)

January 27, 2016 4:15 pm to 5:15 pm

Location: Packard 101

Talk Title: The Impact of New Developments of Colour Science on Imaging Technology

Talk Abstract: Colour science has been widely used in the imaging industry. This talk will introduce some new development areas of colour science. Among them, three areas related to imaging technology will be focused upon: LED lighting quality, CIE 2006 colorimetry, comprehensive colour appearance modelling. LED lightings has recently making great advances in the illumination industry. It has a unique feature on adjustability, capable of tuning its spectral for different applications. A tuneable LED system for image sensor applications such as white balance, calibration and characterisation will be introduced. It will be demonstrated and its performance will be reported.

Speaker's Biography: Ronnier is a Global Expert Professor at Zhejiang University (China), and Professor of Colour and Imaging Science at Leeds University (UK). He is also the Vice-President of the International Commission on Illumination (CIE). He received his PhD in 1986 at the University of Bradford in the field of colour science. He has published over 500 scientific articles in the fields of colour science, imaging technology and LED illumination. He is a Fellow of the Society for Imaging Science and Technology, and the Society of Dyers and Colourists. He is also the Chief Editor of the Encyclopaedia of Colour Science and Technology published by Springer in December 2015.


Visiting Professor Dominik Michels

Stanford University

January 20, 2016 4:15 pm to 5:15 pm

Location: Packard 101

Talk Title: Complex Real-Time High-Fidelity Simulations in Visual Computing

Talk Abstract: Whereas in the beginning of visual computing, mostly rudimentary physical models or rough approximations were employed to allow for real-time simulations, several modern applications of visual computing and related disciplines require fast and highly accurate numerical simulations at once; for example interactive computer-aided design and manufacturing processes, digital modeling and fabrication, training simulators, and intelligent robotic devices. This talk covers a selection of high fidelity algorithms and techniques for the fast simulation of complex scenarios; among others symbolic-numeric coupling, structure preserving integration, and timescale segmentation. Their powerful behavior is presented on a broad spectrum of concrete applications in science and industry from the simulation of biological microswimmers and molecular structures to the optimization of consumer goods like toothbrushes and shavers.

More Information:

Speaker's Biography: Dominik L. Michels serves for the Computer Science Department at Stanford University and runs the High Fidelity Algorithmics Group at the Max Planck Center for Visual Computing and Communication since fall 2014. Previously, he was a Postdoc in Computing and Mathematical Sciences at Caltech. He studied Computer Science and Physics at University of Bonn and B-IT, from where he received a B.Sc. in Computer Science and Physics in 2011, a M.Sc. in Computer Science in 2013, and a Ph.D. in Mathematics and Natural Sciences on Stiff Cauchy Problems in Scientific Computing in early 2014. He was a visiting scholar at several international institutions, among others at JINR in Moscow, and at MIT and Harvard University in Cambridge, MA. His research comprises both fundamental and applied aspects on computational mathematics and physics addressing open research questions in algorithmics, computer algebra, symbolic-numeric methods, and mathematical modeling to solve practically relevant problems in scientific and visual computing. In the non-academic context, he works as a research partner in the sections of high-technology and consumer goods.


Professor Laura Waller

University of California at Berkeley

January 13, 2016 4:15 pm to 5:15 pm

Location: Packard 101

Talk Title: 3D Computational Microscopy

Talk Abstract: This talk will describe new computational microscopy methods for high pixel-count 3D images. We describe two setups employing illumination-side and detection-side aperture coding of angle (Fourier) space for capturing 4D phase-space (e.g. light field) datasets with fast acquisition times. Using a multi-slice forward model, we develop efficient 3D reconstruction algorithms for both incoherent and coherent imaging models, with robustness to scattering. Experimentally, we achieve real-time 3D intensity and phase capture with high resolution across a large volume. Such computational approaches to optical microscopy add significant new capabilities to commercial microscopes without significant hardware modification.

More Information:

Speaker's Biography: Laura Waller is an Assistant Professor at UC Berkeley in the Department of Electrical Engineering and Computer Sciences (EECS) and a Senior Fellow at the Berkeley Institute of Data Science (BIDS), with affiliations in Bioengineering, QB3 and Applied Sciences & Technology. She was a Postdoctoral Researcher and Lecturer of Physics at Princeton University from 2010-2012 and received B.S., M.Eng., and Ph.D. degrees from the Massachusetts Institute of Technology (MIT) in 2004, 2005, and 2010, respectively. She is a Moore Foundation Data-Driven Investigator, Bakar fellow, NSF CAREER awardee and Packard Fellow.


Professor Audrey Bowden

Stanford University

January 6, 2016 4:30 pm to 5:30 pm

Location: Packard 101

Talk Title: Lighting the Path to Better Healthcare

Talk Abstract: Cancer. Infertility. Hearing loss. Each of these phrases can bring a ray of darkness into an otherwise happy life. The Stanford Biomedical Optics group, led by Professor Audrey Bowden, aims to develop and deploy novel optical technologies to solve interdisciplinary challenges in the clinical and basic sciences. In short, we use light to image life -- and in so doing, illuminate new paths to better disease diagnosis, management and treatment. In this talk, I will discuss our recent efforts to design, fabricate and/or construct new hardware, software and systems-level biomedical optics tools to attack problems in skin cancer, bladder cancer, hearing loss and infertility. Our efforts span development of new fabrication techniques for 3D tissue-mimicking phantoms, new strategies for creating large mosaics and 3D models of biomedical data, machine-learning classifiers for automated detection of disease, novel system advances for multiplexed optical coherence tomography and low-cost technologies for point-of-care diagnostics.

More Information:

Speaker's Biography: Audrey K (Ellerbee) Bowden is an Assistant Professor of Electrical Engineering at Stanford University. She received her BSE in EE from Princeton University, her PhD in BME from Duke University and completed her postdoctoral training in Chemistry and Chemical Biology at Harvard University. During her career, Dr. Bowden served as an International Fellow at Ngee Ann Polytechnic in Singapore and as a Legislative Assistant in the United States Senate through the AAAS Science and Technology Policy Fellows Program sponsored by the OSA and SPIE. She is a member of the OSA, a Senior Member of SPIE and is the recipient of numerous awards, including the Air Force Young Investigator Award, the NSF Career Award and the Hellman Faculty Scholars Award. She currently serves as Associate Editor of IEEE Photonics Journal. Her research interests include biomedical optics, microfluidics, and point of care diagnostics.


Professor Ofer Levi

University of Toronto

November 18, 2015 4:15 pm to 5:15 pm

Location: Packard 101

Talk Title: Portable optical brain imaging

Talk Abstract: Optical techniques are widely used in clinical settings and in biomedical research to interrogate bio-molecular interactions and to evaluate tissue dynamics. Miniature integrated optical systems for sensing and imaging can be portable, enabling long-term imaging studies in living tissues.

We present the development of a compact multi-modality optical neural imaging system, to image tissue blood flow velocity and oxygenation, using a fast CCD camera and miniature VCSEL illumination. We combined two techniques of laser speckle contrast imaging (LSCI) and intrinsic optical signal imaging (IOSI) simultaneously, using these compact laser sources, to monitor induced cortical ischemia in a full field format with high temporal acquisition rates. We have demonstrated tracking seizure activity, evaluating blood-brain barrier breaching, and integrating fast spatial light modulators for extended imaging depth and auto-focusing during brain imaging of flow dynamics. Our current studies include prototype designs and system optimization and evaluation for a low-cost portable imaging system as a minimally invasive method for long-term neurological studies in un-anesthetized animals. This system will provide a better understanding of the progression and treatment efficacy of various neurological disorders, in freely behaving animals

More Information:

Speaker's Biography: Dr. Ofer Levi is an Associate Professor in the Institute of Biomaterials and Biomedical Engineering and the Edward S. Rogers Sr. Department of Electrical and Computer Engineering at the University of Toronto, currently on a Sabbatical leave at Stanford University. Dr. Levi received his Ph.D. in Physics from the Hebrew University of Jerusalem, Israel in 2000, and worked in 2000-2007 as a Postdoctoral Fellow and as a Research Associate at the Departments of Applied Physics and Electrical Engineering, Stanford University, CA. He serves as an Associate Editor in Biomedical Optics Express (OSA) and is a member of OSA, IEEE-Photonics, and SPIE. His recent research areas include biomedical imaging systems and optical bio-sensors based on semiconductor devices and nano-structures, and their application to bio-medical diagnostics, in vivo imaging, and study of bio-molecular interactions.


Dr. Boyd Fowler


November 11, 2015 4:15 pm to 5:15 pm

Location: Packard 101

Talk Title: Highlights from the International Workshop on Imaging Sensors

Talk Abstract: Image sensor innovation continues after more than 50 years of development. New image sensor markets are being developed while old markets continue to grow. Higher performance and lower cost image sensors are enabling these new applications. Although CMOS image sensors dominate the market, CCDs and other novel image sensors continue to be developed. In this talk we discuss trends in image sensor technology and present results from selected workshop papers. Moreover, we will discuss developments in phase pixel technology, stacked die image sensors, time of flight image sensors, SPAD image sensors, Quanta image sensors, low light level sensors, wide dynamic range sensors and global shutter sensors.

More Information:

Speaker's Biography: Boyd Fowler was born in California in 1965. He received his M.S.E.E. and Ph.D. degrees from Stanford University in 1990 and 1995 respectively. After finishing his Ph.D. he stayed at Stanford University as a research associate in the Electrical Engineering Information Systems Laboratory until 1998. In 1998 he founded Pixel Devices International in Sunnyvale California. Between 2005 and 2013 he was the CTO and VP of Technology at Fairchild Imaging. He is current at Google researching future directions for image sensors and imaging systems. He has authored numerous technical papers, book chapters and patents. His current research interests include CMOS image sensors, low noise image sensors, noise analysis, data compression, machine learning and vision.


Professor Anat Levin

The Weizmann Institute of Science

October 28, 2015 4:15 pm to 5:15 pm

Location: Packard 101

Talk Title: Inverse Volume Rendering with Material Dictionaries

Talk Abstract: Translucent materials are ubiquitous, and simulating their appearance requires accurate physical parameters. However, physically-accurate parameters for scattering materials are difficult to acquire. We introduce an optimization framework for measuring bulk scattering properties of homogeneous materials (phase function, scattering coefficient, and absorption coefficient) that is more accurate, and more applicable to a broad range of materials. The optimization combines stochastic gradient descent with Monte Carlo rendering and a material dictionary to invert the radiative transfer equation. It offers several advantages: (1) it does not require isolating single-scattering events; (2) it allows measuring solids and liquids that are hard to dilute; (3) it returns parameters in physically-meaningful units; and (4) it does not restrict the shape of the phase function using Henyey-Greenstein or any other low-parameter model. We evaluate our approach by creating an acquisition setup that collects images of a material slab under narrow-beam RGB illumination. We validate results by measuring prescribed nano-dispersions and showing that recovered parameters match those predicted by Lorenz-Mie theory. We also provide a table of RGB scattering parameters for some common liquids and solids, which are validated by simulating color images in novel geometric configurations that match the corresponding photographs with less than 5% error.

Speaker's Biography: Anat Levin is an Associate Prof. at the Weizmann Inst. of Science, Israel, doing research in the field of computational imaging.
She is currently a Visiting Prof. on sabbatical at the Stanford EE department with Prof. Wetzstein.
She received her Ph.D. from the Hebrew University at 2006, and spent a couple of years as a posotdoc at MIT CSAIL.


Professor Aydogan Ozcan

University of California at Los Angeles

October 20, 2015 4:15 pm to 5:15 pm

Location: AllenX

Talk Title: Democratization of Next-Generation Microscopy, Sensing and Diagnostics Tools through Computational Photonics

Talk Abstract: My research focuses on the use of computation/algorithms to create new optical microscopy, sensing, and diagnostic techniques, significantly improving existing tools for probing micro- and nano-objects while also simplifying the designs of these analysis tools. In this presentation, I will introduce a new set of computational microscopes which use lens-free on-chip imaging to replace traditional lenses with holographic reconstruction algorithms. Basically, 3D images of specimens are reconstructed from their “shadows” providing considerably improved field-of-view (FOV) and depth-of-field, thus enabling large sample volumes to be rapidly imaged, even at nanoscale. These new computational microscopes routinely generate >1–2 billion pixels (giga-pixels), where even single viruses can be detected with a FOV that is >100 fold wider than other techniques. At the heart of this leapfrog performance lie self-assembled liquid nano-lenses that are computationally imaged on a chip. These self-assembled nano-lenses are stable for >1 hour at room temperature, and are composed of a biocompatible buffer that prevents nano-particle aggregation while also acting as a spatial “phase mask.” The field-of-view of these computational microscopes is equal to the active-area of the sensor-array, easily reaching, for example, >20 mm2 or >10 cm2 by employing state-of-the-art CMOS or CCD imaging chips, respectively.

In addition to this remarkable increase in throughput, another major benefit of this technology is that it lends itself to field-portable and cost-effective designs which easily integrate with smartphones to conduct giga-pixel tele-pathology and microscopy even in resource-poor and remote settings where traditional techniques are difficult to implement and sustain, thus opening the door to various telemedicine applications in global health. Some other examples of these smartphone-based biomedical tools that I will describe include imaging flow cytometers, immunochromatographic diagnostic test readers, bacteria/pathogen sensors, blood analyzers for complete blood count, and allergen detectors. Through the development of similar computational imagers, I will also report the discovery of new 3D swimming patterns observed in human and animal sperm. One of this newly discovered and extremely rare motion is in the form of “chiral ribbons” where the planar swings of the sperm head occur on an osculating plane creating in some cases a helical ribbon and in some others a twisted ribbon. Shedding light onto the statistics and biophysics of various micro-swimmers’ 3D motion, these results provide an important example of how biomedical imaging significantly benefits from emerging computational algorithms/theories, revolutionizing existing tools for observing various micro- and nano-scale phenomena in innovative, high-throughput, and yet cost-effective ways.

More Information:

Speaker's Biography: Dr. Ozcan is the Chancellor’s Professor at UCLA and an HHMI Professor with the Howard Hughes Medical Institute, leading the Bio- and Nano-Photonics Laboratory at UCLA School of Engineering and is also the Associate Director of the California NanoSystems Institute (CNSI). Dr. Ozcan holds 31 issued patents (all of which are licensed) and >20 pending patent applications and is also the author of one book and the co-author of more than 400 peer reviewed research articles in major scientific journals and conferences. Dr. Ozcan is a Fellow of SPIE and OSA, and has received major awards including the Presidential Early Career Award for Scientists and Engineers (PECASE), SPIE Biophotonics Technology Innovator Award, SPIE Early Career Achievement Award, ARO Young Investigator Award, NSF CAREER Award, NIH Director’s New Innovator Award, ONR Young Investigator Award, IEEE Photonics Society Young Investigator Award and MIT’s TR35 Award for his seminal contributions to near-field and on-chip imaging, and telemedicine based diagnostics. Dr. Ozcan is also the recipient of the National Geographic Emerging Explorer Award, National Academy of Engineering (NAE) The Grainger Foundation Frontiers of Engineering Award, Popular Science Brilliant 10 Award, Gates Foundation Grand Challenges Award, Popular Mechanics Breakthrough Award, Netexplorateur Award, Microscopy Today Innovation Award, and the Wireless Innovation Award organized by the Vodafone Americas Foundation as well as the Okawa Foundation Award.


Rajiv Laroia


October 16, 2015 4:15 pm to 5:15 pm

Location: Packard 101

Talk Title: Gathering Light

Talk Abstract: With digital cameras in every cell phone, everyone is a photographer. But people still aspire to the better zoom, the lower noise, and the artistic bokeh effects provided by the digital SLR cameras, if only these features were available in as convenient and light-weight a package as a cell phone or a thin compact camera. Traditional high-end cameras have a big lens system that enables those features, but the drawback is weight, bulk, and inconvenience of carrying and switching lenses. In this talk, we discuss an alternative approach of using a heterogenous array of small cameras to provide those features, and more. Light's camera technology combines prime lenses that provide an optical zoom equivalent of 35mm, 70mm, and 150mm lenses. Small mirrors allow reconfiguring the cameras to select the right level of zoom and field of view. This talk describes the architecture of this flexible computational camera.

More Information:

Speaker's Biography: Rajiv is the cofounder and CTO of The Light Company, a company dedicated to re-imagining photography. He previously founded and served as CTO of Flarion Technologies, which developed the base technology for LTE. Flarion was acquired by Qualcomm in 2006. Prior to Flarion, Rajiv held R&D leadership roles in Lucent Technologies Bell Labs. Rajiv holds a Ph.D. and Master's degree from the University of Maryland, College Park and a Bachelor's degree from the Indian Institute of Technology, Delhi, all in electrical engineering. He is recipient of the 2013 IEEE Industrial Innovation Award.


Robert LiKamWa

Rice University

October 7, 2015 4:15 pm to 5:15 pm

Location: AllenX

Talk Title: Designing a Mixed-Signal ConvNet Vision Sensor for Continuous Mobile Vision

Talk Abstract: Continuously providing our computers with a view of what we see will enable novel services to assist our limited memory and attention. In this talk, we show that today’s system software and imaging hardware, highly optimized for photography, are ill-suited for this task. We present our early ideas towards a fundamental rethinking of the vision pipeline, centered around a novel vision sensor architecture, which we call RedEye. Targeting object recognition, we shift early convolutional processing into RedEye's analog domain, reducing the workload of the analog readout and of the computational system. To ease analog design complexity, we design a modular column-parallel design to promote physical circuitry reuse and algorithmic cyclic reuse. RedEye also includes programmable mechanisms to admit noise for energy reduction, further increasing the sensor's energy efficiency. Compared to conventional systems, RedEye reports an 85% reduction in sensor energy and a 45% reduction in computational energy.

More Information:

Speaker's Biography: Robert LiKamWa is a final year Ph.D. Student at Rice University. His research focus is on efficient support for continuous mobile vision. To supplement his research, he has interned and collaborated with Microsoft Research and Samsung Mobile Processor Innovation Lab on various projects related to vision systems. Robert received best paper awards from ACM MobiSys 2013 and PhoneSense 2011.


Dr. Liang Gao

Ricoh Innovations

September 30, 2015 4:15 pm to 5:15 pm

Location: Packard 101

Talk Title: Developing Next Generation Multidimensional Optical Imaging Devices

Talk Abstract: When performing optical measurement with a limited photon budget, it is important to assure that each detected photon is as rich in information as possible. Conventional optical imaging systems generally tag light with just two characteristics (x, y), measuring its intensity in a 2D (x, y) lattice. However, this throws away much of the information content actually carried by a photon. This information can be written as (x, y, z, θ, φ, λ, t, ψ, χ): the spatial coordinates (x, y, z) are in 3D, the propagation polar angles (θ, φ) are in 2D, and the wavelength (λ), emission time (t), and polarization orientation and ellipticity angles (ψ, χ) are in 2D. Neglecting coherence effects, a photon thus carries with it nine tags. In order to explore this wealth of information, an imaging system should be able to characterize measured photons in 9D, rather than in 2D.
This presentation will provide an overview of the next generation of multidimensional optical imaging devices which leverage advances in computational optics, micro-fabrication, and detector technology. The resultant systems can simultaneously capture multiple photon tags in parallel, thereby maximizing the information content we can acquire from a single camera exposure. In particular, I will discuss our recent development of two game-changing technologies—a snapshot hyperspectral imager, image mapping spectrometer (IMS), and an ultrafast imager, compressed ultrafast photography (CUP)—and how these techniques can potentially revolutionize our sensation of surrounding world.

More Information:

Speaker's Biography: Dr. Liang Gao is currently an advisory research scientist in computational optical imaging group at Ricoh Innovations. His primary research interests are microscopy, including super-resolution microscopy and photoacoustic microscopy, cost-effective high-performance optics for diagnostics, computational optical imaging, ultrafast imaging, and multidimensional optical imaging. Dr. Liang Gao is the author of more than 30 peer-reviewed publications in top-tier journals, such as Nature, Physics Report, and Annual Review of Biomedical Engineering. He received his BS degree in Physics from Tsinghua University in 2005 and PhD degree in Applied Physics and Bioengineering from Rice University in 2011.


Stephen Hicks

University of Oxford and Royal Academy of Engineering

August 14, 2015 4:15 pm to 5:15 pm

Location: Packard 101

Talk Title: From Electrodes to Smart Glasses: Augmented vision for the sight-impaired

Talk Abstract: The majority of the world’s 40 million “blind” people have some areas of remaining sight. This is known as residual vision and while it is generally insufficient for sighted tasks such as reading, navigating and detecting faces, it can often be augmented and enhanced through the use of near-eye displays.

Low vision is primarily a problem of contrast: patients are often unable to differentiate target objects (often foreground objects) from busy backgrounds. Depth imaging, and more recently semantic object segmentation, provide the tools to easily isolate foreground objects, allowing them to be enhanced in ways that exaggerate object boundaries, surface features and contrast. Computer vision and 3D mapping are also advancing a new form of enabling device, one that is more aware of its spatial surroundings and able to direct the user to specific objects on demand.

The emergence of small depth-RGB cameras, powerful portable computers and higher quality wearable displays means that for the first time we are able to consider building a vision augmenting system for daily long-term use. The requirements depend somewhat on the eye condition of the user such as the residual visual field, and colour and contrast sensitivity, but also on the needs and context of the user. Advances in all these areas, from low profile displays, deep learning, and context sensitive task prioritisation mean that advanced wearable assistants are now within reach.

In my talk I will discuss our efforts to develop and validate a generally useful Smart Specs platform, which is now part of the first wide-scale test of augmented vision in the UK funded by Google. This work will be put in context of ongoing Oxford projects such as implanted retinal prosthetics, gene therapies and sensory substitution devices. Much of this work has applications beyond visual impairment

Speaker's Biography: Stephen Hicks is a Lecturer in Neuroscience at the University of Oxford, and Royal Academy of Engineering Enterprise Fellow. He is the lead investigator of the Smart Specs research group who are building and validating novel forms of sight enhancement for blind and partially sighted individuals. Stephen completed a PhD in neuroscience at the University of Sydney in Australia, studying vision and spatial memory. On moving to the UK he began a post doctoral position at Imperial College London developing portable eye trackers for neurological diagnoses and computer vision techniques for electronic retinal implants. Stephen joined Oxford in 2009 where he developed concepts for image optimization in prosthetic vision, leading to the formation of the Smart Specs research group. He works closely with Professor Phil Torr in the Department of Engineering to develop semantic imaging systems for 3D object recognition and mapping.

Stephen won the Royal Society’s Brian Mercer Award for Innovation in 2013 and led the team who won the UK’s Google Global Impact Challenge in 2014 to build portable smart glasses for sight enhancement. He is the co-founder of Visual Alchemy Ltd ( which is beginning to commercialize the Smart Specs platform.


Professor Kristina Irsch

Wilmer Eye Institute, John Hopkins School of Medicine

July 14, 2015 4:15 pm to 5:15 pm

Location: Packard 101

Talk Title: Remote detection of binocular fixation and focus using polarization optics and retinal birefringence properties of the eye

Talk Abstract: Amblyopia (“lazy eye”) is a major public health problem, caused by misalignment of the eyes (strabismus) or defocus. If detected early in childhood, there is an excellent response to therapy, yet most children are detected too late to be treated effectively. Commercially available vision screening devices that test for amblyopia’s primary causes can detect strabismus only indirectly and inaccurately via assessment of the positions of external light reflections from the cornea, but they cannot detect the anatomical feature of the eyes where fixation actually occurs (the fovea). This talk presents an accurate and calibration-free technique for remote localization of the true fixation point of an eye by employing the characteristic birefringence signature of the radially arrayed Henle fibers delineating the fovea. Progress on the development of a medical diagnostic screening device for eye misalignment and defocus will be presented, and other potential applications will be discussed.

More Information:

Speaker's Biography: Kristina Irsch is a German physicist specializing in biomedical and ophthalmic optics. She received her Ph.D. from the University of Heidelberg in Germany where she trained under Josef F. Bille, Ph.D. She went to the Johns Hopkins University School of Medicine in Baltimore, Maryland in 2005, first as a visiting graduate student, and later completed a post-doctoral research fellowship in ophthalmic optics and instrumentation under David L. Guyton, M.D. before joining the faculty as Assistant Professor of Ophthalmology in 2010. Much of her research has focused on remote eye fixation and focus detection, using polarization optics and retinal birefringence properties of the eye, and its use in clinical settings. The main goal is to identify children with strabismus (misalignment of the eyes) and focusing abnormalities, at an early and still easily curable stage, before irreversible amblyopia (functional monocular visual impairment, or “lazy eye”) develops.


Dr Sara Abrahamsson

Rockefeller University

June 18, 2015 4:15 pm to 5:15 pm

Location: Packard 101

Talk Title: Instantaneous, high-resolution 3D imaging using multifocus microscopy (MFM) and applications in functional neuronal imaging

Talk Abstract: Multifocus microscopy (MFM) is an optical imaging technique that delivers instant 3D data at high spatial and temporal resolution. Diffractive Fourier optics are used to multiplex and refocus light, forming a 3D image consisting of an instantaneously formed focal stack of 2D images. MFM optics can be appended to a commercial microscope in a “black box” at the camera port, to capture 3D movies of quickly moving systems at the native resolution of the imaging system to which it is attached. MFM can also be combined with super-resolution methods to image beyond 200nm resolution. Systems can be tailored to fit different imaging volumes, covering for example seven or nine image planes to study mRNA diffusion inside a cell nucleus, or 25 focal planes or more to study cell division or neuronal activity in a developing embryo.

In the Bargmann lab at the Rockefeller University, MFM is applied in functional neuronal imaging in the nematode C. elegans – a millimeter-sized worm that with its compact but versatile nervous system of 302 neurons is a common model organism in neurobiology. The genetically expressed calcium-indicator dye GCaMP is used to visualize neuronal activity in the worm. Using MFM it is possible to image entire 3D clusters of neurons in living animals with single neuron resolution, and to perform unbiased imaging screenings of entire circuits to identify neuronal function during for example olfactory stimulation.

Speaker's Biography: Sara came to UCSF in 2004 from the Royal Institute of Technology, Stockholm, Sweden to spend six months on an optical design project with Professor Mats Gustafsson. Mats tricked her into staying in the lab to do a Ph.D. and later to follow him to HHMI Janelia Research Campus. During this time, Sara developed two diffractive Fourier optics systems for live bio-microscopy in extended sample volumes: Extended Focus (EF)1 and multifocus microscopy (MFM)2,3. These techniques have since been applied to a wide variety of fast live-imaging projects in biological research.

Currently, Sara is a Leon Levy postdoctoral fellow in the laboratory of Cori Bargmann at the Rockefeller University, where she applies MFM to functional neuronal imaging in extended sample volumes. Sara also spends a lot of her time in the nanofabrication clean rooms at CNF, Cornell University, Ithaca, NY and NIST, MD, using multilevel, deep UV-lithography to fab the various diffractive Fourier optics devices she designs and builds in the labs of collaborators in the US and Europe. Sara spends her summers at the Marine Biological Laboratory in Woods Hole on Cape Cod, developing 3D polarization microscopy methods with the Oldenbourg lab.


Bernd Richter

Fraunhofer FEP

June 5, 2015 4:15 pm to 5:15 pm

Location: Packard 101

Talk Title: Bidirectional OLED Microdisplays: Technology and Applications

Talk Abstract: Microdisplays based on the OLED-on-Silicon technology getting more and more spread in data glasses and electronic viewfinders. In these applications the OLED microdisplay can take advantage of its low power consumption and the self-emitting behaviour of the OLED that enables simpler optics because of no need to use an additional backlight illumination. Fraunhofers approach is to extend these advantages by embedding an additional image sensor inside the active display area. So it is possible to create a bidirectional microdisplay (light emission and detection in the same plane). This special feature can be used to realize eye-tracking by capturing the eye-scene of the glass user and thus providing a hands-free capability of interaction with the system. This talk will introduce the OLED-on-Silicon technology and the approach of bidirectional OLED microdisplays and its applications in interactive data-glasses. In addition to that the latest generation of a bidirectional microdisplay with increased SVGA resolution will be demonstrated.

More Information:

Speaker's Biography: Bernd Richter (Fraunhofer FEP) received his diploma in electrical engineering from TU Dresden in 2003. Afterwards he joined the analog & mixed-signal IC design group at Fraunhofer in Dresden. The focus of his work is the CMOS design for OLED microdisplays and sensor applications as well as More-than-Moore technologies with focus on Organic-on-Silicon. He managed several projects and generations of OLED microdisplays for public and industrial customers. Since 2012 Bernd Richter is heading the department “IC & System Design” at Fraunhofer FEP.


Dr. Francesco Aieta

HP Labs

May 27, 2015 4:15 pm to 5:15 pm

Location: Packard 101

Talk Title: Achromatic Metasurfaces: towards broadband flat optics

Talk Abstract: Conventional optical components rely on the propagation through thick materials to control the amplitude, phase and polarization of light. Metasurfaces provide a new path for designing planar optical devices with new functionalities. In this approach, the control of the wavefront is achieved by tailoring the geometry of subwavelength-spaced nano antennas.
By designing an array of low-loss dielectric resonators we create metasurfaces with an engineered wavelength-dependent phase shift that compensates for the dispersion of the phase accumulated by light during propagation. In this way the large chromatic effects typical of all flat optical components can be corrected. A flat lens without chromatic aberrations and a beam deflector are demonstrated.
The suppression of chromatic aberrations in metasurface-based planar photonics will find applications in lightweight collimators for displays, and chromatically-corrected imaging systems.

Speaker's Biography: Francesco Aieta is a researcher at HP Labs specializing in novel photonics devices for imaging and sensing applications. He worked as postdoctoral fellow at Harvard University and received a PhD in Applied Physics from the Politecnica delle Marche (Italy) in 2013. His present and past research are focused on the study of novel flat optical materials, design of devices from mid infrared to visible spectrum for biomedical applications as well as electronics consumer products. Other area of interest includes plasmonics, light matter interaction at the nanoscale, optical trapping in anisotropic environment and properties of liquid crystals. He is a member of the Optical Society of America.


Jules Urbach


May 13, 2015 4:15 pm to 5:15 pm

Location: Packard 101

Talk Title: Light Field Rendering and Streaming for VR and AR

Talk Abstract: Light field rendering produces realistic imagery that can be viewed from any vantage point. It is an ideal format for deploying cinematic experiences targeting consumer virtual and augmented reality devices with position tracking, as well as emerging light field display hardware. Because of their computational complexity and data requirements, light field rendering and content delivery have not been practical in the past. Jules Urbach, CEO of OTOY, will delve into the company's content creation and delivery pipelines which are designed to make light field production and content publishing practical today.

More Information:

Speaker's Biography: Jules Urbach is a pioneer in computer graphics, streaming and 3D rendering with over 25 years of industry experience. He attended Harvard-Westlake high school in LA before being accepted to Harvard University. He decided to defer his acceptance to Harvard (indefinitely as it turned out) to make revolutionary video games. He made his first game, Hell Cab (Time Warner Interactive) at age 18, which was one of the first CD-ROM games ever created. Six years after Hell Cab, Jules founded Groove Alliance. Groove created the first 3D game ever available on (Real Pool). Currently, Jules is busy working on his two latest ventures, OTOY and LightStage which aim to revolutionize 3D content capture, creation and delivery.


Professor Thrasyvoulos (Thrasos) N. Pappas

Electrical Engineering and Computer Science Department , Northwestern University

May 6, 2015 4:15 pm to 5:15 pm

Location: Packard 101

Talk Title: Visual Signal Analysis: Focus on Texture Similarity

Talk Abstract: Texture is an important visual attribute both for human perception and
image analysis systems. We present structural texture similarity
metrics (STSIM) and applications that critically depend on such metrics,
with emphasis on image compression and content-based retrieval. The
STSIM metrics account for human visual perception and the stochastic
nature of textures. They rely entirely on local image statistics and
allow substantial point-by-point deviations between textures that
according to human judgment are similar or essentially identical.

We also present new testing procedures for objective texture similarity
metrics. We identify three operating domains for evaluating the
performance of such similarity metrics: the top of the similarity
scale, where a monotonic relationship between metric values and
subjective scores is desired; the ability to distinguish between
perceptually similar and dissimilar textures; and the ability to
retrieve "identical" textures. Each domain has different performance
goals and requires different testing procedures. Experimental results
demonstrate both the performance of the proposed metrics and the
effectiveness of the proposed subjective testing procedures.
The focus of our current work at Lawrence Livermore is on texture
space characterization for surveillance applications.

Speaker's Biography: Thrasos Pappas received the Ph.D. degree in electrical engineering
and computer science from MIT in 1987. From 1987 until 1999, he was a
Member of the Technical Staff at Bell Laboratories, Murray Hill, NJ.
He joined the EECS Department at Northwestern in 1999. He is
currently on sabbatical leave at Lawrence Livermore National
Laboratory (January to May 2015). His research interests are in human
perception and electronic media, and in particular, image and video
quality and compression, image and video analysis, content-based
retrieval, model-based halftoning, and tactile and multimodal
interfaces. Prof. Pappas is a Fellow of the IEEE and SPIE. He has
served as editor-in-chief of the IEEE Transactions on Image Processing
(2010-12), and technical program co-chair of ICIP-01 and ICIP-09.
Prof. Pappas is currently serving as VP-Publications for the Signal
Processing Society of IEEE. Since 1997 he has been co-chair of the
SPIE/IS&T Conference on Human Vision and Electronic Imaging.


Peter Milford


April 29, 2015 4:15 pm to 5:15 pm

Location: Packard 101

Talk Title: Wearable Eye Tracking

Talk Abstract: Eye interaction technology can be applied to wearable computing systems, ranging from augmented reality systems to virtual reality to information display devices.
Applications of eye interaction range from eye tracking, user interface control, iris recognition, foveated rendering, biometric data capture and many others.
Researchers have been working on eye tracking since the 1800’s, starting with ‘observations’ to photography to direct contact and electrical methods to current camera based methods.
I will outline eye tracking in general, with focus on wearable eye tracking and the applications.

Speaker's Biography: Peter received his Ph.D in astrophysics at the University of Queensland, Brisbane Australia. He worked for 5 years at Stanford on a satellite based Solar observing experiment - observing a ‘large spherical object’.
He left Stanford to join a startup developing a three degree of freedom magnetic tracker, with applications in virtual reality head tracking.
He went on to start his consulting company, working with a variety of Silicon Valley firms, mainly in consumer electronics industry, bringing a practical physics approach to embedded sensors, imaging, calibration, factory test, algorithms etc.
Peter has been working with Eyefluence Inc. since it’s founding and is CTO/VP Engineering overseeing a strong multi-disciplinary team in developing wearable eye interaction technology. He now looks at ‘small spherical’ objects.
Eyefluence Inc. goal is to transform intent into action through your eyes. Eyefluence is developing a variety of eye interaction technologies for upcoming wearable display systems, including eye tracking, iris recognition and user interfaces for control of HMDs.


Professor Paul Debevec

USC Institute for Creative Technologies

April 22, 2015 4:15 pm to 5:15 pm

Location: Packard 101

Talk Title: Achieving Photoreal Digital Actors

Talk Abstract: We have entered an age where even the human actors in a movie can now be created as computer generated imagery. Somewhere between “Final Fantasy: the Spirits Within” in 2001 and “The Curious Case of Benjamin Button” in 2008, digital actors crossed the “Uncanny Valley” from looking strangely synthetic to believably real. This talk describes how the Light Stage scanning systems and HDRI lighting techniques developed at the USC Institute for Creative Technologies have helped create digital actors in a wide range of recent films. For in‐depth examples, the talk describes how high‐resolution face scanning, advanced character rigging, and performance‐driven facial animation were combined to create “Digital Emily”, a collaboration with Image Metrics (now Faceware) yielding one of the first photoreal digital actors, and 2013’s “Digital Ira”, a collaboration with Activision Inc., yielding the most realistic real‐time digital actor to date. A recent project with USC's Shoah Foundation is recording light field video of interviews with survivors of the Holocaust to allow interactive conversations with life-size automultiscopic projections.

More Information:

Speaker's Biography: Paul Debevec is a Research Professor at the University of Southern California and the Chief Visual Officer at USC's Institute for Creative Technologies. From his 1996 P.hD. at UC Berkeley, Debevec’s publications and animations have focused on techniques for photogrammetry, image‐based rendering, high dynamic range imaging, image‐based lighting, appearance measurement, facial animation, and 3D displays. Debevec is an IEEE Senior Member and Co-Chair of the Academy of Motion Picture Arts and Sciences’ (AMPAS) Science and Technology Council. He received a Scientific and Engineering Academy Award® in 2010 for his work on the Light Stage facial capture systems, used in movies including Spider‐Man 2, Superman Returns, The Curious Case of Benjamin Button, Avatar, Tron: Legacy, The Avengers, The Avengers, Oblivion, Gravity, Maleficent, and Furious 7. In 2014, Debevec was profiled in The New Yorker magazine's "Pixel Perfect: the scientist behind the digital cloning of actors" article by Margaret Talbot. He also recently worked with the Smithsonian Institution to digitize a 3D model of President Barack Obama.


Professor Peyman Milanfar


April 15, 2015 4:15 pm to 5:15 pm

Location: Packard 101

Talk Title: Computational Imaging: From Photons to Photos

Talk Abstract: Fancy cameras used to be the exclusive domain of professional photographers and experimental scientists. Times have changed, but even as recently as a decade ago, consumer cameras were solitary pieces of hardware and glass; disconnected gadgets with little brains, and no software. But now, everyone owns a smartphone with a powerful processor, and every smartphone has a camera. These mobile cameras are simple, costing only a few dollars per unit. And on their own, they are no competition for their more expensive cousins. But coupled with the processing power native to the devices in which they sit, they are so effective that much of the low-end point-and-shoot camera market has already been decimated by mobile photography.

Computational imaging is the enabler for this new paradigm in consumer photography. It is the art, science, and engineering of producing a great shot (moving or still) from small form factor, mobile cameras. It does so by changing the rules of image capture — recording information in space, time, and across other degrees of freedom — while relying heavily on post-processing to produce a final result. Ironically, in this respect, mobile imaging devices are now more like scientific instruments than conventional cameras. This has deep implications for the future of consumer photography.

In this technological landscape, the ubiquity of devices and open platforms for imaging will inevitably lead to an explosion of technical and economic activity, as was the case with other types of mobile applications. Meanwhile, clever algorithms, along with dedicated hardware architectures, will take center stage and enable unprecedented imaging capabilities in the user’s hands.

Speaker's Biography: Peyman received his undergraduate education in electrical engineering and mathematics from the University of California, Berkeley, and the MS and PhD degrees in electrical engineering from the Massachusetts Institute of Technology. He was a Professor of EE at UC Santa Cruz from 1999-2014, having served as Associate Dean of the School of Engineering from 2010-12. From 2012-2014 he was at Google-x, where he helped develop the imaging pipeline for Google Glass. He currently leads the Computational Imaging team in Google Research. He holds 8 US patents; has been keynote speaker at numerous conferences including PCS, SPIE, and ICME; and along with his students, won several best paper awards from the IEEE Signal Processing Society. He is a Fellow of the IEEE.


Yangyan Li and Matthias Niessner

Stanford University

April 8, 2015 4:15 pm to 5:15 pm

Location: Packard 101

Talk Title: From Acquisition to Understanding of 3D Shapes

Talk Abstract: Understanding 3D shapes is an essential but very challenging task for many scenarios ranging from robotics to computer graphics and vision applications. In particular, range sensing technology has made the shape understanding problem even more relevant, as we can now easily capture the geometry of the real world. In this talk, we will demonstrate how we can obtain a 3D reconstruction of an environment, and how we can exploit these results to infer semantic attributes in a scene. More specifically, we introduce a SLAM technique for large-scale 3D reconstruction using a Microsoft Kinect sensor. We will then present a method to jointly segment and classify the underlying 3D geometry. Furthermore, we propose to locate and recognize individual 3D shapes during the scanning, where large shape collections are used as the source of prior knowledge. In the future, we plan to extend our work from reconstructing geometry and object labelling to inferring objects’ physical properties such as weights.

Speaker's Biography: Yangyan Li is a post-doctoral scholar at Prof. Leonidas J. Guibas' Geometric Computation Group in Stanford University, affiliated with Max Planck Center for Visual Computing and Communication. Yangyan received his PhD degree rom Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, under the supervision of Prof. Baoquan Chen in 2013. His primary research interests fall in the field of Computer Graphics with an emphasis on 3D reconstruction.

Matthias Niessner is a visiting assistant professor at Stanford University affiliated with the Max Planck Center for Visual Computing and Communication. Previous to his appointment at Stanford, he earned his PhD from the University of Erlangen-Nuremberg, Germany under the supervision of Günther Greiner. His research focuses on different fields of computer graphics and computer vision, including real-time rendering, reconstruction of 3D scene environments, and semantic scene understanding.


Dr. Kathrin Berkner

Ricoh Innovations

March 4, 2015 4:15 pm to 5:15 pm

Location: Packard 101

Talk Title: Measuring material and surfaces properties from light fields

Talk Abstract: Light field imaging has been emerging for the past view years. The capability of capturing the light field of a scene in a single snapshot has enabled new applications not just for consumer photography, but also for industrial applications where 3D reconstruction of scenes is desired. A less researched application areas is that of the characterization of materials and surfaces from light fields. In this talk we discuss some of those applications and show how the imaging tasks impact the end-to-end system design of a resulting task-specific light field imaging system.

Speaker's Biography: Kathrin Berkner is Deputy Director of Research at Ricoh Innovations, where she is leading research on sensing technologies and collaborations with Ricoh R&D teams in Japan and India. Her research team of computational imaging experts has been developing technology that made its way into several new imaging products for Ricoh. Before working with optical elements, Dr. Berkner worked on a variety of image and document processing technologies that were implemented in Ricoh’s core multifunction printing products and achieved several internal awards. Prior to joining Ricoh, she was a Postdoctoral Researcher at Rice University, Houston, TX, performing research on wavelets with the Rice DSP group. Dr. Berkner holds a PhD. degree in mathematics from the University of Bremen, Germany.


Professor Martin S. Banks

University of California at Berkeley

February 25, 2015 4:15 pm to 5:15 pm

Location: Packard 101

Talk Title: Vergence-Accommodation Conflicts in Stereoscopic Displays

Talk Abstract: Stereoscopic displays present different images to the two eyes and thereby create a compelling three-dimensional (3D) sensation. They are being developed for numerous applications. However, stereoscopic displays cause perceptual distortions, performance decrements, and visual discomfort. These problems occur because some of the presented depth cues (i.e., perspective and binocular disparity) specify the intended 3D scene while focus cues (blur and accommodation) specify the fixed distance of the display itself. We have developed a stereoscopic display that circumvents these problems. It consists of a fast switchable lens (>1 kHz) synchronized to the display such that focus cues are nearly correct. Using this display, we have investigated how the conflict between vergence and accommodation affects 3D shape perception, visual performance, and, most importantly, visual comfort. We offer guidelines to minimize these adverse effects.

Speaker's Biography: Martin S. Banks is a Professor of Optometry and Vision Science at the University of California at Berkeley. He has received numerous awards for his work on basic and applied research on human visual development, on visual space perception, and on the development and evaluation of stereoscopic displays. He was appointed Fellow of the Center for Advanced Study of the Behavioral Sciences (1988), Honorary Research Fellow of Cardiff University (2007), Fellow of the American Association for the Advancement of Science (2008), Fellow of the American Psychological Society (2009), Holgate Fellow of Durham University (2011), and WICN Fellow of University of Wales (2011).

Professor Banks received his Bachelor’s degree at Occidental College in 1970 where he majored in Psychology and minored in Physics. He received a Master’s degree in Experimental Psychology from UC San Diego in 1973 and a doctorate in Developmental Psychology from University of Minnesota in 1976. He was Assistant and Associate Professor of Psychology at the University of Texas at Austin from 1976-1985. He moved to UC Berkeley School of Optometry in 1985, and was Chairman of the Vision Science Program from 1995-2002, and again in 2012.


Professor Zeev Zalevsky

Bar-Ilan University

February 11, 2015 4:15 pm to 5:15 pm

Location: Packard 101

Talk Title: Remote Photonic Sensing and Super Resolved Imaging

Talk Abstract: My talk will be divided into two parts. In the first I will present a technological platform that can be used for remote sensing of biomedical parameters as well as for establishing a directional communication channel. The technology is based upon illuminating a surface with a laser and then using an imaging camera to perform temporal and spatial tracking of secondary speckle patterns in order to have nano metric accurate estimation of the movement of the back reflecting surface. If the back reflecting surface is a skin located close to main blood arteries then biomedical monitoring can be realized. If the surface is close to our neck or head then a directional communication channel can be established.
The proposed technology was already applied for remote and continuous estimation of heart beats, blood pulse pressure, intra ocular pressure, estimation of alcohol and glucose concentrations in blood stream as well as for early detection of malaria. It was also experimentally used as invisible photonic mean for remote, directional and noise isolated sensing of speech signals.
The second part of my talk will deal with optical super resolution. Digital imaging systems as well as human vision system have limited capability for separation of spatial features. Therefore, the imaging resolution is limited. The reasons to this limitation are related to the effect of diffraction i.e. the finite dimensions of the imaging optics, the geometry of the sensing array and its sensitivity as well as the axial position of the object itself which may be out of focus.
In my talk I will present novel photonic approaches and means to exceed the above mentioned limitations existing in the vision science and eventually to allow us having super resolved imaging providing improved lateral and axial capabilities for separation of spatial features.

More Information:

Speaker's Biography: Zeev Zalevsky is a full Professor in the faculty of engineering in Bar-Ilan University, Israel. His major fields of research are optical super resolution, biomedical optics, nano-photonics and electro-optical devices, RF photonics and beam shaping. Zeev received his B.Sc. and Ph.D. degrees in electrical engineering from Tel-Aviv University in 1993 and 1996 respectively. He has many publications, patents and awards recognizing his significant contribution to the field of super resolved imaging and biomedical sensing. Zeev is an OSA, SPIE and EOS fellow and IEEE senior member. He is currently serving as the vice Dean of engineering, the head of the electro-optics track and a director of the Nanophotonics Center at the Bar-Ilan Institute of Nanotechnology. Zeev is also the founder of several startup companies.


Professor Joseph Ford

University of California at San Diego

February 4, 2015 4:15 pm to 5:15 pm

Location: Packard 101

Talk Title: Miniaturized panoramic cameras using fiber-coupled spherical optics

Talk Abstract: Conventional digital camera require lenses that form images directly onto focal planes, a natural consequence of the difficulties of fabricating non-planar image sensors. But using a curved image surface can dramatically increase the aperture, resolution and field of view achievable within a compact volume. This presentation will highlight imager research by UCSD, and collaborators at Distant Focus, done within the DARPA "SCENICC" program. I'll show an acorn-sized F/1.7 lens with a 12 mm focal length, a 120˚ field of view, a spectrum that extends from the visible to near infrared, and a measured resolution of over 300 lp/mm on its spherical image surface. The spherical image surface is coupled to one or more focal planes by high-resolution optical fiber bundles, resulting in raw images that compare well to conventional cameras an order of magnitude larger. These images are further improved by computational photography techniques specific to the fiber-coupled "cascade" image, where a continuous image is sampled by a quasi-periodic fiber bundle before transfer and re-sampling by the rectangular pixel array. I'll show the results of such image processing, and how this technology can fit an F/1 omnidirectional 150 Mpixel/frame (or larger) movie camera into a 4" diameter sphere.

More Information:

Speaker's Biography: Joseph E. Ford is a Professor of ECE at the University of California San Diego working in free-space optics for communications, energy, and sensing. At AT&T Bell Labs from 1994 to 2000, Dr. Ford led research demonstrating the first MEMS attenuator, spectral equalizer and wavelength add/drop switch, technologies now in widespread use. Dr. Ford was General Chair of the first IEEE Conference on Optical MEMS in 2000, and General Chair for the 2008 OSA Optical Fiber Communications Conference. Dr. Ford is co-author on 47 United States patents and over 200 journal articles and conference proceedings, and a Fellow of the Optical Society of America. He leads UCSD’s Photonics Systems Integration Lab (, a research group doing advanced free-space optical system design, prototyping and characterization for a wide range of applications.


Imaging and Visual Quality Seminar to be held in Santa Clara, CA

February 3, 2015 1:30 pm to February 4, 2015 12:30 am

Location: Intel Corporation | Santa Clara 12 Building (SC12) at 3600 Juliette Lane, Santa Clara

Talk Title: Imaging and Visual Quality Seminar

Talk Abstract: The Imaging and Visual Quality Seminar is sponsored by Intel, the IEEE Standards Association and the Stanford Center for Image Systems Engineering. This seminar features 11 speakers with different expertise and perspectives on image quality assessment. There will be talks on standards, simulation and subjective methods for image quality evaluation. The unique challenges of quantifying the image quality of smartphones, automotive and novel mobile devices will also be discussed.

This event is free, but pre-registration is required.

More Information:


Dr. Giacomo Chiari

retired from the Getty Conservation Institute in Los Angeles

January 21, 2015 4:15 pm to 5:15 pm

Location: Packard 101

Talk Title: Imaging Cultural Heritage

Talk Abstract: An image is worth a 1000 words.
The recent progress in imaging techniques applied to Cultural Heritage has been immense and to cover them all it would take a full university course. This lecture presents in a succinct way the applications of imaging done at the Getty Conservation Institute in the last 10 years. The fundamental problem of registering and superimposing images obtained by using different techniques has been solved, and several examples will show how powerful this is. Chemical mapping, coupled with spot noninvasive analyses on selected points, enormously reduces the need to take samples. Several techniques to image the invisible are described, some old and revitalized thanks to new tools, like Electron Emission or defocused radiography; others totally new and made possible by the advent of modern detectors and excitation means. A 3D visualization of medium-large bronze statue via Ct-scan has opened new insights into the defects in the statue’s manufacture and the subsequent deterioration. The possibility to detect loose pieces of plaster at a distance, that are dangerous to the public, using Laser Speckle interferometry can save large amounts of money and conservators time. Multispectral analysis, giving different information for each wavelength, makes it possible to select the most informative images and combine them together. Visible Induced Luminescence can uniquely map Egyptian blue and the examples shown demonstrate how powerful this technique is.

Speaker's Biography: Giacomo Chiari, a full professor in crystallography from Turin University, worked extensively on cultural heritage (Michelangelo’s Last Judgement, Maya Blue, adobe conservation in many countries). When he retired in 2003 from Turin University, he became the Chief Scientist at the Getty Conservation Institute in Los Angeles. He retired from the GCI in 2013 and he has been consulting and lecturing since then. At GCI he helped to develop new equipment (CT-scan for bronze, a portable noninvasive XRD/XRF device - DUETTO, a Laser Speckle interferometer to detect detached plasters, VIL, visually induced luminescence for mapping Egyptian Blue pigment and X-ray electron emission radiography). In the field he has worked on mural paintings in Peru, in Tutankhamen tomb and in Herculaneum.


Dr. Achintya Bhowmik


January 14, 2015 4:15 pm to 5:15 pm

Location: Packard 101

Talk Title: Intel® RealSense Technology: Adding Immersive Sensing and Interactions to Computing Devices

Talk Abstract: How we interface and interact with computing and entertainment devices is on the verge of a revolutionary transformation, with natural user inputs based on touch, gesture, and voice replacing or augmenting the use of traditional interfaces based on the mouse, remote controls, and joysticks. With the rapid advances in natural sensing technologies, we are endowing the devices with the abilities to see, hear, feel, and understand us and the physical world. In this talk, we will present and demonstrate Intel® RealSense Technology, which is enabling a new class of interactive and immersive applications based on embedded real-time 3D visual sensing. We will also take a peek at the future of multimodal sensing and interactions.

Speaker's Biography: Dr. Achin Bhowmik leads the research, development, and productization of advanced computing solutions based on natural interactions, intuitive interfaces, and immersive experiences, recently branded as Intel® RealSense Technology. Previously, he served as the chief of staff of the personal computing group, Intel’s largest business unit. Prior to that, he led the advanced video and display technology group, responsible for developing multimedia processing architecture for Intel’s computing products. His prior work includes liquid-crystal-on-silicon microdisplay technology and integrated optoelectronic devices.
As an adjunct and guest professor, he has taught graduate-level courses on advanced sensing and human-computer interactions, computer vision, and display technologies at the University of California, Berkeley, Kyung Hee University, Seoul, and University of California, Santa Cruz Extension. He has >150 publications, including two books, titled “Interactive Displays: Natural Human-Interface Technologies” and “Mobile Displays: Technology & Applications”, and 27 issued patents. He is an associate editor for the Journal of the Society for Information Display. He is the vice president of the Society for Information Display (SID), Americas, and a senior member of the IEEE. He is on the board of directors for OpenCV, the organization behind the open source computer vision library.


Dr. Andrew Gallagher


January 7, 2015 4:15 pm to 5:15 pm

Location: Packard 101

Talk Title: Understanding Images of People with Social Context

Talk Abstract: When we see other humans, we can quickly make judgments such as their demographic description and identity if they are familiar to us. We can answer questions related to the activities and relationships between people in an image. We draw conclusions based not just on what we see, but also from a lifetime of experience of living and interacting with other people. Even simple, common sense knowledge such as the fact that children are smaller than adults allows us to better understand the roles of the people we see. In this work, we propose contextual features, for modelling social context, drawn from a variety of public sources, and models for understanding images of people with the objective of providing computers with access to the same contextual information that humans use.
Computer vision and data-driven image analysis can play a role in helping us learn about people. We now are able to see millions of candid and posed images of people on the Internet. We can describe people with a vector of possible first names, and automatically produce descriptions of particular people in an image. From a broad perspective, this work presents a loop in that our knowledge about people can help computer vision algorithms, and computer vision can help us learn more about people.

Speaker's Biography: Andy is a Senior Software Engineer with Google, working with geo-referenced imagery. Previously, he was Visiting Research Scientist at Cornell University's School of Electrical and Computer Engineering, and part of a computer vision start-up, TaggPic, that identified landmarks in images . He earned the Ph.D. degree in electrical and computer engineering from Carnegie Mellon University in 2009, advised by Prof. Tsuhan Chen. Andy worked for the Eastman Kodak Company from 1996 to 2012, initially developing computational photography and computer vision algorithms for digital photofinishing, such as dynamic range compression, red-eye correction and face recognition.
More recently, Andy's interests are in the arena of improving computer vision by incorporating context, human interactions, and unique image cues. Andy is interested in a wide variety of data analysis problems, and has developed algorithms for detecting image forgeries, assembling puzzles, and deciding what NFL teams should do on fourth down.


Dr. Aldo Badano

US Food and Drug Administration

December 10, 2014 4:15 pm to 5:15 pm

Location: Packard 101

Talk Title: A stereoscopic computational observer model for image quality assessment

Talk Abstract: As stereoscopic display devices become more commonplace, their image quality evaluation becomes increasingly important. Most studies on 3D displays rely on physical measurements or on human preference. Currently, there is no link correlating bench testing with detection performance for medical imaging applications. We describe a computational stereoscopic observer approach inspired by the mechanisms of stereopsis in human vision for task-based image quality assessment. The stereo-observer uses a left and a right image generated through a visualization operator to render 3D datasets for white and lumpy backgrounds. Our simulation framework generalizes different types of model observers including existing 2D and 3D observers as well as providing flexibility for the stereoscopic model approach. We show results quantifying the changes in performance when varying stereo angle as measured by the ideal linear stereoscopic observer. We apply the framework to the study of performance trade-offs for three stereoscopic display technologies. Our results show that the crosstalk signature for 3D content varies considerably when using different models of 3D glasses for active stereoscopic displays. Our methodology can be extended to model other aspects of the stereoscopic imaging chain in medical, entertainment, and other demanding applications.

Speaker's Biography: Aldo Badano is a member of the Senior Biomedical Research Service and the Laboratory Leader for Imaging Physics in the Division of Imaging, Diagnostics, and Software Reliability, Office of Science and Engineering Laboratories, Center for Devices and Radiological Health, US Food and Drug Administration. Dr. Badano leads a program on the characterization, modeling and assessment of medical image acquisition and display devices using experimental and computational methods. Dr. Badano is an affiliate faculty at the Fischell Bioengineering Department at the University of Maryland College Park, and at the Computer Science and Electrical Engineering Department of University of Maryland, Baltimore County. He received a PhD degree in Nuclear Engineering and a MEng in Radiological Health Engineering from the University of Michigan in 1999 and 1995, and a ChemEng degree from the Universidad de la República, Montevideo, Uruguay in 1992. He serves as Associate Editor for several scientific journals and as reviewer of technical proposals for DOD and NIH. Dr. Badano has authored more than 250 publications and a tutorial textbook on medical displays.


Dr. Michael Zordan

Sony Biotechnology

February 18, 2015 4:15 pm to 5:15 pm

Location: Packard 101

Talk Title: The Spectral Flow Cytometer

Talk Abstract: Spectral flow cytometry is an exciting technology for cytomics and systems biology. Spectral flow cytometry differs from conventional flow cytometry in that the measured parameters for events are fluorescence spectra taken across all detectors as opposed to being primarily the fluorescence signal measured from one detector. This gives spectral flow cytometry capabilities and flexibility that far exceeds those of conventional flow cytometry.
There are several different hardware schemes that can be used to measure spectral data from cells. The core functions that a spectral detection scheme must have are:

1. A means to spatially separate collected light based on wavelength.
2. A multichannel detection system that will simultaneously measure the signals at different wavelengths independently.
3. The data processing power to perform spectral unmixing for real time display.

These fundamental differences enable spectral flow cytometers to perform applications that are not readily possible on conventional flow cytometers. Cellular autofluorescence can be used as a parameter in spectral flow cytometry, giving up new options for analysis that are not present in conventional flow cytometry. Additionally, because a spectral flow cytometer measures the whole fluorescence spectrum for each fluorophore, overlapping fluorophores can be resolved based on spectral shape allowing for the use of markers that would not be resolvable by conventional flow cytometry. Sony Biotechnology Inc. has recently released the SP6800, the world’s first commercial spectral flow cytometer.

Speaker's Biography: Michael is a Staff Engineer at Sony Biotechnology Inc. specializing in the design and use of flow cytometry instrumentation, with particular emphasis on the optical design of the systems. He has been a lead engineer on the SY3200 cell sorter, the EC800 flow analyzer and has contributed to the SP6800 Spectral Analyzer. He received a Ph.D. from Purdue University in 2010 in Biomedical Engineering where he developed optical methods for the detection and isolation of single rare cells. He is an ISAC Scholar, and a member of the ISAC Data Standards Task Force. His current research interests include spectral cell analysis and next generation cellular analysis techniques.


Tom Malzbender

Cultural Heritage Imaging

January 28, 2015 4:15 pm to 5:15 pm

Location: Packard 101

Talk Title: Capturing and Transforming Surface Reflectance: Imaging the Antikythera Mechanism

Talk Abstract: In 1900, a party of sponge divers chanced on the wreck of a Roman merchant vessel between Crete and mainland Greece. It was found to contain numerous ancient Greek treasures, among them a mysterious lump of clay that split open to reveal ‘mathematical gears’ as it dried out. This object is now known as the Antikythera Mechanism, one of the most enlightening artifacts in terms of revealing the advanced nature of ancient Greek science and technology. In 2005 we traveled to the National Archeological Museum in Athens to apply our reflectance imaging methods to the mechanism for the purpose of revealing ancient writing on the device. These methods capture surface appearance and transform reflectance properties to allow subtle surface shape to be seen that is otherwise difficult to perceive. We were successful, and along with the results of Microfocus CT imaging, epigraphers were able to decipher 3000 characters compared with the original 800 known. This led to an understanding that the device was a mechanical, astronomical computer, built around 150 B.C.E. and capable of predicting solar and lunar eclipses. This talk will overview the reflectance imaging methods as well as what they reveal about the Antikythera Mechanism.

More Information:

Speaker's Biography: Tom is a researcher who recently completed a 31 year career at Hewlett-Packard Laboratories, working at the interface of computer graphics, vision, imaging and signal processing. At HPL he and developed the methods of Fourier Volume Rendering, Polynomial Texture Mapping (PTM) and Reflectance Transformation, as well as directing the Visual Computing Department. Tom also developed the capacitive sensing technology that allowed HP to penetrate the consumer graphics tablet market. His PTM/RTI methods are used by most major museums in North America and Europe and in the fields of criminal forensics, paleontology and archaeology. He has co-chaired or served on the program comittee of over 30 conferences in computer graphics and vision. Tom now serves on the board of Cultural Heritage Imaging. More information can be found at .


Dr. Johnny Lee


December 3, 2014 4:15 pm to 5:15 pm

Location: Packard 101

Talk Title: Project Tango: Giving Mobile Devices a Human-Scale Understanding of Space and Motion

Talk Abstract: Project Tango is a focused effort to harvest research from the last decade of work in computer vision and robotics and concentrate that technology into a mobile device. It uses computer vision and advanced sensor fusion to estimate position and orientation of the device in the real-time, while simultaneously generating a 3D map of the environment. We will discuss some of the underlying technologies that make this possible, such as the hardware sensors and some of the software algorithms. We will also show demonstrations of how the technology could be used in both gaming and non-gaming applications. This is just the beginning and we hope you will join us on this journey. We believe it will be one worth taking.

Speaker's Biography: Johnny Lee is a Technical Program Lead at the Advanced Technology and Projects (ATAP) group at Google. He leads Project Tango, which is a focused effort to bring computer vision and advanced sensor fusion to mobile platforms. Previously, he helped Google X explore new projects as Rapid Evaluator and was a core algorithms contributor to the original Xbox Kinect. His YouTube videos demonstrating Wii remote hacks have surpassed over 15 million views and became one of the most popular TED talk videos. In 2008, he received his PhD in Human-Computer Interaction from Carnegie Mellon University and has been recognized in MIT Technology Review’s TR35


Dr. Arthur Zhang


November 12, 2014 4:15 pm to 5:15 pm

Location: Packard 101

Talk Title: iOptik: Contact Lens–Enabled Wearable Display Platform

Talk Abstract: A revolution in technology is currently underway with wearable electronics that will dramatically change how we interact with technology. Among all the ways that technology can provide us with sensory input and feedback, no method is more important than through our visual system. With smartphones, tablets, and TVs becoming ever larger to enable a more enjoyable and natural way to interact with digital content, consumers, industry, and military alike are turning to wearable displays. We are all seeking the ideal display that fits unobtrusively into our everyday lives, while providing very high visual performance. Innovega’s iOptik wearable display system accomplishes this by breaking away from any conventional optical method and merging the optics of the wearable display into high-tech contact lenses. The contact lenses provide the wearer with the ability to see their surroundings with perfectly corrected vision, while simultaneously allowing them the ability to view an immersive display, in a tiny form factor, embedded within fashionable eyewear. This talk will present an overview of how our technology works and provide some examples of the elements within our system.

Speaker's Biography: Arthur Zhang is an engineer, scientist, and entrepreneur. He has been working in the startup environment since graduating with a PhD in applied physics in 2010 and is now the Senior Member of Technical Staff at Innovega. He has been responsible for developing many of the key components of Innovega’s technology, including the world’s first polarized contact lens. He has also been a key contributor to Innovega’s many eyewear platforms. Arthur has great interest in the merging of technology with the human body and is an expert in the integration of nano/micro-scale devices into medical devices.


Professor Gordon Wetzstein

Stanford University

October 15, 2014 4:15 pm to 5:15 pm

Location: AllenX Auditorium

Talk Title: Compressive Light Field Display and Imaging Systems

Talk Abstract: With rapid advances in optical fabrication, digital processing power, and computational perception, a new generation of display technology is emerging: compressive displays exploring the co-design of optical elements and computational processing while taking particular characteristics of the human visual system into account. We will review advances in this field and give an outlook on next-generation compressive display and imaging technology. In contrast to conventional technology, compressive displays aim for a joint-design of optics, electronics, and computational processing that together exploit compressibility of the presented data. For instance, light fields show the same 3D scene from different perspectives - all these images are very similar and therefore compressible. By combining displays that use multilayer architectures or directional backlighting combined with optimal light field factorizations, limitations of existing devices, for instance resolution, depth of field, and field of view, can be overcome. In addition to light field display and projection, we will discuss a variety of technologies for compressive super-resolution and high dynamic range image display as well as compressive light field imaging and microscopy.

Speaker's Biography: Prior to joining Stanford University's Electrical Engineering Department as an Assistant Professor in 2014, Gordon Wetzstein was a Research Scientist in the Camera Culture Group at the MIT Media Lab. His research focuses on computational imaging, microscopy, and display systems as well as computational light transport. At the intersection of computer graphics, machine vision, optics, scientific computing, and perception, this research has a wide range of applications in next-generation consumer electronics, scientific imaging, human-computer interaction, remote sensing, and many other areas. Gordon's cross-disciplinary approach to research has been funded by DARPA, NSF, Intel, Samsung, and other grants from industry sponsors and research councils. In 2006, Gordon graduated with Honors from the Bauhaus in Weimar, Germany, and he received a Ph.D. in Computer Science from the University of British Columbia in 2011. His doctoral dissertation focuses on computational light modulation for image acquisition and display and won the Alain Fournier Ph.D. Dissertation Annual Award. He organized the IEEE 2012 and 2013 International Workshops on Computational Cameras and Displays, founded as a forum for sharing computational display design instructions with the DIY community, and presented a number of courses on Computational Displays and Computational Photography at ACM SIGGRAPH. Gordon won the best paper awards at the International Conference on Computational Photography in 2011 and 2014 as well as a Laval Virtual Award in 2005.

Talk Slides: see


Professor Eli Peli

Harvard Medical School

October 29, 2014 4:15 pm to 5:15 pm

Location: Packard 101

Talk Title: Visual Issues with Head-Mounted Displays

Talk Abstract: After 25 years of commercial development of head-mounted displays (HMD) we seem to be approaching a point of maturation of the technology that will finally penetrate the market place. The presentation of images in near eye displays whether monocular, binocular, stereoscopic, or see-through for augmented vision has important consequences for the visual experience and of particular importance for the technology success is the comfort and safety of the users. I will discuss the ophthalmic consequences of HMDs that has been suggested, and the evidence collected so far. A major concern has been the decoupling of accommodation and convergence in (stereo and non-stereo) HMD that is presumed to cause eye strain and lead to numerous technological approaches to overcome. Motion sickness like symptoms are common with HMDs and with non-HMD stereo displays, but have been addressed to a much lesser extent. Other visual phenomena and visual challenges presented by HMDs will be presented as well.

Speaker's Biography: Eli Peli is trained as an Electrical Engineer and an Optometrist. He is the Moakley Scholar in Aging Eye Research at Schepens, Massachusetts Eye and Ear, and Professor of Ophthalmology at Harvard Medical School. Dr. Peli is a Fellow of the American Academy of Optometry, the Optical Society of America, the Society for Information Display, and The International Society of Optical Engineering. He was presented the 2010 Otto Schade Prize from the SID (Society for Information Display) and the 2010 Edwin H Land Medal awarded jointly by the Optical Society of America and the Society for Imaging Science and Technology. His principal research interests are image processing in relation to visual function and clinical psychophysics in low vision rehabilitation, image understanding and evaluation of display-vision interaction. He also maintains an interest in oculomotor control and binocular vision. Dr. Peli is a consultant to many companies in the ophthalmic instrumentation area and to manufacturers of head mounted displays (HMD).

Video Files:


Dr. Bernard Kress

Google [X] Labs

October 22, 2014 4:15 pm to 5:15 pm

Location: Packard 101

Talk Title: From Virtual Reality headsets to Smart Glasses and beyond

Talk Abstract: Helmet Mounted Displays (HMDs) and Head Up Displays (HUDs) have been used extensively over the past decades especially within the defense sector. The complexity of the design and the fabrication of high quality see-through combiner optics to achieve high resolution over a large FOV have hindered their use in consumer electronic devices.
Occlusion Head Mounted Displays (HMD) have also been used in the defense sector for simulation and training purposes, over similar large FOV, packed with custom head tracking and eye gesture sensors.
Recently, a paradigm shift to consumer electronics has occurred as part of the wider wearable computing effort. Technologies developed for the smart phone industry have been used to build smaller, lower power, cheaper, electronics. Similarly, novel integrated sensors and micro-displays have enabled the development of consumer electronic smart glasses and smart eyewear, professional AR (Augmented Reality) HMDs as well as VR (Virtual Reality) headsets.
Reducing the FOV while addressing the needs for an increased exit pupil (thus allowing their use by most people) alongside stringent industrial design constrains have been pushing the limits of the design techniques and technologies available to the optical engineer (refractive, catadioptric, micro-optic, segmented Fresnel, waveguide, diffractive, holographic, …).
The integration of the optical combiner within conventional meniscus prescription lenses is a challenge that has yet to be solved.
We will review how a broad range of optical design techniques have been applied to fulfill such requirements, as well as the various head-worn devices developed to date. Finally, we will review additional optical technologies applied as input mechanisms (eye and head gesture sensing, gaze tracking and hand gesture sensing).

Speaker's Biography: For over 20 years, Bernard has made significant scientific contributions as a researcher, professor, consultant, advisor, instructor, and author, in the field of micro-optics, diffractive optics and holography for research, industry and consumer electronics. He has been involved in half a dozen start-ups in the Silicon Valley on optical data storage, optical telecom, optical position sensors and display (picos, HUDs and HMDs). Bernard holds 28 international granted patents and 30 patents applications. He has published more than 100 proceeding papers and 18 refereed journal papers. He is a short course instructor for the SPIE on micro-optics, diffractive optics and wafer scale optics. He has published three books edited by John Wiley and Sons “Digital Diffractive Optics” (1999), “Applied Digital Optics” (2007) and Mac Graw Hill “Optical System Design: Diffractive Optics” (2005) and a field guide by SPIE “Digital Micro-Optics” (2014). He has been chairman of the SPIE conference “Photonics for Harsh Environments” for the past three years. He is currently with Google [X] working on the Google Glass project as the Principal Optical Architect.

Video Files:


Professor Eric Fossum

Thayer School of Engineering at Dartmouth

October 1, 2014 4:15 pm to 5:15 pm

Location: Packard 101

Talk Title: Quanta Image Sensor (QIS) Concept and Progress

Talk Abstract: The Quanta Image Sensor (QIS) was conceived when contemplating shrinking pixel sizes and storage capacities, and the steady increase in digital processing power. In the single-bit QIS, the output of each field is a binary bit plane, where each bit represents the presence or absence of at least one photoelectron in a photodetector. A series of bit planes is generated through high speed readout, and a kernel or “cubicle” of bits (X,Y, t) is used to create a single output image pixel. The size of the cubicle can be adjust post-acquisition to optimize image quality. The specialized sub-diffraction-limit photodetectors in the QIS are referred to as “jots” and a QIS may have a gigajot or more, read out at 1000 fps, for a data rate exceeding 1Tb/s. Basically, we are trying to count photons as they arrive at the sensor. Recent progress towards realizing the QIS for commercial and scientific purposes will be discussed. This includes investigation of a pump-gate jot device implemented in a 65nm process, power efficient readout electronics, currently less than 20pJ/b in 0.18 um CMOS, creating images from jot data with high dynamic range, and understanding the imaging characteristics of single-bit and multi-bit QIS devices, such as the inherent and interesting film-like D-log(H) characteristic. If successful, the QIS will represent a major paradigm shift in image capture.

Speaker's Biography: Eric R. Fossum is a Professor at the Thayer School of Engineering at Dartmouth. His work on miniaturizing NASA interplanetary spacecraft cameras at Caltech’s Jet Propulsion Laboratory in the early 1990’s led to his invention of the CMOS image sensor “camera-on-a-chip” that has touched many here on Earth, from every smartphone to automobiles and medicine, from security and safety to art, social media and political change. Used in billions of cameras each year, his technology has launched a world-wide explosion in digital imaging and visual communications.

Honors include induction into the National Inventors Hall of Fame and election to the National Academy of Engineering and the National Academy of Inventors. He received the NASA Exceptional Achievement Medal and is a Fellow of the IEEE. He co-founded the International Image Sensor Society and served as its first President.

A graduate of Trinity College and Yale University, Dr. Fossum taught at Columbia and then worked at JPL. He co-founded and led Photobit Corporation and later led MEMS-maker Siimpel. He joined Dartmouth in 2010, where he teaches and continues research on image sensors, and is Director of the school’s Ph.D. Innovation Program. He has published over 260 technical papers and holds over 150 U.S. patents. He and his wife have a small hobby farm in New Hampshire and he enjoys his time on his tractor.

Video Files:


Professor Eero Simoncelli

Howard Hughes Medical Institute and New York University

July 31, 2014 4:15 pm to 5:15 pm

Location: Clark Center Auditorium

Talk Title: Embedding of prior probabilities in neural populations

Talk Abstract: The mammalian brain is a metabolically expensive device, and evolutionary pressures have presumably driven it to make productive use of its resources. In early stages of sensory processing, this concept can be expressed more formally as an optimality principle: the brain maximizes the information that is encoded about relevant sensory variables, given available resources. I'll describe a specific instantiation of this hypothesis that predicts a direct relationship between the distribution of sensory attributes encountered in the environment, and the selectivity and response levels of neurons within a population that encodes those attributes. This allocation of neural resources, in turn, imposes direct limitations on the ability of the organism to discriminate different values of the encoded attribute. I'll show that these physiological and perceptual predictions are borne out in a variety of visual and auditory attributes. Finally, I'll show that this encoding of sensory information provides a natural substrate for subsequent computation, which can make use of the knowledge of environmental (prior) distributions that is embedded in the population structure.

More Information:


Professor Ramesh Jain

Department of Computer Science, University of California at Irvine

May 7, 2014 4:15 pm to 5:15 pm

Location: Packard 101

Talk Title: Situation Recognition

Talk Abstract: With the growth in social media, Internet of things, wearable devices, mobile phones, and planetary- scale sensing there is an unprecedented need and opportunity to assimilate spatio-temporally distributed heterogeneous data streams into actionable information. Consequently the concepts like objects, scenes, and events, need to be extended to recognize situations (e.g. epidemics, traffic jams, seasons, flash mobs). This presentation motivates and computationally grounds the problem of situation recognition. It presents a systematic approach for combining multimodal real-time heterogeneous big data into actionable situations. Specifically, an approach for modeling and recognizing situations using available data streams is implemented using EventShop to model and detect situations of interest. Similar framework is applied at personal level to determine evolving personal situations. By combining personal situation and environmental situation, it is possible to connect needs of people to appropriate resources efficiently, effectively, and promptly. We will discuss this framework using some early examples.

Speaker's Biography: Ramesh Jain is an entrepreneur, researcher, and educator.

He is a Donald Bren Professor in Information & Computer Sciences at University of California, Irvine where he is doing research in Event Web and experiential computing. Earlier he served on faculty of Georgia Tech, University of California at San Diego, The University of Michigan, Ann Arbor, Wayne State University, and Indian Institute of Technology, Kharagpur. He is a Fellow of ACM, IEEE, AAAI, IAPR, and SPIE. His current research interests are in processing massive number of geo-spatial heterogeneous data streams for building Smart Social System. He is the recipient of several awards including the ACM SIGMM Technical Achievement Award 2010.

Ramesh co-founded several companies, managed them in initial stages, and then turned them over to professional management. These companies include PRAJA, Virage, and ImageWare. Currently he is involved in Stikco and SnapViz. He has also been advisor to several other companies including some of the largest companies in media and search space.


Professor Chris Bregler

New York University

May 28, 2014 4:15 pm to 5:15 pm

Location: Packard 101

Talk Title: Next Gen Motion Capture: From the Silver Screen to the Stadium and the Streets

Talk Abstract: Mermaids and pirates, the Hulk and Iron Man! This talk will describe the behind-the-scenes technology of our match-moving and 3D capture system used in recent movies, including The Avengers, Pirates of the Caribbean, Avatar, Star Trek, and The Lone Ranger, to create the latest 3D visual effects. It will also show how we have used similar technology for New York Times infographics to demonstrate the body language of presidential debates, the motions of a New York Philharmonics conductor, New York Yankee Mariano Rivera's pitch style, and Olympic swimmer Dana Vollmer's famous butterfly stroke that won her four gold medals.

While Motion Capture is the predominant technology used for these domains, we have moved beyond such studio-based technology to do special effects, movement visualization, and recognition without markers and without multiple high-speed IR cameras. Instead, many projects are shot on-site, outdoors, and in challenging environments with the benefit of new interactive computer vision techniques as well as new crowed-sourced and deep learning techniques.

More Information:

Speaker's Biography: Chris Bregler is a Professor of Computer Science at NYU's Courant Institute, director of the NYU Movement Lab, and C.E.O. of ManhattanMocap, LLC. He received his M.S. and Ph.D. in Computer Science from U.C. Berkeley and his Diplom from Karlsruhe University. Prior to NYU he was on the faculty at Stanford University and worked for several companies including Hewlett Packard, Interval, Disney Feature Animation, and LucasFilm's ILM. His motion capture research and commercial projects in science and entertainment have resulted in numerous publications, patents, and awards from the National Science Foundation, Sloan Foundation, Packard Foundation, Electronic Arts, Microsoft, Google, U.S. Navy, U.S. Airforce, and other sources. He has been named Stanford Joyce Faculty Fellow, Terman Fellow, and Sloan Research Fellow. He received the Olympus Prize for achievements in computer vision and pattern recognition and was awarded the IEEE Longuet-Higgins Prize for "Fundamental Contributions in Computer Vision that have withstood the test of time". Some of his non-academic achievements include being the executive producer of, which required building the world's largest real-time motion capture volume, and a massive multi-player motion game holding several world records in The Motion Capture Society. He was the chair for the SIGGRAPH Electronic Theater and Animation Festival. He has been active in the Visual Effects industry, for example, as the lead developer of ILM's Multitrack system that has been used in many feature film productions. His work has also been featured in mainstream media such as the New York Times, Los Angeles Times, Scientific American, National Geographic, WIRED, Business Week, Variety, Hollywood Reporter, ABC, CBS, NBC, CNN, Discovery/Science Channel, and many other outlets.


Professor Leo Guibas

Stanford University

April 30, 2014 4:15 pm to 5:15 pm

Location: Packard 101

Talk Title: The Space Between The Images

Talk Abstract: Multimedia content has become a ubiquitous presence on all our computing devices, spanning the gamut from live content captured by personal device sensors such as smartphone cameras to immense databases of images, audio and video stored in the cloud. As we try to maximize the utility and value of all these petabytes of content, we often do so by analyzing each piece of data individually and foregoing a deeper analysis of the relationships between the media. Yet with more and more data, there will be more and more connections and correlations, because the data captured comes from the same or similar objects, or because of particular repetitions, symmetries or other relations and self-relations that the data sources satisfy.

In this talk we focus on the "space between the images", that is on expressing the relationships between different multimedia data. We aim to make such relationships explicit, tangible, first-class objects that themselves can be analyzed, stored, and queried -- irrespective of the media they originate from. We discuss mathematical and algorithmic issues on how to represent and compute relationships or mappings between media data sets at multiple levels of detail. We also show how to analyze and leverage networks of maps and relationships, small and large, between inter-related data. The network can act as a regularizer, allowing us to to benefit from the "wisdom of the collection" in performing operations on individual data sets or in map inference between them.

More Information:

Speaker's Biography: Leonidas Guibas obtained his Ph.D. from Stanford under the supervision of Donald Knuth. His main subsequent employers were Xerox PARC, DEC/SRC, MIT, and Stanford. He is currently the Paul Pigott Professor of Computer Science (and by courtesy, Electrical Engineering) at Stanford University. He heads the Geometric Computation group and is part of the Graphics Laboratory, the AI Laboratory, the Bio-X Program, and the Institute for Computational and Mathematical Engineering. Professor Guibas' interests span geometric data analysis, computational geometry, geometric modeling, computer graphics, computer vision, robotics, ad hoc communication and sensor networks, and discrete algorithms. Some well-known past accomplishments include the analysis of double hashing, red-black trees, the quad-edge data structure, Voronoi-Delaunay algorithms, the Earth Mover's distance, Kinetic Data Structures (KDS), Metropolis light transport, and the Heat-Kernel Signature. Professor Guibas is an ACM Fellow, an IEEE Fellow and winner of the ACM Allen Newell award.


Dr. Jon Shlens


May 21, 2014 4:15 pm to 5:15 pm

Location: Packard 101

Talk Title: Engineering a Large Scale Vision System by Leveraging Semantic Knowledge

Talk Abstract: Abstract: Computer-based vision systems are increasingly indispensable in our modern world. Modern visual recognition systems have been limited though in their ability to identify large numbers of object categories. This limitation is due in part to the increasing difficulty of acquiring sufficient training data in the form of labeled images as the number of object categories grows unbounded. One remedy is to leverage data from other sources – such as text data – both to train visual models and constrain their predictions. In this talk I will present our recent efforts at Google to build a novel architecture that employs a deep neural network to identify visual objects employing both labeled image data as well as semantic information gleaned from unannotated text. I will demonstrate that this model matches state-of-the-art performance on academic benchmarks while making semantically more reasonable errors. Most importantly, I will discuss how semantic information can be exploited to make predictions about image labels not observed during training. Semantic knowledge substantially improves "zero-shot" predictions achieving state-of-the-art performance on predicting tens of thousands of object categories never previously seen by the visual model.

More Information:

Speaker's Biography: Jon Shlens is a senior research scientist at Google since 2010. Prior to joining Google Research he was a research fellow at the Howard Hughes Medical Institute and a Miller Fellow at UC Berkeley. His research interests include machine perception, statistical signal processing, machine learning and biological neuroscience.


Dr. David Fattal


April 16, 2014 4:15 pm to 5:15 pm

Location: Packard 101

Talk Title: Mobile Holography

Talk Abstract: The mobile computing industry is experiencing a booming development fueled by a growing worldwide demand for smartphones, tablets (and soon wearables), the availability of low-power system-on-a-chip (SoCs), and a well established supply chain in Asia. Remarkably, there is enough graphical processing power on the latest smartphones to manipulate and render multiview 3D "holographic" content on the fly. If only we had the technology to project this holographic content from a portable screen...

LEIA Inc. is a spin-off from HP Labs which aims at commercializing a disruptive display technology, a diffractive backlit LCD system allowing the rendering of holographic 3D content at video rate on a mobile platform. LEIA’s core technology resides in the surface treatment of the backlight, and otherwise utilizes a commercially available LCD panel to create animated content. It uses standard LED illumination, comes in an ultra-thin form factor, is capable of high pixel densities and large field of view and does not consume more optical power than a regular LCD screen.

In this talk, I will present some fundamental aspects of the technology and will discuss various consumer applications.

Speaker's Biography: David is the founder and CEO of LEIA Inc, a spin-off from HP Labs aiming at commercializing a novel holographic display technology for mobile devices. He previously spent 9 years as a senior researcher in the Intelligent Infrastructure Laboratory at HP Labs, working on various aspects of quantum computing and photonics, and specializing in the manipulation of light at the nanoscale. He holds a PhD in Physics from Stanford University and a BS in theoretical physics from Ecole Polytechnique, France. David received the 2010 Pierre Faurre award for young French industrial career achievement, and was named French Innovator of the year 2013 by the MIT technology Review before featuring on the global innovator list that same year. David has 60 granted patents and co-authored the text-book “Single Photon Devices and Applications”.


Professor Bernd Girod

Stanford University

April 2, 2014 4:15 pm to 5:15 pm

Location: Packard 101

Talk Title: Mobile Visual Search - Linking the Virtual and the Physical World

Talk Abstract: Mobile devices are expected to become ubiquitous platforms for visual search and mobile augmented reality applications. For object recognition on mobile devices, a visual database is typically stored in the cloud. Hence, for a visual comparison, information must be either uploaded from, or downloaded to, the mobile over a wireless link. The response time of the system critically depends on how much information must be transferred in both directions, and efficient compression is the key to a good user experience. We review recent advances in mobile visual search, using compact feature descriptors, and show that dramatic speed-ups and power savings are possible by considering recognition, compression, and retrieval jointly. For augmented reality applications, where image matching is performed continually at video frame rates, interframe coding of SIFT descriptors achieves bit-rate reductions of 1-2 orders of magnitude relative to advanced video coding techniques. We will use real-time implementations for different example applications, such as recognition of landmarks, media covers or printed documents, to show the benefits of implementing computer vision algorithms on the mobile device, in the cloud, or both.

More Information:

Speaker's Biography: Bernd Girod is Professor of Electrical Engineering in the Information Systems Laboratory of Stanford University, California, since 1999. Previously, he was a Professor in the Electrical Engineering Department of the University of Erlangen-Nuremberg. His current research interests are in the area of networked media systems. He has published over 500 conference and journal papers and 6 books, receiving the EURASIP Signal Processing Best Paper Award in 2002, the IEEE Multimedia Communication Best Paper Award in 2007, the EURASIP Image Communication Best Paper Award in 2008, the EURASIP Signal Processing Most Cited Paper Award in 2008, as well as the EURASIP Technical Achievement Award in 2004 and the Technical Achievement Award of the IEEE Signal Processing Society in 2011. As an entrepreneur, Professor Girod has been involved in several startup ventures, among them Polycom, Vivo Software, 8x8, and RealNetworks. He received an Engineering Doctorate from University of Hannover, Germany, and an M.S. Degree from Georgia Institute of Technology. Prof. Girod is a Fellow of the IEEE, a EURASIP Fellow, and a member of the German National Academy of Sciences (Leopoldina). He currently serves Stanford’s School of Engineering as Senior Associate Dean for Online Learning and Professional Development.


Professor Roland Angst

Stanford University

March 11, 2015 4:15 pm to 5:15 pm

Location: Packard 101

Talk Title: Challenges in Image-Based 3D Reconstructions

Talk Abstract: Driven by the needs for various applications such as robotics, immersive augmented and virtual reality, digitization of archeological sites and landmarks, medical imaging, etc., the extraction of 3D geometry from images has become increasingly important in the last couple of years. The theory of multiple view geometry which relates images from different viewpoints dates back more than 100 years. However, in practice, e.g. due to imperfections of cameras or measuring noise, the required assumptions for this theory are often not met exactly which makes 3D computer vision inherently difficult.

In my talk, I will first outline some of the challenges we are faced with and in the second part, I will focus on two of those challenges. Specifically, we will look into radial distortion estimation without calibration targets and dense 3D reconstructions for scenes where the rigidity assumption is violated. We will see how simple and very intuitive reasoning in geometric terms can provide the foundation for algorithms to tackle those challenges.

Speaker's Biography: Roland Angst is currently affiliated to the Max Planck Center for Visual Computing and Communication. As such, he is currently a visiting assistant professor at Stanford University, where he is a member of Prof. Bernd Girod's Image, Video, and Multimedia Group as well as of Prof. Leonidas J. Guibas' Geometric Computation Group. He will join the Max Planck Institute in Saarbrücken in April 2015.

Roland received his PhD degree from the Swiss Federal Institute of Technology (ETH) Zürich in 2012 under the supervision of Prof. Marc Pollefeys. His research focused on geometric computer vision and on subspace models and algorithms in particular. In 2010, he has received a prestigious Google European Doctoral Fellowship in Computer Vision. Roland received his Master's degree in computer science with distinction from ETH Zürich in October 2007. His current primary research interests span computer vision and geometry, augmented and virtual reality.


Dr. Simone Bianco

University of Milano-Bicocca, Italy

February 7, 2014 3:30 pm to 4:00 pm

Location: Packard 101

Talk Title: Adaptive illuminant estimation using faces

Talk Abstract: This talk will show that it is possible to use skin tones to estimate the illuminant color. In our approach, we use a face detector to find faces in the scene, and the corresponding skin colors to estimate the chromaticity of the illuminant. The method is based on two observations: first, skin colors tend to form a cluster in the color space, making it a cue to estimate the illuminant in the scene; second, many photographic images are portraits or contain people.
If no faces are detected, the input image is processed with a low-level illuminant estimation algorithm automatically selected.
The algorithm automatically switches from global to spatially varying color correction on the basis of the illuminant estimations on the different faces detected in the image. An extensive comparison with both global and local color constancy algorithms is carried out to validate the effectiveness of the proposed algorithm in terms of both statistical and perceptual significance on a large heterogeneous dataset of RAW images containing faces.

Speaker's Biography: Simone Bianco obtained the BSc and the MSc degree in Mathematics from the University of Milano-Bicocca, Italy, respectively in 2003 and 2006. He received the PhD in Computer Science at Department of Informatics, Systems and Communication of the University of Milano-Bicocca, Italy, in 2010, where he currently a post-doc. His research interests include computer vision, optimization algorithms, and color imaging.


Professor Jon Yngve Hardeberg

Gjøvik University College, Gjøvik, Norway

February 7, 2014 3:00 pm to 3:30 pm

Location: Packard 101

Talk Title: Next Generation Colour Printing: Beyond Flat Colors

Talk Abstract: Colour Printing 7.0: Next Generation Multi-Channel Printing (CP7.0) is an Initial Training Network funded by EU's Seventh Framework programme. The project addresses a significant need for research, training and innovation in the printing industry. The main objectives of this project are to train a new generation of printing scientists who will be able to assume science and technology leadership in this traditional technological sector, and to do research in the colour printing field by fully exploring the possibilities of using more than the conventional four colorants (CMYK) in printing; focusing particularly on the spectral properties.

We primarily focus on four key areas of research; spectral modeling of the printer/ink/paper combination, spectral gamut prediction and gamut mapping, the effect of paper optics and surface properties on the colour reproduction of multi-channel devices, and optimal halftoning algorithms and tonal reproduction characteristics of multi-channel printing devices. Several application areas are considered, including textile and fine art.

In one part of the project, an extra dimension is added to the print in the form of relief, with the goal of controlling the angle-dependent reflection properties. For such prints, called 2.5D prints, a surface texture is created by printing multiple layers of ink on desired locations. The ongoing study will lead us to
create prints with relief that closely resemble the original, with a focus on aspects such as improving
the print quality, reducing the print costs and discovering new market opportunities.

In this presentation we will give an overview of the CP7.0 project, its goals, accomplishments and challenges. Furthermore as most if not all of the involved scientists-in-charge, postdoctoral researchers and PhD students will be present, it will be possible to go more in depth in the discussions following the talk. More information about the project can also be found here:

Speaker's Biography: Jon Y. Hardeberg received his sivilingeniør (MSc) degree in signal processing from the Norwegian Institute of Technology in Trondheim, Norway in 1995, and his PhD from Ecole Nationale Supérieure des Télécommunications in Paris, France in 1999. After a short but extremely valuable industry career near Seattle, Washington, where he designed, implemented, and evaluated colour imaging system solutions for multifunction peripherals and other imaging devices and systems, he joined Gjøvik University College (GUC, in Gjøvik, Norway, in 2001.

He is currently Professor of Colour Imaging at GUC's Faculty of Computer Science and Media Technology, and member of the Norwegian Colour and Visual Computing Laboratory (, where he teaches, supervises MSc and PhD students, and researches in the field of colour imaging. His current research interests include multispectral colour imaging, print and image quality, colorimetric device characterisation, colour management, and cultural heritage imaging, and he has co-authored more than 150 publications within the field.

He is currently project co-ordinator for an EU project (Marie Curie ITN CP7.0,, project leader for a large research project funded by the Research Council of Norway (HyPerCept), GUC’s representative in the management committee of the Erasmus Mundus Master Course CIMET (Colour in Informatics and Media Technology,, and Norway’s Management Committee member in the COST Action COSCH (Colour and Space in Cultural Heritage,


Dr. Michael Kriss

March 12, 2014 4:15 pm to 5:15 pm

Location: Packard 101

Talk Title: ISO Speed for Digital Cameras: Real or Imaginary

Talk Abstract: The concept of an ISO speed for digital cameras is somewhat of a conundrum. There is an accepted ISO 12232:2006 Standard on how to measure the ISO speed for a digital camera. In fact there are three accepted measures including the Recommended Exposure Index (REI), the Standard Output Sensitivity (SOS) and the Saturation-Based technique. These measures are all based on the final output of the digital imaging system and are often confined to a specific file format (TIFF for example) and color encoding (sRGB for example). The “traditional” negative film ISO (ASA) speed was empirically defined as the exposure that gave an excellent image when printed on either color or black and white paper. The “rule of thumb” was that on a bright sunny day the camera should be set to F/16 (or F/11 for reversal slide film) with a shutter speed of 1/ISO speed. This would ensure that there are two stops under and over exposure protection. This criterion made it possible for simple cameras to always get a good picture on a bright day. The speaker will present a way to calculate the “ISO speed” speed of the sensor rather than that of the camera. The calculation will take into consideration all the physics involved in creating photoelectrons, storing them, and the degrading sources of noise present in image sensors. The calculation will draw from the film terminology, but instead of a threshold speed calculation used in film studies, a signal-to-noise (S/N) calculation will be used for sensors. The impact on sensor speed from f-stop and shutter time manipulation will be discussed. The concept allows one to just replace the film by a sensor and use the same metering systems. Higher ISO speeds are possible by just increasing the overall gain of the imaging system (after white balance) and better image processing to hold the sharpness and lower the noise.

Speaker's Biography: Dr. Kriss received his BA(62), MS(64) and PhD(69) in Physics from the University of California at Los Angeles. He joined the Eastman Kodak Research Laboratories, Color Photography Division, in 1969 and later the Physics Division until his retirement in 1993. In his early years at Kodak, Dr, Kriss focused on color film image structure and modled and simulated the impact of chemical development on image structure and color reproduction. When he joined the Physics Division he focused on image processing of scanned and captured digital images. Dr Kriss spent three years in Japan where helped build an advanced research facility. At Kodak he headed up the Imaging Processing Laboratory and Algorithm Developing Laboratory. He joined the University of Rochester in 1993 where he was the executive director of the Center for Electronic Imaging Systems and taught through the Computer and Electrical Engineering Department. He joined Sharp Laboratories of America in 2000 where he headed the Color Imaging Group. Dr Kriss retired in 2004 but is still active as a consultant, Adjunct Professor at Portland State University, IS&T activities, and as the Editor in Chief of the Wiley-IS&T Series on Imaging Science and Technology and the forthcoming Handbook of Digital Imaging Technologies.


Professor Daniel Palanker

Stanford University

March 5, 2014 4:15 pm to 5:15 pm

Location: Packard 101

Talk Title: Restoration of Sight with Photovoltaic Subretinal Prosthesis

Talk Abstract: Retinal degeneration leads to blindness due to gradual loss of photoreceptors. Information can be reintroduced into the visual system by patterned electrical stimulation of the remaining retinal neurons. Photovoltaic subretinal prosthesis directly converts light into pulsed electric current in each pixel, stimulating the nearby inner retinal neurons. Visual information is projected onto the retina by video goggles using pulsed near-infrared (~900nm) light.
Subretinal arrays with 70μm photovoltaic pixels provide highly localized stimulation: retinal ganglion cells respond to alternating gratings with the stripe width of a single pixel, which is half of the native resolution in healthy controls (~30μm). Similarly to normal vision, retinal response to prosthetic stimulation exhibits flicker fusion at high frequencies (20-40 Hz), adaptation to static images, and non-linear summation of subunits in the receptive fields. In rats with retinal degeneration, the photovoltaic subretinal arrays also provide visual acuity up to half of its normal level (~1 cpd), as measured by the cortical response to alternating gratings. If these results translate to human retina, such implants could restore visual acuity up to 20/250. With eye scanning and perceptual learning, human patients might even cross the 20/200 threshold of legal blindness. Ease of implantation and tiling of these wireless modules to cover a large visual field, combined with high resolution open doors to highly functional restoration of sight.

More Information:

Speaker's Biography: Daniel Palanker is an Associate Professor in the Department of Ophthalmology and in the Hansen Experimental Physics Laboratory at Stanford University. He received PhD in Applied Physics in 1994 from the Hebrew University of Jerusalem, Israel.
Dr. Palanker studies interactions of electric field with biological cells and tissues in a broad range of frequencies: from quasi-static to optical, and develops their diagnostic, therapeutic and prosthetic applications, primarily in ophthalmology.
Several of his developments are in clinical practice world-wide: Pulsed Electron Avalanche Knife (PEAK PlasmaBladeTM), Patterned Scanning Laser Photocoagulator (PASCALTM), and OCT-guided Laser System for Cataract Surgery (CatalysTM). In addition to laser-tissue interactions, retinal phototherapy and associated neural plasticity Dr. Palanker is working on electro-neural interfaces, including Retinal Prosthesis, electronic control of vasculature and of the glands.


Professor Austin Roorda

University of California at Berkeley

April 9, 2014 4:15 pm to 5:15 pm

Location: Packard 101

Talk Title: Studying human vision one cone at a time

Talk Abstract: Vision scientists employ a diversity of approaches in their quest to understand human vision – from studying behavior of cells in a dish to studying responses of humans to visual stimuli. A new generation of tools are helping to bridge these two approaches. First, adaptive optics removes the blur caused by optical imperfections, offering optical access to single cells in the human retina. Second, advanced eye tracking allows us to repeatedly probe targeted retinal locations. The combined system allows us to perform psychophysics with an unprecedented level of stimulus control and localization. In this talk I will review the technology and present our latest results on human color and motion perception.

Speaker's Biography: Austin Roorda received his Ph.D. in Vision Science & Physics from the University of Waterloo, Canada in 1996. For over 15 years, Dr. Roorda has been pioneering applications of adaptive optics, including mapping of the trichromatic cone mosaic while a postdoc at the University of Rochester, designing and building the first adaptive optics scanning laser ophthalmoscope at the University of Houston, tracking and targeting light delivery to individual cones in the human eye at UC Berkeley, and being part of the first team to use AO imaging to monitor efficacy of a treatment to slow retinal degeneration. Since January 2005, he’s been at the UC Berkeley School of Optometry where he is the current chair of the Vision Science Graduate Program. He is a Fellow of the Optical Society of America and of the Association for Research in Vision and Ophthalmology and is a recipient of the Glenn A. Fry award, the highest research honor from the American Academy of Optometry.


Professor Hendrik Lensch

Eberhard Karls University at Tübingen.

February 12, 2014 4:15 pm to 5:15 pm

Location: Packard 101

Talk Title: Emphasizing Depth and Motion

Talk Abstract: Monocular displays are rather bad at transporting relative distances and velocities between objects to the observer as some of the binocular cues are missing. In our framework we use a stereo camera to first observe depth, relative distances and velocity and then modify the captured images in different ways to convey the lost information. Depth for example can be emphasized even on a monocular display using depth-of-field-rendering, local intensity or color contrast enhancement or using unsharp masking of the depth buffer. Linear motion, on the other hand can be emphasized by motion blur, streaks, rendered bursts or simply color coding the remaining distances between vehicles. These are a few ways of modifying pictures of the real world for actively controlling the user’s attention while trying to introduce only rather subtle modifications. We will present a real-time frame-work based on edge-optimized wavelets that optimizes depth estimation and emphasizes depth or motion.

Speaker's Biography: Hendrik P. A. Lensch holds the chair for computer graphics at Tübingen University. He received his diploma in computers science from the University of Erlangen in 1999. He worked as a research associate at the computer graphics group at the Max-Planck-Institut für Informatik in Saarbrücken, Germany, and received his PhD from Saarland University in 2003. Hendrik Lensch spent two years (2004-2006) as a visiting assistant professor at Stanford University, USA, followed by a stay at the MPI Informatik as the head of an independent research group. From 2009 to 2011 he has been a full professor at the Institute for Media Informatics at Ulm University, Germany. In his career, he received the Eurographics Young Researcher Award 2005, was awarded an Emmy-Noether-Fellowship by the German Research Foundation (DFG) in 2007 and received an NVIDIA Professor Partnership Award in 2010. His research interests include 3D appearance acquisition, computational photography, global illumination and image-based rendering, and massively parallel programming.


Professor EJ (Eduardo Jose) Chichilnisky

Stanford University

February 19, 2014 4:15 pm to 5:15 pm

Location: Packard 101

Talk Title: Artificial retina: Design principles for a high-fidelity brain-machine interface

Talk Abstract: The retina communicates visual information to the brain in spatio-temporal patterns of electrical activity, and these signals mediate all of our visual experience. Retinal prostheses are designed to artificially elicit activity in retinas that have been damaged by disease, with the hope of conveying useful visual information to the brain. Current devices, however, produce limited visual function. The reasons for this can be understood based on the organization of visual signals in the retina, and I will show experimental data suggesting that it is possible in principle to produce a device with exquisite spatial and temporal resolution, approaching the fidelity of the natural visual signal. These advances in interfacing to the neural circuitry of the retina may have broad implications for future brain-machine interfaces in general. I will also discuss how novel technologies may be used to optimize the use of such devices for the purpose of helping blind people see.

Speaker's Biography: E.J. Chichilnisky received a BA in Mathematics from Princeton University and an MS in Mathematics and PhD in Neuroscience from Stanford University. He worked at the Salk Institute for Biological Studies in San Diego for 15 years, and is now a Professor in the Neurosurgery Department and Hansen Experimental Physics Laboratory at Stanford.


Professor Silvio Savarese

Stanford University

January 14, 2014 4:15 pm to 5:15 pm

Location: Packard 101

Talk Title: Perceiving the 3D world from Images

Talk Abstract: When we look at an environment such as a coffee shop, we don't just recognize the objects in isolation, but rather perceive a rich scenery of the 3D space, its objects and all the relations among them. This allows us to effortlessly navigate through the environment, or to interact and manipulate objects in the scene with amazing precision.

The past several decades of computer vision research have, on the other hand, addressed the problems of 2D object recognition and 3D space reconstruction as two independent ones. Tremendous progress have been made in both areas. However, while methods for object recognition attempt to describe the scene as a list of class labels, they often make mistakes due to the lack of a coherent understanding of the 3D spatial structure. Similarly, methods for scene 3D modeling can produce accurate metric reconstructions but cannot put the reconstructed scene into a semantically useful form.

A major line of work from my group in recent years has been to design intelligent visual models that understand the 3D world by integrating 2D and 3D cues, inspired by what humans do. In this talk I will introduce a novel paradigm whereby objects and 3D space are modeled in a joint fashion to achieve a coherent and rich interpretation of the environment. I will start by giving an overview of our research for detecting objects and determining their geometric properties such as 3D location, pose or shape. Then, I will demonstrate that these detection methods play a critical role for modeling the interplay between objects and space which, in turn, enable simultaneous semantic reasoning and 3D scene reconstruction. I will conclude this talk by demonstrating that our novel paradigm for scene understanding is potentially transformative in application areas such as autonomous or assisted navigation, robotics, automatic 3D modeling of urban environments and surveillance.

More Information:

Speaker's Biography: Silvio Savarese is an Assistant Professor of Computer Science at Stanford University. He earned his Ph.D. in Electrical Engineering from the California Institute of Technology in 2005 and was a Beckman Institute Fellow at the University of Illinois at Urbana-Champaign from 2005–2008. He joined Stanford in 2013 after being Assistant and then Associate Professor (with tenure) of Electrical and Computer Engineering at the University of Michigan, Ann Arbor, from 2008 to 2013.

His research interests include computer vision, object recognition and scene understanding, shape representation and reconstruction, human activity recognition and visual psychophysics.


Professor Kirk Martinez

University of Southampton

December 13, 2013 4:15 pm to 5:15 pm

Location: Packard 101

Talk Title: Reflectance Transformation imaging of cultural heritage objects

Talk Abstract: The imaging of cultural heritage objects has helped to drive many imaging developments. This talk will briefly round-up personal experiences of building high resolution, colorimetric and 3D object systems during five large European projects. Recently there has been a growing interest in reflectance transformation imaging, where systems with many light positions are used to create images which can be viewed with varying light angles. This has proven to be useful in the study of archaeological objects with subtle surface textures which are not rendered well with a single image. Several “dome” based systems have been made using high power white LEDs and off the shelf digital SLR cameras have been produced for campaigns to image clay tablets. The designs and results will be discussed with examples from the Ashmolean Museum in Oxford.

More Information:

Speaker's Biography: Kirk Martinez is a Reader in Electronics and Computer Science, University of Southampton. He has a PhD in Image Processing from the University of Essex. He previously ran the MA in Computer Applications for History of Art in Birkbeck College London while working on a variety of European imaging projects. This included VASARI (High resolution colorimetric imaging of art), MARC (image and print), ACOHIR (3D objects), Viseum (IIPimage viewer) projects. He went on to Content-based retrieval and semantic web applications for museums (Artiste, SCULPTEUR, eCHASE). He now mainly works on Sensor networks for the environment: Glacsweb and Internet of Things. He founded the VIPS image processing library and co-designed RTI imaging systems as part of an AHRC project


Professor Ramakrishna Kakarala and Vittal Premachandran

Nanyang Technological University

November 5, 2013 4:15 pm to 5:15 pm

Location: Packard 101

Talk Title: What parts of a shape are discriminative?

Talk Abstract: What is distinctive about the shape of an apple? The answer likely depends on comparison to similar shapes. The reason to study this question is that shape is a distinguishing feature of objects, and is therefore useful for object recognition in computer vision. Though shape is useful, when objects have similar overall shapes, discriminating among them becomes difficult; successful object recognition calls for the identification of important parts of the shapes. In this talk we introduce the concept of discriminative parts and propose a method to identify them. We show how we can assign levels of importance to different regions of contours, based on their discriminative potential. Our experiments show that the method is promising and can identify semantically meaningful segments as being important ones. We place our work in context by reviewing the related work on saliency.

More Information:

Speaker's Biography: Ramakrishna Kakarala is an Associate Professor in the School of Computer Engineering at the Nanyang Technological University (NTU) in Singapore. He has worked in both academia and industry; prior to joining NTU, he spent 8 years at Agilent Laboratories in Palo Alto, and at Avago Technologies in San Jose. He received the Ph.D. in Mathematics at UC Irvine, after completing a B.Sc. in Computer Engineering at the University of Michigan. Two of his students have recently won awards: the BAE Systems award at EI 2012, and the Best Student Paper award at ICIP 2013. The latter award went to Vittal Premachandran, whose Ph.D. thesis work at NTU is the basis of this talk.


Kartik Venkataraman

Pelican Imaging

December 10, 2013 4:15 pm to 5:15 pm

Location: Packard 101

Talk Title: High Performance Camera Arrays for Light Field Imaging

Talk Abstract: Light field imaging with Camera arrays have been explored extensively in academia and have been been used to showcase applications such as view point synthesis, synthetic refocus, computing range images, and capturing high speed video among others. However, none of the prior approaches have addressed the modifications needed to achieve the small form factor and image quality required to make them viable for mobile devices and consumer imaging. In our approach, we customize many aspects of the camera array including lenses, pixels, and software algorithms to achieve the imaging performance required for consumer imaging. We analyze the performance of camera arrays and establish scaling laws that allow one to predict the performance of such systems with varying system parameters. A key advantage of this architecture is that it captures depth. The technology is passive, supports both stills and video, and is low light capable. In addition, we explore extending the capabilities of the depth map through regularization to support applications such as fast user-guided matting.

Speaker's Biography: Kartik Venkataraman has over 20 years experience working with technology companies in Silicon Valley. Prior to founding Pelican Imaging, Kartik headed Computational Cameras at Micron Imaging (Aptina). He spearheaded the design of Extended Depth of Field (EDOF) imaging systems for the mobile camera market. As Manager of the Camera & Scene Modeling group, Kartik¹s end-to-end simulation environment for camera system architecture and module simulations has been adopted in parts of the mobile imaging ecosystem. Previously at Intel, Kartik was principally associated with investigating medical imaging and visualization between Johns Hopkins Medical School and the Institute of Systems Science in Singapore. His interests include Image Processing, Computer Graphics and Visualization, Computer Architectures, and Medical Imaging. Venkataraman founded Pelican Imaging in 2008. Venkataraman received his Ph.D. in Computer Science from University of California, Santa Cruz, MS in Computer Engineering from University of Massachusetts, Amherst, and B.Tech (Honors) in Electrical Engineering from the Indian Institute of Technology, Kharagpur.

Video Files:


Dr. Ram Narayanswamy


January 28, 2014 4:15 pm to 5:15 pm

Location: Packard 101

Talk Title: Computational Imaging Platforms for Next Generation User Experience

Talk Abstract: Cameras in mobile devices has made photography ubiquitous. It is estimated that approximately 900 Billion photos will be captured in 2014. Innovation and low cost is making cameras increasingly accessible to a wide range of consumer products. It is also enabling imaging platforms with multiple cameras. Users would like newer experiences with their photographs and videos. Furthermore they want to express themselves in novel ways using these visual media forms. The next revolution in imaging is happening at the nexus of Computational imaging, Industrial design and User experience. In this talk, we develop this theme, discuss the opportunities for innovative work and the requirements it places on the various components and the system.

Speaker's Biography: Ram Narayanswamy currently leads an Intel effort in Computational Imaging as part of Intel Lab’s User Experience Group. He cut his teeth in imaging at NASA Langley Research Center in the Visual Image Processing lab under Fred Huck working on imaging system design and optimization. Back then he co-authored a paper titled “Characterizing digital image acquisition devices”, better known today as the “slanted-edge test” a de facto standard to measure camera MTF. Subsequently, he did his PhD at the Optoelectronics Systems Center at University of Colorado under Prof. Kristina Johnson and joined CDM Optics, a start-up which pioneered Wavefront Coding and the field of computational imaging. Upon CDM’s acquisition by OmniVision Technologies, Ram led the effort to productize Wavefront Coding for the mobile phone segment. He joined Aptina Imaging where he helped bring the world’s first performance-720p reflowable cameras modules to market – a complete-camera that ships in tape and reel! While at Aptina, Ram also led their effort in Array cameras. Ram has a PhD from the University of Colorado-Boulder, MS from the University of Virginia – Charlottesville and a BS from the National Institute of Technology – Trichy. He loves this golden age of imaging and is looking forward to ushering the platinum age.

Video Files:


Tigran Galstian (CTO) and Peter Clark (VP, Optics)


November 19, 2013 4:15 pm to 5:15 pm

Location: Packard 101

Talk Title: How the collective behavior of molecules improves mobile imaging

Talk Abstract: LensVector Inc, is a Sunnyvale company producing a new generation of tunable optical elements based on spatial alignment of molecules enabling the fabrication of electrically variable "molecular" lenses and prisms. Those components are suitable for a number of applications, including providing autofocus capability for miniature digital cameras using no moving parts. We will describe the basic technology behind the LensVector products and some design considerations for camera applications.

More Information:

Speaker's Biography: .

Tigran V. Galstian, CTO, LensVector Inc., is involved in the area of academic research on optical materials and components as well as their applications in imaging, telecommunication and energy.

Peter P. Clark, VP Optics, LensVector Inc., does optical system design and engineering. Before joining LensVector, he was with Flextronics, Polaroid, Honeywell, and American Optical. He is a Fellow of the Optical Society of America.

Video Files:


Dr. Douglas Lanman


October 8, 2013 4:15 pm to 5:15 pm

Location: Packard 101

Talk Title: Near-Eye Light Field Displays

Talk Abstract: Near-eye displays project images directly into a viewer's eye, encompassing both head-mounted displays (HMDs) and electronic viewfinders. Such displays confront a fundamental problem: the unaided human eye cannot accommodate (focus) on objects placed in close proximity. This talk introduces a light-field-based approach to near-eye display that allows for dramatically thinner and lighter HMDs capable of depicting accurate accommodation, convergence, and binocular-disparity depth cues. Such near-eye light field displays depict sharp images from out-of-focus display elements by synthesizing light fields that correspond to virtual scenes located within the viewer's natural accommodation range. Building on related integral imaging displays and microlens-based light-field cameras, we optimize performance in the context of near-eye viewing. Near-eye light field displays support continuous accommodation of the eye throughout a finite depth of field; as a result, binocular configurations provide a means to address the accommodation convergence conflict that occurs with existing stereoscopic displays. This talk will conclude with a demonstration featuring a binocular OLED-based prototype and a GPU-accelerated stereoscopic light field renderer.

More Information:

Speaker's Biography: Douglas Lanman works in the Computer Graphics and New User Experiences groups within NVIDIA Research. His research is focused on computational imaging and display systems, including head-mounted displays (HMDs), automultiscopic (glasses-free) 3D displays, light field cameras, and active illumination for 3D reconstruction. He received a B.S. in Applied Physics with Honors from Caltech in 2002 and M.S. and Ph.D. degrees in Electrical Engineering from Brown University in 2006 and 2010, respectively. Prior to joining NVIDIA, he was a Postdoctoral Associate at the MIT Media Lab from 2010 to 2012 and an Assistant Research Staff Member at MIT Lincoln Laboratory from 2002 to 2005. Douglas has worked as an intern at Intel, Los Alamos National Laboratory, INRIA Rhône-Alpes, Mitsubishi Electric Research Laboratories (MERL), and the MIT Media Lab. He presented the "Build Your Own 3D Scanner" course at SIGGRAPH 2009 and SIGGRAPH Asia 2009, the "Build Your Own 3D Display" course at SIGGRAPH 2010, SIGGRAPH 2011, and SIGGRAPH Asia 2010, and the "Computational Imaging" and "Computational Displays" courses at SIGGRAPH 2012.

Video Files:


Dr. Boyd Fowler


October 1, 2013 4:15 pm to 5:15 pm

Location: Packard 101

Talk Title: Highlights of the International Image Sensor Workshop

Talk Abstract: In this paper we present the latest trends in image sensors. This includes developments
in CMOS image sensors, CCDs, and single photon avalanche photodiode (SPAD) image sensors.

Speaker's Biography: Boyd Fowler was born in California in 1965. He received his M.S.E.E. and Ph.D. degrees from Stanford University in 1990 and 1995 respectively. After finishing his Ph.D. he stayed at Stanford University as a research associate in the Electrical Engineering Information Systems Laboratory until 1998. In 1998 he founded Pixel Devices International in Sunnyvale California. After selling Pixel Devices to Agilent technologies, he served as the advanced development manager in the Sensor Solutions Division (SSD) between 2003 and 2005. Between 2005 and 2013 he was the CTO and VP of Technology at Fairchild Imaging/BAE Systems Imaging Solutions. Currently he a technical program manager at Google in Mountain View California. He has authored numerous technical papers and patents. He's current research interests include CMOS image sensors, low noise image sensors, noise analysis, low power image sensors, data compression, and image processing.

Video Files:


Professor Marc Levoy

Stanford University and Google

October 15, 2013 4:15 pm to 5:15 pm

Location: Packard 101

Talk Title: What Google Glass means for the future of photography

Talk Abstract: Although head-mounted cameras (and displays) are not new, Google Glass has the
potential to make these devices commonplace. This has implications for the
practice, art, and uses of photography. So what's different about doing
photography with Glass? First, Glass doesn't work like a conventional camera;
it's hands-free, point-of-view, always available, and instantly triggerable.
Second, Glass facilitates different uses than a conventional camera: recording
documents, making visual todo lists, logging your life, and swapping eyes with
other Glass users. Third, Glass will be an open platform, unlike most cameras.
This is not easy, because Glass is a heterogeneous computing platform, with
multiple processors having different performance, efficiency, and
programmability. The challenge is to invent software abstractions that allow
control over the camera as well as access to these specialized processors.
Finally, devices like Glass that are head-mounted and perform computational
photography in real time have the potential to give wearers "superhero vision",
like seeing in the dark, or magnifying subtle motion or changes. If such
devices can also perform computer vision in real time and are connected to the
cloud, then they can do face recognition, live language translation, and
information recall. The hard part is not imagining these capabilities, but
deciding which ones are feasible, useful, and socially acceptable.

More Information:

Speaker's Biography: Marc Levoy is the VMware Founders Professor of Computer Science at Stanford University, with a joint appointment in the Department of Electrical Engineering. He received a Bachelor's and Master's in Architecture from Cornell University in 1976 and 1978, and a PhD in Computer Science from the University of North Carolina at Chapel Hill in 1989. In the 1970's Levoy worked on computer animation, developing a cartoon animation system that was used by Hanna-Barbera Productions to make The Flintstones, Scooby Doo, and other shows. In the 1980's Levoy worked on volume rendering, a technique for displaying three-dimensional functions such as computed tomography (CT) or magnetic resonance (MR) data. In the 1990's he worked on 3D laser scanning, culminating in the Digital Michelangelo Project, in which he and his students spent a year in Italy digitizing the statues of Michelangelo. Outside of academia, Levoy co-designed the Google book scanner and launched Google's Street View project. His current interests include light fields. optical microscopy, and computational photography - meaning computational imaging techniques that extend the capabilities of digital photography. Awards: Charles Goodwin Sands Medal for best undergraduate thesis (1976), National Science Foundation Presidential Young Investigator (1991), ACM SIGGRAPH Computer Graphics Achievement Award (1996), ACM Fellow (2007).

Video Files:


Patrick Gill

Rambus Labs

September 24, 2013 4:15 pm to 5:15 pm

Location: Packard 101

Talk Title: Ultraminiature lensless diffractive computational imagers from Rambus Labs

Talk Abstract: Rambus Labs is developing a new class of computational optical sensors and imagers that do not require traditional focusing. We have recently built our first proof-of-concept lensless imagers that exploit spiral phase anti-symmetric diffraction gratings. These gratings produce diffraction patterns on a photodiode array below, and the diffraction patterns contain information about the faraway scene sufficient to reconstruct the scene without ever having to focus incident light. Image resolution, sensor size, low-light performance and wavelength robustness are all improved over previous diffractive computational imagers.

Speaker's Biography: Patrick R. Gill was a national champion of both mathematics and physics contests in Canada prior to conducting his doctoral work in sensory neuroscience at the University of California at Berkeley (Ph.D. awarded in 2007). He conducted postdoctoral research at Cornell University and the University of Toronto before joining the Computational Sensing and Imaging (CSI) group at Rambus Labs in 2012. He is best known in the optics community for his lead role in inventing the planar Fourier capture array at Cornell University, and he was awarded the Best Early Career Research Paper Award at the 2013 Optical Society of America meeting on Computational Sensing and Imaging.

Video Files:


David Cardinal

Professional photographer, technologist and tech journalist

June 11, 2013 4:15 pm to 5:15 pm

Location: Packard 101

Talk Title: Photography: The Big Picture -- Current innovations in cameras, sensors and optics.

Talk Abstract: It is fitting that our final talk of the year will be an update on the state of camera technologies and how they are playing out in the marketplace. Being heads-down in one area of technology, it can be easy to lose track of others, even those which might be relevant. A big part of the role of technology journalists like David Cardinal is to look at the big picture and provide this type of context. No area of technology has seen more rapid progress than digital photography. A broad array of new technologies, companies, and products is attacking imaging problems from every direction. David's unique perspective as both an award-winning pro photographer and veteran tech journalist writing about imaging technologies allows him to speak about many of these developments in the context of how they are faring in the market and where the industry is likely to go from here.

In particular, some of the developments in smartphone imagers, mirrorless imaging, and new imaging form factors like Google Glass will be discussed. The goal is also to have a highly interactive presentation, as many of you in the audience are top researchers in these areas and will want to share your own perspectives.

More Information:

Speaker's Biography: David Cardinal is a professional photographer, technologist and tech journalist with a decade of experience as a digital travel and nature photographer and over two decades working in high tech -- including software development and management positions at Sun Microsystems and co-Founder and CTO of FirstFloor Software (later part of Calico Commerce). A software developer by training, he co-wrote one of the first image management solutions for digital photographers and has written for PC Mag, Dr. Dobbs, Outdoor Photographer and most recently His award-winning nature photographs have appeared in dozens of publications around the world.

When he's not writing about the latest new cameras or technology, David leads small group photo safaris to Africa, Asia, Alaska and Texas each year which you can learn about at


Video Files:


Tibor Balogh

Founder and CEO of Holografika

May 30, 2013 4:15 pm

Location: Packard 202

Talk Title: Light Field Display Architectures

Talk Abstract: Holografika multi-projector display systems provide high quality hologram-like horizontal-parallax automultiscopic viewing for multiple observers without the need for tracking or restrictive viewing positions. They range in size from TV-box-set to wall-filling life size. Unlike other glasses-free 3D displays offering few perspectives, HoloVizio presents a near-continuous view of the 3D light field free of inversions, discontinuities, repeated views, or their accompanying discomfort – and all this with a near 180 degree region of viewing. The presentation will highlight design considerations and performance issues, and cover aspects for interactive use.

The latest Holografika display– the HoloVizio 80WLT with a 30" diagonal – will be on site and available for viewing. A 2D video showing this 3D system is available here: Specifications can be seen at

More Information:

Speaker's Biography: Tibor Balogh is the Founder and CEO of Holografika, the world’s eading company for light field display. He is a graduate of the Budapest Technical University, and has expertise in the fields of holography, lasers, electro-optical technologies and engineering. Tibor began his career in software engineering, then moved to an academic position at the Eötvös Loránd Scientific University - Budapest. In 1989, he founded Holografika, focusing on developing a proprietary true 3D visualization technology, coming to market with the first HoloVizio system in 2004. Tibor’s work has led to his being awarded the Joseph Petzval medal, the Kalmár award, and the Dennis Gabor Prize. He was a World Technology Award finalist in 2006. Tibor holds numerous patents for 3D display, has authored a large number of publications presenting aspects of his display work and, as CEO of Holografika, coordinates and manages several EU R&D projectors elated to 3D visualization.

Video Files:


Professor Mark Schnitzer

Stanford University

April 23, 2013 4:15 pm to 5:15 pm

Location: Packard 202

Talk Title: Visualizing the neuronal orchestra: Imaging the dynamics of large-scale neural ensembles in freely behaving mice

Talk Abstract: A longstanding challenge in neuroscience is to understand how populations of individual neurons and glia contribute to animal behavior and brain disease. Addressing this challenge has been difficult partly due to lack of appropriate brain imaging technology for visualizing cellular properties in awake behaving animals. I will describe a miniaturized, integrated fluorescence microscope for imaging cellular dynamics in the brains of freely behaving mice. The microscope also allows time-lapse imaging, for watching how individual cells' coding properties evolve over weeks. By using the integrated microscope to perform calcium-imaging in behaving mice as they repeatedly explored a familiar environment, we tracked the place fields of thousands of CA1 hippocampal neurons over weeks. Spatial coding was highly dynamic, for on each day the neural representation of this environment involved a unique subset of neurons. Yet, the cells within the ~15–25% overlap between any two of these subsets retained the same place fields, and though this overlap was also dynamic it sufficed to preserve a stable and accurate ensemble representation of space across weeks. This study in CA1 illustrates the types of time-lapse studies on memory, ensemble neural dynamics, and coding that will now be possible in multiple brain regions of behaving rodents.

More Information:

Speaker's Biography: Mark Schnitzer is Associate Professor of Biology and Applied Physics and is an Investigator of the Howard Hughes Medical Institute. His research concerns the innovation of novel optical imaging technologies and their use in the pursuit of understanding neural circuits. The Schnitzer lab has invented two forms of fiber-optic imaging, one- and two-photon fluorescence microendoscopy, which enable minimally invasive imaging of cells in deep brain tissues. The lab is further developing microendoscopy technology, studying how experience or environment alters neuronal properties, and exploring two different clinical applications. The group has also developed two complementary approaches to imaging neuronal and astrocytic dynamics in awake behaving animals. Much research focuses on cerebellum-dependent forms of motor learning. By combining imaging, electrophysiological, behavioral, and computational approaches, the lab seeks to understand cerebellar dynamics underlying learning, memory, and forgetting. Further work in the lab concerns neural circuitry in other mammalian brain areas such as hippocampus and neocortex, as well as the neural circuitry of Drosophila.

Video Files:


Dr. Roland Angst

Stanford University

May 28, 2013 4:15 pm to 5:15 pm

Location: Packard 204

Talk Title: Geometry and Semantics in Computer Vision

Talk Abstract: Recent developments in computer vision have led to well-established pipelines for fully automated image- or video-based scene reconstructions. While we have seen progress in 2D scene understanding, 3D reconstructions and scene understanding have evolved independently. Hence, a major trend in computer vision is currently the development of more holistic views which combine scene understanding and 3D reconstructions in a joint, more robust and accurate framework (3D scene understanding).

In my talk, I will present two recent, but entirely different approaches how to build upon such geometric concepts in order to extract scene semantics. Using a structure-from-motion pipeline and a low-rank factorization technique, the first approach analyzes the motion of rigid parts in order to extract motion constraints between those rigid parts. Such motion constraints reveal valuable information about the functionality of an object, eg. a rolling motion parallel to a plane is likely due to a wheel. The second approach combines appearance-based superpixel classifiers with class-specific 3D shape priors for joint 3D scene reconstruction and class segmentation. The appearance based classifiers guide the selection of an appropriate 3D shape regularization term whereas the 3D reconstruction in turn helps in the class segmentation task. We will see how this can be formulated as a convex optimization problem over labels of a volumetric voxel grid.

More Information:

Speaker's Biography: Roland Angst has received his PhD degree from the Swiss Federal Institute of Technology (ETH) Zürich in September 2012 under the supervision of Prof. Marc Pollefeys. As a member of the Computer Vision & Geometry Group , his research focused on geometric computer vision and on subspace models and algorithms in particular. In 2010, he has received a prestigious Google European Doctoral Fellowship in Computer Vision. Roland received his Master's degree in computer science with distinction under the supervision of Prof. Markus Gross at the Swiss Federal Institute of Technology Zurich in October 2007.

Roland is currently affiliated to the Max Planck Center for Visual Computing and Communication. As such, he is currently a visiting assistant professor at Stanford University, where he is a member of Prof. Bernd Girod's Image, Video, and Multimedia Systems Group as well as of Prof. Leonidas J. Guibas' Geometric Computation Group. His primary research interests span computer vision and geometry. Specifically, he is interested in relating geometry and semantics in order to design new and improve existing algorithms in computer vision.

Video Files:


Shailendra Mathur


March 22, 2013 12:00 pm to 1:00 pm

Location: Packard 204

Talk Title: A post-production framework for light-field based story-telling

Talk Abstract: A strong trend in the media industry is the acquisition of higher data resolutions and broader ranges in color, spatial and temporal domains. Another associated trend is the acquisition of multiple views over the same domains. Media may be captured from multiple devices, a single device with high resolution/range capabilities or, a synthetic source such as a CGI scene. Video productions are evolving from working with a single view on to the scene to capturing as much information as possible. The editor and story-teller is the winner in this world, since they can “super-sample the world and edit it later”. They can extracting the views required for the story they are telling after acquisition has taken place. The role of traditional editing and data management tools becomes challenging to support this multi-view and multi-resolution data sets. Well known editorial and data management techniques have to be applied to multiple media samples of the same scene for different ranges and resolutions in the temporal, color and spatial domains. This talk presents how concepts from the area of Light Fields have been used to for an editorial and data management system to work with these multi-view sources. A proposed data model and run-time system shows how media from various views can be grouped in a way that they appear as a single light field source to the editing and data management functions. The goal of the talk is to encourage a discussion on how this post-production framework can be used make productions with light field acquisition a reality.

Speaker's Biography: Shailendra Mathur is the Chief Architect for video products at Avid with technology oversight over the editing, video server and broadcast graphics products. With over 18 years of experience in the media industry and a research background in the area of computer vision and medical imaging, Shailendra has contributed to a wide gamut of technical products and solutions in the media space. Beyond his responsibilities in product development, his research and engineering interests have led to multiple publications and patents in the areas of computer vision, medical imaging, visual effects, graphics, animation, media players and high performance compute architectures. Over the past few years understanding the art and science of stereoscopy, color, high frame rates, high resolutions, and applying them to storytelling tools has been a passion.

Video Files:


Ricardo Motta


April 9, 2013 4:15 pm to 5:15 pm

Location: Packard 204

Talk Title: Chimera: NVIDIA's Mobile Computational Photography Architecture

Talk Abstract: Chimera, NVIDIA's computational photography architecture, was created to increase the flexibility of camera image processing pipelines and support new capture methods. This flexibility has become essential for the support of new sensors, including multi-camera and plenoptic methods, for many new bayer space algorithms, such as HDR and non-local means noise reduction, as well for post-demosaic algorithms such as image merging and local tone mapping. Chimera achieves this flexibility by creating a simplified framework to mix and match the ISP, the GPU and the CPU, allowing, for example, the GPU to be used in bayer space before the ISP for noise reduction, and the CPU after the ISP for scaling. In this talk we will discuss how the evolution of sensors and capture methods are shaping mobile imaging, and provide an overview of Chimera and the Always-on-HDR method.

Speaker's Biography: Ricardo Motta is a Distinguished Engineer at NVIDIA Corp, and the CTO responsible for imaging technology and roadmap for the Tegra mobile products. He is a graduate of the Imaging and Photographic Science program at RIT, where he developed the first colorimetric model for computer driven CRTs. He joined HP Labs in 1987 and was HP¹s first color imaging scientist, spearheading the development of HP¹s core color imaging products and technology, including the first color printers, copiers, cameras, and eventually the sRGB standard. From 1996 to 1999 he was a Chief Architect for HP's imaging business, then world¹s largest. In 1999 he left HP to start Pixim Inc, a pioneer in computational photography, where he was VP and CTO, leading the development of the first HDR video chipset. He is in the board of advisors of the Munsell Color Science Lab, and is a past vice president of IS&T.

Video Files:


Professor Audrey Ellerbee

Stanford University

May 14, 2013 4:15 pm to 5:15 pm

Location: Packard 204

Talk Title: New Designs for the Next Generation Optical Coherence Tomography Systems

Talk Abstract: In its nearly 25-year history, optical coherence tomography (OCT) has gained appreciable notoriety as a technology capable of non-invasively imaging the microstructure of biological tissue. The rapid penetration of OCT into the clinical market has both been driven by and itself stimulated by new technological advances in the field, with the end result of a highly informative, patient-friendly diagnostic modality. Recently, our own group has taken a systematic approach to re-engineering the traditional OCT system both to understand and to overcome some of the known limitations of current designs. In this talk, I provide a perspective on historical developments in the field, our recent contributions, and new application areas.

Speaker's Biography: Audrey K Ellerbee, PhD is a Gabilan Fellow and Assistant Professor of Electrical Engineering at Stanford University. She received her BSE in Electrical Engineering from Princeton University, her PhD in Biomedical Engineering from Duke University and completed her postdoctoral training in Chemistry and Chemical Biology at Harvard University. During her career, Dr. Ellerbee spent a short time as an International Fellow at Ngee Ann Polytechnic in Singapore and as a Legislative Assistant in the United States Senate through the AAAS Science and Technology Policy Fellows Program sponsored by OSA and SPIE. She is a member of the OSA and SPIE and is the recipient of numerous awards, including most recently the Air Force Young Investigator Award and the Hellman Faculty Scholars Award.

Dr. Ellerbee directs the Stanford Biomedical Optics group, whose mission is to develop and deploy novel tools for optical imaging at the microscale and nanoscale. The group’s applications of interest span clinical and basic science domains with a particular interest in the development of low-cost, portable technologies suited for use in poorly resourced environments. Building on expertise and experience with interferometry, they aim to create innovative technologies that serve as integral complements to the toolkits of biologists and clinicians, as well as use their own technologies to study various cellular phenomena relevant to disease.

Video Files:


Professor Jitendra Malik

University of California at Berkeley

May 7, 2013 4:15 pm to 5:15 pm

Location: Packard 204

Talk Title: The Three R's of Computer Vision: Recognition, Reconstruction and Reorganization

Talk Abstract: Over the last two decades, we have seen remarkable progress in computer vision with demonstration of capabilities such as face detection, handwritten digit recognition, reconstructing three-dimensional models of cities, automated monitoring of activities, segmenting out organs or tissues in biological images, and sensing for control of robots and cars. Yet there are many problems where computers still perform significantly below human perception. For example, in the recent PASCAL benchmark challenge on visual object detection, the average precision for most 3D object categories was under 50%.

I will argue that further progress on the classic problems of computational vision: recognition, reconstruction and re-organization requires us to study the interaction among these processes. For example recognition of 3d objects benefits from a preliminary reconstruction of 3d structure, instead of just treating it as a 2d pattern classification problem. Recognition is also reciprocally linked to reorganization, with bottom up grouping processes generating candidates, which with top-down activations of object and part detectors. In this talk, I will show some of the progress we have made towards the goal of a unified framework for the 3 R's of computer vision. I will also point towards some of the exciting applications we may expect over the next decade as computer vision starts to deliver on even more of its grand promise.

Speaker's Biography: Jitendra Malik was born in Mathura, India in 1960. He received the B.Tech degree in Electrical Engineering from Indian Institute of Technology, Kanpur in 1980 and the PhD degree in Computer Science from Stanford University in 1985. In January 1986, he joined the university of California at Berkeley, where he is currently the Arthur J. Chick Professor in the Computer Science Division, Department of Electrical Engg and Computer Sciences. He is also on the faculty of the department of Bioengineering, and the Cognitive Science and Vision Science groups. During 2002-2004 he served as the Chair of the Computer Science Division and during 2004-2006 as the Department Chair of EECS. He serves on the advisory board of Microsoft Research India, and on the Governing Body of IIIT Bangalore.

Prof. Malik’s research group has worked on many different topics in computer vision, computational modeling of human vision, computer graphics and the analysis of biological images, resulting in more than 150 research papers and 30 PhD dissertations. Several well-known concepts and algorithms arose in this research, such as anisotropic diffusion, normalized cuts, high dynamic range imaging, and shape contexts. According to Google Scholar, seven of his papers have received more than a thousand citations each, and he is one of ISI’s Highly Cited Researchers in Engineering.

He received the gold medal for the best graduating student in Electrical Engineering from IIT Kanpur in 1980 and a Presidential Young Investigator Award in 1989. At UC Berkeley, he was selected for the Diane S. McEntyre Award for Excellence in Teaching in 2000, a Miller Research Professorship in 2001, and appointed to be the Arthur J. Chick Professor in 2002. He received the Distinguished Alumnus Award from IIT Kanpur in 2008. He was awarded the Longuet-Higgins Prize for a contribution that has stood the test of time twice, in 2007 and in 2008. He is a fellow of the IEEE and the ACM, and a member of the National Academy of Engineering.

Video Files:


Dr. Thomas Vogelsang


April 16, 2013 4:15 pm to 5:15 pm

Location: Packard 204

Talk Title: Binary Pixel and High Dynamic Range Imaging

Talk Abstract: Modern image sensors, especially in devices like mobile phones, are using smaller and smaller pixels. The resulting reduced full well capacity limits the possible dynamic range significantly. Spatial and temporal oversampling based on the concept of binary pixels is a way to overcome this limitation. It is an approach to achieve high dynamic range imaging in a single exposure using small pixels.
The concept of binary pixels has been initially proposed by Eric Fossum now at Dartmouth College. Feng Yang et al. of EPFL have developed the first theory of binary oversampling based on photon statistics. We have developed these initial proposals further to enable feasibility in today’s image sensor technology while maintaining the theoretical understanding based on photon statistics which has been key to the development of this technology.
In this talk I will describe the theory behind binary pixel and show how to achieve good low light sensitivity together with high dynamic range. Then I will give examples how this technology can be implemented using today’s technology and show a roadmap to take advantage of further developments of image sensor technology.

Speaker's Biography: Dr. Thomas Vogelsang is a member of the Computational Sensing and Imaging group of Rambus Labs where he has been leading the work on binary pixel technology since early 2011. Before his imaging work his main research interest was DRAM architecture and DRAM core design. Prior to joining Rambus in 2009 he worked for Qimonda in Burlington VT where he lead the development of low power DRAM. Before that he was a researcher at the DRAM development alliance of Infineon, IBM and Toshiba. He received his diploma and Ph. D. degree in physics from the Technical University of Munich, Germany. Thomas Vogelsang is a senior member of the IEEE.

Video Files:


Professor Greg Asner

Department of Global Ecology, Carnegie Institution for Science, USA and the Department of Environmental Earth System Science, Stanford University

January 29, 2013 4:15 pm to 5:15 pm

Location: Packard 204

Talk Title: Imaging Ecosystems with the Carnegie Airborne Observatory

Talk Abstract: Following four billion years of evolution on Earth, the most biologically diverse biome to have arisen on land is the great swath of tropical forest straddling the equator, which inspires the naturalist in millions of people planet-wide. Yet ironically, tropical forests continue to undergo enormous losses and untold biotic rearrangements in patterns and rates that defy current scientific inquiry and conservation initiatives. It is also ironic that most of our understanding of tropical forest composition, structure and function comes from very sparsely distributed field plots, whilst our understanding of tropical deforestation is gleaned from myopic satellite remote sensing metrics of forest cover. There exists a persistent, and increasingly problematic, void in our understanding of tropical forest dynamics at the scale in which organisms disperse, reproduce, migrate and go extinct.

To address this problem, Dr Asner developed the Carnegie Airborne Observatory, or CAO ( The CAO is designed to assist in the measurement and interpretation of organismic-scale processes that mediate forest carbon dynamics, disturbance regimes, biosphere-atmosphere interactions, and evolutionary processes across large tropical regions. In June 2011, they launched CAO AToMS, the next generation technology in a unique line of advanced airborne measurement systems. Inaugural campaigns with AToMS were recently completed, including mapping in the Western Amazon, Mesoamerica, and Southern Africa. Here Dr Asner presents a brief history of the CAO program and some new insights derived from its unique observations. He also sheds light on how 3D laser-guided spectroscopic imaging of ecosystems can transform scientific knowledge, inspire artistic expression and communication, and accelerate both conservation actions and climate-change policy initiatives.

More Information:

Speaker's Biography: Dr Greg Asner is a faculty member in the Carnegie Institution’s Department of Global Ecology and a Professor in the Department of Environmental Earth System Science at Stanford University. His scientific background spans the fields of evolutionary biology, ecosystems ecology, biogeochemistry, and remote sensing. Dr Asner’s research interests range from global deforestation rates to the evolution of chemical compounds in tropical forest canopies. He also seeks practical ways to reduce biodiversity losses through international climate agreements and emerging carbon markets. The aircraft and satellite mapping techniques developed by Dr Asner and his team are widely recognized as among the most innovative for exploring and monitoring the Earth's changing ecosystems.


Andrew (Beau) Watson

NASA Ames Research Center

March 19, 2013 4:15 pm to 5:15 pm

Location: Packard 204

Talk Title: High Frame Rate Movies and Human Vision

Talk Abstract: “High Frame Rate” is the new big thing in Hollywood. Directors James Cameron and Douglas Trumbull extoll its virtues. Peter Jackson’s movie “The Hobbitt,” filmed at 48 Hz, twice the industry standard, has recently been released.
In this talk I will provide a scientific look at the role of frame rate in the visual quality of movies. My approach is to represent the moving image, and the various steps involved in its capture, processing and display, in the spatio-temporal frequency domain. This transformation exposes the various artifacts generated in the process, and makes it possible to predict their visibility. This prediction is accomplished by means of a tool we call the Window of Visibility, that is a simplified representation of the human spatio-temporal contrast sensitivity function. With the aid of the Window of Visibility, the movie-making process, including frame rate, can be optimized for the eye of the beholder.

Speaker's Biography: Andrew B. Watson is the Senior Scientist for Vision Research at NASA Ames Research Center in California. He is the author of over 100 papers and six patents on topics in vision science and imaging technology. In 2001, he founded the Journal of Vision ( where he serves as Editor-in-Chief. Dr. Watson is a Fellow of the Optical Society of America, of the Association for Research in Vision and Ophthalmology, and of the Society for Information Display. He is Vice Chair for Vision Science and Human Factors of the International Committee on Display Measurement. In 2007 he received the Otto Schade Award from the Society for Information Display, and in 2008 the Special Recognition Award from the Association for Research in Vision and Ophthalmology. In 2011, he received the Presidential Rank Award from the President of the United States.

Video Files:


Harlyn Baker

February 12, 2013 4:15 pm to 5:15 pm

Location: Packard 204

Talk Title: Capture Considerations for Multi-view Panoramic Cameras

Talk Abstract: This presentation will discuss methods and insights from a lab/business-unit collaborative project in multi-view imaging and display. Our goal was to better understand the potential of new immersive technologies by developing demonstrations and experiments that we performed within the context of corporate customer interests. The demonstrations centered around using multi-viewpoint capture and binocular and multiscopic display for immersive -- very large scale -- 3D entertainment experiences.

Image acquisition is with our wide-VGA and high-definition Herodion multi-camera capture system. 3D display at a very large scale used tiled projection systems with polarization permitting binocular stereo viewing and, at a smaller scale, multi pico-type projectors presenting autostereo viewing.

Much of what will be described is lessons outside of success, as we have taken our gear into the field for life-sized 3D capture and presentation to large audiences -- a dangerous place for unproven technologies. Being an area where much is yet to be learned, and feeling that a good way to change this is to just boldly go, our experiments have resulted equally in insights and accolades. Still, we view this as a progression, and we are confident of the end being a worthwhile pursuit, if not an eventuality, with success promising huge commercial potential.

Speaker's Biography: Harlyn Baker has a long history in multiple-image computer vision, from early 3D modeling in Edinburgh, to stereo in his PhD at Champaign-Urbana and the Stanford AI Lab, a dozen years at SRI where he co-developed Epipolar Plane Image (EPI) Analysis, four at Interval Research, and a dozen more at HP Labs.

Harlyn's EPI work, called seminal, has been instrumental in most later developments on Light Field analysis, including RaySpace and Hogel formulations. On leaving Interval Research, he was co-founder of TYZX, joined HP Labs in 2000, where he designed and developed camera systems to support multi-view studies, demonstrated automultiscopic imaging and display systems for 3D interaction and immersive experiences, and has most recently been exploring how 3D capture can impact print photography. Harlyn just took an early retirement package from Hewlett Packard Laboratories, and by summer will take up a position at the University of Heidelberg.

Video Files:


C. David Tobie


February 19, 2013 4:15 pm to 5:15 pm

Location: Packard 204

Talk Title: Why Color Management is Hard: Lessons from the Trenches

Talk Abstract: Color Management is often considered a science, but has many applied aspects, and even a few artistic functions. This lecture will compare and contrast affordable color management's practical commercial side with the science behind it, and the art its used to create. Examples from the product line David Tobie was worked to create will provide opportunities to describe the relationship of these products to color science, optics, and industry standards, as well as mass production, fast changing commercial markets, and demanding but non-scientific end users.

Speaker's Biography: C. DAVID TOBIE has been involved in color management and digital imaging from their early development. David has worked to see affordable solutions put in place for graphic design, prepress, photography and digital imaging, and taught users how best to utilize them. He has consulted internationally for a wide range of color-related companies, and is best known by photographers for his writing and technical editing of texts and periodicals for the photo industry such as Mastering Digital Printing, and Professional Photographer magazine, and his seminars on color and imaging at photographic workshops around the globe. David is currently Global Product Technology Manager at Datacolor, where he develops new products and features for their Spyder <> line of calibration tools. His work has received a long line of digital imaging product awards including the coveted TIPA award, and a nomination for the DesignPreis. He was recognized by Microsoft as an MVP in Print and Imaging, 2007-2010. Much of David’s recent writing can be found at his photography blog:, and his samples of his photography can be seen at:

Video Files:


Jim DeFilippis

March 12, 2013 4:15 pm to 5:15 pm

Location: Packard 204

Talk Title: Frame Rate and the Perception of Moving Pictures

Talk Abstract: Television and movies are unique in the world of imaging in part due to the temporal dimension. This is evident in the 'frame rate' of the moving image medium; what is less known and understood is the effect of frame rate on the perception of the medium by the viewer. In addition, there are issues related to frame rate on the capture end of the creative process as well as on the display side. To further complicate the situation are the effects due to modern image compression which utilizes temporal redundancy to reduce the required data capacity.

My talk will discuss the tradeoffs as well as limitations to moving picture capture and display and the effect on the viewer perception.

Speaker's Biography: Jim in an independent consultant to media and broadcasting. He has worked in radio and television broadcasting for over 32 years including the ABC Television and Radio Network, the Advanced Television Test Center, the Atlanta Olympic Broadcast Organization and recently as the EVP of Digital Television Technologies and Standards, for the FOX Television Group. His main focus is on 3D TV, high frame rate video production, Mobile DTV and digital file based work flows. Past activities include the development of progressive camera systems to replace film for television, 480p30 video systems (FOX Widescreen), and MPEG 2 splicing system design and deployment for the FOX Network. Previous to FOX, Jim was the Head of Engineering for the 1996 Atlanta Olympic Games where he championed the development of the first all digital, disk based, super slo-motion camera/recording system (Panasonic/EVS). Jim has been involved with the Olympics since 1993 and has participated in (5) Olympic games, most recently he worked for the London Olympic 3D channel. He is the author of several papers on the topic of digital television, progressive scanning cameras and digital media workflow.

Jim attended the School of Engineering at Columbia University in the City of New York where he attained his Bachelor of Science in electrical engineering in 1980 and his Masters of Science in electrical engineering in 1990.

Mr. DeFilippis is a Fellow of the SMPTE and is involved in standards development at the International Telecommunications Union. He is active in ATSC standards work including A/85 RP for Audio Loudness Control for DTV. In 2012 Jim received the David Sarnoff Award from the SMPTE for his contributions to improving television technologies.

Video Files:


Entertainment Technology in the Internet Age (2013)

June 18, 2013 to June 19, 2013

Location: Stanford University

Talk Title: Entertainment Technology in the Internet Age

Talk Abstract: Entertainment technology development and content deployment has historically been the purview of Hollywood and traditional broadcast media. However, rapid convergence of technology improvements in connectivity, bandwidth, and media processing coupled with consumer interest has caused a surge in media distribution over the web.

The Stanford Center for Image Systems Engineering is partnering with the Society for Motion Picture and Television Engineers to produce a two-day conference on "Entertainment in the Internet Age" (ETIA).

Through a series of panel discussions and presentations, with ample opportunity for audience participation, the ETIA conference will examine topics within the areas of Internet-focused content creation, distribution, and monetization, as well as technical tools and solutions for shaping the user experience.

More Information:

Talk Slides:


Professor Shih-Fu Chang

Electrical Engineering and Computer Science Digital Video and Multimedia Lab at Columbia University

December 4, 2012 4:15 pm to 5:15 pm

Location: Packard 204

Talk Title: Recent Advances of Compact Hashing for Large-Scale Visual Search

Talk Abstract: Finding nearest neighbor data in high-dimensional spaces is a common yet challenging task in many applications, such as stereo vision, image retrieval, and large graph construction. A scalable indexing scheme is needed to achieve satisfactory accuracy, search complexity, and storage cost. However, the well-known issue related to the curse of dimensionality made existing solutions impractical. Recent advances in locality sensitive hashing show promises by hashing high-dimensional features into a small number of bits while preserving proximity in the original feature space. It has attracted great attention due to its simplicity in implementation (linear projection only), constant search time, low storage cost, and applicability to diverse features and metrics. In this talk, I will first survey a few recent methods that extends basic hashing methods to incorporate labeled information through supervised and semi-supervised hashing, employ hyperplane hashing for finding nearest points to subspaces (e.g., planes), and demonstrate the practical utility of compact hashing methods in solving several challenging problems of large-scale mobile visual search - low bandwidth, limited processing power on mobile devices, and needs of searching large databases on servers. Finally, we study the fundamental questions of high-dimensional search - how is nearest neighbor search performance affected by data size, dimension, and sparsity; can we predict the performance of hashing methods over a data set before its implementation? (joint work with Junfeng He, Sanjiv Kumar, Wei Liu, and Jun Wang)

Speaker's Biography: Shih-Fu Chang is the Richard Dicker Chair Professor, Director of the Digital Video and Multimedia Lab, and Senior Vice Dean of Engineering School at Columbia University. He research interest is focused on multimedia retrieval, computer vision, signal processing, and machine learning. He and his students have made important contributions to the field of content-based visual retrieval, and developed a few well-known image/video search engines for web images, consumer videos, news videos, and mobile products. He has received ACM SIG Multimedia Technical Achievement Award, IEEE Kiyo Tomiyasu award, IBM Faculty award, and Service Recognition Awards from IEEE and ACM. His research has been supported by government agencies (NSF, DARPA, IARPA, NGA, ONR, NY State) as well as many industry sponsors. He is a Fellow of IEEE and the American Association for the Advancement of Science.


Dr. Peter Vajda

Max Planck Center for Visual Computing and Communications at Stanford

November 20, 2012 4:15 pm to 5:15 pm

Location: Stanford University

Talk Title: Personalized TeleVision News

Talk Abstract: In this presentation we demonstrate a platform for personalized television news to replace the traditional one-broadcast-fits-all model. We forecast that next-generation video news consumption will be more personalized, device agnostic, and pooled from many different information sources. The technology for our research represents a major step in this direction, providing each viewer with a personalized newscast with stories that matter most to them. We believe that such a model can provide a vastly superior user experience and provide fine-grained analytics to content providers. While personalized viewing is increasingly popular for text-based news, personalized real-time video news streams are a critically missing technology.

Speaker's Biography: Peter Vajda is a visiting assistant professor in the Department of Electrical Engineering at Stanford University. He received his doctorate from Ecole Polytechnique Fédéral de Lausanne (EPFL) in Switzerland. He was involved in several European projects such as Visnet II Network of Excellence (Networked Audiovisual Media Technologies), K-Space (The Knowledge Space of Technology to Bridge the Semantic Gap) and PetaMedia (Peer-to-peer Tagged Media). Within these European projects, he was working in the area of object detection for video surveillance applications, image segmentation evaluation, and metadata propagation based on object duplicate duplicate detection. His current research interests include large-scale image retrieval system for mobile visual search applications and personalized real-time video news.


Professor Laura Waller

University of California at Berkeley

December 11, 2012 4:15 pm to 5:15 pm

Location: Packard 204

Talk Title: Phase imaging with partially coherent light

Talk Abstract: This talk will describe computational phase imaging methods based on intensity transport, with a focus on imaging systems using partially coherent illumination (e.g. optical and X-ray microscopes). Knowledge of propagation dynamics allows quantitative recovery of wavefront shape with very little hardware modification. The effect of spatial coherence in typical and ‘coherence engineered’ systems will be explored. All of these techniques use partially coherent light, whose wave-fields are inherently richer than coherent (laser) light, having many more degrees-of-freedom. Measurement and control of such high dimensional beams will allow new applications in bioimaging and metrology, as well as bring new challenges for efficient algorithm design.

Speaker's Biography: Laura Waller the newest faculty member in the department of Electrical Engineering and Computer Sciences (EECS) at UC Berkeley, building a lab in computational optical imaging, with a focus on higher-dimensional wave-fields (such as partially coherent or nonlinear beams). She was a Postdoctoral Research Associate in Electrical Engineering and Lecturer of Physics at Princeton University from 2010-2012 and received B.S., M.Eng., and Ph.D. degrees in EECS from the Massachusetts Institute of Technology (MIT) in 2004, 2005, and 2010, respectively, where she was a SMART student (Singapore-MIT Alliance for Research and Technology).

Video Files:


Lingfei Meng

Ricoh Innovations, Inc.

January 15, 2013 4:15 pm to 5:15 pm

Location: Packard 204

Talk Title: System Model and Performance Evaluation of Spectrally Coded Plenoptic Camera

Talk Abstract: Plenoptic camera architectures are designed to capture a 4D light field of the scene and have been used for different applications, such as digital refocusing and depth estimation. This plenoptic architecture can also be modified to collect multispectral images in a single snapshot by inserting a filter array in the pupil plane of the main lens. In this talk I will first introduce an end-to-end imaging system model for a spectral coded plenoptic camera. I will then present our prototype, which was developed based on a modified DSLR camera containing a microlens array on the sensor and a filter array in the main lens. Finally, I will show results based on both simulations and measurements obtained from the prototype.

Speaker's Biography: Lingfei Meng is a research scientist at Ricoh Innovations, where he is creating technology for computational imaging systems. He received the Ph.D. degree in imaging science from the Rochester Institute of Technology, Rochester, NY, in 2012, completing his thesis on polarimetric imaging system modeling and optimization. He also received the B.S. degree in electrical engineering from the Tianjin University, China, in 2007. He was the recipient of the 2011 IEEE Data Fusion Contest for his work on object tracking using satellite imagery.

Video Files:


Scott Daly, Timo Kunkel, Xing Sun, Suzanne Farrell, and Poppy Crum

Dolby Laboratories

November 27, 2012 4:15 pm to 5:15 pm

Location: Packard 204

Talk Title: Viewer preference statistics for shadow, diffuse, specular, and emissive luminance limits of high dynamic range displays

Talk Abstract: A subjective study was performed to find minimum and maximum display luminances
based on viewer preferences. The motivation was to find values based on real-world
structured image content, as opposed to geometric test patterns commonly used in the
literature, and to find values relevant for display of video content. The test images were
specifically designed, both in scene set-up and in capture techniques (i.e., HDR multiple
exposure merging), to test these limits without the usual perceptual conflicts of contrast
induction, the Stevens effect, the Hunt effect, contrast/sharpness interactions. The display
was designed to render imagery at extreme ranges of luminance and high contrast to
avoid the usual elevation of black level with increasing brightness, ranging from 0.004 to
20,000 cd/m2. The image signals were captured, represented, and processed to avoid the
common unintended signal distortions of clipping, contrast reduction, and tonescale
shape changes. The image range was broken into diffuse reflective and highlight regions.
Preferences were studied as opposed to detection thresholds, to provide results more
directly relevant to viewers of media. Statistics of the preferences will be described, as
opposed to solely reporting mean and standard deviation values. As a result, we believe
these results are robust to future hardware capabilities in displays.

Speaker's Biography: Timo Kunkel is a color and imaging research engineer at Dolby Labs, Inc. in Sunnyvale, CA.
His main areas of research are perception-based color models, high dynamic range imaging,
advanced display technologies, and psychophysics. Before moving to the U.S., he worked for
Dolby in Vancouver, Canada, investigating the dynamic range limits of the human visual
system, and on creating novel color appearance models. Over the last twelve years, he also
worked as an architecture and landscape photographer for clients in Europe and the U.S.
and was co-founder and head of pre-press imaging of a German children’s book publisher. He
received his PhD in Computer Science from the University of Bristol, UK and an MSc in
Geography and Climatology from the University of Freiburg in Germany.

Scott Daly received an M.S. in Bioengineering from the University of Utah in 1984, completing
a neurophysiology thesis on retinal temporal processing. He then worked in the Imaging Science
Division at Eastman Kodak in the fields of image compression, image fidelity models, and image
watermarking. Next, he worked at Sharp Laboratories of America in Washington State, where he
led a group on display algorithms. Becoming a research fellow, he applied visual models towards
digital video and displays, with publications on spatiotemporal and motion imagery, human
interaction with wall-sized displays, and stereoscopic displays. These topics led him to join Dolby
Laboratories in 2010 to focus on overall fundamental perceptual issues, and efforts to preserve
artistic intent throughout the entire video path to reach the viewer. He is currently a member of

Video Files:


Dr. Hiroshi Shimamoto

NHK Science & Technology Research Laboratories

October 30, 2012 4:15 pm to 5:15 pm

Location: Packard 204

Talk Title: 120 Hz-frame-rate SUPER Hi-VISION Capture and Display Devices

Talk Abstract: SUPER Hi-VISION is the next-generation ultra-high definition television (UHDTV) broadcast system, that consists of 33 megapixels (7,680-pixel by 4,320-line of resolution, that is 16 times of HDTV, and some call this resolution ‘8k’). The frame frequency has been 60 Hz, but recently the frame rate was doubled to 120 Hz to improve its motion portrayal quality. This new UHDTV system (we call it “full-spec” SUPER Hi-VISION) video has been standardized as Recommendation ITU-R BT.2020.
In this talk I will describe world’s first 120-Hz SUPER Hi-VISION devices that we have developed. One is a 120-Hz SUPER Hi-VISION image-capture device that uses three 120-Hz, 33-megapixel CMOS image sensors. The sensor uses 12-bit ADCs and operates at a data rate of 51.2 Gbit/s. Our unique ADC technology realized the high-speed operation and low-power consumption at the same time. The second is a 120-Hz SUPER Hi-VISION display system that uses three 8-megapixel LCOS chips and e-shift technology. These 120-Hz SUPER Hi-VISION devices were exhibited at our open house in May and at IBC2012 in September 2012 and showed a superb picture quality with less motion blur.

Speaker's Biography: Hiroshi Shimamoto received the B.E. degree in electronic engineering from Chiba University in 1989, M.E. and Ph.D degrees in information processing from Tokyo Institute of Technology in 1991 and 2008, respectively. In 1991, he joined NHK (Japan Broadcasting Corporation). Since 1993, he has been working on research and development of UHDTV cameras and image sensors at the NHK Science & Technology Research Laboratories. He is Senior Research Engineer and is responsible for developing 120-Hz UHDTV image sensors. In 2005-2006, He was a visiting scholar at Stanford University. He is a member of the IEEE.

Video Files:


Professor Hamid Aghajan

Stanford University

October 16, 2012 4:15 pm to 5:15 pm

Location: Packard 204

Talk Title: Camera Networks for Ambient Intelligence: Personal Recommendations via Behavior Modeling

Talk Abstract: Vision offers rich information about events involving human activities in applications from gesture recognition to occupancy reasoning. Multi-camera vision allows for applications based on 3D perception and reconstruction, offers improved accuracy through feature and decision fusion, and provides access to different attributes of the observed events through camera task assignment. However, the inherent complexities in vision processing stemming from perspective view and occlusions, as well as setup and calibration requirements have challenged the creation of meaningful applications that can operate in uncontrolled environments. Moreover, the task of studying user acceptance criteria such as privacy management and the implications in visual ambient communication has for the most part stayed out of the realm of technology design, further hindering the roll-out of vision-based applications in spite of the available sensing, processing, and networking technologies.
Smart environments are spaces that sense, perceive, and react to the presence, commands, or observed events of their occupants through a variety of interfaces and offer services such as multimedia, home control, or pervasive communications, as well as accident detection and well-being applications. The notion of ambient intelligence refers to endowing such systems with unobtrusive and intuitive interfaces as well as mechanisms to learn and adapt to the behavior models and preferences of their user in order to offer context-aware and customized services tailored to the user needs.
A user-centric design paradigm in creating vision-based applications considers the user acceptance and social aspects of the intended solution as part of the design effort. Adaptation to the user’s set of preferences and behavior model, seamless and intuitive interfaces, automated setup and configuration, ease of use, awareness of the context, and responsiveness to the user’s privacy options are some of the attributes of a user-centric design. The incorporation of user’s behavior model and preferences into a reasoning system which employs demographic data or expert guidelines can lead to a personal recommendation system that supports the user in activities of daily life at home or in the office.
Novel opportunities in application development for smart homes and offices in the areas related to well-being, automation, and experience sharing in social networks are enabled by employing such user-centric approaches. A few examples of application development based on the mentioned concepts will be discussed.

More Information:

Speaker's Biography: Hamid Aghajan is director of Stanford's A I R (Ambient Intelligence Research) Lab, and Wireless Sensor Networks Lab. He has served as a consulting faculty at the department of Electrical Engineering at Stanford University since 2003. Focus of research in his group is on methods and applications of Ambient Intelligence with an emphasis on behavior modeling based on activity monitoring. Specific research topics include adaptive energy efficient automation in smart homes and offices, occupancy modeling of smart buildings for resource efficiency, detection of anomaly or shift in behavior in elderly care, improving user's well-being in home and office through personalized recommendations, and avatar-based social interactions.
Hamid is Editor-in-Chief of "Journal of Ambient Intelligence and Smart Environments", and has served as guest editor for IJCV, IEEE Trans. on Multimedia, CVIU, and IEEE J-STSP. Hamid was general chair of ACM/IEEE ICDSC 2008 and technical program chair of ICDSC 2007. He has organized workshops, special sessions, or tutorials at ECCV, ACM MM, CVPR, ICCV, ICMI-MLMI, FG, ECAI, EI, and ICASSP. Hamid received his MS and PhD degrees from Stanford University.


Professor Hagit Hel-Or

University of Haifa

June 5, 2012 4:15 pm to 5:15 pm

Location: Packard 204

Talk Title: Photometric Invariant Pattern Matching

Talk Abstract: Multi-modal image alignment (e.g. CT-MRI, SAR-Visual, Visual-IR) is ubiquitous in medical imaging, satellite imaging, military and security applications and involves matching, images that differ significantly in their photometric content. Shadow removal from images and videos is crucial for object detection and recognition as well as for tracking applications. Typical approaches rely on a background model of the scene and a functional mapping between shadow and background regions. Patch based applications such as image denoising, image super-resolution, image retargeting, image summarization and many more can greatly benefit and improve performance when allowing patches to undergo photometric changes.

Common to all these applications is the necessity to compare and evaluate differences between images or image-patches while disregarding photometric changes occurring between them. Classic metrics such as the Euclidean norm, Normalized Grayscale Correlation and others do not successfully compensate for photometric changes, specifically in non-linear tone mappings. In this talk, a fast pattern matching scheme termed Matching by Tone Mapping (MTM) is introduced which allows matching under photometric transformations including non-linear and non-monotonic tone mappings. We exploit the recently introduced Slice Transform to implement a fast computational scheme requiring computational time similar to the fast implementation of Normalized Cross Correlation (NCC). In fact, the MTM measure can be viewed as a generalization of the NCC for non-linear mappings and actually reduces to NCC when mappings are restricted to be linear. The MTM is shown to be invariant to any non-linear photometric tone mapping, and is empirically shown to be highly discriminative and robust to noise.

Example applications that will be shown include multi-modal image alignment and shadow removal.

Speaker's Biography: Hagit Hel-Or received her PhD in Math and Computer Science from The Hebrew University of Jerusalem, Israel in 1994. She did a postdoc in the Math and Computer Science Department, at Bar-Ilan University and later in the Dept of Psychology at Stanford University hosted by Prof. Brian Wandell. Since 1998 she is a faculty member in the Department of Computer Science at the University of Haifa, Israel and an affiliate of the Institute of Information Processing and Decision Making (IIPDM) in the Dept of Psychology at the University of Haifa. She has published over a 70 papers in leading Journals and major conferences and holds several U.S. Patents. She is an Associate Editor of the Pattern Recognition Journal and a co-organizer of the yearly Israeli Computer Vision Conference. Her research interests in the area of Image Processing and Computer Vision include Pattern Recognition, Color Vision, Imaging Technologies, and Computational and Human Vision.


Dr. Jennifer Gille and Dr. Manu Parmar

Qualcomm MEMS Technologies, Inc

May 29, 2012 4:15 pm to 5:15 pm

Location: Packard 204

Talk Title: The Mirasol Display and Advantages of Multi-State Color

Talk Abstract: An IMOD (Interferometric Modulator) is an interference-based MEMS device that can be used to create a low-power, sunlight-readable, color, video display. The IMOD element is a folded etalon whose basic components are a partial reflector, a full reflector, and a gap between them. For a fixed geometry and materials, the spectral reflectance of an IMOD element depends on the size of the gap. Since the full reflector component is a mirror, its reflectance is non-Lambertian. Just like butterfly wings and peacock feathers, this property gives the appearance of glowing color.

We will describe how a display can be designed around such mirror elements, starting with a conventional three-color, three-subpixel RGB pixel geometry. In this ordinary geometry a given location (in this case mirror) can only assume one of two states: a color or black. However, a more efficient use of area can be realized if the same mirror is allowed to assume more than two states. In one instance, for example, there may continue to be three subpixels, but each mirror can take on one of three states (color 1, color 2, black.) In another, there is no subpixel structure, and each mirror can take on one of many states. We will show the improvement in image quality as we go from the conventional 2-state geometry to multi-state geometries.

Speaker's Biography: Jennifer Gille, senior staff engineer with Qualcomm MEMS Technologies, Inc.(QMT), oversees image quality performance and color processing in R&D for Qualcomm QMT displays. Prior to joining Qualcomm, Jennifer was at NASA Ames, where she served as a senior scientist working with Jim Larimer as part of the ViDEOS team. She conducted vision research and developed software tools for display designers. Jennifer served on the faculty at the University of California Santa Cruz where she taught perception and experimental psychology. She was also a researcher for the Visual Perception group at SRI International. She holds a bachelor’s degree in mathematics and a Ph.D. in vision science with an emphasis on color and spatial vision from UCLA. She is a member of SID, IS&T, SPIE and OSA.


Professor Gaurav Sharma

Electrical and Computer Engineering Department, University of Rochester

May 15, 2012 4:15 pm to 5:15 pm

Location: Packard 204

Talk Title: Imaging Arithmetic: Physics U Math > Physics + Math

Talk Abstract: For several real-world problems, signal and image processing approaches are most successful when they combine the insight offered by the physics underlying the problem with the mathematical framework and tools inherent in digital signal and image processing. Electronic imaging systems are a particularly fertile ground for problems in this class because they deal specifically with the capture of physical scenes and with the reproduction of images on physical devices. In this presentation, we highlight specific examples of problems in electronic imaging for which the combination of physical insight, mathematical tools, and engineering ingenuity leads to particularly elegant and effective solutions.

We illustrate the above ideas in some depth using a number of case studies drawn from our research in electronic imaging, in each case highlighting how the combination of physical modeling/insight with mathematical analysis enables a solutions that each of these tools alone is unable to address adequately. The examples cover a wide range of applications, including methods for show-through cancelation in scanning, print watermarks detectable by viewers without using any viewing aids, multiplexed images that revealed under varying illumination, improved metrics for the accuracy of color capture devices, and color halftone separation estimation from scans.

Speaker's Biography: Gaurav Sharma is an associate professor at the Electrical and Computer Engineering Department at the University of Rochester, where, from 2008-2010, he also served as the Director for the Center for Emerging and Innovative Science (CEIS), a New York state funded center located at the University of Rochester chartered with promoting economic development through university-industry technology transfer. He received the PhD degree in Electrical and Computer engineering from North Carolina State University, Raleigh in 1996. From 1993 through 2003, he was with the Xerox Innovation group in Webster, NY, most recently in the position of Principal Scientist and Project Leader. His research interests include color science and imaging, multimedia/print security, and bioinformatics. He is the Editor-in-Chief for the Journal of Electronic Imaging and the editor of the Digital Color Imaging Handbook published by CRC press in 2003. He has also served as an associate editor for the Journal of Electronic Imaging, the IEEE Transactions on Image Processing, and for the IEEE Transactions on Information Forensics and Security. Dr. Sharma is a senior member of the IEEE, a member of IS&T and has been elected to Sigma Xi, Phi Kappa Phi, and Pi Mu Epsilon. He is a member of the Image, Video, and Multi-dimensional Signal Processing (IVMSP) and the Information Forensics and Security (IFS) technical committees of the IEEE Signal Processing Society and served as the Chair for the former committee in 2010-11. In recognition of his research contributions, he received an IEEE Region I technical innovation award in 2008.


Dr. Kathrin Berkner

Ricoh Innovations, Inc.

April 24, 2012 4:15 pm to 5:15 pm

Location: Packard 204

Talk Title: Design framework for a plenoptic camera using wave propagation techniques

Talk Abstract: Plenoptic cameras are designed to capture different combinations of light rays from a scene, sampling its light field, and enable applications such as digital refocusing, or depth estimation. Image formation models for such cameras are typically based on geometric optics. Little work has been performed so far on analyzing diffraction effects and aberrations of the optical system. We demonstrate simulation of a plenoptic camera optical system using wave propagation analysis and show how the diffraction effects influence performance in the case where a spectral filter mask is inserted into the main lens.

Speaker's Biography: Kathrin Berkner is research manager of the Computation Optics & Visual Processing Group at Ricoh Innovations, where she is responsible for creating technology for computational imaging and sensing systems, starting from optical and digital design to prototyping to making the systems practical for use in commercial applications.
Before working on computational imaging systems, Kathrin developed technologies for restoration and enhancement of images, reformatting of documents and images to adapt to small-size displays, as well as graphical models for structuring document information for social media. Prior to joining Ricoh she was a Postdoctoral Researcher at Rice University, Houston, TX. Kathrin holds a PhD. degree in mathematics from the University of Bremen, Germany.


Professor Thomas Wiegand

Berlin Institute of Technology and Fraunhofer HHI

March 20, 2012 4:15 pm to 5:15 pm

Location: Packard 204

Talk Title: Towards Measurement of Perceived Differences Using EEG

Talk Abstract: An approach towards the direct measurement of video quality perception using electroencephalography (EEG) is presented. Subjects viewed video clips while their brain activity was registered using EEG. The presented video signals contained compressed as well as uncompressed video sequences. The distortions were introduced by a hybrid video codec. Subjects had to indicate whether or not they had perceived a quality change. In response to a quality change, a voltage change in EEG was observed for all subjects.

Potentially, a neuro-technological approach to video assessment could lead to a more objective quantification of quality change detection, overcoming the limitations of subjective approaches (such as subjective bias and the requirement of an overt response). Furthermore, it allows for real-time applications wherein the brain response to a video clip is monitored while it is being viewed.

Speaker's Biography: Thomas Wiegand is a professor at the department of Electrical Engineering and Computer Science at the Berlin Institute of Technology, chairing the Image Communication Laboratory, and is jointly heading the Image Processing department of the Fraunhofer Institute for Telecommunications - Heinrich Hertz Institute, Berlin, Germany. He received the Dipl.-Ing. degree in Electrical Engineering from the Technical University of Hamburg-Harburg, Germany, in 1995 and the Dr.-Ing. degree from the University of Erlangen-Nuremberg, Germany, in 2000. He joined the Heinrich Hertz Institute in 2000 as the head of the Image Communication group in the Image Processing department. His research interests include video processing and coding, multimedia transmission, as well as computer vision and graphics. Since 1995, he has been an active participant in standardization for multimedia with successful submissions to ITU-T VCEG, ISO/IEC MPEG, 3GPP, DVB, and IETF. He is currently a visiting professor at Stanford University.


Professor Giacomo Langfelder

Politecnico di Milano, Italy

January 30, 2012 4:15 pm to 5:15 pm

Location: Packard 204

Talk Title: What would you do with a tunable sensor… if you had it?

Talk Abstract: Interest has been given by different research groups in the last year to sensors with multiple and/or selectable primaries for applications in spectral image capture or simply for more faithful color acquisition. In principle, a sensor with continuously tunable spectral responses would be a very smart solution to adapt the sensor response to different scene conditions and/or to different acquisition modes.
In this context, in our laboratory we are working on the development of a CMOS image sensor with tunable pixels. Each pixel is directly sensitive to three or more, visible or near-infrared, tunable spectral bands. This can be obtained exploiting the different light penetration in Silicon at different wavelengths, and a specific pixel structure featuring transverse, electrically reconfigurable, collection trajectories.
Through the presentation I will discuss the technological steps toward the development of a full sensor of tunable and reconfigurable pixels and I will share my thoughts about possible applications, with comparisons with other examples in the scientific literature.

Speaker's Biography: Giacomo Langfelder received the M.S. and Ph.D. degrees - in 2005 and 2009 respectively - from Politecnico di Milano, Italy, where he is currently an Assistant Professor with the Department of Electronics and Information Technology. His research on sensors focuses on one side on novel semiconductor radiation detectors and on the other side on micro- and nano-electromechanical systems, and associated readout electronics. He is a co-inventor of a few patents on color-sensitive imaging detectors and on new methods for white balancing and image processing. He is the author of about 40 publications in refereed scientific journals and conference proceedings.


Dr. Ren Ng


February 10, 2012 4:15 pm to 5:15 pm

Location: Packard 204

Talk Title: The Lytro Consumer Light Field Camera

Talk Abstract: Lytro is a startup in Mountain View that has recently introduced the first light field camera for consumers. Light field cameras provide many new capabilities, including the ability to focus pictures after the shot is taken. The underlying technology is based on founder Ren Ng's PhD dissertation on light field photography, which he completed at Stanford in 2006. This talk will present the company, technology and product.

Speaker's Biography: Ren Ng is the founder and CEO of Lytro, and developed the underlying technology during grad school. Ren's Ph.D. research on light field technology earned the field’s top honor, the ACM Doctoral Dissertation Award for best thesis in computer science and engineering, as well as Stanford University’s Arthur Samuel Award for Best Ph.D. Dissertation. Ren holds a Ph.D. in computer science and a B.S. in mathematical and computational science from Stanford University.


CEO Candice Brown Elliott


February 28, 2012 4:15 pm to 5:15 pm

Location: Packard 204

Talk Title: Reducing Field Sequential Color Break-Up Artifacts using a Hybrid Display with Locally Desaturated Virtual Primaries

Talk Abstract: Recent work by several investigators have explored means of reducing Field Sequential Color (FSC) Color Break-Up (CBU) visibility on local dimming RGB LED BackLight Units (BLU). The most promising of which appears to be Locally Desaturated Color Primaries, in which each local zone has a set of mixed color “virtual primaries” formed from combinations of the RGB LED BLU primaries. The colors in a given image region, above each locally dimmable zone, are surveyed for the smallest triangle of locally desaturated virtual primaries that may bound all of the colors found in that zone. Since the colors from each zone typically are similar to each other, the triangle thus formed is smaller than that formed by the original RGB LED primaries. Each field exhibits locally desaturated virtual primaries that being closer together in color space, reduce the visibility of CBU. In developing these algorithms, researchers have noted that the zones must be very small and numerous, in order to find small enough triangles to reduce CBU to acceptable limits while fully reconstructing all colors.

In researching the statistics of many potential TV images that such displays may exhibit, I noted that while the brighter colors within a zone were containable by reasonably smaller, locally desaturated, virtual primaries, the darker colors required a larger-sized virtual primary triangle, which is too large to effectively reduce CBU. I hypothesized in early 2006 that combining locally desaturated virtual primaries with a color filtered panel, similar to our PenTile RGBW, but with more W subpixels and fewer color filtered subpixels, would allow a smaller virtual triangle to bound the brighter colors. The color filters would reproduce the darker, more saturated colors that would then lie outside of the local FSC virtual primary triangle. This hybrid approach would eliminate CBU, while reproducing a wider local color gamut, using larger, less numerous local dimming zones, containing cost. I and my colleagues at Nouvoyance and Samsung have been assiduously working to develop the algorithms in simulations, and later on in an R&D prototype of the selected architecture, since then.

During Display Week 2011, Samsung demonstrated in their booth at the exhibit hall, an R&D prototype of a PenTile RGBCW hybrid field sequential display. This display combined a locally dimmable RGB BLU with a multiprimary color filtered LCD. Half of the subpixels are clear (W), allowing a locally desaturated virtual primary FSC method of color reproduction. The other half of the subpixels are multiprimary color filtered, providing a larger color gamut, at a reduced brightness, than the locally desaturated virtual primary triangle. The two color reproduction methodologies are fully operable and blended together at all times, stabilizing the FSC, further reducing CBU. The display uses subpixel rendering of the color filtered subpixels, that, combined with a fully populated W subpixel lattice, reconstructs images using two subpixels per pixel, on average.

Work on the algorithms for our hybrid system continues, now incorporating an RGBE multiprimary backlight. Theory of operation and the results of this work will be presented.

Speaker's Biography: Candice Brown Elliott is CEO of Nouvoyance, an employee owned, independent, fabless semiconductor firm that develops real time subpixel rendering and color processing chip cores for PenTile displays and related technologies, in close partnership with Samsung. Ms. Elliott founded Clairvoyante in July 2000 to develop and license enhanced display architectures and subpixel rendering technology which was sold to Samsung in March of 2008. Ms. Elliott is a 35-year veteran of the display and semiconductor industries, holding positions in R&D, manufacturing, and engineering management at Fairchild, Advanced Micro Devices, Planar Systems, and the Micro Display Corporation. Ms. Elliott has been granted 63 U.S patents. She earned a dual B.S. in Physics and Psychology in 1982 from University of the State of New York and attended Stanford University Graduate School in the Materials Science Dept through the Honors COOP program the following year.


Professor Lorne Whitehead

University of British Columbia

February 21, 2012 4:15 pm to 5:15 pm

Location: Packard 204

Talk Title: Improving the CIE Color Rendering Index – how this can be done and why it matters.

Talk Abstract: The concept of color rendering of light sources is surprisingly subtle and many find it difficult to fully understand. This matters because there is an unavoidable trade-off between the color rendering quality of light sources and their luminous efficacy - a fact that has significant economic and social consequences. A related issue is that recent human-factors research shows the current CIE Color Rendering Index (CRI) does a poor job of assessing the colour rendering quality of the light from many LEDs. We have found that at least part of the problem is non-uniformity of the spectral sensitivity of the current CRI metric. We have developed an improved computational procedure that eliminates that problem, and numerous groups are now assessing it. The hope is that a better measure of color rendering will assist researchers in establishing the importance of lighting quality as opposed to quantity, in turn leading to more pleasant and sustainable interior environments.

Speaker's Biography: Lorne A. Whitehead received a Ph.D. in physics from the University of British Columbia and is a registered Professional Engineer. His career has emphasized innovation in science, business, and administration. From 1983 to 1993 he served as CEO of TIR Systems Ltd., a university spin-off company that grew to 200 employees and was recently purchased by Philips Corp. Since joining UBC in 1994, he has been a Professor and held an NSERC Industrial Research Chair in the Department of Physics and Astronomy, carrying out studies of the optical, electrical, and mechanical properties of micro-structured surfaces, a field in which he holds more than 100 patents. His technology is used in many computer screens and televisions (via licenses to 3M and Dolby Laboratories) and he also helped to start three other companies based on his research. In addition to his scientific work at UBC, Dr. Whitehead has held a number of senior university administrative positions.


Dr. Vic Nalwa

President, FullView, Inc.

March 6, 2012 4:15 pm to 5:15 pm

Location: Packard 204

Talk Title: FullView Cameras

Talk Abstract: Panoramic and spherical-view cameras today are by and large one of three types:
1. A single camera with wide-angle optics, such as a fisheye lens or a curved mirror;
2. Multiple cameras looking out directly in different directions; or
3. Multiple cameras looking out off flat mirrors, which is FullView’s patented approach.
Whereas multiple cameras offer much higher resolution than any single camera, multiple
cameras looking out directly are in general incapable of producing seamless, artifact-free
and blur-free composite images no matter what, because of parallax. FullView evades
parallax through its patented approach in which multiple cameras look out off flat
mirrors such that all the cameras are effectively looking out in different directions but
from the same single viewpoint. As a result, FullView’s composite images, whether video
or still, and irrespective of their resolution, are always seamless, artifact-free and blur-free,
and they provide much higher resolution than outwardly pointing cameras.

Speaker's Biography: Vic Nalwa is President of FullView, which he cofounded with Lucent Technologies
in 2000. He invented the original FullView camera at Bell Labs in 1995, in recognition
of which he was elected a Fellow of the IEEE in 2004.
He attended, in sequence, St. Columba's High School, New Delhi, India, where he
skipped his senior year to attend the Indian Institute of Technology (IIT), Kanpur,
India, where he received the B.Tech. Degree as The Best Graduating Student in
Electrical Engineering in 1983; and then Stanford University, which he attended
on an ISL Research Fellowship and from where he received the M.S. and Ph.D.
Degrees in Electrical Engineering in 1985 and 1987, respectively.
From 1987 to 2000, he was a Principal Investigator at Bell Labs Research. There,
upon a dare, over the summer of 1993, he designed and implemented a system
for Automatic On-Line Signature Verification whose equal-error rate was less
than a tenth that of three competing systems developed at Bell Labs Research.
For this, he was thereon afforded complete freedom of research by the President
of Bell Labs, freedom that culminated in the launch of FullView in 2000. In 1989,
he was concurrently on the faculty of Electrical Engineering at Princeton University.
He is the author of the text A Guided Tour of Computer Vision, Addison-Wesley,
MA, 1993.


Professor Hany Farid

Department of Computer Science, Dartmouth

January 31, 2012 4:15 pm to 5:15 pm

Location: Packard 204

Talk Title: Photo Forensics

Talk Abstract: From the tabloid magazines to the fashion industry, main-stream media outlets, political campaigns, courtrooms, and the photo hoaxes that land in our email in-boxes, doctored and highly retouched photographs are appearing with a growing frequency and sophistication. The resulting lack of trust is impacting law enforcement, national security, the media, e-commerce, and public health. The nascent field of digital photo forensics has emerged to help regain some trust in digital photographs. In the absence of any digital watermarks or signatures, we work on the assumption that most forms of tampering will disturb some statistical or geometric property of an image. To the extent that these perturbations can be quantified and detected, they can be used to invalidate or authenticate a photo. I will provide a broad overview of our work in this area. I will also describe a new perceptually meaningful rating system that rates photos of people on the degree to which they have been retouched.

Speaker's Biography: Hany Farid received his undergraduate degree in Computer Science and Applied Mathematics from the University of Rochester in 1989. He received his Ph.D. in Computer Science from the University of Pennsylvania in 1997. Following a two year post-doctoral position in Brain and Cognitive Sciences at MIT, he joined the faculty at Dartmouth in 1999, where he is currently a professor of computer science. Hany is also the Chief Technology Officer and co-founder of Fourandsix Technologies, Inc. Hany is the recipient of an NSF CAREER award, a Sloan Fellowship and a Guggenheim Fellowship.


Professor Norimichi Tsumura

Department of Information and Image Sciences, Chiba University

January 26, 2012 4:15 pm to 5:15 pm

Location: Packard 204

Talk Title: Physics and Physiologically Based Skin Color Image Analysis and Synthesis

Talk Abstract: Reproduced skin appearance such as color, texture and translucency depend on imaging devices, illuminants and environments. As a result of the recent progress of color management technology, imaging devices and the color of an illuminant can be calibrated by device profiles to achieve high-fidelity appearance reproduction. However, the high-fidelity reproduction is not always effective in the practical imaging systems used for facial imaging; therefore, additional functions for color, texture and translucency reproduction are required in high quality facial image reproduction. We named these functions as E-cosmetic functions, and we believe the E-cosmetic function should be physics based and physiologically based processing for high quality facial image reproduction. In this talk, therefore, physics and physiologically based image processing is introduced based on the extraction of specular, hemoglobin, melanin and shading information in the skin color image.

Speaker's Biography: Norimichi Tsumura was born in Wakayama, Japan, on 3 April 1967. He received the B.E., M.E. and Dr. Eng degrees in applied physics from Osaka University in 1990, 1992 and 1995, respectively. He moved to the Department of Information and Computer Sciences, Chiba University in April 1995, as an assistant professor. He was a visiting scientist in University of Rochester from March 1999 to January 2000. He is currently associate professor in Department of Information and Image Sciences, Chiba University since February 2002, also was a researcher in PRESTO, Japan Science and Technology Corporation (JST) from December 2001 November 2005. He got the Optics Prize for Young Scientists (The Optical Society of Japan) in 1995, Applied Optics Prize for the excellent research and presentation (The Japan Society of Applied Optics) in 2000, Charles E. Ives Award (Journal Award: IS&T ) in 2002. He is interested in the color image processing, computer vision, computer graphics and biomedical optics. His most high impact journal paper is as follow. “Independent component analysis of skin color image,“ Norimichi Tsumura, et al., Journal of Optical Society of America A Vol. 16 No.9, 2169-2176(1999). These original results of the journal paper are used in practical application for cosmetic evaluation and demonstration. His second high impact journal paper is as follow. "Image-Based Control of Skin Melanin Texture," Norimichi Tsumura, et al., Applied Optics Vol. 45, Issue 25, pp. 6626-6633 (2006). These original results of the journal paper are used in practical application for cosmetic evaluation and demonstration. A lots of alumni in his laboratory are now working in major companies for optics and imaging..


Dr. Quan Huynh-Thu

Technicolor Research & Innovation, France

January 20, 2012 4:15 pm to 5:15 pm

Location: Packard 204

Talk Title: Quality assessment of stereoscopic 3D content

Talk Abstract: Stereoscopic three-dimensional (S3D) video content has generated excitement in both the cinema and television industries as it is considered to be a key feature that can significantly enhance the visual experience. However, one of the major challenges to the deployment of 3D is the difficulty to provide high quality images that are also both comfortable while meeting signal transmission requirements. The different processing steps that are necessary in a 3D-TV delivery chain can all introduce artifacts that may create problems in terms of human visual perception. Means to measure perceptual video quality in an accurate and practical way is therefore of highest importance for content providers, service providers, and display manufacturers. However, the measurement of 3D video quality is not a simple extension of 2D video quality. A review of the recent advances in 3D video quality assessment is provided and challenges are discussed. An outline of on-going efforts in standards-related bodies is also provided.

Speaker's Biography: Quan Huynh-Thu received the Dipl.-Ing. degree in electrical engineering from the University of Liège (Belgium), the M.Eng. degree in electronics engineering from the University of Electro-Communications (Japan), and the Ph.D. degree in electronic systems engineering from the University of Essex (U.K). He is currently Senior Scientist at Technicolor Research & Innovation, France. Prior to that, from 1997 to 2000, he was Research Scientist in the Image and Signal Processing Lab at the Belgian Forensic Institute. He was awarded a postgraduate fellowship from the Japanese Ministry of Education and, from 2000 to 2003, was Researcher at the University of Electro-Communications, Tokyo. From 2003 to 2010, he was Senior Research Engineer at Psytechnics Ltd (UK), where he co-developed the ITU-T Recommendation J.247 for the objective measurement of perceptual video quality. His current research interests include 3D video quality assessment, human factors, and visual attention. He has been actively contributing to the work of the International Telecommunication Union (ITU) and the Video Quality Experts Group (VQEG) since 2004. He is currently co-chair of the VQEG 3D-TV and Multimedia groups, and Rapporteur for Study Question 2 in ITU-T Study Group 9.


Dr. Akiko Yoshida

Sharp Corporate Research and Development

November 14, 2011 4:15 pm to 5:15 pm

Location: Packard Room 312

Talk Title: High Fidelity Color Image Capture and Display

Talk Abstract: We describe an imaging system that includes a camera and a display with high color fidelity. The spectral sensitivities of the camera were modified in order to satisfy the Luther-Ives condition and the display has five color primaries (red, green, blue, yellow and cyan) that produce a wider color gamut than conventional displays. The entire imaging process, from colorimetric image capture to display rendering, functions in real time.

Speaker's Biography: Akiko Yoshida received her B.S. degree from the University of Aizu in 2000 and her M.Sc. degree from Universität des Saarlandes in 2004. She joined Max-Planck-Institut für Informatik as a Ph.D. fellow and was awarded her Ph.D. degree in 2008. She joined SHARP Corporation as a researcher at the Corporate Research and Development Group. Her main topics of research concern human perception and its applications for image processing. She is a member of SID and ACM.


Professor Thomas Wiegand

Berlin Institute of Technology and Fraunhofer HHI

December 13, 2011 4:15 pm to 5:15 pm

Location: Packard 204

Talk Title: 3D Video Coding

Talk Abstract: New techniques for 3D video coding for stereoscopic and autostereoscopic 3D displays are presented, in which a small number of video views and potentially associated depth maps are coded. Based on the coded signals, additional views required for displaying the 3D video on an autostereoscopic display can be generated by Depth Image Based Rendering techniques. The developed coding scheme represents an extension of HEVC, similar to the MVC extension of H.264/AVC. However, in addition to the well-known disparity-compensated prediction advanced techniques for interview and inter-component prediction, the representation of depth blocks, and the encoder control for depth signals have been integrated. In comparison to simulcasting the different signals using HEVC, the proposed approach provides about 40% and 50% bit rate savings for the tested configurations with 2 and 3 views, respectively. Bit rate reductions of about 20% have been obtained in comparison to a straightforward multiview extension of HEVC without the newly developed coding tools.

Speaker's Biography: Thomas Wiegand is a professor at the department of Electrical Engineering and Computer Science at the Berlin Institute of Technology, chairing the Image Communication Laboratory, and is jointly heading the Image Processing department of the Fraunhofer Institute for Telecommunications - Heinrich Hertz Institute, Berlin, Germany. He received the Dipl.-Ing. degree in Electrical Engineering from the Technical University of Hamburg-Harburg, Germany, in 1995 and the Dr.-Ing. degree from the University of Erlangen-Nuremberg, Germany, in 2000. He joined the Heinrich Hertz Institute in 2000 as the head of the Image Communication group in the Image Processing department. His research interests include video processing and coding, multimedia transmission, as well as computer vision and graphics. Since 1995, he has been an active participant in standardization for multimedia with successful submissions to ITU-T VCEG, ISO/IEC MPEG, 3GPP, DVB, and IETF. He is currently a visiting professor at Stanford University.


Dr. Kunal Ghosh


December 6, 2011 4:15 pm to 5:15 pm

Location: Packard 204

Talk Title: Brain imaging using an integrated fluorescence microscope

Talk Abstract: There is a rising emphasis today on the role of neural circuitry in neuropsychiatric disease. To understand how neural circuits shape mammalian behavior, and how normal circuit patterns go awry leading to behavioral deficits, it will be crucial to have means for imaging neural dynamics in behaving animals (such as mice) and in large numbers of individual neurons concurrently. Fluorescence microscopy has key advantages for tracking neural activity, however, the limitations of benchtop fluorescence microscopy have typically confined studies to anesthetized or immobilized animals. The sizes involved and typical costs of tens of thousands of dollars or more per microscope also preclude mass-production and concurrent usage in large numbers of mice. 

Inspired by the need to enable high-speed, cellular-level in vivo brain imaging in large numbers of freely behaving mice, over the last four years at Stanford we pioneered the development of miniaturized and mass-producible fluorescence microscopes. These tiny microscopes are each <2 g in mass, small enough to be borne on the head of an adult mouse for in vivo brain imaging during active animal behavior, and amenable to mass production at low costs. Researchers at Stanford have been using prototype microscopes of this kind on a daily basis for in vivo brain imaging studies in freely behaving mice, to: 1) Image cerebellar vasculature and microcirculation during activity; and 2) Image neural dynamics, analyzing spiking activity of populations of neurons with single neuron specificity during different behavioral tasks.

This talk will summarize these technological advances, which were recently reported in Nature Methods 8:871-878 (Oct 2011) and featured in Nature, MIT Technology Review and several media outlets.

Speaker's Biography: Kunal Ghosh is founder and CEO of Inscopix, Inc., a Stanford University spin-off backed by leading venture capitalists that is developing end-to-end solutions for in vivo cellular-level imaging in awake, behaving subjects. Prior to founding Inscopix, Kunal was a Postdoctoral Scholar in the Department of Biology at Stanford. Kunal received his M.S. and Ph.D., both in Electrical Engineering, from Stanford in 2006 and 2010, respectively. He graduated from the Jerome Fisher Program in Management and Technology at the University of Pennsylvania in 2004 with a B.S.E. in Electrical Engineering and a B.S. from the Wharton School.


Dr. Boyd Fowler

BAE Systems Imaging Solutions

March 13, 2012 4:15 pm to 5:15 pm

Location: Packard 204

Talk Title: CMOS Dental X-Ray Imaging

Talk Abstract: The digital intra-oral X-ray camera market has experienced rapid growth over the past decade with a CAGR of more than 20%. The transition from film to digital sensors has reduced the radiation dose by 10x and minimized the environmental impact of dentistry.
In this talk, we review the basics of dental X-ray imaging. This includes the theory and operation of intra-oral sensors, extra-oral sensors and cone beam CT sensors. We focus on indirection X-ray imaging systems that convert X-ray photons into visible photons using a scintillator and then optically couple these visible photons to a solid state image sensor. In addition, we also review the analysis of X-ray imaging quality based on detective quantum efficiency (DQE). Moreover, we discuss how efficiently X-ray quanta are measured by an indirect X-ray imaging system. This includes understanding both the system signal to noise ratio (SNR) and the modulation transfer function (MTF). Then we discuss the detailed construction and operation of intra-oral sensors. This includes reviewing their requirements, the design of the solid state imager, the X-ray triggering scheme, and chamfered corner design. Finally we discuss the detailed construction and operation of extra-oral sensors. This includes reviewing the operation and design of TDI CCDs and CMOS imagers for this application.

Speaker's Biography: Boyd Fowler was born in California in 1965. He received his M.S.E.E. and Ph.D. degrees from Stanford University in 1990 and 1995 respectively. After finishing his Ph.D. he stayed at Stanford University as a research associate in the Electrical Engineering Information Systems Laboratory until 1998. In 1998 he founded Pixel Devices International in Sunnyvale California. Presently he is the technology director at BAE Imaging Solutions, formally Fairchild Imaging. He has authored numerous technical papers and patents. He's current research interests include CMOS image sensors, low noise image sensors, noise analysis, and data compression.


Professor David Brainard

University of Pennsylvania

November 11, 2011 4:15 pm to 5:15 pm

Location: Packard 204

Talk Title: The Human Demosaicing Algorithm

Talk Abstract: The human visual system shares with most digital cameras the design feature that color information is acquired via spatially interleaved sensors with different spectral properties. That is, the human retina contains three distinct spectral classes of cone photoreceptors, the L-, M-, and S-cones, and cones of these three classes are spatially interleaved in the retina. Similarly, most digital cameras employ a design with interleaved red, green, and blue sensors. In each case, generating a full color image requires application of a demosaicing algorithm that uses the available image data to estimate the values of the two cone/sensor classes not present at each cone/sensor location. In this talk, I will review psychophysics and modeling that sheds light on the demosaicing algorithm employed by the human visual system. This algorithm requires that the visual system have knowledge of the spectral type of each of its cones. For the L and M cones, a variety of lines of evidence suggest that the class of the cone at each retinal location is learned, rather than signaled by some sort of biochemical marker. In the second part of the talk, I will present results that show that natural images contain sufficient statistical structure to support unsupervised learning of cone classes.

More Information:

Speaker's Biography: David Brainard received his AB in physics from Harvard University (1982) and MS (electrical engineering) and PhD (psychology) from Stanford University in 1989. He is currently Professor of Psychology at the University of Pennsylvania and his research focuses on human color vision and color image processing. He is a fellow of the Optical Society of America and the Association for Psychological Science.


Dr. Michael Kriss

November 8, 2011 4:15 pm to 5:15 pm

Location: Packard 204

Talk Title: Pixel Wars, past, present and in the future

Talk Abstract: In the late 1970’s and the early 1980’s it became apparent that electronic imaging was displacing film based, motion imaging in areas like news, sports and home movies. The Mavica camera by SONY offered an easy way to capture still images, as compared to film, but with much lower quality. Studies within the Kodak Research Laboratories indicated that high quality 4” x 6” prints could be made with 6 million pixels using the Bayer Color Filter Array and appropriate image processing. By 2005 this milestone of 6 million pixels was easily surpassed and color film usage has since greatly declined. The competition between digital camera manufacturers has driven them to offer greater and greater pixel counts without considering the overall quality to the customer. This paper will review the history of quality as a function of pixel count, the current state of image quality based on experiment and modeling reflecting pixel count/size and offer some suggestion on camera design for a range of well defined uses.

Speaker's Biography: Dr. Kriss received his Ph.D. in Physics from UCLA in 1969 in the field of Liquid Helium Superfluidity. He then joined the Eastman Kodak Research Laboratories where He spent 24 years working both on film and digital imaging. Most of his scientific work was focused on analytical and computer models of film and digital imaging systems with a strong focus on image quality. This work led him into image processing where he headed a laboratory that created very sophisticated image processing algorithms for film and sensors. He took an early retirement in late 1992 and joined the University of Rochester where he was the executive director of the Center for Electronic Imaging Systems and taught special course in digital imaging systems through the Computer and Electrical Engineering Department. In 1999 he accepted a position at Sharp Laboratories of America where a he managed of group of talented scientists and engineers working on advanced image processing methods for a variety of color hardcopy devices. He continue to model digital imaging systems and color electro-photographic and ink jet halftone systems (for fun). He also works with IS&T and Wiley on a series of texts on Imaging Science and Technology.


Phillip Corriveau


October 25, 2011 4:15 pm to 5:15 pm

Location: Packard 204

Talk Title: User Experience and 3D Technologies

Talk Abstract: Some would say that Stereo 3D is the flavor of the month, while others argue that S3D is here to stay. Today’s insights around penetration of 3D into the home show a hockey stick curve of adoption which supersedes that of HDTV when it was launched. Based on this trend and the pervasiveness of S3D into other devices in the consumer and business setting, we are left with one important question: What is the User Experience?
Various technology ecosystems from consumer electronics manufactures to cinema content producers have embraced and are driving this technology into our everyday lives. With this shift of technology, consumers have access to new visual experiences that are beyond traditional 2D media and which provide a more immersive experience. The human factors component of this “new” experience is at the top of everyone’s mind. With joint efforts of Industry, Government, and Academia significant research efforts have been initiated and are delivering meaningful data around the impact on experiential quality to our consumers.
In this talk we will discuss why User Experience is important, what is being done to quantify the experience and my personal pain points with problems yet to be solved.

Speaker's Biography: Philip Corriveau is a Principal Engineer in the technology arm of the Interaction & Experience Research group in Intel Labs. Philip received his Bachelors of Science Honors at Carleton University, Ottawa Canada in 1990. He immediately started his career at the Canadian Government Communications Research Center performing end-user subjective testing in support of the ATSC HD standard for North America. In January 2009 he was awarded a National Academy of Television Arts & Science, Technology & Engineering Emmy® Award for User Experience Research for the Standardization of the ATSC Digital System.
Philip moved to Intel in 2001 to seed a research capability called the Media and Acoustics Perception Laboratory designed to address fundamental perceptual aspects of platform and product design. He now manages a team of human factors engineers in the Experience Metrics & Quality Evaluation group conducting user experience research across Intel technologies, platforms and product lines.
Philip is currently the Chair of Steering Team 5 for 3D@Home addressing Human Factors issues surrounding the development of 3D technologies for end-users. Philip is Adjunct Professor, Pacific University, School of Optometry integrating user experience and vision. He also founded and still participates in the Video Quality Experts Group, aimed a developing, testing and recommending for standardization objective video quality metrics.


Dr. Stefan Winkler

Interactive Digital Media Program at the University of Illinois’ Advanced Digital Sciences Center (ADSC) in Singapore

October 13, 2011 4:15 pm to 5:15 pm

Location: Packard Room 312

Talk Title: Image Quality in the Social Media Era

Talk Abstract: With the wide-spread use of digital cameras, imaging software, photo-sharing sites, social networks, and other related technologies, media production and consumption have become much more multi-directional, varied, and complex than they used to be. As a result, the popular concept of “Quality of Experience” (QoE) must also be looked at from a different perspective.
In this talk, I’ll give an overview of the current state of the art of image and video quality assessment methods, including the main standardization efforts as well as the impact of 3D. I’ll then address some of the issues with traditional approaches, discussing some their shortcomings and unsolved problems. Finally, I’ll present some new angles on quality for social media

Speaker's Biography: Dr. Stefan Winkler is Principal Scientist and Director of the Interactive Digital Media Program at the University of Illinois’ Advanced Digital Sciences Center (ADSC) in Singapore. He also serves as Scientific Advisor to Cheetah Technologies. He has co-founded a start-up, worked in large corporations, and held faculty positions.
Dr. Winkler has a Ph.D. degree from the Ecole Polytechnique Fédérale de Lausanne, Switzerland, and an M.Eng./B.Eng. degree from the University of Technology Vienna, Austria. He has published over 70 papers and the book “Digital Video Quality”. He has also contributed to standards in ITU, VQEG, ATIS, VSF, and SCTE. His research interests include video processing, computer vision, perception, and human-computer interaction.


Professor Nicholas George

The Institute of Optics, University of Rochester

October 11, 2011 4:15 pm to 5:15 pm

Talk Title: Extended Depth of Field for Digital Cameras

Talk Abstract: During the past few years the Rochester team has developed a highly effective digital camera system that has a small f/# and an extended depth of field. It provides a high-quality image from 4 or 6 inches to infinity without the need to focus. The generalized Fourier optical design will be described including novel analysis for aspheric lenses and the details of the digital processing. Illustrative image results are presented together with a method for tailoring the camera design for superior performance close in or at extreme distances.

Speaker's Biography: The speaker Nicholas George is the Wilson Chair Professor of Electronic Imaging and Professor of Optics at The Institute of Optics, University of Rochester. He has 30 years of experience at the University of Rochester, and previously 20 years as a Professor of Applied Physics and Electrical Engineering at the California Institute of Technology. He obtained a BS degree at UC Berkeley, graduating with highest honors, the MS at Maryland, and the PhD from the California Institute of Technology. He has many firsts and near-firsts in the field of modern optics including the holographic diffraction grating, the holographic stereogram, the ring-wedge detector robotic vision system, the laser heterodyne for pollution sensing of nitrogen oxides, the infrared digital hologram, the theory and experiments for the wavelength dependence of speckle, the FM-FM laser line scan system for remote contouring of aerial maps, and the present research into EDOF for digital cameras


Dr. Torbjorn Skauli

Norwegian Defense Research, FFI

October 4, 2011 4:15 pm to 5:15 pm

Talk Title: Hyperspectral Imaging Technology and Systems

Talk Abstract: Hyperspectral imaging, or imaging spectroscopy, records detailed spectral information in each image pixel. The spectral information enables a growing range of applications in fields such as medical imaging, industrial process monitoring, remote sensing and military reconnaissance. Many different technologies exist for recording hyperspectral images, but they all involve some form of painful compromise. Any hyperspectral imaging system requires image processing as an indispensable part, to extract the desired information from the spectral data.

The talk will give a brief introduction to the concept of hyperspectral imaging, and review some of the technologies used for recording spectral images. Important characteristics of spectral imaging sensors will be discussed: spectral and spatial resolution, signal to noise ratio and spatial coregistration of spectral bands. Spectral image processing will be briefly reviewed, with some emphasis on the need to go beyond pure pattern recognition techniques and also exploit available physical knowledge. Example results will be shown from an airborne hyperspectral imaging system with real-time processing developed at FFI.

Speaker's Biography: Torbjorn Skauli holds a PhD in physics from the University of Oslo. At FFI, the Norwegian Defense Research establishment, Skauli is head of the hyperspectral imaging group. He also teaches a course on optical imaging and detection at the University of Oslo, and is engaged in various science outreach activities. In the past, Skauli has worked with development of detectors for infrared imaging, as well as growth and characterization of Mercury cadmium telluride for use in IR detectors. Currently, Skauli is starting his second sabbatical year at Stanford, this time affiliated with SCIEN.


Dr. Jelena Kovacevic

Bell Laboratories, Lucent Technologies

March 21, 2001 2:00 pm to 3:00 pm

Talk Title: Quantized Frame Expansions with Erasures

Talk Abstract: A large fraction of the information that flows across today's networks is useful even in a degraded condition. Examples include speech, audio, still images and video. When this information is subject to packet losses and retransmission is impossible due to real-time constraints, superior performance with respect to total transmitted rate, distortion, and delay may sometimes be achieved by adding redundancy to the bit stream rather than repeating lost packets. In multiple description coding, the data is broken into several streams with some redundancy among the streams. When all the streams are received, one can guarantee low distortion at the expense of having a slightly higher bit rate than a system designed purely for compression. On the other hand, when only some of the streams are received, the quality of the reconstruction degrades gracefully, which is very unlikely to happen with a system designed purely for compression. I will describe a scheme that achieves redundancy by using overcomplete expansions -- frames. We then discuss frame design issues in the presence of quantization and losses.


Dr. Thomas Wiegand

Heinrich-Hertz-Institute, Berlin, Germany

September 21, 2001 2:00 pm to 3:00 pm

Talk Title: Context-based Adaptive Coding and the Emerging H.26L Video Compression Standard

Talk Abstract: H.26L is the current project of the ITU-T Video Coding Experts Group. The main goals of the new ITU-T H.26L standardization effort are a simple and straight forward video design to achieve enhanced compression performance and provision of a "network-friendly" packet-based video representation addressing "conversational" (i.e., video telephony) and "non-conversational" (i.e., storage, broadcast, or streaming) applications.
H.26L contains a new entropy coding scheme that is based on context-based adaptive binary arithmetic coding. In this entropy coding scheme, context models are utilized for efficient prediction of the coding symbols. The novel binary adaptive arithmetic coding technique is employed to match the conditional entropy of the coding symbols given the context model estimates. The adaptation is also employed to keep track of non-stationary symbol statistics. By using our new entropy coding scheme instead of the current one variable length code approach of the current TML, large bit-rate savings up to 32% can be achieved. As a remarkable outcome of our experiments, we observed that high gains are reached not only at high bit-rates, but also at very low rates.

Talk Slides:


Peter Schroeder

California Institute of Technology, Pasadena

November 13, 2001 2:00 pm to 3:00 pm

Talk Title: What is the entropy of a 2-manifold graph?

Talk Abstract: With the increasing availability of 3D scanning methodologies surfaces are emerging as a new multimedia datatype. Surface scans can be quite detailed and are typically given as a mesh, i.e., a set of samples on the surface together with neighborhood relations ("triangles"). Finding efficient representations for such surface descriptions is a subject of ongoing research.
In this talk I will consider a particular problem that arises in the compression of meshes: How efficient can we represent the connectivity of a general polyhedral mesh? We have recently developed an algorithm based on entropy coding valence (number of neighbors of a vertex) and degree (number of vertices in a face) streams which is near optimal according to a census of all planar graphs. I will discuss the development of algorithms of this kind and how the optimality property is proven.
Time permitting I will speculate on the larger (and much harder) question of how to define the entropy of a surface.
Joint work with Andrei Khodakovsky, Pierre Alliez, and Mathieu Desbrun.


Henry Wu and Damian Tan

School of Computer Science and Software Engineering, Monash University, Australia

November 14, 2001 2:00 pm to 3:00 pm

Talk Title: Perceptual Video Distortion Metrics And Coding

Talk Abstract: Quantitative quality and impairment metrics based on the human visual system have remained one of the most critical issues in the field of digital video coding and compression. The progress made on this issue significantly affects research activities in at least the following three areas: design of new high performance coding/compression algorithm, quality/impairment measurements for digital video coding algorithms and products, and quantitative definition of "psychovisual redundancy".
This talk will discuss the vision model which we use in devising our blocking and ringing impairment metrics for digital video quality assessment [1] in comparison with those used in proposals in the forum of VQEG (Video Quality Experts Group). The performance of our vision model based impairment metrics have been evaluated showing high correlations with corresponding subjective test results.
A new perceptual image coder [2] will also be described which adopts the coding structure of the EBCOT with the proposed perceptual distortion measure in place of the MSE and the CVIS. Examples will be given to compare the performance of the new perceptual coder with that of the EBCOT with the MSE and CVIS.
[1] Z. Yu, H.R. Wu, S. Winkler and T. Chen, "Objective assessment of blocking artifacts for digital video with a vision model", the Proceedings of the IEEE, November 2001.
[2] D. Tan, H.R. Wu and Z. Yu, "Perceptual coding of digital colour images", Proceedings of IEEE International Symposium on Intelligent Signal Processing and Communication Systems, November 2001.

Talk Slides:


Professor Shih-Fu Chang

Digital Video and Multimedia Research Lab Columbia University

December 12, 2001 2:00 pm to 3:00 pm

Talk Title: Audio-Visual Content Indexing, Filtering, and Adaptation

Talk Abstract: Recently, researchers have been very active in developing new techniques and standards for audio-visual content description. Such descriptions can be used in innovative applications such as multimedia search engines, personalized media filters, and intelligent video navigators. In our research, we are particularly interested in emerging applications in two environments: personal media server (also called personal video recorder) and mobile multimedia device.
In this talk, I will first give an overview of technical issues arising in exciting applications mentioned above. Then I will present our recent research in two specific areas.
The first area involves real-time event detection in specific application domains, such as sports. Such techniques are needed especially for filtering live broadcast programs in a mobile, time-sensitive environment. We will present a real-time system for sports event parsing and detection, combining approaches of machine learning, compressed-domain processing, and domain rule modeling. Demos of real-time performance will be shown. In addition, we will introduce a novel video streaming framework in which video bitrate is adapted dynamically according to matching of detected event to user's preference. Such adaptation enables improved video quality and user experience by allocating scarce resource to video segments with important content. We will discuss interesting system-level issues inspired by such a content-adaptive streaming framework.
The second area involves parsing and summarizing high-level content in long unstructured programs, such as films. Here we will present our work on audio-visual scene segmentation using a multimedia integrative framework. Psychological memory-based models are used for detecting long-term scene boundaries. Structural patterns such as dialog and anchoring are analyzed. Multimedia cues (such as speech or silence) are incorporated for aligning audio-visual scenes. We will report promising results from experiments with films of different genres. Given the scene structure, we have also developed a unique approach to audio-visual content skimming by incorporating production syntax and perceptual complexity of video.
At the end, I will briefly review the emerging MPEG-7 standard. I will discuss the relationships between MPEG-7 and above research tools. I will also briefly describe our work in indexing medical video in Columbia's Digital Library Project.


Dr. Joachim Eggers

University of Erlangen-Nuremberg, Telecommunications Laboratory

January 24, 2002 12:30 pm to 1:30 pm

Talk Title: Information Embedding and Digital Watermarking

Talk Abstract: The ease of perfect copying, distribution, and manipulation of digital data has become a significant problem for copyright protection and integrity verification of digitized multimedia content. Digital watermarking has been proposed as one possible method to combat these problems. Thus, information embedding and digital watermarking has gained a lot of attention during the last years. The many publications on information embedding and digital watermarking show a mutual improvement of watermarking schemes and attacks against embedded watermarks. However, recently, research on theoretical performance limits of digital watermarking has intensified as well. In this presentation, theoretical and experimental results for several more or less restricted watermarking scenarios are discussed. The focus is on blind watermarking, where the receiver does not have access to the original non-watermarked data. In this case, digital watermarking can be considered communication with side information about the original signal at the encoder. Further, robust digital watermarking schemes should be designed by considering digital watermarking a game between the watermark embedder and attacker. The presentation closes with example results for image watermarking.

Talk Slides:


Professor Heinrich Niemann

University of Erlangen-Nuremberg, Germany

March 17, 2003 1:00 pm to 2:00 pm

Talk Title: Using Lightfields in Image Processing

Talk Abstract: A short introduction to the concept of a lightfield (LF) is given and to the recording by one hand- or robot-manipulated camera. The need for at least rough scene geometry for good rendering quality is pointed out. A so-called free-form LF is used where the camera positions can be arbitrary, in particular need not be on a regular grid. The main part of the talk is devoted to potential usage of LF for various tasks in image processing.
It is shown that the purely image based modeling of an object or a scene by a LF may be used to track an object and to self-localize a robot in an indoor environment. Both tasks basically amount, in a probabilistic framework, to a recursive state estimation which is performed by particle filters. For the tracking task the accuracy of state vector estimation was investigated for different resolutions and particle numbers. Self-localization was performed with a mobile platform.
Another application is to provide a 3-D recording, rendering, and augmentation of a (laparoscopic) surgery situation by a LF. The recording is being done by either a hand-held or a robot-manipulated camera. Sufficiently stable prominent points can be detected and tracked from different organs so that a camera calibration is possible. It was evaluated experimentally that the LF is useful to remove the effect of specular reflections from images taken by an endoscope during surgery. Ongoing work is directed towards augmenting the LF by vessels hidden from the camera and by organs from preoperative images.

Talk Slides:


Professor Thrasyvoulos Pappas

Electrical and Computer Engineering, Northwestern University, Evanston Illinois

April 24, 2003 12:20 pm

Talk Title: Adaptive Image Segmentation Based on Perceptual Color and Texture Features

Talk Abstract: We propose an image segmentation algorithm that is based on spatially adaptive color and texture features. The color features are based on the estimation of spatially adaptive dominant colors, which on one hand, reflect the fact that the human visual system cannot simultaneously perceive a large number of colors, and on the other, the fact that image colors are spatially varying. The spatially adaptive dominant colors are obtained using a previously developed adaptive clustering algorithm for color segmentation. The (spatial) texture features are based on a steerable filter decomposition, which offers an efficient and flexible approximation of early processing in the human visual system. In contrast to texture analysis/synthesis techniques that use a large number of parameters to describe texture, our segmentation algorithm relies on only a few parameters to segment the image into simple yet meaningful texture categories. Since texture feature estimation requires a finite neighborhood that limits spatial resolution, the proposed algorithm combines texture with color information to obtain accurate and precise edge localization. The performance of the proposed algorithm is demonstrated in the domain of photographic images, including low resolution, degraded, and compressed images.

Talk Slides:


Professor Alexandros Eleftheriadis

Department of Electrical Engineering, Columbia University

February 26, 2004 12:00 pm to 1:00 pm

Talk Title: Media Representation - From Software Tools to Theory

Talk Abstract: Media representation encompasses the various ways to digitally represent audio and video signals for the purposes of efficient creation, processing, storage, or transmission. It is a cross section of multimedia software, signal processing and compression, communication systems, and information theory. In this talk I briefly review some of our work in rate shaping, model-assisted video coding, mobile broadcast video, and semi-automatic video segmentation, and discuss how seemingly simple questions and the need to bridge the gap between low-level signal properties and high-level authoring structures have led us to the design of the Flavor language and the development of Complexity Distortion Theory. Flavor is a software tool for codec designers, whereas Complexity Distortion is a new theoretical framework for media representation based on complexity theory. I will also talk about our participation and the lessons learned in MPEG-4, an important milestone in media representation standards, as well as our current design philosophy in new software tools for next-generation media applications.

Talk Slides:


Dean Messing, Louis Kerofsky and Scott Daly

Sharp Labs of America

January 21, 2005 12:30 pm to 1:30 pm

Talk Title: Subpixel Rendering on Colour Matrix Displays

Talk Abstract: It is well known that the luminance resolution of a colour matrix display exceeds the resolution implied by the spatial sampling frequency of the display pixels. This is due to the spatially discrete nature of the subpixels that comprise each pixel. By treating the subpixels as individual luminance-contributing elements it is possible to enhance resolution of the display. However there are practical and theoretical difficulties in attaining this additional resolution the most important of which is the attendant colour aliasing. The first half of our talk discusses the resolution enhancement problem and presents an image processing algorithm that increases the resolution of a flat-panel display having the usual one-dimensional "striped" subpixel geometry. By a clever use of properties of the Human Visual System (HVS) the added resolution is obtained without the usual annoying colour artifacts. As a coda to this half of the presentation we also discuss some interesting results related to multi-frame resolution enhancement and the role played by the image acquisition system's colour filter array (CFA). As it turns out, CFA Interpolation and Subpixel Rendering are dual problems.
The rendering algorithm presented in the first half is specific to panels that have striped RGB subpixel geometry. Until recently this geometry has been nearly universal. However a new generation of colour matrix panels is on the horizon. These panels have both two-dimensional subpixel patterns and subpixels that do not (necessarily) use the standard RGB colour primaries. The second half of our talk discusses the rendering problem in the context of such panels and presents our recently proposed Optimal Rendering Framework for rendering imagery on such displays. This Framework uses the spatial sensitives of the HVS and the subpixel geometry of the display to pose a Constrained Optimisation Problem, the solution of which yields an array of 2D optimal rendering filters. The constraints are determined by the display panel, and the Cost Function is determined by the HVS. This Framework is very general. It can support displays with quite arbitrary 2D geometries and colour primaries. We provide examples showing how to apply this framework to several different 2D subpixel geometries and to displays having more than three primaries. We conclude with a novel and elegant application of the Framework to a problem unrelated to its original intent.


Dr. Kohji Mitani

Science and Technical Research Laboratories, Japan Broadcasting Corporation, (NHK)

March 10, 2005 10:00 am to 11:00 am

Talk Title: From VeggieVision to PeopleVision

Talk Abstract: We are investigating an ultrahigh-definition video system that can provide viewers with a greater sensation of reality than HDTV. The present target is to develop a system with 4000 scanning lines. This system should play a major role in any application that requires high resolution such as cinema, virtual museum, video archives, and telemedicine. A video camera and projection display together with a disc recorder system have been developed as experimental devices for the 4000 scanning line system. At the present time, the number of panel pixels is limited to 2kk4k for both CCD and LCD.

Due to this resolution constraint, four panels (two greens, one red and one blue) are combined to produce a resolution of 4k 8k pixels (16 times that of HDTV) in both the camera and display. The two green panels are arranged by the diagonal-pixel-offset method to achieve the above resolution. This talk will describe the development of these devices and a new system to be exhibited at the EXPO 2005that will be held in Aichi Pref., Japan.


Previous SCIEN Colloquia

To see a list of previous SCIEN colloquia, please click here.