Dr. Jan Kautz

NVIDIA

November 7, 2018 4:30 pm to 5:30 pm

Location: Packard 101

Talk Title: Image Domain Transfer

Talk Abstract: Image domain transfer includes methods that transform an image based on an example, commonly used in photorealistic and artistic style transfer, as well as learning-based methods that learn a transfer function based on a training set. These are usually based on generative adversarial networks (GANs), and can be supervised or unsupervised as well as unimodal or multimodal. I will present a number of our recent methods in this space that can be used to translate, for instance, a label map to a realistic street image, a day time street image to a night time street image, a dog to different cat breeds, and many more.

Speaker's Biography: Jan is VP of Learning and Perception Research at NVIDIA. He leads the Learning & Perception Research team, working predominantly on computer vision problems (from low-level vision through geometric vision to high-level vision), as well as machine learning problems (including deep reinforcement learning, generative models, and efficient deep learning). Before joining NVIDIA in 2013, Jan was a tenured faculty member at University College London. He holds a BSc in Computer Science from the University of Erlangen-Nürnberg (1999), an MMath from the University of Waterloo (1999), received his PhD from the Max-Planck-Institut für Informatik (2003), and worked as a post-doctoral researcher at the Massachusetts Institute of Technology (2003-2006).

 

Professor Liang Gao

University of Illinois Urbana-Champaign

December 5, 2018 4:30 pm to 5:30 pm

Location: Packard 101

Talk Title: Plenoptic Medical Cameras

Talk Abstract: Optical imaging probes like otoscopes and laryngoscopes are essential tools used by doctors to see deep into the human body. Until now, they have been crucially limited to two-dimensional (2D) views of tissue lesions in vivo that frequently jeopardize their diagnostic usefulness. Depth imaging is critically needed in medical diagnostics because most tissue lesions manifest themselves as abnormal 3D structural changes. In this talk, I will talk our recent effort to develop three-dimensional (3D) plenoptic imaging tool that revolutionizes diagnosis with unprecedented sensitivity and specificity in the images produced. Particularly, I will discuss two plenoptic medical cameras, a plenoptic otoscope and a plenoptic laryngoscope, and their applications for in-vivo imaging.

More Information: https://iopticslab.ece.illinois.edu/

Speaker's Biography: Dr. Liang Gao is currently an Assistant Professor of Electrical and Computer Engineering department at University of Illinois Urbana-Champaign. He is also affiliated with Beckman Institute for Advanced Science and Technology. His primary research interests encompass multidimensional optical imaging, including hyperspectral imaging and ultrafast imaging, photoacoustic tomography and microscopy, and cost-effective high-performance optics for diagnostics. Dr. Liang Gao is the author of more than 40 peer-reviewed publications in top-tier journals, such as Nature, Science Advances, Physics Report, and Annual Review of Biomedical Engineering. He received his BS degree in Physics from Tsinghua University in 2005 and Ph.D. degree in Applied Physics and Bioengineering from Rice University in 2011. Dr. Liang Gao is a recipient of NSF CAREER award in 2017 and NIH MIRA award for Early-Stage Investigators in 2018.

 

Michael Broxton

Google

October 31, 2018 4:30 pm to 5:30 pm

Location: Packard 101

Talk Title: Wavefront coding techniques and resolution limits for light field microscopy

Talk Abstract: Light field microscopy is a rapid, scan-less volume imaging technique that requires only a standard wide field fluorescence microscope and a microlens array. Unlike scanning microscopes, which collect volumetric information over time, the light field microscope captures volumes synchronously in a single photographic exposure, and at speeds limited only by the frame rate of the image sensor. This is made possible by the microlens array, which focuses light onto the camera sensor so that each position in the volume is mapped onto the sensor as a unique light intensity pattern. These intensity patterns are the position-dependent point response functions of the light field microscope. With prior knowledge of these point response functions, it is possible to “decode” 3-D information from a raw light field image and computationally reconstruct a full volume.
In this talk I present an optical model for light field microscopy based on wave optics that accurately models light field point response functions. I describe an algorithm that solves for volumes using a GPU-accelerated iterative algorithm, and discuss priors that are useful for reconstructing biological specimens. I then explore the diffraction limit that applies for light field microscopy, and how it gives rise to a position-dependent resolution limits for this microscope. I’ll explain how these limits differ from more familiar resolution metrics commonly used in 3-D scanning microscopy, like the Rayleigh limit and the optical transfer function (OTF). Using this theory of resolution limits for the light field microscope, I explore new wavefront coding techniques that can modify the light field resolution limits and can address certain common reconstruction artifacts, at least to a degree. Certain resolution trade-offs exist that suggest that light field microscopy is just one of potentially many useful forms of computational microscopy. Finally, I describe our application of light field microscopy in neuroscience where we have used it to record calcium activity in populations of neurons within the brains of awake, behaving animals.

Speaker's Biography: Michael Broxton grew up in Los Alamos, NM where he had his first exposure to scientific computing systems at Los Alamos National Laboratory, learning the value of shared scientific enterprise. He attended MIT, where he earned his Bachelors and Masters degrees in EE/CS, additionally doing research in the MIT Media Laboratory. Michael moved to California to work at NASA Ames Research Center for six years in robotics and computer vision, with a particular focus on building automated software pipelines for processing satellite imagery of Mars and the Moon. During that time he collaborated with Google to release Moon and Mars modes for Google Earth 5.0. He then entered a Stanford PhD program in Computer Science, transitioning from the macrocosmos to the microcosmos, developing theory and improved performance in light field microscopy and proving applications in neuroscience. Recently graduated, his PhD research is the subject of this talk. Michael now works as a research scientist at Google.

 

Petr Kollnhofer

MIT CSAIL

November 28, 2018 4:30 pm to 5:30 pm

Location: Packard 101

Talk Title: Perceptual modeling with multi-modal sensing

Talk Abstract: The research of human perception has enabled many visual applications in computer graphics that efficiently utilize computation resources to deliver a high quality experience within the limitations of the hardware. Beyond vision, humans perceive their surrounding using variety of senses to build a mental model of the world and act upon it. This mental image is often incomplete or incorrect which may have safety implications. As we cannot directly see inside the head, we need to read indirect signals projected outside. In the first part of the talk I will show how perceptual modeling can be used to overcome and exploit limitations of one specific human sense - the vision. Then, I will describe how we can build sensors to observe other human interactions connected first with physical touch and then with eye gaze patterns. Finally, I will outline how such readings can be used to teach computers to understand human behavior, to predict and to provide assistance or safety.

Speaker's Biography: Dr. Petr Kellnhofer has completed his PhD at Max-Planck Institute for Informatics in Germany under supervision of Prof. Hans-Peter Seidel and Prof. Karol Myszkowski. His thesis on perceptual modeling of human vision for stereoscopy was awarded the Eurographics PhD award. After the graduation he has been a postdoc in the group of Prof. Wojciech Matusik at MIT CSAIL and he has been working on topics related to human sensing such as eye tracking. Dr. Kellnhofer’s current research interest is a combination of perceptual modeling and machine learning in order to utilize data gathered from various types of sensors and to learn about human perception and higher level behavior.

 

Professor Hany Farid

University of California at Berkeley

November 14, 2018 4:30 pm to 5:30 pm

Location: Packard 101

Talk Title: Photo Forensics from JPEG Coding Artifacts

Talk Abstract: The past few years have seen a startling and troubling rise in the fake-news phenomena in which everyone from individuals to state-sponsored entities produce and distribute mis-information, which is then widely promoted and disseminated on social media. The implications of fake news range from a mis-informed public to an existential threat to democracy, and horrific violence. At the same time, recent and rapid advances in machine learning are making it easier than ever to create sophisticated and compelling fake images and videos, making the fake-news phenomena even more powerful and dangerous. I will start by providing a broad overview of the field of image and video forensics and then I will describe in detail a suite of image forensic techniques that explicitly detect inconsistencies in JPEG coding artifacts.

Speaker's Biography: I am the Albert Bradley 1915 Third Century Professor of Computer Science at Dartmouth. My research focuses on digital forensics, image analysis, and human perception. I received my undergraduate degree in Computer Science and Applied Mathematics from the University of Rochester in 1989 and my Ph.D. in Computer Science from the University of Pennsylvania in 1997. Following a two-year post-doctoral fellowship in Brain and Cognitive Sciences at MIT, I joined the faculty at Dartmouth in 1999. I am the recipient of an Alfred P. Sloan Fellowship, a John Simon Guggenheim Fellowship, and I am a Fellow of the National Academy of Inventors.

 

Professor Michael Zollhöfer

Stanford University

October 24, 2018 4:30 pm to 5:30 pm

Location: Packard 101

Talk Title: Is it real? Deep Neural Face Reconstruction and Rendering

Talk Abstract: A broad range of applications in visual effects, computer animation, autonomous driving, and man-machine interaction heavily depend on robust and fast algorithms to obtain high-quality reconstructions of our physical world in terms of geometry, motion, reflectance, and illumination. Especially, with the increasing popularity of virtual, augmented and mixed reality devices, there comes a rising demand for real-time and low-latency solutions.
This talk covers data-parallel optimization and state-of-the-art machine learning techniques to tackle the underlying 3D and 4D reconstruction problems based on novel mathematical models and fast algorithms. The particular focus of this talk is on self-supervised face reconstruction from a collection of unlabeled in-the-wild images. The proposed approach can be trained end-to-end without dense annotations by fusing a convolutional encoder with a differentiable expert-designed renderer and a self-supervised training loss.
The resulting reconstructions are the foundation for advanced video editing effects, such as photo-realistic re-animation of portrait videos. The core of the proposed approach is a generative rendering-to-video translation network that takes computer graphics renderings as input and generates photo-realistic modified target videos that mimic the source content. With the ability to freely control the underlying parametric face model, we are able to demonstrate a large variety of video rewrite applications. For instance, we can reenact the full head using interactive user-controlled editing and realize high-fidelity visual dubbing.

Speaker's Biography: Michael Zollhöfer is a Visiting Assistant Professor at Stanford University. His stay at Stanford is funded by a postdoctoral fellowship of the Max Planck Center for Visual Computing and Communication (MPC-VCC), which he received for his work in the fields of computer vision, computer graphics, and machine learning. Before joining Stanford University, Michael was a Postdoctoral Researcher at the Max Planck Institute for Informatics working with Christian Theobalt. He received his PhD from the University of Erlangen-Nuremberg for his work on real-time reconstruction of static and dynamic scenes. During his PhD, he was an intern at Microsoft Research Cambridge working with Shahram Izadi on data-parallel optimization for real-time template-based surface reconstruction. The primary goal of his research is to teach computers to reconstruct and analyze our world at frame rate based on visual input. To this end, he develops key technology to invert the image formation models of computer graphics based on data-parallel optimization and state-of-the-art deep learning techniques. The reconstructed intrinsic scene properties, such as geometry, motion, reflectance, and illumination are the foundation for a broad range of applications not only in virtual and augmented reality, visual effects, computer animation, autonomous driving, and man-machine interaction, but also in other fields such as medicine and biomechanics.

 

Previous SCIEN Colloquia

To see a list of previous SCIEN colloquia, please click here.