2019 SCIEN Affiliates Meeting Poster Presentations

Check back for updates to this list. 

HDR Imaging & High Speed Imaging for a Programmable Vision Sensor with In-pixel Processing Capabilities by Julien Martel, Stephen Carey, Piotr Dudek, Gordon Wetzstein

Global topology optimization based on generative neural networks for metasurfaces design by Jiaqi Jiang and Jonathan Fan

Deep Optics for Single-shot High-dynamic-range Imaging by Christopher A. Metzler, Hayato Ikoma, Yifan Peng, Gordon Wetzstein

User-adapting Haptic Interaction from Holdable Kinesthetic Devices by Julie M. Walker, Michael Raitor, Patrick Slade, Arec Jamgochian, Mykel Kochenderfer, Allison M. Okamura

Autofocals: Gaze-contingent eyeglasses for presbyopes by Nitish Padmanaban, Robert Konrad, Gordon Wetzstein

A Unified Push Memory for Dataflow Accelerator Generation by Qiaoyi Liu, Jeff Setter, Kathleen Feng, Xuan Yang, Teguh Hofstee, Mark Horowitz, and Priyanka Raina

Gaze-Contingent Ocular Parallax Rendering for Virtual Reality by Robert Konrad, Anastasios Angelopolous, Gordon Wetzstein

Perceptual Accuracy of a Mixed-Reality System for MR-Guided Breast Surgical Planning in the Operating Room by Stephanie L Perkins, Michael A Lin, Subashini Srinivasan, Amanda J Wheeler, Brian A Hargreaves, Bruce L Daniel

High-Fidelity Calibration and Characterization of a Hyperspectral Computed Tomography System by Isabel O. Gallegos, Gabriella M. Dalton, Adriana M. Stohn, Srivathsan Koundinyan, Kyle R. Thompson, Edward S. Jimenez

Multifocal panoptic recording of cross-cortical neuronal dynamics in behaving mice by Isaac V. Kauvar*, Timothy A. Machado*, Elle Yuen, John Kochalka, Minseung Choi, William E. Allen, Cephra Raja, Nandini Pichamoorthy, Gordon Wetzstein, Karl Deisseroth

Shared Autonomy in Soft-Robot Teleoperation by Fabio Stroppa, Ming Luo, Allison M. Okamura

Validating neuroimaging software: the case of population receptive fields by Garikoitz Lerma-Usabiaga, Noah Benson, Jonathan Winawer, Brian A. Wandell

Improving Monocular Depth Estimation with Global Depth Histogram Matching using a Single SPAD Transient by Mark Nishimura, David Lindell, Chris Metzler, Gordon Wetzstein

Wave-based non-line-of-sight Imaging using fast f−k migration by David B. Lindell, Gordon Wetzstein, Matthew O'Toole

Learned Large Field-of-View Imaging With Thin-Plate Optics by Yifan (Evan) Peng, Qilin Sun, Xiong Dun, Gordon Wetzstein, Wolfgang Heidrich, Felix Heide

Factored Occlusion: Single Spatial Light Modulator Occlusion-capable Optical See-through Augmented Reality Display by Brooke Krajancich, Nitish Padmanaban, Gordon Wetzstein

Solving Vision Problems via Filtering by Sean I. Young, Aous T. Naman, Bernd Girod, David Taubman

Non-line-of-sight Surface Reconstruction Using the Directional Light-cone Transform by Sean I. Young, David B. Lindell, Bernd Girod, David Taubman, Gordon Wetzstein

Towards Adaptive Sampling for Depth Completion by Alexander Bergman, David Lindell, Gordon Wetzstein

A Simulation Environment for Creating Synthetic Datasets by Zheng Lyu, Zhenyi Liu, Brian Wandell, Joyce Farrell

Renal Transplantation Visualization with Augmented Reality by Elizabeth Nguyen, Dr. Bruce Daniel, Dr. Stephan Busque, and Dr. Marc. L Melcher

Scene Representation Networks: Continuous 3D-Structure-Aware Neural Scene Representations by Vincent Sitzmann, Michael Zollhöfer, Gordon Wetzstein


Abstracts

Title: Autofocals: Gaze-contingent eyeglasses for presbyopes

Authors: Nitish Padmanaban, Robert Konrad, Gordon Wetzstein

Abstract: As humans age, they gradually lose the ability to accommodate, or refocus, to near distances because of the stiffening of the crystalline lens. This condition, known as presbyopia, affects nearly 20% of people worldwide. We design and build a new presbyopia correction, autofocals, to externally mimic the natural accommodation response, combining eye tracker and depth sensor data to automatically drive focus-tunable lenses. We evaluated 19 users on visual acuity, contrast sensitivity, and a refocusing task. Autofocals exhibit better visual acuity when compared to monovision and progressive lenses while maintaining similar contrast sensitivity. On the refocusing task, autofocals are faster and, compared to progressives, also significantly more accurate. In a separate study, a majority of 23 of 37 users ranked autofocals as the best correction in terms of ease of refocusing. Our work demonstrates the superiority of autofocals over current forms of presbyopia correction and could affect the lives of millions.

Bio: Nitish is a fifth year PhD student working in the Computational Imaging Lab at Stanford. His research focuses on computational and optical techniques for improving focus cues in virtual and augmented reality and the real world.


Title: A Unified Push Memory for Dataflow Accelerator Generation

Authors: Qiaoyi Liu, Jeff Setter, Kathleen Feng, Xuan Yang, Teguh Hofstee, Mark Horowitz, and Priyanka Raina 

Abstract:  Hardware accelerators for many application domains rely on custom on-chip buffer hierarchies to exploit parallelism and locality to obtain efficiency. Usually these memories “push” data to the compute (or other memory units), by associating the address generation with the buffer and exploiting the application’s deterministic access patterns. While push memories are nearly universal in accelerators, the type of push memory depends on the application domain (e.g. line buffers for image pipelines, double buffers for DNNs). Systems that create accelerators often use a single buffer model and thus are limited. This paper describes a general push memory abstraction, called a unified buffer, which can describe any push memory. Using this unified buffer enabled us to modify a system that generated image processing accelerators from Halide DSL to allow it to span many application domains. We created a set of rewrite rules for optimizing the generated buffers and mapping them to FPGAs and coarse-grained reconfigurable arrays (CGRAs). Using the unified buffer abstraction, the resulting system can handle applications that include image processing, DNNs, and hybrid applications like MobileNet.

Bio: Kathleen is a PhD student in electrical engineering, advised by Priyanka Raina. She is interested in designing and developing new computer architectures for computer graphics, vision, and related applications.


Title: Gaze-Contingent Ocular Parallax Rendering for Virtual Reality

Authors: Robert Konrad, Anastasios Angelopolous, Gordon Weztstein

Abstract: Immersive computer graphics systems strive to generate perceptually realistic user experiences. Current-generation virtual reality (VR) displays are successful in accurately rendering many perceptually important effects, including perspective, disparity, motion parallax, and other depth cues. In this paper we introduce ocular parallax rendering, a technology that accurately renders small amounts of gaze-contingent parallax capable of improving depth perception and realism in VR. Ocular parallax describes the small amounts of depth-dependent image shifts on the retina that are created as the eye rotates. The effect occurs because the centers of rotation and projection of the eye are not the same. We study the perceptual implications of ocular parallax rendering by designing and conducting a series of user experiments. Specifically, we estimate perceptual detection and discrimination thresholds for this effect and demonstrate that it is clearly visible in most VR applications. Additionally, we show that ocular parallax rendering provides an effective ordinal depth cue and it improves the impression of realistic depth in VR.

Bio: I am a 6th year PhD candidate in the Electrical Engineering Department at Stanford University, advised by Professor Gordon Wetzstein, as part of the Stanford Computational Imaging Lab. My research interests lie at the intersection of computational displays and human physiology with a specific focus on virtual and augmented reality systems. For such systems, I have worked on supporting various depth cues, with a specific interest on focus cues and ocular parallax.


Title: High-Fidelity Calibration and Characterization of a Hyperspectral Computed Tomography System

Authors: Isabel O. Gallegos, Gabriella M. Dalton, Adriana M. Stohn, Srivathsan Koundinyan, Kyle R. Thompson, Edward S. Jimenez

Abstract:This work presents a numerical method to characterize the nonlinear encoding operator of the world's first hyperspectral x-ray computed tomography (H-CT) system as a sequence of discrete-to-discrete, linear imaging system matrices across unique and narrow energy windows. H-CT has various applications in the non-destructive analysis of materials and objects in fields such as national security, industry, and medicine, but acquiring physical H-CT data requires significant time and money. Additionally, many approaches to CT make gross assumptions about the image formation process in order to apply post-processing and reconstruction techniques that lead to inferior data, resulting in faulty measurements, assessments, and quantifications. Through the analysis of the point source response for each energy channel at each location in the field of view, we present a linear model that describes the H-CT system. This work presents the numerical method used to produce the model through the collection of data needed to describe the system; the parameterization used to compress the model; and the decompression of the model for computation. By using this linear model, large amounts of accurate synthetic H-CT data can be efficiently produced, greatly reducing the costs associated with physical H-CT scans. Successfully approximating the encoding operator for the H-CT system through a point spread distribution enables quick assessment of H-CT behavior for various applications in high-performance reconstruction, sensitivity analysis, and machine learning. This project was conducted at Sandia National Laboratories (SNL). SNL is managed and operated by NTESS under DOE NNSA contract DE-NA0003525.

Bio: I am second-year undergraduate student at Stanford University studying computer science, with a plan to pursue a PhD in computer science.  My research has focused on high-energy hyperspectral x-ray computed tomography, and I have a patent pending, two technical disclosures for patents, and three publications as a result of this work.  At Stanford, I am the Corporate Liaison for the Society of Latinx Engineers (SOLE), a member of the Outreach Board for Women in Computer Science (WiCS), and a member of Code the Change.


Title: Multifocal panoptic recording of cross-cortical neuronal dynamics in behaving mice

Authors: Isaac V. Kauvar*, Timothy A. Machado*, Elle Yuen, John Kochalka, Minseung Choi, William E. Allen, Cephra Raja, Nandini Pichamoorthy, Gordon Wetzstein, Karl Deisseroth

Abstract: We present a large field of view, multifocal optical system for simultaneously recording from over a thousand neuronal sources distributed across the entirety of mouse dorsal cortex at over 25 Hz.

Bio: Isaac Kauvar is a final-year graduate student co-advised by Gordon Wetzstein and Karl Deisseroth. He works at the intersection of neuroscience, machine learning, and optics. www.ivk.io


Title: Wave-based non-line-of-sight Imaging using fast f−k migration

Authors: David B. Lindell, Gordon Wetzstein, Matthew O'Toole

Abstract: Imaging objects outside a camera’s direct line of sight has important applications in robotic vision, remote sensing, and many other domains. Time-of-flight-based non-line-of-sight (NLOS) imaging systems have recently demonstrated impressive results, but several challenges remain. Image formation and inversion models have been slow or limited by the types of hidden surfaces that can be imaged. Moreover, non-planar sampling surfaces and non-confocal scanning methods have not been supported by efficient NLOS algorithms. With this work, we introduce a wave-based image formation model for the problem of NLOS imaging. Inspired by inverse methods used in seismology, we adapt a frequency-domain method, f-k migration, for solving the inverse NLOS problem. Unlike existing NLOS algorithms, f-k migration is both fast and memory efficient, it is robust to specular and other complex reflectance properties, and we show how it can be used with non-confocally scanned measurements as well as for non-planar sampling surfaces. f-k migration is more robust to measurement noise than alternative methods, generally produces better quality reconstructions, and is easy to implement. We experimentally validate our algorithms with a new NLOS imaging system that records room-sized scenes outdoors under indirect sunlight, and scans persons wearing retroreflective clothing at interactive rates.

Bio: David is a 4th year PhD candidate in the Electrical Engineering Department at Stanford University advised by Gordon Wetzstein. His recent work focuses on developing computational algorithms for non-line-of-sight imaging, single-photon imaging, and 3D imaging with sensor fusion. He received his bachelor's and master's degrees in 2015 and 2016 from Brigham Young University where he worked on remote sensing algorithms for satellite-based radar.


Title: Shared Autonomy in Soft-Robot Teleoperation

Authors: Fabio Stroppa, Ming Luo, Allison M. Okamura

Abstract: A user-friendly interface to teleoperate a soft-robot manipulator in a complex environment. Composed of a growing manipulator with a grasping end-effector and a gesture-based control. The project aims at exploring the spectrum of interactions between human-control teleoperation and autonomous behavior. In these shared-autonomous scenarios, the human operator and robot intelligence are both necessary for task completion.

Bio: Dr. Fabio Stroppa is a Software Engineer and received his Ph.D. in Percaptual Robotics from Scuola Superiore Sant’Anna in 2018. He is currently a postdoctoral scholar in the Collaborative Haptics and Robotics in Medicine (CHARM) Lab at Stanford University, working with Prof. Allison Okamura in the Department of Mechanical Engineering, and, by courtesy, Computer Science.


Title: Learned Large Field-of-View Imaging With Thin-Plate Optics

Authors: Yifan (Evan) Peng, Qilin Sun, Xiong Dun, Gordon Wetzstein, Wolfgang Heidrich, Felix Heide

Abstract: Typical camera optics consist of a system of individual elements that are designed to compensate for the aberrations of a single lens. Recent computational cameras shift some of this correction task from the optics to post-capture processing, reducing the imaging optics to only a few optical elements. However, these systems only achieve reasonable image quality by limiting the field of view (FOV) to a few degrees — effectively ignoring severe off-axis aberrations with blur sizes of multiple hundred pixels.
In this work, we propose a lens design and learned reconstruction architecture that lift this limitation and provide an order of magnitude increase in field of view using only a single thin-plate lens element. Specifically, we design a lens to produce spatially shift-invariant point spread functions, over the full FOV, that are tailored to the proposed reconstruction architecture. We achieve this with a mixture PSF, consisting of a peak and and a low-pass component, which provides residual contrast instead of a small spot size as in traditional lens designs. To perform the reconstruction, we train a deep network on captured data from a display lab setup, eliminating the need for manual acquisition of training data in the field. We assess the proposed method in simulation and experimentally with a prototype camera system. We compare our system against existing single-element designs, including an aspherical lens and a pinhole, and we compare against a complex multi-element lens, validating high-quality large field-of-view (i.e. 53-deg) imaging performance using only a single thin-plate element.

Bio: Yifan (Evan) Peng is a Postdoc Research Fellow at Stanford Electrical Engineering. He received his PhD in Computer Science at The University of British Columbia. He was a Visiting Research Student at Computational Imaging Group, Stanford University, and a Remote (Visiting) Researcher at Visual Computing Center, King Abdullah University of Science and Technology. Prior to joining UBC, he worked at Lenovo Research as a Display Tech Researcher. Before that, he obtained his MSc and BE degrees both in Optical Science and Engineering from State Key Lab of Modern Optical Instrumentation, Zhejiang University. His research interests ride across the interdisciplinary fields of optics/photonics, graphics, and vision. Much of his work concerns computational imaging solutions, including capture end and display end, as well as AR/MR enhancement.


Title: User-adapting Haptic Interaction from Holdable Kinesthetic Devices

Authors: Julie M. Walker, Michael Raitor, Patrick Slade, Arec Jamgochian, Mykel Kochenderfer, Allison M. Okamura

Abstract: Augmented and virtual reality have many potential applications that could benefit from haptic (touch) interaction. Most consumer haptic interfaces rely primarily on vibration feedback. This often feels unnatural and can’t provide intuitive directional force feedback. We have developed two holdable haptic devices that provide directional, kinesthetic haptic cues. The first device uses gyroscopic effects to generate salient torques onto a user’s hand in three degrees-of-freedom. The second device is a holdable haptic gripper for manipulating virtual objects and provides intuitive forces to the user’s fingertips in four degrees-of-freedom. Using both systems, we have explored how different users perceive and respond to haptic guidance. We use model-based reinforcement learning and optimal control techniques to tailor the haptic controller specifically for that user, improving their performance in virtual guidance tasks.

Bio:  Julie M. Walker is a sixth-year Mechanical Engineering Ph.D. candidate in Allison Okamura's Collaborative Haptics and Robotics in Medicine Lab (CHARM Lab). Her research focuses on haptic guidance for applications such as surgical training, virtual reality, and teleoperation of robotic systems. She is finishing her Ph.D. in early 2020, and is looking for a job!


Title: Global topology optimization based on generative neural networks for metasurfaces design

Authors: Jiaqi Jiang and Jonathan Fan

Abstract: Metasurfaces are subwavelength-structured artificial media that can shape and localize electromagnetic waves in unique ways. The inverse design of these de- vices is a non-convex optimization problem in a high dimensional space, making global optimization a major challenge. We present a new type of population-based global optimization algorithm for metasurfaces that is enabled by the training of a generative neural network. The loss function used for backpropagation depends on the generated pattern layouts, their efficiencies, and efficiency gradients, which are calculated by the adjoint variables method using forward and adjoint electro- magnetic simulations. We observe that the distribution of devices generated by the network continuously shifts towards high performance design space regions over the course of optimization. Upon training completion, the best generated de- vices have efficiencies comparable to or exceeding the best devices designed using standard topology optimization. Our proposed global optimization algorithm can generally apply to other gradient-based optimization problems in optics, mechanics and electronics.

Bio: My name is Jiaqi Jiang, a 3rd year PhD student in the department of Electrical Engineering. My research interests focus on the application of machine learning techniques to the optimization of photonics design.


Title: Deep Optics for Single-shot High-dynamic-range Imaging

Authors: Christopher A. Metzler, Hayato Ikoma, Yifan Peng, Gordon Wetzstein

Abstract: We use end-to-end learning to develop a single-shot high-dynamic-range (HDR) imaging system. Our system can be thought of as an electro-optical auto-encoder: an optical phase mask encodes a HDR image into a low-dynamic-range image that is recorded by a sensor and then decoded into a HDR image by convolutional neural network (CNN). This framework allows us to automatically optimize the phase mask using training data and back-propagation. We fabricate this optimized optical element and attach it as a hardware add-on to a conventional camera during inference. In extensive simulations and with a physical prototype, we demonstrate that this end-to-end deep optical imaging approach to single-shot HDR imaging outperforms both purely CNN-based approaches and other PSF engineering approaches.

Bios:
Chris Metzler: I am a postdoctoral researcher in Gordon Wetzstein's Computational Imaging Lab. Prior to that, I was a PhD student in the Machine Learning, Digital Signal Processing, and Computational Imaging labs at Rice University, where I worked under the direction of professors Richard Baraniuk and Ashok Veeraraghavan. My research focuses on the development, application, and analysis of new signal processing algorithms. I am especially interested in algorithm design related to problems in computational imaging, machine learning, and communications.
Hayato Ikoma: I am a Ph.D. student in Gordon Wetzstein's Computational Imaging Group. My research focuses on the application of computational imaging for fluorescence optical microscopy, where I develop new computational imaging technique and fabricate custom optical elements.


Title: Factored Occlusion: Single Spatial Light Modulator Occlusion-capable Optical See-through Augmented Reality Display

Authors: Brooke Krajancich, Nitish Padmanaban, Gordon Wetzstein

Abstract: Occlusion is a powerful visual cue that is crucial for depth perception and realism in optical see-through augmented reality (OST-AR). However, existing OST-AR systems additively overlay physical and digital content with beam combiners – an approach that does not easily support mutual occlusion, resulting in virtual objects that appear semi-transparent and unrealistic. In this work, we propose a new type of occlusion-capable OST-AR system. Rather than additively combining the real and virtual worlds, we employ a single digital micromirror device (DMD) to merge the respective light paths in a multiplicative manner. This unique approach allows us to simultaneously block light incident from the physical scene on a pixel-by-pixel basis while also modulating the light emitted by a light emitting diode (LED) to display digital content. Our technique builds on mixed binary/continuous factorization algorithms to optimize time-multiplexed binary DMD patterns and their corresponding LED colors to approximate a target augmented reality (AR) scene. In simulations and with a prototype benchtop display, we demonstrate hard-edge occlusions, plausible shadows, and also gaze-contingent optimization of this novel display mode, which only requires a single spatial light modulator.

Bio: Brooke is a second year electrical engineering PhD candidate working in the Computational Imaging Lab at Stanford, supervised by Professor Gordon Wetzstein. Her research focuses on developing computational techniques that leverage the co-design of optical elements, image processing algorithms and principles imposed by the human visual system for improving current-generation virtual and augmented reality displays.


Title: Towards Adaptive Sampling for Depth Completion

Authors: Alexander Bergman, David Lindell, Gordon Wetzstein

Abstract: Estimating dense depth images from sparse samples and aligned RGB images (depth completion) is necessary for imaging modalities such as LiDAR where it is infeasible to scan every point in a scene for depth. We introduce a new method for depth completion using deep learning which performs well at very sparse sampling densities. Additionally, we propose a method which adaptively determines the location to sparsely sample the scene which best informs the depth completion task. These components optimized together approach an end-to-end adaptive sampling system for dense depth estimation.

Bio: Alex is a second year PhD student working in the Computational Imaging Lab at Stanford. His research interests are in 3D imaging and computational imaging systems for autonomous agents where the agent can both adaptively image the world and optimally act according to its goal.


Title: Improving Monocular Depth Estimation with Global Depth Histogram Matching using a Single SPAD Transient

Authors: Mark Nishimura, David Lindell, Chris Metzler, Gordon Wetzstein

Abstract: Existing monocular depth estimation algorithms successfully predict the relative depth order of objects in a scene. However, because of the fundamental scale ambiguity associated with monocular images, these algorithms fail at correctly predicting an object’s true metric depth. In this work, we demonstrate how a depth histogram of the scene, which can be readily captured using a single-pixel diffused single-photon avalanche diode (SPAD), can be fused with the output of existing monocular depth estimation algorithms to resolve the depth ambiguity problem. We validate this novel sensor fusion technique experimentally and in extensive simulation. We show that it dramatically improves the performance of several state-of-the-art monocular depth estimation algorithms.

Bio: Mark Nishimura is a second-year PhD student in the Computational Imaging Lab. His main research interests include machine learning and optimization for single-photon imaging, 3D imaging, and inverse rendering.


Title: Non-line-of-sight Surface Reconstruction Using the Directional Light-cone Transform

Authors: Sean I. Young, David B. Lindell, Bernd Girod, David Taubman, Gordon Wetzstein

Abstract: We propose a joint albedo-normal approach to non-line-of-sight (NLOS) surface reconstruction using the directional light-cone transform (D-LCT). While current NLOS imaging methods reconstruct either the albedo or surface normals of the hidden scene, the two quantities provide complementary information of the scene, so an efficient method to estimate both simultaneously is desirable. We formulate the recovery of the two quantities as a vector deconvolution problem, and solve it via Cholesky-Wiener decomposition. We demonstrate that surfaces fitted non-parametrically using our recovered normals are more accurate than those produced with NLOS surface reconstruction methods recently proposed, and are 1,000 times faster to compute than inverse-rendering methods.

Bio: Dr. Sean Young is postdoctoral researcher with the IVMS Lab (led by Bernd Girod) and the Computational Imaging Lab (led by Gordon Wetzstein) at Stanford University. Previously, he was a PhD candidate in the IVMP Lab at the University of New South Wales, Australia, advised by Prof. David Taubman. His research interests are signal processing and compression, and inverse problems in imaging.


Title: Validating neuroimaging software: the case of population receptive fields

Authors: Garikoitz Lerma-Usabiaga, Noah Benson, Jonathan Winawer, Brian A. Wandell

Abstract: Neuroimaging software methods are complex, making it a near certainty that some implementations will contain errors. It is difficult, nay impossible, for researchers to check the validity of software by reading source code - even when the code is open and shared. For this reason, the community should establish methods to evaluate implementation validity. We describe a computational approach for validating and sharing software implementations and apply it to a particular application: population receptive field (pRF) methods for functional MRI data. The methods can be extended to many other critical neuroimaging algorithms. Having unit and system testing protocols can help (a) developers to build new software, (b) research scientists to verify the software’s accuracy, and (c) reviewers to evaluate the methods used in publications and grants.

Bio: Originally and EE and MBA, I worked in management consultancy and I created a technology and Marketing company in Spain. After that, I obtained a PhD in Cognitive Neuroscience in reading research using multimodal MRI methods. I work in Brian Wandell's lab in white matter and population receptive related methods.


Title: Solving Vision Problems via Filtering

Authors: Sean I. Young, Aous T. Naman, Bernd Girod, David Taubman

Abstract: We propose a new, filtering approach for solving a large number of regularized inverse problems commonly found in computer vision. Traditionally, such problems are solved by finding the solution to the system of equations that expresses the first-order optimality conditions of the problem. This can be slow if the system of equations is dense due to the use of nonlocal regularization, necessitating iterative solvers such as successive over-relaxation or conjugate gradients. In this paper, we show that similar solutions can be obtained more easily via filtering, obviating the need to solve a potentially dense system of equations using slow iterative methods. Our filtered solutions are very similar to the true ones, but up to 10 times faster to compute.

Bio: Dr. Sean Young is postdoctoral researcher with the IVMS Lab (led by Bernd Girod) and the Computational Imaging Lab (led by Gordon Wetzstein) at Stanford University. Previously, he was a PhD candidate in the IVMP Lab at the University of New South Wales, Australia, advised by Prof. David Taubman. His research interests are signal processing and compression, and inverse problems in imaging.


Title: Perceptual Accuracy of a Mixed-Reality System for MR-Guided Breast Surgical Planning in the Operating Room

Authors: Stephanie L Perkins, Michael A Lin, Subashini Srinivasan, Amanda J Wheeler, Brian A Hargreaves, Bruce L Daniel

Abstract: One quarter of women who undergo breast lumpectomy to treat early- stage breast cancer in the United States undergo a repeat surgery due to concerns that residual tumor was left behind. This has led to a significant increase in women choosing mastectomy operations in the United States. We have developed a mixed-reality system that projects a 3D “hologram” of images from a breast MRI onto a patient using the Microsoft HoloLens. The goal of this system is to reduce the number of repeated surgeries by improving surgeons’ ability to determine tumor extent. We are conducting a pilot study in patients with palpable tumors that tests a surgeon’s ability to accurately identify the tumor location via mixed-reality visualization during surgical planning. Although early results are promising, it is critical but not straightforward to align holograms to the breast and to account for tissue deformations. More work is needed to improve the registration and holographic display at arm’s-length working distance. Nonetheless, first results from breast cancer surgeries have shown that mixed-reality guidance can indeed provide information about tumor location, and that this exciting new use for AR has the potential to improve the lives of many patients.

Bio: Steffi Perkins is a late-stage PhD student in the Department of Bioengineering at Stanford. She works under the mentorship of Drs. Brian Hargreaves and Bruce Daniel, where her research interests focus on the intersection of medical imaging and mixed reality. She has experience developing methods to image breast cancer with MRI, developing and testing mixed-reality systems for breast and thoracic surgical planning in the OR, and collaborating with both engineers and clinicians.


Title: A Simulation Environment for Creating Synthetic Datasets

Authors: Zheng Lyu, Zhenyi Liu, Brian Wandell, Joyce Farrell

Abstract: We describe a collection of software tools for generating labeled, ground-truth image data (ISET3d and ISETCam). The software uses physically based ray tracing (PBRT) to model scene radiance and the transformation through multi-element lenses to the sensor irradiance . The ISET3d software also creates pixel-level labels for depth and object type. Complex scenes are assembled from a database of objects and materials stored in Flywheel.io. Camera images are created using the ISETCam software that converts the sensor spectral irradiance to RGB images. The simulation environment makes it possible to precisely specify the properties of scenes, optics, sensors and image processing pipelines and to associate the camera images with ground-truth depth maps and labeled objects. The simulation environment can adjust the position and number of images for camera array applications. We describe a specific synthetic dataset intended to support the development and evaluation of algorithms for depth estimation and image alignment.

Bio: Zheng Lyu is a PhD candidate at Stanford Electrical Engineering department. He is working with Prof. Brian Wandell and Dr. Joyce Farrell. His research interests contain: camera simulation & design, digital/medical imaging, 3D graphics simulation. Prior joining Stanford, he earned bachelor degree in Engineering department at Tsinghua University in 2016.


Title: Renal Transplantation Visualization with Augmented Reality

Authors: Elizabeth Nguyen, Dr. Bruce Daniel, Dr. Stephan Busque, and Dr. Marc. L Melcher

Abstract: Kidney shortages remain a rampant issue in the healthcare field with many people dying everyday waiting for a transplant. Transplant surgeries remain high-risk because of large anatomical variance between structures. Since surgeries are risky and kidneys are scare, there are many efforts to use pre-operative planning strategies to minimize surgical complications. Right now, surgeons often utilize flat paper/computer viewing methods to view kidneys. Recently, a few doctors have seen the benefits of viewing anatomies using 3D-printed models. We propose utilizing augmented reality as a cost efficient viewing modality that allows for high quality viewing of a large volume of kidneys at once. Our viewing application designed for the Microsoft Hololens allows for the user to use voice commands and hand gesturing to rotate models and enlarge them for optimal viewing. We prototyped our application with two transplant surgeons in order to develop an application that was easy-to-use and had important features for pre-operative planning. Our models use MRI data which have been segmented using HOROS and then made into a 3D model. We are currently conducting a study with transplant surgeons and about 60 kidney cases. With this augmented reality application, we demonstrate a new viewing modality for the field of radiology that could greatly impact the current healthcare practices. (Demo included)

Bio: Elizabeth Nguyen is a current undeclared sophomore studying computer science and spatial design. She is currently the President of Stanford XR and is interested in exploring the intersections of mixed reality and medicine.


Title: HDR Imaging & High Speed Imaging for a Programmable Vision Sensor with In-pixel Processing Capabilities

Authors: Julien Martel, Stephen Carey, Piotr Dudek, Gordon Wetzstein

Abstract: HDR Imaging and High speed imaging are challenging problems for conventional vision sensors. Here we use an emerging vision sensor technology entangling sensing and processing directly inside each pixel. Our sensor collocates a simple processor with a few memories next to each pixel photo-sensitive element. We show how new algorithms can be tailored for those devices, leveraging their fine-grain massive parallelism and demonstrate how those can be use to alleviate typical shortcomings of conventional (APS) vision sensors, in particular for HDR and high speed imaging.

Bio: I obtained my Ph.D. at ETH Zurich in 2019 under the supervision of Matthew Cook. I am a postdoctoral research fellow in the computational imaging lab at Stanford under the supervision of Gordon Wetzstein.


Title: Scene Representation Networks: Continuous 3D-Structure-Aware Neural Scene Representations

Authors: Vincent Sitzmann, Michael Zollhöfer, Gordon Wetzstein

Abstract: Unsupervised learning with generative models has the potential of discovering rich representations of 3D scenes. While geometric deep learning has explored 3D-structure-aware representations of scene geometry, these models typically require explicit 3D supervision. Emerging neural scene representations can be trained only with posed 2D images, but existing methods ignore the three-dimensional structure of scenes. We propose Scene Representation Networks (SRNs), a continuous, 3D-structure-aware scene representation that encodes both geometry and appearance. SRNs represent scenes as continuous functions that map world coordinates to a feature representation of local scene properties. By formulating the image formation as a differentiable ray-marching algorithm, SRNs can be trained end-to-end from only 2D images and their camera poses, without access to depth or shape. This formulation naturally generalizes across scenes, learning powerful geometry and appearance priors in the process. We demonstrate the potential of SRNs by evaluating them for novel view synthesis, few-shot reconstruction, joint shape and appearance interpolation, and unsupervised discovery of a non-rigid face model.

Bio: Vincent is a fifth-year Ph.D. student in the Stanford Computational Imaging Laboratory, advised by Prof. Gordon Wetzstein. His research interest lies in 3D-structure-aware neural scene representations - a novel way for AI to represent information on our 3D world. His goal is to allow AI to perform intelligent 3D reasoning, such as inferring a complete model of a scene with information on geometry, material, lighting etc. from only few observations, a task that is simple for humans, but currently impossible for AI. He has previously worked on differentiable camera pipelines, VR and Human Perception.