Psych 267 Final Project Proposal
Mike Lin and Mike Harville
We plan to implement an interactive image segmentation and compositing
tool based on the paper "Intelligent scissors for image compositing",
by E. N. Mortensen and W. A. Barrett (Proceedings SIGGRAPH,
1995, pp. 191-198). Some of the components of the software implemented
in the original paper include:
- calculation of cost functions for linking neighboring pixels into
boundaries, using a weighted sum of Laplacian zero crossing, gradient
magnitude, and gradient direction information
- two-dimensional dynamic programming method for finding the optimal
(lowest cost) path from a "seed" pixel in the image (presumably near
an image edge) to other pixels in the image
- drawing the optimal path to the current "free" point (the position
of the mouse) interactively, so user can see it as he moves the mouse
- a "boundary cooling" algorithm for automatically freezing stable
portions of the boundary (i.e. if movement of the mouse over some short
period of time leaves some portion of the path back to the seed pixel
unchanged, then freeze that section of the boundary and generate a new
seed pixel at the end of it)
- an interactive, dynamic training algorithm for causing the
boundary to preferentially adhere to edges similar to those already in
the "frozen" portion of the boundary
- some edge filtering ("live wire masking") mechanism for minimizing
jaggies and background pixel contamination along the computed boundary
- a "spatial frequency and contrast matching" algorithm to help
make compositions of a scissored image with another image appear more
seamless
Although the authors argue that their tool is better than most other
segmentation algorithms, we think there is probably room for
improvement. Therefore, in addition to implementing the above
components, we will think about and possibly implement the following
enhancements:
- It appears that they only use one resolution for
computing image Laplacian and gradient information, and we might like
to try to incorporate a multiresolution scheme into our
algorithm. If we are successful in this, we would also probably try to
incorporate multi-resolution information into the dynamic interactive
training algorithm they describe.
- We would like to substitute the "multiresolution image splining"
technique implemented in Homework #2 for their "spatial frequency and
contrast matching" algorithm during compositing.
- We might try to improve on the dynamic training algorithm itself,
depending on how effective we find their relatively crude
implementation to be.
We would like to implement the project in Matlab, but two issues may
force us to do some portions of the project in C++: 1) the speed at
which Matlab can do the computations may be so slow that it inhibits
the interactivity of the tool, and 2) Matlab may not allow us to
program the type of user-interface we need (i.e. an event-driven
program with callbacks). Even if we have to write some of the project in
C++, we would still like to be able to tie it to Matlab so that we
would have access to the extensive image processing tools it makes
available to us. If we have access to these functions (both in the
Matlab core and in the various Stanford and Psych 267 toolkits), we
will more easily be able to experiment with new, possibly complex
enhancements.
One reason we chose this project is that is has many not-too-complex,
somewhat independent pieces, so that we should be able to
incrementally build it and make sure that the project is proceeding
smoothly. We can also add on as many extra features as we want based on our
available time. "Extra features" includes the above enhancements, as
well as tools for allowing the user to rotate, resize, or otherwise
manipulate a segmented image piece before compositing it with another
image.