A model was developed that recovers valuable data lost from images and video that have been “collapsed” into lower dimensions. The model could be used to recreate video from motion-blurred images or from new types of cameras that capture a person’s movement around corners but only as vague, onedimensional lines. The approach could be used to convert 2D medical images into more informative — but more expensive — 3D body scans, which could benefit medical imaging in poorer nations.
Captured visual data often collapses data of multiple dimensions of time and space into one or two dimensions, called projections. X-rays, for example, collapse three-dimensional data about anatomical structures into a flat image. Corner cameras detect moving people around corners. These could be useful for firefighters finding people in burning buildings. But the cameras currently only produce projections that resemble blurry, squiggly lines, corresponding to a person’s trajectory and speed.
The “visual deprojection” model uses a neural network to “learn” patterns that match low-dimensional projections to their original high-dimensional images and videos. Given new projections, the model uses what it’s learned to recreate all the original data from a projection. In experiments, the model synthesized accurate video frames showing people walking by extracting information from single, one-dimensional lines similar to those produced by corner cameras. The model also recovered video frames from single, motion-blurred projections of digits moving around a screen.
Digital cameras capturing long-exposure shots, for instance, will basically aggregate photons over a period of time on each pixel. In capturing an object’s movement over time, the camera will take the average value of the movement-capturing pixels. Then, it applies those average values to corresponding heights and widths of a still image, which creates the signature blurry streaks of the object’s trajectory. By calculating some variations in pixel intensity, the movement can theoretically be recreated.
The model, based on a convolutional neural network (CNN) — a machine-learning model used for image-processing tasks — captures clues about any lost dimension in averaged pixels. When shown previously unseen projections, the model notes the pixel patterns and follows the blueprints to all possible signals that could have produced that projection. Then, it synthesizes new images that combine all data from the projection and all data from the signal. This recreates the high-dimensional signal.