John L. Barron, Allan D. Jepson, John K. Tsotsos
We address the problem of interpreting image velocity fields generated by a moving monocular observer viewing a stationary environment under perspective projection to obtain 3-D information about the relative motion of the observer (egomotion) and the relative depth of environmental surface points (environmental layout). The algorithm presented in this paper involves computing motion and structure from a spatio-temporal distribution of image velocities that are hypothesized to belong to the same 3-D planar surface. However, the main result of this paper is not just another motion and structure algorithm that exhibits some novel features but rather an extensive error analysis of the algorithm’s preformance for various types of noise in the image velocities. Waxman and Ullman  have devised an algorithm for computing motion and structure using image velocity and its 1st and 2d order spatial derivatives at one image point. We generalize this result to include derivative information in time as well. Further, we show the approximate equivalence of reconstruction algorithms that use only image velocities and those that use one image velocity and its 1st and/or 2nd spatio-temporal derivatives at one image point. The main question addressed in this paper is: "How accurate do the input image velocities have to be?" or equivalently, "How accurate does the input image velocity and its Ist and 2nd order derivatives have to be?". The answer to this question involves worst case error analysis. We end the paper by drawing some conclusions about the feasibility of motion and structure calculations in general.