
 
   
Figure 11: Mini Golf demo. 
6.1  Performance 
The interface was designed to be sufficiently 
accurate to provide intuitive control and create a 
convincing visual integration between real and 
rendered objects. The accuracy of wand motion and 
rotation has been tested in favourable lighting 
conditions. Measurements were made at a range of 
distances from the camera, with observations 
repeated multiple times and compared to ground 
truth data. Results, seen in Table 1, predictably show 
uncertainty of the position estimates increasing with 
distance from the camera. Linear movement 
estimates are reasonably accurate, compensating 
well for the effect of perspective on motion parallel 
and perpendicular to the camera. However, a 
substantial degree of inaccuracy is seen in x-axis 
rotation estimates at a significant distance from the 
camera. 
  Far (75cm)  Mid (55cm)  Near 
(40cm) 
10cm X 
Translatio
n 
10.03±0.09 10.02±0.06 10.03±0.0
6 
10cm Z 
Translatio
n 
9.98±0.59 8.37±0.20 9.91±0.09 
45° Z 
Rotation 
44.96±0.32 44.94±0.23 45.68±0.2
0 
45° X 
Rotation 
45.28±4.61 39.76±2.76 45.39±1.0
6 
 
In less favourable conditions, the presence of 
bright light sources, shadows on the tracked features, 
or noise due to low light can degrade performance. 
However, a sufficient degree of control can be 
achieved in most realistic indoor situations in which 
the system was tested. As the tracking system is 
based on colour, tracking problems are most 
noticeable in situations where the colour of the 
tracked features is present in substantial areas of the 
background or user. 
7  CONCLUSION 
This paper has described methods of tracking a 
known object in 3D in a single camera, through 
properties of recognized features. This tracking is 
used to create genuine 3D user interfaces that can be 
used for direct interaction with 3D environments 
integrating real and simulated objects. These 
interfaces are suitable for use in a home 
environment, with current computing hardware. 
The current system is limited in terms of the 
range of objects that can be tracked, requiring 
markers to provide identifiable features for tracking 
that incorporates rotation. However, the scope of the 
fundamental system is broad enough that it could be 
substantially extended with future development, and 
further demonstrate the implementation and use of 
original forms of human-computer interaction. 
REFERENCES 
Alter, T. D., 1992. 3D Pose from 3 Corresponding points 
under weak-Perspective Projection. Technical Report 
1378, MIT Artificial Intelligence Laboratory. 
Chung, J., Kim, N., Kim, G.J., & Park, C.M., 2002. Real 
Time Motion Tracking System For Interactive 
Entertainment Applications. Proceedings of the 5th 
International Conference on System Simulation and 
Scientific Computing. 
Haralick, R.M., Sternberg, S.R., & Zhuang, X., 1987. 
Image analysis using mathematical morphology. IEEE 
Transactions on Pattern Analysis and Machine 
Intelligence, Vol. 9, No. 4, pp. 532-550. 
Hartley, R., Zisserman, A., 2000. Multiple View 
Geometry In Computer Vision, Cambridge University 
Press. 
Immersion Corporation, http://www.immersion.com 
Kim, K., Ramakrishna, R.S., 1999. Vision-Based Eye-
Gaze Tracking for Human Computer Interface. 
Proceedings of the IEEE SMC’99 Conference, No.2, 
pp. 324-329. 
Nielsen, F., Nock, R., 2004. Approximating smallest 
enclosing disks. Proceedings of the 16th Canadian 
Conference on Computational Geometry, pp. 124-127. 
Rosenfeld, A., Pfaltz, J.L., October 1966. Sequential 
operations in digital picture processing. Journal of the 
ACM, Vol. 13, No. 4, pp. 471-494. 
Smith, A.R., 1978. Color Gamut Transform Pairs. 
Proceedings of  SIGGRAPH 78, pp. 12-19. 
Sony EyeToy, http://eyetoy.com 
Yang, M-H., Ahuja, N., 1999. Recognizing hand gesture 
using motion trajectories. Proceedings of the IEEE CS 
Conference on Computer Vision and Pattern 
Recognition, pp. 468-472. 
Table 1: Mean values and standard deviation for estimates 
of movement in 3D space. 
VISAPP 2006 - MOTION, TRACKING AND STEREO VISION
394