Tracking capture in video motion.
Baritz, Mihaela ; Cristea, Luciana
1. INTRODUCTION
Motion capture of human body can be an effective method of creating
realistic human motion for animation or for different technical or
medical studies. Unfortunately, the quality demands for animation place
challenging demands on a capture and visualization system. The capture
solutions that need these demands have required specialized hardware
that is invasive and expensive solutions. For that computer vision could
make animation data much easier that another method and in plus it can
obtain information's to use them in different fields. In this paper
it is presented a methodology for analyzing the 2D motion given image
observations; use this as a tool for understanding the problem and for
survey the evolution of a human body in different moments of walking
process.
Synthetic experiments confirm that these situations would arise in
practice. The experiments show how even simple visual tracking
information can be used to understand the sort of gait for different
people or to create a standard human motion but even with video
tracking. (Bregler C., Malik, J. 2002)
Motion capture is an integrate method for creating the movement for
computer animation or for different technical or medical studies, but
motion capture also has its share of weaknesses.
The development of new and improved methods of editing and
processing motion capture data has made great strides in making motion
capture a more viable tool for animation production or for understanding
biological process, technical and medical structures.
Another problem of motion capture has been the practical challenges
of acquiring data. While research has made progress on using the data,
capture techniques have evolved slowly, having problems with inertial
time of acquisition and visualization. In special tracking technologies,
based either on mechanical or magnetic sensors, or specially designed
cameras viewing light markers, are required to create the observations
that are processed into motion data. While these systems have improved
in their dimensions, reliability, precision, and range, they are still
generally expensive and some times hard to be used. These activities of
motion capture must be performed by dedicated places providing specific
environments for the recordings. Any "performer" could be
captured in any setting that is desired. Using standard, high speed or
thermal video cameras and special conditions for recording is an
integrated structure as it could meet these goals.
The use of a single camera is a particularly situation only for
static position and small or medium movements.
[FIGURE 1 OMITTED]
It offers the lowest cost, simplified setup, the potential use of
natural or artificial light sources and a space without any
modifications. The creation of motion capture data from a single video
stream seems like a plausible idea but only for small and medium
movements (short distance).
2. THE MOTION CAPTURE PROBLEM
The goal of motion capture is to record the movement of a
"performer" (typically, but not always, human) in a compact,
usable, repetitive manner and to be possible to rebuild these movements.
For this paper, we are concerned with the gross motion of the body,
on the short distance (3 m), because the specific capture of facial or
hands motion poses a different set of problems like skin color, 3D
images and vibration stability. In computer graphics/computer vision
studies usually the human body is divided into a small number of rigid
segments that rotate relative to one another. This approximation is
simply because the human knees, elbows and ankles do not have a single
pivot point. The true motions of more complex joints are sometimes
examples of kinematics approximations. The motion capture problem we
consider, therefore, must have the following form: given a single stream
of video observations of a people with normal/disabilities gait, analyze
the tracking of each point, and compute a 3D skeletal representation of
the motion using sufficient quality to be useful for animation. But for
medical studies of the movement of the human body, like in Parkinson
disease, are sufficient to use the rigid form of the human body parts.
(Gleicher M., Ferrier, N 2003)
3. EXPERIMENTAL SETUP
Virtual humans are articulated figures modeled with multiple
layers: a virtual skin is usually attached to an underlying skeleton,
which animates the whole body. The skeleton is a hierarchically
organized set of joints, and this set depends on the animation
requirements and the fields of applications. To create morphologically
correct skeletons it will be better but this can turn out to be quite
costly. Real humans have so many degrees of freedom that virtual models
frequently are necessary to omit some of them or to minimize the
importance into movement actions.
[FIGURE 2 OMITTED]
The capture problem is sometimes difficult: the articulated model
does not accurately reflect the real people, articulations lead to
self-occlusions, even the articulated models contains many degrees of
freedom, the skeleton is internal and therefore cannot be observed
directly, also the clothes are an important part into the observations
and recordings the movement's images. Our information sources are
2D-bidimensional and some occlusion is possible in time of recordings
(like hands passing in front of hip joints). In addition, the optical
medium for recordings provides a finite resolution (spatially and
temporally), and the parameters of real cameras are difficult to obtain
precisely (light, vibration, humidity). Having these limitations, it is
not a problem that the practical approach to motion capture for moving
tracking of the human body involves can be used in this method without
these limitations. For example, if we observe a point on the people in
an image created by a camera, we cannot determine the position of the
point; only constrain its location to lie along a ray. For that, we
assume an idealized pinhole camera model such that the ray is defined by
the camera's focal point and the point on the image plane. In
practice, a video camera has a finite resolution so observations are
only localized to a region of the image plane and the space of human
body movement must be in this region. (Monzani, J.S., 2002)
Additional information is needed to determine the position of a
point in space, like axis system and calibration. A variety of sources
can be utilized in various computer vision techniques, and a few can be
applied to motion reconstruction. Most methods assume strong models to
place further restrictions on possible poses. There are a wide range of
visual tracking techniques in the practice ranging from edge feature
based to region based tracking, and brute-force search methods to
differential approaches. Edge feature based tracking techniques usually
require clean and good data with high contrast object boundaries. But on
human bodies such features are very difficult to obtain; clothes have
many folds, environmental light is sometimes inconstant and also the
images background has different colors in different directions.
Also, if the left and right leg has the same color and they
overlap, they are separated only by low contrast boundaries.
Region based techniques can track objects with arbitrary texture
and attempt to match areas between consecutive frames.
[FIGURE 3 OMITTED]
[FIGURE 4 OMITTED]
This requires the estimation of 6 free parameters that describe
this deformation (x/y translation, x/y scaling, rotation, and shear).
4. RESULTS AND CONCLUSIONS
In our researches we used two video cameras, one of them withhigh
speed and other with normally speed, to record the human body movements
between initial point and end point situated at 3 meters distance.
Initial the people wearing black cloth walks with a normal speed (also
measuring the ground forces develop in gait process by a force plate) on
this distance and we're recording the image of all points attached
on the legs joints. Using these trajectories of the leg joints we can
transfer to a human virtual model to analyze the movements, the joints
positions in different movement moments and also to establish a data
base for each people. (www.adobe.com)
Also, these recordings are possible to be used to simulate a
movement and to establish the limits or to calculate the forces develop
in muscles or joints during gait process.
[FIGURE 5 OMITTED]
5. ACKNOWLEDGMENT
Researches are part of Grant A1088 with CNCSIS Romania and
we've developed the investigations with apparatus from Research
Platform "SAVAT", University Transylvania Brasov.
6. REFERENCES
Bregler C., Malik, J. 2002, Video Motion Capture, UCB/CSD 97-973;
Gleicher M., Ferrier, N 2003, Evaluating Video-Based Motion
Capture;
Monzani, J.S., 2002, An architecture for behavioral animation of
virtual humans Ecole Politehnique Federale de Laussanne, Suisse;
www.adobe.com, accessed 2008-06-15.