Motion Information Retrieval

Okay…we can visualize our data as (a series of) figure plots. We can also transform it into sound and make ourselves ‘ring’. But to be honest: that is not what we usually intend to do when we capture a motion or activity data with a set of sensors. What we want is knowledge and information on the captured motion for subsequent analysis. Here, we generally hope to acquire information that is not known to us and that we cannot discover, or information that we could not be able to quantify by ourselves. This data mining process can be remarkably facilitated and automatized if we base it on additional, artificial motion knowledge learned by motion information retrieval methods. Ideally, such artificial motion knowledge is similar to the complex biological motor perception of the human brain.

Basic Machine Learning Principles

The human brain acquired the ability and knowledge to evaluate motion performances during years of practice and experience. So how can we learn a similar knowledge from just a (probably pretty small) collection of inertial sensor data? As a basic task, it is primarily necessary to make sense of the data stream. For this, it is beneficial to analyze the biomechanics and technical characteristics of a motion. Sometimes it can furthermore be helpful to investigate the grading and scoring conventions of a sport to learn more about its most important aspects. Once you know about the important (and discriminative) data characteristics, the main steps of an motion information retrieval are then: (a) the transformation of the augmented motion data into meaningful feature representations, (b) the learning of artificial motion knowledge from the previous feature transformations and (c) the retrieval of significant aspects from an arbitrary data stream on the base of the learned motion knowledge.

Ground Truth Measure

To utilize and implement any information retrieval method, it is necessary to define a ground truth measure that describes all relevant motion information in the collected data. This ground truth measure is utilized to ‘teach’ your analysis systems how certain relevant motion information is represented within the data streams. Generally, two sources of ground truth specifications are available: input knowledge from biomechanical measures and input knowledge from official judging conventions. The former usually comprises information on ideal motion parameters and beneficial or advantageous motion execution. The latter on the other hand usually comprises information on performance errors and disadvantageous motion execution. For both cases, respective data descriptors then constitute the reference for the detection of similar properties in unknown data streams.

Kinematic Feature Transformations

One of the most important premises for the learning of artificial motion knowledge is to have your sensor data available in a significant and meaningful data representation. The estimation of accurate body kinematics was a first step towards such data significance and should now be extended by feature transformations.┬áResearch on learning scenarios for activity recognition and motion pattern analysis introduced various possible feature extractors to emphasize key signal properties. These mainly belong to one of the following: statistical raw-signal based features, event-based features, multilevel features derived from clustered statistical occurrences and kinematic body motion information. To equalize every feature’s influence on the following motion information retrieval process and to exclude influences of variant anthropometrics between sensor-equipped athletes, all features should furthermore be scaled by either normalization or rescaling.

Machine Learning Methods

In computer science, there are many methodologies available to learn artificial knowledge from some transformed feature data. Common methods are Support Vector Machines (SVMs), Hidden Markov Models (HMMs), clustering algorithms like decision trees or k-nearest neighbor computations and probabilistic models. For time-serial motion data, further strategies are possible that evaluate similarities between sequences like Dynamic Time Warping (DTW). In consideration of the current context I am not going to talk about any of them in further detail. But if you are really interested – the spheres of the WWW provide sufficient information on their fundamental algorithmic principle…

information retrieval

Sample Motion Information Retrieval Setting

Wow, that was a lot of abstract information on this page so far! To concretize the previous information, let me now introduce you to my work of the last two years…
Large parts of my PhD were inspired by a real-life problem, that I then fit into the structure of a general motion information retrieval system. And this might be very similar to what you plan or want to do by yourself, too (even if it might not end in a PhD). In concrete, it all started already approximately 20 years ago, when a young figure skating me was disappointed by some judging competition results that I perceived as unfair. And despite the process of technical knowledge and motion sensing devices, still no remedy to this problem or similar controversies has been presented so far. I therefore decided to address this issue and design and implement a particular system that automatically discovers, classifies and evaluates style errors within a motion performance – a motion style assessment system.

motion performance evaluation

I chose ski jumping as fundamental sport for the system engineering. Ski Jumping is a very technical motion that is significantly dependent on biomechanical and physical laws. Even small changes in body pose or deviations from the ideal motion execution can considerably influence the outcome of the performance. Flight posture for example immediately influences aerodynamic forces as drag and lift and hence the length of flight and final performance. However, it is also a sport whose overall competition results depend on the style evaluation of judges. Consequently, the environment for a system implementation was very defined and relatively simple to translate into machine specifications.

Ground Truth Measure

Since the application purpose was to evaluate motion style, I chose to utilize judging conventions as system ground truth measure. An evaluation training sheet for judges of the Japanese Skiing Association served as fundamental source of knowledge. An experienced ski jump judge was then instructed to annotate the style of every measured ski jump under the given style error definitions. Those judge scores were collected on paper in real-time during the data capture sessions and under real judging conditions from the judge’s tower to obtain a ground truth measure as close to the actual judging as possible. After data acquisition, all score sheets were digitized. In conformance with those official guidelines, I lastly derived ten style basic error categories for the subsequent data mining tasks from the collected judge annotations.

Feature Transformations

I chose to use n statistical signal features F_{D} built from descriptive standard measures (for example the mean, standard deviation, skewness, kurtosis, spectral composition, mean crossing rate) and m body model features F_{C} built from the augmented body kinematics (for example joint angles, distances between joints, orientations of body segments). All F_{D} features were normalized as F_{Dn}^\prime = \frac{F_{Dn}}{\left|F_{Dn}\right|} and all F_{C} features rescaled as F_{Cm}^\prime = \frac{F_{Cm}-\min(F_{Cm})}{\max(F_{Cm})-\min(F_{Cm})} before use in the motion information retrieval step.

Machine Learning

The principal idea of the intended error recognition was to determine commonalities between the feature transformations of two or more motion performances. Those feature transformations were either known as error jump (EJ) or non-error jump (NJ) per error category from the judge’s ground truth annotations.

ski jump evaluation

The general problem definition could be easily represented as a binary classification problem. A respective knowledge on the different motion errors was then learned from the F_{D} and F_{C} features with DTW and a SVM. To verify the applicability of the system design on the present data, the learned artificial error knowledge was next validated with respect to every error category. For this, all data captures were randomly assigned to a training and a testing data set and all jumps in the testing data classified as either EJ or NJ. The classifications were then compared to the ground truth annotations and the accuracy of the classifications determined.