Leg Motion Classifier

Numerical Methods, Spring 2021

For this project, I worked in a team of 4 to optimize leg motion classification in MATLAB using self-collected accelerometer data. A deep convolutional neural network model and a Gauss Naïve Bayes model with peak location optimization were used to compare and classify data with leg motions of walking, running, and jumping. The more detailed writeup and .m files of the project can be found here.

The data for this project was collected using a sensor-based accelerometer to measure lower body acceleration through walking, running, and jumping from the Phyphox mobile application, without accounting for gravitational acceleration effects. The recorded directional time series for the acceleration data include: x-direction (horizontal phone screen axis), y-direction (vertical phone screen axis), z-direction (axis from the front to the back of the screen), and absolute acceleration. In the experiment, the mobile device was mounted on the thigh region with tight pockets, straps, or adhesives in an attempt to record the most amount of leg activity per person as possible. The procedure for trials of walking, running, and jumping went as followed: set a timer of 60 seconds and perform the activity around a flat terrain (either indoors or outdoors).

Machine Learning Methods

Two different machine learning algorithms were implemented to train given the same input data we collected. Since the data collected had a very small sample size, the accuracies only reflect the accuracy in classifying leg movements for the sample of people we collected data from (6 total). The actual test error when classifying leg movements for an arbitrary sample of people would likely be much less given our trained models. In order to gauge the maximum accuracy, we first chose to analyze a Deep Convolutional Neural Network. Afterwards, we applied the Gauss-Naïve Bayesian classifier and compared it with the CNN model to gauge scenarios for when one is more effective than another.

Deep Convolutional Neural Network

The deep CNN method is relatively complex but yields highly accurate results for properly tuned models, thus making it a good measure for the achievable accuracy. In our model, the CNN was used to identify common features in signals (our acceleration data over time) in order to accurately predict classes for walking, running, or jumping. To develop this model, we had to adjust the hyperparameters listed across the bottom of the network architecture to find the most accurate model. The accuracy of the model, when trained with feature enhancements on the raw data, using convolutions with smoothing filters like Gaussian, Sobel, and Difference of Gaussian was about 91.46%.

Network architecture for deep convolutional neural net on our leg data

Gauss-Naïve Bayes

We also explored the results of the Gauss Naïve Bayes classification method with features and analyzed the accuracy results when compared to the CNN technique. The raw collected data from Phyphox was pre-processed in a readable format. The classification found that mathematical peaks were the most accurate and deterministic feature for classifying the three movements. The raw data achieves an 86.1% accuracy.

Raw data classified in Gauss Naïve Bayes model using peaks as feature

While an 86.1% accuracy rate was sufficient, we wanted to find ways to improve the accuracy of classifying peaks. In doing so, a more rigorous peak finding algorithm was implemented. First, smoothing the extraneous noise from the data was found to have increased accuracy by a wide margin. For the peak optimization process, a Gaussian smoothing filter was applied across the acceleration values in each axial direction using a window threshold of 50, which smooths the data by application of a Gaussian over a fifty-element sliding window.

Absolute acceleration of walking dataset for raw (left) and smoothed (right) acceleration

The following three algorithms were applied in conjunction to improve peak-finding:

Gauss-Newton Algorithm of a Quadratic Regression. The Gauss-Newton method is an iterative gradient-based optimization algorithm often used for nonlinear regression analysis. To compute a Gauss-Newton approximation, it is necessary to find the Jacobian matrix of partial derivatives in relation to the coefficients of the non-linear fit equation. For our peaks, we used the quadratic model approximation to implement as a Gaussian fit. Our algorithm scheme went as follows: we iterated through our sample data until the maximum iteration or until the tolerance was reached, solved for the partials in the Jacobian, and used a least-squares solution to compute an array of the approximation coefficients.
Caruana’s Algorithm to Fit a Gaussian Function. A Gaussian function is modeled by a mode’s standard deviation and mean as parameters, we can use quadratic coefficients to relate to the Gaussian parameters. This is because by mathematical relation, a Gaussian function is the exponential of a quadratic equation. The coefficients for the quadratic function were obtained by the Gauss-Newton algorithm, then used to compute the Gaussian parameters, and finally implemented into the peak-finding algorithm to approximate the peak locations.
Finding Local Peaks algorithm. The last component to this numerical process is the local peak-finder algorithm, which identifies the local peaks from the smoothed dataset in all motions and their corresponding axial directions. The algorithm analyzes the dataset and appends the peak locations to an array if the criteria for a peak identification is met (zero-slope and maximum value over an amplitude threshold). This algorithm calls the function in the above subsection to output the x and y values of the peak location (the Gaussian modal distribution) when found.

Peak locations using refined peak finding algorithm for walking, running, and jumping data

Optimized data classified in Gauss Naïve Bayes model using refined and smoothed peaks as feature

The optimized peaks showed a better improvement overall than simply the raw data with training time and accuracy, yielding 90.6% overall accuracy for the model. In terms of practical use, the Gauss Naïve Bayes classifier with peak location optimization may prove more useful in time-pressured environments, like real time classification for fitness trackers, due to the much shorter training time needed for the classifier while the Deep CNN would be best used in environments where accuracy is a higher priority.