Recap From Day 062
I said that In the coming days, we will see methods that allow for capturing how we are performing a gesture, while we are performing it.
Let’s get into it.
Working With Time
The first method that we will see starting today is a real time version of DTW. What does that mean? DTW aligns an input sequence onto a template in order to compute a similarity measure between those two sequences. The first method, Gesture Follower, is able to align the incoming sequence onto the template and compute their similarity on the fly. Which means at each new incoming feature value that is to say, while we’re performing the gesture.
As soon as we start performing a gesture, the method is able to recognize which gesture it is by giving us it’s index or label, and it is able to align it to the recognized gesture by giving us a continuous value corresponding to the progression of the executed gesture within the template. The method is called Gesture Follower because it operates as if it’s following the gesture while it is performing.
This method had been developed at IRCAM in Paris by Frederic Bevilacqua and colleagues. The Gesture Follower is a system for real-time following and recognition of time profiles. In the example in the video below, the Gesture Follower learns three gestures, i.e drawings using the mouse, while simultaneously recording voice data .
In the video above, during the “performance”, the Gesture Follower recognizes which gesture is being performed, and plays the corresponding sound, time stretched or compressed depending on the pacing of the gesture.
Let’s inspect how Gesture Follower work. Like inDTW, a gesture is represented as a sequence of feature vectors. Each feature in the sequence is a point of the gesture trajectory which means a snapshot of the gesture at a certain time. GF stores each gesture template as sequence of features. As a result this operation gives the alignment of the incoming gesture onto the template as seen below.
So Gesture Follower gives the alignment for every template but what we want at the end is the one that is the likeliest. To do so, GF takes the distances between the incoming point and each template, given by the previous alignment, and build a probability distribution over the template. This probability distribution allows us to know which template is more likely to look like the input gesture. And the gesture reaching the highest probability will then be the outcome of the classification.
That’s all for day 063. I hope you found this informative. Thank you for taking time out of your schedule and allowing me to be your guide on this journey. And until next time, be legendary.