Cornell Does It Again: Sonar+AI for eye-tracking

If you remember:

Now, Cornell released another paper on GazeTrak, which uses sonar acoustics with AI to track eye movements.

Our system only needs one speaker and four microphones attached to each side of the glasses. These acoustic sensors capture the formations of the eyeballs and the surrounding areas by emitting encoded inaudible sound towards eyeballs and receiving the reflected signals. These reflected signals are further processed to calculate the echo profiles, which are fed to a customized deep learning pipeline to continuously infer the gaze position. In a user study with 20 participants, GazeTrak achieves an accuracy of 3.6° within the same remounting session and 4.9° across different sessions with a refreshing rate of 83.3 Hz and a power signature of 287.9 mW.

Major drawback, however, as summarized by Mixed News:

Because the shape of the eyeball differs from person to person, the AI model used by GazeTrak has to be trained separately for each user. To commercialize the eye-tracking sonar, enough data would have to be collected to create a universal model.

But still though, Cornell has now come out with research touting sonar+AI as a replacement for camera sensors (visible and infrared) for body, face and now eye tracking. This increases the possibilities of VR and AR which is smaller in size, more efficient in energy and more responsive to privacy. I’m excited for this work.

Video of GazeTrak (eye-tracking)

Video of PoseSonic (upper-body tracking)

Video of EchoSpeech (face-tracking)

Leave a comment