Inspiration

As we transition back to in-person meetings, we often find ourselves in a situation where most team members can meet physically while others must join virtually. Often, these types of meetings are challenging because of the disconnect that takes place between the two groups. RoboTuber was developed to address these challenges by providing a seamless medium for the two to communicate.

What it does

RoboTuber is an advanced webcam stand-in interface. With the ability to share what RoboTuber “sees” with the virtual members, it likewise uses facial tracking to replicate the facial motions/expressions of remote members. With the ability to indicate eye blinks, mouth motion, head nods, and shakes, as well as eyebrow furrows, it provides nonverbal avenues of communication through which these meetings become much simpler and enjoyable. With other features such as the ability to remotely pan and tilt, these individuals are much more independent and can remain engaged with a moving conversation.

How we built it

We built Robo Tuber using an open-source facial tracking software called OpenSeeFace that allowed us to track facial feature movements such as blinking, mouth opening, and eyebrow movement. We then sent this information to an elixir server via a UDP connection which would then process the data and transform it into commands for the physical interface, which would be sent using WebSockets. Unfortunately, we were unable to fully integrate the system due to network lag and delay, but we were able to fully track facial motion and turn it into commands which could then be separately read by the physical interface.

Robotuber’s physical interface consists of a Raspberry Pi connected to an Arduino. The Pi picks up commands from the elixir server and relays them over serial to the Arduino, which maps the values based on the type of command. For panning and tilting, it will map a decimal value to the total range of motion. For instance, a value of 0.5 will make the robot face directly forward, while a value of 0 will make it look to the right.

Challenges we ran into

One of the biggest challenges we ran into was interfacing the OpenSeeFace library with our backend server. Since OpenSeeFace did not have a publicly defined API interface, we had to create a fork off of the original repo where we made changes to allow the library to send data to our own custom backend. Another challenge we faced was being able to efficiently send facial tracking data across the network to the physical interface as quickly as possible. In order to solve this, we had to work directly with binary encoding the data and sending it over a UDP connection to eliminate as many delays as possible.

Accomplishments that we're proud of

We are most proud that we were able to use facial tracking software to accurately determine important features such as open or closed eyes and mouth, head position, and eyebrow position.

What we learned

We learned a lot about optimizing the transfer of data and improving commands to reduce the necessary bandwidth required to share information. With three main nodes (elixir server, raspberry pi, and Arduino), this was key to making RoboTuber effective.

What's next for RoboTuber

We would like to make RoboTuber able to replicate the facial motions of remote users with much lower latency, as well as make the device able to be “called” remotely. That way, meeting members can answer via RoboTuber in the same way they would an intercom or landline without connecting to a dedicated computer.

Built With

Share this project:

Updates