Kääntää | Devpost

Inspiration

With our eyes set on joining the Snap Spectacles track and being given the theme of “Illumination”, we immediately thought of augmented reality applications oriented towards education and learning. We were inspired by the Japanese language learning game, Shashingo, where players take photos of objects around a fictional city and learn vocabularies. We wanted to implement something similar but in reality instead.

What it does

The app starts up with a prompt asking for the language the user desires to learn, which is done via voice input to make typing unnecessary. Then, there will be two modes the user can enter: Free Roam and Recall. In Free Roam, the user can go around pinching and scanning objects around them and the word in the desired language, the pronunciation, and the English word will pop up. The word will be added to your learned word bank, which can be practiced in Recall mode. In Recall mode, our little fluff ball will give the player a prompt, and the player will have to scan an object that matches the word. The player will make Maxwell happy if they succeed, or he will detonate otherwise. This will repeat until the player goes through their entire word bank, which the player will be returned to Free Roam to add more words to remember.

How we built it

We built our program in Snap's Lens Studio. Using the built-in VoiceML module, we use it to detect the user's voice for the language selection menu. The user interactions are done with the SpectaclesInteractionKit. Our cropping system is based on Snap's existing Crop sample and modified to use Gemini and provide an appropriate JSON out put to be used by the UI and game logic. We also created a typewriter like interface displaying detailed definitions of each word. In addition, we created music and sound effects for our project through FL Studio.

Challenges we ran into

The major challenge we had was working on an entirely new platform that is Lens Studio. It bears semblance to other game engines, but it does have its own quirks and issues. Notably, we spent a lot of our time trying to get the connection to the device to work as even with the steps followed, we had a hard time getting things to pair. Coming from Unity, the most of us also had to play around with TypeScript for the first time and work around being limited to one scene as well. There were frequent issues with cached project files causing the project to work in simulation but not on the headset itself.

On the ideation side, we initially wanted to use the speech-to-text system to create a live transcription of foreign languages and allow the user to read the definition of the selected word, but we decided that the decreased eye contact and the automatic noise suppression of the internal microphone, which is more optimized for listening to the user rather than the environment, would impact the user experience. We also wanted to implement a text-to-speech to Maxwell to help pronounce words, but Gemini does not support generating audio files currently and the other APIs appear to be too complicated for us to deal with in such a time crunch.

Towards the end, we came to realized that we did not plan our version control well, so we had to spend hours attempting to merge all of our separate commits into one app too.

Accomplishments that we're proud of

This was our first time working with Lens Studio, and since we only had prior experience in Unity, learning a whole new platform proved to be a feat.
We were able to communicate and creating a project with our team working through our team through issues and the final crunch to the end.
Having fun with the workshops and events, and meeting like-minded and creative individuals!

What we learned

Using Gemini API Calls
Programming with TypeScript
Using Lens Studio
To make an initial GitHub project and work on that, not individually working on files and then merging.

What's next for Kääntää

Add text-to-speech using Google's Voice API to help with learning via verbal language
Add speech-to-text for pronunciation practice
Memory recall system similar to Anki with a more advanced algorithm

This project was based on the Crop sample project provided by Snap, so we would like to give our gratitude to its creator(s) as well.