LockInVision | Devpost

Inspiration

Have you ever had a major assignment that you needed to finish, but couldn't figure out where to start? Oftentimes, people like to envision large, complex projects as a series of smaller tasks that need to be completed - this process is called fragmentation. Working in this manner can give us the internal validation that comes with checking off action items, which leads to increased motivation to complete the project in an efficient and timely manner. The problem with current methods of fragmenting complex tasks is the lack of adherence to splits as individuals are prone to skipping ahead and overworking themselves, promoting poor work habits. This results in unnecessary levels of stress. In virtual reality (VR), rules can be enforced on individuals to ensure that they adhere to optimal work patterns while avoiding unhealthy habits. This led us to develop a 3D mixed-reality (MR) simulation where individuals can continue to do their work in the real-world while receiving the benefit of advanced visualizations within VR. To ensure that the user doesn't overwork themselves, their heartrate can be tracked through any standard smartwatch, and the VR environment can actually prevent the user from working if it detects an abnormally high heart rate (which is an indicator of stress). On top of this, implementing a virtual mechanism for fragmenting tasks also gives us the opportunity to choose the location of choice, and it is well-supported that individuals

What it does

Individuals experience the simulation through a head mounted display (HMD), with built-in hand tracking allowing them to interact with their virtual environment. The MR aspect is enabled through a window within the virtual environment that passes through the virtual simulation and shows the real-world, allowing individuals to complete tasks like writing essays or making PowerPoint presentations on their computers while wearing the VR device. The individual is placed in a forest environment, and presented with an OpenAI Whisper window where they can dictate what they would like to say to the large language model (LLM). The LLM, when prompted, will immediately ask the individual what task they would like to complete. When a prompt is given, the model will fragment the task into 3-10 smaller tasks, detailing the title, description, and expected duration for each minor task. This will be presented in a taskbar that the individual can scroll through to view the list of items they need to complete. From this point, the individual will then be able to work on their list of tasks, checking items off as they progress through the simulation. The person's heartrate is tracked through the HypeRate API, , which will be passed into the VR simulation. Once the simulation detects the person's heart rate to be over 100 beats-per-minute, the simulation will prevent the individual from seeing the real-world.

How we built it

The bulk of our simulation was built in Unity, using the 3D Universal Rendering Pipeline (URP). The OpenAI API was used to integrate Whisper for speech-to-text and ChatGPT to fragment a large user-given task into smaller tasks. The Meta all-in-one SDK was used to create a window in the virtual environment which would pass through into the real-world.

Challenges we ran into

Initially, engineering the prompt for the model was difficult, as the LLM would return the list without the required items or wouldn't understand the task itself. Furthermore, finding a way to integrate the output of OpenAI Whisper and ChatGPT was difficult as the original ChatGPT API was designed to be specifically for text. Once we had recognizable output from ChatGPT, integrating the output into the virtual environment was a difficult task, as we could not pre-program any objects and had to develop a better understanding of how to store and parse through the output given by ChatGPT.

Accomplishments that we're proud of

This was our first hackathon project that integrated virtual reality with an LLM. Furthermore, we actually were able to make meaningful changes in our environment based on the output given by the LLM, which was well worth the effort it took to get it working. Overall, it a product that could reasonably improve productivity while preventing poor work habits.

What we learned

We learned different methods of prompt engineering, particularly the importance of word choice when needing a very specific output from ChatGPT. We were able to make use of singleton object that could store each of the 3-10 items from ChatGPT for use in other parts of the scene. Lastly, we figured out how to implement a horizontal-layout-groups pattern of visualizing the items list.

What's next for LockInVision

Since we were busy with actually creating the product, we still haven't tested out how well it can handle a variety of different tasks (creating reports vs writing actual code), and thus would need to test it to find any weaknesses. Beyond this, we would hope to add in a gaze-detector that would be able to alert individuals when they were distracted for longer periods of time.

Built With

c#
openai
placeholder
unity

Updates

Zaid Ahmed started this project — Oct 06, 2024 12:06 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.