TeXify | Devpost

Screenshot

Inspiration

As technology rapidly advances, more and more of our daily lives are becoming digitized. Technology has allowed for great increases in efficiency in all manners of our daily lives, and seldom do we have the need to pick up a pen and paper anymore.

However, there is one exception: math. Even today, when our entire lives can be lived virtually without leaving our rooms, inputting complex mathematical symbols on a keyboard is still painfully slow. One of our team members faced this issue severely during remote learning, where they were required to submit their work with LaTeX. Without sufficient prior coding experience, he struggled to input complex mathematical expressions into LaTeX, and yearned for a faster way to submit his work. This is an issue faced by not only our team, but also thousands of passionate mathematicians around the world. We set out to face this issue by using state of the art technologies such as machine learning, paving the pathway for future innovation in math to code translation.

What it does

Our project aims to efficiently and reliably convert handwritten mathematical expressions into LaTeX. To achieve this, text can directly be handwritten into the interactive tool on the website, or it can be uploaded as an image. Then, the website will use a machine learning model trained on thousands of images to effectively decipher the handwritten expression and convert it into LaTeX.

How we built it

We researched and experimented with many artificial intelligence techniques, including heuristic methods, support vector machines, convolutional neural networks, and recurrent neural networks, before settling on a sequence-to-sequence attention-based transformer model. This is the same type of model that is responsible for recent major innovations in machine learning, such as the famous GPT-3 and DALL·E. We based our transformer on the TrOCR architecture described in this paper: link.

Due to a relative shortage of training data for mathematical handwriting, we first trained our model on handwritten English text, data on which can be found in large abundance. After a couple hours of training, our model worked well on English text, at which point we stopped feeding English text and instead started fine-tuning the model with mathematical expressions. This technique is known as “transfer learning” and is very commonly used in the field.

After several hours of training, we had a working model for transcribing mathematical text, and moved on to build an application for it. We wrote a simple backend in Python using Starlette to handle requests containing images, feed those images through our trained model, and then send back the transcribed text. Then, we built a frontend website in TypeScript using Svelte, which let the user either draw or upload an image, and would then communicate with the backend and display the transcribed result. The result is a fully functioning and easy to use website that could effectively accept either images or direct writing as input, then return and display a LaTeX output from the machine learning model.

Challenges we ran into

As none of our members have had extensive experience with artificial intelligence and machine learning, it took a lot of research and experimentation to arrive at a working model. We also had to conduct an extensive search in order to obtain quality data sets. Most of the datasets we found were extremely limited in scope, or were too low-quality for our purposes. Ultimately, we settled for a Pearson dataset, only containing high-quality images of mathematical expressions commonly found in calculus, and not any other topics. Though this wouldn’t make for as good a dataset as we had been hoping for, it would be acceptable, given the time constraint for this hackathon.

Accomplishments that we're proud of

As this is the first hackathon for most of our team members, we are very proud that we could work together and collaborate to create a working product. Our team members are also very new at using machine learning, and we are happy that we created a working model trained on a fairly limited dataset. Overall, although there are still many things we could improve about the product, we are proud that we have created a complete product with fully functional frontend and backend interfaces within the time limit.

What we learned

We learned to choose from different machine learning models to use one that most effectively encapsulates our dataset. We also learned how to best train our model in a way that would be most efficient and avoid overfitting our dataset. Lastly, we learned to collaborate as a team, despite schedule conflicts between teammates and on-site wifi problems. The only experience we had with machine learning before was very superficial, so it was very rewarding to dive deeper into the field and understand some of the newer innovations that are being worked on today and how they work under the hood. Working on building TeXify in 24 hours was a very enriching learning experience for us, and we hope to continue working on improving our project in the future.

What's next for TeXify

The website we built during CalHacks is simply an exploration of the underlying technology that we’ve created. In the future, we’d like to create different applications of this technology. For example, one of the applications we envision is a tablet app that allows you to handwrite math using a stylus, while transcribing what you’ve written in real time to your computer. This will be very helpful for students typing up solutions to homework problems, or even for real-time note-taking during lectures where efficiency is especially important.

As for our technology, we’d like to expand the domains of mathematics that our model can handle. As mentioned above, our model currently only works well with calculus-related expressions, since that is what the vast majority of our training data was. However, TeXify would be especially useful for things like linear algebra, where it may be necessary to write out big matrices, and we’d really like to expand our model to topics like that.

Another possible application with this technology that we are very excited about would be something like “mathematical spellcheck” — so that typos and accidental mistakes can be identified and corrected with ease.