SmartCheck UI

SmartCheck

Inspiration

The inspiration for SmartCheck stemmed from our personal experiences and observations as students, as well as backing these experiences with some research. We witnessed the inherent flaws and biases in the traditional manual evaluation of handwritten exam answer sheets. Specifically for students living in India , we noticed the following:

Teachers don’t generally read the entire answer sheet instead skim through to look for main points, thus are vulnerable to missing important information the student has written. Sources: 1 2
Teachers are generally biassed towards better and neater handwriting rather than actual content and quality of answer. Sources: 1 2
Manual checking is much slower, therefore quality checking is sacrificed and faster check is prioritised due to time constraints given to teachers. Sources: 1

Our Personal Anecdotes:

"I am currently a college student. Since the first semester in my college, our teachers have been explaining to us the optimal way to give an exam in order to score high marks. They advised us to fill as many pages as possible in our answer sheets, as teachers often award marks based on the length rather than thoroughly checking the answers."
"We were also instructed to write neatly and in bullet points, as teachers tend to skim through answer sheets, making it easier for them to spot key points without missing information."
"I have personally witnessed our 10th-grade English teacher checking our answer sheets. The teacher had to evaluate an answer sheet which took students 3 hours to write within 2-3 minutes. Consequently, the teacher often overlooked the quality of answers and preferred neater handwriting instead. This is the bias we aim to remove."

Learning Experience

Through this project, we delved into the exciting realms of computer vision, natural language processing, and cutting-edge AI technologies. We learned about optical character recognition (OCR) techniques for text extraction, transformer-based language models Gemini for extracting key points, and object detection models like YOLO for diagram recognition. Additionally, we gained insights into prompt engineering, and developing optimized user interfaces for efficient grading.

Something we really found astounding was the zero-shot capability of the Gemini language and vision models, which saved us a lot of time !

Project Development

Data Collection: We gathered past exam answer sheets from our college faculty, covering various question formats, and digitized them using scanners or phone cameras.
Diagram Extraction & Question seperation: We trained a YOLOv8 models, specifically trained on student answer sheets, to recognize and extract diagrams drawn by students, and separate the start and ending of questions.
Text and Key Point Extraction: We utilized Gemini Vision API OCR capabilities to extract handwritten text from the scanned images as well as output the main points written by the student.
Interface Development: We created an user interface using streamlit, presenting teachers with a structured view of the question, extracted key points, recognized diagrams, and the option to view the original answer sheet if needed.
Database Integration: We also aim to streamline the process of updating university databases with the evaluated scores, ensuring a seamless record-keeping workflow.

Challenges Faced

One of the primary challenges we faced was fine-tuning the AI models to achieve high accuracy in text extraction, key point identification, given the variability in handwriting styles, subject-specific terminology. Additionally, creating a metric to verify the quality of "key points" that gemini extracted from the answer sheet was also difficult.

Built With

google.generativeai
numpy
opencv
pandas
pillow
python
streamlit
ultralytics
yolov8

Submitted to

Google AI Hackathon

Created by

My primary role was developing the user interface for SmartCheck. Utilizing the streamlit framework, I designed and implemented the interface that presents the extracted information to teachers in a structured manner. I also contributed to integrating the API of Gemini Pro Vision model for text extraction and the large language model (LLM) for zero-shot main point summarization, into the application.

Aditya Punia
I focused on the data preprocessing and model training aspects. I labeled half of the answer sheet image data, while Aditya handled the other half. Using this labeled data, I trained the question separator and diagram extractor YOLOv8 models.

Sarthak Rawat

Updates

Sarthak Rawat started this project — May 03, 2024 01:50 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.