Inspiration

Have you heard of Interview Coder? or maybe Parakeet AI? The rise of these types of AI tools challenge the integrity of technical interviews. These tools enable candidates to potentially cheat, leading to less qualified individuals being hired over honest, capable engineers. This undermines fair evaluation and can negatively impact engineering teams. TrueTalent was created to address this issue head-on, providing interviewers with the tools needed to ensure a fair and accurate assessment of a candidate's true abilities.

What it does

TrueTalent is an advanced technical interviewing platform integrated with multi-modal AI detection capabilities. It provides interviewers with real-time insights and post-interview analysis to identify candidates potentially using unauthorized aids. Its core features focus on analyzing multiple data streams: Visual Behavior Monitoring: Leverages computer vision AI to analyze the interviewee's video stream for anomalies commonly associated with cheating, including: Gaze Tracking: Detects sustained periods of looking away from the screen (up, down, side) inconsistent with typical thinking patterns. Eye Movement Analysis: Identifies rapid or unnatural eye movements indicative of reading external material. Interaction Detection: Flags suspicious hand movements or body posture changes suggesting interaction with hidden devices. Auditory Pattern Recognition: Processes audio input to: Identify Recitation: Detects speech patterns that may indicate reading from a script or receiving external audio cues. Assess Understanding: Flags responses that sound overly rehearsed or lack genuine comprehension markers. Generate Follow-up Questions: Suggests relevant technical questions based on the candidate's verbal explanations to probe deeper understanding. Code Authenticity Analysis: Examines submitted code for: AI Generation Likelihood: Assesses coding style, complexity, use of uncommon patterns, and comments for markers typical of LLM-generated code. Plagiarism Detection: (Optionally) Compares snippets against known online sources or databases.

How we built it

TrueTalent's detection engine is powered by Google's Gemini Pro (or Gemini 1.5 Flash), utilizing its multi-modal processing strengths: Video Analysis (Gemini Vision): The frontend captures video (e.g., using MediaRecorder API in MP4/WebM format). This video is sent to a backend service (e.g., Python Flask with OpenCV). OpenCV extracts frames at regular intervals (e.g., 1 frame per second). These image frames are base64 encoded and sent as a sequence to the Gemini API along with a specific prompt instructing it to look for visual cheating indicators (gaze, eye movement, hidden interaction). Gemini's response provides a likelihood score ('Low', 'Medium', 'High') and reasoning. Audio Analysis (Gemini Audio/Speech-to-Text): The audio stream is captured and sent to a backend service. It can be processed in chunks. Depending on the specific implementation: It might be transcribed using a Speech-to-Text service first, then the transcript sent to Gemini for linguistic analysis (checking for unnatural phrasing, generating follow-ups). Or, if using a model with direct audio understanding, the audio chunks could be sent to Gemini with prompts to analyze speech patterns or content directly. Code Analysis (Gemini Text): The code snippet written by the candidate in the shared editor is sent to the Gemini API (likely via a dedicated backend endpoint). The prompt asks Gemini to evaluate the code's likelihood of being AI-generated, considering factors like style, comments, complexity, and potential source similarity, returning an assessment. By integrating these distinct analysis streams, TrueTalent provides a robust, layered approach to verifying the authenticity of a candidate's performance during remote technical interviews.

Challenges we ran into

First, we encountered challenges in figuring out how to retrieve text from the coding section of the screen. Initially, we attempted to use Tesseract OCR to capture screenshots every 5 seconds and send the text to the Gemini API. However, sometimes the semicolons looked like 1’s or i’s, and the mouse cursor also inadvertently got translated into a 1 or i. To fix this, we dove into the HTML elements to extract the text from the section corresponding to the coding section. Secondly, while processing audio, we faced obstacles in finding a suitable library for speech recognition and translation. Processing audio was more challenging than processing plain text and required an additional step before feeding it into the Gemini API. Our team also accidentally deleted a section of the audio code due to a GitHub mistake, which was a valuable learning experience. Always push onto branches before merging the code! After translating the audio to text, we spent a substantial amount of time moving the text into the AI Assistant window. Finally, we experimented with Gemini prompt engineering to analyze the interviewee’s mannerisms via video input and identify what is questionable about the interviewee. Speaking with Google representatives helped us refine our Gemini prompt. We also encountered latency challenges while processing videos, which led us to record the entire interview instead of having Gemini analyze the video live.

Accomplishments that we're proud of

We’re very proud of our front-facing user interface (UI). Our team brainstormed how to create an aesthetically pleasing and minimalist web application that rivals traditional technical interviewing tools like CodePair and CodeShare. We made two unique versions of our webpage, one for the interviewer and one for the interviewee. Interviewers can analyze the interviewee’s code using Gemini by clicking on the analyze code button, detect abnormalities in the interviewee’s speech through audio analysis, and record the interviewee’s body language, facial expressions, and eye contact by starting the video capture feature. TrueTalent boasts three distinct screens: a section integrated with the most popular technical problem site, Leetcode, a coding platform with syntax highlighting, and a fully working AI Assistant page. We enhance interviewers’ experience by assisting them in hiring TrueTalent.

What we learned

Juggling video, audio, and text analysis with Gemini is super powerful, but man, getting it all working together smoothly was complex! Turns out, getting the AI prompts just right is key. Asking the right questions makes a huge difference in catching suspicious stuff without flagging normal behavior. Handling live video and audio streams in the browser and getting them processed quickly on the backend? Yeah, that was definitely a challenge – lots of tricky bits there. It's actually pretty hard to tell if someone's just thinking hard or sneakily reading notes off-screen. We learned we need to be really careful not to jump to conclusions (minimize those false alarms!).

What's next for TrueTalent

Make it Smarter: Keep tweaking the AI prompts and models to get even better at spotting real cheating while ignoring nervous habits. Speed it Up: Work on making the real-time analysis faster so the feedback feels more instant for the interviewer. More Detection Tricks: Maybe look into other clues, like if someone suddenly pastes a huge chunk of code, or analyze the audio more deeply. Better Feedback: Improve how we show the analysis results – make it clearer and maybe point out when something suspicious happened. Keep it Fair: Always be thinking about fairness and making sure the tool isn't biased. That's super important.

Built With

Share this project:

Updates