InterviewPrepApp
InterviewPrepApp is a full-stack voice interview simulator built as a Final Year Project. Users create interviews by providing a job title and description — the system generates relevant questions, then conducts a live voice interview using Deepgram for speech-to-text and GPT-4o for intelligent conversation. Difficulty adapts dynamically based on response quality. After each session, a full AI-generated feedback report with scores, category breakdowns, and improvement suggestions is available to download as a PDF.
The Problem
Most interview prep platforms are passive — you read solutions or watch videos, but never practice the actual pressure of a live interview. The few platforms that offer mock interviews rely on text chat, lack adaptive difficulty, and give generic feedback. InterviewPrepApp was built to simulate a real interview: live voice, questions that adjust to your performance in real time, and structured feedback you can act on.
Key Engineering Decisions
Client-Side Adaptive Selection
Next question selection happens on the client via GPT-4o function calling — eliminates a server round-trip per question, keeping voice interaction latency low.
Deepgram WebSocket Voice
Streaming STT and TTS over a single bidirectional WebSocket — real-time conversational flow without buffering full audio clips between turns.
Question Bank + AI Fallback
MongoDB queried first for cost efficiency. GPT-4o generation kicks in only when the bank lacks coverage for the job description's keywords.
Resume Parsing Pipeline
PDF/DOCX → raw text → GPT-4o-mini → typed ResumeData injected into the system prompt, enabling personalized questions without sending binary to the LLM.
Isolated Mass Interview Instances
Each candidate gets their own adaptive interview instance from a shared join link. All results aggregate to a single dashboard for the interviewer.
GKE CI/CD Pipeline
Push to main → lint/typecheck/build → Docker image to GHCR → deploy to GKE via GitHub Actions. Forced disciplined environment management from day one.
Key Highlights
- Real-time bidirectional voice interaction using Deepgram Nova-3 STT and Aura TTS over WebSocket streaming.
- Adaptive difficulty engine — questions scale from level 1 to 5 based on response quality assessment by GPT-4o.
- Intelligent question selection — extracts keywords from job descriptions, queries a MongoDB question bank, falls back to OpenAI generation.
- AI-generated feedback with overall score, category breakdowns (Technical, Communication, Problem Solving), and per-question assessments.
- Full coding interview mode — Monaco editor (VS Code engine) with JavaScript, Python, and C++ support, visible and hidden test cases, and LLM-generated feedback.
- Mass interview mode — create a session, share a link, review all candidate results and download individual PDF reports.
- Deployed on both Vercel and Google Kubernetes Engine (GKE) via a full GitHub Actions CI/CD pipeline (lint → build → Docker → GHCR → GKE); auth handled by Clerk.
Tech Stack
Key Takeaways
Real-time voice AI demands careful WebSocket lifecycle management — a mid-interview disconnect needs graceful session recovery, or the user loses their entire progress.
Adaptive difficulty requires a consistent scoring model across question types; conflating communication quality with technical correctness produces noisy signals that push difficulty in the wrong direction.
Streaming LLM responses feel dramatically faster than waiting for a full completion, even when total token count is identical — perceived latency matters as much as actual latency.
GPT-4o function calling is powerful for structured decision-making, but the JSON schema for the function must be airtight — vague parameter descriptions cause the model to hallucinate values at the worst possible moment.
Wiring up CI/CD to Kubernetes early in the project forced disciplined environment and secret management from day one, paying dividends every time the production config diverged from local.