Inspiration
Mapping textbook sections to educational standards is vital for curriculum design, but doing it by hand is slow, subjective, and nearly impossible to scale. We were curious whether natural language processing could step in as a helpful co-pilot — automating alignment while keeping curricula consistent and up to date.
In short, we wanted to turn a tedious task into something fast, reliable, and actually enjoyable.
What it does
OpenStaxAlign predicts the most relevant educational standard for each textbook section. It processes hierarchical OpenStax JSON files, preserves parent-section context, and produces standard predictions for unseen books in a submission-ready CSV format.
The result: curriculum designers can review, validate, and iterate in minutes instead of hours.
How we built it
We extracted structured text fields from nested JSON, transformed them into TF-IDF features, and trained Logistic Regression and Linear SVM models. We fine-tuned hyperparameters with stratified cross-validation, applied class weighting to handle rare standards, and evaluated performance using classification accuracy and visualization tools.
Our focus was on building models that were not only effective, but also transparent and quick to iterate on.
Challenges we ran into
Preserving hierarchical context during flattening turned out to be essential for strong predictions, and class imbalance required careful filtering and weighting strategies.
API limits and the one-day timeline meant every experiment had to count, pushing us toward fast, reliable approaches over heavier architectures.
Accomplishments that we're proud of
In a single day, we delivered a complete end-to-end pipeline and reached around 0.75 validation accuracy with lightweight, interpretable models.
We also produced polished visualizations and clear documentation so others could easily explore and build on our work.
What we learned
We learned just how much preprocessing and contextual features influence NLP performance, how to make smart trade-offs under time pressure, and how to present technical results clearly to both technical and non-technical audiences.
What's next for OpenStaxAlign
Next up: transformer-based models, multi-label prediction, hierarchical classification, multilingual support, and an educator-facing review interface that puts humans in the loop.
This is only the beginning 🚀

Log in or sign up for Devpost to join the conversation.