In the Data4Good Competition (2025), our team worked on a socially impactful challenge: reducing misinformation risk in educational AI by verifying whether model-generated answers are factual with respect to supporting context.
Our work, “Building Trust in Educational AI through Factuality Verification,” frames the task as a Natural Language Inference (NLI) problem and combines transformer-based reasoning with LLM arbitration.
(Georgia Institute of Technology, Atlanta, USA)
Educational LLM tools are widely used, but fluent yet incorrect responses can reinforce misconceptions and reduce learning quality.
This project targets safer AI deployment by classifying generated answers as:
Reliable factuality verification helps platforms decide when to:
We developed an ensemble pipeline:
Key insight: retrieval improves performance only when integrated consistently in both training and inference.
These results indicate high reliability for gating educational AI responses before they reach learners.
This work supports trustworthy AI in education and aligns with UN Sustainable Development Goal 4 (Quality Education).
At scale, factuality verification can substantially reduce learner exposure to incorrect AI explanations while keeping human review effort small and targeted.