Sitemap
A list of all the posts and pages found on the site. For you robots out there, there is an XML version available for digesting as well.
Pages
Posts
Scalable influence and fact tracing for large language models pretraining
Published:
Figure: Difference between the classical lexical retrieval and the influence based retrieval for large language models
Why Language Models Hallucinate: The Epidemic of Penalizing Uncertainty
Published:
Figure: Binary grading makes “guess when unsure” optimal → higher hallucinations.
Confidence-aware grading (penalize wrong answers; allow IDK) makes abstention rational → lower hallucinations.
Teaching Humanoids Without MoCap: Inside TWIST2’s Portable Data Collection System
Published:
Motivation
How do we collect humanlike motion data for robots without a $100K motion-capture studio?
What I Learned from Hackathons (and Losing One!)
Published:
Hackathons have been among the best learning experiences of my career.
5 Books That Changed How I Think About Machine Learning and Research
Published:
Books have shaped how I approach ML — not just as a technical field, but as a way of thinking.
What is Data Shapley? Measuring the True Value of Data
Published:
We often focus on model architectures — but what if the most valuable part of your ML system is your data?
Data Shapley assigns a contribution score to each training point, measuring its impact on model performance.
Enhancing Cybersecurity Risk Assessment using Temporal Knowledge Graphs
Published:
My recent publication in Decision Support Systems (Elsevier, 2025) focuses on temporal knowledge graph-based explainable DSS for cybersecurity.
Explaining SENE: Manifold Learning for Distracted Driving Analysis
Published:
My first research paper, published in Engineering Applications of Artificial Intelligence (2023), proposed SENE — a novel manifold learning technique for analyzing distracted driving.
hackathons
| Data4Good 2025 | Building Trust in Educational AI through Factuality Verification |
Built an ensemble factuality-verification pipeline for educational AI responses, achieving 99.03% balanced accuracy on a held-out competition test set.
Aligned with UN SDG 4 through safer and more trustworthy AI-assisted learning.
| Open IIT Data Analytics | Sponsored by Brillio |
Predicted popularity of 4000+ songs using ensemble models; secured 1st place out of 48 teams.
| HackGT 12: Crypt of Data | BackpackMate AI |
Developed BackpackMate AI — a travel-planning phone agent built with the Mastra Framework and LLM-based retrieval pipelines.
portfolio
EV Charging Network Optimization Dashboard
Optimizing EV charging infrastructure across urban regions using spatial clustering and demand analysis.
publications
SENE: A novel manifold learning approach for distracted driving analysis with spatio-temporal and driver praxeological features Permalink
Published in Engineering Applications of Artificial Intelligence, 2023
Although many studies have been conducted on distracted driving, the growing number of accidents on roads demands further serious attention. Most real-world distracted driving data are unlabeled and high-dimensional, making analyses complex. There is a lack of proper indices to understand the perilousness of distracted driving, making it difficult to identify roads or neighborhoods with higher risk of accidents. Previous studies focused either on spatio-temporal or praxeological factors separately, but did not consider both together. Furthermore, crisp rule extraction and interpretation are largely missing in the literature.
Enhancing cybersecurity risk assessment using temporal knowledge graph-based explainable decision support system Permalink
Published in Decision Support Systems, 2025
Assessing cybersecurity policies is crucial for organisations to combat evolving cyber threats. The absence of comprehensive datasets has prevented prior studies from analysing cybersecurity policy risks. Past studies also neglected temporal information in policies, and attention-based analyses often lack automated determination of optimal attention units. Furthermore, the absence of interpretability in cybersecurity studies creates a barrier to understanding policy vulnerabilities and developing targeted solutions.
talks
Talk 1 on Relevant Topic in Your Field
Published:
This is a description of your talk, which is a markdown file that can be all markdown-ified like any other post. Yay markdown!
Conference Proceeding talk 3 on Relevant Topic in Your Field
Published:
This is a description of your conference proceedings talk, note the different field in type. You can put anything in this field.
teaching
Teaching experience 1
Undergraduate course, University 1, Department, 2014
This is a description of a teaching experience. You can use markdown like any other post.
Teaching experience 2
Workshop, University 1, Department, 2015
This is a description of a teaching experience. You can use markdown like any other post.
