Projects | Khurram Khalil

Projects

Selected code and research prototypes aligned with my two main directions: GenAI optimization and AI hardware security.

GenAI Optimization

TOGGLE
Safe LLM compression framework · Python

Temporal-logic-guided compression for edge LLM deployment, focused on balancing efficiency with behavioral constraints.

FairCompress
Fairness- and energy-aware LLM compression · Python

Research prototype for compression methods that account for efficiency, fairness, and deployability rather than size alone.

VeriNAIS
Signal-temporal-logic-guided neural architecture search · Python

Verified architecture search for AI systems where deployment behavior must satisfy formal constraints.

OSPA Transformer
Orthogonal Subspace Projection Attention · Python

Attention mechanism work aimed at improving transformer efficiency through structured subspace projections.

flipRL
Reinforcement-learning-driven bit-flip attack research · Python

Prototype connected to efficient bit-flip attacks and accelerator-level vulnerability assessment for LLM systems.

VERMITHOR
Runtime orchestration for thermally safe edge inference · Python

Formal runtime control for edge cyber-physical inference where thermal and reliability constraints are first-class requirements.

Approximate Computing
Approximate DNN and hardware fault studies · C

Low-level experimentation around approximate computation, fault behavior, and the security-reliability tradeoffs of efficient AI.

Neural-Pulse
Runtime verification for LLM behavior · Python

Signal Temporal Logic monitoring for hallucination-like behavior and semantic attack patterns in LLMs.

GhostTrack
Mechanistic hypothesis and semantic tracking for LLMs · Python

Framework for tracing competing thought trajectories and identifying hallucination risks before final output.

Attention Sink Analysis
Transformer attention analysis · Python

Tools for analyzing first-token dominance and attention sink behavior in long-context language models.