Research | Khurram Khalil

Research

Home Research Publications Projects Service Updates Blog CV

My research focuses on the internal geometry of LLM representations, efficient generative AI, and hardware-aware security for deployed AI systems.

Branch 1
Geometry

Predictive Geometry of LLM Representations

I am interested in whether the geometry of hidden states, attention flows, and semantic trajectories can forecast LLM behavior before output tokens are produced. This includes attention sinks, first-token dominance, hallucination precursors, and mechanistic tracking of competing internal hypotheses.

Read the Predictive Geometry page

Branch 2
GenAI

GenAI Optimization

This branch studies how generative AI systems can become smaller, faster, cheaper, and more reliable without losing the qualities that make them useful.

LLM and VLM compression for edge deployment.
Temporal-logic-guided compression, search, and orchestration.
Tradeoffs among accuracy, energy, fairness, latency, and thermal safety.
Efficient attention mechanisms and alternative language architectures.

Keywords: LLM compression, VLM compression, neural architecture search, edge AI, formal verification, efficient attention.

Read the GenAI Optimization page

Branch 3
Hardware Security

AI Hardware Security

This branch studies how AI systems fail under hardware faults, accelerator-level attacks, and deployment-time reliability constraints, especially for LLMs, VLMs, and approximate DNNs.

Efficient bit-flip attacks on multimodal LLMs.
Accelerator-level model fault assessment with reinforcement learning.
Fault mitigation in approximate deep neural networks.
Security-aware evaluation for edge and hardware-constrained AI systems.

Keywords: AI hardware security, bit-flip attacks, LLM accelerator faults, approximate DNN faults, hardware-aware robustness, fault mitigation.

Read the AI Hardware Security page

Inventions and Intellectual Property

FAIRNESS-AWARE LLM COMPRESSION USING TEMPORAL LOGIC
U.S. Provisional Patent Application No. 63/942,261. Patent pending.

A Framework for Formally Verified, Energy-aware Compression of Vision-Language Models
University of Missouri invention disclosure.

A Universal, Tokenizer-Free Language Architecture with Continuous Interaction Fields
University of Missouri invention disclosure.

Attention Mechanisms with Orthogonal Subspace Projections and Geometric Clustering
University of Missouri invention disclosures.