Research

Home Research Publications Projects Service Updates Blog CV

My research focuses on the internal geometry of LLM representations, efficient generative AI, and hardware-aware security for deployed AI systems.

Branch 1
Geometry

Predictive Geometry of LLM Representations

I am interested in whether the geometry of hidden states, attention flows, and semantic trajectories can forecast LLM behavior before output tokens are produced. This includes attention sinks, first-token dominance, hallucination precursors, and mechanistic tracking of competing internal hypotheses.

Read the Predictive Geometry page

Branch 2
GenAI

GenAI Optimization

This branch studies how generative AI systems can become smaller, faster, cheaper, and more reliable without losing the qualities that make them useful.

  • LLM and VLM compression for edge deployment.
  • Temporal-logic-guided compression, search, and orchestration.
  • Tradeoffs among accuracy, energy, fairness, latency, and thermal safety.
  • Efficient attention mechanisms and alternative language architectures.

Keywords: LLM compression, VLM compression, neural architecture search, edge AI, formal verification, efficient attention.

Read the GenAI Optimization page

Branch 3
Hardware Security

AI Hardware Security

This branch studies how AI systems fail under hardware faults, accelerator-level attacks, and deployment-time reliability constraints, especially for LLMs, VLMs, and approximate DNNs.

  • Efficient bit-flip attacks on multimodal LLMs.
  • Accelerator-level model fault assessment with reinforcement learning.
  • Fault mitigation in approximate deep neural networks.
  • Security-aware evaluation for edge and hardware-constrained AI systems.

Keywords: AI hardware security, bit-flip attacks, LLM accelerator faults, approximate DNN faults, hardware-aware robustness, fault mitigation.

Read the AI Hardware Security page

Inventions and Intellectual Property

FAIRNESS-AWARE LLM COMPRESSION USING TEMPORAL LOGIC
A Framework for Formally Verified, Energy-aware Compression of Vision-Language Models
A Universal, Tokenizer-Free Language Architecture with Continuous Interaction Fields
Attention Mechanisms with Orthogonal Subspace Projections and Geometric Clustering