Pranav Kumaar pranavkumaarofficial

Pranav Kumaar

Software Engineer · Machine Learning Research · Production ML Systems

Multi-Agent LLM Systems

Retrieval-Augmented Generation

Local & Cost-Aware Inference

Production ML Infrastructure

I am a software engineer who builds production-grade AI systems — focusing on how models, data, and infrastructure behave under real-world constraints such as latency, cost, scale, and reliability.

Portfolio · LinkedIn · Email

Research & Publications

ICMLC 2026

When Graph Structure Hurts: Lightweight Path Ranking for Dense KG-RAG
93.9% AUC · 13× fewer parameters than GNN baselines · Designed for dense, production-scale knowledge graphs

Selected Systems & Public Projects

Channel AI
_{Conversational BI Platform}

Results
– Reduced enterprise reporting cycles from days to minutes
– Deployed across 4 enterprise pilots and 12 SMB environments
– Sub-20s latency on multi-million-row analytical workloads

System
Multi-agent LangGraph orchestration over Apache Iceberg

Stack
LangGraph · OpenAI Agents SDK · Iceberg · RAG · LlamaIndex · Qdrant · WhatsApp API

→ https://github.com/pranavkumaarofficial/newdhatu-enterprise

NLCLI Wizard
_{Local LLM Tooling}

Results
– 83.3% accuracy translating natural language to shell commands
– Fully offline CPU inference (810 MB quantized model)
– ~1.5s latency with zero external dependencies

System
Gemma 3 1B fine-tuned via QLoRA and quantized to GGUF

Data
1,500 manually verified command mappings

→ https://github.com/pranavkumaarofficial/nlcli-wizard

Production Case Studies (No Public Repository)

OneSKU
_{Hybrid Retrieval System}

_{Implemented within a client-facing production environment; source code not publicly releasable.}

Results
– 94% precision on catalog-matching benchmarks
– Sub-15s query latency across multi-million SKU inventories
– Rolled out across 20+ vendor catalogs

Engineering Notes
– Hybrid BM25 + dense retrieval outperformed purely neural approaches on noisy catalogs
– Explicit separation of categorical (exact-match) and numerical (range-aware) attributes
– Vendor-specific schema reconciliation logic

Systems in Progress

Efficient Agent Routing	Cost-aware agent selection for tool-heavy LLM workflows under strict latency budgets
Small Language Models for Analytics	Local inference, quantization, and structured reasoning for domain-specific business intelligence

Technical Focus Areas

AI / ML Systems
Multi-agent orchestration · RAG · PEFT · Quantization · Model optimization

Data & Infrastructure
Apache Iceberg · PostgreSQL · Vector databases · Docker · Kubernetes · Cloud platforms

Production Engineering
FastAPI · Python · TypeScript · OAuth2 · PKI · HL7 / FHIR interoperability

📫 Connect

Portfolio · LinkedIn · Email

_{Software engineer interested in scalable AI systems, local LLM deployment, and production ML infrastructure}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pranav Kumaar pranavkumaarofficial

Achievements