Skip to content
View rahult18's full-sized avatar

Highlights

  • Pro

Block or report rahult18

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
rahult18/README.md

πŸ’« About Me

πŸ‘‹ Hi, I'm Rahul

GenAI Engineer building production-grade LLM systems, agentic workflows, and AI infrastructure automation.

Currently working on large-scale telecom AI systems and cloud optimization platforms. I focus on designing reliable, observable, and cost-efficient AI systems that operate beyond notebooks.


πŸš€ What I Work On

  • πŸ€– Agentic AI systems using LangGraph, MCP, and multi-agent orchestration
  • πŸ”Ž Graph + Hybrid RAG pipelines for high-precision retrieval
  • πŸ“Š LLMOps & observability with LangFuse, MLflow, and Triton
  • ☁️ Cloud cost optimization & GPU utilization analytics
  • ⚑ Low-latency inference systems with quantized and fine-tuned models

🧠 Recent Impact

  • Automated 60% of telecom network triage using multi-agent RAG
  • Built Kubernetes AI debugger reducing infra triage time by 60%
  • Delivered $1.5M+ annual cloud savings through GPU observability
  • Optimized LLM inference latency by 35% via LoRA + Triton deployment
  • Improved outage detection across 5K+ daily alarms using predictive modeling

πŸ›  Core Stack

LLM & Agents
LangGraph β€’ LangChain β€’ MCP β€’ CrewAI β€’ Autogen β€’ LlamaIndex

RAG & Retrieval
Hybrid RAG β€’ Graph RAG β€’ Neo4j β€’ Elastic Vector DB β€’ pgvector β€’ Cohere Rerank

LLMOps & Infra
Ray β€’ ONNX β€’ NVIDIA Triton β€’ MLflow β€’ LangFuse β€’ Kubeflow β€’ Run:AI

Data Engineering
Spark β€’ Airflow β€’ Kafka β€’ Snowflake β€’ TimescaleDB β€’ dbt

Backend & Cloud
Python β€’ FastAPI β€’ Java β€’ AWS β€’ GCP β€’ Kubernetes β€’ Terraform β€’ Docker


πŸ§ͺ Featured Projects

ApplyAI
LLM-powered job automation system with async processing and multi-model routing.
Stack: FastAPI β€’ Next.js β€’ Celery β€’ Redis β€’ Vector embeddings

SpringCommerce
Microservices e-commerce platform with polyglot persistence and observability stack.

AtmoFlow
Batch + streaming data pipelines using Spark, Airflow, and GCP.


🌱 Interests

  • Production AI systems
  • Agent orchestration standards (MCP / A2A)
  • Scalable RAG architectures
  • Efficient LLM inference
  • AI infra reliability

πŸ“« Connect

Pinned Loading

  1. Parallelisation-of-DES-Algorithm Parallelisation-of-DES-Algorithm Public

    This project focuses on enhancing the efficiency of the DES cryptographic algorithm by parallelizing its implementation using OpenMP. By dividing the plain text into substrings of length '8' and le…

    C++

  2. Stock-Market-Prediction-and-Forecasting Stock-Market-Prediction-and-Forecasting Public

    This project utilizes advanced machine learning techniques, including LSTM-based recurrent neural networks and the Random Forest Model, to predict stock market trends.

    HTML

  3. NYC-Yellow-Taxi-Trip-Data-Pipeline NYC-Yellow-Taxi-Trip-Data-Pipeline Public

    This is an end-to-end data pipeline that processes and analyzes NYC Yellow Taxi trip data. It includes data ingestion, cleaning, feature engineering, machine learning model training, and a REST API…

    Jupyter Notebook 1

  4. Story-Generation-using-LSTM-and-GRU Story-Generation-using-LSTM-and-GRU Public

    Story Generation using LSTM & GRU, leverages advanced Natural Language Processing (NLP) techniques to autonomously generate stories.

    HTML 2