Skip to content
View osamaaltaf-pk's full-sized avatar

Block or report osamaaltaf-pk

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
osamaaltaf-pk/README.md

LinkedIn Email HuggingFace WhatsApp


👋 About Me

I'm an AI Engineer specializing in production-grade LLM systems, real-time voice AI, and inference optimization. I don't just prototype — I build and ship systems that run in the real world under real constraints.

  • 🏛️ Built a government-grade real-time translation system handling 30+ languages with sub-1500ms latency (Morocco GovTech)
  • 🎤 Designed scalable Voice AI platforms with inbound/outbound telephony via SIP routing (US project)
  • ⚡ Currently deploying and optimizing LLMs at scale using vLLM, KV cache, and quantization for a London-based AI company
  • 🤖 Experienced with multi-agent orchestration: LangGraph, CrewAI, AIBrix (TikTok's agent layer)
  • 🔬 Fine-tuned LLMs and Vision-Language Models on proprietary datasets for domain-specific tasks
  • 🌍 Remote AI Engineer working across UK, US, and international GovTech projects

"I care about AI that works in production — low latency, high reliability, real impact."


🛠️ Tech Stack

🧠 LLM & AI Frameworks

LangChain LangGraph CrewAI vLLM HuggingFace Unsloth

🎤 Voice AI

PipeCat LiveKit Whisper ElevenLabs SIP/Telephony

⚡ Inference & Optimization

vLLM LMCache Quantization KV Cache Redis

🔧 Backend & Deployment

Python FastAPI Docker AWS TypeScript

📊 Fine-Tuning & MLOps

LoRA W&B MLflow Pinecone RAGAS


🚀 What I've Built (Production Systems)

🏛️  Gov Real-Time Translation    →  30+ languages · sub-1500ms · LiveKit + PipeCat + Whisper
🎤  Voice AI Platform (US)       →  SIP telephony · no-code agent builder · LangGraph agents  
⚡  LLM Inference at Scale       →  vLLM · KV cache · FastAPI · LMCache via Redis
🔬  Vision-Language Fine-Tuning  →  Llama 3.2 Vision · Unsloth · medical/domain datasets
🤖  Multi-Agent Orchestration    →  AIBrix · CrewAI · LangGraph · tool-use agents

📌 Featured Projects

🧠 LLMs-Unsloth

Fine-tuning pipeline using Unsloth for efficient LLM training with LoRA/QLoRA. Domain-specific fine-tuning with experiment tracking.

Python Jupyter Unsloth LoRA

🩺 Llama Vision Fine-Tuning

Fine-tuned Llama 3.2 Vision on radiography datasets for medical image analysis using Unsloth optimization.

Python Vision-LM Medical AI Unsloth

🔊 Pocket TTS

Lightweight Text-to-Speech pipeline with multiple voice engine support and real-time audio processing.

Python TTS Voice AI Audio Processing

🔍 Research Assistant

AI-powered research assistant with document retrieval, synthesis, and structured output generation.

TypeScript RAG LLM Agents


📈 GitHub Activity

GitHub Streak


🏆 Experience Highlights

Role Company Location Period
🤖 AI Engineer — LLM Systems Confidential London, UK 🇬🇧 Jul 2025 – Present
🌍 AI Voice Agent Developer GovTech Project Morocco 🇲🇦 Feb – Jun 2025
🎤 AI Engineer — Voice AI Platform US Project Miami, USA 🇺🇸 Nov 2024 – Feb 2025
🔬 Junior AI Engineer Iaxon Software Pakistan 🇵🇰 Apr – Oct 2023

🌟 Specializations

specializations = {
    "Voice AI":           ["PipeCat", "LiveKit", "Whisper ASR", "SIP Telephony", "TTS Pipelines"],
    "LLM Inference":      ["vLLM", "PagedAttention", "KV Cache", "GPTQ", "AWQ", "Quantization"],
    "Fine-Tuning":        ["LoRA", "QLoRA", "RLHF", "Unsloth", "SFTTrainer", "Vision-LMs"],
    "Agentic Systems":    ["LangGraph", "CrewAI", "AIBrix", "Tool Use", "Multi-Agent Orchestration"],
    "RAG & Retrieval":    ["LangChain", "Pinecone", "Weaviate", "RAGAS", "Hybrid Search"],
    "Languages_spoken":   ["Urdu 🇵🇰", "English 🇬🇧 (IELTS 7.5 / C2)"],
}

📫 Let's Work Together

I'm open to remote AI engineering roles and consulting contracts with US/UK/EU companies.

Specializing in: Voice AI systems · LLM deployment & optimization · Fine-tuning pipelines · Multi-agent architectures

LinkedIn Email WhatsApp


"Building AI systems that work at government scale, voice latency, and production reliability."

Pinned Loading

  1. LLMs-Unsloth LLMs-Unsloth Public

    LLM fine-tuning pipeline using Unsloth, LoRA and QLoRA for efficient domain-specific training

    Jupyter Notebook

  2. OrpheusAssistant OrpheusAssistant Public

    AI assistant built with LLM orchestration, tool use, and conversational memory

    Python 1

  3. Pocket_TTS Pocket_TTS Public

    Lightweight Python TTS pipeline with multi-engine voice support and real-time audio processing

    Python

  4. davidbrowne17/csm-streaming davidbrowne17/csm-streaming Public

    Forked from SesameAILabs/csm

    Realtime demo, Streaming and Finetuning code for CSM

    Python 450 71