Skip to content
View HarshTomar1234's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report HarshTomar1234

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
HarshTomar1234/README.md

Typing SVG


LinkedIn X Portfolio


About

AI/ML Engineer passionate about building end-to-end AI systems. I believe in understanding AI by implementing from scratch — from Vision Transformers to multi-agent architectures.

Focus: Computer Vision • Research Implementations • GenAI & Agents • MLOps & LLMOps Cloud Deployment


Tech Stack

Programming Languages

ML/AI Frameworks

HuggingFace

Computer Vision

Roboflow

Supervision YOLOv5-v8 Object Detection Image Segmentation DeepSORT ByteTrack Optical Flow

GenAI & LLM

HuggingFace LlamaIndex CrewAI OpenAI Anthropic

LangGraph AG2 (AutoGen) Agno RAG DeepSeek Prompt Engineering OpenAI API

Data Science

NumPy Pandas Matplotlib

Seaborn Scikit-learn Statistical Analysis Feature Engineering A/B Testing

Development Tools

Jupyter

Streamlit FastAPI Flask Gradio

Cloud & Deployment

EC2 S3 Lambda CI/CD MLflow DVC

Databases

Pinecone Chroma FAISS Vector Databases


Featured Projects

Real-time tennis match analysis system with advanced computer vision

  • YOLOv8 custom-trained on 1000+ annotated frames
  • ByteTrack for robust multi-object tracking
  • CNN-based court homography detection
  • Shot classification: 85%+ accuracy
  • Player detection: 95% | Ball tracking: 88%
  • Mini-court tactical visualization
  • Real-time speed & distance analytics

Python YOLOv8 OpenCV PyTorch ByteTrack Supervision

Demo

Multi-sport CV analysis pipeline for football/soccer

  • YOLOv8 + DeepSORT tracking pipeline
  • K-means team identification using HSV color
  • Optical flow camera compensation
  • Tactical heatmap generation
  • Player possession & movement analysis
  • Perspective transformation for top-view
  • Ball control & pass detection

Python YOLO DeepSORT OpenCV Supervision

Demo

Deep learning medical imaging system for cancer detection

  • Binary classification: benign/malignant
  • Grad-CAM heatmaps for model explainability
  • OpenCV feature extraction pipeline
  • Auto-generated PDF diagnostic reports
  • Transfer learning with pretrained CNNs
  • Flask web interface for predictions

TensorFlow OpenCV Flask Grad-CAM Keras

GitHub

AI-powered molecular research & drug discovery platform

  • NVIDIA MolMIM for molecule generation
  • RDKit 3D molecular visualization
  • PubChem API integration for data
  • CMA-ES optimization + QED scoring
  • Interactive molecular property explorer
  • Firebase backend for user sessions

TypeScript Next.js Firebase RDKit NVIDIA NIMs

GitHub

Conversational AI with real-time web search capabilities

  • LangGraph agentic architecture
  • Real-time web search integration
  • GPT-4 powered intelligent responses
  • Live search visualization UI
  • Multi-turn conversation memory
  • FastAPI backend + Next.js frontend

LangGraph Next.js FastAPI GPT-4 OpenAI API

GitHub

Production-ready Deepfake Detection MLOps Pipeline

  • EfficientNet-based image classifier
  • DVC pipeline for data versioning
  • MLflow experiment tracking on DagsHub
  • Automated CI/CD with GitHub Actions
  • Docker containerized deployment
  • Deployed on Hugging Face Spaces

TensorFlow DVC MLflow Docker CI/CD AWS S3 & EC3 EKS Deployment Prometheus Grafana

Demo


Research Implementations

Building architectures from scratch to truly understand them.

Paper Implementation Key Details
Vision Transformer ViT 16×16 patch embedding, 12-layer encoder, multi-head attention
LoRA & QLoRA PyTorch-LoRA Low-rank adaptation, 4-bit quantization, 83% VRAM reduction
Vision-Language Models VLMverse PaLiGemma, SigLIP, cross-modal attention fusion

Multi-Agent Systems

Project Description
AgentForge ★ 2 Multi-agent orchestration with CrewAI, LangGraph, PhiData
Travel Planner 4-agent system: Flight, Hotel, Itinerary, Budget
Google ADK Production agent patterns with Google ADK

Learning Repos

Comprehensive implementations from fundamentals to SOTA:


GitHub Stats



3D Contributions


Let's Connect

Open to: AI/ML Engineering • Computer Vision • GenAI/LLMOps


LinkedIn X Email Portfolio


"Building AI systems by understanding them from first principles"


Pinned Loading

  1. Tennis-Vision Tennis-Vision Public

    Tennis Detection and Visualization System An advanced computer vision system for tennis match analysis that tracks players and ball movement with high precision. The system uses YOLOv8 and custom-t…

    Jupyter Notebook 23 1

  2. BBoxMaskPose BBoxMaskPose Public

    Forked from MiraPurkrabek/BBoxMaskPose

    [ICCV 25] The official repository of paper 'Detection, Pose Estimation and Segmentation for Multiple Bodies: Closing the Virtuous Circle'

    Python

  3. DeepGuard-MLOps-Pipeline DeepGuard-MLOps-Pipeline Public

    A production-grade MLOps capstone project demonstrating the complete machine learning lifecycle—from data versioning with DVC, experiment tracking with MLflow/DagsHub, containerization with Docker,…

    Jupyter Notebook

  4. MoleCuQuest MoleCuQuest Public

    Connecting biologists, medical researchers & molecule nerds — because science is ❤️........

    TypeScript

  5. decifra decifra Public

    🔍 MLOps Fraud Detection Pipeline with Explainable AI - Featuring ZenML, MLflow, DVC, BentoML, SHAP & LIME for transparent predictions

    Python

  6. kubrick-ai kubrick-ai Public

    Forked from the-ai-merge/multimodal-agents-course

    An MCP Multimodal AI Agent with eyes and ears!

    Python