Skip to content
View mjsushanth's full-sized avatar

Block or report mjsushanth

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
mjsushanth/README.md

Hello, I'm Joel M

AI enthusiast & Data Engineer | M.S. in Artificial Intelligence @ Northeastern University

I like AI/ML research, enterprise-scale data engineering, building systems in deep learning, NLP, and computer vision. Please visit: https://mjsushanth.github.io/


Portfolio:

  • Do visit my portfolio here

Study Notes:

  • My study notes are categorized and placed in here!
  • Credits to Obsidian, such a perfect app for taking notes. Here, you can expect to find deep-math diving, clear mental models, intuition, project-research on whatever my work has produced.

Research & Academic Projects

  • FinRAG/FinSights: Production-Grade Financial Intelligence System — Hybrid dual-path architecture combining structured queries (DuckDB/SQL dimension tables) with semantic retrieval. Processes 72M→1M sentences via stratified sampling with temporal weighting across regulatory eras. Check this out!
    • Data Engineering Pipeline: DuckDB stratified sampling (30+ SQL scripts) with weighted multi-objective scoring, fuzzy-matched integrations, conditional temporal stratification.
    • Advanced RAG Engineering: Sentence-level embeddings, multi-query expansion with window-hopping retrieval, citation provenance via document headers for exact traceability. Polars/Parquet logging, serverless-ready architecture. Achieves $0.017-0.025/query cost, no managed DB overhead.
  • Text-to-Pose Diffusion: Built a CLIP-conditioned diffusion model with cross-attention + anatomical loss for 3D pose generation.
    • Has deeply researched concepts on Motion/3D data: (pose representation, N-joint hierarchical mapping, kinematic chains, pelvis-spine-extremity validation) and the architecture of Hybrid CNN-Transformer Diffusion, CLIP Semantic Encoding & Projection, Dual-Pass CFG and Anatomical Constraint Enforcement. See Report here., See Design here.
  • Multi-View 3D Scene Analysis: Created a 10k+ LOC pipeline with MV scene analysis, pose-guided filtering, occlusion handling, and RANSAC validation on ETH3D. See Design Flow.
  • Protein Structure Prediction: Implemented HMM, CRF, BiLSTM; CRF reached 67% accuracy on CB513 using evolutionary + context features. See Report here.
  • SocrAItic Circle: Multi-Agent Debate LLMs workflow, designed with multi-phase debate cycles, iterative refinement, YAML-driven orchestration, and judge modules.
  • Artist Classification: Compared SVM-SIFT-BoVW, CAEs, VAEs, and CNNs; SVM achieved 89% accuracy on 50-class dataset.

Other works:

  • Multi-vector ViT+CLIP with LoRA and ColBERT-style MaxSim retrieval Demo Notebook.
  • An example workflow of ML-Serving using Gitub CI/CD and AWS Lambda, SAM Infrastructure. Src Code. , Notes here. Study notes.
  • Usage of Optuna and MLFlow using a synthetic time-series generator Src Code.

📫 Connect with Me

LinkedIn
Email
GitHub

Pinned Loading

  1. Finsights-MLOps/FinSights Finsights-MLOps/FinSights Public

    Group project for Coursework; (MLOps IE7374). Northeastern University.

    Jupyter Notebook 1 4

  2. Multi_Agent_LLM_Debater Multi_Agent_LLM_Debater Public

    A modular framework for orchestrating structured debates between multiple large language models (LLMs) with specialized judge evaluation. This project implements an adversarial training approach to…

    Jupyter Notebook 2

  3. CLIP-Conditioned-Diffusion-T2Pose-Generation CLIP-Conditioned-Diffusion-T2Pose-Generation Public

    Dataset - HumanML3D. Large Pipeline with research oriented implementation and exploration for T2P from static pose; ConditionedUNetModels, Anatomical Awareness, Text embedding conditioning, Cross-a…

    Jupyter Notebook 1

  4. ML_Protein_Structure_Prediction ML_Protein_Structure_Prediction Public

    Probabilistic approaches for protein secondary structure prediction using Hidden Markov Models and Conditional Random Fields (CS 6140)

    Jupyter Notebook 1

  5. mlops-labs-portfolio mlops-labs-portfolio Public

    Submission of Labs for MLOps - IE7374. Will likely include other Practice Work!

    Jupyter Notebook

  6. MultiView_Image_Analysis_CS5330 MultiView_Image_Analysis_CS5330 Public

    A research project that attempts to understand elements for complete 3D reconstruction from static images: Feature correspondences, Scene Adaptiveness, Reliability, Camera Intrinsics, Distances, Ov…

    Jupyter Notebook