Kartikey Joshi aidendorian

Kartikey Joshi

B.Tech CSE Specializing in AI & ML, graduating 2027. I'm interested in the more mathematical and low-level side of deep learning — building models from scratch, understanding what's inside them, and working on problems where physics and neural networks intersect.

Currently learning CUDA and working on 3D Gaussian Splatting.

Projects

Marcella ★ 2
A ~60M parameter decoder-only transformer built entirely from scratch in PyTorch — no Hugging Face, no shortcuts. Implements RoPE, RMSNorm, SwiGLU FFN, Flash SDP attention with custom causal masking, and a per-layer KV cache with pre-allocated fixed-size tensors for zero-overhead inference. Trained on a weighted mix of FineWeb-Edu, Wikipedia, and SlimPajama with a custom SentencePiece tokenizer (32K vocab). Instruction-finetuned with response-only loss masking. Evaluated at perplexity 32.87 on a held-out split. Ships with a FastAPI streaming backend and a Svelte chat UI.

FWI ★ 13
Physics-Informed GAN for Elastic Full Waveform Inversion — reconstructing subsurface Earth properties (Vp, Vs, density, Poisson's ratio, Young's modulus) from multi-component seismic waveforms. The generator is a U-Net that maps waveform inputs [B, 10, 1000, 70] to 70×70 subsurface grids; a Fourier Neural Operator acts as the differentiable elastic wave solver; a WGAN discriminator enforces realism. Total loss combines adversarial, data misfit (MSE), and PDE residual terms. Uses the ECFB dataset from the SMILE team. (In progress)

Vision-Transformer-for-DeepFake-Detection ★ 1
ViT-based deepfake detector with self-supervised pretraining via masked image modeling on CelebA, finetuned on DFDC. Achieves ~85% accuracy and ROC-AUC ~0.93. Includes Grad-CAM to validate that detections focus on manipulated facial regions rather than background artifacts.

4x-Upscaler-ESRGAN
ESRGAN for 4× image super-resolution. Two-phase training: PSNR-optimised first, then adversarial + VGG perceptual loss. RRDBNet with 23 RRDB blocks. Tile-based inference reduces peak GPU memory from ~4.5 GB to ~500 MB (≈89% reduction) without degrading output quality.

NeuralStyleTransfer
Neural style transfer using VGG19 with multi-scale pyramid optimisation and L-BFGS refinement, balancing content, style, and total variation loss.

Currently

Learning CUDA — kernels, memory hierarchies, warp-level operations
Exploring 3D Gaussian Splatting for real-time radiance field rendering

Stack

Python · PyTorch · CUDA · C++ · FastAPI

Computer Vision · Language Modelling · Physics-Informed Neural Networks · Fourier Neural Operators · GANs · Self-Supervised Learning · Flash Attention · KV Cache · SentencePiece

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Kartikey Joshi aidendorian

Achievements

Achievements

Block or report aidendorian

Kartikey Joshi

Projects

Currently

Stack

Pinned Loading

Uh oh!