Sample codes for my CUDA programming book
-
Updated
Dec 14, 2025 - Cuda
Sample codes for my CUDA programming book
GPU-accelerated Levenberg-Marquardt curve fitting in CUDA
Reverse engineering NVIDIA SASS instruction dictionary, kernel audits and pattern recognition across GPU architectures.
Accelerated General (FP32) Matrix Multiplication from scratch in CUDA
CUDA编程练习项目-Hands-on CUDA kernels and performance optimization, covering GEMM, FlashAttention, Tensor Cores, CUTLASS, quantization, KV cache, NCCL, and profiling.
CUDA kernel author's tools
An extension library of WMMA API (Tensor Core API)
A curated set of C++ examples for optimization-based elastodynamic contact simulation using CUDA, emphasizing algorithmic convergence, penetration-free, and inversion-free conditions. Designed for readability and understanding, this tutorial helps beginners learn how to write simple GPU code for efficient solid simulations.
Personal CUDA learning repo, built step by step from scratch.
bilibili视频【CUDA 12.x 并行编程入门(C++版)】配套代码
General Matrix Multiplication using NVIDIA Tensor Cores
Graphics Processing Units Genetic Algorithm
Efficient implementations of Merge Sort and Bitonic Sort algorithms using CUDA for GPU parallel processing, resulting in accelerated sorting of large arrays. Includes both CPU and GPU versions, along with a performance comparison.
Get started with CUDA programming
CUDA Finite Difference Library
CUDA Implementation of Parallel Matrix Factorization Algorithm for Recommender Systems
GPU Parallel Computing software solution examples with CUDA
Case studies constitute a modern interdisciplinary and valuable teaching practice which plays a critical and fundamental role in the development of new skills and the formation of new knowledge. This research studies the behavior and performance of two interdisciplinary and widely adopted scientific kernels, a Fast Fourier Transform and Matrix M…
Add a description, image, and links to the gpu-programming topic page so that developers can more easily learn about it.
To associate your repository with the gpu-programming topic, visit your repo's landing page and select "manage topics."