Productive, portable, and performant GPU programming in Python.
-
Updated
Jun 9, 2026 - C++
Productive, portable, and performant GPU programming in Python.
Open3D: A Modern Library for 3D Data Processing
CUDA Templates and Python DSLs for High-Performance Linear Algebra
A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.
Mesh optimization library that makes meshes smaller and faster to render
a language for fast, portable data-parallel computation
HarfBuzz text shaping engine
A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.
The fastest and most memory efficient lattice Boltzmann CFD software, running on all GPUs and CPUs via OpenCL. Free for non-commercial use.
[ARCHIVED] The C++ parallel algorithms library. See https://github.com/NVIDIA/cccl
ArrayFire: a general purpose GPU library.
Optimized primitives for collective multi-GPU communication
MegEngine 是一个快速、可拓展、易于使用且支持自动求导的深度学习框架
Lemonade helps users discover and run local AI apps by serving optimized LLMs right from their own GPUs and NPUs. Join our discord: https://discord.gg/5xXzkMu8Zk
Lightning fast C++/CUDA neural network framework
Open source neural network chess engine with GPU acceleration and broad hardware support.
HeavyDB (formerly MapD/OmniSciDB)
CV-CUDA™ is an open-source, GPU accelerated library for cloud-scale image processing and computer vision.
Add a description, image, and links to the gpu topic page so that developers can more easily learn about it.
To associate your repository with the gpu topic, visit your repo's landing page and select "manage topics."