NVIDIA Corporation
- 20.6k followers
- 2788 San Tomas Expressway, Santa Clara, CA, 95051
- https://nvidia.com
Pinned Loading
Repositories
- cuda-tile Public
CUDA Tile IR is an MLIR-based intermediate representation and compiler infrastructure for CUDA kernel optimization, focusing on tile-based computation patterns and optimizations targeting NVIDIA tensor core units.
NVIDIA/cuda-tile’s past year of commit activity - doca-platform Public
DOCA Platform manages provisioning and service orchestration for Bluefield DPUs
NVIDIA/doca-platform’s past year of commit activity - TensorRT-LLM Public
TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT LLM also contains components to create Python and C++ runtimes that orchestrate the inference execution in a performant way.
NVIDIA/TensorRT-LLM’s past year of commit activity - cuda-quantum Public
C++ and Python support for the CUDA Quantum programming model for heterogeneous quantum-classical workflows
NVIDIA/cuda-quantum’s past year of commit activity - Model-Optimizer Public
A unified library of SOTA model optimization techniques like quantization, pruning, distillation, speculative decoding, etc. It compresses deep learning models for downstream deployment frameworks like TensorRT-LLM, TensorRT, vLLM, etc. to optimize inference speed.
NVIDIA/Model-Optimizer’s past year of commit activity - nv-ingest Public
NeMo Retriever extraction is a scalable, performance-oriented document content and metadata extraction microservice. NeMo Retriever extraction uses specialized NVIDIA NIM microservices to find, contextualize, and extract text, tables, charts and images that you can use in downstream generative applications.
NVIDIA/nv-ingest’s past year of commit activity - TransformerEngine Public
A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit and 4-bit floating point (FP8 and FP4) precision on Hopper, Ada and Blackwell GPUs, to provide better performance with lower memory utilization in both training and inference.
NVIDIA/TransformerEngine’s past year of commit activity