NVIDIA Developer Forums

Accelerated Computing CUDA

Topic	Replies	Views	Activity
CUDA Green Context API \| Memory Footprint CUDA Programming and Performance cuda , driver	2	53	December 17, 2025
TensorFlow + RTX 5090 + WSL: CUDA 12 Installed in WSL but Windows Driver Uses CUDA 13 CUDA on Windows Subsystem for Linux cuda , tensorflow , gpu	2	59	December 17, 2025
Looking for advice for CUDA performance tracking in CI/CD pipelines CUDA Programming and Performance cuda	2	17	December 17, 2025
Pinned memory throughput significantly lower on Ubuntu than on Windows CUDA Programming and Performance	22	183	December 17, 2025
Displaymodeselector + RTX PRO 6000 blackwell workstation edition CUDA Setup and Installation	4	133	December 16, 2025
Double4 is deprecated, but the preferred double4_32a is unrecognized? CUDA Programming and Performance	6	24	December 16, 2025
How to sync Cuda and Vulkan? CUDA Programming and Performance	2	21	December 16, 2025
Nvcc, syntax error in cuda.h(7451): error: expected a ")" CUDA Programming and Performance gtc	3	49	December 16, 2025
Wmma vs Wgmma On H100 GPU CUDA Programming and Performance cublas	4	29	December 15, 2025
Thrust device allocator vs std allocator CUDA Programming and Performance	3	38	December 15, 2025
Architectural insights needed: Why is the MIG 3g.71gb instance consistently the "Efficiency Sweet Spot" on H200? CUDA Programming and Performance llama	4	72	December 15, 2025
Weekend project: Very accurate double-precision sincos() implementation for a restricted domain CUDA Programming and Performance	0	23	December 14, 2025
Pixel Shader vs NPP - Which is faster for batch processing NV12 to RGB conversions and display directly to screen? CUDA Programming and Performance npp	5	63	December 14, 2025
Fedora 43 and NVCC / Cuda13.1 error "exception specification is incompatible" rsqrt / rsqrtf CUDA Setup and Installation	0	46	December 13, 2025
Register usage spike in SASS with divison slow/full path CUDA Programming and Performance cuda	13	204	December 12, 2025
RTX 5090 not working with PyTorch and Stable Diffusion (sm_120 unsupported) CUDA Setup and Installation	10	8062	December 12, 2025
Need a Windows 11 Driver for a M10 CUDA Setup and Installation	3	25	December 12, 2025
Question about the cacheConfig value in nsight systems CUDA Programming and Performance nsight	6	52	December 12, 2025
Is the CUDA tile kernel submitted to GPU still using the cuLaunchKernel? CUDA Programming and Performance	2	50	December 12, 2025
Unexpected results on cub::DeviceRadixSort::SortKeys and SortPairs with 128 bit keys CUDA Programming and Performance	5	22	December 12, 2025
How many tensor cores to execute the wmma.mma.sync.aligned.{alayout}.{blayout}.m16n16k16 instruction？ CUDA Programming and Performance cuda	23	117	December 12, 2025
Cuda runfile won't extract CUDA Setup and Installation	4	128	December 11, 2025
Compiling magma on Jetson Thor CUDA NVCC Compiler	0	13	December 11, 2025
__frsqrt_rn is not accurate 0.5ulp? I found a number CUDA Programming and Performance cuda , gpu-computing	4	42	December 10, 2025
FFMA with Uniform register CUDA Programming and Performance	3	72	December 9, 2025
Can't install CUDA and Nsight - Visual Studio or what? (Updated) CUDA Setup and Installation	4	132	December 9, 2025
Is it possible having compressible memory & memory pools over the same array on device? CUDA Programming and Performance cuda	0	28	December 9, 2025
CUDA-Q kernel crashes on Tesla V100 (Driver 570.133 / CUDA 12.8) when running VQE CUDA Setup and Installation cuda	0	20	December 9, 2025
cudaMemcpyBatchAsync cannot aggregate D2D copy operations CUDA Programming and Performance	13	111	December 9, 2025
Training YOLO in the background CUDA Programming and Performance cuda , yolo , python	1	44	December 8, 2025