GPU Compute Engineer

Family: Low-level & domain-heavy

Programs GPUs for general-purpose workloads โ€” ML training, scientific simulation, and data processing โ€” squeezing maximum throughput from parallel hardware.

Day to day

Writes CUDA or ROCm kernels, profiles memory bandwidth and compute utilization, optimizes data layouts for coalesced access, and integrates GPU kernels into training frameworks.

Core skills

Adjacent roles

โ† Back to Atlas