NVIDIA's new CUDA Tile programming model brings a high-level, portable abstraction for writing high-performance GPU kernels. In this webinar, we will introduce CUDA Tile, and how it's been ported to Julia as cuTile.jl. The webinar will explore tile-based GPU programming through real-world examples spanning linear algebra routines, AI inference kernels, and HPC algorithms, demonstrating how CUDA Tile makes Tensor Core programming accessible for both AI and scientific computing workloads.
Learn how CUDA Tile is brought into the Julia ecosystem through cuTile.jl, enabling faster, more intuitive development for AI and scientific computing workloads.
In this webinar:
How accelerator hardware trends are shaping modern GPU programming
Efficient management and utilization of Tensor Cores
The role of the new MLIR-based software stack
Building tile-based abstractions in Julia
Real-World Applications
See CUDA Tile in action across:
Linear algebra: Matrix-vector operations and DGEMM
AI inference: RoPE, SwiLU, and Flash Attention
HPC algorithms: Heat equation solvers and sparse matrix-vector systems







