sambath.narayanan@dataeverconsulting.com
What is cuDNN ?
NVIDIA CUDA Deep Neural Network (cuDNN) is a GPU-accelerated library of primitives for deep neural networks.
It provides highly tuned implementations of routines arising frequently in DNN applications.
Added support for FP8 fused-multi-head attention training and inference support targeting BERT on NVIDIA Hopper GPUs.
Added support for transformer models training and inference using Flash Attention in cuDNN runtime fusion engine
Benefits - Performance Improvement

How to enable cuDNN in the code ?

Additional Reading
https://docs.nvidia.com/deeplearning/cudnn/index.html