cuDNN 8.3 : The purpose

Nvidia has got an amazing AI ecosystem. In fact they have started the whole feature ‘GPU as Compute accelerator’ back in 2007 with various architectures following a pure gpu for graphics. Today Nvidia provides a library of specific algorithms for deep learning purpose which run on newest accelerators and allow developers to focus on the software instead of fine tuning performance for each gpu on the edge.

Highlights of cuDNN 8.3 features:

Tensor Core acceleration for all popular convolutions including 2D, 3D, Grouped, Depth-wise separable, and Dilated with NHWC and NCHW inputs and outputs
Optimized kernels for computer vision and speech models including ResNet, ResNext, EfficientNet, EfficientDet, SSD, MaskRCNN, Unet, VNet, BERT, GPT-2, Tacotron2 and WaveGlow
Supports FP32, FP16, BF16 and TF32 floating point formats and INT8, and UINT8 integer formats
Arbitrary dimension ordering, striding, and sub-regions for 4d tensors means easy integration into any neural net implementation
Speed up fused operations on any CNN architecture
cuDNN is supported on Windows and Linux with Ampere, Turing, Volta, Pascal, Maxwell, and Kepler GPU architectures

Links
https://developer.nvidia.com/cudnn