AMD Talks AI Capabilities of RDNA 3 GPUs & XDNA NPU: Radeon RX 7900 XT Up To 8X Faster Than Ryzen 7 8700G

AMD Talks AI Capabilities of RDNA 3 GPUs & XDNA NPU: Radeon RX 7900 XT Up To 8X Faster Than Ryzen 7 8700G

AMD has shared some interesting data regarding the capabilities of its RDNA 3 GPU and XDNA NPU hardware for consumer-centric AI workloads.

AMD's RDNA 3 GPUs and XDNA NPU deliver a robust suite of consumer-centric AI capabilities on PC platforms.

There is no doubt that AMD has been at the forefront of offering AI capabilities to a wider PC audience through the implementation of the XDNA NPU on its Ryzen APUs. The first NPU debuted in 2023 with the Phoenix “Ryzen 7040” APUs and was recently updated with the Hawk Point “Ryzen 8040” series. In addition to the NPU, AMD's RDNA 3 GPU architecture also includes a large amount of dedicated AI cores that can handle these workloads and the company's ROCm software suite. Together trying to stabilize its momentum.

During the latest “Meet the Experts” webinar, AMD discussed how its Radeon graphics suite, such as the RDNA 3 series, provides gamers, creators, and developers with much better workloads. I include:

  • Video quality enhancement
  • Background noise removal
  • Text to Image (GenAI)
  • Generative Language Models (GenAI)
  • Photo editing
  • Video editing
  • Upscaling
  • Image from text
  • Model Training (Linux)
  • ROCm Platform (Linux)

Starting with the AMD RDNA 3 graphics architecture, the latest GPUs featured on Radeon RX 7000 GPUs and Ryzen 7000/8000 CPUs deliver more than 2x gen-over-gen AI performance increases.

These GPU products offer up to 192 AI accelerators optimized for FP16 workloads, optimized in multiple ML frameworks such as Microsoft DirectML, Nod.AI Shark and ROCm, and feature large pools of VRAM that support big data. (up to 48 GB) and features faster bandwidth enhanced by Infinity Cache technology.

According to AMD, most AI use cases on the PC platform involve LLM and Diffusion models that depend primarily on the FP16 compute and memory capabilities of the hardware they're running on. Some models such as the SDXL (Diffusion) are Compute bound and require around 4-16 GB of memory while the Llama2-13B and Mistral-8x are 7B memory bound and can use up to 23 GB of memory.

As mentioned earlier, AMD has a wide range of hardware with dedicated AI acceleration. Even the company's Radeon RX 7600 XT, a $329 US graphics card, has 16 GB of VRAM and in terms of performance, it offers a 3.6x increase over the Ryzen 7 8700G in LM Studio while the RX 7900 XT is up to 8x faster. Is. 8700 g.


LM Studio Performance (higher is better):

  • Ryzen 7 8700G NPU: 11 tokens/sec
  • RX 7600 XT 16 GB: 40 tokens/sec
  • RX 7900 XT 20 GB: 85 tokens/sec

AMUSE spread (lower is better):

  • Ryzen 7 8700G NPU: 2.6 seconds/image
  • RX 7600 XT 16 GB: 0.97 sec/image
  • RX 7900 XT 20 GB: 0.6 sec/image

AMD also does a small comparison against NVIDIA's GeForce RTX called the Green Team. “Premium AI PC” platform. Both lineups offer similar support, but AMD shows how its 16 GB GPUs come in at a low price of $329 US (7600 XT) while NVIDIA's most entry-level 16 GB GPU costs around $500 US (4060 TI 16 GB). The company also has a high-end stack that measures up to 48 GB of memory. AMD has previously shown strong performance against Intel's Core Ultra in AI at a better price..

Moving forward, AMD talks about how ROCm 6.0 is developing and how the open source stack will support consumer-grade hardware like the Radeon RX 7900 XTX, 7900 XT, 7900 GRE, PRO W7900, and PRO W7800. Getting support for ROCm 6.0 supports both PyTorch and ONNX runtime ML models and algorithms on Ubuntu 22.03.3 (Linux) OS and improves interoperability by adding INT8 for more complex models.

The company is looking to make ROCm even more open source by offering developers a range of software stack and hardware documentation..

AMD and its ROCm suite are competing. Dominant NVIDIA CUDA and TensorRT stack While Intel is also gaining ground with its OneAPI AI stack.. These are the three forces to look out for when it comes to AI workloads on the PC platform, so expect a lot of innovation and optimization for current and next-generation hardware in the future.

Share this story.



About the Author

Leave a Reply