Intel Xeon and AI PC provide acceleration for Meta Llama 3 generated AI workload

2024-04-19 17:49:47 [Original by Zhongguancun Online] Author: eleven

Zhongguancun Online reported that Meta today launched its next generation big language model (LLM) - Meta Llama 3. At the first time of release, Intel optimized and verified that the Llama 3 model with 8 billion and 70 billion parameters can be used in Intel Xeon processor , Intel Gaudi accelerator , Intel Core Ultra processor and Intel Ruixuan graphics card.

Vice President and AI of Intel Software Li Wei, general manager of engineering, said: "Intel has been actively cooperating with leading enterprises in AI software ecosystem, and is committed to providing solutions with both performance and ease of use. Meta Llama3 is a new wave of important iterations of AI big language model. As a leader in technological innovation of AI software and hardware products, Intel is very happy to cooperate with Meta to fully tap the potential of models such as Llama3 and help ecological partners develop leading AI applications. "

significance : Adhering to the vision of promoting AI everywhere, Intel continues to deepen its software and AI ecology to ensure that its products can meet the changing innovation needs in the AI field. Integrate Intel Advanced Matrix Extension Acceleration Engine in the data center（ Intel AMX) Intel Xeon processors and Intel Gaudi can provide more choices to meet customers' changing and diversified needs.

Intel Core Ultra processor and Intel Ruixuan graphics card not only provide developers with local development tools, but also provide comprehensive software framework and tool support for deployment on millions of devices. Among them, PyTorch and Intel PyTorch expansion package for local research and development, as well as OpenVINO toolkit for model development and reasoning, etc.

Run Llama 3 on Intel products : When Intel initially tested and evaluated the performance of the Llama 3 model with 8 billion and 70 billion parameters on its own products, it used open source software including PyTorch, DeepSpeed, Intel Optium Habana library and Intel PyTorch expansion package, and provided the latest software optimization.

Intel Xeon processors can run rigorous end-to-end AI workloads. In order to reduce latency, Intel also optimizes the inference performance of large language models through technological innovation. The Intel Xeon 6 processor (code named Granite Rapids) equipped with a performance core, when running Llama 3 model reasoning with 8 billion parameters, has twice the delay of the fourth generation Intel Xeon processor, and can run Llama 3 model reasoning with 70 billion parameters at a token delay of less than 100 milliseconds.

Intel Gaudi 2 accelerator has optimized performance on Llama 2 models with 7 billion, 13 billion and 70 billion parameters. Now it has also had preliminary performance tests based on the new Llama 3 model. With the update of Intel Gaudi software, Intel can easily run the new Llama 3 model and generate results for reasoning and fine-tuning. The recently released Intel Gaudi 3 accelerator also supports running Llama 3.

Intel Core Ultra and Intel Ruixuan graphics cards also show excellent performance when running Llama 3. In preliminary tests, Intel Core Ultra processors have shown output generation performance that is far higher than the normal reading speed of humans. In addition, the Intel Ruixuan A770 graphics card has a new matrix engine (Xe Matrix eXtensions, XMX) AI acceleration and 16GB video memory in the Xe kernel, which further provides excellent performance for large language model workloads.

In the future, Meta will add new capabilities, more model sizes and enhanced performance. Intel will also continue to improve the performance of its AI products to support this new big language model.

Intel (Intel) i7-13700KF Core 13 processor 16 core 24 thread Remax up to 5.4Ghz 30M three-level cache desktop CPU

[Distributor] Jingdong Mall

[Product price] RMB 2589

Enter Purchase

This article is an original article. If it is reproduced, please indicate the source: Intel Xeon and AI PC provide acceleration for Meta Llama 3 generated AI workload https://server.zol.com.cn/866/8669118.html

Error correction and problem suggestions label: CPU