vllm/docs/source/getting_started/installation/cpu-arm.md

(installation-arm)=

# Installation for ARM CPUs

vLLM has been adapted to work on ARM64 CPUs with NEON support, leveraging the CPU backend initially developed for the x86 platform. This guide provides installation instructions specific to ARM (which also apply to Apple Silicon, see [Installation for macOS](#installation-apple) for more). For additional details on supported features, refer to the [x86 CPU documentation](#installation-x86) covering:

- CPU backend inference capabilities
- Relevant runtime environment variables
- Performance optimization tips

ARM CPU backend currently supports Float32, FP16 and BFloat16 datatypes.
Contents:

1. [Requirements](#arm-backend-requirements)
2. [Quick Start with Dockerfile](#arm-backend-quick-start-dockerfile)
3. [Building from Source](#build-arm-backend-from-source)

(arm-backend-requirements)=

## Requirements

- **Operating System**: Linux or macOS
- **Compilers**: `gcc/g++ >= 12.3.0` (optional, but recommended) or `Apple Clang >= 15.0.0` for macOS
- **Instruction Set Architecture (ISA)**: NEON support is required

(arm-backend-quick-start-dockerfile)=

## Quick Start with Dockerfile

You can quickly set up vLLM on ARM using Docker:

```console
$ docker build -f Dockerfile.arm -t vllm-cpu-env --shm-size=4g .
$ docker run -it \
             --rm \
             --network=host \
             --cpuset-cpus=<cpu-id-list, optional> \
             --cpuset-mems=<memory-node, optional> \
             vllm-cpu-env
```

(build-arm-backend-from-source)=

## Building from Source

To build vLLM from source on Ubuntu 22.04 or other Linux distributions, follow a similar process as with x86. Testing has been conducted on AWS Graviton3 instances for compatibility.
[Docs] Convert rST to MyST (Markdown) (#11145) Signed-off-by: Rafael Vasquez <rafvasq21@gmail.com> 2024-12-23 17:35:38 -05:00			`(installation-arm)=`

			`# Installation for ARM CPUs`

[Hardware][Apple] Native support for macOS Apple Silicon (#11696) Signed-off-by: Wallas Santos <wallashss@ibm.com> Co-authored-by: Michael Goin <michael@neuralmagic.com> 2025-01-08 05:35:49 -03:00			`vLLM has been adapted to work on ARM64 CPUs with NEON support, leveraging the CPU backend initially developed for the x86 platform. This guide provides installation instructions specific to ARM (which also apply to Apple Silicon, see [Installation for macOS](#installation-apple) for more). For additional details on supported features, refer to the [x86 CPU documentation](#installation-x86) covering:`
[Docs] Convert rST to MyST (Markdown) (#11145) Signed-off-by: Rafael Vasquez <rafvasq21@gmail.com> 2024-12-23 17:35:38 -05:00
			`- CPU backend inference capabilities`
			`- Relevant runtime environment variables`
			`- Performance optimization tips`

			`ARM CPU backend currently supports Float32, FP16 and BFloat16 datatypes.`
			`Contents:`

			`1. [Requirements](#arm-backend-requirements)`
			`2. [Quick Start with Dockerfile](#arm-backend-quick-start-dockerfile)`
			`3. [Building from Source](#build-arm-backend-from-source)`

			`(arm-backend-requirements)=`

			`## Requirements`

			`- Operating System: Linux or macOS`
[Hardware][Apple] Native support for macOS Apple Silicon (#11696) Signed-off-by: Wallas Santos <wallashss@ibm.com> Co-authored-by: Michael Goin <michael@neuralmagic.com> 2025-01-08 05:35:49 -03:00			- Compilers: `gcc/g++ >= 12.3.0` (optional, but recommended) or `Apple Clang >= 15.0.0` for macOS
[Docs] Convert rST to MyST (Markdown) (#11145) Signed-off-by: Rafael Vasquez <rafvasq21@gmail.com> 2024-12-23 17:35:38 -05:00			`- Instruction Set Architecture (ISA): NEON support is required`

			`(arm-backend-quick-start-dockerfile)=`

			`## Quick Start with Dockerfile`

			`You can quickly set up vLLM on ARM using Docker:`

			```console
			`$ docker build -f Dockerfile.arm -t vllm-cpu-env --shm-size=4g .`
			`$ docker run -it \`
			`--rm \`
			`--network=host \`
			`--cpuset-cpus=<cpu-id-list, optional> \`
			`--cpuset-mems=<memory-node, optional> \`
			`vllm-cpu-env`
			```

			`(build-arm-backend-from-source)=`

			`## Building from Source`

			`To build vLLM from source on Ubuntu 22.04 or other Linux distributions, follow a similar process as with x86. Testing has been conducted on AWS Graviton3 instances for compatibility.`