NXP Ara240 M.2 Module

Discrete M.2 AI accelerator module delivering 40 eTOPS for plug-in generative AI at the edge

40 eTOPS | 16GB LPDDR4 onboard | M.2 2280 M-Key | PCIe Gen4 x4 | ~6.6W

Overview

The NXP Ara240 M.2 module is a discrete neural processing unit (DNPU) packaged in the standard M.2 2280 M-Key form factor, delivering 40 eTOPS of AI acceleration via PCIe Gen4 x4. With 16GB of onboard LPDDR4 memory, it can run models up to 30 billion parameters (INT4) independently of host memory. At approximately 6.6W typical power, it enables fanless industrial deployments.

Specifications

Specification	Value	Notes
NPU	NXP Ara240 (multiple NPU + VPU cores)
AI Performance	40 eTOPS	INT4, INT8, mixed precision
Onboard Memory	16GB LPDDR4
Host Interface	PCIe Gen4 x4 (~8 GB/s)
Form Factor	M.2 2280 M-Key (NGFF)
Power	~6.6W typical
Max Model Size	Up to 30B parameters (INT4)
Frameworks	TensorFlow, PyTorch, ONNX
Software	NXP Ara SDK
Status	Preproduction
Compatibility	FRDM i.MX 8M Plus, FRDM i.MX 95, other M.2 M-Key hosts

Use Cases

Adding generative AI (LLM/VLM) inference to existing embedded systems via standard M.2 slot
Accelerating computer vision and neural network workloads on NXP i.MX or other host platforms
Low-power edge AI inference in industrial, medical, and retail devices without requiring a GPU

Overview​

Specifications​

Use Cases​

Overview

Specifications

Use Cases