Installing YOLO on ARM Architecture Devices

This guide provides detailed instructions for installing and running YOLOv8 on ARM-based NVIDIA devices like the Jetson, Orin, and Xavier series. We’ll cover setup requirements, installation steps, and optimization tips.

Prerequisites

Before starting the installation, ensure your Jetson device is running the appropriate JetPack version:

JetPack 4: Jetson Nano, TX2
JetPack 5: Xavier NX, AGX Xavier, Orin NX, AGX Orin
JetPack 6: Orin series

Initial System Setup

First, let’s prepare the system by enabling maximum performance and installing essential packages:

# Enable maximum power mode
sudo nvpmodel -m 0

# Enable maximum clock speeds
sudo jetson_clocks

# Update system packages
sudo apt update
sudo apt install -y python3-pip python3-dev build-essential
pip3 install --upgrade pip

Installation Process

1. Install Required Dependencies

# Install system libraries
sudo apt-get install -y libopenmpi-dev libopenblas-base libomp-dev

# Install Python packages
pip3 install numpy==1.23.5  # Specific version for compatibility

2. Install PyTorch for ARM

Different JetPack versions require specific PyTorch installations:

For JetPack 6.0:

pip3 install https://github.com/ultralytics/assets/releases/download/v0.0.0/torch-2.3.0-cp310-cp310-linux_aarch64.whl
pip3 install https://github.com/ultralytics/assets/releases/download/v0.0.0/torchvision-0.18.0a0+6043bc2-cp310-cp310-linux_aarch64.whl

For JetPack 5.1.x:

wget https://developer.download.nvidia.com/compute/redist/jp/v512/pytorch/torch-2.1.0a0+41361538.nv23.06-cp38-cp38-linux_aarch64.whl
pip3 install torch-2.1.0a0+41361538.nv23.06-cp38-cp38-linux_aarch64.whl

# Install torchvision from source
sudo apt install -y libjpeg-dev zlib1g-dev
git clone https://github.com/pytorch/vision torchvision
cd torchvision
git checkout v0.16.2
python3 setup.py install --user

3. Install ONNX Runtime GPU

For JetPack 6.0 (Python 3.10):

wget https://nvidia.box.com/shared/static/48dtuob7meiw6ebgfsfqakc9vse62sg4.whl -O onnxruntime_gpu-1.18.0-cp310-cp310-linux_aarch64.whl
pip3 install onnxruntime_gpu-1.18.0-cp310-cp310-linux_aarch64.whl

For JetPack 5.x (Python 3.8)

wget https://nvidia.box.com/shared/static/zostg6agm00fb6t5uisw51qi6kpcuwzd.whl -O onnxruntime_gpu-1.17.0-cp38-cp38-linux_aarch64.whl
pip3 install onnxruntime_gpu-1.17.0-cp38-cp38-linux_aarch64.whl

4. Install YOLOv8

# Install ultralytics with export dependencies
pip3 install ultralytics[export]

# Verify installation
python3 -c "from ultralytics import YOLO; model = YOLO('yolov8n.pt')"

Optimizing YOLOv8 for Jetson

Convert to TensorRT for Best Performance

from ultralytics import YOLO

# Load your model
model = YOLO('yolov8n.pt')

# Export to TensorRT FP16 for better performance
model.export(format='engine', half=True)  # Creates 'yolov8n.engine'

# For Jetson devices with DLA cores (Orin, Xavier series)
model.export(format='engine', device='dla:0', half=True)

# Test the exported model
trt_model = YOLO('yolov8n.engine')
results = trt_model('path/to/image.jpg')

Monitoring System Performance

Install Jetson Stats to Monitor System Metrics

sudo pip3 install jetson-stats
sudo reboot
jtop  # Launch monitoring interface

Common Issues and Solutions

Memory Errors

If you encounter CUDA out-of-memory errors:

Reduce batch size.
Use a smaller model variant (e.g., nano or small).
Enable FP16 precision.

Performance Issues

Always use TensorRT for inference.
Enable maximum power mode and clock speeds.
Monitor thermal throttling using jtop.

Testing the Installation

Run this simple test to verify everything works:

from ultralytics import YOLO

# Load a model
model = YOLO('yolov8n.pt')

# Run inference
results = model('https://ultralytics.com/images/bus.jpg')
results[0].show()  # Display results

Performance Optimization Tips

Use TensorRT: Always convert models to TensorRT format for deployment.
Enable DLA: Use DLA cores when available (Orin and Xavier series).
Batch Processing: Process multiple frames together when possible.
Monitor Thermals: Keep an eye on thermal throttling using jtop.