Skip to main content

Posts

Showing posts with the label PyTorch

How to Fix 'PyTorch not compiled with ROCm' on AMD GPUs

  If you are transitioning to a PyTorch AMD GPU environment for model training or inference, you have likely encountered an immediate roadblock. When attempting to move a tensor to the GPU using   .to('cuda')   or calling   .cuda() , the interpreter throws an exception indicating that PyTorch was not compiled with ROCm or CUDA enabled. This error brings development to a halt. The hardware is physically present, and the system drivers may be installed correctly, but the Python runtime refuses to utilize the GPU. Resolving this requires replacing the default PyTorch binaries with a build specifically compiled against AMD’s ROCm (Radeon Open Compute) stack. Understanding the Root Cause To fix PyTorch CUDA error exceptions on AMD hardware, you must understand how Python package distribution works. When you run a standard  pip install torch , pip reaches out to the default Python Package Index (PyPI). Due to package size limits and historical dominance, the official ...

How to Fix PyTorch and MLX GPU Utilization Issues on Apple Silicon (M1/M2/M3)

  Running a local LLM Apple Silicon environment should be blazingly fast given the memory bandwidth of modern Mac hardware. Yet, developers frequently encounter inference speeds of just 1-2 tokens per second, accompanied by maxed-out CPU cores and an entirely idle GPU. This bottleneck occurs because standard Python environments and machine learning libraries do not default to Apple's Metal API. Resolving this requires explicitly configuring your code to utilize Metal Performance Shaders Python bindings or adopting Apple's specialized array framework. The Root Cause: Why macOS Defaults to CPU In the established AI/ML ecosystem, Nvidia's CUDA is the default backend for hardware acceleration. When a framework like PyTorch cannot locate a CUDA-enabled GPU, its fallback mechanism defaults directly to the CPU. Apple Silicon operates on a completely different architecture using Metal Performance Shaders (MPS). PyTorch does support MPS, but it requires specific build parameters, an...