Backend Tolerance & Debugging

accel-gpu is designed to produce numerically consistent results across backend modes:

What "consistent" means

Core ops are expected to match within floating-point tolerance, not bit-exact equality.
Typical tolerances used in tests are 1e-4 to 2e-4 for real-valued outputs.
Reduction-heavy or iterative operations can drift slightly more than simple elementwise math.

Initialization order by default:

Override in tests or production diagnostics:

const cpu = await init({ forceCPU: true });
const webgl = await init({ forceWebGL: true });
const auto = await init();

When validating model/data pipelines:

Use WebGPU where available for large matrix and convolution workloads.
Keep arrays alive only as long as needed; wrap temporary tensors in gpu.tidy(...).
Prefer subpath imports for smaller bundles:

import { matmul } from "accel-gpu/linalg";
import { fft } from "accel-gpu/signal";