The test suite validates micrograd's gradient computations by comparing them against PyTorch as a ground-truth oracle. This chapter explains the testing strategy, what the two test functions cover, and what edge cases they expose about the autograd engine's correctness.
The testing strategy in `test_engine.py` is called **oracle testing** or **reference implementation testing**: rather than hand-computing expected gradient values, the tests run the same computation in both micrograd and PyTorch and assert that the results match. This is powerful because PyTorch's autograd is battle-tested and correct; any discrepancy means micrograd has a bug.
For each test, the same arithmetic expression is written twice — once using `Value` objects and once using `torch.tensor` objects with `requires_grad=True`. Both forward passes produce equivalent scalar outputs, and both call `.backward()`. The test then compares `.grad` from the micrograd `Value`s against `.grad.item()` from the PyTorch tensors. PyTorch tensors are created with `dtype=torch.float64` to match Python's native float precision, reducing numerical noise in the comparison.
The `test_more_ops` function exercises a richer set of operations including division, negative exponents, and `relu` on negative values (the 'dead neuron' case). It also demonstrates **gradient accumulation at shared nodes**: the variable `d` is used in two separate sub-expressions, so its final gradient must be the sum of two contributions. The test verifies this is handled correctly by micrograd's `+=` accumulation in the backward closures.
The assertions in `test_more_ops` use `abs(val - ref) < tol` with `tol=1e-5` rather than exact equality. This tolerance is necessary because floating-point arithmetic is not associative — evaluating the same mathematical expression in a different order (as micrograd and PyTorch may do) can produce results that differ in the least significant bits. The tolerance is tight enough to catch real bugs while being loose enough to tolerate normal floating-point rounding. The `test_sanity_check` function, by contrast, uses exact equality checks, which works because the expressions are simple enough that both implementations follow the same evaluation path.