One very important step is missing during training cycle.

1 min readApr 16, 2020

One very important step is missing during training cycle. clearing the gradients by calling zero_grad post backward() call.

zero_grad clears old gradients from the previous step (otherwise you’d just accumulate the gradients from all backward() calls.)

Written by Milind Deore