PyTorch - whats the difference between models training mode and evaluation mode?-尧图网站建设

📅 发布时间：2026/6/19 1:07:10

In PyTorch, a model can operate in two main modes:

These two modes affect how certain layers behave, not the computation graph or gradients directly.

Here’s the detailed difference 👇

This sets the model to training mode.

Used during training (i.e., when you call loss.backward() and optimizer.step()).
Some layers behave differently in training mode:

Layer type	Behavior in `train()`
Dropout	Randomly zeroes some neurons (according to dropout probability `p`). This adds noise to help prevent overfitting.
BatchNorm	Uses mini-batch statistics (mean and variance of the current batch) to normalize activations and updates running averages.

So during training, both Dropout and BatchNorm behave stochastically and update their internal states.

This sets the model to evaluation (inference) mode.

Used during validation or testing.
You typically wrap evaluation code in torch.no_grad() to save memory and speed up computation.
Some layers change behavior:

Layer type	Behavior in `eval()`
Dropout	Disabled (no random dropout, all neurons are active).
BatchNorm	Uses running (moving average) statistics collected during training — not the batch’s mean/variance.

This ensures deterministic and consistent outputs during inference.

Mode	Command	Dropout	BatchNorm	Use case
Training	`model.train()`	Active (random neuron drop)	Uses batch stats & updates running stats	When training
Evaluation	`model.eval()`	Inactive	Uses stored running stats	When validating/testing

PyTorch - whats the difference between models training mode and evaluation mode?