How to train a deep neural network?

1 · · Aug. 14, 2021, 11:29 a.m.
large batch size Adam or SGD learning rate data auto augmentation ResNeSt > ResNet circle loss weight decay: WEIGHT_DECAY: 0.0005 WEIGHT_DECAY_BIAS: 0. ...