Loss Scaling And Step Size In Deep Learning Optimization