/opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages/sklearn/preprocessing/_encoders.py:868: FutureWarning: `sparse` was renamed to `sparse_output` in version 1.2 and will be removed in 1.4. `sparse_output` is ignored unless you leave `sparse` to its default value.
warnings.warn(
No GPU/TPU found, falling back to CPU. (Set TF_CPP_MIN_LOG_LEVEL=0 and rerun for more info.)
0%| | 0/96 [00:00<?, ?batch/s]Epoch 0: 0%| | 0/96 [00:00<?, ?batch/s]Epoch 0: 0%| | 0/96 [00:01<?, ?batch/s, train/train_loss_1=0.158]Epoch 0: 1%|1 | 1/96 [00:01<01:57, 1.24s/batch, train/train_loss_1=0.158]Epoch 0: 1%|1 | 1/96 [00:01<01:57, 1.24s/batch, train/train_loss_1=0.154]Epoch 0: 1%|1 | 1/96 [00:01<01:57, 1.24s/batch, train/train_loss_1=0.138]Epoch 0: 1%|1 | 1/96 [00:01<01:57, 1.24s/batch, train/train_loss_1=0.136]Epoch 0: 1%|1 | 1/96 [00:01<01:57, 1.24s/batch, train/train_loss_1=0.125]Epoch 0: 1%|1 | 1/96 [00:01<01:57, 1.24s/batch, train/train_loss_1=0.123]Epoch 0: 1%|1 | 1/96 [00:01<01:57, 1.24s/batch, train/train_loss_1=0.119]Epoch 0: 1%|1 | 1/96 [00:01<01:57, 1.24s/batch, train/train_loss_1=0.113]Epoch 0: 1%|1 | 1/96 [00:01<01:57, 1.24s/batch, train/train_loss_1=0.11] Epoch 0: 1%|1 | 1/96 [00:01<01:57, 1.24s/batch, train/train_loss_1=0.111]Epoch 0: 1%|1 | 1/96 [00:01<01:57, 1.24s/batch, train/train_loss_1=0.106]Epoch 0: 1%|1 | 1/96 [00:01<01:57, 1.24s/batch, train/train_loss_1=0.101]Epoch 0: 1%|1 | 1/96 [00:01<01:57, 1.24s/batch, train/train_loss_1=0.104]Epoch 0: 1%|1 | 1/96 [00:01<01:57, 1.24s/batch, train/train_loss_1=0.107]Epoch 0: 1%|1 | 1/96 [00:01<01:57, 1.24s/batch, train/train_loss_1=0.0987]Epoch 0: 1%|1 | 1/96 [00:01<01:57, 1.24s/batch, train/train_loss_1=0.0959]Epoch 0: 1%|1 | 1/96 [00:01<01:57, 1.24s/batch, train/train_loss_1=0.0987]Epoch 0: 1%|1 | 1/96 [00:01<01:57, 1.24s/batch, train/train_loss_1=0.0898]Epoch 0: 1%|1 | 1/96 [00:01<01:57, 1.24s/batch, train/train_loss_1=0.109] Epoch 0: 1%|1 | 1/96 [00:01<01:57, 1.24s/batch, train/train_loss_1=0.0954]Epoch 0: 1%|1 | 1/96 [00:01<01:57, 1.24s/batch, train/train_loss_1=0.117] Epoch 0: 1%|1 | 1/96 [00:01<01:57, 1.24s/batch, train/train_loss_1=0.0982]Epoch 0: 1%|1 | 1/96 [00:01<01:57, 1.24s/batch, train/train_loss_1=0.098] Epoch 0: 1%|1 | 1/96 [00:01<01:57, 1.24s/batch, train/train_loss_1=0.093]Epoch 0: 1%|1 | 1/96 [00:01<01:57, 1.24s/batch, train/train_loss_1=0.0908]Epoch 0: 1%|1 | 1/96 [00:01<01:57, 1.24s/batch, train/train_loss_1=0.0933]Epoch 0: 1%|1 | 1/96 [00:01<01:57, 1.24s/batch, train/train_loss_1=0.086] Epoch 0: 1%|1 | 1/96 [00:01<01:57, 1.24s/batch, train/train_loss_1=0.101]Epoch 0: 1%|1 | 1/96 [00:01<01:57, 1.24s/batch, train/train_loss_1=0.0823]Epoch 0: 1%|1 | 1/96 [00:01<01:57, 1.24s/batch, train/train_loss_1=0.0866]Epoch 0: 1%|1 | 1/96 [00:01<01:57, 1.24s/batch, train/train_loss_1=0.089] Epoch 0: 1%|1 | 1/96 [00:01<01:57, 1.24s/batch, train/train_loss_1=0.0899]Epoch 0: 1%|1 | 1/96 [00:01<01:57, 1.24s/batch, train/train_loss_1=0.0892]Epoch 0: 1%|1 | 1/96 [00:01<01:57, 1.24s/batch, train/train_loss_1=0.0966]Epoch 0: 1%|1 | 1/96 [00:01<01:57, 1.24s/batch, train/train_loss_1=0.0858]Epoch 0: 1%|1 | 1/96 [00:01<01:57, 1.24s/batch, train/train_loss_1=0.0853]Epoch 0: 1%|1 | 1/96 [00:01<01:57, 1.24s/batch, train/train_loss_1=0.0837]Epoch 0: 1%|1 | 1/96 [00:01<01:57, 1.24s/batch, train/train_loss_1=0.0833]Epoch 0: 1%|1 | 1/96 [00:01<01:57, 1.24s/batch, train/train_loss_1=0.0794]Epoch 0: 1%|1 | 1/96 [00:01<01:57, 1.24s/batch, train/train_loss_1=0.0993]Epoch 0: 1%|1 | 1/96 [00:01<01:57, 1.24s/batch, train/train_loss_1=0.0754]Epoch 0: 1%|1 | 1/96 [00:01<01:57, 1.24s/batch, train/train_loss_1=0.0733]Epoch 0: 1%|1 | 1/96 [00:01<01:57, 1.24s/batch, train/train_loss_1=0.0883]Epoch 0: 1%|1 | 1/96 [00:01<01:57, 1.24s/batch, train/train_loss_1=0.0838]Epoch 0: 1%|1 | 1/96 [00:01<01:57, 1.24s/batch, train/train_loss_1=0.094] Epoch 0: 1%|1 | 1/96 [00:01<01:57, 1.24s/batch, train/train_loss_1=0.0934]Epoch 0: 1%|1 | 1/96 [00:01<01:57, 1.24s/batch, train/train_loss_1=0.0807]Epoch 0: 1%|1 | 1/96 [00:01<01:57, 1.24s/batch, train/train_loss_1=0.0858]Epoch 0: 1%|1 | 1/96 [00:01<01:57, 1.24s/batch, train/train_loss_1=0.0777]Epoch 0: 1%|1 | 1/96 [00:01<01:57, 1.24s/batch, train/train_loss_1=0.084] Epoch 0: 1%|1 | 1/96 [00:01<01:57, 1.24s/batch, train/train_loss_1=0.0884]Epoch 0: 1%|1 | 1/96 [00:01<01:57, 1.24s/batch, train/train_loss_1=0.0826]Epoch 0: 1%|1 | 1/96 [00:01<01:57, 1.24s/batch, train/train_loss_1=0.0877]Epoch 0: 1%|1 | 1/96 [00:01<01:57, 1.24s/batch, train/train_loss_1=0.0824]Epoch 0: 1%|1 | 1/96 [00:01<01:57, 1.24s/batch, train/train_loss_1=0.0726]Epoch 0: 1%|1 | 1/96 [00:01<01:57, 1.24s/batch, train/train_loss_1=0.0949]Epoch 0: 1%|1 | 1/96 [00:01<01:57, 1.24s/batch, train/train_loss_1=0.0745]Epoch 0: 1%|1 | 1/96 [00:01<01:57, 1.24s/batch, train/train_loss_1=0.0831]Epoch 0: 1%|1 | 1/96 [00:01<01:57, 1.24s/batch, train/train_loss_1=0.0671]Epoch 0: 1%|1 | 1/96 [00:01<01:57, 1.24s/batch, train/train_loss_1=0.0836]Epoch 0: 1%|1 | 1/96 [00:01<01:57, 1.24s/batch, train/train_loss_1=0.0839]Epoch 0: 1%|1 | 1/96 [00:01<01:57, 1.24s/batch, train/train_loss_1=0.072] Epoch 0: 1%|1 | 1/96 [00:01<01:57, 1.24s/batch, train/train_loss_1=0.0805]Epoch 0: 1%|1 | 1/96 [00:01<01:57, 1.24s/batch, train/train_loss_1=0.0871]Epoch 0: 1%|1 | 1/96 [00:01<01:57, 1.24s/batch, train/train_loss_1=0.0739]Epoch 0: 1%|1 | 1/96 [00:01<01:57, 1.24s/batch, train/train_loss_1=0.0682]Epoch 0: 69%|######8 | 66/96 [00:01<00:00, 67.87batch/s, train/train_loss_1=0.0682]Epoch 0: 69%|######8 | 66/96 [00:01<00:00, 67.87batch/s, train/train_loss_1=0.0902]Epoch 0: 69%|######8 | 66/96 [00:01<00:00, 67.87batch/s, train/train_loss_1=0.0821]Epoch 0: 69%|######8 | 66/96 [00:01<00:00, 67.87batch/s, train/train_loss_1=0.0739]Epoch 0: 69%|######8 | 66/96 [00:01<00:00, 67.87batch/s, train/train_loss_1=0.0863]Epoch 0: 69%|######8 | 66/96 [00:01<00:00, 67.87batch/s, train/train_loss_1=0.0854]Epoch 0: 69%|######8 | 66/96 [00:01<00:00, 67.87batch/s, train/train_loss_1=0.0758]Epoch 0: 69%|######8 | 66/96 [00:01<00:00, 67.87batch/s, train/train_loss_1=0.0779]Epoch 0: 69%|######8 | 66/96 [00:01<00:00, 67.87batch/s, train/train_loss_1=0.0685]Epoch 0: 69%|######8 | 66/96 [00:01<00:00, 67.87batch/s, train/train_loss_1=0.0686]Epoch 0: 69%|######8 | 66/96 [00:01<00:00, 67.87batch/s, train/train_loss_1=0.0764]Epoch 0: 69%|######8 | 66/96 [00:01<00:00, 67.87batch/s, train/train_loss_1=0.077] Epoch 0: 69%|######8 | 66/96 [00:01<00:00, 67.87batch/s, train/train_loss_1=0.0934]Epoch 0: 69%|######8 | 66/96 [00:01<00:00, 67.87batch/s, train/train_loss_1=0.074] Epoch 0: 69%|######8 | 66/96 [00:01<00:00, 67.87batch/s, train/train_loss_1=0.0816]Epoch 0: 69%|######8 | 66/96 [00:01<00:00, 67.87batch/s, train/train_loss_1=0.0769]Epoch 0: 69%|######8 | 66/96 [00:01<00:00, 67.87batch/s, train/train_loss_1=0.067] Epoch 0: 69%|######8 | 66/96 [00:01<00:00, 67.87batch/s, train/train_loss_1=0.085]Epoch 0: 69%|######8 | 66/96 [00:01<00:00, 67.87batch/s, train/train_loss_1=0.0774]Epoch 0: 69%|######8 | 66/96 [00:01<00:00, 67.87batch/s, train/train_loss_1=0.0883]Epoch 0: 69%|######8 | 66/96 [00:01<00:00, 67.87batch/s, train/train_loss_1=0.0784]Epoch 0: 69%|######8 | 66/96 [00:01<00:00, 67.87batch/s, train/train_loss_1=0.0864]Epoch 0: 69%|######8 | 66/96 [00:01<00:00, 67.87batch/s, train/train_loss_1=0.0678]Epoch 0: 69%|######8 | 66/96 [00:01<00:00, 67.87batch/s, train/train_loss_1=0.0874]Epoch 0: 69%|######8 | 66/96 [00:01<00:00, 67.87batch/s, train/train_loss_1=0.0842]Epoch 0: 69%|######8 | 66/96 [00:01<00:00, 67.87batch/s, train/train_loss_1=0.0831]Epoch 0: 69%|######8 | 66/96 [00:01<00:00, 67.87batch/s, train/train_loss_1=0.0828]Epoch 0: 69%|######8 | 66/96 [00:01<00:00, 67.87batch/s, train/train_loss_1=0.0803]Epoch 0: 69%|######8 | 66/96 [00:01<00:00, 67.87batch/s, train/train_loss_1=0.0756]Epoch 0: 69%|######8 | 66/96 [00:01<00:00, 67.87batch/s, train/train_loss_1=0.0697]Epoch 0: 69%|######8 | 66/96 [00:02<00:00, 67.87batch/s, train/train_loss_1=0.0881] 0%| | 0/96 [00:00<?, ?batch/s]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0755]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.073] Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0809]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0707]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0751]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0824]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0761]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0893]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0757]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.073] Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0773]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0737]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0754]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.075] Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0694]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0792]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0829]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0819]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0716]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0684]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0799]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0671]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.079] Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0732]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.074] Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.085]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0644]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0767]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0671]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0667]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0768]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0629]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0694]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0798]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0775]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0696]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0735]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.064] Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0775]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0706]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0745]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0717]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0684]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0692]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0637]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0601]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0733]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0632]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0687]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0811]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0696]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0677]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0632]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0867]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0749]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0645]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0681]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0645]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0608]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0744]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0659]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0713]Epoch 1: 65%|######4 | 62/96 [00:00<00:00, 618.03batch/s, train/train_loss_1=0.0713]Epoch 1: 65%|######4 | 62/96 [00:00<00:00, 618.03batch/s, train/train_loss_1=0.0662]Epoch 1: 65%|######4 | 62/96 [00:00<00:00, 618.03batch/s, train/train_loss_1=0.0719]Epoch 1: 65%|######4 | 62/96 [00:00<00:00, 618.03batch/s, train/train_loss_1=0.0848]Epoch 1: 65%|######4 | 62/96 [00:00<00:00, 618.03batch/s, train/train_loss_1=0.0738]Epoch 1: 65%|######4 | 62/96 [00:00<00:00, 618.03batch/s, train/train_loss_1=0.0644]Epoch 1: 65%|######4 | 62/96 [00:00<00:00, 618.03batch/s, train/train_loss_1=0.0771]Epoch 1: 65%|######4 | 62/96 [00:00<00:00, 618.03batch/s, train/train_loss_1=0.0641]Epoch 1: 65%|######4 | 62/96 [00:00<00:00, 618.03batch/s, train/train_loss_1=0.0692]Epoch 1: 65%|######4 | 62/96 [00:00<00:00, 618.03batch/s, train/train_loss_1=0.0607]Epoch 1: 65%|######4 | 62/96 [00:00<00:00, 618.03batch/s, train/train_loss_1=0.0635]Epoch 1: 65%|######4 | 62/96 [00:00<00:00, 618.03batch/s, train/train_loss_1=0.0792]Epoch 1: 65%|######4 | 62/96 [00:00<00:00, 618.03batch/s, train/train_loss_1=0.0684]Epoch 1: 65%|######4 | 62/96 [00:00<00:00, 618.03batch/s, train/train_loss_1=0.0677]Epoch 1: 65%|######4 | 62/96 [00:00<00:00, 618.03batch/s, train/train_loss_1=0.0579]Epoch 1: 65%|######4 | 62/96 [00:00<00:00, 618.03batch/s, train/train_loss_1=0.0653]Epoch 1: 65%|######4 | 62/96 [00:00<00:00, 618.03batch/s, train/train_loss_1=0.0681]Epoch 1: 65%|######4 | 62/96 [00:00<00:00, 618.03batch/s, train/train_loss_1=0.0694]Epoch 1: 65%|######4 | 62/96 [00:00<00:00, 618.03batch/s, train/train_loss_1=0.0603]Epoch 1: 65%|######4 | 62/96 [00:00<00:00, 618.03batch/s, train/train_loss_1=0.0658]Epoch 1: 65%|######4 | 62/96 [00:00<00:00, 618.03batch/s, train/train_loss_1=0.0715]Epoch 1: 65%|######4 | 62/96 [00:00<00:00, 618.03batch/s, train/train_loss_1=0.0652]Epoch 1: 65%|######4 | 62/96 [00:00<00:00, 618.03batch/s, train/train_loss_1=0.0648]Epoch 1: 65%|######4 | 62/96 [00:00<00:00, 618.03batch/s, train/train_loss_1=0.0618]Epoch 1: 65%|######4 | 62/96 [00:00<00:00, 618.03batch/s, train/train_loss_1=0.0605]Epoch 1: 65%|######4 | 62/96 [00:00<00:00, 618.03batch/s, train/train_loss_1=0.0601]Epoch 1: 65%|######4 | 62/96 [00:00<00:00, 618.03batch/s, train/train_loss_1=0.062] Epoch 1: 65%|######4 | 62/96 [00:00<00:00, 618.03batch/s, train/train_loss_1=0.0538]Epoch 1: 65%|######4 | 62/96 [00:00<00:00, 618.03batch/s, train/train_loss_1=0.0749]Epoch 1: 65%|######4 | 62/96 [00:00<00:00, 618.03batch/s, train/train_loss_1=0.0698]Epoch 1: 65%|######4 | 62/96 [00:00<00:00, 618.03batch/s, train/train_loss_1=0.0658]Epoch 1: 65%|######4 | 62/96 [00:00<00:00, 618.03batch/s, train/train_loss_1=0.07] Epoch 1: 65%|######4 | 62/96 [00:00<00:00, 618.03batch/s, train/train_loss_1=0.0733]Epoch 1: 65%|######4 | 62/96 [00:00<00:00, 618.03batch/s, train/train_loss_1=0.0543]Epoch 1: 65%|######4 | 62/96 [00:00<00:00, 618.03batch/s, train/train_loss_1=0.0828] 0%| | 0/96 [00:00<?, ?batch/s]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0826]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0676]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0633]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0667]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0619]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0653]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0746]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0654]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0738]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0664]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0756]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0692]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0635]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0568]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0618]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0607]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0705]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0713]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0578]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0567]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0595]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0636]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0683]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0611]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0765]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0667]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0658]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0685]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0732]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0603]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0639]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0588]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0684]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0688]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0628]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0671]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0669]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0629]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0694]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0715]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0655]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0689]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0504]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0557]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0685]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0682]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0765]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0577]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0508]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0487]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0557]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0772]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0561]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0755]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.063] Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0674]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0802]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0618]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0717]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0642]Epoch 2: 62%|######2 | 60/96 [00:00<00:00, 594.34batch/s, train/train_loss_1=0.0642]Epoch 2: 62%|######2 | 60/96 [00:00<00:00, 594.34batch/s, train/train_loss_1=0.0724]Epoch 2: 62%|######2 | 60/96 [00:00<00:00, 594.34batch/s, train/train_loss_1=0.0563]Epoch 2: 62%|######2 | 60/96 [00:00<00:00, 594.34batch/s, train/train_loss_1=0.0578]Epoch 2: 62%|######2 | 60/96 [00:00<00:00, 594.34batch/s, train/train_loss_1=0.0725]Epoch 2: 62%|######2 | 60/96 [00:00<00:00, 594.34batch/s, train/train_loss_1=0.0606]Epoch 2: 62%|######2 | 60/96 [00:00<00:00, 594.34batch/s, train/train_loss_1=0.0652]Epoch 2: 62%|######2 | 60/96 [00:00<00:00, 594.34batch/s, train/train_loss_1=0.0517]Epoch 2: 62%|######2 | 60/96 [00:00<00:00, 594.34batch/s, train/train_loss_1=0.0595]Epoch 2: 62%|######2 | 60/96 [00:00<00:00, 594.34batch/s, train/train_loss_1=0.0643]Epoch 2: 62%|######2 | 60/96 [00:00<00:00, 594.34batch/s, train/train_loss_1=0.0595]Epoch 2: 62%|######2 | 60/96 [00:00<00:00, 594.34batch/s, train/train_loss_1=0.0697]Epoch 2: 62%|######2 | 60/96 [00:00<00:00, 594.34batch/s, train/train_loss_1=0.0634]Epoch 2: 62%|######2 | 60/96 [00:00<00:00, 594.34batch/s, train/train_loss_1=0.0641]Epoch 2: 62%|######2 | 60/96 [00:00<00:00, 594.34batch/s, train/train_loss_1=0.0658]Epoch 2: 62%|######2 | 60/96 [00:00<00:00, 594.34batch/s, train/train_loss_1=0.0598]Epoch 2: 62%|######2 | 60/96 [00:00<00:00, 594.34batch/s, train/train_loss_1=0.0666]Epoch 2: 62%|######2 | 60/96 [00:00<00:00, 594.34batch/s, train/train_loss_1=0.0722]Epoch 2: 62%|######2 | 60/96 [00:00<00:00, 594.34batch/s, train/train_loss_1=0.0616]Epoch 2: 62%|######2 | 60/96 [00:00<00:00, 594.34batch/s, train/train_loss_1=0.0639]Epoch 2: 62%|######2 | 60/96 [00:00<00:00, 594.34batch/s, train/train_loss_1=0.0602]Epoch 2: 62%|######2 | 60/96 [00:00<00:00, 594.34batch/s, train/train_loss_1=0.0684]Epoch 2: 62%|######2 | 60/96 [00:00<00:00, 594.34batch/s, train/train_loss_1=0.0616]Epoch 2: 62%|######2 | 60/96 [00:00<00:00, 594.34batch/s, train/train_loss_1=0.0671]Epoch 2: 62%|######2 | 60/96 [00:00<00:00, 594.34batch/s, train/train_loss_1=0.0707]Epoch 2: 62%|######2 | 60/96 [00:00<00:00, 594.34batch/s, train/train_loss_1=0.0527]Epoch 2: 62%|######2 | 60/96 [00:00<00:00, 594.34batch/s, train/train_loss_1=0.0542]Epoch 2: 62%|######2 | 60/96 [00:00<00:00, 594.34batch/s, train/train_loss_1=0.0672]Epoch 2: 62%|######2 | 60/96 [00:00<00:00, 594.34batch/s, train/train_loss_1=0.0601]Epoch 2: 62%|######2 | 60/96 [00:00<00:00, 594.34batch/s, train/train_loss_1=0.0714]Epoch 2: 62%|######2 | 60/96 [00:00<00:00, 594.34batch/s, train/train_loss_1=0.0648]Epoch 2: 62%|######2 | 60/96 [00:00<00:00, 594.34batch/s, train/train_loss_1=0.0744]Epoch 2: 62%|######2 | 60/96 [00:00<00:00, 594.34batch/s, train/train_loss_1=0.0606]Epoch 2: 62%|######2 | 60/96 [00:00<00:00, 594.34batch/s, train/train_loss_1=0.0588]Epoch 2: 62%|######2 | 60/96 [00:00<00:00, 594.34batch/s, train/train_loss_1=0.0581]Epoch 2: 62%|######2 | 60/96 [00:00<00:00, 594.34batch/s, train/train_loss_1=0.0702]Epoch 2: 62%|######2 | 60/96 [00:00<00:00, 594.34batch/s, train/train_loss_1=0.0871] 0%| | 0/96 [00:00<?, ?batch/s]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0662]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0593]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0675]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0625]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0574]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0572]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0599]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0623]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0631]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0604]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.064] Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0695]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0746]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0606]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0747]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.065] Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0607]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0691]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0657]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0638]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0663]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0693]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0615]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0749]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0606]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0639]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0543]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0565]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0597]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0524]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0674]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0646]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0574]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0633]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0601]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0697]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.061] Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0605]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0722]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0625]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0633]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0697]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0697]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.073] Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0777]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0679]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0552]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0645]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0607]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0506]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0632]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0684]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0553]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0577]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0599]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0637]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0571]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0533]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0586]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0666]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0543]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0492]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.056] Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0617]Epoch 3: 67%|######6 | 64/96 [00:00<00:00, 631.97batch/s, train/train_loss_1=0.0617]Epoch 3: 67%|######6 | 64/96 [00:00<00:00, 631.97batch/s, train/train_loss_1=0.0625]Epoch 3: 67%|######6 | 64/96 [00:00<00:00, 631.97batch/s, train/train_loss_1=0.0582]Epoch 3: 67%|######6 | 64/96 [00:00<00:00, 631.97batch/s, train/train_loss_1=0.0552]Epoch 3: 67%|######6 | 64/96 [00:00<00:00, 631.97batch/s, train/train_loss_1=0.0652]Epoch 3: 67%|######6 | 64/96 [00:00<00:00, 631.97batch/s, train/train_loss_1=0.0625]Epoch 3: 67%|######6 | 64/96 [00:00<00:00, 631.97batch/s, train/train_loss_1=0.072] Epoch 3: 67%|######6 | 64/96 [00:00<00:00, 631.97batch/s, train/train_loss_1=0.0666]Epoch 3: 67%|######6 | 64/96 [00:00<00:00, 631.97batch/s, train/train_loss_1=0.0583]Epoch 3: 67%|######6 | 64/96 [00:00<00:00, 631.97batch/s, train/train_loss_1=0.0587]Epoch 3: 67%|######6 | 64/96 [00:00<00:00, 631.97batch/s, train/train_loss_1=0.0569]Epoch 3: 67%|######6 | 64/96 [00:00<00:00, 631.97batch/s, train/train_loss_1=0.0687]Epoch 3: 67%|######6 | 64/96 [00:00<00:00, 631.97batch/s, train/train_loss_1=0.0667]Epoch 3: 67%|######6 | 64/96 [00:00<00:00, 631.97batch/s, train/train_loss_1=0.0565]Epoch 3: 67%|######6 | 64/96 [00:00<00:00, 631.97batch/s, train/train_loss_1=0.0705]Epoch 3: 67%|######6 | 64/96 [00:00<00:00, 631.97batch/s, train/train_loss_1=0.0721]Epoch 3: 67%|######6 | 64/96 [00:00<00:00, 631.97batch/s, train/train_loss_1=0.068] Epoch 3: 67%|######6 | 64/96 [00:00<00:00, 631.97batch/s, train/train_loss_1=0.0636]Epoch 3: 67%|######6 | 64/96 [00:00<00:00, 631.97batch/s, train/train_loss_1=0.056] Epoch 3: 67%|######6 | 64/96 [00:00<00:00, 631.97batch/s, train/train_loss_1=0.0638]Epoch 3: 67%|######6 | 64/96 [00:00<00:00, 631.97batch/s, train/train_loss_1=0.062] Epoch 3: 67%|######6 | 64/96 [00:00<00:00, 631.97batch/s, train/train_loss_1=0.0673]Epoch 3: 67%|######6 | 64/96 [00:00<00:00, 631.97batch/s, train/train_loss_1=0.0621]Epoch 3: 67%|######6 | 64/96 [00:00<00:00, 631.97batch/s, train/train_loss_1=0.067] Epoch 3: 67%|######6 | 64/96 [00:00<00:00, 631.97batch/s, train/train_loss_1=0.0672]Epoch 3: 67%|######6 | 64/96 [00:00<00:00, 631.97batch/s, train/train_loss_1=0.0647]Epoch 3: 67%|######6 | 64/96 [00:00<00:00, 631.97batch/s, train/train_loss_1=0.0628]Epoch 3: 67%|######6 | 64/96 [00:00<00:00, 631.97batch/s, train/train_loss_1=0.0631]Epoch 3: 67%|######6 | 64/96 [00:00<00:00, 631.97batch/s, train/train_loss_1=0.0588]Epoch 3: 67%|######6 | 64/96 [00:00<00:00, 631.97batch/s, train/train_loss_1=0.0561]Epoch 3: 67%|######6 | 64/96 [00:00<00:00, 631.97batch/s, train/train_loss_1=0.0694]Epoch 3: 67%|######6 | 64/96 [00:00<00:00, 631.97batch/s, train/train_loss_1=0.0608]Epoch 3: 67%|######6 | 64/96 [00:00<00:00, 631.97batch/s, train/train_loss_1=0.0562] 0%| | 0/96 [00:00<?, ?batch/s]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0544]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0632]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0588]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0633]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0704]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0617]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.059] Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0596]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0551]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0632]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0599]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.059] Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0574]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0631]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0741]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0564]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0609]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0606]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.064] Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0661]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0658]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0607]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0555]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0567]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0632]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0647]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0657]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0536]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0592]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0575]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0705]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0488]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.072] Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.069]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0682]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0696]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0665]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0668]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0611]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0603]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.051] Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0608]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0535]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.059] Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0648]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0763]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0678]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0632]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0658]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0731]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0661]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0596]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0731]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0729]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0634]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0694]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0612]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0638]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0621]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.06] Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0646]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0574]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0628]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0639]Epoch 4: 67%|######6 | 64/96 [00:00<00:00, 639.52batch/s, train/train_loss_1=0.0639]Epoch 4: 67%|######6 | 64/96 [00:00<00:00, 639.52batch/s, train/train_loss_1=0.0615]Epoch 4: 67%|######6 | 64/96 [00:00<00:00, 639.52batch/s, train/train_loss_1=0.0703]Epoch 4: 67%|######6 | 64/96 [00:00<00:00, 639.52batch/s, train/train_loss_1=0.0664]Epoch 4: 67%|######6 | 64/96 [00:00<00:00, 639.52batch/s, train/train_loss_1=0.06] Epoch 4: 67%|######6 | 64/96 [00:00<00:00, 639.52batch/s, train/train_loss_1=0.0648]Epoch 4: 67%|######6 | 64/96 [00:00<00:00, 639.52batch/s, train/train_loss_1=0.0559]Epoch 4: 67%|######6 | 64/96 [00:00<00:00, 639.52batch/s, train/train_loss_1=0.0682]Epoch 4: 67%|######6 | 64/96 [00:00<00:00, 639.52batch/s, train/train_loss_1=0.0543]Epoch 4: 67%|######6 | 64/96 [00:00<00:00, 639.52batch/s, train/train_loss_1=0.0622]Epoch 4: 67%|######6 | 64/96 [00:00<00:00, 639.52batch/s, train/train_loss_1=0.0678]Epoch 4: 67%|######6 | 64/96 [00:00<00:00, 639.52batch/s, train/train_loss_1=0.056] Epoch 4: 67%|######6 | 64/96 [00:00<00:00, 639.52batch/s, train/train_loss_1=0.0574]Epoch 4: 67%|######6 | 64/96 [00:00<00:00, 639.52batch/s, train/train_loss_1=0.0578]Epoch 4: 67%|######6 | 64/96 [00:00<00:00, 639.52batch/s, train/train_loss_1=0.0557]Epoch 4: 67%|######6 | 64/96 [00:00<00:00, 639.52batch/s, train/train_loss_1=0.0558]Epoch 4: 67%|######6 | 64/96 [00:00<00:00, 639.52batch/s, train/train_loss_1=0.0533]Epoch 4: 67%|######6 | 64/96 [00:00<00:00, 639.52batch/s, train/train_loss_1=0.0581]Epoch 4: 67%|######6 | 64/96 [00:00<00:00, 639.52batch/s, train/train_loss_1=0.0586]Epoch 4: 67%|######6 | 64/96 [00:00<00:00, 639.52batch/s, train/train_loss_1=0.0673]Epoch 4: 67%|######6 | 64/96 [00:00<00:00, 639.52batch/s, train/train_loss_1=0.0545]Epoch 4: 67%|######6 | 64/96 [00:00<00:00, 639.52batch/s, train/train_loss_1=0.0574]Epoch 4: 67%|######6 | 64/96 [00:00<00:00, 639.52batch/s, train/train_loss_1=0.0646]Epoch 4: 67%|######6 | 64/96 [00:00<00:00, 639.52batch/s, train/train_loss_1=0.0589]Epoch 4: 67%|######6 | 64/96 [00:00<00:00, 639.52batch/s, train/train_loss_1=0.0671]Epoch 4: 67%|######6 | 64/96 [00:00<00:00, 639.52batch/s, train/train_loss_1=0.0607]Epoch 4: 67%|######6 | 64/96 [00:00<00:00, 639.52batch/s, train/train_loss_1=0.0622]Epoch 4: 67%|######6 | 64/96 [00:00<00:00, 639.52batch/s, train/train_loss_1=0.0785]Epoch 4: 67%|######6 | 64/96 [00:00<00:00, 639.52batch/s, train/train_loss_1=0.0615]Epoch 4: 67%|######6 | 64/96 [00:00<00:00, 639.52batch/s, train/train_loss_1=0.0624]Epoch 4: 67%|######6 | 64/96 [00:00<00:00, 639.52batch/s, train/train_loss_1=0.064] Epoch 4: 67%|######6 | 64/96 [00:00<00:00, 639.52batch/s, train/train_loss_1=0.0685]Epoch 4: 67%|######6 | 64/96 [00:00<00:00, 639.52batch/s, train/train_loss_1=0.0439] 0%| | 0/96 [00:00<?, ?batch/s]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0611]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0673]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.069] Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0548]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0601]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.058] Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0611]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0634]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0604]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0651]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0629]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0644]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0637]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0653]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0557]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.062] Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0697]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.056] Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0495]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0632]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0592]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0646]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0698]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0602]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0691]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0592]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0642]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0556]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0663]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0625]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.056] Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0629]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0489]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.062] Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0651]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0561]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0567]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0551]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0587]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0627]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0692]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.064] Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0603]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0678]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0559]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0583]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0575]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0617]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0601]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0637]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0601]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0562]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0688]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0538]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0568]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0649]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0675]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0633]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0658]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0544]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0684]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0591]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0675]Epoch 5: 66%|######5 | 63/96 [00:00<00:00, 623.19batch/s, train/train_loss_1=0.0675]Epoch 5: 66%|######5 | 63/96 [00:00<00:00, 623.19batch/s, train/train_loss_1=0.0654]Epoch 5: 66%|######5 | 63/96 [00:00<00:00, 623.19batch/s, train/train_loss_1=0.0684]Epoch 5: 66%|######5 | 63/96 [00:00<00:00, 623.19batch/s, train/train_loss_1=0.0601]Epoch 5: 66%|######5 | 63/96 [00:00<00:00, 623.19batch/s, train/train_loss_1=0.0578]Epoch 5: 66%|######5 | 63/96 [00:00<00:00, 623.19batch/s, train/train_loss_1=0.0719]Epoch 5: 66%|######5 | 63/96 [00:00<00:00, 623.19batch/s, train/train_loss_1=0.0655]Epoch 5: 66%|######5 | 63/96 [00:00<00:00, 623.19batch/s, train/train_loss_1=0.0649]Epoch 5: 66%|######5 | 63/96 [00:00<00:00, 623.19batch/s, train/train_loss_1=0.0529]Epoch 5: 66%|######5 | 63/96 [00:00<00:00, 623.19batch/s, train/train_loss_1=0.0642]Epoch 5: 66%|######5 | 63/96 [00:00<00:00, 623.19batch/s, train/train_loss_1=0.0626]Epoch 5: 66%|######5 | 63/96 [00:00<00:00, 623.19batch/s, train/train_loss_1=0.0646]Epoch 5: 66%|######5 | 63/96 [00:00<00:00, 623.19batch/s, train/train_loss_1=0.0633]Epoch 5: 66%|######5 | 63/96 [00:00<00:00, 623.19batch/s, train/train_loss_1=0.0592]Epoch 5: 66%|######5 | 63/96 [00:00<00:00, 623.19batch/s, train/train_loss_1=0.0596]Epoch 5: 66%|######5 | 63/96 [00:00<00:00, 623.19batch/s, train/train_loss_1=0.0592]Epoch 5: 66%|######5 | 63/96 [00:00<00:00, 623.19batch/s, train/train_loss_1=0.0638]Epoch 5: 66%|######5 | 63/96 [00:00<00:00, 623.19batch/s, train/train_loss_1=0.0664]Epoch 5: 66%|######5 | 63/96 [00:00<00:00, 623.19batch/s, train/train_loss_1=0.0659]Epoch 5: 66%|######5 | 63/96 [00:00<00:00, 623.19batch/s, train/train_loss_1=0.0676]Epoch 5: 66%|######5 | 63/96 [00:00<00:00, 623.19batch/s, train/train_loss_1=0.0703]Epoch 5: 66%|######5 | 63/96 [00:00<00:00, 623.19batch/s, train/train_loss_1=0.0584]Epoch 5: 66%|######5 | 63/96 [00:00<00:00, 623.19batch/s, train/train_loss_1=0.0506]Epoch 5: 66%|######5 | 63/96 [00:00<00:00, 623.19batch/s, train/train_loss_1=0.0662]Epoch 5: 66%|######5 | 63/96 [00:00<00:00, 623.19batch/s, train/train_loss_1=0.0522]Epoch 5: 66%|######5 | 63/96 [00:00<00:00, 623.19batch/s, train/train_loss_1=0.0559]Epoch 5: 66%|######5 | 63/96 [00:00<00:00, 623.19batch/s, train/train_loss_1=0.0619]Epoch 5: 66%|######5 | 63/96 [00:00<00:00, 623.19batch/s, train/train_loss_1=0.0481]Epoch 5: 66%|######5 | 63/96 [00:00<00:00, 623.19batch/s, train/train_loss_1=0.0656]Epoch 5: 66%|######5 | 63/96 [00:00<00:00, 623.19batch/s, train/train_loss_1=0.0558]Epoch 5: 66%|######5 | 63/96 [00:00<00:00, 623.19batch/s, train/train_loss_1=0.058] Epoch 5: 66%|######5 | 63/96 [00:00<00:00, 623.19batch/s, train/train_loss_1=0.069]Epoch 5: 66%|######5 | 63/96 [00:00<00:00, 623.19batch/s, train/train_loss_1=0.0612]Epoch 5: 66%|######5 | 63/96 [00:00<00:00, 623.19batch/s, train/train_loss_1=0.0525] 0%| | 0/96 [00:00<?, ?batch/s]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0604]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0558]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0566]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0531]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0696]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0661]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0574]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0608]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0647]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0733]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0609]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0625]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0518]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0566]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0616]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0658]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0666]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0648]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0617]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0611]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0626]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0602]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0586]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0556]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0545]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0556]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0607]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.05] Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0564]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0477]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0661]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0638]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0594]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0659]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0732]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0548]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0654]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0502]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0547]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.06] Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.062]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0672]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.054] Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0693]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0608]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0605]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0675]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0522]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0613]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0639]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0573]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0632]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0514]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0632]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0599]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0609]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.059] Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0636]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0682]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0679]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.056] Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0583]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0698]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.06] Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.052]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0682]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0709]Epoch 6: 70%|######9 | 67/96 [00:00<00:00, 660.58batch/s, train/train_loss_1=0.0709]Epoch 6: 70%|######9 | 67/96 [00:00<00:00, 660.58batch/s, train/train_loss_1=0.0518]Epoch 6: 70%|######9 | 67/96 [00:00<00:00, 660.58batch/s, train/train_loss_1=0.0613]Epoch 6: 70%|######9 | 67/96 [00:00<00:00, 660.58batch/s, train/train_loss_1=0.0667]Epoch 6: 70%|######9 | 67/96 [00:00<00:00, 660.58batch/s, train/train_loss_1=0.0572]Epoch 6: 70%|######9 | 67/96 [00:00<00:00, 660.58batch/s, train/train_loss_1=0.0631]Epoch 6: 70%|######9 | 67/96 [00:00<00:00, 660.58batch/s, train/train_loss_1=0.0697]Epoch 6: 70%|######9 | 67/96 [00:00<00:00, 660.58batch/s, train/train_loss_1=0.0557]Epoch 6: 70%|######9 | 67/96 [00:00<00:00, 660.58batch/s, train/train_loss_1=0.0677]Epoch 6: 70%|######9 | 67/96 [00:00<00:00, 660.58batch/s, train/train_loss_1=0.0761]Epoch 6: 70%|######9 | 67/96 [00:00<00:00, 660.58batch/s, train/train_loss_1=0.0581]Epoch 6: 70%|######9 | 67/96 [00:00<00:00, 660.58batch/s, train/train_loss_1=0.0545]Epoch 6: 70%|######9 | 67/96 [00:00<00:00, 660.58batch/s, train/train_loss_1=0.0638]Epoch 6: 70%|######9 | 67/96 [00:00<00:00, 660.58batch/s, train/train_loss_1=0.0489]Epoch 6: 70%|######9 | 67/96 [00:00<00:00, 660.58batch/s, train/train_loss_1=0.0572]Epoch 6: 70%|######9 | 67/96 [00:00<00:00, 660.58batch/s, train/train_loss_1=0.0697]Epoch 6: 70%|######9 | 67/96 [00:00<00:00, 660.58batch/s, train/train_loss_1=0.0609]Epoch 6: 70%|######9 | 67/96 [00:00<00:00, 660.58batch/s, train/train_loss_1=0.0745]Epoch 6: 70%|######9 | 67/96 [00:00<00:00, 660.58batch/s, train/train_loss_1=0.0707]Epoch 6: 70%|######9 | 67/96 [00:00<00:00, 660.58batch/s, train/train_loss_1=0.0564]Epoch 6: 70%|######9 | 67/96 [00:00<00:00, 660.58batch/s, train/train_loss_1=0.0612]Epoch 6: 70%|######9 | 67/96 [00:00<00:00, 660.58batch/s, train/train_loss_1=0.063] Epoch 6: 70%|######9 | 67/96 [00:00<00:00, 660.58batch/s, train/train_loss_1=0.0652]Epoch 6: 70%|######9 | 67/96 [00:00<00:00, 660.58batch/s, train/train_loss_1=0.0573]Epoch 6: 70%|######9 | 67/96 [00:00<00:00, 660.58batch/s, train/train_loss_1=0.0682]Epoch 6: 70%|######9 | 67/96 [00:00<00:00, 660.58batch/s, train/train_loss_1=0.0626]Epoch 6: 70%|######9 | 67/96 [00:00<00:00, 660.58batch/s, train/train_loss_1=0.0519]Epoch 6: 70%|######9 | 67/96 [00:00<00:00, 660.58batch/s, train/train_loss_1=0.0551]Epoch 6: 70%|######9 | 67/96 [00:00<00:00, 660.58batch/s, train/train_loss_1=0.0431]Epoch 6: 70%|######9 | 67/96 [00:00<00:00, 660.58batch/s, train/train_loss_1=0.0678] 0%| | 0/96 [00:00<?, ?batch/s]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0562]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0731]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0526]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0536]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0632]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0548]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0568]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0619]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0652]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0734]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0645]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0627]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0583]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0498]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0555]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0642]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0625]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0607]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0585]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0594]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0668]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0688]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0629]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0646]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0606]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0628]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0691]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0751]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0574]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0713]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0525]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0502]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0626]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0633]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0492]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0608]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0542]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0486]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0586]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0579]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0588]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0709]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0665]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0639]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0549]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0679]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0622]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0604]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0698]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.054] Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0611]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0606]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0696]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.057] Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0624]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0662]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0601]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0645]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0657]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.047] Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0591]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.057] Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0632]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0623]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0538]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0623]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0553]Epoch 7: 70%|######9 | 67/96 [00:00<00:00, 664.57batch/s, train/train_loss_1=0.0553]Epoch 7: 70%|######9 | 67/96 [00:00<00:00, 664.57batch/s, train/train_loss_1=0.0633]Epoch 7: 70%|######9 | 67/96 [00:00<00:00, 664.57batch/s, train/train_loss_1=0.0696]Epoch 7: 70%|######9 | 67/96 [00:00<00:00, 664.57batch/s, train/train_loss_1=0.0569]Epoch 7: 70%|######9 | 67/96 [00:00<00:00, 664.57batch/s, train/train_loss_1=0.0584]Epoch 7: 70%|######9 | 67/96 [00:00<00:00, 664.57batch/s, train/train_loss_1=0.0613]Epoch 7: 70%|######9 | 67/96 [00:00<00:00, 664.57batch/s, train/train_loss_1=0.0584]Epoch 7: 70%|######9 | 67/96 [00:00<00:00, 664.57batch/s, train/train_loss_1=0.0556]Epoch 7: 70%|######9 | 67/96 [00:00<00:00, 664.57batch/s, train/train_loss_1=0.0689]Epoch 7: 70%|######9 | 67/96 [00:00<00:00, 664.57batch/s, train/train_loss_1=0.062] Epoch 7: 70%|######9 | 67/96 [00:00<00:00, 664.57batch/s, train/train_loss_1=0.0618]Epoch 7: 70%|######9 | 67/96 [00:00<00:00, 664.57batch/s, train/train_loss_1=0.0535]Epoch 7: 70%|######9 | 67/96 [00:00<00:00, 664.57batch/s, train/train_loss_1=0.0531]Epoch 7: 70%|######9 | 67/96 [00:00<00:00, 664.57batch/s, train/train_loss_1=0.0632]Epoch 7: 70%|######9 | 67/96 [00:00<00:00, 664.57batch/s, train/train_loss_1=0.075] Epoch 7: 70%|######9 | 67/96 [00:00<00:00, 664.57batch/s, train/train_loss_1=0.0601]Epoch 7: 70%|######9 | 67/96 [00:00<00:00, 664.57batch/s, train/train_loss_1=0.0733]Epoch 7: 70%|######9 | 67/96 [00:00<00:00, 664.57batch/s, train/train_loss_1=0.0588]Epoch 7: 70%|######9 | 67/96 [00:00<00:00, 664.57batch/s, train/train_loss_1=0.0626]Epoch 7: 70%|######9 | 67/96 [00:00<00:00, 664.57batch/s, train/train_loss_1=0.0652]Epoch 7: 70%|######9 | 67/96 [00:00<00:00, 664.57batch/s, train/train_loss_1=0.0532]Epoch 7: 70%|######9 | 67/96 [00:00<00:00, 664.57batch/s, train/train_loss_1=0.0622]Epoch 7: 70%|######9 | 67/96 [00:00<00:00, 664.57batch/s, train/train_loss_1=0.0535]Epoch 7: 70%|######9 | 67/96 [00:00<00:00, 664.57batch/s, train/train_loss_1=0.0475]Epoch 7: 70%|######9 | 67/96 [00:00<00:00, 664.57batch/s, train/train_loss_1=0.0646]Epoch 7: 70%|######9 | 67/96 [00:00<00:00, 664.57batch/s, train/train_loss_1=0.0705]Epoch 7: 70%|######9 | 67/96 [00:00<00:00, 664.57batch/s, train/train_loss_1=0.0509]Epoch 7: 70%|######9 | 67/96 [00:00<00:00, 664.57batch/s, train/train_loss_1=0.0528]Epoch 7: 70%|######9 | 67/96 [00:00<00:00, 664.57batch/s, train/train_loss_1=0.0509]Epoch 7: 70%|######9 | 67/96 [00:00<00:00, 664.57batch/s, train/train_loss_1=0.0603] 0%| | 0/96 [00:00<?, ?batch/s]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0527]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0598]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0681]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.049] Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.063]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0647]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0488]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.055] Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0598]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0631]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0648]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0576]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0528]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0624]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0587]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0666]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0611]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0665]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0463]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0664]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.054] Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0599]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0591]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0672]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.064] Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0539]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0549]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0545]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0659]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0557]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0739]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0665]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0578]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0608]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0545]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0529]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0615]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0682]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0658]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0606]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0534]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0569]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0661]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0614]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0688]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0661]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0503]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0599]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0645]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.073] Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0605]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0602]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.058] Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0649]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.057] Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0551]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.055] Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0596]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0633]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.065] Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0567]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0613]Epoch 8: 65%|######4 | 62/96 [00:00<00:00, 617.42batch/s, train/train_loss_1=0.0613]Epoch 8: 65%|######4 | 62/96 [00:00<00:00, 617.42batch/s, train/train_loss_1=0.0572]Epoch 8: 65%|######4 | 62/96 [00:00<00:00, 617.42batch/s, train/train_loss_1=0.0626]Epoch 8: 65%|######4 | 62/96 [00:00<00:00, 617.42batch/s, train/train_loss_1=0.0565]Epoch 8: 65%|######4 | 62/96 [00:00<00:00, 617.42batch/s, train/train_loss_1=0.0591]Epoch 8: 65%|######4 | 62/96 [00:00<00:00, 617.42batch/s, train/train_loss_1=0.0637]Epoch 8: 65%|######4 | 62/96 [00:00<00:00, 617.42batch/s, train/train_loss_1=0.0676]Epoch 8: 65%|######4 | 62/96 [00:00<00:00, 617.42batch/s, train/train_loss_1=0.0602]Epoch 8: 65%|######4 | 62/96 [00:00<00:00, 617.42batch/s, train/train_loss_1=0.0718]Epoch 8: 65%|######4 | 62/96 [00:00<00:00, 617.42batch/s, train/train_loss_1=0.0521]Epoch 8: 65%|######4 | 62/96 [00:00<00:00, 617.42batch/s, train/train_loss_1=0.0637]Epoch 8: 65%|######4 | 62/96 [00:00<00:00, 617.42batch/s, train/train_loss_1=0.0569]Epoch 8: 65%|######4 | 62/96 [00:00<00:00, 617.42batch/s, train/train_loss_1=0.064] Epoch 8: 65%|######4 | 62/96 [00:00<00:00, 617.42batch/s, train/train_loss_1=0.0674]Epoch 8: 65%|######4 | 62/96 [00:00<00:00, 617.42batch/s, train/train_loss_1=0.0587]Epoch 8: 65%|######4 | 62/96 [00:00<00:00, 617.42batch/s, train/train_loss_1=0.0595]Epoch 8: 65%|######4 | 62/96 [00:00<00:00, 617.42batch/s, train/train_loss_1=0.0726]Epoch 8: 65%|######4 | 62/96 [00:00<00:00, 617.42batch/s, train/train_loss_1=0.0716]Epoch 8: 65%|######4 | 62/96 [00:00<00:00, 617.42batch/s, train/train_loss_1=0.0634]Epoch 8: 65%|######4 | 62/96 [00:00<00:00, 617.42batch/s, train/train_loss_1=0.0618]Epoch 8: 65%|######4 | 62/96 [00:00<00:00, 617.42batch/s, train/train_loss_1=0.064] Epoch 8: 65%|######4 | 62/96 [00:00<00:00, 617.42batch/s, train/train_loss_1=0.0586]Epoch 8: 65%|######4 | 62/96 [00:00<00:00, 617.42batch/s, train/train_loss_1=0.0524]Epoch 8: 65%|######4 | 62/96 [00:00<00:00, 617.42batch/s, train/train_loss_1=0.0554]Epoch 8: 65%|######4 | 62/96 [00:00<00:00, 617.42batch/s, train/train_loss_1=0.0612]Epoch 8: 65%|######4 | 62/96 [00:00<00:00, 617.42batch/s, train/train_loss_1=0.0616]Epoch 8: 65%|######4 | 62/96 [00:00<00:00, 617.42batch/s, train/train_loss_1=0.054] Epoch 8: 65%|######4 | 62/96 [00:00<00:00, 617.42batch/s, train/train_loss_1=0.0525]Epoch 8: 65%|######4 | 62/96 [00:00<00:00, 617.42batch/s, train/train_loss_1=0.0717]Epoch 8: 65%|######4 | 62/96 [00:00<00:00, 617.42batch/s, train/train_loss_1=0.0785]Epoch 8: 65%|######4 | 62/96 [00:00<00:00, 617.42batch/s, train/train_loss_1=0.0615]Epoch 8: 65%|######4 | 62/96 [00:00<00:00, 617.42batch/s, train/train_loss_1=0.0527]Epoch 8: 65%|######4 | 62/96 [00:00<00:00, 617.42batch/s, train/train_loss_1=0.059] Epoch 8: 65%|######4 | 62/96 [00:00<00:00, 617.42batch/s, train/train_loss_1=0.0464]Epoch 8: 65%|######4 | 62/96 [00:00<00:00, 617.42batch/s, train/train_loss_1=0.0529] 0%| | 0/96 [00:00<?, ?batch/s]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0552]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0628]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0611]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.055] Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0559]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0591]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0676]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0664]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0538]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0515]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0551]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0649]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.07] Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0635]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0593]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0701]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0616]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0563]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0582]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0651]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0624]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0616]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0626]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0602]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0645]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0546]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0566]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.054] Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0631]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0596]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.059] Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0686]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.059] Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0533]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0598]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0504]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0707]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0672]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.071] Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0561]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0585]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0668]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0611]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0566]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0666]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0601]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0622]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0605]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0587]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0578]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0581]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0633]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0561]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0482]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0611]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0518]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0584]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.058] Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0633]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0551]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0606]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0488]Epoch 9: 65%|######4 | 62/96 [00:00<00:00, 615.51batch/s, train/train_loss_1=0.0488]Epoch 9: 65%|######4 | 62/96 [00:00<00:00, 615.51batch/s, train/train_loss_1=0.0654]Epoch 9: 65%|######4 | 62/96 [00:00<00:00, 615.51batch/s, train/train_loss_1=0.0669]Epoch 9: 65%|######4 | 62/96 [00:00<00:00, 615.51batch/s, train/train_loss_1=0.0607]Epoch 9: 65%|######4 | 62/96 [00:00<00:00, 615.51batch/s, train/train_loss_1=0.0643]Epoch 9: 65%|######4 | 62/96 [00:00<00:00, 615.51batch/s, train/train_loss_1=0.0656]Epoch 9: 65%|######4 | 62/96 [00:00<00:00, 615.51batch/s, train/train_loss_1=0.062] Epoch 9: 65%|######4 | 62/96 [00:00<00:00, 615.51batch/s, train/train_loss_1=0.0462]Epoch 9: 65%|######4 | 62/96 [00:00<00:00, 615.51batch/s, train/train_loss_1=0.0634]Epoch 9: 65%|######4 | 62/96 [00:00<00:00, 615.51batch/s, train/train_loss_1=0.0518]Epoch 9: 65%|######4 | 62/96 [00:00<00:00, 615.51batch/s, train/train_loss_1=0.057] Epoch 9: 65%|######4 | 62/96 [00:00<00:00, 615.51batch/s, train/train_loss_1=0.0613]Epoch 9: 65%|######4 | 62/96 [00:00<00:00, 615.51batch/s, train/train_loss_1=0.0565]Epoch 9: 65%|######4 | 62/96 [00:00<00:00, 615.51batch/s, train/train_loss_1=0.0639]Epoch 9: 65%|######4 | 62/96 [00:00<00:00, 615.51batch/s, train/train_loss_1=0.0576]Epoch 9: 65%|######4 | 62/96 [00:00<00:00, 615.51batch/s, train/train_loss_1=0.0733]Epoch 9: 65%|######4 | 62/96 [00:00<00:00, 615.51batch/s, train/train_loss_1=0.0557]Epoch 9: 65%|######4 | 62/96 [00:00<00:00, 615.51batch/s, train/train_loss_1=0.066] Epoch 9: 65%|######4 | 62/96 [00:00<00:00, 615.51batch/s, train/train_loss_1=0.0634]Epoch 9: 65%|######4 | 62/96 [00:00<00:00, 615.51batch/s, train/train_loss_1=0.0625]Epoch 9: 65%|######4 | 62/96 [00:00<00:00, 615.51batch/s, train/train_loss_1=0.0612]Epoch 9: 65%|######4 | 62/96 [00:00<00:00, 615.51batch/s, train/train_loss_1=0.0586]Epoch 9: 65%|######4 | 62/96 [00:00<00:00, 615.51batch/s, train/train_loss_1=0.058] Epoch 9: 65%|######4 | 62/96 [00:00<00:00, 615.51batch/s, train/train_loss_1=0.0615]Epoch 9: 65%|######4 | 62/96 [00:00<00:00, 615.51batch/s, train/train_loss_1=0.0589]Epoch 9: 65%|######4 | 62/96 [00:00<00:00, 615.51batch/s, train/train_loss_1=0.0513]Epoch 9: 65%|######4 | 62/96 [00:00<00:00, 615.51batch/s, train/train_loss_1=0.0626]Epoch 9: 65%|######4 | 62/96 [00:00<00:00, 615.51batch/s, train/train_loss_1=0.0628]Epoch 9: 65%|######4 | 62/96 [00:00<00:00, 615.51batch/s, train/train_loss_1=0.0591]Epoch 9: 65%|######4 | 62/96 [00:00<00:00, 615.51batch/s, train/train_loss_1=0.0637]Epoch 9: 65%|######4 | 62/96 [00:00<00:00, 615.51batch/s, train/train_loss_1=0.067] Epoch 9: 65%|######4 | 62/96 [00:00<00:00, 615.51batch/s, train/train_loss_1=0.0612]Epoch 9: 65%|######4 | 62/96 [00:00<00:00, 615.51batch/s, train/train_loss_1=0.0573]Epoch 9: 65%|######4 | 62/96 [00:00<00:00, 615.51batch/s, train/train_loss_1=0.0547]Epoch 9: 65%|######4 | 62/96 [00:00<00:00, 615.51batch/s, train/train_loss_1=0.0706]Epoch 9: 100%|##########| 96/96 [00:00<00:00, 569.03batch/s, train/train_loss_1=0.0706]
/opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages/sklearn/preprocessing/_encoders.py:868: FutureWarning: `sparse` was renamed to `sparse_output` in version 1.2 and will be removed in 1.4. `sparse_output` is ignored unless you leave `sparse` to its default value.
warnings.warn(
0%| | 0/31 [00:00<?, ?batch/s]Epoch 0: 0%| | 0/31 [00:00<?, ?batch/s]Epoch 0: 0%| | 0/31 [00:01<?, ?batch/s, train/train_loss_1=0.135]Epoch 0: 3%|3 | 1/31 [00:01<00:34, 1.14s/batch, train/train_loss_1=0.135]Epoch 0: 3%|3 | 1/31 [00:01<00:34, 1.14s/batch, train/train_loss_1=0.132]Epoch 0: 3%|3 | 1/31 [00:01<00:34, 1.14s/batch, train/train_loss_1=0.123]Epoch 0: 3%|3 | 1/31 [00:01<00:34, 1.14s/batch, train/train_loss_1=0.126]Epoch 0: 3%|3 | 1/31 [00:01<00:34, 1.14s/batch, train/train_loss_1=0.124]Epoch 0: 3%|3 | 1/31 [00:01<00:34, 1.14s/batch, train/train_loss_1=0.123]Epoch 0: 3%|3 | 1/31 [00:01<00:34, 1.14s/batch, train/train_loss_1=0.123]Epoch 0: 3%|3 | 1/31 [00:01<00:34, 1.14s/batch, train/train_loss_1=0.126]Epoch 0: 3%|3 | 1/31 [00:01<00:34, 1.14s/batch, train/train_loss_1=0.123]Epoch 0: 3%|3 | 1/31 [00:01<00:34, 1.14s/batch, train/train_loss_1=0.122]Epoch 0: 3%|3 | 1/31 [00:01<00:34, 1.14s/batch, train/train_loss_1=0.123]Epoch 0: 3%|3 | 1/31 [00:01<00:34, 1.14s/batch, train/train_loss_1=0.12] Epoch 0: 3%|3 | 1/31 [00:01<00:34, 1.14s/batch, train/train_loss_1=0.122]Epoch 0: 3%|3 | 1/31 [00:01<00:34, 1.14s/batch, train/train_loss_1=0.119]Epoch 0: 3%|3 | 1/31 [00:01<00:34, 1.14s/batch, train/train_loss_1=0.124]Epoch 0: 3%|3 | 1/31 [00:01<00:34, 1.14s/batch, train/train_loss_1=0.119]Epoch 0: 3%|3 | 1/31 [00:01<00:34, 1.14s/batch, train/train_loss_1=0.119]Epoch 0: 3%|3 | 1/31 [00:01<00:34, 1.14s/batch, train/train_loss_1=0.121]Epoch 0: 3%|3 | 1/31 [00:01<00:34, 1.14s/batch, train/train_loss_1=0.116]Epoch 0: 3%|3 | 1/31 [00:01<00:34, 1.14s/batch, train/train_loss_1=0.118]Epoch 0: 3%|3 | 1/31 [00:01<00:34, 1.14s/batch, train/train_loss_1=0.123]Epoch 0: 3%|3 | 1/31 [00:01<00:34, 1.14s/batch, train/train_loss_1=0.121]Epoch 0: 3%|3 | 1/31 [00:01<00:34, 1.14s/batch, train/train_loss_1=0.124]Epoch 0: 3%|3 | 1/31 [00:01<00:34, 1.14s/batch, train/train_loss_1=0.119]Epoch 0: 3%|3 | 1/31 [00:01<00:34, 1.14s/batch, train/train_loss_1=0.117]Epoch 0: 3%|3 | 1/31 [00:01<00:34, 1.14s/batch, train/train_loss_1=0.113]Epoch 0: 3%|3 | 1/31 [00:01<00:34, 1.14s/batch, train/train_loss_1=0.118]Epoch 0: 3%|3 | 1/31 [00:01<00:34, 1.14s/batch, train/train_loss_1=0.119]Epoch 0: 3%|3 | 1/31 [00:01<00:34, 1.14s/batch, train/train_loss_1=0.121]Epoch 0: 3%|3 | 1/31 [00:01<00:34, 1.14s/batch, train/train_loss_1=0.116]Epoch 0: 3%|3 | 1/31 [00:02<00:34, 1.14s/batch, train/train_loss_1=0.116]Epoch 0: 100%|##########| 31/31 [00:02<00:00, 14.90batch/s, train/train_loss_1=0.116] 0%| | 0/31 [00:00<?, ?batch/s]Epoch 1: 0%| | 0/31 [00:00<?, ?batch/s]Epoch 1: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.11]Epoch 1: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.119]Epoch 1: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.12] Epoch 1: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.118]Epoch 1: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.118]Epoch 1: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.118]Epoch 1: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.122]Epoch 1: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.113]Epoch 1: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.116]Epoch 1: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.117]Epoch 1: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.114]Epoch 1: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.111]Epoch 1: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.123]Epoch 1: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.115]Epoch 1: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.107]Epoch 1: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.119]Epoch 1: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.116]Epoch 1: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.114]Epoch 1: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.116]Epoch 1: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.114]Epoch 1: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.112]Epoch 1: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.112]Epoch 1: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.11] Epoch 1: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.117]Epoch 1: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.115]Epoch 1: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.111]Epoch 1: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.118]Epoch 1: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.112]Epoch 1: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.111]Epoch 1: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.11] Epoch 1: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.108] 0%| | 0/31 [00:00<?, ?batch/s]Epoch 2: 0%| | 0/31 [00:00<?, ?batch/s]Epoch 2: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.107]Epoch 2: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.108]Epoch 2: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.111]Epoch 2: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.112]Epoch 2: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.117]Epoch 2: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.111]Epoch 2: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.11] Epoch 2: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.109]Epoch 2: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.111]Epoch 2: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.104]Epoch 2: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.105]Epoch 2: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.111]Epoch 2: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.103]Epoch 2: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.112]Epoch 2: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.107]Epoch 2: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.105]Epoch 2: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.103]Epoch 2: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.111]Epoch 2: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.115]Epoch 2: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.109]Epoch 2: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.113]Epoch 2: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.115]Epoch 2: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.111]Epoch 2: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.11] Epoch 2: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.102]Epoch 2: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.105]Epoch 2: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.11] Epoch 2: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.107]Epoch 2: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.108]Epoch 2: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.0993]Epoch 2: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.119] 0%| | 0/31 [00:00<?, ?batch/s]Epoch 3: 0%| | 0/31 [00:00<?, ?batch/s]Epoch 3: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.11]Epoch 3: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.111]Epoch 3: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.116]Epoch 3: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.107]Epoch 3: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.108]Epoch 3: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.113]Epoch 3: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.106]Epoch 3: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.108]Epoch 3: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.104]Epoch 3: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.102]Epoch 3: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.103]Epoch 3: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.101]Epoch 3: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.11] Epoch 3: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.109]Epoch 3: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.108]Epoch 3: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.111]Epoch 3: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.107]Epoch 3: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.1] Epoch 3: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.107]Epoch 3: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.0995]Epoch 3: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.113] Epoch 3: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.094]Epoch 3: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.107]Epoch 3: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.109]Epoch 3: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.1] Epoch 3: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.0928]Epoch 3: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.117] Epoch 3: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.104]Epoch 3: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.108]Epoch 3: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.103]Epoch 3: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.0999] 0%| | 0/31 [00:00<?, ?batch/s]Epoch 4: 0%| | 0/31 [00:00<?, ?batch/s]Epoch 4: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.104]Epoch 4: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.102]Epoch 4: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.111]Epoch 4: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.102]Epoch 4: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.0997]Epoch 4: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.0994]Epoch 4: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.0987]Epoch 4: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.105] Epoch 4: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.101]Epoch 4: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.104]Epoch 4: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.105]Epoch 4: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.103]Epoch 4: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.107]Epoch 4: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.116]Epoch 4: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.108]Epoch 4: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.107]Epoch 4: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.109]Epoch 4: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.112]Epoch 4: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.0938]Epoch 4: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.103] Epoch 4: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.118]Epoch 4: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.115]Epoch 4: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.105]Epoch 4: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.103]Epoch 4: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.097]Epoch 4: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.0969]Epoch 4: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.0968]Epoch 4: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.0984]Epoch 4: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.11] Epoch 4: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.0958]Epoch 4: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.102] 0%| | 0/31 [00:00<?, ?batch/s]Epoch 5: 0%| | 0/31 [00:00<?, ?batch/s]Epoch 5: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.0953]Epoch 5: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.0958]Epoch 5: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.0985]Epoch 5: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.104] Epoch 5: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.0914]Epoch 5: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.0965]Epoch 5: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.106] Epoch 5: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.104]Epoch 5: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.107]Epoch 5: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.111]Epoch 5: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.0998]Epoch 5: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.105] Epoch 5: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.105]Epoch 5: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.0957]Epoch 5: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.11] Epoch 5: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.105]Epoch 5: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.101]Epoch 5: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.0964]Epoch 5: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.0983]Epoch 5: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.0954]Epoch 5: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.0992]Epoch 5: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.1] Epoch 5: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.09]Epoch 5: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.0922]Epoch 5: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.113] Epoch 5: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.0991]Epoch 5: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.113] Epoch 5: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.115]Epoch 5: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.104]Epoch 5: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.0909]Epoch 5: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.0972] 0%| | 0/31 [00:00<?, ?batch/s]Epoch 6: 0%| | 0/31 [00:00<?, ?batch/s]Epoch 6: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.11]Epoch 6: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.107]Epoch 6: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.0982]Epoch 6: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.115] Epoch 6: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.102]Epoch 6: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.0972]Epoch 6: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.1] Epoch 6: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.108]Epoch 6: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.105]Epoch 6: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.103]Epoch 6: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.0939]Epoch 6: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.0989]Epoch 6: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.0984]Epoch 6: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.116] Epoch 6: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.106]Epoch 6: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.106]Epoch 6: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.0988]Epoch 6: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.0963]Epoch 6: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.109] Epoch 6: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.0974]Epoch 6: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.1] Epoch 6: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.105]Epoch 6: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.0963]Epoch 6: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.096] Epoch 6: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.0862]Epoch 6: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.0966]Epoch 6: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.101] Epoch 6: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.0985]Epoch 6: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.0978]Epoch 6: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.111] Epoch 6: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.0908] 0%| | 0/31 [00:00<?, ?batch/s]Epoch 7: 0%| | 0/31 [00:00<?, ?batch/s]Epoch 7: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.106]Epoch 7: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.103]Epoch 7: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.0964]Epoch 7: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.102] Epoch 7: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.092]Epoch 7: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.095]Epoch 7: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.0998]Epoch 7: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.101] Epoch 7: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.104]Epoch 7: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.107]Epoch 7: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.101]Epoch 7: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.105]Epoch 7: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.101]Epoch 7: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.104]Epoch 7: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.0955]Epoch 7: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.112] Epoch 7: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.0941]Epoch 7: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.112] Epoch 7: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.107]Epoch 7: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.093]Epoch 7: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.101]Epoch 7: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.0993]Epoch 7: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.094] Epoch 7: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.0942]Epoch 7: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.0985]Epoch 7: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.098] Epoch 7: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.0994]Epoch 7: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.103] Epoch 7: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.102]Epoch 7: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.105]Epoch 7: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.107] 0%| | 0/31 [00:00<?, ?batch/s]Epoch 8: 0%| | 0/31 [00:00<?, ?batch/s]Epoch 8: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.0951]Epoch 8: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.109] Epoch 8: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.106]Epoch 8: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.102]Epoch 8: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.0972]Epoch 8: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.104] Epoch 8: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.0905]Epoch 8: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.1] Epoch 8: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.111]Epoch 8: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.0923]Epoch 8: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.0951]Epoch 8: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.102] Epoch 8: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.104]Epoch 8: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.098]Epoch 8: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.0967]Epoch 8: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.0928]Epoch 8: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.106] Epoch 8: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.0998]Epoch 8: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.105] Epoch 8: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.103]Epoch 8: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.106]Epoch 8: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.0942]Epoch 8: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.104] Epoch 8: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.0968]Epoch 8: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.0944]Epoch 8: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.1] Epoch 8: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.102]Epoch 8: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.102]Epoch 8: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.102]Epoch 8: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.0937]Epoch 8: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.0958] 0%| | 0/31 [00:00<?, ?batch/s]Epoch 9: 0%| | 0/31 [00:00<?, ?batch/s]Epoch 9: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.0943]Epoch 9: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.0994]Epoch 9: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.0973]Epoch 9: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.101] Epoch 9: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.0989]Epoch 9: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.0977]Epoch 9: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.0948]Epoch 9: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.0991]Epoch 9: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.105] Epoch 9: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.1] Epoch 9: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.0989]Epoch 9: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.0983]Epoch 9: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.102] Epoch 9: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.105]Epoch 9: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.0912]Epoch 9: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.107] Epoch 9: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.103]Epoch 9: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.0984]Epoch 9: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.111] Epoch 9: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.0908]Epoch 9: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.0914]Epoch 9: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.0951]Epoch 9: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.0961]Epoch 9: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.0982]Epoch 9: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.0953]Epoch 9: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.0933]Epoch 9: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.0836]Epoch 9: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.11] Epoch 9: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.0964]Epoch 9: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.106] Epoch 9: 0%| | 0/31 [00:00<?, ?batch/s, train/train_loss_1=0.117]Epoch 9: 100%|##########| 31/31 [00:00<00:00, 575.46batch/s, train/train_loss_1=0.117]
/opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages/sklearn/preprocessing/_encoders.py:868: FutureWarning: `sparse` was renamed to `sparse_output` in version 1.2 and will be removed in 1.4. `sparse_output` is ignored unless you leave `sparse` to its default value.
warnings.warn(
0%| | 0/96 [00:00<?, ?batch/s]Epoch 0: 0%| | 0/96 [00:00<?, ?batch/s]Epoch 0: 0%| | 0/96 [00:01<?, ?batch/s, train/train_loss_1=0.127]Epoch 0: 1%|1 | 1/96 [00:01<01:45, 1.11s/batch, train/train_loss_1=0.127]Epoch 0: 1%|1 | 1/96 [00:01<01:45, 1.11s/batch, train/train_loss_1=0.13] Epoch 0: 1%|1 | 1/96 [00:01<01:45, 1.11s/batch, train/train_loss_1=0.127]Epoch 0: 1%|1 | 1/96 [00:01<01:45, 1.11s/batch, train/train_loss_1=0.121]Epoch 0: 1%|1 | 1/96 [00:01<01:45, 1.11s/batch, train/train_loss_1=0.127]Epoch 0: 1%|1 | 1/96 [00:01<01:45, 1.11s/batch, train/train_loss_1=0.123]Epoch 0: 1%|1 | 1/96 [00:01<01:45, 1.11s/batch, train/train_loss_1=0.129]Epoch 0: 1%|1 | 1/96 [00:01<01:45, 1.11s/batch, train/train_loss_1=0.126]Epoch 0: 1%|1 | 1/96 [00:01<01:45, 1.11s/batch, train/train_loss_1=0.127]Epoch 0: 1%|1 | 1/96 [00:01<01:45, 1.11s/batch, train/train_loss_1=0.125]Epoch 0: 1%|1 | 1/96 [00:01<01:45, 1.11s/batch, train/train_loss_1=0.123]Epoch 0: 1%|1 | 1/96 [00:01<01:45, 1.11s/batch, train/train_loss_1=0.125]Epoch 0: 1%|1 | 1/96 [00:01<01:45, 1.11s/batch, train/train_loss_1=0.124]Epoch 0: 1%|1 | 1/96 [00:01<01:45, 1.11s/batch, train/train_loss_1=0.122]Epoch 0: 1%|1 | 1/96 [00:01<01:45, 1.11s/batch, train/train_loss_1=0.123]Epoch 0: 1%|1 | 1/96 [00:01<01:45, 1.11s/batch, train/train_loss_1=0.12] Epoch 0: 1%|1 | 1/96 [00:01<01:45, 1.11s/batch, train/train_loss_1=0.122]Epoch 0: 1%|1 | 1/96 [00:01<01:45, 1.11s/batch, train/train_loss_1=0.121]Epoch 0: 1%|1 | 1/96 [00:01<01:45, 1.11s/batch, train/train_loss_1=0.118]Epoch 0: 1%|1 | 1/96 [00:01<01:45, 1.11s/batch, train/train_loss_1=0.12] Epoch 0: 1%|1 | 1/96 [00:01<01:45, 1.11s/batch, train/train_loss_1=0.122]Epoch 0: 1%|1 | 1/96 [00:01<01:45, 1.11s/batch, train/train_loss_1=0.121]Epoch 0: 1%|1 | 1/96 [00:01<01:45, 1.11s/batch, train/train_loss_1=0.121]Epoch 0: 1%|1 | 1/96 [00:01<01:45, 1.11s/batch, train/train_loss_1=0.117]Epoch 0: 1%|1 | 1/96 [00:01<01:45, 1.11s/batch, train/train_loss_1=0.121]Epoch 0: 1%|1 | 1/96 [00:01<01:45, 1.11s/batch, train/train_loss_1=0.121]Epoch 0: 1%|1 | 1/96 [00:01<01:45, 1.11s/batch, train/train_loss_1=0.114]Epoch 0: 1%|1 | 1/96 [00:01<01:45, 1.11s/batch, train/train_loss_1=0.116]Epoch 0: 1%|1 | 1/96 [00:01<01:45, 1.11s/batch, train/train_loss_1=0.114]Epoch 0: 1%|1 | 1/96 [00:01<01:45, 1.11s/batch, train/train_loss_1=0.117]Epoch 0: 1%|1 | 1/96 [00:01<01:45, 1.11s/batch, train/train_loss_1=0.122]Epoch 0: 1%|1 | 1/96 [00:01<01:45, 1.11s/batch, train/train_loss_1=0.118]Epoch 0: 1%|1 | 1/96 [00:01<01:45, 1.11s/batch, train/train_loss_1=0.117]Epoch 0: 1%|1 | 1/96 [00:01<01:45, 1.11s/batch, train/train_loss_1=0.116]Epoch 0: 1%|1 | 1/96 [00:01<01:45, 1.11s/batch, train/train_loss_1=0.112]Epoch 0: 1%|1 | 1/96 [00:01<01:45, 1.11s/batch, train/train_loss_1=0.111]Epoch 0: 1%|1 | 1/96 [00:01<01:45, 1.11s/batch, train/train_loss_1=0.118]Epoch 0: 1%|1 | 1/96 [00:01<01:45, 1.11s/batch, train/train_loss_1=0.118]Epoch 0: 1%|1 | 1/96 [00:01<01:45, 1.11s/batch, train/train_loss_1=0.112]Epoch 0: 1%|1 | 1/96 [00:01<01:45, 1.11s/batch, train/train_loss_1=0.11] Epoch 0: 1%|1 | 1/96 [00:01<01:45, 1.11s/batch, train/train_loss_1=0.111]Epoch 0: 1%|1 | 1/96 [00:01<01:45, 1.11s/batch, train/train_loss_1=0.115]Epoch 0: 1%|1 | 1/96 [00:01<01:45, 1.11s/batch, train/train_loss_1=0.109]Epoch 0: 1%|1 | 1/96 [00:01<01:45, 1.11s/batch, train/train_loss_1=0.112]Epoch 0: 1%|1 | 1/96 [00:01<01:45, 1.11s/batch, train/train_loss_1=0.113]Epoch 0: 1%|1 | 1/96 [00:01<01:45, 1.11s/batch, train/train_loss_1=0.113]Epoch 0: 1%|1 | 1/96 [00:01<01:45, 1.11s/batch, train/train_loss_1=0.108]Epoch 0: 1%|1 | 1/96 [00:01<01:45, 1.11s/batch, train/train_loss_1=0.113]Epoch 0: 1%|1 | 1/96 [00:01<01:45, 1.11s/batch, train/train_loss_1=0.106]Epoch 0: 1%|1 | 1/96 [00:01<01:45, 1.11s/batch, train/train_loss_1=0.102]Epoch 0: 1%|1 | 1/96 [00:01<01:45, 1.11s/batch, train/train_loss_1=0.106]Epoch 0: 1%|1 | 1/96 [00:01<01:45, 1.11s/batch, train/train_loss_1=0.104]Epoch 0: 1%|1 | 1/96 [00:01<01:45, 1.11s/batch, train/train_loss_1=0.102]Epoch 0: 1%|1 | 1/96 [00:01<01:45, 1.11s/batch, train/train_loss_1=0.108]Epoch 0: 1%|1 | 1/96 [00:01<01:45, 1.11s/batch, train/train_loss_1=0.105]Epoch 0: 1%|1 | 1/96 [00:01<01:45, 1.11s/batch, train/train_loss_1=0.0984]Epoch 0: 1%|1 | 1/96 [00:01<01:45, 1.11s/batch, train/train_loss_1=0.103] Epoch 0: 1%|1 | 1/96 [00:01<01:45, 1.11s/batch, train/train_loss_1=0.105]Epoch 0: 1%|1 | 1/96 [00:01<01:45, 1.11s/batch, train/train_loss_1=0.0919]Epoch 0: 61%|######1 | 59/96 [00:01<00:00, 66.70batch/s, train/train_loss_1=0.0919]Epoch 0: 61%|######1 | 59/96 [00:01<00:00, 66.70batch/s, train/train_loss_1=0.0942]Epoch 0: 61%|######1 | 59/96 [00:01<00:00, 66.70batch/s, train/train_loss_1=0.0993]Epoch 0: 61%|######1 | 59/96 [00:01<00:00, 66.70batch/s, train/train_loss_1=0.0995]Epoch 0: 61%|######1 | 59/96 [00:01<00:00, 66.70batch/s, train/train_loss_1=0.101] Epoch 0: 61%|######1 | 59/96 [00:01<00:00, 66.70batch/s, train/train_loss_1=0.104]Epoch 0: 61%|######1 | 59/96 [00:01<00:00, 66.70batch/s, train/train_loss_1=0.0979]Epoch 0: 61%|######1 | 59/96 [00:01<00:00, 66.70batch/s, train/train_loss_1=0.097] Epoch 0: 61%|######1 | 59/96 [00:01<00:00, 66.70batch/s, train/train_loss_1=0.107]Epoch 0: 61%|######1 | 59/96 [00:01<00:00, 66.70batch/s, train/train_loss_1=0.0906]Epoch 0: 61%|######1 | 59/96 [00:01<00:00, 66.70batch/s, train/train_loss_1=0.0939]Epoch 0: 61%|######1 | 59/96 [00:01<00:00, 66.70batch/s, train/train_loss_1=0.0969]Epoch 0: 61%|######1 | 59/96 [00:01<00:00, 66.70batch/s, train/train_loss_1=0.0998]Epoch 0: 61%|######1 | 59/96 [00:01<00:00, 66.70batch/s, train/train_loss_1=0.0957]Epoch 0: 61%|######1 | 59/96 [00:01<00:00, 66.70batch/s, train/train_loss_1=0.094] Epoch 0: 61%|######1 | 59/96 [00:01<00:00, 66.70batch/s, train/train_loss_1=0.0956]Epoch 0: 61%|######1 | 59/96 [00:01<00:00, 66.70batch/s, train/train_loss_1=0.0905]Epoch 0: 61%|######1 | 59/96 [00:01<00:00, 66.70batch/s, train/train_loss_1=0.0938]Epoch 0: 61%|######1 | 59/96 [00:01<00:00, 66.70batch/s, train/train_loss_1=0.0971]Epoch 0: 61%|######1 | 59/96 [00:01<00:00, 66.70batch/s, train/train_loss_1=0.0882]Epoch 0: 61%|######1 | 59/96 [00:01<00:00, 66.70batch/s, train/train_loss_1=0.0867]Epoch 0: 61%|######1 | 59/96 [00:01<00:00, 66.70batch/s, train/train_loss_1=0.0958]Epoch 0: 61%|######1 | 59/96 [00:01<00:00, 66.70batch/s, train/train_loss_1=0.0905]Epoch 0: 61%|######1 | 59/96 [00:01<00:00, 66.70batch/s, train/train_loss_1=0.0864]Epoch 0: 61%|######1 | 59/96 [00:01<00:00, 66.70batch/s, train/train_loss_1=0.0788]Epoch 0: 61%|######1 | 59/96 [00:01<00:00, 66.70batch/s, train/train_loss_1=0.0858]Epoch 0: 61%|######1 | 59/96 [00:01<00:00, 66.70batch/s, train/train_loss_1=0.0863]Epoch 0: 61%|######1 | 59/96 [00:01<00:00, 66.70batch/s, train/train_loss_1=0.0868]Epoch 0: 61%|######1 | 59/96 [00:01<00:00, 66.70batch/s, train/train_loss_1=0.0837]Epoch 0: 61%|######1 | 59/96 [00:01<00:00, 66.70batch/s, train/train_loss_1=0.0874]Epoch 0: 61%|######1 | 59/96 [00:01<00:00, 66.70batch/s, train/train_loss_1=0.0812]Epoch 0: 61%|######1 | 59/96 [00:01<00:00, 66.70batch/s, train/train_loss_1=0.0903]Epoch 0: 61%|######1 | 59/96 [00:01<00:00, 66.70batch/s, train/train_loss_1=0.0885]Epoch 0: 61%|######1 | 59/96 [00:01<00:00, 66.70batch/s, train/train_loss_1=0.0849]Epoch 0: 61%|######1 | 59/96 [00:01<00:00, 66.70batch/s, train/train_loss_1=0.084] Epoch 0: 61%|######1 | 59/96 [00:01<00:00, 66.70batch/s, train/train_loss_1=0.0843]Epoch 0: 61%|######1 | 59/96 [00:01<00:00, 66.70batch/s, train/train_loss_1=0.0761]Epoch 0: 61%|######1 | 59/96 [00:02<00:00, 66.70batch/s, train/train_loss_1=0.0858]Epoch 0: 100%|##########| 96/96 [00:02<00:00, 38.35batch/s, train/train_loss_1=0.0858] 0%| | 0/96 [00:00<?, ?batch/s]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0858]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0778]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0773]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0859]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0856]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0758]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0832]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0816]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0794]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0814]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0783]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0769]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0699]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.078] Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0769]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0717]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0726]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0706]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0695]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0655]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0678]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0726]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0665]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0837]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0727]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0687]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0904]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0675]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0661]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0741]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0708]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0734]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0639]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0654]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0589]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0645]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0721]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0645]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0711]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0651]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0695]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.067] Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0622]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0639]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0685]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0673]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.069] Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0524]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0563]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0652]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0657]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0523]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0607]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.067] Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0553]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0598]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.061] Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0676]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0635]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0637]Epoch 1: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0611]Epoch 1: 64%|######3 | 61/96 [00:00<00:00, 608.37batch/s, train/train_loss_1=0.0611]Epoch 1: 64%|######3 | 61/96 [00:00<00:00, 608.37batch/s, train/train_loss_1=0.0595]Epoch 1: 64%|######3 | 61/96 [00:00<00:00, 608.37batch/s, train/train_loss_1=0.057] Epoch 1: 64%|######3 | 61/96 [00:00<00:00, 608.37batch/s, train/train_loss_1=0.053]Epoch 1: 64%|######3 | 61/96 [00:00<00:00, 608.37batch/s, train/train_loss_1=0.0665]Epoch 1: 64%|######3 | 61/96 [00:00<00:00, 608.37batch/s, train/train_loss_1=0.0654]Epoch 1: 64%|######3 | 61/96 [00:00<00:00, 608.37batch/s, train/train_loss_1=0.0628]Epoch 1: 64%|######3 | 61/96 [00:00<00:00, 608.37batch/s, train/train_loss_1=0.0606]Epoch 1: 64%|######3 | 61/96 [00:00<00:00, 608.37batch/s, train/train_loss_1=0.0578]Epoch 1: 64%|######3 | 61/96 [00:00<00:00, 608.37batch/s, train/train_loss_1=0.0568]Epoch 1: 64%|######3 | 61/96 [00:00<00:00, 608.37batch/s, train/train_loss_1=0.0585]Epoch 1: 64%|######3 | 61/96 [00:00<00:00, 608.37batch/s, train/train_loss_1=0.0657]Epoch 1: 64%|######3 | 61/96 [00:00<00:00, 608.37batch/s, train/train_loss_1=0.0535]Epoch 1: 64%|######3 | 61/96 [00:00<00:00, 608.37batch/s, train/train_loss_1=0.0546]Epoch 1: 64%|######3 | 61/96 [00:00<00:00, 608.37batch/s, train/train_loss_1=0.0536]Epoch 1: 64%|######3 | 61/96 [00:00<00:00, 608.37batch/s, train/train_loss_1=0.056] Epoch 1: 64%|######3 | 61/96 [00:00<00:00, 608.37batch/s, train/train_loss_1=0.0613]Epoch 1: 64%|######3 | 61/96 [00:00<00:00, 608.37batch/s, train/train_loss_1=0.0446]Epoch 1: 64%|######3 | 61/96 [00:00<00:00, 608.37batch/s, train/train_loss_1=0.0571]Epoch 1: 64%|######3 | 61/96 [00:00<00:00, 608.37batch/s, train/train_loss_1=0.0523]Epoch 1: 64%|######3 | 61/96 [00:00<00:00, 608.37batch/s, train/train_loss_1=0.0551]Epoch 1: 64%|######3 | 61/96 [00:00<00:00, 608.37batch/s, train/train_loss_1=0.0597]Epoch 1: 64%|######3 | 61/96 [00:00<00:00, 608.37batch/s, train/train_loss_1=0.0567]Epoch 1: 64%|######3 | 61/96 [00:00<00:00, 608.37batch/s, train/train_loss_1=0.0552]Epoch 1: 64%|######3 | 61/96 [00:00<00:00, 608.37batch/s, train/train_loss_1=0.0444]Epoch 1: 64%|######3 | 61/96 [00:00<00:00, 608.37batch/s, train/train_loss_1=0.0373]Epoch 1: 64%|######3 | 61/96 [00:00<00:00, 608.37batch/s, train/train_loss_1=0.0521]Epoch 1: 64%|######3 | 61/96 [00:00<00:00, 608.37batch/s, train/train_loss_1=0.0478]Epoch 1: 64%|######3 | 61/96 [00:00<00:00, 608.37batch/s, train/train_loss_1=0.0579]Epoch 1: 64%|######3 | 61/96 [00:00<00:00, 608.37batch/s, train/train_loss_1=0.0629]Epoch 1: 64%|######3 | 61/96 [00:00<00:00, 608.37batch/s, train/train_loss_1=0.0578]Epoch 1: 64%|######3 | 61/96 [00:00<00:00, 608.37batch/s, train/train_loss_1=0.0552]Epoch 1: 64%|######3 | 61/96 [00:00<00:00, 608.37batch/s, train/train_loss_1=0.0669]Epoch 1: 64%|######3 | 61/96 [00:00<00:00, 608.37batch/s, train/train_loss_1=0.0598]Epoch 1: 64%|######3 | 61/96 [00:00<00:00, 608.37batch/s, train/train_loss_1=0.0551]Epoch 1: 64%|######3 | 61/96 [00:00<00:00, 608.37batch/s, train/train_loss_1=0.0502] 0%| | 0/96 [00:00<?, ?batch/s]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0573]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0523]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0542]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0511]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0546]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0557]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0494]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0423]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0532]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0411]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.047] Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0492]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0462]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0577]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0523]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.043] Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.046]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0482]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0634]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0584]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0499]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0519]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.055] Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0357]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0471]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0399]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0523]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0592]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0503]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0542]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0551]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0471]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0439]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0449]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.043] Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0486]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0519]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0388]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.048] Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0473]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0477]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0551]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0554]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0593]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.045] Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0466]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0433]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.039] Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.04] Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0501]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0417]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0491]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0473]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.043] Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0412]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0615]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0531]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0521]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0327]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0446]Epoch 2: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0549]Epoch 2: 64%|######3 | 61/96 [00:00<00:00, 603.58batch/s, train/train_loss_1=0.0549]Epoch 2: 64%|######3 | 61/96 [00:00<00:00, 603.58batch/s, train/train_loss_1=0.0404]Epoch 2: 64%|######3 | 61/96 [00:00<00:00, 603.58batch/s, train/train_loss_1=0.0565]Epoch 2: 64%|######3 | 61/96 [00:00<00:00, 603.58batch/s, train/train_loss_1=0.0519]Epoch 2: 64%|######3 | 61/96 [00:00<00:00, 603.58batch/s, train/train_loss_1=0.0471]Epoch 2: 64%|######3 | 61/96 [00:00<00:00, 603.58batch/s, train/train_loss_1=0.0496]Epoch 2: 64%|######3 | 61/96 [00:00<00:00, 603.58batch/s, train/train_loss_1=0.0441]Epoch 2: 64%|######3 | 61/96 [00:00<00:00, 603.58batch/s, train/train_loss_1=0.0511]Epoch 2: 64%|######3 | 61/96 [00:00<00:00, 603.58batch/s, train/train_loss_1=0.0389]Epoch 2: 64%|######3 | 61/96 [00:00<00:00, 603.58batch/s, train/train_loss_1=0.0479]Epoch 2: 64%|######3 | 61/96 [00:00<00:00, 603.58batch/s, train/train_loss_1=0.0466]Epoch 2: 64%|######3 | 61/96 [00:00<00:00, 603.58batch/s, train/train_loss_1=0.0605]Epoch 2: 64%|######3 | 61/96 [00:00<00:00, 603.58batch/s, train/train_loss_1=0.0543]Epoch 2: 64%|######3 | 61/96 [00:00<00:00, 603.58batch/s, train/train_loss_1=0.0508]Epoch 2: 64%|######3 | 61/96 [00:00<00:00, 603.58batch/s, train/train_loss_1=0.0519]Epoch 2: 64%|######3 | 61/96 [00:00<00:00, 603.58batch/s, train/train_loss_1=0.0421]Epoch 2: 64%|######3 | 61/96 [00:00<00:00, 603.58batch/s, train/train_loss_1=0.0462]Epoch 2: 64%|######3 | 61/96 [00:00<00:00, 603.58batch/s, train/train_loss_1=0.0538]Epoch 2: 64%|######3 | 61/96 [00:00<00:00, 603.58batch/s, train/train_loss_1=0.0375]Epoch 2: 64%|######3 | 61/96 [00:00<00:00, 603.58batch/s, train/train_loss_1=0.039] Epoch 2: 64%|######3 | 61/96 [00:00<00:00, 603.58batch/s, train/train_loss_1=0.0453]Epoch 2: 64%|######3 | 61/96 [00:00<00:00, 603.58batch/s, train/train_loss_1=0.0376]Epoch 2: 64%|######3 | 61/96 [00:00<00:00, 603.58batch/s, train/train_loss_1=0.0459]Epoch 2: 64%|######3 | 61/96 [00:00<00:00, 603.58batch/s, train/train_loss_1=0.0572]Epoch 2: 64%|######3 | 61/96 [00:00<00:00, 603.58batch/s, train/train_loss_1=0.0394]Epoch 2: 64%|######3 | 61/96 [00:00<00:00, 603.58batch/s, train/train_loss_1=0.0416]Epoch 2: 64%|######3 | 61/96 [00:00<00:00, 603.58batch/s, train/train_loss_1=0.0435]Epoch 2: 64%|######3 | 61/96 [00:00<00:00, 603.58batch/s, train/train_loss_1=0.0397]Epoch 2: 64%|######3 | 61/96 [00:00<00:00, 603.58batch/s, train/train_loss_1=0.0379]Epoch 2: 64%|######3 | 61/96 [00:00<00:00, 603.58batch/s, train/train_loss_1=0.0517]Epoch 2: 64%|######3 | 61/96 [00:00<00:00, 603.58batch/s, train/train_loss_1=0.0353]Epoch 2: 64%|######3 | 61/96 [00:00<00:00, 603.58batch/s, train/train_loss_1=0.0477]Epoch 2: 64%|######3 | 61/96 [00:00<00:00, 603.58batch/s, train/train_loss_1=0.0515]Epoch 2: 64%|######3 | 61/96 [00:00<00:00, 603.58batch/s, train/train_loss_1=0.04] Epoch 2: 64%|######3 | 61/96 [00:00<00:00, 603.58batch/s, train/train_loss_1=0.0474]Epoch 2: 64%|######3 | 61/96 [00:00<00:00, 603.58batch/s, train/train_loss_1=0.0403] 0%| | 0/96 [00:00<?, ?batch/s]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0438]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.038] Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.037]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.05] Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0497]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0458]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0295]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.044] Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0405]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0381]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0456]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0467]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0358]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0484]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0395]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0421]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0435]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.041] Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0476]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0406]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0432]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0409]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0336]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0395]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0373]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0461]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0383]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0474]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0424]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0389]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0496]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0391]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0374]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0542]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0537]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0308]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0402]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0369]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0429]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0442]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.046] Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0293]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0488]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.036] Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0468]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0325]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0408]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0467]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0419]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0335]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0536]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0362]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0403]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0448]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0317]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0362]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0426]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0416]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0441]Epoch 3: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0385]Epoch 3: 62%|######2 | 60/96 [00:00<00:00, 593.03batch/s, train/train_loss_1=0.0385]Epoch 3: 62%|######2 | 60/96 [00:00<00:00, 593.03batch/s, train/train_loss_1=0.0387]Epoch 3: 62%|######2 | 60/96 [00:00<00:00, 593.03batch/s, train/train_loss_1=0.0347]Epoch 3: 62%|######2 | 60/96 [00:00<00:00, 593.03batch/s, train/train_loss_1=0.0498]Epoch 3: 62%|######2 | 60/96 [00:00<00:00, 593.03batch/s, train/train_loss_1=0.0417]Epoch 3: 62%|######2 | 60/96 [00:00<00:00, 593.03batch/s, train/train_loss_1=0.0334]Epoch 3: 62%|######2 | 60/96 [00:00<00:00, 593.03batch/s, train/train_loss_1=0.0429]Epoch 3: 62%|######2 | 60/96 [00:00<00:00, 593.03batch/s, train/train_loss_1=0.0286]Epoch 3: 62%|######2 | 60/96 [00:00<00:00, 593.03batch/s, train/train_loss_1=0.0469]Epoch 3: 62%|######2 | 60/96 [00:00<00:00, 593.03batch/s, train/train_loss_1=0.0331]Epoch 3: 62%|######2 | 60/96 [00:00<00:00, 593.03batch/s, train/train_loss_1=0.0433]Epoch 3: 62%|######2 | 60/96 [00:00<00:00, 593.03batch/s, train/train_loss_1=0.0405]Epoch 3: 62%|######2 | 60/96 [00:00<00:00, 593.03batch/s, train/train_loss_1=0.0415]Epoch 3: 62%|######2 | 60/96 [00:00<00:00, 593.03batch/s, train/train_loss_1=0.0391]Epoch 3: 62%|######2 | 60/96 [00:00<00:00, 593.03batch/s, train/train_loss_1=0.035] Epoch 3: 62%|######2 | 60/96 [00:00<00:00, 593.03batch/s, train/train_loss_1=0.0395]Epoch 3: 62%|######2 | 60/96 [00:00<00:00, 593.03batch/s, train/train_loss_1=0.0387]Epoch 3: 62%|######2 | 60/96 [00:00<00:00, 593.03batch/s, train/train_loss_1=0.0389]Epoch 3: 62%|######2 | 60/96 [00:00<00:00, 593.03batch/s, train/train_loss_1=0.0401]Epoch 3: 62%|######2 | 60/96 [00:00<00:00, 593.03batch/s, train/train_loss_1=0.0334]Epoch 3: 62%|######2 | 60/96 [00:00<00:00, 593.03batch/s, train/train_loss_1=0.0395]Epoch 3: 62%|######2 | 60/96 [00:00<00:00, 593.03batch/s, train/train_loss_1=0.034] Epoch 3: 62%|######2 | 60/96 [00:00<00:00, 593.03batch/s, train/train_loss_1=0.0418]Epoch 3: 62%|######2 | 60/96 [00:00<00:00, 593.03batch/s, train/train_loss_1=0.0319]Epoch 3: 62%|######2 | 60/96 [00:00<00:00, 593.03batch/s, train/train_loss_1=0.0353]Epoch 3: 62%|######2 | 60/96 [00:00<00:00, 593.03batch/s, train/train_loss_1=0.0451]Epoch 3: 62%|######2 | 60/96 [00:00<00:00, 593.03batch/s, train/train_loss_1=0.0421]Epoch 3: 62%|######2 | 60/96 [00:00<00:00, 593.03batch/s, train/train_loss_1=0.0418]Epoch 3: 62%|######2 | 60/96 [00:00<00:00, 593.03batch/s, train/train_loss_1=0.0401]Epoch 3: 62%|######2 | 60/96 [00:00<00:00, 593.03batch/s, train/train_loss_1=0.0476]Epoch 3: 62%|######2 | 60/96 [00:00<00:00, 593.03batch/s, train/train_loss_1=0.043] Epoch 3: 62%|######2 | 60/96 [00:00<00:00, 593.03batch/s, train/train_loss_1=0.0313]Epoch 3: 62%|######2 | 60/96 [00:00<00:00, 593.03batch/s, train/train_loss_1=0.0336]Epoch 3: 62%|######2 | 60/96 [00:00<00:00, 593.03batch/s, train/train_loss_1=0.0449]Epoch 3: 62%|######2 | 60/96 [00:00<00:00, 593.03batch/s, train/train_loss_1=0.0382]Epoch 3: 62%|######2 | 60/96 [00:00<00:00, 593.03batch/s, train/train_loss_1=0.0334]Epoch 3: 62%|######2 | 60/96 [00:00<00:00, 593.03batch/s, train/train_loss_1=0.0291] 0%| | 0/96 [00:00<?, ?batch/s]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0311]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0365]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0356]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0397]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0441]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0399]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0397]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0429]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0498]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0328]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0275]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0413]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0416]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0361]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0521]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0338]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0486]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0401]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0404]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0398]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0473]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0302]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.037] Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0449]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0388]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0517]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0319]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0421]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.033] Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0421]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0359]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0403]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0301]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0453]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0371]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0351]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0386]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0399]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0399]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.043] Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0279]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0422]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0442]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0417]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0321]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0397]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0321]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.045] Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0434]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0314]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0357]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0471]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0504]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0279]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0341]Epoch 4: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0406]Epoch 4: 58%|#####8 | 56/96 [00:00<00:00, 558.51batch/s, train/train_loss_1=0.0406]Epoch 4: 58%|#####8 | 56/96 [00:00<00:00, 558.51batch/s, train/train_loss_1=0.0349]Epoch 4: 58%|#####8 | 56/96 [00:00<00:00, 558.51batch/s, train/train_loss_1=0.0319]Epoch 4: 58%|#####8 | 56/96 [00:00<00:00, 558.51batch/s, train/train_loss_1=0.0499]Epoch 4: 58%|#####8 | 56/96 [00:00<00:00, 558.51batch/s, train/train_loss_1=0.036] Epoch 4: 58%|#####8 | 56/96 [00:00<00:00, 558.51batch/s, train/train_loss_1=0.0346]Epoch 4: 58%|#####8 | 56/96 [00:00<00:00, 558.51batch/s, train/train_loss_1=0.0392]Epoch 4: 58%|#####8 | 56/96 [00:00<00:00, 558.51batch/s, train/train_loss_1=0.0318]Epoch 4: 58%|#####8 | 56/96 [00:00<00:00, 558.51batch/s, train/train_loss_1=0.0373]Epoch 4: 58%|#####8 | 56/96 [00:00<00:00, 558.51batch/s, train/train_loss_1=0.0297]Epoch 4: 58%|#####8 | 56/96 [00:00<00:00, 558.51batch/s, train/train_loss_1=0.0306]Epoch 4: 58%|#####8 | 56/96 [00:00<00:00, 558.51batch/s, train/train_loss_1=0.044] Epoch 4: 58%|#####8 | 56/96 [00:00<00:00, 558.51batch/s, train/train_loss_1=0.0307]Epoch 4: 58%|#####8 | 56/96 [00:00<00:00, 558.51batch/s, train/train_loss_1=0.0307]Epoch 4: 58%|#####8 | 56/96 [00:00<00:00, 558.51batch/s, train/train_loss_1=0.042] Epoch 4: 58%|#####8 | 56/96 [00:00<00:00, 558.51batch/s, train/train_loss_1=0.0285]Epoch 4: 58%|#####8 | 56/96 [00:00<00:00, 558.51batch/s, train/train_loss_1=0.0376]Epoch 4: 58%|#####8 | 56/96 [00:00<00:00, 558.51batch/s, train/train_loss_1=0.0336]Epoch 4: 58%|#####8 | 56/96 [00:00<00:00, 558.51batch/s, train/train_loss_1=0.0351]Epoch 4: 58%|#####8 | 56/96 [00:00<00:00, 558.51batch/s, train/train_loss_1=0.0316]Epoch 4: 58%|#####8 | 56/96 [00:00<00:00, 558.51batch/s, train/train_loss_1=0.0283]Epoch 4: 58%|#####8 | 56/96 [00:00<00:00, 558.51batch/s, train/train_loss_1=0.03] Epoch 4: 58%|#####8 | 56/96 [00:00<00:00, 558.51batch/s, train/train_loss_1=0.0389]Epoch 4: 58%|#####8 | 56/96 [00:00<00:00, 558.51batch/s, train/train_loss_1=0.0392]Epoch 4: 58%|#####8 | 56/96 [00:00<00:00, 558.51batch/s, train/train_loss_1=0.0374]Epoch 4: 58%|#####8 | 56/96 [00:00<00:00, 558.51batch/s, train/train_loss_1=0.0288]Epoch 4: 58%|#####8 | 56/96 [00:00<00:00, 558.51batch/s, train/train_loss_1=0.0313]Epoch 4: 58%|#####8 | 56/96 [00:00<00:00, 558.51batch/s, train/train_loss_1=0.033] Epoch 4: 58%|#####8 | 56/96 [00:00<00:00, 558.51batch/s, train/train_loss_1=0.0383]Epoch 4: 58%|#####8 | 56/96 [00:00<00:00, 558.51batch/s, train/train_loss_1=0.0277]Epoch 4: 58%|#####8 | 56/96 [00:00<00:00, 558.51batch/s, train/train_loss_1=0.0327]Epoch 4: 58%|#####8 | 56/96 [00:00<00:00, 558.51batch/s, train/train_loss_1=0.0387]Epoch 4: 58%|#####8 | 56/96 [00:00<00:00, 558.51batch/s, train/train_loss_1=0.0364]Epoch 4: 58%|#####8 | 56/96 [00:00<00:00, 558.51batch/s, train/train_loss_1=0.0402]Epoch 4: 58%|#####8 | 56/96 [00:00<00:00, 558.51batch/s, train/train_loss_1=0.0278]Epoch 4: 58%|#####8 | 56/96 [00:00<00:00, 558.51batch/s, train/train_loss_1=0.0393]Epoch 4: 58%|#####8 | 56/96 [00:00<00:00, 558.51batch/s, train/train_loss_1=0.0332]Epoch 4: 58%|#####8 | 56/96 [00:00<00:00, 558.51batch/s, train/train_loss_1=0.035] Epoch 4: 58%|#####8 | 56/96 [00:00<00:00, 558.51batch/s, train/train_loss_1=0.0344]Epoch 4: 58%|#####8 | 56/96 [00:00<00:00, 558.51batch/s, train/train_loss_1=0.0292]Epoch 4: 58%|#####8 | 56/96 [00:00<00:00, 558.51batch/s, train/train_loss_1=0.0464] 0%| | 0/96 [00:00<?, ?batch/s]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0364]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0344]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0344]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0305]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0273]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0311]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0242]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0479]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0382]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0314]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0378]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.024] Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0426]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0277]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0379]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0416]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0469]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0365]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.04] Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0319]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0326]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0265]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0461]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0417]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0241]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0355]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0319]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0275]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0223]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.041] Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0279]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0405]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0337]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0285]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0456]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0335]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0258]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0326]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0321]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0365]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0346]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0383]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0415]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0247]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0363]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0363]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.037] Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0307]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0455]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0468]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0311]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0281]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0381]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0489]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0251]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0294]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0288]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0332]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0403]Epoch 5: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0346]Epoch 5: 62%|######2 | 60/96 [00:00<00:00, 597.18batch/s, train/train_loss_1=0.0346]Epoch 5: 62%|######2 | 60/96 [00:00<00:00, 597.18batch/s, train/train_loss_1=0.0469]Epoch 5: 62%|######2 | 60/96 [00:00<00:00, 597.18batch/s, train/train_loss_1=0.0407]Epoch 5: 62%|######2 | 60/96 [00:00<00:00, 597.18batch/s, train/train_loss_1=0.0411]Epoch 5: 62%|######2 | 60/96 [00:00<00:00, 597.18batch/s, train/train_loss_1=0.0332]Epoch 5: 62%|######2 | 60/96 [00:00<00:00, 597.18batch/s, train/train_loss_1=0.0316]Epoch 5: 62%|######2 | 60/96 [00:00<00:00, 597.18batch/s, train/train_loss_1=0.0282]Epoch 5: 62%|######2 | 60/96 [00:00<00:00, 597.18batch/s, train/train_loss_1=0.0315]Epoch 5: 62%|######2 | 60/96 [00:00<00:00, 597.18batch/s, train/train_loss_1=0.0383]Epoch 5: 62%|######2 | 60/96 [00:00<00:00, 597.18batch/s, train/train_loss_1=0.0368]Epoch 5: 62%|######2 | 60/96 [00:00<00:00, 597.18batch/s, train/train_loss_1=0.0316]Epoch 5: 62%|######2 | 60/96 [00:00<00:00, 597.18batch/s, train/train_loss_1=0.024] Epoch 5: 62%|######2 | 60/96 [00:00<00:00, 597.18batch/s, train/train_loss_1=0.0353]Epoch 5: 62%|######2 | 60/96 [00:00<00:00, 597.18batch/s, train/train_loss_1=0.046] Epoch 5: 62%|######2 | 60/96 [00:00<00:00, 597.18batch/s, train/train_loss_1=0.0437]Epoch 5: 62%|######2 | 60/96 [00:00<00:00, 597.18batch/s, train/train_loss_1=0.0357]Epoch 5: 62%|######2 | 60/96 [00:00<00:00, 597.18batch/s, train/train_loss_1=0.0256]Epoch 5: 62%|######2 | 60/96 [00:00<00:00, 597.18batch/s, train/train_loss_1=0.0278]Epoch 5: 62%|######2 | 60/96 [00:00<00:00, 597.18batch/s, train/train_loss_1=0.0289]Epoch 5: 62%|######2 | 60/96 [00:00<00:00, 597.18batch/s, train/train_loss_1=0.0354]Epoch 5: 62%|######2 | 60/96 [00:00<00:00, 597.18batch/s, train/train_loss_1=0.0404]Epoch 5: 62%|######2 | 60/96 [00:00<00:00, 597.18batch/s, train/train_loss_1=0.0272]Epoch 5: 62%|######2 | 60/96 [00:00<00:00, 597.18batch/s, train/train_loss_1=0.0295]Epoch 5: 62%|######2 | 60/96 [00:00<00:00, 597.18batch/s, train/train_loss_1=0.0386]Epoch 5: 62%|######2 | 60/96 [00:00<00:00, 597.18batch/s, train/train_loss_1=0.0375]Epoch 5: 62%|######2 | 60/96 [00:00<00:00, 597.18batch/s, train/train_loss_1=0.0353]Epoch 5: 62%|######2 | 60/96 [00:00<00:00, 597.18batch/s, train/train_loss_1=0.0274]Epoch 5: 62%|######2 | 60/96 [00:00<00:00, 597.18batch/s, train/train_loss_1=0.0365]Epoch 5: 62%|######2 | 60/96 [00:00<00:00, 597.18batch/s, train/train_loss_1=0.0288]Epoch 5: 62%|######2 | 60/96 [00:00<00:00, 597.18batch/s, train/train_loss_1=0.0411]Epoch 5: 62%|######2 | 60/96 [00:00<00:00, 597.18batch/s, train/train_loss_1=0.0441]Epoch 5: 62%|######2 | 60/96 [00:00<00:00, 597.18batch/s, train/train_loss_1=0.0289]Epoch 5: 62%|######2 | 60/96 [00:00<00:00, 597.18batch/s, train/train_loss_1=0.0381]Epoch 5: 62%|######2 | 60/96 [00:00<00:00, 597.18batch/s, train/train_loss_1=0.0321]Epoch 5: 62%|######2 | 60/96 [00:00<00:00, 597.18batch/s, train/train_loss_1=0.0281]Epoch 5: 62%|######2 | 60/96 [00:00<00:00, 597.18batch/s, train/train_loss_1=0.0275]Epoch 5: 62%|######2 | 60/96 [00:00<00:00, 597.18batch/s, train/train_loss_1=0.0456] 0%| | 0/96 [00:00<?, ?batch/s]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0238]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0312]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0341]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0384]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0291]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0493]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0411]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0227]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0334]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0353]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.024] Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0309]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0367]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0383]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0316]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0299]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.023] Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0303]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0353]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0357]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.03] Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0374]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0316]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0399]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0266]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0262]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0438]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0334]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0414]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0294]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0334]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.038] Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.033]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0384]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0334]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0381]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0402]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0391]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0255]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0401]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0325]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0338]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0306]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0376]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0358]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.029] Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0301]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0345]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.04] Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0322]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.024] Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0372]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0322]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0239]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0304]Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.036] Epoch 6: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0264]Epoch 6: 59%|#####9 | 57/96 [00:00<00:00, 569.90batch/s, train/train_loss_1=0.0264]Epoch 6: 59%|#####9 | 57/96 [00:00<00:00, 569.90batch/s, train/train_loss_1=0.0229]Epoch 6: 59%|#####9 | 57/96 [00:00<00:00, 569.90batch/s, train/train_loss_1=0.0334]Epoch 6: 59%|#####9 | 57/96 [00:00<00:00, 569.90batch/s, train/train_loss_1=0.0344]Epoch 6: 59%|#####9 | 57/96 [00:00<00:00, 569.90batch/s, train/train_loss_1=0.0321]Epoch 6: 59%|#####9 | 57/96 [00:00<00:00, 569.90batch/s, train/train_loss_1=0.0287]Epoch 6: 59%|#####9 | 57/96 [00:00<00:00, 569.90batch/s, train/train_loss_1=0.0267]Epoch 6: 59%|#####9 | 57/96 [00:00<00:00, 569.90batch/s, train/train_loss_1=0.0239]Epoch 6: 59%|#####9 | 57/96 [00:00<00:00, 569.90batch/s, train/train_loss_1=0.0285]Epoch 6: 59%|#####9 | 57/96 [00:00<00:00, 569.90batch/s, train/train_loss_1=0.0355]Epoch 6: 59%|#####9 | 57/96 [00:00<00:00, 569.90batch/s, train/train_loss_1=0.0331]Epoch 6: 59%|#####9 | 57/96 [00:00<00:00, 569.90batch/s, train/train_loss_1=0.0337]Epoch 6: 59%|#####9 | 57/96 [00:00<00:00, 569.90batch/s, train/train_loss_1=0.0305]Epoch 6: 59%|#####9 | 57/96 [00:00<00:00, 569.90batch/s, train/train_loss_1=0.0355]Epoch 6: 59%|#####9 | 57/96 [00:00<00:00, 569.90batch/s, train/train_loss_1=0.0279]Epoch 6: 59%|#####9 | 57/96 [00:00<00:00, 569.90batch/s, train/train_loss_1=0.0449]Epoch 6: 59%|#####9 | 57/96 [00:00<00:00, 569.90batch/s, train/train_loss_1=0.0334]Epoch 6: 59%|#####9 | 57/96 [00:00<00:00, 569.90batch/s, train/train_loss_1=0.0341]Epoch 6: 59%|#####9 | 57/96 [00:00<00:00, 569.90batch/s, train/train_loss_1=0.0335]Epoch 6: 59%|#####9 | 57/96 [00:00<00:00, 569.90batch/s, train/train_loss_1=0.0343]Epoch 6: 59%|#####9 | 57/96 [00:00<00:00, 569.90batch/s, train/train_loss_1=0.0388]Epoch 6: 59%|#####9 | 57/96 [00:00<00:00, 569.90batch/s, train/train_loss_1=0.0259]Epoch 6: 59%|#####9 | 57/96 [00:00<00:00, 569.90batch/s, train/train_loss_1=0.0412]Epoch 6: 59%|#####9 | 57/96 [00:00<00:00, 569.90batch/s, train/train_loss_1=0.0356]Epoch 6: 59%|#####9 | 57/96 [00:00<00:00, 569.90batch/s, train/train_loss_1=0.0244]Epoch 6: 59%|#####9 | 57/96 [00:00<00:00, 569.90batch/s, train/train_loss_1=0.0315]Epoch 6: 59%|#####9 | 57/96 [00:00<00:00, 569.90batch/s, train/train_loss_1=0.0372]Epoch 6: 59%|#####9 | 57/96 [00:00<00:00, 569.90batch/s, train/train_loss_1=0.0402]Epoch 6: 59%|#####9 | 57/96 [00:00<00:00, 569.90batch/s, train/train_loss_1=0.0305]Epoch 6: 59%|#####9 | 57/96 [00:00<00:00, 569.90batch/s, train/train_loss_1=0.0289]Epoch 6: 59%|#####9 | 57/96 [00:00<00:00, 569.90batch/s, train/train_loss_1=0.0299]Epoch 6: 59%|#####9 | 57/96 [00:00<00:00, 569.90batch/s, train/train_loss_1=0.0433]Epoch 6: 59%|#####9 | 57/96 [00:00<00:00, 569.90batch/s, train/train_loss_1=0.0356]Epoch 6: 59%|#####9 | 57/96 [00:00<00:00, 569.90batch/s, train/train_loss_1=0.0267]Epoch 6: 59%|#####9 | 57/96 [00:00<00:00, 569.90batch/s, train/train_loss_1=0.0279]Epoch 6: 59%|#####9 | 57/96 [00:00<00:00, 569.90batch/s, train/train_loss_1=0.0329]Epoch 6: 59%|#####9 | 57/96 [00:00<00:00, 569.90batch/s, train/train_loss_1=0.0336]Epoch 6: 59%|#####9 | 57/96 [00:00<00:00, 569.90batch/s, train/train_loss_1=0.0282]Epoch 6: 59%|#####9 | 57/96 [00:00<00:00, 569.90batch/s, train/train_loss_1=0.0386]Epoch 6: 59%|#####9 | 57/96 [00:00<00:00, 569.90batch/s, train/train_loss_1=0.0487] 0%| | 0/96 [00:00<?, ?batch/s]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0308]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0297]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.028] Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0395]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0422]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0245]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0274]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0399]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0355]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0398]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.026] Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0283]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0369]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0303]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.027] Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0323]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0344]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0267]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0317]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0257]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0354]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0401]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0259]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0359]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0373]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.026] Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0384]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0327]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.028] Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0276]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0342]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0273]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0309]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0312]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0193]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0367]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0311]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0263]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0338]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0323]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0342]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0267]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0365]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0373]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0309]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0229]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.036] Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0208]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0228]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0361]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0236]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0469]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0307]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0286]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0334]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0329]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0375]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0344]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0186]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0377]Epoch 7: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0275]Epoch 7: 64%|######3 | 61/96 [00:00<00:00, 606.23batch/s, train/train_loss_1=0.0275]Epoch 7: 64%|######3 | 61/96 [00:00<00:00, 606.23batch/s, train/train_loss_1=0.0254]Epoch 7: 64%|######3 | 61/96 [00:00<00:00, 606.23batch/s, train/train_loss_1=0.0282]Epoch 7: 64%|######3 | 61/96 [00:00<00:00, 606.23batch/s, train/train_loss_1=0.0279]Epoch 7: 64%|######3 | 61/96 [00:00<00:00, 606.23batch/s, train/train_loss_1=0.0229]Epoch 7: 64%|######3 | 61/96 [00:00<00:00, 606.23batch/s, train/train_loss_1=0.0367]Epoch 7: 64%|######3 | 61/96 [00:00<00:00, 606.23batch/s, train/train_loss_1=0.0346]Epoch 7: 64%|######3 | 61/96 [00:00<00:00, 606.23batch/s, train/train_loss_1=0.0391]Epoch 7: 64%|######3 | 61/96 [00:00<00:00, 606.23batch/s, train/train_loss_1=0.0289]Epoch 7: 64%|######3 | 61/96 [00:00<00:00, 606.23batch/s, train/train_loss_1=0.0343]Epoch 7: 64%|######3 | 61/96 [00:00<00:00, 606.23batch/s, train/train_loss_1=0.0275]Epoch 7: 64%|######3 | 61/96 [00:00<00:00, 606.23batch/s, train/train_loss_1=0.0235]Epoch 7: 64%|######3 | 61/96 [00:00<00:00, 606.23batch/s, train/train_loss_1=0.0366]Epoch 7: 64%|######3 | 61/96 [00:00<00:00, 606.23batch/s, train/train_loss_1=0.0305]Epoch 7: 64%|######3 | 61/96 [00:00<00:00, 606.23batch/s, train/train_loss_1=0.0286]Epoch 7: 64%|######3 | 61/96 [00:00<00:00, 606.23batch/s, train/train_loss_1=0.0302]Epoch 7: 64%|######3 | 61/96 [00:00<00:00, 606.23batch/s, train/train_loss_1=0.0305]Epoch 7: 64%|######3 | 61/96 [00:00<00:00, 606.23batch/s, train/train_loss_1=0.0348]Epoch 7: 64%|######3 | 61/96 [00:00<00:00, 606.23batch/s, train/train_loss_1=0.0365]Epoch 7: 64%|######3 | 61/96 [00:00<00:00, 606.23batch/s, train/train_loss_1=0.0226]Epoch 7: 64%|######3 | 61/96 [00:00<00:00, 606.23batch/s, train/train_loss_1=0.0309]Epoch 7: 64%|######3 | 61/96 [00:00<00:00, 606.23batch/s, train/train_loss_1=0.0379]Epoch 7: 64%|######3 | 61/96 [00:00<00:00, 606.23batch/s, train/train_loss_1=0.029] Epoch 7: 64%|######3 | 61/96 [00:00<00:00, 606.23batch/s, train/train_loss_1=0.0337]Epoch 7: 64%|######3 | 61/96 [00:00<00:00, 606.23batch/s, train/train_loss_1=0.0392]Epoch 7: 64%|######3 | 61/96 [00:00<00:00, 606.23batch/s, train/train_loss_1=0.0349]Epoch 7: 64%|######3 | 61/96 [00:00<00:00, 606.23batch/s, train/train_loss_1=0.0247]Epoch 7: 64%|######3 | 61/96 [00:00<00:00, 606.23batch/s, train/train_loss_1=0.0285]Epoch 7: 64%|######3 | 61/96 [00:00<00:00, 606.23batch/s, train/train_loss_1=0.0296]Epoch 7: 64%|######3 | 61/96 [00:00<00:00, 606.23batch/s, train/train_loss_1=0.0256]Epoch 7: 64%|######3 | 61/96 [00:00<00:00, 606.23batch/s, train/train_loss_1=0.0298]Epoch 7: 64%|######3 | 61/96 [00:00<00:00, 606.23batch/s, train/train_loss_1=0.0367]Epoch 7: 64%|######3 | 61/96 [00:00<00:00, 606.23batch/s, train/train_loss_1=0.0364]Epoch 7: 64%|######3 | 61/96 [00:00<00:00, 606.23batch/s, train/train_loss_1=0.0229]Epoch 7: 64%|######3 | 61/96 [00:00<00:00, 606.23batch/s, train/train_loss_1=0.033] Epoch 7: 64%|######3 | 61/96 [00:00<00:00, 606.23batch/s, train/train_loss_1=0.0243] 0%| | 0/96 [00:00<?, ?batch/s]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0348]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0362]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0299]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0375]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0276]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0301]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0254]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.032] Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0339]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0262]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0268]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0265]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0362]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0282]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0309]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0349]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0369]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0273]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.029] Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0376]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0423]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0289]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.024] Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0172]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0333]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0261]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0306]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.031] Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0414]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0255]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0178]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0295]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0326]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0242]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0335]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0284]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0313]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0209]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0335]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0247]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0441]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0318]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0313]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0302]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0314]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0302]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0336]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0309]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0453]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0396]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0334]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0386]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0361]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0384]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0274]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.031] Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0319]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0389]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0268]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0236]Epoch 8: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0275]Epoch 8: 64%|######3 | 61/96 [00:00<00:00, 602.37batch/s, train/train_loss_1=0.0275]Epoch 8: 64%|######3 | 61/96 [00:00<00:00, 602.37batch/s, train/train_loss_1=0.0306]Epoch 8: 64%|######3 | 61/96 [00:00<00:00, 602.37batch/s, train/train_loss_1=0.0364]Epoch 8: 64%|######3 | 61/96 [00:00<00:00, 602.37batch/s, train/train_loss_1=0.0271]Epoch 8: 64%|######3 | 61/96 [00:00<00:00, 602.37batch/s, train/train_loss_1=0.0313]Epoch 8: 64%|######3 | 61/96 [00:00<00:00, 602.37batch/s, train/train_loss_1=0.0316]Epoch 8: 64%|######3 | 61/96 [00:00<00:00, 602.37batch/s, train/train_loss_1=0.0406]Epoch 8: 64%|######3 | 61/96 [00:00<00:00, 602.37batch/s, train/train_loss_1=0.025] Epoch 8: 64%|######3 | 61/96 [00:00<00:00, 602.37batch/s, train/train_loss_1=0.0235]Epoch 8: 64%|######3 | 61/96 [00:00<00:00, 602.37batch/s, train/train_loss_1=0.0361]Epoch 8: 64%|######3 | 61/96 [00:00<00:00, 602.37batch/s, train/train_loss_1=0.0252]Epoch 8: 64%|######3 | 61/96 [00:00<00:00, 602.37batch/s, train/train_loss_1=0.0272]Epoch 8: 64%|######3 | 61/96 [00:00<00:00, 602.37batch/s, train/train_loss_1=0.0346]Epoch 8: 64%|######3 | 61/96 [00:00<00:00, 602.37batch/s, train/train_loss_1=0.0326]Epoch 8: 64%|######3 | 61/96 [00:00<00:00, 602.37batch/s, train/train_loss_1=0.0286]Epoch 8: 64%|######3 | 61/96 [00:00<00:00, 602.37batch/s, train/train_loss_1=0.0331]Epoch 8: 64%|######3 | 61/96 [00:00<00:00, 602.37batch/s, train/train_loss_1=0.0302]Epoch 8: 64%|######3 | 61/96 [00:00<00:00, 602.37batch/s, train/train_loss_1=0.0267]Epoch 8: 64%|######3 | 61/96 [00:00<00:00, 602.37batch/s, train/train_loss_1=0.0365]Epoch 8: 64%|######3 | 61/96 [00:00<00:00, 602.37batch/s, train/train_loss_1=0.0327]Epoch 8: 64%|######3 | 61/96 [00:00<00:00, 602.37batch/s, train/train_loss_1=0.027] Epoch 8: 64%|######3 | 61/96 [00:00<00:00, 602.37batch/s, train/train_loss_1=0.0346]Epoch 8: 64%|######3 | 61/96 [00:00<00:00, 602.37batch/s, train/train_loss_1=0.0297]Epoch 8: 64%|######3 | 61/96 [00:00<00:00, 602.37batch/s, train/train_loss_1=0.0308]Epoch 8: 64%|######3 | 61/96 [00:00<00:00, 602.37batch/s, train/train_loss_1=0.0345]Epoch 8: 64%|######3 | 61/96 [00:00<00:00, 602.37batch/s, train/train_loss_1=0.0239]Epoch 8: 64%|######3 | 61/96 [00:00<00:00, 602.37batch/s, train/train_loss_1=0.0375]Epoch 8: 64%|######3 | 61/96 [00:00<00:00, 602.37batch/s, train/train_loss_1=0.0372]Epoch 8: 64%|######3 | 61/96 [00:00<00:00, 602.37batch/s, train/train_loss_1=0.0327]Epoch 8: 64%|######3 | 61/96 [00:00<00:00, 602.37batch/s, train/train_loss_1=0.0393]Epoch 8: 64%|######3 | 61/96 [00:00<00:00, 602.37batch/s, train/train_loss_1=0.0347]Epoch 8: 64%|######3 | 61/96 [00:00<00:00, 602.37batch/s, train/train_loss_1=0.0266]Epoch 8: 64%|######3 | 61/96 [00:00<00:00, 602.37batch/s, train/train_loss_1=0.0355]Epoch 8: 64%|######3 | 61/96 [00:00<00:00, 602.37batch/s, train/train_loss_1=0.0273]Epoch 8: 64%|######3 | 61/96 [00:00<00:00, 602.37batch/s, train/train_loss_1=0.0314]Epoch 8: 64%|######3 | 61/96 [00:00<00:00, 602.37batch/s, train/train_loss_1=0.0293] 0%| | 0/96 [00:00<?, ?batch/s]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0252]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0246]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0264]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0294]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0287]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0379]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0313]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0244]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0301]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0272]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0323]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0342]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0269]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0435]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0369]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0271]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0333]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.026] Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0255]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0235]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0256]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0265]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0219]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0254]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0306]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0338]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0319]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0358]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0285]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0246]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0307]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0239]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0321]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0305]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0282]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0238]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0363]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0267]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0223]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0383]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0307]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0312]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0386]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0303]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0212]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0193]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0323]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0332]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0348]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0302]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0411]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0259]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0265]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0427]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0265]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0248]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0284]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0389]Epoch 9: 0%| | 0/96 [00:00<?, ?batch/s, train/train_loss_1=0.0252]Epoch 9: 61%|######1 | 59/96 [00:00<00:00, 584.12batch/s, train/train_loss_1=0.0252]Epoch 9: 61%|######1 | 59/96 [00:00<00:00, 584.12batch/s, train/train_loss_1=0.0286]Epoch 9: 61%|######1 | 59/96 [00:00<00:00, 584.12batch/s, train/train_loss_1=0.0382]Epoch 9: 61%|######1 | 59/96 [00:00<00:00, 584.12batch/s, train/train_loss_1=0.0289]Epoch 9: 61%|######1 | 59/96 [00:00<00:00, 584.12batch/s, train/train_loss_1=0.0279]Epoch 9: 61%|######1 | 59/96 [00:00<00:00, 584.12batch/s, train/train_loss_1=0.0288]Epoch 9: 61%|######1 | 59/96 [00:00<00:00, 584.12batch/s, train/train_loss_1=0.0279]Epoch 9: 61%|######1 | 59/96 [00:00<00:00, 584.12batch/s, train/train_loss_1=0.0229]Epoch 9: 61%|######1 | 59/96 [00:00<00:00, 584.12batch/s, train/train_loss_1=0.0334]Epoch 9: 61%|######1 | 59/96 [00:00<00:00, 584.12batch/s, train/train_loss_1=0.0309]Epoch 9: 61%|######1 | 59/96 [00:00<00:00, 584.12batch/s, train/train_loss_1=0.023] Epoch 9: 61%|######1 | 59/96 [00:00<00:00, 584.12batch/s, train/train_loss_1=0.0247]Epoch 9: 61%|######1 | 59/96 [00:00<00:00, 584.12batch/s, train/train_loss_1=0.035] Epoch 9: 61%|######1 | 59/96 [00:00<00:00, 584.12batch/s, train/train_loss_1=0.0224]Epoch 9: 61%|######1 | 59/96 [00:00<00:00, 584.12batch/s, train/train_loss_1=0.0286]Epoch 9: 61%|######1 | 59/96 [00:00<00:00, 584.12batch/s, train/train_loss_1=0.0377]Epoch 9: 61%|######1 | 59/96 [00:00<00:00, 584.12batch/s, train/train_loss_1=0.039] Epoch 9: 61%|######1 | 59/96 [00:00<00:00, 584.12batch/s, train/train_loss_1=0.0303]Epoch 9: 61%|######1 | 59/96 [00:00<00:00, 584.12batch/s, train/train_loss_1=0.0299]Epoch 9: 61%|######1 | 59/96 [00:00<00:00, 584.12batch/s, train/train_loss_1=0.0238]Epoch 9: 61%|######1 | 59/96 [00:00<00:00, 584.12batch/s, train/train_loss_1=0.0371]Epoch 9: 61%|######1 | 59/96 [00:00<00:00, 584.12batch/s, train/train_loss_1=0.0375]Epoch 9: 61%|######1 | 59/96 [00:00<00:00, 584.12batch/s, train/train_loss_1=0.0304]Epoch 9: 61%|######1 | 59/96 [00:00<00:00, 584.12batch/s, train/train_loss_1=0.0372]Epoch 9: 61%|######1 | 59/96 [00:00<00:00, 584.12batch/s, train/train_loss_1=0.0341]Epoch 9: 61%|######1 | 59/96 [00:00<00:00, 584.12batch/s, train/train_loss_1=0.0333]Epoch 9: 61%|######1 | 59/96 [00:00<00:00, 584.12batch/s, train/train_loss_1=0.0306]Epoch 9: 61%|######1 | 59/96 [00:00<00:00, 584.12batch/s, train/train_loss_1=0.0266]Epoch 9: 61%|######1 | 59/96 [00:00<00:00, 584.12batch/s, train/train_loss_1=0.0331]Epoch 9: 61%|######1 | 59/96 [00:00<00:00, 584.12batch/s, train/train_loss_1=0.0328]Epoch 9: 61%|######1 | 59/96 [00:00<00:00, 584.12batch/s, train/train_loss_1=0.0273]Epoch 9: 61%|######1 | 59/96 [00:00<00:00, 584.12batch/s, train/train_loss_1=0.0338]Epoch 9: 61%|######1 | 59/96 [00:00<00:00, 584.12batch/s, train/train_loss_1=0.0248]Epoch 9: 61%|######1 | 59/96 [00:00<00:00, 584.12batch/s, train/train_loss_1=0.0284]Epoch 9: 61%|######1 | 59/96 [00:00<00:00, 584.12batch/s, train/train_loss_1=0.0297]Epoch 9: 61%|######1 | 59/96 [00:00<00:00, 584.12batch/s, train/train_loss_1=0.0208]Epoch 9: 61%|######1 | 59/96 [00:00<00:00, 584.12batch/s, train/train_loss_1=0.0321]Epoch 9: 61%|######1 | 59/96 [00:00<00:00, 584.12batch/s, train/train_loss_1=0.033] Epoch 9: 100%|##########| 96/96 [00:00<00:00, 528.02batch/s, train/train_loss_1=0.033]
/opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages/sklearn/preprocessing/_encoders.py:868: FutureWarning: `sparse` was renamed to `sparse_output` in version 1.2 and will be removed in 1.4. `sparse_output` is ignored unless you leave `sparse` to its default value.
warnings.warn(
0%| | 0/88 [00:00<?, ?batch/s]Epoch 0: 0%| | 0/88 [00:00<?, ?batch/s]Epoch 0: 0%| | 0/88 [00:01<?, ?batch/s, train/train_loss_1=0.143]Epoch 0: 1%|1 | 1/88 [00:01<01:36, 1.10s/batch, train/train_loss_1=0.143]Epoch 0: 1%|1 | 1/88 [00:01<01:36, 1.10s/batch, train/train_loss_1=0.136]Epoch 0: 1%|1 | 1/88 [00:01<01:36, 1.10s/batch, train/train_loss_1=0.13] Epoch 0: 1%|1 | 1/88 [00:01<01:36, 1.10s/batch, train/train_loss_1=0.123]Epoch 0: 1%|1 | 1/88 [00:01<01:36, 1.10s/batch, train/train_loss_1=0.12] Epoch 0: 1%|1 | 1/88 [00:01<01:36, 1.10s/batch, train/train_loss_1=0.114]Epoch 0: 1%|1 | 1/88 [00:01<01:36, 1.10s/batch, train/train_loss_1=0.11] Epoch 0: 1%|1 | 1/88 [00:01<01:36, 1.10s/batch, train/train_loss_1=0.11]Epoch 0: 1%|1 | 1/88 [00:01<01:36, 1.10s/batch, train/train_loss_1=0.104]Epoch 0: 1%|1 | 1/88 [00:01<01:36, 1.10s/batch, train/train_loss_1=0.101]Epoch 0: 1%|1 | 1/88 [00:01<01:36, 1.10s/batch, train/train_loss_1=0.0964]Epoch 0: 1%|1 | 1/88 [00:01<01:36, 1.10s/batch, train/train_loss_1=0.113] Epoch 0: 1%|1 | 1/88 [00:01<01:36, 1.10s/batch, train/train_loss_1=0.094]Epoch 0: 1%|1 | 1/88 [00:01<01:36, 1.10s/batch, train/train_loss_1=0.0965]Epoch 0: 1%|1 | 1/88 [00:01<01:36, 1.10s/batch, train/train_loss_1=0.0958]Epoch 0: 1%|1 | 1/88 [00:01<01:36, 1.10s/batch, train/train_loss_1=0.095] Epoch 0: 1%|1 | 1/88 [00:01<01:36, 1.10s/batch, train/train_loss_1=0.101]Epoch 0: 1%|1 | 1/88 [00:01<01:36, 1.10s/batch, train/train_loss_1=0.0947]Epoch 0: 1%|1 | 1/88 [00:01<01:36, 1.10s/batch, train/train_loss_1=0.0947]Epoch 0: 1%|1 | 1/88 [00:01<01:36, 1.10s/batch, train/train_loss_1=0.109] Epoch 0: 1%|1 | 1/88 [00:01<01:36, 1.10s/batch, train/train_loss_1=0.0898]Epoch 0: 1%|1 | 1/88 [00:01<01:36, 1.10s/batch, train/train_loss_1=0.0977]Epoch 0: 1%|1 | 1/88 [00:01<01:36, 1.10s/batch, train/train_loss_1=0.0973]Epoch 0: 1%|1 | 1/88 [00:01<01:36, 1.10s/batch, train/train_loss_1=0.108] Epoch 0: 1%|1 | 1/88 [00:01<01:36, 1.10s/batch, train/train_loss_1=0.0934]Epoch 0: 1%|1 | 1/88 [00:01<01:36, 1.10s/batch, train/train_loss_1=0.0965]Epoch 0: 1%|1 | 1/88 [00:01<01:36, 1.10s/batch, train/train_loss_1=0.0847]Epoch 0: 1%|1 | 1/88 [00:01<01:36, 1.10s/batch, train/train_loss_1=0.087] Epoch 0: 1%|1 | 1/88 [00:01<01:36, 1.10s/batch, train/train_loss_1=0.0969]Epoch 0: 1%|1 | 1/88 [00:01<01:36, 1.10s/batch, train/train_loss_1=0.104] Epoch 0: 1%|1 | 1/88 [00:01<01:36, 1.10s/batch, train/train_loss_1=0.0935]Epoch 0: 1%|1 | 1/88 [00:01<01:36, 1.10s/batch, train/train_loss_1=0.105] Epoch 0: 1%|1 | 1/88 [00:01<01:36, 1.10s/batch, train/train_loss_1=0.103]Epoch 0: 1%|1 | 1/88 [00:01<01:36, 1.10s/batch, train/train_loss_1=0.0946]Epoch 0: 1%|1 | 1/88 [00:01<01:36, 1.10s/batch, train/train_loss_1=0.0967]Epoch 0: 1%|1 | 1/88 [00:01<01:36, 1.10s/batch, train/train_loss_1=0.0892]Epoch 0: 1%|1 | 1/88 [00:01<01:36, 1.10s/batch, train/train_loss_1=0.0775]Epoch 0: 1%|1 | 1/88 [00:01<01:36, 1.10s/batch, train/train_loss_1=0.0837]Epoch 0: 1%|1 | 1/88 [00:01<01:36, 1.10s/batch, train/train_loss_1=0.0882]Epoch 0: 1%|1 | 1/88 [00:01<01:36, 1.10s/batch, train/train_loss_1=0.0783]Epoch 0: 1%|1 | 1/88 [00:01<01:36, 1.10s/batch, train/train_loss_1=0.095] Epoch 0: 1%|1 | 1/88 [00:01<01:36, 1.10s/batch, train/train_loss_1=0.078]Epoch 0: 1%|1 | 1/88 [00:01<01:36, 1.10s/batch, train/train_loss_1=0.0828]Epoch 0: 1%|1 | 1/88 [00:01<01:36, 1.10s/batch, train/train_loss_1=0.09] Epoch 0: 1%|1 | 1/88 [00:01<01:36, 1.10s/batch, train/train_loss_1=0.0864]Epoch 0: 1%|1 | 1/88 [00:01<01:36, 1.10s/batch, train/train_loss_1=0.0844]Epoch 0: 1%|1 | 1/88 [00:01<01:36, 1.10s/batch, train/train_loss_1=0.0876]Epoch 0: 1%|1 | 1/88 [00:01<01:36, 1.10s/batch, train/train_loss_1=0.0921]Epoch 0: 1%|1 | 1/88 [00:01<01:36, 1.10s/batch, train/train_loss_1=0.0848]Epoch 0: 1%|1 | 1/88 [00:01<01:36, 1.10s/batch, train/train_loss_1=0.0986]Epoch 0: 1%|1 | 1/88 [00:01<01:36, 1.10s/batch, train/train_loss_1=0.082] Epoch 0: 1%|1 | 1/88 [00:01<01:36, 1.10s/batch, train/train_loss_1=0.0854]Epoch 0: 1%|1 | 1/88 [00:01<01:36, 1.10s/batch, train/train_loss_1=0.0848]Epoch 0: 1%|1 | 1/88 [00:01<01:36, 1.10s/batch, train/train_loss_1=0.0944]Epoch 0: 1%|1 | 1/88 [00:01<01:36, 1.10s/batch, train/train_loss_1=0.0921]Epoch 0: 1%|1 | 1/88 [00:01<01:36, 1.10s/batch, train/train_loss_1=0.0963]Epoch 0: 1%|1 | 1/88 [00:01<01:36, 1.10s/batch, train/train_loss_1=0.0895]Epoch 0: 1%|1 | 1/88 [00:01<01:36, 1.10s/batch, train/train_loss_1=0.0912]Epoch 0: 1%|1 | 1/88 [00:01<01:36, 1.10s/batch, train/train_loss_1=0.0847]Epoch 0: 1%|1 | 1/88 [00:01<01:36, 1.10s/batch, train/train_loss_1=0.0789]Epoch 0: 1%|1 | 1/88 [00:01<01:36, 1.10s/batch, train/train_loss_1=0.0826]Epoch 0: 1%|1 | 1/88 [00:01<01:36, 1.10s/batch, train/train_loss_1=0.0825]Epoch 0: 1%|1 | 1/88 [00:01<01:36, 1.10s/batch, train/train_loss_1=0.0957]Epoch 0: 1%|1 | 1/88 [00:01<01:36, 1.10s/batch, train/train_loss_1=0.0796]Epoch 0: 1%|1 | 1/88 [00:01<01:36, 1.10s/batch, train/train_loss_1=0.0712]Epoch 0: 74%|#######3 | 65/88 [00:01<00:00, 74.09batch/s, train/train_loss_1=0.0712]Epoch 0: 74%|#######3 | 65/88 [00:01<00:00, 74.09batch/s, train/train_loss_1=0.0822]Epoch 0: 74%|#######3 | 65/88 [00:01<00:00, 74.09batch/s, train/train_loss_1=0.0873]Epoch 0: 74%|#######3 | 65/88 [00:01<00:00, 74.09batch/s, train/train_loss_1=0.0972]Epoch 0: 74%|#######3 | 65/88 [00:01<00:00, 74.09batch/s, train/train_loss_1=0.0886]Epoch 0: 74%|#######3 | 65/88 [00:01<00:00, 74.09batch/s, train/train_loss_1=0.0895]Epoch 0: 74%|#######3 | 65/88 [00:01<00:00, 74.09batch/s, train/train_loss_1=0.0784]Epoch 0: 74%|#######3 | 65/88 [00:01<00:00, 74.09batch/s, train/train_loss_1=0.0925]Epoch 0: 74%|#######3 | 65/88 [00:01<00:00, 74.09batch/s, train/train_loss_1=0.0716]Epoch 0: 74%|#######3 | 65/88 [00:01<00:00, 74.09batch/s, train/train_loss_1=0.0845]Epoch 0: 74%|#######3 | 65/88 [00:01<00:00, 74.09batch/s, train/train_loss_1=0.0896]Epoch 0: 74%|#######3 | 65/88 [00:01<00:00, 74.09batch/s, train/train_loss_1=0.0938]Epoch 0: 74%|#######3 | 65/88 [00:01<00:00, 74.09batch/s, train/train_loss_1=0.0821]Epoch 0: 74%|#######3 | 65/88 [00:01<00:00, 74.09batch/s, train/train_loss_1=0.0788]Epoch 0: 74%|#######3 | 65/88 [00:01<00:00, 74.09batch/s, train/train_loss_1=0.0818]Epoch 0: 74%|#######3 | 65/88 [00:01<00:00, 74.09batch/s, train/train_loss_1=0.0829]Epoch 0: 74%|#######3 | 65/88 [00:01<00:00, 74.09batch/s, train/train_loss_1=0.0819]Epoch 0: 74%|#######3 | 65/88 [00:01<00:00, 74.09batch/s, train/train_loss_1=0.0915]Epoch 0: 74%|#######3 | 65/88 [00:01<00:00, 74.09batch/s, train/train_loss_1=0.082] Epoch 0: 74%|#######3 | 65/88 [00:01<00:00, 74.09batch/s, train/train_loss_1=0.0826]Epoch 0: 74%|#######3 | 65/88 [00:01<00:00, 74.09batch/s, train/train_loss_1=0.0888]Epoch 0: 74%|#######3 | 65/88 [00:01<00:00, 74.09batch/s, train/train_loss_1=0.0929]Epoch 0: 74%|#######3 | 65/88 [00:01<00:00, 74.09batch/s, train/train_loss_1=0.0832]Epoch 0: 74%|#######3 | 65/88 [00:02<00:00, 74.09batch/s, train/train_loss_1=0.0837] 0%| | 0/88 [00:00<?, ?batch/s]Epoch 1: 0%| | 0/88 [00:00<?, ?batch/s]Epoch 1: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0793]Epoch 1: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0821]Epoch 1: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0911]Epoch 1: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0934]Epoch 1: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0898]Epoch 1: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0905]Epoch 1: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0991]Epoch 1: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0813]Epoch 1: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0838]Epoch 1: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0832]Epoch 1: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0786]Epoch 1: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0872]Epoch 1: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0891]Epoch 1: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0923]Epoch 1: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.082] Epoch 1: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0929]Epoch 1: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0899]Epoch 1: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0842]Epoch 1: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0708]Epoch 1: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0819]Epoch 1: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0852]Epoch 1: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0709]Epoch 1: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0778]Epoch 1: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0893]Epoch 1: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0755]Epoch 1: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0789]Epoch 1: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0747]Epoch 1: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0792]Epoch 1: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0766]Epoch 1: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0849]Epoch 1: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.076] Epoch 1: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0854]Epoch 1: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0785]Epoch 1: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0969]Epoch 1: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0829]Epoch 1: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0811]Epoch 1: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0859]Epoch 1: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0831]Epoch 1: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0793]Epoch 1: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0812]Epoch 1: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0788]Epoch 1: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0949]Epoch 1: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0829]Epoch 1: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0804]Epoch 1: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0795]Epoch 1: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.085] Epoch 1: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0772]Epoch 1: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0809]Epoch 1: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0738]Epoch 1: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0908]Epoch 1: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0783]Epoch 1: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0836]Epoch 1: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0877]Epoch 1: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0773]Epoch 1: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.085] Epoch 1: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0849]Epoch 1: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0779]Epoch 1: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.079] Epoch 1: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0803]Epoch 1: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0821]Epoch 1: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0855]Epoch 1: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0835]Epoch 1: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.082] Epoch 1: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.088]Epoch 1: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0769]Epoch 1: 74%|#######3 | 65/88 [00:00<00:00, 648.91batch/s, train/train_loss_1=0.0769]Epoch 1: 74%|#######3 | 65/88 [00:00<00:00, 648.91batch/s, train/train_loss_1=0.0886]Epoch 1: 74%|#######3 | 65/88 [00:00<00:00, 648.91batch/s, train/train_loss_1=0.0785]Epoch 1: 74%|#######3 | 65/88 [00:00<00:00, 648.91batch/s, train/train_loss_1=0.0946]Epoch 1: 74%|#######3 | 65/88 [00:00<00:00, 648.91batch/s, train/train_loss_1=0.0784]Epoch 1: 74%|#######3 | 65/88 [00:00<00:00, 648.91batch/s, train/train_loss_1=0.0793]Epoch 1: 74%|#######3 | 65/88 [00:00<00:00, 648.91batch/s, train/train_loss_1=0.0844]Epoch 1: 74%|#######3 | 65/88 [00:00<00:00, 648.91batch/s, train/train_loss_1=0.0819]Epoch 1: 74%|#######3 | 65/88 [00:00<00:00, 648.91batch/s, train/train_loss_1=0.0777]Epoch 1: 74%|#######3 | 65/88 [00:00<00:00, 648.91batch/s, train/train_loss_1=0.0695]Epoch 1: 74%|#######3 | 65/88 [00:00<00:00, 648.91batch/s, train/train_loss_1=0.0919]Epoch 1: 74%|#######3 | 65/88 [00:00<00:00, 648.91batch/s, train/train_loss_1=0.0659]Epoch 1: 74%|#######3 | 65/88 [00:00<00:00, 648.91batch/s, train/train_loss_1=0.0848]Epoch 1: 74%|#######3 | 65/88 [00:00<00:00, 648.91batch/s, train/train_loss_1=0.0875]Epoch 1: 74%|#######3 | 65/88 [00:00<00:00, 648.91batch/s, train/train_loss_1=0.071] Epoch 1: 74%|#######3 | 65/88 [00:00<00:00, 648.91batch/s, train/train_loss_1=0.0737]Epoch 1: 74%|#######3 | 65/88 [00:00<00:00, 648.91batch/s, train/train_loss_1=0.0785]Epoch 1: 74%|#######3 | 65/88 [00:00<00:00, 648.91batch/s, train/train_loss_1=0.0719]Epoch 1: 74%|#######3 | 65/88 [00:00<00:00, 648.91batch/s, train/train_loss_1=0.0862]Epoch 1: 74%|#######3 | 65/88 [00:00<00:00, 648.91batch/s, train/train_loss_1=0.086] Epoch 1: 74%|#######3 | 65/88 [00:00<00:00, 648.91batch/s, train/train_loss_1=0.068]Epoch 1: 74%|#######3 | 65/88 [00:00<00:00, 648.91batch/s, train/train_loss_1=0.0752]Epoch 1: 74%|#######3 | 65/88 [00:00<00:00, 648.91batch/s, train/train_loss_1=0.0842]Epoch 1: 74%|#######3 | 65/88 [00:00<00:00, 648.91batch/s, train/train_loss_1=0.0672] 0%| | 0/88 [00:00<?, ?batch/s]Epoch 2: 0%| | 0/88 [00:00<?, ?batch/s]Epoch 2: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0806]Epoch 2: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0703]Epoch 2: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0858]Epoch 2: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0888]Epoch 2: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.101] Epoch 2: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0802]Epoch 2: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0747]Epoch 2: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0878]Epoch 2: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0694]Epoch 2: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0735]Epoch 2: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0862]Epoch 2: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0846]Epoch 2: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0781]Epoch 2: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0731]Epoch 2: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0906]Epoch 2: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0732]Epoch 2: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0872]Epoch 2: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0759]Epoch 2: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0814]Epoch 2: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0789]Epoch 2: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0874]Epoch 2: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0728]Epoch 2: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0841]Epoch 2: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0609]Epoch 2: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0846]Epoch 2: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0732]Epoch 2: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0799]Epoch 2: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0833]Epoch 2: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0799]Epoch 2: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0795]Epoch 2: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0663]Epoch 2: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0649]Epoch 2: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0817]Epoch 2: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0855]Epoch 2: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0845]Epoch 2: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0744]Epoch 2: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0808]Epoch 2: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0778]Epoch 2: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0695]Epoch 2: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0781]Epoch 2: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0827]Epoch 2: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0851]Epoch 2: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0789]Epoch 2: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.068] Epoch 2: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0817]Epoch 2: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0897]Epoch 2: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0895]Epoch 2: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0815]Epoch 2: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0881]Epoch 2: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0963]Epoch 2: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0926]Epoch 2: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0825]Epoch 2: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0679]Epoch 2: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0704]Epoch 2: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0793]Epoch 2: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0827]Epoch 2: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0735]Epoch 2: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0748]Epoch 2: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0791]Epoch 2: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0726]Epoch 2: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0732]Epoch 2: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0705]Epoch 2: 70%|####### | 62/88 [00:00<00:00, 619.52batch/s, train/train_loss_1=0.0705]Epoch 2: 70%|####### | 62/88 [00:00<00:00, 619.52batch/s, train/train_loss_1=0.0823]Epoch 2: 70%|####### | 62/88 [00:00<00:00, 619.52batch/s, train/train_loss_1=0.0707]Epoch 2: 70%|####### | 62/88 [00:00<00:00, 619.52batch/s, train/train_loss_1=0.0654]Epoch 2: 70%|####### | 62/88 [00:00<00:00, 619.52batch/s, train/train_loss_1=0.077] Epoch 2: 70%|####### | 62/88 [00:00<00:00, 619.52batch/s, train/train_loss_1=0.0789]Epoch 2: 70%|####### | 62/88 [00:00<00:00, 619.52batch/s, train/train_loss_1=0.075] Epoch 2: 70%|####### | 62/88 [00:00<00:00, 619.52batch/s, train/train_loss_1=0.0819]Epoch 2: 70%|####### | 62/88 [00:00<00:00, 619.52batch/s, train/train_loss_1=0.0802]Epoch 2: 70%|####### | 62/88 [00:00<00:00, 619.52batch/s, train/train_loss_1=0.0858]Epoch 2: 70%|####### | 62/88 [00:00<00:00, 619.52batch/s, train/train_loss_1=0.0907]Epoch 2: 70%|####### | 62/88 [00:00<00:00, 619.52batch/s, train/train_loss_1=0.0822]Epoch 2: 70%|####### | 62/88 [00:00<00:00, 619.52batch/s, train/train_loss_1=0.0856]Epoch 2: 70%|####### | 62/88 [00:00<00:00, 619.52batch/s, train/train_loss_1=0.0795]Epoch 2: 70%|####### | 62/88 [00:00<00:00, 619.52batch/s, train/train_loss_1=0.0756]Epoch 2: 70%|####### | 62/88 [00:00<00:00, 619.52batch/s, train/train_loss_1=0.0703]Epoch 2: 70%|####### | 62/88 [00:00<00:00, 619.52batch/s, train/train_loss_1=0.0774]Epoch 2: 70%|####### | 62/88 [00:00<00:00, 619.52batch/s, train/train_loss_1=0.0804]Epoch 2: 70%|####### | 62/88 [00:00<00:00, 619.52batch/s, train/train_loss_1=0.079] Epoch 2: 70%|####### | 62/88 [00:00<00:00, 619.52batch/s, train/train_loss_1=0.0717]Epoch 2: 70%|####### | 62/88 [00:00<00:00, 619.52batch/s, train/train_loss_1=0.0662]Epoch 2: 70%|####### | 62/88 [00:00<00:00, 619.52batch/s, train/train_loss_1=0.0756]Epoch 2: 70%|####### | 62/88 [00:00<00:00, 619.52batch/s, train/train_loss_1=0.0629]Epoch 2: 70%|####### | 62/88 [00:00<00:00, 619.52batch/s, train/train_loss_1=0.0732]Epoch 2: 70%|####### | 62/88 [00:00<00:00, 619.52batch/s, train/train_loss_1=0.0685]Epoch 2: 70%|####### | 62/88 [00:00<00:00, 619.52batch/s, train/train_loss_1=0.0653]Epoch 2: 70%|####### | 62/88 [00:00<00:00, 619.52batch/s, train/train_loss_1=0.0778] 0%| | 0/88 [00:00<?, ?batch/s]Epoch 3: 0%| | 0/88 [00:00<?, ?batch/s]Epoch 3: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0852]Epoch 3: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0754]Epoch 3: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0716]Epoch 3: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0876]Epoch 3: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0862]Epoch 3: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0875]Epoch 3: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0787]Epoch 3: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0779]Epoch 3: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0834]Epoch 3: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0802]Epoch 3: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0826]Epoch 3: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0661]Epoch 3: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0885]Epoch 3: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0648]Epoch 3: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0846]Epoch 3: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0898]Epoch 3: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0751]Epoch 3: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0666]Epoch 3: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0661]Epoch 3: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0818]Epoch 3: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0657]Epoch 3: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0733]Epoch 3: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0798]Epoch 3: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0713]Epoch 3: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0747]Epoch 3: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0695]Epoch 3: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0764]Epoch 3: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0642]Epoch 3: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0774]Epoch 3: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0739]Epoch 3: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0754]Epoch 3: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0818]Epoch 3: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0826]Epoch 3: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0788]Epoch 3: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0702]Epoch 3: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.077] Epoch 3: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0795]Epoch 3: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0817]Epoch 3: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0677]Epoch 3: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.073] Epoch 3: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0757]Epoch 3: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0719]Epoch 3: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0672]Epoch 3: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0585]Epoch 3: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0803]Epoch 3: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0766]Epoch 3: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0848]Epoch 3: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0851]Epoch 3: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0849]Epoch 3: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0767]Epoch 3: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0743]Epoch 3: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0607]Epoch 3: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0729]Epoch 3: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.076] Epoch 3: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0779]Epoch 3: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0635]Epoch 3: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0821]Epoch 3: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0908]Epoch 3: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0809]Epoch 3: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0849]Epoch 3: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0674]Epoch 3: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0626]Epoch 3: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0781]Epoch 3: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0724]Epoch 3: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0906]Epoch 3: 74%|#######3 | 65/88 [00:00<00:00, 640.90batch/s, train/train_loss_1=0.0906]Epoch 3: 74%|#######3 | 65/88 [00:00<00:00, 640.90batch/s, train/train_loss_1=0.0802]Epoch 3: 74%|#######3 | 65/88 [00:00<00:00, 640.90batch/s, train/train_loss_1=0.0808]Epoch 3: 74%|#######3 | 65/88 [00:00<00:00, 640.90batch/s, train/train_loss_1=0.0661]Epoch 3: 74%|#######3 | 65/88 [00:00<00:00, 640.90batch/s, train/train_loss_1=0.0584]Epoch 3: 74%|#######3 | 65/88 [00:00<00:00, 640.90batch/s, train/train_loss_1=0.074] Epoch 3: 74%|#######3 | 65/88 [00:00<00:00, 640.90batch/s, train/train_loss_1=0.0725]Epoch 3: 74%|#######3 | 65/88 [00:00<00:00, 640.90batch/s, train/train_loss_1=0.0831]Epoch 3: 74%|#######3 | 65/88 [00:00<00:00, 640.90batch/s, train/train_loss_1=0.0752]Epoch 3: 74%|#######3 | 65/88 [00:00<00:00, 640.90batch/s, train/train_loss_1=0.0681]Epoch 3: 74%|#######3 | 65/88 [00:00<00:00, 640.90batch/s, train/train_loss_1=0.0772]Epoch 3: 74%|#######3 | 65/88 [00:00<00:00, 640.90batch/s, train/train_loss_1=0.0826]Epoch 3: 74%|#######3 | 65/88 [00:00<00:00, 640.90batch/s, train/train_loss_1=0.0793]Epoch 3: 74%|#######3 | 65/88 [00:00<00:00, 640.90batch/s, train/train_loss_1=0.0801]Epoch 3: 74%|#######3 | 65/88 [00:00<00:00, 640.90batch/s, train/train_loss_1=0.0672]Epoch 3: 74%|#######3 | 65/88 [00:00<00:00, 640.90batch/s, train/train_loss_1=0.0692]Epoch 3: 74%|#######3 | 65/88 [00:00<00:00, 640.90batch/s, train/train_loss_1=0.0707]Epoch 3: 74%|#######3 | 65/88 [00:00<00:00, 640.90batch/s, train/train_loss_1=0.0546]Epoch 3: 74%|#######3 | 65/88 [00:00<00:00, 640.90batch/s, train/train_loss_1=0.0638]Epoch 3: 74%|#######3 | 65/88 [00:00<00:00, 640.90batch/s, train/train_loss_1=0.0756]Epoch 3: 74%|#######3 | 65/88 [00:00<00:00, 640.90batch/s, train/train_loss_1=0.072] Epoch 3: 74%|#######3 | 65/88 [00:00<00:00, 640.90batch/s, train/train_loss_1=0.0814]Epoch 3: 74%|#######3 | 65/88 [00:00<00:00, 640.90batch/s, train/train_loss_1=0.0722]Epoch 3: 74%|#######3 | 65/88 [00:00<00:00, 640.90batch/s, train/train_loss_1=0.0844] 0%| | 0/88 [00:00<?, ?batch/s]Epoch 4: 0%| | 0/88 [00:00<?, ?batch/s]Epoch 4: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0818]Epoch 4: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0773]Epoch 4: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0607]Epoch 4: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0639]Epoch 4: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0821]Epoch 4: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0705]Epoch 4: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0791]Epoch 4: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0622]Epoch 4: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0612]Epoch 4: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0853]Epoch 4: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0801]Epoch 4: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0658]Epoch 4: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0649]Epoch 4: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0816]Epoch 4: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0809]Epoch 4: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0758]Epoch 4: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0707]Epoch 4: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0754]Epoch 4: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.073] Epoch 4: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0781]Epoch 4: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0731]Epoch 4: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0791]Epoch 4: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.072] Epoch 4: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0736]Epoch 4: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0612]Epoch 4: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0729]Epoch 4: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0671]Epoch 4: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0697]Epoch 4: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0718]Epoch 4: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0846]Epoch 4: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0727]Epoch 4: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0774]Epoch 4: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0691]Epoch 4: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0785]Epoch 4: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0798]Epoch 4: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0813]Epoch 4: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.071] Epoch 4: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.104]Epoch 4: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0687]Epoch 4: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0788]Epoch 4: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0746]Epoch 4: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.076] Epoch 4: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0682]Epoch 4: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0735]Epoch 4: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0853]Epoch 4: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0645]Epoch 4: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0696]Epoch 4: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0849]Epoch 4: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0799]Epoch 4: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0716]Epoch 4: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0885]Epoch 4: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0824]Epoch 4: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0674]Epoch 4: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0621]Epoch 4: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0623]Epoch 4: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0709]Epoch 4: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0805]Epoch 4: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0721]Epoch 4: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0762]Epoch 4: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0814]Epoch 4: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0685]Epoch 4: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0856]Epoch 4: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.069] Epoch 4: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0794]Epoch 4: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0655]Epoch 4: 74%|#######3 | 65/88 [00:00<00:00, 645.90batch/s, train/train_loss_1=0.0655]Epoch 4: 74%|#######3 | 65/88 [00:00<00:00, 645.90batch/s, train/train_loss_1=0.0717]Epoch 4: 74%|#######3 | 65/88 [00:00<00:00, 645.90batch/s, train/train_loss_1=0.0868]Epoch 4: 74%|#######3 | 65/88 [00:00<00:00, 645.90batch/s, train/train_loss_1=0.055] Epoch 4: 74%|#######3 | 65/88 [00:00<00:00, 645.90batch/s, train/train_loss_1=0.076]Epoch 4: 74%|#######3 | 65/88 [00:00<00:00, 645.90batch/s, train/train_loss_1=0.0829]Epoch 4: 74%|#######3 | 65/88 [00:00<00:00, 645.90batch/s, train/train_loss_1=0.0632]Epoch 4: 74%|#######3 | 65/88 [00:00<00:00, 645.90batch/s, train/train_loss_1=0.071] Epoch 4: 74%|#######3 | 65/88 [00:00<00:00, 645.90batch/s, train/train_loss_1=0.0756]Epoch 4: 74%|#######3 | 65/88 [00:00<00:00, 645.90batch/s, train/train_loss_1=0.079] Epoch 4: 74%|#######3 | 65/88 [00:00<00:00, 645.90batch/s, train/train_loss_1=0.072]Epoch 4: 74%|#######3 | 65/88 [00:00<00:00, 645.90batch/s, train/train_loss_1=0.0814]Epoch 4: 74%|#######3 | 65/88 [00:00<00:00, 645.90batch/s, train/train_loss_1=0.0625]Epoch 4: 74%|#######3 | 65/88 [00:00<00:00, 645.90batch/s, train/train_loss_1=0.0698]Epoch 4: 74%|#######3 | 65/88 [00:00<00:00, 645.90batch/s, train/train_loss_1=0.0911]Epoch 4: 74%|#######3 | 65/88 [00:00<00:00, 645.90batch/s, train/train_loss_1=0.0688]Epoch 4: 74%|#######3 | 65/88 [00:00<00:00, 645.90batch/s, train/train_loss_1=0.0723]Epoch 4: 74%|#######3 | 65/88 [00:00<00:00, 645.90batch/s, train/train_loss_1=0.0709]Epoch 4: 74%|#######3 | 65/88 [00:00<00:00, 645.90batch/s, train/train_loss_1=0.0748]Epoch 4: 74%|#######3 | 65/88 [00:00<00:00, 645.90batch/s, train/train_loss_1=0.0806]Epoch 4: 74%|#######3 | 65/88 [00:00<00:00, 645.90batch/s, train/train_loss_1=0.0623]Epoch 4: 74%|#######3 | 65/88 [00:00<00:00, 645.90batch/s, train/train_loss_1=0.0642]Epoch 4: 74%|#######3 | 65/88 [00:00<00:00, 645.90batch/s, train/train_loss_1=0.0746]Epoch 4: 74%|#######3 | 65/88 [00:00<00:00, 645.90batch/s, train/train_loss_1=0.0687] 0%| | 0/88 [00:00<?, ?batch/s]Epoch 5: 0%| | 0/88 [00:00<?, ?batch/s]Epoch 5: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0835]Epoch 5: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0614]Epoch 5: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0808]Epoch 5: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.07] Epoch 5: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0789]Epoch 5: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0639]Epoch 5: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0696]Epoch 5: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.072] Epoch 5: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0693]Epoch 5: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0643]Epoch 5: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0612]Epoch 5: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0715]Epoch 5: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0784]Epoch 5: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0791]Epoch 5: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0664]Epoch 5: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0656]Epoch 5: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0846]Epoch 5: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0871]Epoch 5: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0804]Epoch 5: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0761]Epoch 5: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.067] Epoch 5: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0759]Epoch 5: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0734]Epoch 5: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0813]Epoch 5: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0714]Epoch 5: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0688]Epoch 5: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0809]Epoch 5: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.079] Epoch 5: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0773]Epoch 5: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0678]Epoch 5: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0585]Epoch 5: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0768]Epoch 5: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0669]Epoch 5: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0622]Epoch 5: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0806]Epoch 5: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0743]Epoch 5: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0658]Epoch 5: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0802]Epoch 5: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0726]Epoch 5: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0592]Epoch 5: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0672]Epoch 5: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0707]Epoch 5: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0768]Epoch 5: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0722]Epoch 5: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0706]Epoch 5: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0677]Epoch 5: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0708]Epoch 5: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0796]Epoch 5: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0726]Epoch 5: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0667]Epoch 5: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0578]Epoch 5: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0794]Epoch 5: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0665]Epoch 5: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0635]Epoch 5: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0737]Epoch 5: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0588]Epoch 5: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0733]Epoch 5: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0728]Epoch 5: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.075] Epoch 5: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0895]Epoch 5: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0843]Epoch 5: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0741]Epoch 5: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0688]Epoch 5: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0896]Epoch 5: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0716]Epoch 5: 74%|#######3 | 65/88 [00:00<00:00, 649.48batch/s, train/train_loss_1=0.0716]Epoch 5: 74%|#######3 | 65/88 [00:00<00:00, 649.48batch/s, train/train_loss_1=0.079] Epoch 5: 74%|#######3 | 65/88 [00:00<00:00, 649.48batch/s, train/train_loss_1=0.0766]Epoch 5: 74%|#######3 | 65/88 [00:00<00:00, 649.48batch/s, train/train_loss_1=0.0787]Epoch 5: 74%|#######3 | 65/88 [00:00<00:00, 649.48batch/s, train/train_loss_1=0.0795]Epoch 5: 74%|#######3 | 65/88 [00:00<00:00, 649.48batch/s, train/train_loss_1=0.0845]Epoch 5: 74%|#######3 | 65/88 [00:00<00:00, 649.48batch/s, train/train_loss_1=0.0752]Epoch 5: 74%|#######3 | 65/88 [00:00<00:00, 649.48batch/s, train/train_loss_1=0.0724]Epoch 5: 74%|#######3 | 65/88 [00:00<00:00, 649.48batch/s, train/train_loss_1=0.0721]Epoch 5: 74%|#######3 | 65/88 [00:00<00:00, 649.48batch/s, train/train_loss_1=0.0801]Epoch 5: 74%|#######3 | 65/88 [00:00<00:00, 649.48batch/s, train/train_loss_1=0.0684]Epoch 5: 74%|#######3 | 65/88 [00:00<00:00, 649.48batch/s, train/train_loss_1=0.0743]Epoch 5: 74%|#######3 | 65/88 [00:00<00:00, 649.48batch/s, train/train_loss_1=0.0779]Epoch 5: 74%|#######3 | 65/88 [00:00<00:00, 649.48batch/s, train/train_loss_1=0.0808]Epoch 5: 74%|#######3 | 65/88 [00:00<00:00, 649.48batch/s, train/train_loss_1=0.0744]Epoch 5: 74%|#######3 | 65/88 [00:00<00:00, 649.48batch/s, train/train_loss_1=0.0671]Epoch 5: 74%|#######3 | 65/88 [00:00<00:00, 649.48batch/s, train/train_loss_1=0.0572]Epoch 5: 74%|#######3 | 65/88 [00:00<00:00, 649.48batch/s, train/train_loss_1=0.0774]Epoch 5: 74%|#######3 | 65/88 [00:00<00:00, 649.48batch/s, train/train_loss_1=0.0617]Epoch 5: 74%|#######3 | 65/88 [00:00<00:00, 649.48batch/s, train/train_loss_1=0.0691]Epoch 5: 74%|#######3 | 65/88 [00:00<00:00, 649.48batch/s, train/train_loss_1=0.0819]Epoch 5: 74%|#######3 | 65/88 [00:00<00:00, 649.48batch/s, train/train_loss_1=0.0683]Epoch 5: 74%|#######3 | 65/88 [00:00<00:00, 649.48batch/s, train/train_loss_1=0.0787]Epoch 5: 74%|#######3 | 65/88 [00:00<00:00, 649.48batch/s, train/train_loss_1=0.0834] 0%| | 0/88 [00:00<?, ?batch/s]Epoch 6: 0%| | 0/88 [00:00<?, ?batch/s]Epoch 6: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0668]Epoch 6: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0853]Epoch 6: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0827]Epoch 6: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0712]Epoch 6: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.065] Epoch 6: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.074]Epoch 6: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0699]Epoch 6: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0727]Epoch 6: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0776]Epoch 6: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0824]Epoch 6: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.074] Epoch 6: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0713]Epoch 6: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0849]Epoch 6: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.076] Epoch 6: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0647]Epoch 6: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0761]Epoch 6: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0705]Epoch 6: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0666]Epoch 6: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0694]Epoch 6: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0722]Epoch 6: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0663]Epoch 6: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0657]Epoch 6: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0654]Epoch 6: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.084] Epoch 6: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0736]Epoch 6: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.067] Epoch 6: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0693]Epoch 6: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0741]Epoch 6: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0615]Epoch 6: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0624]Epoch 6: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0865]Epoch 6: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0825]Epoch 6: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0728]Epoch 6: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0669]Epoch 6: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0666]Epoch 6: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0698]Epoch 6: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0644]Epoch 6: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0717]Epoch 6: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0795]Epoch 6: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.075] Epoch 6: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0643]Epoch 6: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0748]Epoch 6: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0809]Epoch 6: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.066] Epoch 6: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0701]Epoch 6: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0773]Epoch 6: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0877]Epoch 6: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0689]Epoch 6: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0797]Epoch 6: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0836]Epoch 6: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0732]Epoch 6: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0774]Epoch 6: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0668]Epoch 6: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0727]Epoch 6: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0688]Epoch 6: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.059] Epoch 6: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0681]Epoch 6: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0767]Epoch 6: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0681]Epoch 6: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0828]Epoch 6: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0772]Epoch 6: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0652]Epoch 6: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0506]Epoch 6: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0624]Epoch 6: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0838]Epoch 6: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0803]Epoch 6: 75%|#######5 | 66/88 [00:00<00:00, 653.74batch/s, train/train_loss_1=0.0803]Epoch 6: 75%|#######5 | 66/88 [00:00<00:00, 653.74batch/s, train/train_loss_1=0.0824]Epoch 6: 75%|#######5 | 66/88 [00:00<00:00, 653.74batch/s, train/train_loss_1=0.072] Epoch 6: 75%|#######5 | 66/88 [00:00<00:00, 653.74batch/s, train/train_loss_1=0.0631]Epoch 6: 75%|#######5 | 66/88 [00:00<00:00, 653.74batch/s, train/train_loss_1=0.0733]Epoch 6: 75%|#######5 | 66/88 [00:00<00:00, 653.74batch/s, train/train_loss_1=0.07] Epoch 6: 75%|#######5 | 66/88 [00:00<00:00, 653.74batch/s, train/train_loss_1=0.0755]Epoch 6: 75%|#######5 | 66/88 [00:00<00:00, 653.74batch/s, train/train_loss_1=0.0754]Epoch 6: 75%|#######5 | 66/88 [00:00<00:00, 653.74batch/s, train/train_loss_1=0.0691]Epoch 6: 75%|#######5 | 66/88 [00:00<00:00, 653.74batch/s, train/train_loss_1=0.0663]Epoch 6: 75%|#######5 | 66/88 [00:00<00:00, 653.74batch/s, train/train_loss_1=0.0806]Epoch 6: 75%|#######5 | 66/88 [00:00<00:00, 653.74batch/s, train/train_loss_1=0.0784]Epoch 6: 75%|#######5 | 66/88 [00:00<00:00, 653.74batch/s, train/train_loss_1=0.0817]Epoch 6: 75%|#######5 | 66/88 [00:00<00:00, 653.74batch/s, train/train_loss_1=0.0692]Epoch 6: 75%|#######5 | 66/88 [00:00<00:00, 653.74batch/s, train/train_loss_1=0.0556]Epoch 6: 75%|#######5 | 66/88 [00:00<00:00, 653.74batch/s, train/train_loss_1=0.0742]Epoch 6: 75%|#######5 | 66/88 [00:00<00:00, 653.74batch/s, train/train_loss_1=0.0673]Epoch 6: 75%|#######5 | 66/88 [00:00<00:00, 653.74batch/s, train/train_loss_1=0.0747]Epoch 6: 75%|#######5 | 66/88 [00:00<00:00, 653.74batch/s, train/train_loss_1=0.0681]Epoch 6: 75%|#######5 | 66/88 [00:00<00:00, 653.74batch/s, train/train_loss_1=0.0696]Epoch 6: 75%|#######5 | 66/88 [00:00<00:00, 653.74batch/s, train/train_loss_1=0.0843]Epoch 6: 75%|#######5 | 66/88 [00:00<00:00, 653.74batch/s, train/train_loss_1=0.0769]Epoch 6: 75%|#######5 | 66/88 [00:00<00:00, 653.74batch/s, train/train_loss_1=0.0877] 0%| | 0/88 [00:00<?, ?batch/s]Epoch 7: 0%| | 0/88 [00:00<?, ?batch/s]Epoch 7: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.078]Epoch 7: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0765]Epoch 7: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0697]Epoch 7: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0691]Epoch 7: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0682]Epoch 7: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0772]Epoch 7: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0854]Epoch 7: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.06] Epoch 7: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0734]Epoch 7: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0735]Epoch 7: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0636]Epoch 7: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0761]Epoch 7: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0786]Epoch 7: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0764]Epoch 7: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0659]Epoch 7: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0751]Epoch 7: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0879]Epoch 7: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0837]Epoch 7: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0679]Epoch 7: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0651]Epoch 7: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0723]Epoch 7: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0783]Epoch 7: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0773]Epoch 7: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0776]Epoch 7: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0796]Epoch 7: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0637]Epoch 7: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0675]Epoch 7: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0683]Epoch 7: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0703]Epoch 7: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0662]Epoch 7: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0599]Epoch 7: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0554]Epoch 7: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0742]Epoch 7: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0809]Epoch 7: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0673]Epoch 7: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0841]Epoch 7: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0809]Epoch 7: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0654]Epoch 7: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0874]Epoch 7: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0715]Epoch 7: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.07] Epoch 7: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0544]Epoch 7: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0661]Epoch 7: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0685]Epoch 7: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.068] Epoch 7: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0692]Epoch 7: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.074] Epoch 7: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0697]Epoch 7: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0739]Epoch 7: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0663]Epoch 7: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0816]Epoch 7: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0793]Epoch 7: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0692]Epoch 7: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0772]Epoch 7: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0805]Epoch 7: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0632]Epoch 7: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0667]Epoch 7: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0696]Epoch 7: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0711]Epoch 7: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.059] Epoch 7: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0703]Epoch 7: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0723]Epoch 7: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0688]Epoch 7: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0759]Epoch 7: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0684]Epoch 7: 74%|#######3 | 65/88 [00:00<00:00, 642.88batch/s, train/train_loss_1=0.0684]Epoch 7: 74%|#######3 | 65/88 [00:00<00:00, 642.88batch/s, train/train_loss_1=0.0756]Epoch 7: 74%|#######3 | 65/88 [00:00<00:00, 642.88batch/s, train/train_loss_1=0.0735]Epoch 7: 74%|#######3 | 65/88 [00:00<00:00, 642.88batch/s, train/train_loss_1=0.0799]Epoch 7: 74%|#######3 | 65/88 [00:00<00:00, 642.88batch/s, train/train_loss_1=0.0562]Epoch 7: 74%|#######3 | 65/88 [00:00<00:00, 642.88batch/s, train/train_loss_1=0.0716]Epoch 7: 74%|#######3 | 65/88 [00:00<00:00, 642.88batch/s, train/train_loss_1=0.066] Epoch 7: 74%|#######3 | 65/88 [00:00<00:00, 642.88batch/s, train/train_loss_1=0.0722]Epoch 7: 74%|#######3 | 65/88 [00:00<00:00, 642.88batch/s, train/train_loss_1=0.0758]Epoch 7: 74%|#######3 | 65/88 [00:00<00:00, 642.88batch/s, train/train_loss_1=0.0787]Epoch 7: 74%|#######3 | 65/88 [00:00<00:00, 642.88batch/s, train/train_loss_1=0.0738]Epoch 7: 74%|#######3 | 65/88 [00:00<00:00, 642.88batch/s, train/train_loss_1=0.0739]Epoch 7: 74%|#######3 | 65/88 [00:00<00:00, 642.88batch/s, train/train_loss_1=0.0692]Epoch 7: 74%|#######3 | 65/88 [00:00<00:00, 642.88batch/s, train/train_loss_1=0.072] Epoch 7: 74%|#######3 | 65/88 [00:00<00:00, 642.88batch/s, train/train_loss_1=0.0677]Epoch 7: 74%|#######3 | 65/88 [00:00<00:00, 642.88batch/s, train/train_loss_1=0.073] Epoch 7: 74%|#######3 | 65/88 [00:00<00:00, 642.88batch/s, train/train_loss_1=0.0726]Epoch 7: 74%|#######3 | 65/88 [00:00<00:00, 642.88batch/s, train/train_loss_1=0.0705]Epoch 7: 74%|#######3 | 65/88 [00:00<00:00, 642.88batch/s, train/train_loss_1=0.0694]Epoch 7: 74%|#######3 | 65/88 [00:00<00:00, 642.88batch/s, train/train_loss_1=0.0718]Epoch 7: 74%|#######3 | 65/88 [00:00<00:00, 642.88batch/s, train/train_loss_1=0.078] Epoch 7: 74%|#######3 | 65/88 [00:00<00:00, 642.88batch/s, train/train_loss_1=0.0799]Epoch 7: 74%|#######3 | 65/88 [00:00<00:00, 642.88batch/s, train/train_loss_1=0.0691]Epoch 7: 74%|#######3 | 65/88 [00:00<00:00, 642.88batch/s, train/train_loss_1=0.0802] 0%| | 0/88 [00:00<?, ?batch/s]Epoch 8: 0%| | 0/88 [00:00<?, ?batch/s]Epoch 8: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0732]Epoch 8: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0699]Epoch 8: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0569]Epoch 8: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0707]Epoch 8: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0687]Epoch 8: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0731]Epoch 8: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0717]Epoch 8: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0749]Epoch 8: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0745]Epoch 8: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0783]Epoch 8: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0709]Epoch 8: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0651]Epoch 8: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0661]Epoch 8: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0694]Epoch 8: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0715]Epoch 8: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0856]Epoch 8: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0722]Epoch 8: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0778]Epoch 8: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0879]Epoch 8: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0748]Epoch 8: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0689]Epoch 8: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0627]Epoch 8: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0715]Epoch 8: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0665]Epoch 8: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0662]Epoch 8: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0667]Epoch 8: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0632]Epoch 8: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0775]Epoch 8: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.081] Epoch 8: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0706]Epoch 8: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0852]Epoch 8: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0706]Epoch 8: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.072] Epoch 8: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0796]Epoch 8: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0731]Epoch 8: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0749]Epoch 8: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0887]Epoch 8: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0594]Epoch 8: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0746]Epoch 8: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0825]Epoch 8: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0766]Epoch 8: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.076] Epoch 8: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.073]Epoch 8: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0646]Epoch 8: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0806]Epoch 8: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0688]Epoch 8: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0756]Epoch 8: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0666]Epoch 8: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0763]Epoch 8: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0728]Epoch 8: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0809]Epoch 8: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0635]Epoch 8: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0692]Epoch 8: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0678]Epoch 8: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0619]Epoch 8: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0741]Epoch 8: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0502]Epoch 8: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0851]Epoch 8: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.07] Epoch 8: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0715]Epoch 8: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0789]Epoch 8: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0611]Epoch 8: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0741]Epoch 8: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0629]Epoch 8: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0829]Epoch 8: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0831]Epoch 8: 75%|#######5 | 66/88 [00:00<00:00, 652.66batch/s, train/train_loss_1=0.0831]Epoch 8: 75%|#######5 | 66/88 [00:00<00:00, 652.66batch/s, train/train_loss_1=0.0762]Epoch 8: 75%|#######5 | 66/88 [00:00<00:00, 652.66batch/s, train/train_loss_1=0.0767]Epoch 8: 75%|#######5 | 66/88 [00:00<00:00, 652.66batch/s, train/train_loss_1=0.078] Epoch 8: 75%|#######5 | 66/88 [00:00<00:00, 652.66batch/s, train/train_loss_1=0.0873]Epoch 8: 75%|#######5 | 66/88 [00:00<00:00, 652.66batch/s, train/train_loss_1=0.078] Epoch 8: 75%|#######5 | 66/88 [00:00<00:00, 652.66batch/s, train/train_loss_1=0.0672]Epoch 8: 75%|#######5 | 66/88 [00:00<00:00, 652.66batch/s, train/train_loss_1=0.0719]Epoch 8: 75%|#######5 | 66/88 [00:00<00:00, 652.66batch/s, train/train_loss_1=0.0607]Epoch 8: 75%|#######5 | 66/88 [00:00<00:00, 652.66batch/s, train/train_loss_1=0.0648]Epoch 8: 75%|#######5 | 66/88 [00:00<00:00, 652.66batch/s, train/train_loss_1=0.0605]Epoch 8: 75%|#######5 | 66/88 [00:00<00:00, 652.66batch/s, train/train_loss_1=0.0762]Epoch 8: 75%|#######5 | 66/88 [00:00<00:00, 652.66batch/s, train/train_loss_1=0.0742]Epoch 8: 75%|#######5 | 66/88 [00:00<00:00, 652.66batch/s, train/train_loss_1=0.0602]Epoch 8: 75%|#######5 | 66/88 [00:00<00:00, 652.66batch/s, train/train_loss_1=0.0611]Epoch 8: 75%|#######5 | 66/88 [00:00<00:00, 652.66batch/s, train/train_loss_1=0.0753]Epoch 8: 75%|#######5 | 66/88 [00:00<00:00, 652.66batch/s, train/train_loss_1=0.0601]Epoch 8: 75%|#######5 | 66/88 [00:00<00:00, 652.66batch/s, train/train_loss_1=0.0576]Epoch 8: 75%|#######5 | 66/88 [00:00<00:00, 652.66batch/s, train/train_loss_1=0.0787]Epoch 8: 75%|#######5 | 66/88 [00:00<00:00, 652.66batch/s, train/train_loss_1=0.0678]Epoch 8: 75%|#######5 | 66/88 [00:00<00:00, 652.66batch/s, train/train_loss_1=0.0733]Epoch 8: 75%|#######5 | 66/88 [00:00<00:00, 652.66batch/s, train/train_loss_1=0.0824]Epoch 8: 75%|#######5 | 66/88 [00:00<00:00, 652.66batch/s, train/train_loss_1=0.0531] 0%| | 0/88 [00:00<?, ?batch/s]Epoch 9: 0%| | 0/88 [00:00<?, ?batch/s]Epoch 9: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0744]Epoch 9: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0652]Epoch 9: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0747]Epoch 9: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0771]Epoch 9: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0688]Epoch 9: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0742]Epoch 9: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0684]Epoch 9: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0694]Epoch 9: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0648]Epoch 9: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0777]Epoch 9: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0741]Epoch 9: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0711]Epoch 9: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0547]Epoch 9: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0677]Epoch 9: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.063] Epoch 9: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0663]Epoch 9: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0672]Epoch 9: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0736]Epoch 9: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0696]Epoch 9: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0717]Epoch 9: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0814]Epoch 9: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0787]Epoch 9: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0596]Epoch 9: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0621]Epoch 9: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0893]Epoch 9: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0716]Epoch 9: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.069] Epoch 9: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0813]Epoch 9: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0805]Epoch 9: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0712]Epoch 9: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0558]Epoch 9: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0657]Epoch 9: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0691]Epoch 9: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0708]Epoch 9: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0659]Epoch 9: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0872]Epoch 9: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0717]Epoch 9: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0742]Epoch 9: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0774]Epoch 9: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0856]Epoch 9: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0661]Epoch 9: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0798]Epoch 9: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0659]Epoch 9: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0779]Epoch 9: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0764]Epoch 9: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0681]Epoch 9: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0691]Epoch 9: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0675]Epoch 9: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.072] Epoch 9: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0704]Epoch 9: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0558]Epoch 9: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0579]Epoch 9: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0706]Epoch 9: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0587]Epoch 9: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0658]Epoch 9: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0718]Epoch 9: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0689]Epoch 9: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0808]Epoch 9: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0776]Epoch 9: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0718]Epoch 9: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0793]Epoch 9: 0%| | 0/88 [00:00<?, ?batch/s, train/train_loss_1=0.0631]Epoch 9: 70%|####### | 62/88 [00:00<00:00, 613.00batch/s, train/train_loss_1=0.0631]Epoch 9: 70%|####### | 62/88 [00:00<00:00, 613.00batch/s, train/train_loss_1=0.0666]Epoch 9: 70%|####### | 62/88 [00:00<00:00, 613.00batch/s, train/train_loss_1=0.0672]Epoch 9: 70%|####### | 62/88 [00:00<00:00, 613.00batch/s, train/train_loss_1=0.0774]Epoch 9: 70%|####### | 62/88 [00:00<00:00, 613.00batch/s, train/train_loss_1=0.0711]Epoch 9: 70%|####### | 62/88 [00:00<00:00, 613.00batch/s, train/train_loss_1=0.0891]Epoch 9: 70%|####### | 62/88 [00:00<00:00, 613.00batch/s, train/train_loss_1=0.0589]Epoch 9: 70%|####### | 62/88 [00:00<00:00, 613.00batch/s, train/train_loss_1=0.0751]Epoch 9: 70%|####### | 62/88 [00:00<00:00, 613.00batch/s, train/train_loss_1=0.0825]Epoch 9: 70%|####### | 62/88 [00:00<00:00, 613.00batch/s, train/train_loss_1=0.0631]Epoch 9: 70%|####### | 62/88 [00:00<00:00, 613.00batch/s, train/train_loss_1=0.0646]Epoch 9: 70%|####### | 62/88 [00:00<00:00, 613.00batch/s, train/train_loss_1=0.0772]Epoch 9: 70%|####### | 62/88 [00:00<00:00, 613.00batch/s, train/train_loss_1=0.0693]Epoch 9: 70%|####### | 62/88 [00:00<00:00, 613.00batch/s, train/train_loss_1=0.0723]Epoch 9: 70%|####### | 62/88 [00:00<00:00, 613.00batch/s, train/train_loss_1=0.0686]Epoch 9: 70%|####### | 62/88 [00:00<00:00, 613.00batch/s, train/train_loss_1=0.0579]Epoch 9: 70%|####### | 62/88 [00:00<00:00, 613.00batch/s, train/train_loss_1=0.0629]Epoch 9: 70%|####### | 62/88 [00:00<00:00, 613.00batch/s, train/train_loss_1=0.0767]Epoch 9: 70%|####### | 62/88 [00:00<00:00, 613.00batch/s, train/train_loss_1=0.0749]Epoch 9: 70%|####### | 62/88 [00:00<00:00, 613.00batch/s, train/train_loss_1=0.0855]Epoch 9: 70%|####### | 62/88 [00:00<00:00, 613.00batch/s, train/train_loss_1=0.084] Epoch 9: 70%|####### | 62/88 [00:00<00:00, 613.00batch/s, train/train_loss_1=0.0827]Epoch 9: 70%|####### | 62/88 [00:00<00:00, 613.00batch/s, train/train_loss_1=0.0679]Epoch 9: 70%|####### | 62/88 [00:00<00:00, 613.00batch/s, train/train_loss_1=0.0835]Epoch 9: 70%|####### | 62/88 [00:00<00:00, 613.00batch/s, train/train_loss_1=0.0772]Epoch 9: 70%|####### | 62/88 [00:00<00:00, 613.00batch/s, train/train_loss_1=0.0776]Epoch 9: 70%|####### | 62/88 [00:00<00:00, 613.00batch/s, train/train_loss_1=0.0869]Epoch 9: 100%|##########| 88/88 [00:00<00:00, 536.30batch/s, train/train_loss_1=0.0869]
0%| | 0/14 [00:00<?, ?batch/s]Epoch 0: 0%| | 0/14 [00:00<?, ?batch/s]Epoch 0: 0%| | 0/14 [00:01<?, ?batch/s, train/train_loss_1=0.136]Epoch 0: 7%|7 | 1/14 [00:01<00:13, 1.06s/batch, train/train_loss_1=0.136]Epoch 0: 7%|7 | 1/14 [00:01<00:13, 1.06s/batch, train/train_loss_1=0.129]Epoch 0: 7%|7 | 1/14 [00:01<00:13, 1.06s/batch, train/train_loss_1=0.121]Epoch 0: 7%|7 | 1/14 [00:01<00:13, 1.06s/batch, train/train_loss_1=0.127]Epoch 0: 7%|7 | 1/14 [00:01<00:13, 1.06s/batch, train/train_loss_1=0.115]Epoch 0: 7%|7 | 1/14 [00:01<00:13, 1.06s/batch, train/train_loss_1=0.111]Epoch 0: 7%|7 | 1/14 [00:01<00:13, 1.06s/batch, train/train_loss_1=0.115]Epoch 0: 7%|7 | 1/14 [00:01<00:13, 1.06s/batch, train/train_loss_1=0.106]Epoch 0: 7%|7 | 1/14 [00:01<00:13, 1.06s/batch, train/train_loss_1=0.131]Epoch 0: 7%|7 | 1/14 [00:01<00:13, 1.06s/batch, train/train_loss_1=0.114]Epoch 0: 7%|7 | 1/14 [00:01<00:13, 1.06s/batch, train/train_loss_1=0.124]Epoch 0: 7%|7 | 1/14 [00:01<00:13, 1.06s/batch, train/train_loss_1=0.124]Epoch 0: 7%|7 | 1/14 [00:01<00:13, 1.06s/batch, train/train_loss_1=0.114]Epoch 0: 7%|7 | 1/14 [00:02<00:13, 1.06s/batch, train/train_loss_1=0.138]Epoch 0: 100%|##########| 14/14 [00:02<00:00, 7.59batch/s, train/train_loss_1=0.138] 0%| | 0/14 [00:00<?, ?batch/s]Epoch 1: 0%| | 0/14 [00:00<?, ?batch/s]Epoch 1: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.109]Epoch 1: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.119]Epoch 1: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.1] Epoch 1: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.108]Epoch 1: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.121]Epoch 1: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.113]Epoch 1: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.104]Epoch 1: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0894]Epoch 1: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0994]Epoch 1: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0998]Epoch 1: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0809]Epoch 1: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0917]Epoch 1: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0934]Epoch 1: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0959] 0%| | 0/14 [00:00<?, ?batch/s]Epoch 2: 0%| | 0/14 [00:00<?, ?batch/s]Epoch 2: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.071]Epoch 2: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0782]Epoch 2: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0771]Epoch 2: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.106] Epoch 2: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0737]Epoch 2: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0908]Epoch 2: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0723]Epoch 2: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0949]Epoch 2: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0716]Epoch 2: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0876]Epoch 2: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0578]Epoch 2: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0625]Epoch 2: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0585]Epoch 2: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.086] 0%| | 0/14 [00:00<?, ?batch/s]Epoch 3: 0%| | 0/14 [00:00<?, ?batch/s]Epoch 3: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0631]Epoch 3: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0718]Epoch 3: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0402]Epoch 3: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0555]Epoch 3: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.055] Epoch 3: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0527]Epoch 3: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0695]Epoch 3: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0861]Epoch 3: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0708]Epoch 3: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0511]Epoch 3: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.041] Epoch 3: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0651]Epoch 3: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0402]Epoch 3: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0664] 0%| | 0/14 [00:00<?, ?batch/s]Epoch 4: 0%| | 0/14 [00:00<?, ?batch/s]Epoch 4: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.04]Epoch 4: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0405]Epoch 4: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0261]Epoch 4: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0688]Epoch 4: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0443]Epoch 4: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0642]Epoch 4: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0677]Epoch 4: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.051] Epoch 4: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0476]Epoch 4: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.058] Epoch 4: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0322]Epoch 4: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0829]Epoch 4: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0456]Epoch 4: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0147] 0%| | 0/14 [00:00<?, ?batch/s]Epoch 5: 0%| | 0/14 [00:00<?, ?batch/s]Epoch 5: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0356]Epoch 5: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0393]Epoch 5: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0349]Epoch 5: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0427]Epoch 5: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0495]Epoch 5: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0487]Epoch 5: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.031] Epoch 5: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0398]Epoch 5: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0348]Epoch 5: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0335]Epoch 5: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0344]Epoch 5: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0543]Epoch 5: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0167]Epoch 5: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.04] 0%| | 0/14 [00:00<?, ?batch/s]Epoch 6: 0%| | 0/14 [00:00<?, ?batch/s]Epoch 6: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0353]Epoch 6: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0314]Epoch 6: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0487]Epoch 6: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0218]Epoch 6: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0414]Epoch 6: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0316]Epoch 6: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0547]Epoch 6: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0464]Epoch 6: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0789]Epoch 6: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0394]Epoch 6: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0575]Epoch 6: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0444]Epoch 6: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0266]Epoch 6: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0169] 0%| | 0/14 [00:00<?, ?batch/s]Epoch 7: 0%| | 0/14 [00:00<?, ?batch/s]Epoch 7: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0412]Epoch 7: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.04] Epoch 7: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0511]Epoch 7: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.023] Epoch 7: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0272]Epoch 7: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0469]Epoch 7: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0376]Epoch 7: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0291]Epoch 7: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0449]Epoch 7: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0253]Epoch 7: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0225]Epoch 7: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0433]Epoch 7: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0379]Epoch 7: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0245] 0%| | 0/14 [00:00<?, ?batch/s]Epoch 8: 0%| | 0/14 [00:00<?, ?batch/s]Epoch 8: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0559]Epoch 8: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.045] Epoch 8: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0394]Epoch 8: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0189]Epoch 8: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.033] Epoch 8: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0354]Epoch 8: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0277]Epoch 8: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0358]Epoch 8: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0504]Epoch 8: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0284]Epoch 8: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0242]Epoch 8: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0464]Epoch 8: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.02] Epoch 8: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0314] 0%| | 0/14 [00:00<?, ?batch/s]Epoch 9: 0%| | 0/14 [00:00<?, ?batch/s]Epoch 9: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0263]Epoch 9: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0262]Epoch 9: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0301]Epoch 9: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0457]Epoch 9: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0573]Epoch 9: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0381]Epoch 9: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0312]Epoch 9: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.00837]Epoch 9: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.00966]Epoch 9: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.027] Epoch 9: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.031]Epoch 9: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0196]Epoch 9: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.033] Epoch 9: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0509]Epoch 9: 100%|##########| 14/14 [00:00<00:00, 944.10batch/s, train/train_loss_1=0.0509]
/opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages/sklearn/preprocessing/_encoders.py:868: FutureWarning: `sparse` was renamed to `sparse_output` in version 1.2 and will be removed in 1.4. `sparse_output` is ignored unless you leave `sparse` to its default value.
warnings.warn(
0%| | 0/16 [00:00<?, ?batch/s]Epoch 0: 0%| | 0/16 [00:00<?, ?batch/s]Epoch 0: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.173]Epoch 0: 6%|6 | 1/16 [00:00<00:14, 1.02batch/s, train/train_loss_1=0.173]Epoch 0: 6%|6 | 1/16 [00:00<00:14, 1.02batch/s, train/train_loss_1=0.119]Epoch 0: 6%|6 | 1/16 [00:00<00:14, 1.02batch/s, train/train_loss_1=0.131]Epoch 0: 6%|6 | 1/16 [00:00<00:14, 1.02batch/s, train/train_loss_1=0.116]Epoch 0: 6%|6 | 1/16 [00:00<00:14, 1.02batch/s, train/train_loss_1=0.121]Epoch 0: 6%|6 | 1/16 [00:00<00:14, 1.02batch/s, train/train_loss_1=0.0977]Epoch 0: 6%|6 | 1/16 [00:00<00:14, 1.02batch/s, train/train_loss_1=0.118] Epoch 0: 6%|6 | 1/16 [00:00<00:14, 1.02batch/s, train/train_loss_1=0.104]Epoch 0: 6%|6 | 1/16 [00:00<00:14, 1.02batch/s, train/train_loss_1=0.1] Epoch 0: 6%|6 | 1/16 [00:00<00:14, 1.02batch/s, train/train_loss_1=0.107]Epoch 0: 6%|6 | 1/16 [00:00<00:14, 1.02batch/s, train/train_loss_1=0.103]Epoch 0: 6%|6 | 1/16 [00:00<00:14, 1.02batch/s, train/train_loss_1=0.0889]Epoch 0: 6%|6 | 1/16 [00:00<00:14, 1.02batch/s, train/train_loss_1=0.133] Epoch 0: 6%|6 | 1/16 [00:00<00:14, 1.02batch/s, train/train_loss_1=0.108]Epoch 0: 6%|6 | 1/16 [00:00<00:14, 1.02batch/s, train/train_loss_1=0.0993]Epoch 0: 6%|6 | 1/16 [00:02<00:14, 1.02batch/s, train/train_loss_1=0.141] Epoch 0: 100%|##########| 16/16 [00:02<00:00, 8.85batch/s, train/train_loss_1=0.141] 0%| | 0/16 [00:00<?, ?batch/s]Epoch 1: 0%| | 0/16 [00:00<?, ?batch/s]Epoch 1: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0872]Epoch 1: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0991]Epoch 1: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0854]Epoch 1: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0584]Epoch 1: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0998]Epoch 1: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.117] Epoch 1: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.112]Epoch 1: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.105]Epoch 1: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.107]Epoch 1: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.109]Epoch 1: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0761]Epoch 1: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.121] Epoch 1: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.103]Epoch 1: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0884]Epoch 1: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0942]Epoch 1: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0387] 0%| | 0/16 [00:00<?, ?batch/s]Epoch 2: 0%| | 0/16 [00:00<?, ?batch/s]Epoch 2: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0828]Epoch 2: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0838]Epoch 2: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.107] Epoch 2: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0938]Epoch 2: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0941]Epoch 2: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0641]Epoch 2: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0861]Epoch 2: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0834]Epoch 2: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.105] Epoch 2: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0735]Epoch 2: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0735]Epoch 2: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.102] Epoch 2: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0822]Epoch 2: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0732]Epoch 2: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0764]Epoch 2: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0421] 0%| | 0/16 [00:00<?, ?batch/s]Epoch 3: 0%| | 0/16 [00:00<?, ?batch/s]Epoch 3: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.102]Epoch 3: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0773]Epoch 3: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0699]Epoch 3: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0761]Epoch 3: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0662]Epoch 3: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0919]Epoch 3: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0685]Epoch 3: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0536]Epoch 3: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.073] Epoch 3: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0768]Epoch 3: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0807]Epoch 3: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0647]Epoch 3: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0498]Epoch 3: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0932]Epoch 3: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0938]Epoch 3: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0672] 0%| | 0/16 [00:00<?, ?batch/s]Epoch 4: 0%| | 0/16 [00:00<?, ?batch/s]Epoch 4: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0767]Epoch 4: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0841]Epoch 4: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0376]Epoch 4: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0553]Epoch 4: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0603]Epoch 4: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0665]Epoch 4: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0787]Epoch 4: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0459]Epoch 4: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0626]Epoch 4: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0528]Epoch 4: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0428]Epoch 4: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.034] Epoch 4: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.035]Epoch 4: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.053]Epoch 4: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.07] Epoch 4: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0257] 0%| | 0/16 [00:00<?, ?batch/s]Epoch 5: 0%| | 0/16 [00:00<?, ?batch/s]Epoch 5: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0697]Epoch 5: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0585]Epoch 5: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0521]Epoch 5: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0256]Epoch 5: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0797]Epoch 5: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.03] Epoch 5: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0545]Epoch 5: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0477]Epoch 5: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0312]Epoch 5: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0438]Epoch 5: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0419]Epoch 5: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0845]Epoch 5: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0572]Epoch 5: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0456]Epoch 5: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0569]Epoch 5: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0323] 0%| | 0/16 [00:00<?, ?batch/s]Epoch 6: 0%| | 0/16 [00:00<?, ?batch/s]Epoch 6: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.032]Epoch 6: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.047]Epoch 6: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0648]Epoch 6: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0521]Epoch 6: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0501]Epoch 6: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0305]Epoch 6: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0248]Epoch 6: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0566]Epoch 6: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0594]Epoch 6: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0227]Epoch 6: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0552]Epoch 6: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0708]Epoch 6: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0444]Epoch 6: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0469]Epoch 6: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0408]Epoch 6: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0155] 0%| | 0/16 [00:00<?, ?batch/s]Epoch 7: 0%| | 0/16 [00:00<?, ?batch/s]Epoch 7: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0337]Epoch 7: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0505]Epoch 7: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0355]Epoch 7: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0541]Epoch 7: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0216]Epoch 7: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0588]Epoch 7: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.058] Epoch 7: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.04] Epoch 7: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0597]Epoch 7: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0384]Epoch 7: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0525]Epoch 7: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0177]Epoch 7: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0379]Epoch 7: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0151]Epoch 7: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0365]Epoch 7: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0297] 0%| | 0/16 [00:00<?, ?batch/s]Epoch 8: 0%| | 0/16 [00:00<?, ?batch/s]Epoch 8: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0557]Epoch 8: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0282]Epoch 8: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0369]Epoch 8: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0303]Epoch 8: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0248]Epoch 8: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0353]Epoch 8: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0452]Epoch 8: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0458]Epoch 8: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0467]Epoch 8: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0391]Epoch 8: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0267]Epoch 8: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0576]Epoch 8: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0428]Epoch 8: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0373]Epoch 8: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0304]Epoch 8: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0221] 0%| | 0/16 [00:00<?, ?batch/s]Epoch 9: 0%| | 0/16 [00:00<?, ?batch/s]Epoch 9: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.066]Epoch 9: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0292]Epoch 9: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0635]Epoch 9: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0227]Epoch 9: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0278]Epoch 9: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0239]Epoch 9: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0459]Epoch 9: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0494]Epoch 9: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0494]Epoch 9: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0294]Epoch 9: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0219]Epoch 9: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0207]Epoch 9: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0146]Epoch 9: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0188]Epoch 9: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.0157]Epoch 9: 0%| | 0/16 [00:00<?, ?batch/s, train/train_loss_1=0.129] Epoch 9: 100%|##########| 16/16 [00:00<00:00, 884.96batch/s, train/train_loss_1=0.129]
/opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages/sklearn/preprocessing/_encoders.py:868: FutureWarning: `sparse` was renamed to `sparse_output` in version 1.2 and will be removed in 1.4. `sparse_output` is ignored unless you leave `sparse` to its default value.
warnings.warn(
0%| | 0/11 [00:00<?, ?batch/s]Epoch 0: 0%| | 0/11 [00:00<?, ?batch/s]Epoch 0: 0%| | 0/11 [00:01<?, ?batch/s, train/train_loss_1=0.156]Epoch 0: 9%|9 | 1/11 [00:01<00:11, 1.20s/batch, train/train_loss_1=0.156]Epoch 0: 9%|9 | 1/11 [00:01<00:11, 1.20s/batch, train/train_loss_1=0.138]Epoch 0: 9%|9 | 1/11 [00:01<00:11, 1.20s/batch, train/train_loss_1=0.141]Epoch 0: 9%|9 | 1/11 [00:01<00:11, 1.20s/batch, train/train_loss_1=0.135]Epoch 0: 9%|9 | 1/11 [00:01<00:11, 1.20s/batch, train/train_loss_1=0.124]Epoch 0: 9%|9 | 1/11 [00:01<00:11, 1.20s/batch, train/train_loss_1=0.126]Epoch 0: 9%|9 | 1/11 [00:01<00:11, 1.20s/batch, train/train_loss_1=0.127]Epoch 0: 9%|9 | 1/11 [00:01<00:11, 1.20s/batch, train/train_loss_1=0.12] Epoch 0: 9%|9 | 1/11 [00:01<00:11, 1.20s/batch, train/train_loss_1=0.127]Epoch 0: 9%|9 | 1/11 [00:01<00:11, 1.20s/batch, train/train_loss_1=0.111]Epoch 0: 9%|9 | 1/11 [00:02<00:11, 1.20s/batch, train/train_loss_1=0.122]Epoch 0: 100%|##########| 11/11 [00:02<00:00, 5.54batch/s, train/train_loss_1=0.122] 0%| | 0/11 [00:00<?, ?batch/s]Epoch 1: 0%| | 0/11 [00:00<?, ?batch/s]Epoch 1: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.115]Epoch 1: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.123]Epoch 1: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.131]Epoch 1: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.111]Epoch 1: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.123]Epoch 1: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.117]Epoch 1: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.119]Epoch 1: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.113]Epoch 1: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.106]Epoch 1: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.121]Epoch 1: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0913] 0%| | 0/11 [00:00<?, ?batch/s]Epoch 2: 0%| | 0/11 [00:00<?, ?batch/s]Epoch 2: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.107]Epoch 2: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.108]Epoch 2: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.11] Epoch 2: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.104]Epoch 2: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.103]Epoch 2: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.119]Epoch 2: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.117]Epoch 2: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.118]Epoch 2: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.109]Epoch 2: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.105]Epoch 2: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0937] 0%| | 0/11 [00:00<?, ?batch/s]Epoch 3: 0%| | 0/11 [00:00<?, ?batch/s]Epoch 3: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.108]Epoch 3: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.107]Epoch 3: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.115]Epoch 3: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0903]Epoch 3: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.103] Epoch 3: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0806]Epoch 3: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.102] Epoch 3: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0838]Epoch 3: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.104] Epoch 3: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0972]Epoch 3: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0815] 0%| | 0/11 [00:00<?, ?batch/s]Epoch 4: 0%| | 0/11 [00:00<?, ?batch/s]Epoch 4: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.11]Epoch 4: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.1] Epoch 4: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0809]Epoch 4: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0902]Epoch 4: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.102] Epoch 4: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0783]Epoch 4: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0868]Epoch 4: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0863]Epoch 4: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0928]Epoch 4: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.109] Epoch 4: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.101] 0%| | 0/11 [00:00<?, ?batch/s]Epoch 5: 0%| | 0/11 [00:00<?, ?batch/s]Epoch 5: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0899]Epoch 5: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.075] Epoch 5: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0887]Epoch 5: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.11] Epoch 5: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.126]Epoch 5: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.109]Epoch 5: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0979]Epoch 5: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.111] Epoch 5: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0756]Epoch 5: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0752]Epoch 5: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0798] 0%| | 0/11 [00:00<?, ?batch/s]Epoch 6: 0%| | 0/11 [00:00<?, ?batch/s]Epoch 6: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0778]Epoch 6: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0721]Epoch 6: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0895]Epoch 6: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.107] Epoch 6: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0664]Epoch 6: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0976]Epoch 6: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.091] Epoch 6: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.08] Epoch 6: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0844]Epoch 6: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0866]Epoch 6: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0826] 0%| | 0/11 [00:00<?, ?batch/s]Epoch 7: 0%| | 0/11 [00:00<?, ?batch/s]Epoch 7: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0624]Epoch 7: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0828]Epoch 7: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.111] Epoch 7: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0775]Epoch 7: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.103] Epoch 7: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0782]Epoch 7: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0641]Epoch 7: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.106] Epoch 7: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.065]Epoch 7: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0739]Epoch 7: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.074] 0%| | 0/11 [00:00<?, ?batch/s]Epoch 8: 0%| | 0/11 [00:00<?, ?batch/s]Epoch 8: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0927]Epoch 8: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0816]Epoch 8: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.101] Epoch 8: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0646]Epoch 8: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0782]Epoch 8: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0771]Epoch 8: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.066] Epoch 8: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0668]Epoch 8: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0631]Epoch 8: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0742]Epoch 8: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0767] 0%| | 0/11 [00:00<?, ?batch/s]Epoch 9: 0%| | 0/11 [00:00<?, ?batch/s]Epoch 9: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0586]Epoch 9: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.069] Epoch 9: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0685]Epoch 9: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0687]Epoch 9: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0769]Epoch 9: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.121] Epoch 9: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0803]Epoch 9: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0675]Epoch 9: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0674]Epoch 9: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0638]Epoch 9: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0687]Epoch 9: 100%|##########| 11/11 [00:00<00:00, 723.87batch/s, train/train_loss_1=0.0687]
/opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages/sklearn/preprocessing/_encoders.py:868: FutureWarning: `sparse` was renamed to `sparse_output` in version 1.2 and will be removed in 1.4. `sparse_output` is ignored unless you leave `sparse` to its default value.
warnings.warn(
0%| | 0/12 [00:00<?, ?batch/s]Epoch 0: 0%| | 0/12 [00:00<?, ?batch/s]Epoch 0: 0%| | 0/12 [00:01<?, ?batch/s, train/train_loss_1=0.147]Epoch 0: 8%|8 | 1/12 [00:01<00:12, 1.12s/batch, train/train_loss_1=0.147]Epoch 0: 8%|8 | 1/12 [00:01<00:12, 1.12s/batch, train/train_loss_1=0.126]Epoch 0: 8%|8 | 1/12 [00:01<00:12, 1.12s/batch, train/train_loss_1=0.119]Epoch 0: 8%|8 | 1/12 [00:01<00:12, 1.12s/batch, train/train_loss_1=0.112]Epoch 0: 8%|8 | 1/12 [00:01<00:12, 1.12s/batch, train/train_loss_1=0.117]Epoch 0: 8%|8 | 1/12 [00:01<00:12, 1.12s/batch, train/train_loss_1=0.121]Epoch 0: 8%|8 | 1/12 [00:01<00:12, 1.12s/batch, train/train_loss_1=0.101]Epoch 0: 8%|8 | 1/12 [00:01<00:12, 1.12s/batch, train/train_loss_1=0.114]Epoch 0: 8%|8 | 1/12 [00:01<00:12, 1.12s/batch, train/train_loss_1=0.109]Epoch 0: 8%|8 | 1/12 [00:01<00:12, 1.12s/batch, train/train_loss_1=0.119]Epoch 0: 8%|8 | 1/12 [00:01<00:12, 1.12s/batch, train/train_loss_1=0.123]Epoch 0: 8%|8 | 1/12 [00:02<00:12, 1.12s/batch, train/train_loss_1=0.119]Epoch 0: 100%|##########| 12/12 [00:02<00:00, 6.05batch/s, train/train_loss_1=0.119] 0%| | 0/12 [00:00<?, ?batch/s]Epoch 1: 0%| | 0/12 [00:00<?, ?batch/s]Epoch 1: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.109]Epoch 1: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.124]Epoch 1: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.0921]Epoch 1: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.102] Epoch 1: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.104]Epoch 1: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.105]Epoch 1: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.107]Epoch 1: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.102]Epoch 1: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.0982]Epoch 1: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.114] Epoch 1: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.145]Epoch 1: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.0696] 0%| | 0/12 [00:00<?, ?batch/s]Epoch 2: 0%| | 0/12 [00:00<?, ?batch/s]Epoch 2: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.107]Epoch 2: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.109]Epoch 2: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.0961]Epoch 2: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.0895]Epoch 2: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.108] Epoch 2: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.0888]Epoch 2: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.111] Epoch 2: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.0837]Epoch 2: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.108] Epoch 2: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.12] Epoch 2: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.112]Epoch 2: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.105] 0%| | 0/12 [00:00<?, ?batch/s]Epoch 3: 0%| | 0/12 [00:00<?, ?batch/s]Epoch 3: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.0938]Epoch 3: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.103] Epoch 3: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.11] Epoch 3: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.0973]Epoch 3: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.0894]Epoch 3: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.0921]Epoch 3: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.0966]Epoch 3: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.106] Epoch 3: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.11] Epoch 3: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.105]Epoch 3: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.101]Epoch 3: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.0981] 0%| | 0/12 [00:00<?, ?batch/s]Epoch 4: 0%| | 0/12 [00:00<?, ?batch/s]Epoch 4: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.0987]Epoch 4: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.082] Epoch 4: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.0702]Epoch 4: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.101] Epoch 4: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.101]Epoch 4: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.0849]Epoch 4: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.103] Epoch 4: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.103]Epoch 4: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.112]Epoch 4: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.0962]Epoch 4: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.118] Epoch 4: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.0757] 0%| | 0/12 [00:00<?, ?batch/s]Epoch 5: 0%| | 0/12 [00:00<?, ?batch/s]Epoch 5: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.09]Epoch 5: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.0875]Epoch 5: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.102] Epoch 5: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.0966]Epoch 5: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.102] Epoch 5: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.0675]Epoch 5: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.0768]Epoch 5: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.0976]Epoch 5: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.127] Epoch 5: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.0804]Epoch 5: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.0905]Epoch 5: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.0964] 0%| | 0/12 [00:00<?, ?batch/s]Epoch 6: 0%| | 0/12 [00:00<?, ?batch/s]Epoch 6: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.0979]Epoch 6: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.0934]Epoch 6: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.0912]Epoch 6: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.0842]Epoch 6: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.08] Epoch 6: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.0872]Epoch 6: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.0848]Epoch 6: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.079] Epoch 6: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.0792]Epoch 6: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.125] Epoch 6: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.0884]Epoch 6: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.0898] 0%| | 0/12 [00:00<?, ?batch/s]Epoch 7: 0%| | 0/12 [00:00<?, ?batch/s]Epoch 7: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.0929]Epoch 7: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.0981]Epoch 7: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.0913]Epoch 7: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.0971]Epoch 7: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.0856]Epoch 7: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.105] Epoch 7: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.0928]Epoch 7: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.0733]Epoch 7: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.0863]Epoch 7: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.083] Epoch 7: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.101]Epoch 7: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.0914] 0%| | 0/12 [00:00<?, ?batch/s]Epoch 8: 0%| | 0/12 [00:00<?, ?batch/s]Epoch 8: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.0941]Epoch 8: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.0699]Epoch 8: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.0676]Epoch 8: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.0838]Epoch 8: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.084] Epoch 8: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.104]Epoch 8: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.0841]Epoch 8: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.0811]Epoch 8: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.0958]Epoch 8: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.0796]Epoch 8: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.0812]Epoch 8: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.0676] 0%| | 0/12 [00:00<?, ?batch/s]Epoch 9: 0%| | 0/12 [00:00<?, ?batch/s]Epoch 9: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.0941]Epoch 9: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.0828]Epoch 9: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.109] Epoch 9: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.0728]Epoch 9: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.0775]Epoch 9: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.0934]Epoch 9: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.083] Epoch 9: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.087]Epoch 9: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.0929]Epoch 9: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.094] Epoch 9: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.0488]Epoch 9: 0%| | 0/12 [00:00<?, ?batch/s, train/train_loss_1=0.057] Epoch 9: 100%|##########| 12/12 [00:00<00:00, 757.28batch/s, train/train_loss_1=0.057]
0%| | 0/14 [00:00<?, ?batch/s]Epoch 0: 0%| | 0/14 [00:00<?, ?batch/s]Epoch 0: 0%| | 0/14 [00:01<?, ?batch/s, train/train_loss_1=0.126]Epoch 0: 7%|7 | 1/14 [00:01<00:14, 1.12s/batch, train/train_loss_1=0.126]Epoch 0: 7%|7 | 1/14 [00:01<00:14, 1.12s/batch, train/train_loss_1=0.127]Epoch 0: 7%|7 | 1/14 [00:01<00:14, 1.12s/batch, train/train_loss_1=0.125]Epoch 0: 7%|7 | 1/14 [00:01<00:14, 1.12s/batch, train/train_loss_1=0.124]Epoch 0: 7%|7 | 1/14 [00:01<00:14, 1.12s/batch, train/train_loss_1=0.123]Epoch 0: 7%|7 | 1/14 [00:01<00:14, 1.12s/batch, train/train_loss_1=0.121]Epoch 0: 7%|7 | 1/14 [00:01<00:14, 1.12s/batch, train/train_loss_1=0.122]Epoch 0: 7%|7 | 1/14 [00:01<00:14, 1.12s/batch, train/train_loss_1=0.12] Epoch 0: 7%|7 | 1/14 [00:01<00:14, 1.12s/batch, train/train_loss_1=0.121]Epoch 0: 7%|7 | 1/14 [00:01<00:14, 1.12s/batch, train/train_loss_1=0.118]Epoch 0: 7%|7 | 1/14 [00:01<00:14, 1.12s/batch, train/train_loss_1=0.118]Epoch 0: 7%|7 | 1/14 [00:01<00:14, 1.12s/batch, train/train_loss_1=0.116]Epoch 0: 7%|7 | 1/14 [00:01<00:14, 1.12s/batch, train/train_loss_1=0.119]Epoch 0: 7%|7 | 1/14 [00:02<00:14, 1.12s/batch, train/train_loss_1=0.112]Epoch 0: 100%|##########| 14/14 [00:02<00:00, 6.83batch/s, train/train_loss_1=0.112] 0%| | 0/14 [00:00<?, ?batch/s]Epoch 1: 0%| | 0/14 [00:00<?, ?batch/s]Epoch 1: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.113]Epoch 1: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.112]Epoch 1: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.104]Epoch 1: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.111]Epoch 1: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.109]Epoch 1: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.104]Epoch 1: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0977]Epoch 1: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0964]Epoch 1: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0976]Epoch 1: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0988]Epoch 1: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0944]Epoch 1: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0937]Epoch 1: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0864]Epoch 1: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0934] 0%| | 0/14 [00:00<?, ?batch/s]Epoch 2: 0%| | 0/14 [00:00<?, ?batch/s]Epoch 2: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0806]Epoch 2: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0864]Epoch 2: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0771]Epoch 2: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0747]Epoch 2: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0774]Epoch 2: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.078] Epoch 2: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.069]Epoch 2: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0706]Epoch 2: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0691]Epoch 2: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0727]Epoch 2: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0664]Epoch 2: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0664]Epoch 2: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0591]Epoch 2: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0623] 0%| | 0/14 [00:00<?, ?batch/s]Epoch 3: 0%| | 0/14 [00:00<?, ?batch/s]Epoch 3: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0573]Epoch 3: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0542]Epoch 3: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0498]Epoch 3: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0557]Epoch 3: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0526]Epoch 3: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0584]Epoch 3: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0468]Epoch 3: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0601]Epoch 3: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0629]Epoch 3: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.062] Epoch 3: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0525]Epoch 3: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0608]Epoch 3: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0525]Epoch 3: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0492] 0%| | 0/14 [00:00<?, ?batch/s]Epoch 4: 0%| | 0/14 [00:00<?, ?batch/s]Epoch 4: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0496]Epoch 4: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0602]Epoch 4: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0435]Epoch 4: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0498]Epoch 4: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0428]Epoch 4: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0484]Epoch 4: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0608]Epoch 4: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0465]Epoch 4: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0493]Epoch 4: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0452]Epoch 4: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0407]Epoch 4: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0467]Epoch 4: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.05] Epoch 4: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0313] 0%| | 0/14 [00:00<?, ?batch/s]Epoch 5: 0%| | 0/14 [00:00<?, ?batch/s]Epoch 5: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0493]Epoch 5: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0317]Epoch 5: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0482]Epoch 5: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0299]Epoch 5: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0393]Epoch 5: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0397]Epoch 5: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0409]Epoch 5: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.05] Epoch 5: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0395]Epoch 5: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0471]Epoch 5: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0456]Epoch 5: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0422]Epoch 5: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0358]Epoch 5: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0582] 0%| | 0/14 [00:00<?, ?batch/s]Epoch 6: 0%| | 0/14 [00:00<?, ?batch/s]Epoch 6: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0472]Epoch 6: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0449]Epoch 6: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0363]Epoch 6: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0361]Epoch 6: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0287]Epoch 6: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0407]Epoch 6: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0342]Epoch 6: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0541]Epoch 6: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0458]Epoch 6: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.042] Epoch 6: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0332]Epoch 6: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0415]Epoch 6: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0383]Epoch 6: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0426] 0%| | 0/14 [00:00<?, ?batch/s]Epoch 7: 0%| | 0/14 [00:00<?, ?batch/s]Epoch 7: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.032]Epoch 7: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0367]Epoch 7: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.039] Epoch 7: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0322]Epoch 7: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0332]Epoch 7: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0254]Epoch 7: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0483]Epoch 7: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0419]Epoch 7: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0365]Epoch 7: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0306]Epoch 7: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0304]Epoch 7: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0402]Epoch 7: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0335]Epoch 7: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0444] 0%| | 0/14 [00:00<?, ?batch/s]Epoch 8: 0%| | 0/14 [00:00<?, ?batch/s]Epoch 8: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0416]Epoch 8: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0429]Epoch 8: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0419]Epoch 8: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0301]Epoch 8: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0392]Epoch 8: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0343]Epoch 8: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0328]Epoch 8: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0341]Epoch 8: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0348]Epoch 8: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.025] Epoch 8: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0326]Epoch 8: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0358]Epoch 8: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0393]Epoch 8: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0426] 0%| | 0/14 [00:00<?, ?batch/s]Epoch 9: 0%| | 0/14 [00:00<?, ?batch/s]Epoch 9: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0331]Epoch 9: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0336]Epoch 9: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0359]Epoch 9: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0369]Epoch 9: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0296]Epoch 9: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0311]Epoch 9: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0319]Epoch 9: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0323]Epoch 9: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0282]Epoch 9: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0338]Epoch 9: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0268]Epoch 9: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0323]Epoch 9: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0345]Epoch 9: 0%| | 0/14 [00:00<?, ?batch/s, train/train_loss_1=0.0338]Epoch 9: 100%|##########| 14/14 [00:00<00:00, 548.25batch/s, train/train_loss_1=0.0338]
0%| | 0/8 [00:00<?, ?batch/s]Epoch 0: 0%| | 0/8 [00:00<?, ?batch/s]Epoch 0: 0%| | 0/8 [00:01<?, ?batch/s, train/train_loss_1=0.151]Epoch 0: 12%|#2 | 1/8 [00:01<00:07, 1.12s/batch, train/train_loss_1=0.151]Epoch 0: 12%|#2 | 1/8 [00:01<00:07, 1.12s/batch, train/train_loss_1=0.127]Epoch 0: 12%|#2 | 1/8 [00:01<00:07, 1.12s/batch, train/train_loss_1=0.107]Epoch 0: 12%|#2 | 1/8 [00:01<00:07, 1.12s/batch, train/train_loss_1=0.0926]Epoch 0: 12%|#2 | 1/8 [00:01<00:07, 1.12s/batch, train/train_loss_1=0.0833]Epoch 0: 12%|#2 | 1/8 [00:01<00:07, 1.12s/batch, train/train_loss_1=0.0707]Epoch 0: 12%|#2 | 1/8 [00:01<00:07, 1.12s/batch, train/train_loss_1=0.0707]Epoch 0: 12%|#2 | 1/8 [00:02<00:07, 1.12s/batch, train/train_loss_1=0.0788]Epoch 0: 100%|##########| 8/8 [00:02<00:00, 3.83batch/s, train/train_loss_1=0.0788] 0%| | 0/8 [00:00<?, ?batch/s]Epoch 1: 0%| | 0/8 [00:00<?, ?batch/s]Epoch 1: 0%| | 0/8 [00:00<?, ?batch/s, train/train_loss_1=0.062]Epoch 1: 0%| | 0/8 [00:00<?, ?batch/s, train/train_loss_1=0.0457]Epoch 1: 0%| | 0/8 [00:00<?, ?batch/s, train/train_loss_1=0.051] Epoch 1: 0%| | 0/8 [00:00<?, ?batch/s, train/train_loss_1=0.056]Epoch 1: 0%| | 0/8 [00:00<?, ?batch/s, train/train_loss_1=0.0512]Epoch 1: 0%| | 0/8 [00:00<?, ?batch/s, train/train_loss_1=0.0334]Epoch 1: 0%| | 0/8 [00:00<?, ?batch/s, train/train_loss_1=0.0339]Epoch 1: 0%| | 0/8 [00:00<?, ?batch/s, train/train_loss_1=0.0378] 0%| | 0/8 [00:00<?, ?batch/s]Epoch 2: 0%| | 0/8 [00:00<?, ?batch/s]Epoch 2: 0%| | 0/8 [00:00<?, ?batch/s, train/train_loss_1=0.0359]Epoch 2: 0%| | 0/8 [00:00<?, ?batch/s, train/train_loss_1=0.0358]Epoch 2: 0%| | 0/8 [00:00<?, ?batch/s, train/train_loss_1=0.0415]Epoch 2: 0%| | 0/8 [00:00<?, ?batch/s, train/train_loss_1=0.0422]Epoch 2: 0%| | 0/8 [00:00<?, ?batch/s, train/train_loss_1=0.0412]Epoch 2: 0%| | 0/8 [00:00<?, ?batch/s, train/train_loss_1=0.0308]Epoch 2: 0%| | 0/8 [00:00<?, ?batch/s, train/train_loss_1=0.0301]Epoch 2: 0%| | 0/8 [00:00<?, ?batch/s, train/train_loss_1=0.044] 0%| | 0/8 [00:00<?, ?batch/s]Epoch 3: 0%| | 0/8 [00:00<?, ?batch/s]Epoch 3: 0%| | 0/8 [00:00<?, ?batch/s, train/train_loss_1=0.0364]Epoch 3: 0%| | 0/8 [00:00<?, ?batch/s, train/train_loss_1=0.0283]Epoch 3: 0%| | 0/8 [00:00<?, ?batch/s, train/train_loss_1=0.0336]Epoch 3: 0%| | 0/8 [00:00<?, ?batch/s, train/train_loss_1=0.0373]Epoch 3: 0%| | 0/8 [00:00<?, ?batch/s, train/train_loss_1=0.0282]Epoch 3: 0%| | 0/8 [00:00<?, ?batch/s, train/train_loss_1=0.0391]Epoch 3: 0%| | 0/8 [00:00<?, ?batch/s, train/train_loss_1=0.0312]Epoch 3: 0%| | 0/8 [00:00<?, ?batch/s, train/train_loss_1=0.0401] 0%| | 0/8 [00:00<?, ?batch/s]Epoch 4: 0%| | 0/8 [00:00<?, ?batch/s]Epoch 4: 0%| | 0/8 [00:00<?, ?batch/s, train/train_loss_1=0.0407]Epoch 4: 0%| | 0/8 [00:00<?, ?batch/s, train/train_loss_1=0.0209]Epoch 4: 0%| | 0/8 [00:00<?, ?batch/s, train/train_loss_1=0.0417]Epoch 4: 0%| | 0/8 [00:00<?, ?batch/s, train/train_loss_1=0.0312]Epoch 4: 0%| | 0/8 [00:00<?, ?batch/s, train/train_loss_1=0.0355]Epoch 4: 0%| | 0/8 [00:00<?, ?batch/s, train/train_loss_1=0.0281]Epoch 4: 0%| | 0/8 [00:00<?, ?batch/s, train/train_loss_1=0.0431]Epoch 4: 0%| | 0/8 [00:00<?, ?batch/s, train/train_loss_1=0.0204] 0%| | 0/8 [00:00<?, ?batch/s]Epoch 5: 0%| | 0/8 [00:00<?, ?batch/s]Epoch 5: 0%| | 0/8 [00:00<?, ?batch/s, train/train_loss_1=0.0318]Epoch 5: 0%| | 0/8 [00:00<?, ?batch/s, train/train_loss_1=0.0383]Epoch 5: 0%| | 0/8 [00:00<?, ?batch/s, train/train_loss_1=0.021] Epoch 5: 0%| | 0/8 [00:00<?, ?batch/s, train/train_loss_1=0.0362]Epoch 5: 0%| | 0/8 [00:00<?, ?batch/s, train/train_loss_1=0.0186]Epoch 5: 0%| | 0/8 [00:00<?, ?batch/s, train/train_loss_1=0.0381]Epoch 5: 0%| | 0/8 [00:00<?, ?batch/s, train/train_loss_1=0.0405]Epoch 5: 0%| | 0/8 [00:00<?, ?batch/s, train/train_loss_1=0.0338] 0%| | 0/8 [00:00<?, ?batch/s]Epoch 6: 0%| | 0/8 [00:00<?, ?batch/s]Epoch 6: 0%| | 0/8 [00:00<?, ?batch/s, train/train_loss_1=0.0199]Epoch 6: 0%| | 0/8 [00:00<?, ?batch/s, train/train_loss_1=0.024] Epoch 6: 0%| | 0/8 [00:00<?, ?batch/s, train/train_loss_1=0.0424]Epoch 6: 0%| | 0/8 [00:00<?, ?batch/s, train/train_loss_1=0.0198]Epoch 6: 0%| | 0/8 [00:00<?, ?batch/s, train/train_loss_1=0.0433]Epoch 6: 0%| | 0/8 [00:00<?, ?batch/s, train/train_loss_1=0.044] Epoch 6: 0%| | 0/8 [00:00<?, ?batch/s, train/train_loss_1=0.0328]Epoch 6: 0%| | 0/8 [00:00<?, ?batch/s, train/train_loss_1=0.0239] 0%| | 0/8 [00:00<?, ?batch/s]Epoch 7: 0%| | 0/8 [00:00<?, ?batch/s]Epoch 7: 0%| | 0/8 [00:00<?, ?batch/s, train/train_loss_1=0.0257]Epoch 7: 0%| | 0/8 [00:00<?, ?batch/s, train/train_loss_1=0.0254]Epoch 7: 0%| | 0/8 [00:00<?, ?batch/s, train/train_loss_1=0.0463]Epoch 7: 0%| | 0/8 [00:00<?, ?batch/s, train/train_loss_1=0.0222]Epoch 7: 0%| | 0/8 [00:00<?, ?batch/s, train/train_loss_1=0.04] Epoch 7: 0%| | 0/8 [00:00<?, ?batch/s, train/train_loss_1=0.0238]Epoch 7: 0%| | 0/8 [00:00<?, ?batch/s, train/train_loss_1=0.0294]Epoch 7: 0%| | 0/8 [00:00<?, ?batch/s, train/train_loss_1=0.0441] 0%| | 0/8 [00:00<?, ?batch/s]Epoch 8: 0%| | 0/8 [00:00<?, ?batch/s]Epoch 8: 0%| | 0/8 [00:00<?, ?batch/s, train/train_loss_1=0.0276]Epoch 8: 0%| | 0/8 [00:00<?, ?batch/s, train/train_loss_1=0.0282]Epoch 8: 0%| | 0/8 [00:00<?, ?batch/s, train/train_loss_1=0.0199]Epoch 8: 0%| | 0/8 [00:00<?, ?batch/s, train/train_loss_1=0.0415]Epoch 8: 0%| | 0/8 [00:00<?, ?batch/s, train/train_loss_1=0.0327]Epoch 8: 0%| | 0/8 [00:00<?, ?batch/s, train/train_loss_1=0.0455]Epoch 8: 0%| | 0/8 [00:00<?, ?batch/s, train/train_loss_1=0.0255]Epoch 8: 0%| | 0/8 [00:00<?, ?batch/s, train/train_loss_1=0.0232] 0%| | 0/8 [00:00<?, ?batch/s]Epoch 9: 0%| | 0/8 [00:00<?, ?batch/s]Epoch 9: 0%| | 0/8 [00:00<?, ?batch/s, train/train_loss_1=0.0314]Epoch 9: 0%| | 0/8 [00:00<?, ?batch/s, train/train_loss_1=0.0263]Epoch 9: 0%| | 0/8 [00:00<?, ?batch/s, train/train_loss_1=0.0365]Epoch 9: 0%| | 0/8 [00:00<?, ?batch/s, train/train_loss_1=0.0277]Epoch 9: 0%| | 0/8 [00:00<?, ?batch/s, train/train_loss_1=0.0399]Epoch 9: 0%| | 0/8 [00:00<?, ?batch/s, train/train_loss_1=0.0279]Epoch 9: 0%| | 0/8 [00:00<?, ?batch/s, train/train_loss_1=0.0359]Epoch 9: 0%| | 0/8 [00:00<?, ?batch/s, train/train_loss_1=0.0232]Epoch 9: 100%|##########| 8/8 [00:00<00:00, 561.45batch/s, train/train_loss_1=0.0232]
/opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages/sklearn/preprocessing/_encoders.py:868: FutureWarning: `sparse` was renamed to `sparse_output` in version 1.2 and will be removed in 1.4. `sparse_output` is ignored unless you leave `sparse` to its default value.
warnings.warn(
0%| | 0/7 [00:00<?, ?batch/s]Epoch 0: 0%| | 0/7 [00:00<?, ?batch/s]Epoch 0: 0%| | 0/7 [00:01<?, ?batch/s, train/train_loss_1=0.171]Epoch 0: 14%|#4 | 1/7 [00:01<00:08, 1.38s/batch, train/train_loss_1=0.171]Epoch 0: 14%|#4 | 1/7 [00:01<00:08, 1.38s/batch, train/train_loss_1=0.147]Epoch 0: 14%|#4 | 1/7 [00:01<00:08, 1.38s/batch, train/train_loss_1=0.136]Epoch 0: 14%|#4 | 1/7 [00:01<00:08, 1.38s/batch, train/train_loss_1=0.127]Epoch 0: 14%|#4 | 1/7 [00:01<00:08, 1.38s/batch, train/train_loss_1=0.118]Epoch 0: 14%|#4 | 1/7 [00:01<00:08, 1.38s/batch, train/train_loss_1=0.112]Epoch 0: 14%|#4 | 1/7 [00:02<00:08, 1.38s/batch, train/train_loss_1=0.129]Epoch 0: 100%|##########| 7/7 [00:02<00:00, 3.26batch/s, train/train_loss_1=0.129] 0%| | 0/7 [00:00<?, ?batch/s]Epoch 1: 0%| | 0/7 [00:00<?, ?batch/s]Epoch 1: 0%| | 0/7 [00:00<?, ?batch/s, train/train_loss_1=0.114]Epoch 1: 0%| | 0/7 [00:00<?, ?batch/s, train/train_loss_1=0.109]Epoch 1: 0%| | 0/7 [00:00<?, ?batch/s, train/train_loss_1=0.11] Epoch 1: 0%| | 0/7 [00:00<?, ?batch/s, train/train_loss_1=0.115]Epoch 1: 0%| | 0/7 [00:00<?, ?batch/s, train/train_loss_1=0.112]Epoch 1: 0%| | 0/7 [00:00<?, ?batch/s, train/train_loss_1=0.094]Epoch 1: 0%| | 0/7 [00:00<?, ?batch/s, train/train_loss_1=0.105] 0%| | 0/7 [00:00<?, ?batch/s]Epoch 2: 0%| | 0/7 [00:00<?, ?batch/s]Epoch 2: 0%| | 0/7 [00:00<?, ?batch/s, train/train_loss_1=0.116]Epoch 2: 0%| | 0/7 [00:00<?, ?batch/s, train/train_loss_1=0.0976]Epoch 2: 0%| | 0/7 [00:00<?, ?batch/s, train/train_loss_1=0.108] Epoch 2: 0%| | 0/7 [00:00<?, ?batch/s, train/train_loss_1=0.106]Epoch 2: 0%| | 0/7 [00:00<?, ?batch/s, train/train_loss_1=0.107]Epoch 2: 0%| | 0/7 [00:00<?, ?batch/s, train/train_loss_1=0.101]Epoch 2: 0%| | 0/7 [00:00<?, ?batch/s, train/train_loss_1=0.132] 0%| | 0/7 [00:00<?, ?batch/s]Epoch 3: 0%| | 0/7 [00:00<?, ?batch/s]Epoch 3: 0%| | 0/7 [00:00<?, ?batch/s, train/train_loss_1=0.115]Epoch 3: 0%| | 0/7 [00:00<?, ?batch/s, train/train_loss_1=0.109]Epoch 3: 0%| | 0/7 [00:00<?, ?batch/s, train/train_loss_1=0.103]Epoch 3: 0%| | 0/7 [00:00<?, ?batch/s, train/train_loss_1=0.109]Epoch 3: 0%| | 0/7 [00:00<?, ?batch/s, train/train_loss_1=0.0983]Epoch 3: 0%| | 0/7 [00:00<?, ?batch/s, train/train_loss_1=0.0878]Epoch 3: 0%| | 0/7 [00:00<?, ?batch/s, train/train_loss_1=0.0899] 0%| | 0/7 [00:00<?, ?batch/s]Epoch 4: 0%| | 0/7 [00:00<?, ?batch/s]Epoch 4: 0%| | 0/7 [00:00<?, ?batch/s, train/train_loss_1=0.104]Epoch 4: 0%| | 0/7 [00:00<?, ?batch/s, train/train_loss_1=0.0976]Epoch 4: 0%| | 0/7 [00:00<?, ?batch/s, train/train_loss_1=0.092] Epoch 4: 0%| | 0/7 [00:00<?, ?batch/s, train/train_loss_1=0.111]Epoch 4: 0%| | 0/7 [00:00<?, ?batch/s, train/train_loss_1=0.0937]Epoch 4: 0%| | 0/7 [00:00<?, ?batch/s, train/train_loss_1=0.103] Epoch 4: 0%| | 0/7 [00:00<?, ?batch/s, train/train_loss_1=0.0908] 0%| | 0/7 [00:00<?, ?batch/s]Epoch 5: 0%| | 0/7 [00:00<?, ?batch/s]Epoch 5: 0%| | 0/7 [00:00<?, ?batch/s, train/train_loss_1=0.096]Epoch 5: 0%| | 0/7 [00:00<?, ?batch/s, train/train_loss_1=0.0983]Epoch 5: 0%| | 0/7 [00:00<?, ?batch/s, train/train_loss_1=0.102] Epoch 5: 0%| | 0/7 [00:00<?, ?batch/s, train/train_loss_1=0.0868]Epoch 5: 0%| | 0/7 [00:00<?, ?batch/s, train/train_loss_1=0.104] Epoch 5: 0%| | 0/7 [00:00<?, ?batch/s, train/train_loss_1=0.0901]Epoch 5: 0%| | 0/7 [00:00<?, ?batch/s, train/train_loss_1=0.125] 0%| | 0/7 [00:00<?, ?batch/s]Epoch 6: 0%| | 0/7 [00:00<?, ?batch/s]Epoch 6: 0%| | 0/7 [00:00<?, ?batch/s, train/train_loss_1=0.0924]Epoch 6: 0%| | 0/7 [00:00<?, ?batch/s, train/train_loss_1=0.101] Epoch 6: 0%| | 0/7 [00:00<?, ?batch/s, train/train_loss_1=0.0939]Epoch 6: 0%| | 0/7 [00:00<?, ?batch/s, train/train_loss_1=0.0893]Epoch 6: 0%| | 0/7 [00:00<?, ?batch/s, train/train_loss_1=0.101] Epoch 6: 0%| | 0/7 [00:00<?, ?batch/s, train/train_loss_1=0.0971]Epoch 6: 0%| | 0/7 [00:00<?, ?batch/s, train/train_loss_1=0.071] 0%| | 0/7 [00:00<?, ?batch/s]Epoch 7: 0%| | 0/7 [00:00<?, ?batch/s]Epoch 7: 0%| | 0/7 [00:00<?, ?batch/s, train/train_loss_1=0.0913]Epoch 7: 0%| | 0/7 [00:00<?, ?batch/s, train/train_loss_1=0.0845]Epoch 7: 0%| | 0/7 [00:00<?, ?batch/s, train/train_loss_1=0.093] Epoch 7: 0%| | 0/7 [00:00<?, ?batch/s, train/train_loss_1=0.0826]Epoch 7: 0%| | 0/7 [00:00<?, ?batch/s, train/train_loss_1=0.0866]Epoch 7: 0%| | 0/7 [00:00<?, ?batch/s, train/train_loss_1=0.0901]Epoch 7: 0%| | 0/7 [00:00<?, ?batch/s, train/train_loss_1=0.0857] 0%| | 0/7 [00:00<?, ?batch/s]Epoch 8: 0%| | 0/7 [00:00<?, ?batch/s]Epoch 8: 0%| | 0/7 [00:00<?, ?batch/s, train/train_loss_1=0.0895]Epoch 8: 0%| | 0/7 [00:00<?, ?batch/s, train/train_loss_1=0.0865]Epoch 8: 0%| | 0/7 [00:00<?, ?batch/s, train/train_loss_1=0.083] Epoch 8: 0%| | 0/7 [00:00<?, ?batch/s, train/train_loss_1=0.0878]Epoch 8: 0%| | 0/7 [00:00<?, ?batch/s, train/train_loss_1=0.0883]Epoch 8: 0%| | 0/7 [00:00<?, ?batch/s, train/train_loss_1=0.0897]Epoch 8: 0%| | 0/7 [00:00<?, ?batch/s, train/train_loss_1=0.0795] 0%| | 0/7 [00:00<?, ?batch/s]Epoch 9: 0%| | 0/7 [00:00<?, ?batch/s]Epoch 9: 0%| | 0/7 [00:00<?, ?batch/s, train/train_loss_1=0.0908]Epoch 9: 0%| | 0/7 [00:00<?, ?batch/s, train/train_loss_1=0.0827]Epoch 9: 0%| | 0/7 [00:00<?, ?batch/s, train/train_loss_1=0.0872]Epoch 9: 0%| | 0/7 [00:00<?, ?batch/s, train/train_loss_1=0.0898]Epoch 9: 0%| | 0/7 [00:00<?, ?batch/s, train/train_loss_1=0.0759]Epoch 9: 0%| | 0/7 [00:00<?, ?batch/s, train/train_loss_1=0.0781]Epoch 9: 0%| | 0/7 [00:00<?, ?batch/s, train/train_loss_1=0.1] Epoch 9: 100%|##########| 7/7 [00:00<00:00, 639.63batch/s, train/train_loss_1=0.1]
0%| | 0/11 [00:00<?, ?batch/s]Epoch 0: 0%| | 0/11 [00:00<?, ?batch/s]Epoch 0: 0%| | 0/11 [00:01<?, ?batch/s, train/train_loss_1=0.128]Epoch 0: 9%|9 | 1/11 [00:01<00:11, 1.13s/batch, train/train_loss_1=0.128]Epoch 0: 9%|9 | 1/11 [00:01<00:11, 1.13s/batch, train/train_loss_1=0.119]Epoch 0: 9%|9 | 1/11 [00:01<00:11, 1.13s/batch, train/train_loss_1=0.118]Epoch 0: 9%|9 | 1/11 [00:01<00:11, 1.13s/batch, train/train_loss_1=0.14] Epoch 0: 9%|9 | 1/11 [00:01<00:11, 1.13s/batch, train/train_loss_1=0.123]Epoch 0: 9%|9 | 1/11 [00:01<00:11, 1.13s/batch, train/train_loss_1=0.121]Epoch 0: 9%|9 | 1/11 [00:01<00:11, 1.13s/batch, train/train_loss_1=0.119]Epoch 0: 9%|9 | 1/11 [00:01<00:11, 1.13s/batch, train/train_loss_1=0.121]Epoch 0: 9%|9 | 1/11 [00:01<00:11, 1.13s/batch, train/train_loss_1=0.117]Epoch 0: 9%|9 | 1/11 [00:01<00:11, 1.13s/batch, train/train_loss_1=0.117]Epoch 0: 9%|9 | 1/11 [00:02<00:11, 1.13s/batch, train/train_loss_1=0.116]Epoch 0: 100%|##########| 11/11 [00:02<00:00, 4.74batch/s, train/train_loss_1=0.116] 0%| | 0/11 [00:00<?, ?batch/s]Epoch 1: 0%| | 0/11 [00:00<?, ?batch/s]Epoch 1: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.113]Epoch 1: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.108]Epoch 1: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.113]Epoch 1: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.109]Epoch 1: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.108]Epoch 1: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.105]Epoch 1: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.106]Epoch 1: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.111]Epoch 1: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.112]Epoch 1: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.112]Epoch 1: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.105] 0%| | 0/11 [00:00<?, ?batch/s]Epoch 2: 0%| | 0/11 [00:00<?, ?batch/s]Epoch 2: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0998]Epoch 2: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.106] Epoch 2: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0992]Epoch 2: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0992]Epoch 2: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0979]Epoch 2: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0994]Epoch 2: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0827]Epoch 2: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0937]Epoch 2: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.101] Epoch 2: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.101]Epoch 2: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0957] 0%| | 0/11 [00:00<?, ?batch/s]Epoch 3: 0%| | 0/11 [00:00<?, ?batch/s]Epoch 3: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0966]Epoch 3: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0796]Epoch 3: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0987]Epoch 3: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0892]Epoch 3: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0852]Epoch 3: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0829]Epoch 3: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0928]Epoch 3: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0896]Epoch 3: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0928]Epoch 3: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.083] Epoch 3: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0973] 0%| | 0/11 [00:00<?, ?batch/s]Epoch 4: 0%| | 0/11 [00:00<?, ?batch/s]Epoch 4: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0911]Epoch 4: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0808]Epoch 4: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0798]Epoch 4: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0972]Epoch 4: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0788]Epoch 4: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.087] Epoch 4: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0739]Epoch 4: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.086] Epoch 4: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0829]Epoch 4: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0798]Epoch 4: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0851] 0%| | 0/11 [00:00<?, ?batch/s]Epoch 5: 0%| | 0/11 [00:00<?, ?batch/s]Epoch 5: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0757]Epoch 5: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0857]Epoch 5: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0881]Epoch 5: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0725]Epoch 5: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0755]Epoch 5: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0768]Epoch 5: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0646]Epoch 5: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0926]Epoch 5: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0685]Epoch 5: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0705]Epoch 5: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0749] 0%| | 0/11 [00:00<?, ?batch/s]Epoch 6: 0%| | 0/11 [00:00<?, ?batch/s]Epoch 6: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0656]Epoch 6: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0712]Epoch 6: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0778]Epoch 6: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0823]Epoch 6: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0731]Epoch 6: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0765]Epoch 6: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0715]Epoch 6: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0687]Epoch 6: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.082] Epoch 6: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0731]Epoch 6: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0722] 0%| | 0/11 [00:00<?, ?batch/s]Epoch 7: 0%| | 0/11 [00:00<?, ?batch/s]Epoch 7: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0697]Epoch 7: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0754]Epoch 7: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0776]Epoch 7: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.053] Epoch 7: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0654]Epoch 7: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0659]Epoch 7: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.073] Epoch 7: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0715]Epoch 7: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0705]Epoch 7: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0634]Epoch 7: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0773] 0%| | 0/11 [00:00<?, ?batch/s]Epoch 8: 0%| | 0/11 [00:00<?, ?batch/s]Epoch 8: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0684]Epoch 8: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0638]Epoch 8: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0752]Epoch 8: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0677]Epoch 8: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0508]Epoch 8: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0609]Epoch 8: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0663]Epoch 8: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0719]Epoch 8: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0755]Epoch 8: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0636]Epoch 8: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0578] 0%| | 0/11 [00:00<?, ?batch/s]Epoch 9: 0%| | 0/11 [00:00<?, ?batch/s]Epoch 9: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0624]Epoch 9: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0639]Epoch 9: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0665]Epoch 9: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0494]Epoch 9: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0589]Epoch 9: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0586]Epoch 9: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0796]Epoch 9: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.069] Epoch 9: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0689]Epoch 9: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0529]Epoch 9: 0%| | 0/11 [00:00<?, ?batch/s, train/train_loss_1=0.0545]Epoch 9: 100%|##########| 11/11 [00:00<00:00, 225.54batch/s, train/train_loss_1=0.0545]
/opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages/sklearn/preprocessing/_encoders.py:868: FutureWarning: `sparse` was renamed to `sparse_output` in version 1.2 and will be removed in 1.4. `sparse_output` is ignored unless you leave `sparse` to its default value.
warnings.warn(
0%| | 0/21 [00:00<?, ?batch/s]Epoch 0: 0%| | 0/21 [00:00<?, ?batch/s]Epoch 0: 0%| | 0/21 [00:01<?, ?batch/s, train/train_loss_1=0.171]Epoch 0: 5%|4 | 1/21 [00:01<00:22, 1.13s/batch, train/train_loss_1=0.171]Epoch 0: 5%|4 | 1/21 [00:01<00:22, 1.13s/batch, train/train_loss_1=0.159]Epoch 0: 5%|4 | 1/21 [00:01<00:22, 1.13s/batch, train/train_loss_1=0.136]Epoch 0: 5%|4 | 1/21 [00:01<00:22, 1.13s/batch, train/train_loss_1=0.132]Epoch 0: 5%|4 | 1/21 [00:01<00:22, 1.13s/batch, train/train_loss_1=0.111]Epoch 0: 5%|4 | 1/21 [00:01<00:22, 1.13s/batch, train/train_loss_1=0.108]Epoch 0: 5%|4 | 1/21 [00:01<00:22, 1.13s/batch, train/train_loss_1=0.107]Epoch 0: 5%|4 | 1/21 [00:01<00:22, 1.13s/batch, train/train_loss_1=0.109]Epoch 0: 5%|4 | 1/21 [00:01<00:22, 1.13s/batch, train/train_loss_1=0.106]Epoch 0: 5%|4 | 1/21 [00:01<00:22, 1.13s/batch, train/train_loss_1=0.102]Epoch 0: 5%|4 | 1/21 [00:01<00:22, 1.13s/batch, train/train_loss_1=0.103]Epoch 0: 5%|4 | 1/21 [00:01<00:22, 1.13s/batch, train/train_loss_1=0.115]Epoch 0: 5%|4 | 1/21 [00:01<00:22, 1.13s/batch, train/train_loss_1=0.0953]Epoch 0: 5%|4 | 1/21 [00:01<00:22, 1.13s/batch, train/train_loss_1=0.0971]Epoch 0: 5%|4 | 1/21 [00:01<00:22, 1.13s/batch, train/train_loss_1=0.099] Epoch 0: 5%|4 | 1/21 [00:01<00:22, 1.13s/batch, train/train_loss_1=0.0867]Epoch 0: 5%|4 | 1/21 [00:01<00:22, 1.13s/batch, train/train_loss_1=0.102] Epoch 0: 5%|4 | 1/21 [00:01<00:22, 1.13s/batch, train/train_loss_1=0.0946]Epoch 0: 5%|4 | 1/21 [00:01<00:22, 1.13s/batch, train/train_loss_1=0.0914]Epoch 0: 5%|4 | 1/21 [00:01<00:22, 1.13s/batch, train/train_loss_1=0.0918]Epoch 0: 5%|4 | 1/21 [00:02<00:22, 1.13s/batch, train/train_loss_1=0.0912]Epoch 0: 100%|##########| 21/21 [00:02<00:00, 10.16batch/s, train/train_loss_1=0.0912] 0%| | 0/21 [00:00<?, ?batch/s]Epoch 1: 0%| | 0/21 [00:00<?, ?batch/s]Epoch 1: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0912]Epoch 1: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.107] Epoch 1: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0924]Epoch 1: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0838]Epoch 1: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0866]Epoch 1: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0864]Epoch 1: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0932]Epoch 1: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0936]Epoch 1: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0846]Epoch 1: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0864]Epoch 1: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0858]Epoch 1: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0804]Epoch 1: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0943]Epoch 1: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0848]Epoch 1: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0768]Epoch 1: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0871]Epoch 1: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0811]Epoch 1: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0907]Epoch 1: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0775]Epoch 1: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0981]Epoch 1: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0734] 0%| | 0/21 [00:00<?, ?batch/s]Epoch 2: 0%| | 0/21 [00:00<?, ?batch/s]Epoch 2: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0931]Epoch 2: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0782]Epoch 2: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0811]Epoch 2: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0829]Epoch 2: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.082] Epoch 2: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0786]Epoch 2: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0902]Epoch 2: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.081] Epoch 2: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0788]Epoch 2: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0774]Epoch 2: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0774]Epoch 2: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0931]Epoch 2: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0843]Epoch 2: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0903]Epoch 2: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.085] Epoch 2: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0734]Epoch 2: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0829]Epoch 2: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0672]Epoch 2: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0837]Epoch 2: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0859]Epoch 2: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0861] 0%| | 0/21 [00:00<?, ?batch/s]Epoch 3: 0%| | 0/21 [00:00<?, ?batch/s]Epoch 3: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0772]Epoch 3: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.081] Epoch 3: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0813]Epoch 3: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0726]Epoch 3: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0807]Epoch 3: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0836]Epoch 3: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0751]Epoch 3: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0869]Epoch 3: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0864]Epoch 3: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0865]Epoch 3: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.083] Epoch 3: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0751]Epoch 3: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0787]Epoch 3: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0747]Epoch 3: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0766]Epoch 3: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0695]Epoch 3: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0853]Epoch 3: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0866]Epoch 3: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0769]Epoch 3: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0825]Epoch 3: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0738] 0%| | 0/21 [00:00<?, ?batch/s]Epoch 4: 0%| | 0/21 [00:00<?, ?batch/s]Epoch 4: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0842]Epoch 4: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0658]Epoch 4: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.087] Epoch 4: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0665]Epoch 4: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0787]Epoch 4: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0773]Epoch 4: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0828]Epoch 4: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0848]Epoch 4: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.086] Epoch 4: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0747]Epoch 4: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0793]Epoch 4: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0769]Epoch 4: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0857]Epoch 4: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0712]Epoch 4: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0776]Epoch 4: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0723]Epoch 4: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0749]Epoch 4: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0891]Epoch 4: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0662]Epoch 4: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.081] Epoch 4: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0892] 0%| | 0/21 [00:00<?, ?batch/s]Epoch 5: 0%| | 0/21 [00:00<?, ?batch/s]Epoch 5: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.073]Epoch 5: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0673]Epoch 5: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0714]Epoch 5: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0767]Epoch 5: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0706]Epoch 5: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0645]Epoch 5: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0802]Epoch 5: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0729]Epoch 5: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0721]Epoch 5: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0772]Epoch 5: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0845]Epoch 5: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0804]Epoch 5: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0791]Epoch 5: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0813]Epoch 5: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0871]Epoch 5: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0747]Epoch 5: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0819]Epoch 5: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0838]Epoch 5: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.087] Epoch 5: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0725]Epoch 5: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0822] 0%| | 0/21 [00:00<?, ?batch/s]Epoch 6: 0%| | 0/21 [00:00<?, ?batch/s]Epoch 6: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0805]Epoch 6: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0795]Epoch 6: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0713]Epoch 6: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0698]Epoch 6: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0832]Epoch 6: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0747]Epoch 6: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0914]Epoch 6: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0803]Epoch 6: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0726]Epoch 6: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0648]Epoch 6: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0805]Epoch 6: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0819]Epoch 6: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0774]Epoch 6: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0728]Epoch 6: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.077] Epoch 6: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.077]Epoch 6: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0788]Epoch 6: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0648]Epoch 6: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0773]Epoch 6: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0763]Epoch 6: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0742] 0%| | 0/21 [00:00<?, ?batch/s]Epoch 7: 0%| | 0/21 [00:00<?, ?batch/s]Epoch 7: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0628]Epoch 7: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0769]Epoch 7: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0728]Epoch 7: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0822]Epoch 7: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0752]Epoch 7: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0823]Epoch 7: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.074] Epoch 7: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0787]Epoch 7: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.076] Epoch 7: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0675]Epoch 7: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0838]Epoch 7: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0784]Epoch 7: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0827]Epoch 7: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0767]Epoch 7: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0653]Epoch 7: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.069] Epoch 7: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0833]Epoch 7: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0801]Epoch 7: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0736]Epoch 7: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0815]Epoch 7: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0655] 0%| | 0/21 [00:00<?, ?batch/s]Epoch 8: 0%| | 0/21 [00:00<?, ?batch/s]Epoch 8: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0715]Epoch 8: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.071] Epoch 8: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0773]Epoch 8: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0689]Epoch 8: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0802]Epoch 8: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0681]Epoch 8: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0673]Epoch 8: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0863]Epoch 8: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0693]Epoch 8: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0644]Epoch 8: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0715]Epoch 8: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0763]Epoch 8: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0871]Epoch 8: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0857]Epoch 8: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0745]Epoch 8: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0861]Epoch 8: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0643]Epoch 8: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0807]Epoch 8: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0623]Epoch 8: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0657]Epoch 8: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0776] 0%| | 0/21 [00:00<?, ?batch/s]Epoch 9: 0%| | 0/21 [00:00<?, ?batch/s]Epoch 9: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0747]Epoch 9: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0794]Epoch 9: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0644]Epoch 9: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0696]Epoch 9: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0769]Epoch 9: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0809]Epoch 9: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0866]Epoch 9: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0786]Epoch 9: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0727]Epoch 9: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0684]Epoch 9: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0632]Epoch 9: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0681]Epoch 9: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.074] Epoch 9: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0859]Epoch 9: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0713]Epoch 9: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0716]Epoch 9: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0759]Epoch 9: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0736]Epoch 9: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0781]Epoch 9: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0722]Epoch 9: 0%| | 0/21 [00:00<?, ?batch/s, train/train_loss_1=0.0746]Epoch 9: 100%|##########| 21/21 [00:00<00:00, 555.71batch/s, train/train_loss_1=0.0746]
/opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages/sklearn/preprocessing/_encoders.py:868: FutureWarning: `sparse` was renamed to `sparse_output` in version 1.2 and will be removed in 1.4. `sparse_output` is ignored unless you leave `sparse` to its default value.
warnings.warn(
0%| | 0/655 [00:00<?, ?batch/s]Epoch 0: 0%| | 0/655 [00:00<?, ?batch/s]Epoch 0: 0%| | 0/655 [00:01<?, ?batch/s, train/train_loss_1=0.141]Epoch 0: 0%| | 1/655 [00:01<12:08, 1.11s/batch, train/train_loss_1=0.141]Epoch 0: 0%| | 1/655 [00:01<12:08, 1.11s/batch, train/train_loss_1=0.127]Epoch 0: 0%| | 1/655 [00:01<12:08, 1.11s/batch, train/train_loss_1=0.126]Epoch 0: 0%| | 1/655 [00:01<12:08, 1.11s/batch, train/train_loss_1=0.122]Epoch 0: 0%| | 1/655 [00:01<12:08, 1.11s/batch, train/train_loss_1=0.12] Epoch 0: 0%| | 1/655 [00:01<12:08, 1.11s/batch, train/train_loss_1=0.121]Epoch 0: 0%| | 1/655 [00:01<12:08, 1.11s/batch, train/train_loss_1=0.126]Epoch 0: 0%| | 1/655 [00:01<12:08, 1.11s/batch, train/train_loss_1=0.118]Epoch 0: 0%| | 1/655 [00:01<12:08, 1.11s/batch, train/train_loss_1=0.122]Epoch 0: 0%| | 1/655 [00:01<12:08, 1.11s/batch, train/train_loss_1=0.126]Epoch 0: 0%| | 1/655 [00:01<12:08, 1.11s/batch, train/train_loss_1=0.129]Epoch 0: 0%| | 1/655 [00:01<12:08, 1.11s/batch, train/train_loss_1=0.123]Epoch 0: 0%| | 1/655 [00:01<12:08, 1.11s/batch, train/train_loss_1=0.126]Epoch 0: 0%| | 1/655 [00:01<12:08, 1.11s/batch, train/train_loss_1=0.126]Epoch 0: 0%| | 1/655 [00:01<12:08, 1.11s/batch, train/train_loss_1=0.12] Epoch 0: 0%| | 1/655 [00:01<12:08, 1.11s/batch, train/train_loss_1=0.12]Epoch 0: 0%| | 1/655 [00:01<12:08, 1.11s/batch, train/train_loss_1=0.124]Epoch 0: 0%| | 1/655 [00:01<12:08, 1.11s/batch, train/train_loss_1=0.12] Epoch 0: 0%| | 1/655 [00:01<12:08, 1.11s/batch, train/train_loss_1=0.115]Epoch 0: 0%| | 1/655 [00:01<12:08, 1.11s/batch, train/train_loss_1=0.113]Epoch 0: 0%| | 1/655 [00:01<12:08, 1.11s/batch, train/train_loss_1=0.114]Epoch 0: 0%| | 1/655 [00:01<12:08, 1.11s/batch, train/train_loss_1=0.118]Epoch 0: 0%| | 1/655 [00:01<12:08, 1.11s/batch, train/train_loss_1=0.111]Epoch 0: 0%| | 1/655 [00:01<12:08, 1.11s/batch, train/train_loss_1=0.12] Epoch 0: 0%| | 1/655 [00:01<12:08, 1.11s/batch, train/train_loss_1=0.118]Epoch 0: 0%| | 1/655 [00:01<12:08, 1.11s/batch, train/train_loss_1=0.115]Epoch 0: 0%| | 1/655 [00:01<12:08, 1.11s/batch, train/train_loss_1=0.108]Epoch 0: 0%| | 1/655 [00:01<12:08, 1.11s/batch, train/train_loss_1=0.117]Epoch 0: 0%| | 1/655 [00:01<12:08, 1.11s/batch, train/train_loss_1=0.107]Epoch 0: 0%| | 1/655 [00:01<12:08, 1.11s/batch, train/train_loss_1=0.124]Epoch 0: 0%| | 1/655 [00:01<12:08, 1.11s/batch, train/train_loss_1=0.115]Epoch 0: 0%| | 1/655 [00:01<12:08, 1.11s/batch, train/train_loss_1=0.122]Epoch 0: 0%| | 1/655 [00:01<12:08, 1.11s/batch, train/train_loss_1=0.116]Epoch 0: 0%| | 1/655 [00:01<12:08, 1.11s/batch, train/train_loss_1=0.106]Epoch 0: 0%| | 1/655 [00:01<12:08, 1.11s/batch, train/train_loss_1=0.116]Epoch 0: 0%| | 1/655 [00:01<12:08, 1.11s/batch, train/train_loss_1=0.123]Epoch 0: 0%| | 1/655 [00:01<12:08, 1.11s/batch, train/train_loss_1=0.117]Epoch 0: 0%| | 1/655 [00:01<12:08, 1.11s/batch, train/train_loss_1=0.117]Epoch 0: 0%| | 1/655 [00:01<12:08, 1.11s/batch, train/train_loss_1=0.126]Epoch 0: 0%| | 1/655 [00:01<12:08, 1.11s/batch, train/train_loss_1=0.103]Epoch 0: 0%| | 1/655 [00:01<12:08, 1.11s/batch, train/train_loss_1=0.119]Epoch 0: 0%| | 1/655 [00:01<12:08, 1.11s/batch, train/train_loss_1=0.108]Epoch 0: 0%| | 1/655 [00:01<12:08, 1.11s/batch, train/train_loss_1=0.108]Epoch 0: 0%| | 1/655 [00:01<12:08, 1.11s/batch, train/train_loss_1=0.111]Epoch 0: 0%| | 1/655 [00:01<12:08, 1.11s/batch, train/train_loss_1=0.107]Epoch 0: 0%| | 1/655 [00:01<12:08, 1.11s/batch, train/train_loss_1=0.115]Epoch 0: 0%| | 1/655 [00:01<12:08, 1.11s/batch, train/train_loss_1=0.112]Epoch 0: 0%| | 1/655 [00:01<12:08, 1.11s/batch, train/train_loss_1=0.115]Epoch 0: 0%| | 1/655 [00:01<12:08, 1.11s/batch, train/train_loss_1=0.108]Epoch 0: 0%| | 1/655 [00:01<12:08, 1.11s/batch, train/train_loss_1=0.09] Epoch 0: 0%| | 1/655 [00:01<12:08, 1.11s/batch, train/train_loss_1=0.113]Epoch 0: 0%| | 1/655 [00:01<12:08, 1.11s/batch, train/train_loss_1=0.107]Epoch 0: 0%| | 1/655 [00:01<12:08, 1.11s/batch, train/train_loss_1=0.119]Epoch 0: 0%| | 1/655 [00:01<12:08, 1.11s/batch, train/train_loss_1=0.115]Epoch 0: 0%| | 1/655 [00:01<12:08, 1.11s/batch, train/train_loss_1=0.121]Epoch 0: 0%| | 1/655 [00:01<12:08, 1.11s/batch, train/train_loss_1=0.118]Epoch 0: 0%| | 1/655 [00:01<12:08, 1.11s/batch, train/train_loss_1=0.109]Epoch 0: 0%| | 1/655 [00:01<12:08, 1.11s/batch, train/train_loss_1=0.129]Epoch 0: 0%| | 1/655 [00:01<12:08, 1.11s/batch, train/train_loss_1=0.115]Epoch 0: 0%| | 1/655 [00:01<12:08, 1.11s/batch, train/train_loss_1=0.109]Epoch 0: 0%| | 1/655 [00:01<12:08, 1.11s/batch, train/train_loss_1=0.109]Epoch 0: 0%| | 1/655 [00:01<12:08, 1.11s/batch, train/train_loss_1=0.102]Epoch 0: 0%| | 1/655 [00:01<12:08, 1.11s/batch, train/train_loss_1=0.115]Epoch 0: 0%| | 1/655 [00:01<12:08, 1.11s/batch, train/train_loss_1=0.107]Epoch 0: 0%| | 1/655 [00:01<12:08, 1.11s/batch, train/train_loss_1=0.109]Epoch 0: 0%| | 1/655 [00:01<12:08, 1.11s/batch, train/train_loss_1=0.113]Epoch 0: 0%| | 1/655 [00:01<12:08, 1.11s/batch, train/train_loss_1=0.109]Epoch 0: 0%| | 1/655 [00:01<12:08, 1.11s/batch, train/train_loss_1=0.117]Epoch 0: 0%| | 1/655 [00:01<12:08, 1.11s/batch, train/train_loss_1=0.119]Epoch 0: 0%| | 1/655 [00:01<12:08, 1.11s/batch, train/train_loss_1=0.106]Epoch 0: 0%| | 1/655 [00:01<12:08, 1.11s/batch, train/train_loss_1=0.117]Epoch 0: 0%| | 1/655 [00:01<12:08, 1.11s/batch, train/train_loss_1=0.117]Epoch 0: 0%| | 1/655 [00:01<12:08, 1.11s/batch, train/train_loss_1=0.119]Epoch 0: 11%|#1 | 73/655 [00:01<00:07, 82.53batch/s, train/train_loss_1=0.119]Epoch 0: 11%|#1 | 73/655 [00:01<00:07, 82.53batch/s, train/train_loss_1=0.125]Epoch 0: 11%|#1 | 73/655 [00:01<00:07, 82.53batch/s, train/train_loss_1=0.12] Epoch 0: 11%|#1 | 73/655 [00:01<00:07, 82.53batch/s, train/train_loss_1=0.116]Epoch 0: 11%|#1 | 73/655 [00:01<00:07, 82.53batch/s, train/train_loss_1=0.106]Epoch 0: 11%|#1 | 73/655 [00:01<00:07, 82.53batch/s, train/train_loss_1=0.113]Epoch 0: 11%|#1 | 73/655 [00:01<00:07, 82.53batch/s, train/train_loss_1=0.116]Epoch 0: 11%|#1 | 73/655 [00:01<00:07, 82.53batch/s, train/train_loss_1=0.112]Epoch 0: 11%|#1 | 73/655 [00:01<00:07, 82.53batch/s, train/train_loss_1=0.117]Epoch 0: 11%|#1 | 73/655 [00:01<00:07, 82.53batch/s, train/train_loss_1=0.11] Epoch 0: 11%|#1 | 73/655 [00:01<00:07, 82.53batch/s, train/train_loss_1=0.111]Epoch 0: 11%|#1 | 73/655 [00:01<00:07, 82.53batch/s, train/train_loss_1=0.113]Epoch 0: 11%|#1 | 73/655 [00:01<00:07, 82.53batch/s, train/train_loss_1=0.124]Epoch 0: 11%|#1 | 73/655 [00:01<00:07, 82.53batch/s, train/train_loss_1=0.0973]Epoch 0: 11%|#1 | 73/655 [00:01<00:07, 82.53batch/s, train/train_loss_1=0.107] Epoch 0: 11%|#1 | 73/655 [00:01<00:07, 82.53batch/s, train/train_loss_1=0.118]Epoch 0: 11%|#1 | 73/655 [00:01<00:07, 82.53batch/s, train/train_loss_1=0.111]Epoch 0: 11%|#1 | 73/655 [00:01<00:07, 82.53batch/s, train/train_loss_1=0.125]Epoch 0: 11%|#1 | 73/655 [00:01<00:07, 82.53batch/s, train/train_loss_1=0.113]Epoch 0: 11%|#1 | 73/655 [00:01<00:07, 82.53batch/s, train/train_loss_1=0.104]Epoch 0: 11%|#1 | 73/655 [00:01<00:07, 82.53batch/s, train/train_loss_1=0.108]Epoch 0: 11%|#1 | 73/655 [00:01<00:07, 82.53batch/s, train/train_loss_1=0.114]Epoch 0: 11%|#1 | 73/655 [00:01<00:07, 82.53batch/s, train/train_loss_1=0.105]Epoch 0: 11%|#1 | 73/655 [00:01<00:07, 82.53batch/s, train/train_loss_1=0.122]Epoch 0: 11%|#1 | 73/655 [00:01<00:07, 82.53batch/s, train/train_loss_1=0.112]Epoch 0: 11%|#1 | 73/655 [00:01<00:07, 82.53batch/s, train/train_loss_1=0.116]Epoch 0: 11%|#1 | 73/655 [00:01<00:07, 82.53batch/s, train/train_loss_1=0.0984]Epoch 0: 11%|#1 | 73/655 [00:01<00:07, 82.53batch/s, train/train_loss_1=0.112] Epoch 0: 11%|#1 | 73/655 [00:01<00:07, 82.53batch/s, train/train_loss_1=0.108]Epoch 0: 11%|#1 | 73/655 [00:01<00:07, 82.53batch/s, train/train_loss_1=0.105]Epoch 0: 11%|#1 | 73/655 [00:01<00:07, 82.53batch/s, train/train_loss_1=0.12] Epoch 0: 11%|#1 | 73/655 [00:01<00:07, 82.53batch/s, train/train_loss_1=0.105]Epoch 0: 11%|#1 | 73/655 [00:01<00:07, 82.53batch/s, train/train_loss_1=0.122]Epoch 0: 11%|#1 | 73/655 [00:01<00:07, 82.53batch/s, train/train_loss_1=0.107]Epoch 0: 11%|#1 | 73/655 [00:01<00:07, 82.53batch/s, train/train_loss_1=0.105]Epoch 0: 11%|#1 | 73/655 [00:01<00:07, 82.53batch/s, train/train_loss_1=0.101]Epoch 0: 11%|#1 | 73/655 [00:01<00:07, 82.53batch/s, train/train_loss_1=0.115]Epoch 0: 11%|#1 | 73/655 [00:01<00:07, 82.53batch/s, train/train_loss_1=0.106]Epoch 0: 11%|#1 | 73/655 [00:01<00:07, 82.53batch/s, train/train_loss_1=0.103]Epoch 0: 11%|#1 | 73/655 [00:01<00:07, 82.53batch/s, train/train_loss_1=0.11] Epoch 0: 11%|#1 | 73/655 [00:01<00:07, 82.53batch/s, train/train_loss_1=0.11]Epoch 0: 11%|#1 | 73/655 [00:01<00:07, 82.53batch/s, train/train_loss_1=0.11]Epoch 0: 11%|#1 | 73/655 [00:01<00:07, 82.53batch/s, train/train_loss_1=0.105]Epoch 0: 11%|#1 | 73/655 [00:01<00:07, 82.53batch/s, train/train_loss_1=0.109]Epoch 0: 11%|#1 | 73/655 [00:01<00:07, 82.53batch/s, train/train_loss_1=0.113]Epoch 0: 11%|#1 | 73/655 [00:01<00:07, 82.53batch/s, train/train_loss_1=0.105]Epoch 0: 11%|#1 | 73/655 [00:01<00:07, 82.53batch/s, train/train_loss_1=0.108]Epoch 0: 11%|#1 | 73/655 [00:01<00:07, 82.53batch/s, train/train_loss_1=0.103]Epoch 0: 11%|#1 | 73/655 [00:01<00:07, 82.53batch/s, train/train_loss_1=0.107]Epoch 0: 11%|#1 | 73/655 [00:01<00:07, 82.53batch/s, train/train_loss_1=0.104]Epoch 0: 11%|#1 | 73/655 [00:01<00:07, 82.53batch/s, train/train_loss_1=0.111]Epoch 0: 11%|#1 | 73/655 [00:01<00:07, 82.53batch/s, train/train_loss_1=0.117]Epoch 0: 11%|#1 | 73/655 [00:01<00:07, 82.53batch/s, train/train_loss_1=0.116]Epoch 0: 11%|#1 | 73/655 [00:01<00:07, 82.53batch/s, train/train_loss_1=0.102]Epoch 0: 11%|#1 | 73/655 [00:01<00:07, 82.53batch/s, train/train_loss_1=0.117]Epoch 0: 11%|#1 | 73/655 [00:01<00:07, 82.53batch/s, train/train_loss_1=0.11] Epoch 0: 11%|#1 | 73/655 [00:01<00:07, 82.53batch/s, train/train_loss_1=0.111]Epoch 0: 11%|#1 | 73/655 [00:01<00:07, 82.53batch/s, train/train_loss_1=0.114]Epoch 0: 11%|#1 | 73/655 [00:01<00:07, 82.53batch/s, train/train_loss_1=0.112]Epoch 0: 11%|#1 | 73/655 [00:01<00:07, 82.53batch/s, train/train_loss_1=0.102]Epoch 0: 11%|#1 | 73/655 [00:01<00:07, 82.53batch/s, train/train_loss_1=0.116]Epoch 0: 11%|#1 | 73/655 [00:01<00:07, 82.53batch/s, train/train_loss_1=0.103]Epoch 0: 11%|#1 | 73/655 [00:01<00:07, 82.53batch/s, train/train_loss_1=0.0992]Epoch 0: 11%|#1 | 73/655 [00:01<00:07, 82.53batch/s, train/train_loss_1=0.0977]Epoch 0: 11%|#1 | 73/655 [00:01<00:07, 82.53batch/s, train/train_loss_1=0.117] Epoch 0: 11%|#1 | 73/655 [00:01<00:07, 82.53batch/s, train/train_loss_1=0.0951]Epoch 0: 11%|#1 | 73/655 [00:01<00:07, 82.53batch/s, train/train_loss_1=0.102] Epoch 0: 21%|##1 | 139/655 [00:01<00:03, 162.81batch/s, train/train_loss_1=0.102]Epoch 0: 21%|##1 | 139/655 [00:01<00:03, 162.81batch/s, train/train_loss_1=0.108]Epoch 0: 21%|##1 | 139/655 [00:01<00:03, 162.81batch/s, train/train_loss_1=0.125]Epoch 0: 21%|##1 | 139/655 [00:01<00:03, 162.81batch/s, train/train_loss_1=0.106]Epoch 0: 21%|##1 | 139/655 [00:01<00:03, 162.81batch/s, train/train_loss_1=0.0963]Epoch 0: 21%|##1 | 139/655 [00:01<00:03, 162.81batch/s, train/train_loss_1=0.111] Epoch 0: 21%|##1 | 139/655 [00:01<00:03, 162.81batch/s, train/train_loss_1=0.106]Epoch 0: 21%|##1 | 139/655 [00:01<00:03, 162.81batch/s, train/train_loss_1=0.118]Epoch 0: 21%|##1 | 139/655 [00:01<00:03, 162.81batch/s, train/train_loss_1=0.114]Epoch 0: 21%|##1 | 139/655 [00:01<00:03, 162.81batch/s, train/train_loss_1=0.119]Epoch 0: 21%|##1 | 139/655 [00:01<00:03, 162.81batch/s, train/train_loss_1=0.103]Epoch 0: 21%|##1 | 139/655 [00:01<00:03, 162.81batch/s, train/train_loss_1=0.104]Epoch 0: 21%|##1 | 139/655 [00:01<00:03, 162.81batch/s, train/train_loss_1=0.118]Epoch 0: 21%|##1 | 139/655 [00:01<00:03, 162.81batch/s, train/train_loss_1=0.103]Epoch 0: 21%|##1 | 139/655 [00:01<00:03, 162.81batch/s, train/train_loss_1=0.118]Epoch 0: 21%|##1 | 139/655 [00:01<00:03, 162.81batch/s, train/train_loss_1=0.117]Epoch 0: 21%|##1 | 139/655 [00:01<00:03, 162.81batch/s, train/train_loss_1=0.119]Epoch 0: 21%|##1 | 139/655 [00:01<00:03, 162.81batch/s, train/train_loss_1=0.11] Epoch 0: 21%|##1 | 139/655 [00:01<00:03, 162.81batch/s, train/train_loss_1=0.107]Epoch 0: 21%|##1 | 139/655 [00:01<00:03, 162.81batch/s, train/train_loss_1=0.104]Epoch 0: 21%|##1 | 139/655 [00:01<00:03, 162.81batch/s, train/train_loss_1=0.103]Epoch 0: 21%|##1 | 139/655 [00:01<00:03, 162.81batch/s, train/train_loss_1=0.107]Epoch 0: 21%|##1 | 139/655 [00:01<00:03, 162.81batch/s, train/train_loss_1=0.113]Epoch 0: 21%|##1 | 139/655 [00:01<00:03, 162.81batch/s, train/train_loss_1=0.111]Epoch 0: 21%|##1 | 139/655 [00:01<00:03, 162.81batch/s, train/train_loss_1=0.108]Epoch 0: 21%|##1 | 139/655 [00:01<00:03, 162.81batch/s, train/train_loss_1=0.109]Epoch 0: 21%|##1 | 139/655 [00:01<00:03, 162.81batch/s, train/train_loss_1=0.0988]Epoch 0: 21%|##1 | 139/655 [00:01<00:03, 162.81batch/s, train/train_loss_1=0.111] Epoch 0: 21%|##1 | 139/655 [00:01<00:03, 162.81batch/s, train/train_loss_1=0.103]Epoch 0: 21%|##1 | 139/655 [00:01<00:03, 162.81batch/s, train/train_loss_1=0.104]Epoch 0: 21%|##1 | 139/655 [00:01<00:03, 162.81batch/s, train/train_loss_1=0.0934]Epoch 0: 21%|##1 | 139/655 [00:01<00:03, 162.81batch/s, train/train_loss_1=0.103] Epoch 0: 21%|##1 | 139/655 [00:01<00:03, 162.81batch/s, train/train_loss_1=0.11] Epoch 0: 21%|##1 | 139/655 [00:01<00:03, 162.81batch/s, train/train_loss_1=0.102]Epoch 0: 21%|##1 | 139/655 [00:01<00:03, 162.81batch/s, train/train_loss_1=0.105]Epoch 0: 21%|##1 | 139/655 [00:01<00:03, 162.81batch/s, train/train_loss_1=0.0934]Epoch 0: 21%|##1 | 139/655 [00:01<00:03, 162.81batch/s, train/train_loss_1=0.101] Epoch 0: 21%|##1 | 139/655 [00:01<00:03, 162.81batch/s, train/train_loss_1=0.102]Epoch 0: 21%|##1 | 139/655 [00:01<00:03, 162.81batch/s, train/train_loss_1=0.103]Epoch 0: 21%|##1 | 139/655 [00:01<00:03, 162.81batch/s, train/train_loss_1=0.0883]Epoch 0: 21%|##1 | 139/655 [00:01<00:03, 162.81batch/s, train/train_loss_1=0.112] Epoch 0: 21%|##1 | 139/655 [00:01<00:03, 162.81batch/s, train/train_loss_1=0.106]Epoch 0: 21%|##1 | 139/655 [00:01<00:03, 162.81batch/s, train/train_loss_1=0.122]Epoch 0: 21%|##1 | 139/655 [00:01<00:03, 162.81batch/s, train/train_loss_1=0.109]Epoch 0: 21%|##1 | 139/655 [00:01<00:03, 162.81batch/s, train/train_loss_1=0.11] Epoch 0: 21%|##1 | 139/655 [00:01<00:03, 162.81batch/s, train/train_loss_1=0.115]Epoch 0: 21%|##1 | 139/655 [00:01<00:03, 162.81batch/s, train/train_loss_1=0.0981]Epoch 0: 21%|##1 | 139/655 [00:01<00:03, 162.81batch/s, train/train_loss_1=0.106] Epoch 0: 21%|##1 | 139/655 [00:01<00:03, 162.81batch/s, train/train_loss_1=0.101]Epoch 0: 21%|##1 | 139/655 [00:01<00:03, 162.81batch/s, train/train_loss_1=0.109]Epoch 0: 21%|##1 | 139/655 [00:01<00:03, 162.81batch/s, train/train_loss_1=0.103]Epoch 0: 21%|##1 | 139/655 [00:01<00:03, 162.81batch/s, train/train_loss_1=0.0999]Epoch 0: 21%|##1 | 139/655 [00:01<00:03, 162.81batch/s, train/train_loss_1=0.1] Epoch 0: 21%|##1 | 139/655 [00:01<00:03, 162.81batch/s, train/train_loss_1=0.104]Epoch 0: 21%|##1 | 139/655 [00:01<00:03, 162.81batch/s, train/train_loss_1=0.112]Epoch 0: 21%|##1 | 139/655 [00:01<00:03, 162.81batch/s, train/train_loss_1=0.105]Epoch 0: 21%|##1 | 139/655 [00:01<00:03, 162.81batch/s, train/train_loss_1=0.0977]Epoch 0: 21%|##1 | 139/655 [00:01<00:03, 162.81batch/s, train/train_loss_1=0.101] Epoch 0: 21%|##1 | 139/655 [00:01<00:03, 162.81batch/s, train/train_loss_1=0.103]Epoch 0: 21%|##1 | 139/655 [00:01<00:03, 162.81batch/s, train/train_loss_1=0.0938]Epoch 0: 21%|##1 | 139/655 [00:01<00:03, 162.81batch/s, train/train_loss_1=0.104] Epoch 0: 21%|##1 | 139/655 [00:01<00:03, 162.81batch/s, train/train_loss_1=0.0951]Epoch 0: 21%|##1 | 139/655 [00:01<00:03, 162.81batch/s, train/train_loss_1=0.104] Epoch 0: 21%|##1 | 139/655 [00:01<00:03, 162.81batch/s, train/train_loss_1=0.106]Epoch 0: 21%|##1 | 139/655 [00:01<00:03, 162.81batch/s, train/train_loss_1=0.112]Epoch 0: 21%|##1 | 139/655 [00:01<00:03, 162.81batch/s, train/train_loss_1=0.109]Epoch 0: 21%|##1 | 139/655 [00:01<00:03, 162.81batch/s, train/train_loss_1=0.102]Epoch 0: 21%|##1 | 139/655 [00:01<00:03, 162.81batch/s, train/train_loss_1=0.11] Epoch 0: 21%|##1 | 139/655 [00:01<00:03, 162.81batch/s, train/train_loss_1=0.103]Epoch 0: 32%|###1 | 207/655 [00:01<00:01, 248.24batch/s, train/train_loss_1=0.103]Epoch 0: 32%|###1 | 207/655 [00:01<00:01, 248.24batch/s, train/train_loss_1=0.0913]Epoch 0: 32%|###1 | 207/655 [00:01<00:01, 248.24batch/s, train/train_loss_1=0.111] Epoch 0: 32%|###1 | 207/655 [00:01<00:01, 248.24batch/s, train/train_loss_1=0.0866]Epoch 0: 32%|###1 | 207/655 [00:01<00:01, 248.24batch/s, train/train_loss_1=0.111] Epoch 0: 32%|###1 | 207/655 [00:01<00:01, 248.24batch/s, train/train_loss_1=0.105]Epoch 0: 32%|###1 | 207/655 [00:01<00:01, 248.24batch/s, train/train_loss_1=0.0963]Epoch 0: 32%|###1 | 207/655 [00:01<00:01, 248.24batch/s, train/train_loss_1=0.0982]Epoch 0: 32%|###1 | 207/655 [00:01<00:01, 248.24batch/s, train/train_loss_1=0.099] Epoch 0: 32%|###1 | 207/655 [00:01<00:01, 248.24batch/s, train/train_loss_1=0.0968]Epoch 0: 32%|###1 | 207/655 [00:01<00:01, 248.24batch/s, train/train_loss_1=0.117] Epoch 0: 32%|###1 | 207/655 [00:01<00:01, 248.24batch/s, train/train_loss_1=0.115]Epoch 0: 32%|###1 | 207/655 [00:01<00:01, 248.24batch/s, train/train_loss_1=0.111]Epoch 0: 32%|###1 | 207/655 [00:01<00:01, 248.24batch/s, train/train_loss_1=0.112]Epoch 0: 32%|###1 | 207/655 [00:01<00:01, 248.24batch/s, train/train_loss_1=0.111]Epoch 0: 32%|###1 | 207/655 [00:01<00:01, 248.24batch/s, train/train_loss_1=0.114]Epoch 0: 32%|###1 | 207/655 [00:01<00:01, 248.24batch/s, train/train_loss_1=0.0993]Epoch 0: 32%|###1 | 207/655 [00:01<00:01, 248.24batch/s, train/train_loss_1=0.111] Epoch 0: 32%|###1 | 207/655 [00:01<00:01, 248.24batch/s, train/train_loss_1=0.108]Epoch 0: 32%|###1 | 207/655 [00:01<00:01, 248.24batch/s, train/train_loss_1=0.114]Epoch 0: 32%|###1 | 207/655 [00:01<00:01, 248.24batch/s, train/train_loss_1=0.101]Epoch 0: 32%|###1 | 207/655 [00:01<00:01, 248.24batch/s, train/train_loss_1=0.105]Epoch 0: 32%|###1 | 207/655 [00:01<00:01, 248.24batch/s, train/train_loss_1=0.109]Epoch 0: 32%|###1 | 207/655 [00:01<00:01, 248.24batch/s, train/train_loss_1=0.107]Epoch 0: 32%|###1 | 207/655 [00:01<00:01, 248.24batch/s, train/train_loss_1=0.0969]Epoch 0: 32%|###1 | 207/655 [00:01<00:01, 248.24batch/s, train/train_loss_1=0.113] Epoch 0: 32%|###1 | 207/655 [00:01<00:01, 248.24batch/s, train/train_loss_1=0.113]Epoch 0: 32%|###1 | 207/655 [00:01<00:01, 248.24batch/s, train/train_loss_1=0.115]Epoch 0: 32%|###1 | 207/655 [00:01<00:01, 248.24batch/s, train/train_loss_1=0.0991]Epoch 0: 32%|###1 | 207/655 [00:01<00:01, 248.24batch/s, train/train_loss_1=0.111] Epoch 0: 32%|###1 | 207/655 [00:01<00:01, 248.24batch/s, train/train_loss_1=0.111]Epoch 0: 32%|###1 | 207/655 [00:01<00:01, 248.24batch/s, train/train_loss_1=0.111]Epoch 0: 32%|###1 | 207/655 [00:01<00:01, 248.24batch/s, train/train_loss_1=0.0959]Epoch 0: 32%|###1 | 207/655 [00:01<00:01, 248.24batch/s, train/train_loss_1=0.105] Epoch 0: 32%|###1 | 207/655 [00:01<00:01, 248.24batch/s, train/train_loss_1=0.109]Epoch 0: 32%|###1 | 207/655 [00:01<00:01, 248.24batch/s, train/train_loss_1=0.0985]Epoch 0: 32%|###1 | 207/655 [00:01<00:01, 248.24batch/s, train/train_loss_1=0.12] Epoch 0: 32%|###1 | 207/655 [00:01<00:01, 248.24batch/s, train/train_loss_1=0.103]Epoch 0: 32%|###1 | 207/655 [00:01<00:01, 248.24batch/s, train/train_loss_1=0.108]Epoch 0: 32%|###1 | 207/655 [00:01<00:01, 248.24batch/s, train/train_loss_1=0.108]Epoch 0: 32%|###1 | 207/655 [00:01<00:01, 248.24batch/s, train/train_loss_1=0.11] Epoch 0: 32%|###1 | 207/655 [00:01<00:01, 248.24batch/s, train/train_loss_1=0.104]Epoch 0: 32%|###1 | 207/655 [00:01<00:01, 248.24batch/s, train/train_loss_1=0.117]Epoch 0: 32%|###1 | 207/655 [00:01<00:01, 248.24batch/s, train/train_loss_1=0.105]Epoch 0: 32%|###1 | 207/655 [00:01<00:01, 248.24batch/s, train/train_loss_1=0.0989]Epoch 0: 32%|###1 | 207/655 [00:01<00:01, 248.24batch/s, train/train_loss_1=0.103] Epoch 0: 32%|###1 | 207/655 [00:01<00:01, 248.24batch/s, train/train_loss_1=0.1] Epoch 0: 32%|###1 | 207/655 [00:01<00:01, 248.24batch/s, train/train_loss_1=0.109]Epoch 0: 32%|###1 | 207/655 [00:01<00:01, 248.24batch/s, train/train_loss_1=0.11] Epoch 0: 32%|###1 | 207/655 [00:01<00:01, 248.24batch/s, train/train_loss_1=0.108]Epoch 0: 32%|###1 | 207/655 [00:01<00:01, 248.24batch/s, train/train_loss_1=0.113]Epoch 0: 32%|###1 | 207/655 [00:01<00:01, 248.24batch/s, train/train_loss_1=0.106]Epoch 0: 32%|###1 | 207/655 [00:01<00:01, 248.24batch/s, train/train_loss_1=0.108]Epoch 0: 32%|###1 | 207/655 [00:01<00:01, 248.24batch/s, train/train_loss_1=0.113]Epoch 0: 32%|###1 | 207/655 [00:01<00:01, 248.24batch/s, train/train_loss_1=0.102]Epoch 0: 32%|###1 | 207/655 [00:01<00:01, 248.24batch/s, train/train_loss_1=0.1] Epoch 0: 32%|###1 | 207/655 [00:01<00:01, 248.24batch/s, train/train_loss_1=0.1]Epoch 0: 32%|###1 | 207/655 [00:01<00:01, 248.24batch/s, train/train_loss_1=0.102]Epoch 0: 32%|###1 | 207/655 [00:01<00:01, 248.24batch/s, train/train_loss_1=0.0998]Epoch 0: 32%|###1 | 207/655 [00:01<00:01, 248.24batch/s, train/train_loss_1=0.108] Epoch 0: 32%|###1 | 207/655 [00:01<00:01, 248.24batch/s, train/train_loss_1=0.0929]Epoch 0: 32%|###1 | 207/655 [00:01<00:01, 248.24batch/s, train/train_loss_1=0.109] Epoch 0: 32%|###1 | 207/655 [00:01<00:01, 248.24batch/s, train/train_loss_1=0.1] Epoch 0: 32%|###1 | 207/655 [00:01<00:01, 248.24batch/s, train/train_loss_1=0.121]Epoch 0: 32%|###1 | 207/655 [00:01<00:01, 248.24batch/s, train/train_loss_1=0.101]Epoch 0: 32%|###1 | 207/655 [00:01<00:01, 248.24batch/s, train/train_loss_1=0.0981]Epoch 0: 42%|####1 | 272/655 [00:01<00:01, 324.81batch/s, train/train_loss_1=0.0981]Epoch 0: 42%|####1 | 272/655 [00:01<00:01, 324.81batch/s, train/train_loss_1=0.11] Epoch 0: 42%|####1 | 272/655 [00:01<00:01, 324.81batch/s, train/train_loss_1=0.0987]Epoch 0: 42%|####1 | 272/655 [00:01<00:01, 324.81batch/s, train/train_loss_1=0.104] Epoch 0: 42%|####1 | 272/655 [00:01<00:01, 324.81batch/s, train/train_loss_1=0.103]Epoch 0: 42%|####1 | 272/655 [00:01<00:01, 324.81batch/s, train/train_loss_1=0.115]Epoch 0: 42%|####1 | 272/655 [00:01<00:01, 324.81batch/s, train/train_loss_1=0.0995]Epoch 0: 42%|####1 | 272/655 [00:01<00:01, 324.81batch/s, train/train_loss_1=0.114] Epoch 0: 42%|####1 | 272/655 [00:01<00:01, 324.81batch/s, train/train_loss_1=0.117]Epoch 0: 42%|####1 | 272/655 [00:01<00:01, 324.81batch/s, train/train_loss_1=0.108]Epoch 0: 42%|####1 | 272/655 [00:01<00:01, 324.81batch/s, train/train_loss_1=0.102]Epoch 0: 42%|####1 | 272/655 [00:01<00:01, 324.81batch/s, train/train_loss_1=0.1] Epoch 0: 42%|####1 | 272/655 [00:01<00:01, 324.81batch/s, train/train_loss_1=0.108]Epoch 0: 42%|####1 | 272/655 [00:01<00:01, 324.81batch/s, train/train_loss_1=0.104]Epoch 0: 42%|####1 | 272/655 [00:01<00:01, 324.81batch/s, train/train_loss_1=0.103]Epoch 0: 42%|####1 | 272/655 [00:01<00:01, 324.81batch/s, train/train_loss_1=0.101]Epoch 0: 42%|####1 | 272/655 [00:01<00:01, 324.81batch/s, train/train_loss_1=0.0983]Epoch 0: 42%|####1 | 272/655 [00:01<00:01, 324.81batch/s, train/train_loss_1=0.108] Epoch 0: 42%|####1 | 272/655 [00:01<00:01, 324.81batch/s, train/train_loss_1=0.111]Epoch 0: 42%|####1 | 272/655 [00:01<00:01, 324.81batch/s, train/train_loss_1=0.111]Epoch 0: 42%|####1 | 272/655 [00:01<00:01, 324.81batch/s, train/train_loss_1=0.0973]Epoch 0: 42%|####1 | 272/655 [00:01<00:01, 324.81batch/s, train/train_loss_1=0.117] Epoch 0: 42%|####1 | 272/655 [00:01<00:01, 324.81batch/s, train/train_loss_1=0.0969]Epoch 0: 42%|####1 | 272/655 [00:01<00:01, 324.81batch/s, train/train_loss_1=0.0982]Epoch 0: 42%|####1 | 272/655 [00:01<00:01, 324.81batch/s, train/train_loss_1=0.113] Epoch 0: 42%|####1 | 272/655 [00:01<00:01, 324.81batch/s, train/train_loss_1=0.0943]Epoch 0: 42%|####1 | 272/655 [00:01<00:01, 324.81batch/s, train/train_loss_1=0.122] Epoch 0: 42%|####1 | 272/655 [00:01<00:01, 324.81batch/s, train/train_loss_1=0.106]Epoch 0: 42%|####1 | 272/655 [00:01<00:01, 324.81batch/s, train/train_loss_1=0.111]Epoch 0: 42%|####1 | 272/655 [00:01<00:01, 324.81batch/s, train/train_loss_1=0.104]Epoch 0: 42%|####1 | 272/655 [00:01<00:01, 324.81batch/s, train/train_loss_1=0.0806]Epoch 0: 42%|####1 | 272/655 [00:01<00:01, 324.81batch/s, train/train_loss_1=0.11] Epoch 0: 42%|####1 | 272/655 [00:01<00:01, 324.81batch/s, train/train_loss_1=0.115]Epoch 0: 42%|####1 | 272/655 [00:01<00:01, 324.81batch/s, train/train_loss_1=0.101]Epoch 0: 42%|####1 | 272/655 [00:01<00:01, 324.81batch/s, train/train_loss_1=0.104]Epoch 0: 42%|####1 | 272/655 [00:01<00:01, 324.81batch/s, train/train_loss_1=0.0902]Epoch 0: 42%|####1 | 272/655 [00:01<00:01, 324.81batch/s, train/train_loss_1=0.0963]Epoch 0: 42%|####1 | 272/655 [00:01<00:01, 324.81batch/s, train/train_loss_1=0.105] Epoch 0: 42%|####1 | 272/655 [00:01<00:01, 324.81batch/s, train/train_loss_1=0.0863]Epoch 0: 42%|####1 | 272/655 [00:01<00:01, 324.81batch/s, train/train_loss_1=0.0964]Epoch 0: 42%|####1 | 272/655 [00:01<00:01, 324.81batch/s, train/train_loss_1=0.096] Epoch 0: 42%|####1 | 272/655 [00:01<00:01, 324.81batch/s, train/train_loss_1=0.0976]Epoch 0: 42%|####1 | 272/655 [00:01<00:01, 324.81batch/s, train/train_loss_1=0.109] Epoch 0: 42%|####1 | 272/655 [00:01<00:01, 324.81batch/s, train/train_loss_1=0.0944]Epoch 0: 42%|####1 | 272/655 [00:01<00:01, 324.81batch/s, train/train_loss_1=0.104] Epoch 0: 42%|####1 | 272/655 [00:01<00:01, 324.81batch/s, train/train_loss_1=0.0951]Epoch 0: 42%|####1 | 272/655 [00:01<00:01, 324.81batch/s, train/train_loss_1=0.102] Epoch 0: 42%|####1 | 272/655 [00:01<00:01, 324.81batch/s, train/train_loss_1=0.105]Epoch 0: 42%|####1 | 272/655 [00:01<00:01, 324.81batch/s, train/train_loss_1=0.0944]Epoch 0: 42%|####1 | 272/655 [00:01<00:01, 324.81batch/s, train/train_loss_1=0.0983]Epoch 0: 42%|####1 | 272/655 [00:01<00:01, 324.81batch/s, train/train_loss_1=0.0987]Epoch 0: 42%|####1 | 272/655 [00:01<00:01, 324.81batch/s, train/train_loss_1=0.0964]Epoch 0: 42%|####1 | 272/655 [00:01<00:01, 324.81batch/s, train/train_loss_1=0.109] Epoch 0: 42%|####1 | 272/655 [00:01<00:01, 324.81batch/s, train/train_loss_1=0.102]Epoch 0: 42%|####1 | 272/655 [00:01<00:01, 324.81batch/s, train/train_loss_1=0.0955]Epoch 0: 42%|####1 | 272/655 [00:01<00:01, 324.81batch/s, train/train_loss_1=0.11] Epoch 0: 42%|####1 | 272/655 [00:01<00:01, 324.81batch/s, train/train_loss_1=0.0978]Epoch 0: 42%|####1 | 272/655 [00:01<00:01, 324.81batch/s, train/train_loss_1=0.104] Epoch 0: 42%|####1 | 272/655 [00:01<00:01, 324.81batch/s, train/train_loss_1=0.0958]Epoch 0: 42%|####1 | 272/655 [00:01<00:01, 324.81batch/s, train/train_loss_1=0.0992]Epoch 0: 42%|####1 | 272/655 [00:01<00:01, 324.81batch/s, train/train_loss_1=0.11] Epoch 0: 42%|####1 | 272/655 [00:01<00:01, 324.81batch/s, train/train_loss_1=0.1] Epoch 0: 42%|####1 | 272/655 [00:01<00:01, 324.81batch/s, train/train_loss_1=0.0987]Epoch 0: 42%|####1 | 272/655 [00:01<00:01, 324.81batch/s, train/train_loss_1=0.0934]Epoch 0: 42%|####1 | 272/655 [00:01<00:01, 324.81batch/s, train/train_loss_1=0.111] Epoch 0: 42%|####1 | 272/655 [00:01<00:01, 324.81batch/s, train/train_loss_1=0.104]Epoch 0: 42%|####1 | 272/655 [00:01<00:01, 324.81batch/s, train/train_loss_1=0.103]Epoch 0: 42%|####1 | 272/655 [00:01<00:01, 324.81batch/s, train/train_loss_1=0.101]Epoch 0: 42%|####1 | 272/655 [00:01<00:01, 324.81batch/s, train/train_loss_1=0.111]Epoch 0: 42%|####1 | 272/655 [00:01<00:01, 324.81batch/s, train/train_loss_1=0.103]Epoch 0: 42%|####1 | 272/655 [00:01<00:01, 324.81batch/s, train/train_loss_1=0.113]Epoch 0: 42%|####1 | 272/655 [00:01<00:01, 324.81batch/s, train/train_loss_1=0.0989]Epoch 0: 52%|#####2 | 343/655 [00:01<00:00, 406.53batch/s, train/train_loss_1=0.0989]Epoch 0: 52%|#####2 | 343/655 [00:01<00:00, 406.53batch/s, train/train_loss_1=0.102] Epoch 0: 52%|#####2 | 343/655 [00:01<00:00, 406.53batch/s, train/train_loss_1=0.0925]Epoch 0: 52%|#####2 | 343/655 [00:01<00:00, 406.53batch/s, train/train_loss_1=0.0917]Epoch 0: 52%|#####2 | 343/655 [00:01<00:00, 406.53batch/s, train/train_loss_1=0.111] Epoch 0: 52%|#####2 | 343/655 [00:01<00:00, 406.53batch/s, train/train_loss_1=0.0935]Epoch 0: 52%|#####2 | 343/655 [00:01<00:00, 406.53batch/s, train/train_loss_1=0.108] Epoch 0: 52%|#####2 | 343/655 [00:01<00:00, 406.53batch/s, train/train_loss_1=0.105]Epoch 0: 52%|#####2 | 343/655 [00:01<00:00, 406.53batch/s, train/train_loss_1=0.109]Epoch 0: 52%|#####2 | 343/655 [00:01<00:00, 406.53batch/s, train/train_loss_1=0.0908]Epoch 0: 52%|#####2 | 343/655 [00:01<00:00, 406.53batch/s, train/train_loss_1=0.0974]Epoch 0: 52%|#####2 | 343/655 [00:01<00:00, 406.53batch/s, train/train_loss_1=0.0864]Epoch 0: 52%|#####2 | 343/655 [00:01<00:00, 406.53batch/s, train/train_loss_1=0.106] Epoch 0: 52%|#####2 | 343/655 [00:01<00:00, 406.53batch/s, train/train_loss_1=0.103]Epoch 0: 52%|#####2 | 343/655 [00:01<00:00, 406.53batch/s, train/train_loss_1=0.0972]Epoch 0: 52%|#####2 | 343/655 [00:01<00:00, 406.53batch/s, train/train_loss_1=0.0967]Epoch 0: 52%|#####2 | 343/655 [00:01<00:00, 406.53batch/s, train/train_loss_1=0.0946]Epoch 0: 52%|#####2 | 343/655 [00:01<00:00, 406.53batch/s, train/train_loss_1=0.0972]Epoch 0: 52%|#####2 | 343/655 [00:01<00:00, 406.53batch/s, train/train_loss_1=0.1] Epoch 0: 52%|#####2 | 343/655 [00:01<00:00, 406.53batch/s, train/train_loss_1=0.105]Epoch 0: 52%|#####2 | 343/655 [00:01<00:00, 406.53batch/s, train/train_loss_1=0.0966]Epoch 0: 52%|#####2 | 343/655 [00:01<00:00, 406.53batch/s, train/train_loss_1=0.11] Epoch 0: 52%|#####2 | 343/655 [00:01<00:00, 406.53batch/s, train/train_loss_1=0.117]Epoch 0: 52%|#####2 | 343/655 [00:01<00:00, 406.53batch/s, train/train_loss_1=0.11] Epoch 0: 52%|#####2 | 343/655 [00:01<00:00, 406.53batch/s, train/train_loss_1=0.0984]Epoch 0: 52%|#####2 | 343/655 [00:01<00:00, 406.53batch/s, train/train_loss_1=0.106] Epoch 0: 52%|#####2 | 343/655 [00:01<00:00, 406.53batch/s, train/train_loss_1=0.0936]Epoch 0: 52%|#####2 | 343/655 [00:01<00:00, 406.53batch/s, train/train_loss_1=0.101] Epoch 0: 52%|#####2 | 343/655 [00:01<00:00, 406.53batch/s, train/train_loss_1=0.0958]Epoch 0: 52%|#####2 | 343/655 [00:01<00:00, 406.53batch/s, train/train_loss_1=0.105] Epoch 0: 52%|#####2 | 343/655 [00:01<00:00, 406.53batch/s, train/train_loss_1=0.0865]Epoch 0: 52%|#####2 | 343/655 [00:01<00:00, 406.53batch/s, train/train_loss_1=0.118] Epoch 0: 52%|#####2 | 343/655 [00:01<00:00, 406.53batch/s, train/train_loss_1=0.106]Epoch 0: 52%|#####2 | 343/655 [00:01<00:00, 406.53batch/s, train/train_loss_1=0.106]Epoch 0: 52%|#####2 | 343/655 [00:01<00:00, 406.53batch/s, train/train_loss_1=0.107]Epoch 0: 52%|#####2 | 343/655 [00:01<00:00, 406.53batch/s, train/train_loss_1=0.109]Epoch 0: 52%|#####2 | 343/655 [00:01<00:00, 406.53batch/s, train/train_loss_1=0.0891]Epoch 0: 52%|#####2 | 343/655 [00:01<00:00, 406.53batch/s, train/train_loss_1=0.103] Epoch 0: 52%|#####2 | 343/655 [00:01<00:00, 406.53batch/s, train/train_loss_1=0.119]Epoch 0: 52%|#####2 | 343/655 [00:01<00:00, 406.53batch/s, train/train_loss_1=0.106]Epoch 0: 52%|#####2 | 343/655 [00:01<00:00, 406.53batch/s, train/train_loss_1=0.106]Epoch 0: 52%|#####2 | 343/655 [00:01<00:00, 406.53batch/s, train/train_loss_1=0.1] Epoch 0: 52%|#####2 | 343/655 [00:01<00:00, 406.53batch/s, train/train_loss_1=0.113]Epoch 0: 52%|#####2 | 343/655 [00:01<00:00, 406.53batch/s, train/train_loss_1=0.0987]Epoch 0: 52%|#####2 | 343/655 [00:01<00:00, 406.53batch/s, train/train_loss_1=0.109] Epoch 0: 52%|#####2 | 343/655 [00:01<00:00, 406.53batch/s, train/train_loss_1=0.0991]Epoch 0: 52%|#####2 | 343/655 [00:01<00:00, 406.53batch/s, train/train_loss_1=0.108] Epoch 0: 52%|#####2 | 343/655 [00:01<00:00, 406.53batch/s, train/train_loss_1=0.0911]Epoch 0: 52%|#####2 | 343/655 [00:01<00:00, 406.53batch/s, train/train_loss_1=0.0939]Epoch 0: 52%|#####2 | 343/655 [00:01<00:00, 406.53batch/s, train/train_loss_1=0.112] Epoch 0: 52%|#####2 | 343/655 [00:01<00:00, 406.53batch/s, train/train_loss_1=0.0897]Epoch 0: 52%|#####2 | 343/655 [00:01<00:00, 406.53batch/s, train/train_loss_1=0.104] Epoch 0: 52%|#####2 | 343/655 [00:01<00:00, 406.53batch/s, train/train_loss_1=0.092]Epoch 0: 52%|#####2 | 343/655 [00:01<00:00, 406.53batch/s, train/train_loss_1=0.103]Epoch 0: 52%|#####2 | 343/655 [00:01<00:00, 406.53batch/s, train/train_loss_1=0.108]Epoch 0: 52%|#####2 | 343/655 [00:01<00:00, 406.53batch/s, train/train_loss_1=0.0851]Epoch 0: 52%|#####2 | 343/655 [00:01<00:00, 406.53batch/s, train/train_loss_1=0.097] Epoch 0: 52%|#####2 | 343/655 [00:01<00:00, 406.53batch/s, train/train_loss_1=0.0871]Epoch 0: 52%|#####2 | 343/655 [00:01<00:00, 406.53batch/s, train/train_loss_1=0.113] Epoch 0: 52%|#####2 | 343/655 [00:01<00:00, 406.53batch/s, train/train_loss_1=0.107]Epoch 0: 52%|#####2 | 343/655 [00:01<00:00, 406.53batch/s, train/train_loss_1=0.107]Epoch 0: 52%|#####2 | 343/655 [00:01<00:00, 406.53batch/s, train/train_loss_1=0.088]Epoch 0: 52%|#####2 | 343/655 [00:01<00:00, 406.53batch/s, train/train_loss_1=0.0932]Epoch 0: 52%|#####2 | 343/655 [00:01<00:00, 406.53batch/s, train/train_loss_1=0.0952]Epoch 0: 52%|#####2 | 343/655 [00:01<00:00, 406.53batch/s, train/train_loss_1=0.0913]Epoch 0: 52%|#####2 | 343/655 [00:01<00:00, 406.53batch/s, train/train_loss_1=0.106] Epoch 0: 52%|#####2 | 343/655 [00:01<00:00, 406.53batch/s, train/train_loss_1=0.109]Epoch 0: 52%|#####2 | 343/655 [00:01<00:00, 406.53batch/s, train/train_loss_1=0.097]Epoch 0: 52%|#####2 | 343/655 [00:01<00:00, 406.53batch/s, train/train_loss_1=0.11] Epoch 0: 63%|######2 | 411/655 [00:01<00:00, 469.89batch/s, train/train_loss_1=0.11]Epoch 0: 63%|######2 | 411/655 [00:01<00:00, 469.89batch/s, train/train_loss_1=0.106]Epoch 0: 63%|######2 | 411/655 [00:01<00:00, 469.89batch/s, train/train_loss_1=0.0937]Epoch 0: 63%|######2 | 411/655 [00:01<00:00, 469.89batch/s, train/train_loss_1=0.0972]Epoch 0: 63%|######2 | 411/655 [00:01<00:00, 469.89batch/s, train/train_loss_1=0.103] Epoch 0: 63%|######2 | 411/655 [00:01<00:00, 469.89batch/s, train/train_loss_1=0.104]Epoch 0: 63%|######2 | 411/655 [00:01<00:00, 469.89batch/s, train/train_loss_1=0.12] Epoch 0: 63%|######2 | 411/655 [00:01<00:00, 469.89batch/s, train/train_loss_1=0.0898]Epoch 0: 63%|######2 | 411/655 [00:01<00:00, 469.89batch/s, train/train_loss_1=0.107] Epoch 0: 63%|######2 | 411/655 [00:01<00:00, 469.89batch/s, train/train_loss_1=0.105]Epoch 0: 63%|######2 | 411/655 [00:01<00:00, 469.89batch/s, train/train_loss_1=0.116]Epoch 0: 63%|######2 | 411/655 [00:01<00:00, 469.89batch/s, train/train_loss_1=0.102]Epoch 0: 63%|######2 | 411/655 [00:01<00:00, 469.89batch/s, train/train_loss_1=0.106]Epoch 0: 63%|######2 | 411/655 [00:01<00:00, 469.89batch/s, train/train_loss_1=0.106]Epoch 0: 63%|######2 | 411/655 [00:01<00:00, 469.89batch/s, train/train_loss_1=0.104]Epoch 0: 63%|######2 | 411/655 [00:01<00:00, 469.89batch/s, train/train_loss_1=0.0902]Epoch 0: 63%|######2 | 411/655 [00:01<00:00, 469.89batch/s, train/train_loss_1=0.103] Epoch 0: 63%|######2 | 411/655 [00:01<00:00, 469.89batch/s, train/train_loss_1=0.11] Epoch 0: 63%|######2 | 411/655 [00:01<00:00, 469.89batch/s, train/train_loss_1=0.0959]Epoch 0: 63%|######2 | 411/655 [00:01<00:00, 469.89batch/s, train/train_loss_1=0.119] Epoch 0: 63%|######2 | 411/655 [00:01<00:00, 469.89batch/s, train/train_loss_1=0.104]Epoch 0: 63%|######2 | 411/655 [00:01<00:00, 469.89batch/s, train/train_loss_1=0.111]Epoch 0: 63%|######2 | 411/655 [00:01<00:00, 469.89batch/s, train/train_loss_1=0.103]Epoch 0: 63%|######2 | 411/655 [00:01<00:00, 469.89batch/s, train/train_loss_1=0.111]Epoch 0: 63%|######2 | 411/655 [00:01<00:00, 469.89batch/s, train/train_loss_1=0.0932]Epoch 0: 63%|######2 | 411/655 [00:01<00:00, 469.89batch/s, train/train_loss_1=0.0849]Epoch 0: 63%|######2 | 411/655 [00:01<00:00, 469.89batch/s, train/train_loss_1=0.102] Epoch 0: 63%|######2 | 411/655 [00:01<00:00, 469.89batch/s, train/train_loss_1=0.107]Epoch 0: 63%|######2 | 411/655 [00:01<00:00, 469.89batch/s, train/train_loss_1=0.104]Epoch 0: 63%|######2 | 411/655 [00:01<00:00, 469.89batch/s, train/train_loss_1=0.102]Epoch 0: 63%|######2 | 411/655 [00:01<00:00, 469.89batch/s, train/train_loss_1=0.119]Epoch 0: 63%|######2 | 411/655 [00:01<00:00, 469.89batch/s, train/train_loss_1=0.0989]Epoch 0: 63%|######2 | 411/655 [00:01<00:00, 469.89batch/s, train/train_loss_1=0.103] Epoch 0: 63%|######2 | 411/655 [00:01<00:00, 469.89batch/s, train/train_loss_1=0.0853]Epoch 0: 63%|######2 | 411/655 [00:01<00:00, 469.89batch/s, train/train_loss_1=0.115] Epoch 0: 63%|######2 | 411/655 [00:01<00:00, 469.89batch/s, train/train_loss_1=0.11] Epoch 0: 63%|######2 | 411/655 [00:01<00:00, 469.89batch/s, train/train_loss_1=0.0988]Epoch 0: 63%|######2 | 411/655 [00:01<00:00, 469.89batch/s, train/train_loss_1=0.108] Epoch 0: 63%|######2 | 411/655 [00:01<00:00, 469.89batch/s, train/train_loss_1=0.107]Epoch 0: 63%|######2 | 411/655 [00:01<00:00, 469.89batch/s, train/train_loss_1=0.101]Epoch 0: 63%|######2 | 411/655 [00:01<00:00, 469.89batch/s, train/train_loss_1=0.101]Epoch 0: 63%|######2 | 411/655 [00:01<00:00, 469.89batch/s, train/train_loss_1=0.125]Epoch 0: 63%|######2 | 411/655 [00:01<00:00, 469.89batch/s, train/train_loss_1=0.107]Epoch 0: 63%|######2 | 411/655 [00:01<00:00, 469.89batch/s, train/train_loss_1=0.0988]Epoch 0: 63%|######2 | 411/655 [00:01<00:00, 469.89batch/s, train/train_loss_1=0.102] Epoch 0: 63%|######2 | 411/655 [00:01<00:00, 469.89batch/s, train/train_loss_1=0.0956]Epoch 0: 63%|######2 | 411/655 [00:01<00:00, 469.89batch/s, train/train_loss_1=0.113] Epoch 0: 63%|######2 | 411/655 [00:01<00:00, 469.89batch/s, train/train_loss_1=0.111]Epoch 0: 63%|######2 | 411/655 [00:01<00:00, 469.89batch/s, train/train_loss_1=0.101]Epoch 0: 63%|######2 | 411/655 [00:01<00:00, 469.89batch/s, train/train_loss_1=0.0961]Epoch 0: 63%|######2 | 411/655 [00:01<00:00, 469.89batch/s, train/train_loss_1=0.103] Epoch 0: 63%|######2 | 411/655 [00:01<00:00, 469.89batch/s, train/train_loss_1=0.0986]Epoch 0: 63%|######2 | 411/655 [00:01<00:00, 469.89batch/s, train/train_loss_1=0.102] Epoch 0: 63%|######2 | 411/655 [00:01<00:00, 469.89batch/s, train/train_loss_1=0.0973]Epoch 0: 63%|######2 | 411/655 [00:01<00:00, 469.89batch/s, train/train_loss_1=0.103] Epoch 0: 63%|######2 | 411/655 [00:01<00:00, 469.89batch/s, train/train_loss_1=0.1] Epoch 0: 63%|######2 | 411/655 [00:01<00:00, 469.89batch/s, train/train_loss_1=0.0981]Epoch 0: 63%|######2 | 411/655 [00:01<00:00, 469.89batch/s, train/train_loss_1=0.0915]Epoch 0: 63%|######2 | 411/655 [00:01<00:00, 469.89batch/s, train/train_loss_1=0.0916]Epoch 0: 63%|######2 | 411/655 [00:01<00:00, 469.89batch/s, train/train_loss_1=0.104] Epoch 0: 63%|######2 | 411/655 [00:01<00:00, 469.89batch/s, train/train_loss_1=0.101]Epoch 0: 63%|######2 | 411/655 [00:01<00:00, 469.89batch/s, train/train_loss_1=0.105]Epoch 0: 63%|######2 | 411/655 [00:01<00:00, 469.89batch/s, train/train_loss_1=0.112]Epoch 0: 63%|######2 | 411/655 [00:01<00:00, 469.89batch/s, train/train_loss_1=0.117]Epoch 0: 63%|######2 | 411/655 [00:01<00:00, 469.89batch/s, train/train_loss_1=0.103]Epoch 0: 63%|######2 | 411/655 [00:01<00:00, 469.89batch/s, train/train_loss_1=0.0961]Epoch 0: 63%|######2 | 411/655 [00:01<00:00, 469.89batch/s, train/train_loss_1=0.0957]Epoch 0: 63%|######2 | 411/655 [00:01<00:00, 469.89batch/s, train/train_loss_1=0.103] Epoch 0: 73%|#######2 | 478/655 [00:01<00:00, 519.48batch/s, train/train_loss_1=0.103]Epoch 0: 73%|#######2 | 478/655 [00:01<00:00, 519.48batch/s, train/train_loss_1=0.0953]Epoch 0: 73%|#######2 | 478/655 [00:01<00:00, 519.48batch/s, train/train_loss_1=0.0868]Epoch 0: 73%|#######2 | 478/655 [00:01<00:00, 519.48batch/s, train/train_loss_1=0.0917]Epoch 0: 73%|#######2 | 478/655 [00:01<00:00, 519.48batch/s, train/train_loss_1=0.106] Epoch 0: 73%|#######2 | 478/655 [00:01<00:00, 519.48batch/s, train/train_loss_1=0.114]Epoch 0: 73%|#######2 | 478/655 [00:01<00:00, 519.48batch/s, train/train_loss_1=0.0963]Epoch 0: 73%|#######2 | 478/655 [00:01<00:00, 519.48batch/s, train/train_loss_1=0.0995]Epoch 0: 73%|#######2 | 478/655 [00:01<00:00, 519.48batch/s, train/train_loss_1=0.102] Epoch 0: 73%|#######2 | 478/655 [00:01<00:00, 519.48batch/s, train/train_loss_1=0.0976]Epoch 0: 73%|#######2 | 478/655 [00:01<00:00, 519.48batch/s, train/train_loss_1=0.0912]Epoch 0: 73%|#######2 | 478/655 [00:01<00:00, 519.48batch/s, train/train_loss_1=0.111] Epoch 0: 73%|#######2 | 478/655 [00:01<00:00, 519.48batch/s, train/train_loss_1=0.103]Epoch 0: 73%|#######2 | 478/655 [00:01<00:00, 519.48batch/s, train/train_loss_1=0.0968]Epoch 0: 73%|#######2 | 478/655 [00:01<00:00, 519.48batch/s, train/train_loss_1=0.103] Epoch 0: 73%|#######2 | 478/655 [00:01<00:00, 519.48batch/s, train/train_loss_1=0.106]Epoch 0: 73%|#######2 | 478/655 [00:01<00:00, 519.48batch/s, train/train_loss_1=0.0999]Epoch 0: 73%|#######2 | 478/655 [00:01<00:00, 519.48batch/s, train/train_loss_1=0.101] Epoch 0: 73%|#######2 | 478/655 [00:01<00:00, 519.48batch/s, train/train_loss_1=0.103]Epoch 0: 73%|#######2 | 478/655 [00:01<00:00, 519.48batch/s, train/train_loss_1=0.094]Epoch 0: 73%|#######2 | 478/655 [00:01<00:00, 519.48batch/s, train/train_loss_1=0.092]Epoch 0: 73%|#######2 | 478/655 [00:01<00:00, 519.48batch/s, train/train_loss_1=0.0962]Epoch 0: 73%|#######2 | 478/655 [00:01<00:00, 519.48batch/s, train/train_loss_1=0.0915]Epoch 0: 73%|#######2 | 478/655 [00:01<00:00, 519.48batch/s, train/train_loss_1=0.0972]Epoch 0: 73%|#######2 | 478/655 [00:01<00:00, 519.48batch/s, train/train_loss_1=0.116] Epoch 0: 73%|#######2 | 478/655 [00:01<00:00, 519.48batch/s, train/train_loss_1=0.112]Epoch 0: 73%|#######2 | 478/655 [00:01<00:00, 519.48batch/s, train/train_loss_1=0.1] Epoch 0: 73%|#######2 | 478/655 [00:01<00:00, 519.48batch/s, train/train_loss_1=0.0938]Epoch 0: 73%|#######2 | 478/655 [00:01<00:00, 519.48batch/s, train/train_loss_1=0.1] Epoch 0: 73%|#######2 | 478/655 [00:01<00:00, 519.48batch/s, train/train_loss_1=0.0967]Epoch 0: 73%|#######2 | 478/655 [00:01<00:00, 519.48batch/s, train/train_loss_1=0.0888]Epoch 0: 73%|#######2 | 478/655 [00:01<00:00, 519.48batch/s, train/train_loss_1=0.103] Epoch 0: 73%|#######2 | 478/655 [00:01<00:00, 519.48batch/s, train/train_loss_1=0.11] Epoch 0: 73%|#######2 | 478/655 [00:01<00:00, 519.48batch/s, train/train_loss_1=0.0876]Epoch 0: 73%|#######2 | 478/655 [00:01<00:00, 519.48batch/s, train/train_loss_1=0.103] Epoch 0: 73%|#######2 | 478/655 [00:01<00:00, 519.48batch/s, train/train_loss_1=0.0968]Epoch 0: 73%|#######2 | 478/655 [00:01<00:00, 519.48batch/s, train/train_loss_1=0.105] Epoch 0: 73%|#######2 | 478/655 [00:01<00:00, 519.48batch/s, train/train_loss_1=0.09] Epoch 0: 73%|#######2 | 478/655 [00:01<00:00, 519.48batch/s, train/train_loss_1=0.101]Epoch 0: 73%|#######2 | 478/655 [00:01<00:00, 519.48batch/s, train/train_loss_1=0.101]Epoch 0: 73%|#######2 | 478/655 [00:01<00:00, 519.48batch/s, train/train_loss_1=0.097]Epoch 0: 73%|#######2 | 478/655 [00:01<00:00, 519.48batch/s, train/train_loss_1=0.107]Epoch 0: 73%|#######2 | 478/655 [00:01<00:00, 519.48batch/s, train/train_loss_1=0.109]Epoch 0: 73%|#######2 | 478/655 [00:01<00:00, 519.48batch/s, train/train_loss_1=0.112]Epoch 0: 73%|#######2 | 478/655 [00:01<00:00, 519.48batch/s, train/train_loss_1=0.113]Epoch 0: 73%|#######2 | 478/655 [00:01<00:00, 519.48batch/s, train/train_loss_1=0.101]Epoch 0: 73%|#######2 | 478/655 [00:01<00:00, 519.48batch/s, train/train_loss_1=0.0932]Epoch 0: 73%|#######2 | 478/655 [00:01<00:00, 519.48batch/s, train/train_loss_1=0.102] Epoch 0: 73%|#######2 | 478/655 [00:01<00:00, 519.48batch/s, train/train_loss_1=0.101]Epoch 0: 73%|#######2 | 478/655 [00:01<00:00, 519.48batch/s, train/train_loss_1=0.113]Epoch 0: 73%|#######2 | 478/655 [00:01<00:00, 519.48batch/s, train/train_loss_1=0.112]Epoch 0: 73%|#######2 | 478/655 [00:01<00:00, 519.48batch/s, train/train_loss_1=0.0973]Epoch 0: 73%|#######2 | 478/655 [00:01<00:00, 519.48batch/s, train/train_loss_1=0.0957]Epoch 0: 73%|#######2 | 478/655 [00:01<00:00, 519.48batch/s, train/train_loss_1=0.11] Epoch 0: 73%|#######2 | 478/655 [00:01<00:00, 519.48batch/s, train/train_loss_1=0.103]Epoch 0: 73%|#######2 | 478/655 [00:01<00:00, 519.48batch/s, train/train_loss_1=0.0885]Epoch 0: 73%|#######2 | 478/655 [00:01<00:00, 519.48batch/s, train/train_loss_1=0.102] Epoch 0: 73%|#######2 | 478/655 [00:01<00:00, 519.48batch/s, train/train_loss_1=0.115]Epoch 0: 73%|#######2 | 478/655 [00:01<00:00, 519.48batch/s, train/train_loss_1=0.101]Epoch 0: 73%|#######2 | 478/655 [00:01<00:00, 519.48batch/s, train/train_loss_1=0.104]Epoch 0: 73%|#######2 | 478/655 [00:01<00:00, 519.48batch/s, train/train_loss_1=0.0973]Epoch 0: 73%|#######2 | 478/655 [00:01<00:00, 519.48batch/s, train/train_loss_1=0.0977]Epoch 0: 73%|#######2 | 478/655 [00:01<00:00, 519.48batch/s, train/train_loss_1=0.114] Epoch 0: 73%|#######2 | 478/655 [00:01<00:00, 519.48batch/s, train/train_loss_1=0.0991]Epoch 0: 73%|#######2 | 478/655 [00:01<00:00, 519.48batch/s, train/train_loss_1=0.1] Epoch 0: 73%|#######2 | 478/655 [00:01<00:00, 519.48batch/s, train/train_loss_1=0.0807]Epoch 0: 73%|#######2 | 478/655 [00:01<00:00, 519.48batch/s, train/train_loss_1=0.116] Epoch 0: 73%|#######2 | 478/655 [00:01<00:00, 519.48batch/s, train/train_loss_1=0.117]Epoch 0: 73%|#######2 | 478/655 [00:01<00:00, 519.48batch/s, train/train_loss_1=0.0973]Epoch 0: 83%|########3 | 546/655 [00:01<00:00, 560.82batch/s, train/train_loss_1=0.0973]Epoch 0: 83%|########3 | 546/655 [00:01<00:00, 560.82batch/s, train/train_loss_1=0.102] Epoch 0: 83%|########3 | 546/655 [00:01<00:00, 560.82batch/s, train/train_loss_1=0.109]Epoch 0: 83%|########3 | 546/655 [00:01<00:00, 560.82batch/s, train/train_loss_1=0.101]Epoch 0: 83%|########3 | 546/655 [00:01<00:00, 560.82batch/s, train/train_loss_1=0.11] Epoch 0: 83%|########3 | 546/655 [00:01<00:00, 560.82batch/s, train/train_loss_1=0.114]Epoch 0: 83%|########3 | 546/655 [00:01<00:00, 560.82batch/s, train/train_loss_1=0.0997]Epoch 0: 83%|########3 | 546/655 [00:01<00:00, 560.82batch/s, train/train_loss_1=0.0882]Epoch 0: 83%|########3 | 546/655 [00:01<00:00, 560.82batch/s, train/train_loss_1=0.094] Epoch 0: 83%|########3 | 546/655 [00:01<00:00, 560.82batch/s, train/train_loss_1=0.107]Epoch 0: 83%|########3 | 546/655 [00:01<00:00, 560.82batch/s, train/train_loss_1=0.12] Epoch 0: 83%|########3 | 546/655 [00:01<00:00, 560.82batch/s, train/train_loss_1=0.0938]Epoch 0: 83%|########3 | 546/655 [00:01<00:00, 560.82batch/s, train/train_loss_1=0.101] Epoch 0: 83%|########3 | 546/655 [00:01<00:00, 560.82batch/s, train/train_loss_1=0.112]Epoch 0: 83%|########3 | 546/655 [00:01<00:00, 560.82batch/s, train/train_loss_1=0.107]Epoch 0: 83%|########3 | 546/655 [00:01<00:00, 560.82batch/s, train/train_loss_1=0.0966]Epoch 0: 83%|########3 | 546/655 [00:01<00:00, 560.82batch/s, train/train_loss_1=0.103] Epoch 0: 83%|########3 | 546/655 [00:01<00:00, 560.82batch/s, train/train_loss_1=0.096]Epoch 0: 83%|########3 | 546/655 [00:01<00:00, 560.82batch/s, train/train_loss_1=0.0901]Epoch 0: 83%|########3 | 546/655 [00:01<00:00, 560.82batch/s, train/train_loss_1=0.107] Epoch 0: 83%|########3 | 546/655 [00:01<00:00, 560.82batch/s, train/train_loss_1=0.095]Epoch 0: 83%|########3 | 546/655 [00:01<00:00, 560.82batch/s, train/train_loss_1=0.106]Epoch 0: 83%|########3 | 546/655 [00:01<00:00, 560.82batch/s, train/train_loss_1=0.0932]Epoch 0: 83%|########3 | 546/655 [00:01<00:00, 560.82batch/s, train/train_loss_1=0.095] Epoch 0: 83%|########3 | 546/655 [00:01<00:00, 560.82batch/s, train/train_loss_1=0.103]Epoch 0: 83%|########3 | 546/655 [00:01<00:00, 560.82batch/s, train/train_loss_1=0.107]Epoch 0: 83%|########3 | 546/655 [00:01<00:00, 560.82batch/s, train/train_loss_1=0.106]Epoch 0: 83%|########3 | 546/655 [00:01<00:00, 560.82batch/s, train/train_loss_1=0.0948]Epoch 0: 83%|########3 | 546/655 [00:01<00:00, 560.82batch/s, train/train_loss_1=0.0994]Epoch 0: 83%|########3 | 546/655 [00:01<00:00, 560.82batch/s, train/train_loss_1=0.112] Epoch 0: 83%|########3 | 546/655 [00:01<00:00, 560.82batch/s, train/train_loss_1=0.112]Epoch 0: 83%|########3 | 546/655 [00:01<00:00, 560.82batch/s, train/train_loss_1=0.109]Epoch 0: 83%|########3 | 546/655 [00:01<00:00, 560.82batch/s, train/train_loss_1=0.0929]Epoch 0: 83%|########3 | 546/655 [00:01<00:00, 560.82batch/s, train/train_loss_1=0.109] Epoch 0: 83%|########3 | 546/655 [00:01<00:00, 560.82batch/s, train/train_loss_1=0.0909]Epoch 0: 83%|########3 | 546/655 [00:01<00:00, 560.82batch/s, train/train_loss_1=0.0999]Epoch 0: 83%|########3 | 546/655 [00:01<00:00, 560.82batch/s, train/train_loss_1=0.101] Epoch 0: 83%|########3 | 546/655 [00:01<00:00, 560.82batch/s, train/train_loss_1=0.084]Epoch 0: 83%|########3 | 546/655 [00:01<00:00, 560.82batch/s, train/train_loss_1=0.102]Epoch 0: 83%|########3 | 546/655 [00:01<00:00, 560.82batch/s, train/train_loss_1=0.108]Epoch 0: 83%|########3 | 546/655 [00:01<00:00, 560.82batch/s, train/train_loss_1=0.101]Epoch 0: 83%|########3 | 546/655 [00:01<00:00, 560.82batch/s, train/train_loss_1=0.0942]Epoch 0: 83%|########3 | 546/655 [00:01<00:00, 560.82batch/s, train/train_loss_1=0.0925]Epoch 0: 83%|########3 | 546/655 [00:01<00:00, 560.82batch/s, train/train_loss_1=0.121] Epoch 0: 83%|########3 | 546/655 [00:01<00:00, 560.82batch/s, train/train_loss_1=0.104]Epoch 0: 83%|########3 | 546/655 [00:01<00:00, 560.82batch/s, train/train_loss_1=0.103]Epoch 0: 83%|########3 | 546/655 [00:01<00:00, 560.82batch/s, train/train_loss_1=0.0921]Epoch 0: 83%|########3 | 546/655 [00:01<00:00, 560.82batch/s, train/train_loss_1=0.11] Epoch 0: 83%|########3 | 546/655 [00:01<00:00, 560.82batch/s, train/train_loss_1=0.0887]Epoch 0: 83%|########3 | 546/655 [00:01<00:00, 560.82batch/s, train/train_loss_1=0.102] Epoch 0: 83%|########3 | 546/655 [00:01<00:00, 560.82batch/s, train/train_loss_1=0.112]Epoch 0: 83%|########3 | 546/655 [00:01<00:00, 560.82batch/s, train/train_loss_1=0.0998]Epoch 0: 83%|########3 | 546/655 [00:01<00:00, 560.82batch/s, train/train_loss_1=0.0884]Epoch 0: 83%|########3 | 546/655 [00:01<00:00, 560.82batch/s, train/train_loss_1=0.104] Epoch 0: 83%|########3 | 546/655 [00:02<00:00, 560.82batch/s, train/train_loss_1=0.104]Epoch 0: 83%|########3 | 546/655 [00:02<00:00, 560.82batch/s, train/train_loss_1=0.0818]Epoch 0: 83%|########3 | 546/655 [00:02<00:00, 560.82batch/s, train/train_loss_1=0.0962]Epoch 0: 83%|########3 | 546/655 [00:02<00:00, 560.82batch/s, train/train_loss_1=0.0989]Epoch 0: 83%|########3 | 546/655 [00:02<00:00, 560.82batch/s, train/train_loss_1=0.099] Epoch 0: 83%|########3 | 546/655 [00:02<00:00, 560.82batch/s, train/train_loss_1=0.106]Epoch 0: 83%|########3 | 546/655 [00:02<00:00, 560.82batch/s, train/train_loss_1=0.102]Epoch 0: 83%|########3 | 546/655 [00:02<00:00, 560.82batch/s, train/train_loss_1=0.111]Epoch 0: 83%|########3 | 546/655 [00:02<00:00, 560.82batch/s, train/train_loss_1=0.0947]Epoch 0: 83%|########3 | 546/655 [00:02<00:00, 560.82batch/s, train/train_loss_1=0.12] Epoch 0: 83%|########3 | 546/655 [00:02<00:00, 560.82batch/s, train/train_loss_1=0.0988]Epoch 0: 83%|########3 | 546/655 [00:02<00:00, 560.82batch/s, train/train_loss_1=0.0972]Epoch 0: 83%|########3 | 546/655 [00:02<00:00, 560.82batch/s, train/train_loss_1=0.102] Epoch 0: 83%|########3 | 546/655 [00:02<00:00, 560.82batch/s, train/train_loss_1=0.0771]Epoch 0: 83%|########3 | 546/655 [00:02<00:00, 560.82batch/s, train/train_loss_1=0.112] Epoch 0: 94%|#########3| 614/655 [00:02<00:00, 591.88batch/s, train/train_loss_1=0.112]Epoch 0: 94%|#########3| 614/655 [00:02<00:00, 591.88batch/s, train/train_loss_1=0.0968]Epoch 0: 94%|#########3| 614/655 [00:02<00:00, 591.88batch/s, train/train_loss_1=0.108] Epoch 0: 94%|#########3| 614/655 [00:02<00:00, 591.88batch/s, train/train_loss_1=0.11] Epoch 0: 94%|#########3| 614/655 [00:02<00:00, 591.88batch/s, train/train_loss_1=0.102]Epoch 0: 94%|#########3| 614/655 [00:02<00:00, 591.88batch/s, train/train_loss_1=0.0982]Epoch 0: 94%|#########3| 614/655 [00:02<00:00, 591.88batch/s, train/train_loss_1=0.102] Epoch 0: 94%|#########3| 614/655 [00:02<00:00, 591.88batch/s, train/train_loss_1=0.099]Epoch 0: 94%|#########3| 614/655 [00:02<00:00, 591.88batch/s, train/train_loss_1=0.0985]Epoch 0: 94%|#########3| 614/655 [00:02<00:00, 591.88batch/s, train/train_loss_1=0.0988]Epoch 0: 94%|#########3| 614/655 [00:02<00:00, 591.88batch/s, train/train_loss_1=0.108] Epoch 0: 94%|#########3| 614/655 [00:02<00:00, 591.88batch/s, train/train_loss_1=0.101]Epoch 0: 94%|#########3| 614/655 [00:02<00:00, 591.88batch/s, train/train_loss_1=0.0898]Epoch 0: 94%|#########3| 614/655 [00:02<00:00, 591.88batch/s, train/train_loss_1=0.0966]Epoch 0: 94%|#########3| 614/655 [00:02<00:00, 591.88batch/s, train/train_loss_1=0.0986]Epoch 0: 94%|#########3| 614/655 [00:02<00:00, 591.88batch/s, train/train_loss_1=0.101] Epoch 0: 94%|#########3| 614/655 [00:02<00:00, 591.88batch/s, train/train_loss_1=0.116]Epoch 0: 94%|#########3| 614/655 [00:02<00:00, 591.88batch/s, train/train_loss_1=0.117]Epoch 0: 94%|#########3| 614/655 [00:02<00:00, 591.88batch/s, train/train_loss_1=0.1] Epoch 0: 94%|#########3| 614/655 [00:02<00:00, 591.88batch/s, train/train_loss_1=0.0849]Epoch 0: 94%|#########3| 614/655 [00:02<00:00, 591.88batch/s, train/train_loss_1=0.103] Epoch 0: 94%|#########3| 614/655 [00:02<00:00, 591.88batch/s, train/train_loss_1=0.0928]Epoch 0: 94%|#########3| 614/655 [00:02<00:00, 591.88batch/s, train/train_loss_1=0.102] Epoch 0: 94%|#########3| 614/655 [00:02<00:00, 591.88batch/s, train/train_loss_1=0.103]Epoch 0: 94%|#########3| 614/655 [00:02<00:00, 591.88batch/s, train/train_loss_1=0.0988]Epoch 0: 94%|#########3| 614/655 [00:02<00:00, 591.88batch/s, train/train_loss_1=0.0994]Epoch 0: 94%|#########3| 614/655 [00:02<00:00, 591.88batch/s, train/train_loss_1=0.109] Epoch 0: 94%|#########3| 614/655 [00:02<00:00, 591.88batch/s, train/train_loss_1=0.0856]Epoch 0: 94%|#########3| 614/655 [00:02<00:00, 591.88batch/s, train/train_loss_1=0.0905]Epoch 0: 94%|#########3| 614/655 [00:02<00:00, 591.88batch/s, train/train_loss_1=0.104] Epoch 0: 94%|#########3| 614/655 [00:02<00:00, 591.88batch/s, train/train_loss_1=0.0889]Epoch 0: 94%|#########3| 614/655 [00:02<00:00, 591.88batch/s, train/train_loss_1=0.0979]Epoch 0: 94%|#########3| 614/655 [00:02<00:00, 591.88batch/s, train/train_loss_1=0.111] Epoch 0: 94%|#########3| 614/655 [00:02<00:00, 591.88batch/s, train/train_loss_1=0.106]Epoch 0: 94%|#########3| 614/655 [00:02<00:00, 591.88batch/s, train/train_loss_1=0.0841]Epoch 0: 94%|#########3| 614/655 [00:02<00:00, 591.88batch/s, train/train_loss_1=0.0893]Epoch 0: 94%|#########3| 614/655 [00:02<00:00, 591.88batch/s, train/train_loss_1=0.0912]Epoch 0: 94%|#########3| 614/655 [00:02<00:00, 591.88batch/s, train/train_loss_1=0.104] Epoch 0: 94%|#########3| 614/655 [00:02<00:00, 591.88batch/s, train/train_loss_1=0.0953]Epoch 0: 94%|#########3| 614/655 [00:02<00:00, 591.88batch/s, train/train_loss_1=0.106] Epoch 0: 94%|#########3| 614/655 [00:02<00:00, 591.88batch/s, train/train_loss_1=0.108]Epoch 0: 94%|#########3| 614/655 [00:03<00:00, 591.88batch/s, train/train_loss_1=0.0848] 0%| | 0/655 [00:00<?, ?batch/s]Epoch 1: 0%| | 0/655 [00:00<?, ?batch/s]Epoch 1: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0806]Epoch 1: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.108] Epoch 1: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.096]Epoch 1: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.107]Epoch 1: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.101]Epoch 1: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.107]Epoch 1: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0963]Epoch 1: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.102] Epoch 1: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0962]Epoch 1: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0973]Epoch 1: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0809]Epoch 1: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.106] Epoch 1: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0975]Epoch 1: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.105] Epoch 1: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.103]Epoch 1: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.095]Epoch 1: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0852]Epoch 1: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.101] Epoch 1: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.104]Epoch 1: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0874]Epoch 1: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0999]Epoch 1: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.104] Epoch 1: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0954]Epoch 1: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0941]Epoch 1: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.106] Epoch 1: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0951]Epoch 1: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.107] Epoch 1: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0964]Epoch 1: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.109] Epoch 1: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0741]Epoch 1: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.098] Epoch 1: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0931]Epoch 1: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0989]Epoch 1: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.111] Epoch 1: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0869]Epoch 1: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.112] Epoch 1: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0971]Epoch 1: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.122] Epoch 1: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0951]Epoch 1: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0938]Epoch 1: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.102] Epoch 1: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.102]Epoch 1: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.104]Epoch 1: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.108]Epoch 1: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0934]Epoch 1: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0903]Epoch 1: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0929]Epoch 1: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0986]Epoch 1: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0945]Epoch 1: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0895]Epoch 1: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.105] Epoch 1: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0922]Epoch 1: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.106] Epoch 1: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.102]Epoch 1: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.104]Epoch 1: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0934]Epoch 1: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0984]Epoch 1: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0918]Epoch 1: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0992]Epoch 1: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.106] Epoch 1: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0987]Epoch 1: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0983]Epoch 1: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.102] Epoch 1: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0908]Epoch 1: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0915]Epoch 1: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0958]Epoch 1: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0966]Epoch 1: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.109] Epoch 1: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0968]Epoch 1: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.121] Epoch 1: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.105]Epoch 1: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0989]Epoch 1: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.118] Epoch 1: 11%|#1 | 73/655 [00:00<00:00, 718.60batch/s, train/train_loss_1=0.118]Epoch 1: 11%|#1 | 73/655 [00:00<00:00, 718.60batch/s, train/train_loss_1=0.106]Epoch 1: 11%|#1 | 73/655 [00:00<00:00, 718.60batch/s, train/train_loss_1=0.09] Epoch 1: 11%|#1 | 73/655 [00:00<00:00, 718.60batch/s, train/train_loss_1=0.087]Epoch 1: 11%|#1 | 73/655 [00:00<00:00, 718.60batch/s, train/train_loss_1=0.106]Epoch 1: 11%|#1 | 73/655 [00:00<00:00, 718.60batch/s, train/train_loss_1=0.0874]Epoch 1: 11%|#1 | 73/655 [00:00<00:00, 718.60batch/s, train/train_loss_1=0.0946]Epoch 1: 11%|#1 | 73/655 [00:00<00:00, 718.60batch/s, train/train_loss_1=0.0866]Epoch 1: 11%|#1 | 73/655 [00:00<00:00, 718.60batch/s, train/train_loss_1=0.0989]Epoch 1: 11%|#1 | 73/655 [00:00<00:00, 718.60batch/s, train/train_loss_1=0.0926]Epoch 1: 11%|#1 | 73/655 [00:00<00:00, 718.60batch/s, train/train_loss_1=0.102] Epoch 1: 11%|#1 | 73/655 [00:00<00:00, 718.60batch/s, train/train_loss_1=0.0955]Epoch 1: 11%|#1 | 73/655 [00:00<00:00, 718.60batch/s, train/train_loss_1=0.0993]Epoch 1: 11%|#1 | 73/655 [00:00<00:00, 718.60batch/s, train/train_loss_1=0.0858]Epoch 1: 11%|#1 | 73/655 [00:00<00:00, 718.60batch/s, train/train_loss_1=0.0952]Epoch 1: 11%|#1 | 73/655 [00:00<00:00, 718.60batch/s, train/train_loss_1=0.0972]Epoch 1: 11%|#1 | 73/655 [00:00<00:00, 718.60batch/s, train/train_loss_1=0.108] Epoch 1: 11%|#1 | 73/655 [00:00<00:00, 718.60batch/s, train/train_loss_1=0.101]Epoch 1: 11%|#1 | 73/655 [00:00<00:00, 718.60batch/s, train/train_loss_1=0.0803]Epoch 1: 11%|#1 | 73/655 [00:00<00:00, 718.60batch/s, train/train_loss_1=0.135] Epoch 1: 11%|#1 | 73/655 [00:00<00:00, 718.60batch/s, train/train_loss_1=0.0987]Epoch 1: 11%|#1 | 73/655 [00:00<00:00, 718.60batch/s, train/train_loss_1=0.109] Epoch 1: 11%|#1 | 73/655 [00:00<00:00, 718.60batch/s, train/train_loss_1=0.0943]Epoch 1: 11%|#1 | 73/655 [00:00<00:00, 718.60batch/s, train/train_loss_1=0.105] Epoch 1: 11%|#1 | 73/655 [00:00<00:00, 718.60batch/s, train/train_loss_1=0.101]Epoch 1: 11%|#1 | 73/655 [00:00<00:00, 718.60batch/s, train/train_loss_1=0.102]Epoch 1: 11%|#1 | 73/655 [00:00<00:00, 718.60batch/s, train/train_loss_1=0.101]Epoch 1: 11%|#1 | 73/655 [00:00<00:00, 718.60batch/s, train/train_loss_1=0.103]Epoch 1: 11%|#1 | 73/655 [00:00<00:00, 718.60batch/s, train/train_loss_1=0.0973]Epoch 1: 11%|#1 | 73/655 [00:00<00:00, 718.60batch/s, train/train_loss_1=0.103] Epoch 1: 11%|#1 | 73/655 [00:00<00:00, 718.60batch/s, train/train_loss_1=0.0947]Epoch 1: 11%|#1 | 73/655 [00:00<00:00, 718.60batch/s, train/train_loss_1=0.0985]Epoch 1: 11%|#1 | 73/655 [00:00<00:00, 718.60batch/s, train/train_loss_1=0.0848]Epoch 1: 11%|#1 | 73/655 [00:00<00:00, 718.60batch/s, train/train_loss_1=0.0918]Epoch 1: 11%|#1 | 73/655 [00:00<00:00, 718.60batch/s, train/train_loss_1=0.115] Epoch 1: 11%|#1 | 73/655 [00:00<00:00, 718.60batch/s, train/train_loss_1=0.113]Epoch 1: 11%|#1 | 73/655 [00:00<00:00, 718.60batch/s, train/train_loss_1=0.104]Epoch 1: 11%|#1 | 73/655 [00:00<00:00, 718.60batch/s, train/train_loss_1=0.107]Epoch 1: 11%|#1 | 73/655 [00:00<00:00, 718.60batch/s, train/train_loss_1=0.0942]Epoch 1: 11%|#1 | 73/655 [00:00<00:00, 718.60batch/s, train/train_loss_1=0.0931]Epoch 1: 11%|#1 | 73/655 [00:00<00:00, 718.60batch/s, train/train_loss_1=0.102] Epoch 1: 11%|#1 | 73/655 [00:00<00:00, 718.60batch/s, train/train_loss_1=0.105]Epoch 1: 11%|#1 | 73/655 [00:00<00:00, 718.60batch/s, train/train_loss_1=0.0798]Epoch 1: 11%|#1 | 73/655 [00:00<00:00, 718.60batch/s, train/train_loss_1=0.0994]Epoch 1: 11%|#1 | 73/655 [00:00<00:00, 718.60batch/s, train/train_loss_1=0.0987]Epoch 1: 11%|#1 | 73/655 [00:00<00:00, 718.60batch/s, train/train_loss_1=0.101] Epoch 1: 11%|#1 | 73/655 [00:00<00:00, 718.60batch/s, train/train_loss_1=0.0952]Epoch 1: 11%|#1 | 73/655 [00:00<00:00, 718.60batch/s, train/train_loss_1=0.0949]Epoch 1: 11%|#1 | 73/655 [00:00<00:00, 718.60batch/s, train/train_loss_1=0.104] Epoch 1: 11%|#1 | 73/655 [00:00<00:00, 718.60batch/s, train/train_loss_1=0.0972]Epoch 1: 11%|#1 | 73/655 [00:00<00:00, 718.60batch/s, train/train_loss_1=0.0825]Epoch 1: 11%|#1 | 73/655 [00:00<00:00, 718.60batch/s, train/train_loss_1=0.1] Epoch 1: 11%|#1 | 73/655 [00:00<00:00, 718.60batch/s, train/train_loss_1=0.0866]Epoch 1: 11%|#1 | 73/655 [00:00<00:00, 718.60batch/s, train/train_loss_1=0.0979]Epoch 1: 11%|#1 | 73/655 [00:00<00:00, 718.60batch/s, train/train_loss_1=0.101] Epoch 1: 11%|#1 | 73/655 [00:00<00:00, 718.60batch/s, train/train_loss_1=0.112]Epoch 1: 11%|#1 | 73/655 [00:00<00:00, 718.60batch/s, train/train_loss_1=0.0917]Epoch 1: 11%|#1 | 73/655 [00:00<00:00, 718.60batch/s, train/train_loss_1=0.0957]Epoch 1: 11%|#1 | 73/655 [00:00<00:00, 718.60batch/s, train/train_loss_1=0.087] Epoch 1: 11%|#1 | 73/655 [00:00<00:00, 718.60batch/s, train/train_loss_1=0.108]Epoch 1: 11%|#1 | 73/655 [00:00<00:00, 718.60batch/s, train/train_loss_1=0.0934]Epoch 1: 11%|#1 | 73/655 [00:00<00:00, 718.60batch/s, train/train_loss_1=0.0746]Epoch 1: 11%|#1 | 73/655 [00:00<00:00, 718.60batch/s, train/train_loss_1=0.108] Epoch 1: 11%|#1 | 73/655 [00:00<00:00, 718.60batch/s, train/train_loss_1=0.108]Epoch 1: 11%|#1 | 73/655 [00:00<00:00, 718.60batch/s, train/train_loss_1=0.102]Epoch 1: 11%|#1 | 73/655 [00:00<00:00, 718.60batch/s, train/train_loss_1=0.116]Epoch 1: 11%|#1 | 73/655 [00:00<00:00, 718.60batch/s, train/train_loss_1=0.0972]Epoch 1: 11%|#1 | 73/655 [00:00<00:00, 718.60batch/s, train/train_loss_1=0.108] Epoch 1: 11%|#1 | 73/655 [00:00<00:00, 718.60batch/s, train/train_loss_1=0.107]Epoch 1: 11%|#1 | 73/655 [00:00<00:00, 718.60batch/s, train/train_loss_1=0.0838]Epoch 1: 11%|#1 | 73/655 [00:00<00:00, 718.60batch/s, train/train_loss_1=0.106] Epoch 1: 11%|#1 | 73/655 [00:00<00:00, 718.60batch/s, train/train_loss_1=0.0902]Epoch 1: 11%|#1 | 73/655 [00:00<00:00, 718.60batch/s, train/train_loss_1=0.107] Epoch 1: 11%|#1 | 73/655 [00:00<00:00, 718.60batch/s, train/train_loss_1=0.101]Epoch 1: 22%|##2 | 146/655 [00:00<00:00, 723.00batch/s, train/train_loss_1=0.101]Epoch 1: 22%|##2 | 146/655 [00:00<00:00, 723.00batch/s, train/train_loss_1=0.0981]Epoch 1: 22%|##2 | 146/655 [00:00<00:00, 723.00batch/s, train/train_loss_1=0.105] Epoch 1: 22%|##2 | 146/655 [00:00<00:00, 723.00batch/s, train/train_loss_1=0.0924]Epoch 1: 22%|##2 | 146/655 [00:00<00:00, 723.00batch/s, train/train_loss_1=0.0991]Epoch 1: 22%|##2 | 146/655 [00:00<00:00, 723.00batch/s, train/train_loss_1=0.102] Epoch 1: 22%|##2 | 146/655 [00:00<00:00, 723.00batch/s, train/train_loss_1=0.103]Epoch 1: 22%|##2 | 146/655 [00:00<00:00, 723.00batch/s, train/train_loss_1=0.0983]Epoch 1: 22%|##2 | 146/655 [00:00<00:00, 723.00batch/s, train/train_loss_1=0.116] Epoch 1: 22%|##2 | 146/655 [00:00<00:00, 723.00batch/s, train/train_loss_1=0.107]Epoch 1: 22%|##2 | 146/655 [00:00<00:00, 723.00batch/s, train/train_loss_1=0.0873]Epoch 1: 22%|##2 | 146/655 [00:00<00:00, 723.00batch/s, train/train_loss_1=0.11] Epoch 1: 22%|##2 | 146/655 [00:00<00:00, 723.00batch/s, train/train_loss_1=0.102]Epoch 1: 22%|##2 | 146/655 [00:00<00:00, 723.00batch/s, train/train_loss_1=0.109]Epoch 1: 22%|##2 | 146/655 [00:00<00:00, 723.00batch/s, train/train_loss_1=0.0996]Epoch 1: 22%|##2 | 146/655 [00:00<00:00, 723.00batch/s, train/train_loss_1=0.104] Epoch 1: 22%|##2 | 146/655 [00:00<00:00, 723.00batch/s, train/train_loss_1=0.102]Epoch 1: 22%|##2 | 146/655 [00:00<00:00, 723.00batch/s, train/train_loss_1=0.0951]Epoch 1: 22%|##2 | 146/655 [00:00<00:00, 723.00batch/s, train/train_loss_1=0.11] Epoch 1: 22%|##2 | 146/655 [00:00<00:00, 723.00batch/s, train/train_loss_1=0.0971]Epoch 1: 22%|##2 | 146/655 [00:00<00:00, 723.00batch/s, train/train_loss_1=0.0933]Epoch 1: 22%|##2 | 146/655 [00:00<00:00, 723.00batch/s, train/train_loss_1=0.095] Epoch 1: 22%|##2 | 146/655 [00:00<00:00, 723.00batch/s, train/train_loss_1=0.0953]Epoch 1: 22%|##2 | 146/655 [00:00<00:00, 723.00batch/s, train/train_loss_1=0.0903]Epoch 1: 22%|##2 | 146/655 [00:00<00:00, 723.00batch/s, train/train_loss_1=0.0875]Epoch 1: 22%|##2 | 146/655 [00:00<00:00, 723.00batch/s, train/train_loss_1=0.0983]Epoch 1: 22%|##2 | 146/655 [00:00<00:00, 723.00batch/s, train/train_loss_1=0.0938]Epoch 1: 22%|##2 | 146/655 [00:00<00:00, 723.00batch/s, train/train_loss_1=0.0898]Epoch 1: 22%|##2 | 146/655 [00:00<00:00, 723.00batch/s, train/train_loss_1=0.11] Epoch 1: 22%|##2 | 146/655 [00:00<00:00, 723.00batch/s, train/train_loss_1=0.105]Epoch 1: 22%|##2 | 146/655 [00:00<00:00, 723.00batch/s, train/train_loss_1=0.107]Epoch 1: 22%|##2 | 146/655 [00:00<00:00, 723.00batch/s, train/train_loss_1=0.091]Epoch 1: 22%|##2 | 146/655 [00:00<00:00, 723.00batch/s, train/train_loss_1=0.102]Epoch 1: 22%|##2 | 146/655 [00:00<00:00, 723.00batch/s, train/train_loss_1=0.111]Epoch 1: 22%|##2 | 146/655 [00:00<00:00, 723.00batch/s, train/train_loss_1=0.0955]Epoch 1: 22%|##2 | 146/655 [00:00<00:00, 723.00batch/s, train/train_loss_1=0.103] Epoch 1: 22%|##2 | 146/655 [00:00<00:00, 723.00batch/s, train/train_loss_1=0.102]Epoch 1: 22%|##2 | 146/655 [00:00<00:00, 723.00batch/s, train/train_loss_1=0.09] Epoch 1: 22%|##2 | 146/655 [00:00<00:00, 723.00batch/s, train/train_loss_1=0.0937]Epoch 1: 22%|##2 | 146/655 [00:00<00:00, 723.00batch/s, train/train_loss_1=0.101] Epoch 1: 22%|##2 | 146/655 [00:00<00:00, 723.00batch/s, train/train_loss_1=0.104]Epoch 1: 22%|##2 | 146/655 [00:00<00:00, 723.00batch/s, train/train_loss_1=0.0909]Epoch 1: 22%|##2 | 146/655 [00:00<00:00, 723.00batch/s, train/train_loss_1=0.112] Epoch 1: 22%|##2 | 146/655 [00:00<00:00, 723.00batch/s, train/train_loss_1=0.0981]Epoch 1: 22%|##2 | 146/655 [00:00<00:00, 723.00batch/s, train/train_loss_1=0.109] Epoch 1: 22%|##2 | 146/655 [00:00<00:00, 723.00batch/s, train/train_loss_1=0.109]Epoch 1: 22%|##2 | 146/655 [00:00<00:00, 723.00batch/s, train/train_loss_1=0.1] Epoch 1: 22%|##2 | 146/655 [00:00<00:00, 723.00batch/s, train/train_loss_1=0.108]Epoch 1: 22%|##2 | 146/655 [00:00<00:00, 723.00batch/s, train/train_loss_1=0.0839]Epoch 1: 22%|##2 | 146/655 [00:00<00:00, 723.00batch/s, train/train_loss_1=0.0926]Epoch 1: 22%|##2 | 146/655 [00:00<00:00, 723.00batch/s, train/train_loss_1=0.0846]Epoch 1: 22%|##2 | 146/655 [00:00<00:00, 723.00batch/s, train/train_loss_1=0.103] Epoch 1: 22%|##2 | 146/655 [00:00<00:00, 723.00batch/s, train/train_loss_1=0.0994]Epoch 1: 22%|##2 | 146/655 [00:00<00:00, 723.00batch/s, train/train_loss_1=0.0949]Epoch 1: 22%|##2 | 146/655 [00:00<00:00, 723.00batch/s, train/train_loss_1=0.0994]Epoch 1: 22%|##2 | 146/655 [00:00<00:00, 723.00batch/s, train/train_loss_1=0.108] Epoch 1: 22%|##2 | 146/655 [00:00<00:00, 723.00batch/s, train/train_loss_1=0.0936]Epoch 1: 22%|##2 | 146/655 [00:00<00:00, 723.00batch/s, train/train_loss_1=0.0943]Epoch 1: 22%|##2 | 146/655 [00:00<00:00, 723.00batch/s, train/train_loss_1=0.0824]Epoch 1: 22%|##2 | 146/655 [00:00<00:00, 723.00batch/s, train/train_loss_1=0.1] Epoch 1: 22%|##2 | 146/655 [00:00<00:00, 723.00batch/s, train/train_loss_1=0.0886]Epoch 1: 22%|##2 | 146/655 [00:00<00:00, 723.00batch/s, train/train_loss_1=0.107] Epoch 1: 22%|##2 | 146/655 [00:00<00:00, 723.00batch/s, train/train_loss_1=0.0988]Epoch 1: 22%|##2 | 146/655 [00:00<00:00, 723.00batch/s, train/train_loss_1=0.0931]Epoch 1: 22%|##2 | 146/655 [00:00<00:00, 723.00batch/s, train/train_loss_1=0.109] Epoch 1: 22%|##2 | 146/655 [00:00<00:00, 723.00batch/s, train/train_loss_1=0.105]Epoch 1: 22%|##2 | 146/655 [00:00<00:00, 723.00batch/s, train/train_loss_1=0.108]Epoch 1: 22%|##2 | 146/655 [00:00<00:00, 723.00batch/s, train/train_loss_1=0.0961]Epoch 1: 22%|##2 | 146/655 [00:00<00:00, 723.00batch/s, train/train_loss_1=0.111] Epoch 1: 22%|##2 | 146/655 [00:00<00:00, 723.00batch/s, train/train_loss_1=0.0965]Epoch 1: 22%|##2 | 146/655 [00:00<00:00, 723.00batch/s, train/train_loss_1=0.106] Epoch 1: 22%|##2 | 146/655 [00:00<00:00, 723.00batch/s, train/train_loss_1=0.0934]Epoch 1: 22%|##2 | 146/655 [00:00<00:00, 723.00batch/s, train/train_loss_1=0.0938]Epoch 1: 22%|##2 | 146/655 [00:00<00:00, 723.00batch/s, train/train_loss_1=0.0945]Epoch 1: 33%|###3 | 219/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.0945]Epoch 1: 33%|###3 | 219/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.096] Epoch 1: 33%|###3 | 219/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.097]Epoch 1: 33%|###3 | 219/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.102]Epoch 1: 33%|###3 | 219/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.0943]Epoch 1: 33%|###3 | 219/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.1] Epoch 1: 33%|###3 | 219/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.119]Epoch 1: 33%|###3 | 219/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.0899]Epoch 1: 33%|###3 | 219/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.0831]Epoch 1: 33%|###3 | 219/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.106] Epoch 1: 33%|###3 | 219/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.105]Epoch 1: 33%|###3 | 219/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.0937]Epoch 1: 33%|###3 | 219/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.0854]Epoch 1: 33%|###3 | 219/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.0806]Epoch 1: 33%|###3 | 219/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.0962]Epoch 1: 33%|###3 | 219/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.0927]Epoch 1: 33%|###3 | 219/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.109] Epoch 1: 33%|###3 | 219/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.0925]Epoch 1: 33%|###3 | 219/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.0823]Epoch 1: 33%|###3 | 219/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.107] Epoch 1: 33%|###3 | 219/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.0897]Epoch 1: 33%|###3 | 219/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.098] Epoch 1: 33%|###3 | 219/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.108]Epoch 1: 33%|###3 | 219/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.099]Epoch 1: 33%|###3 | 219/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.102]Epoch 1: 33%|###3 | 219/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.0929]Epoch 1: 33%|###3 | 219/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.102] Epoch 1: 33%|###3 | 219/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.091]Epoch 1: 33%|###3 | 219/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.101]Epoch 1: 33%|###3 | 219/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.091]Epoch 1: 33%|###3 | 219/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.0921]Epoch 1: 33%|###3 | 219/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.0941]Epoch 1: 33%|###3 | 219/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.111] Epoch 1: 33%|###3 | 219/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.101]Epoch 1: 33%|###3 | 219/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.109]Epoch 1: 33%|###3 | 219/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.0834]Epoch 1: 33%|###3 | 219/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.099] Epoch 1: 33%|###3 | 219/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.0772]Epoch 1: 33%|###3 | 219/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.091] Epoch 1: 33%|###3 | 219/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.0932]Epoch 1: 33%|###3 | 219/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.114] Epoch 1: 33%|###3 | 219/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.105]Epoch 1: 33%|###3 | 219/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.082]Epoch 1: 33%|###3 | 219/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.102]Epoch 1: 33%|###3 | 219/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.0974]Epoch 1: 33%|###3 | 219/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.104] Epoch 1: 33%|###3 | 219/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.0954]Epoch 1: 33%|###3 | 219/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.0901]Epoch 1: 33%|###3 | 219/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.0926]Epoch 1: 33%|###3 | 219/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.093] Epoch 1: 33%|###3 | 219/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.102]Epoch 1: 33%|###3 | 219/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.104]Epoch 1: 33%|###3 | 219/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.093]Epoch 1: 33%|###3 | 219/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.0939]Epoch 1: 33%|###3 | 219/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.0921]Epoch 1: 33%|###3 | 219/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.118] Epoch 1: 33%|###3 | 219/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.0866]Epoch 1: 33%|###3 | 219/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.0879]Epoch 1: 33%|###3 | 219/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.0893]Epoch 1: 33%|###3 | 219/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.0941]Epoch 1: 33%|###3 | 219/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.108] Epoch 1: 33%|###3 | 219/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.0875]Epoch 1: 33%|###3 | 219/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.122] Epoch 1: 33%|###3 | 219/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.0785]Epoch 1: 33%|###3 | 219/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.09] Epoch 1: 33%|###3 | 219/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.11]Epoch 1: 33%|###3 | 219/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.0916]Epoch 1: 33%|###3 | 219/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.094] Epoch 1: 33%|###3 | 219/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.0971]Epoch 1: 33%|###3 | 219/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.109] Epoch 1: 33%|###3 | 219/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.0849]Epoch 1: 33%|###3 | 219/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.0989]Epoch 1: 33%|###3 | 219/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.0995]Epoch 1: 44%|####4 | 291/655 [00:00<00:00, 715.01batch/s, train/train_loss_1=0.0995]Epoch 1: 44%|####4 | 291/655 [00:00<00:00, 715.01batch/s, train/train_loss_1=0.109] Epoch 1: 44%|####4 | 291/655 [00:00<00:00, 715.01batch/s, train/train_loss_1=0.0916]Epoch 1: 44%|####4 | 291/655 [00:00<00:00, 715.01batch/s, train/train_loss_1=0.1] Epoch 1: 44%|####4 | 291/655 [00:00<00:00, 715.01batch/s, train/train_loss_1=0.102]Epoch 1: 44%|####4 | 291/655 [00:00<00:00, 715.01batch/s, train/train_loss_1=0.104]Epoch 1: 44%|####4 | 291/655 [00:00<00:00, 715.01batch/s, train/train_loss_1=0.11] Epoch 1: 44%|####4 | 291/655 [00:00<00:00, 715.01batch/s, train/train_loss_1=0.109]Epoch 1: 44%|####4 | 291/655 [00:00<00:00, 715.01batch/s, train/train_loss_1=0.0997]Epoch 1: 44%|####4 | 291/655 [00:00<00:00, 715.01batch/s, train/train_loss_1=0.0925]Epoch 1: 44%|####4 | 291/655 [00:00<00:00, 715.01batch/s, train/train_loss_1=0.098] Epoch 1: 44%|####4 | 291/655 [00:00<00:00, 715.01batch/s, train/train_loss_1=0.102]Epoch 1: 44%|####4 | 291/655 [00:00<00:00, 715.01batch/s, train/train_loss_1=0.0812]Epoch 1: 44%|####4 | 291/655 [00:00<00:00, 715.01batch/s, train/train_loss_1=0.0898]Epoch 1: 44%|####4 | 291/655 [00:00<00:00, 715.01batch/s, train/train_loss_1=0.105] Epoch 1: 44%|####4 | 291/655 [00:00<00:00, 715.01batch/s, train/train_loss_1=0.101]Epoch 1: 44%|####4 | 291/655 [00:00<00:00, 715.01batch/s, train/train_loss_1=0.0902]Epoch 1: 44%|####4 | 291/655 [00:00<00:00, 715.01batch/s, train/train_loss_1=0.0943]Epoch 1: 44%|####4 | 291/655 [00:00<00:00, 715.01batch/s, train/train_loss_1=0.0897]Epoch 1: 44%|####4 | 291/655 [00:00<00:00, 715.01batch/s, train/train_loss_1=0.0882]Epoch 1: 44%|####4 | 291/655 [00:00<00:00, 715.01batch/s, train/train_loss_1=0.0978]Epoch 1: 44%|####4 | 291/655 [00:00<00:00, 715.01batch/s, train/train_loss_1=0.109] Epoch 1: 44%|####4 | 291/655 [00:00<00:00, 715.01batch/s, train/train_loss_1=0.0965]Epoch 1: 44%|####4 | 291/655 [00:00<00:00, 715.01batch/s, train/train_loss_1=0.109] Epoch 1: 44%|####4 | 291/655 [00:00<00:00, 715.01batch/s, train/train_loss_1=0.0924]Epoch 1: 44%|####4 | 291/655 [00:00<00:00, 715.01batch/s, train/train_loss_1=0.125] Epoch 1: 44%|####4 | 291/655 [00:00<00:00, 715.01batch/s, train/train_loss_1=0.0965]Epoch 1: 44%|####4 | 291/655 [00:00<00:00, 715.01batch/s, train/train_loss_1=0.106] Epoch 1: 44%|####4 | 291/655 [00:00<00:00, 715.01batch/s, train/train_loss_1=0.0962]Epoch 1: 44%|####4 | 291/655 [00:00<00:00, 715.01batch/s, train/train_loss_1=0.097] Epoch 1: 44%|####4 | 291/655 [00:00<00:00, 715.01batch/s, train/train_loss_1=0.0984]Epoch 1: 44%|####4 | 291/655 [00:00<00:00, 715.01batch/s, train/train_loss_1=0.0985]Epoch 1: 44%|####4 | 291/655 [00:00<00:00, 715.01batch/s, train/train_loss_1=0.0986]Epoch 1: 44%|####4 | 291/655 [00:00<00:00, 715.01batch/s, train/train_loss_1=0.0987]Epoch 1: 44%|####4 | 291/655 [00:00<00:00, 715.01batch/s, train/train_loss_1=0.107] Epoch 1: 44%|####4 | 291/655 [00:00<00:00, 715.01batch/s, train/train_loss_1=0.102]Epoch 1: 44%|####4 | 291/655 [00:00<00:00, 715.01batch/s, train/train_loss_1=0.0993]Epoch 1: 44%|####4 | 291/655 [00:00<00:00, 715.01batch/s, train/train_loss_1=0.0917]Epoch 1: 44%|####4 | 291/655 [00:00<00:00, 715.01batch/s, train/train_loss_1=0.0937]Epoch 1: 44%|####4 | 291/655 [00:00<00:00, 715.01batch/s, train/train_loss_1=0.106] Epoch 1: 44%|####4 | 291/655 [00:00<00:00, 715.01batch/s, train/train_loss_1=0.103]Epoch 1: 44%|####4 | 291/655 [00:00<00:00, 715.01batch/s, train/train_loss_1=0.0857]Epoch 1: 44%|####4 | 291/655 [00:00<00:00, 715.01batch/s, train/train_loss_1=0.103] Epoch 1: 44%|####4 | 291/655 [00:00<00:00, 715.01batch/s, train/train_loss_1=0.0963]Epoch 1: 44%|####4 | 291/655 [00:00<00:00, 715.01batch/s, train/train_loss_1=0.106] Epoch 1: 44%|####4 | 291/655 [00:00<00:00, 715.01batch/s, train/train_loss_1=0.108]Epoch 1: 44%|####4 | 291/655 [00:00<00:00, 715.01batch/s, train/train_loss_1=0.0853]Epoch 1: 44%|####4 | 291/655 [00:00<00:00, 715.01batch/s, train/train_loss_1=0.0783]Epoch 1: 44%|####4 | 291/655 [00:00<00:00, 715.01batch/s, train/train_loss_1=0.0979]Epoch 1: 44%|####4 | 291/655 [00:00<00:00, 715.01batch/s, train/train_loss_1=0.106] Epoch 1: 44%|####4 | 291/655 [00:00<00:00, 715.01batch/s, train/train_loss_1=0.1] Epoch 1: 44%|####4 | 291/655 [00:00<00:00, 715.01batch/s, train/train_loss_1=0.0949]Epoch 1: 44%|####4 | 291/655 [00:00<00:00, 715.01batch/s, train/train_loss_1=0.106] Epoch 1: 44%|####4 | 291/655 [00:00<00:00, 715.01batch/s, train/train_loss_1=0.0924]Epoch 1: 44%|####4 | 291/655 [00:00<00:00, 715.01batch/s, train/train_loss_1=0.0924]Epoch 1: 44%|####4 | 291/655 [00:00<00:00, 715.01batch/s, train/train_loss_1=0.0933]Epoch 1: 44%|####4 | 291/655 [00:00<00:00, 715.01batch/s, train/train_loss_1=0.0919]Epoch 1: 44%|####4 | 291/655 [00:00<00:00, 715.01batch/s, train/train_loss_1=0.104] Epoch 1: 44%|####4 | 291/655 [00:00<00:00, 715.01batch/s, train/train_loss_1=0.084]Epoch 1: 44%|####4 | 291/655 [00:00<00:00, 715.01batch/s, train/train_loss_1=0.104]Epoch 1: 44%|####4 | 291/655 [00:00<00:00, 715.01batch/s, train/train_loss_1=0.0953]Epoch 1: 44%|####4 | 291/655 [00:00<00:00, 715.01batch/s, train/train_loss_1=0.0806]Epoch 1: 44%|####4 | 291/655 [00:00<00:00, 715.01batch/s, train/train_loss_1=0.0948]Epoch 1: 44%|####4 | 291/655 [00:00<00:00, 715.01batch/s, train/train_loss_1=0.0865]Epoch 1: 44%|####4 | 291/655 [00:00<00:00, 715.01batch/s, train/train_loss_1=0.095] Epoch 1: 44%|####4 | 291/655 [00:00<00:00, 715.01batch/s, train/train_loss_1=0.0967]Epoch 1: 44%|####4 | 291/655 [00:00<00:00, 715.01batch/s, train/train_loss_1=0.102] Epoch 1: 44%|####4 | 291/655 [00:00<00:00, 715.01batch/s, train/train_loss_1=0.0923]Epoch 1: 44%|####4 | 291/655 [00:00<00:00, 715.01batch/s, train/train_loss_1=0.0979]Epoch 1: 44%|####4 | 291/655 [00:00<00:00, 715.01batch/s, train/train_loss_1=0.107] Epoch 1: 44%|####4 | 291/655 [00:00<00:00, 715.01batch/s, train/train_loss_1=0.105]Epoch 1: 44%|####4 | 291/655 [00:00<00:00, 715.01batch/s, train/train_loss_1=0.0926]Epoch 1: 44%|####4 | 291/655 [00:00<00:00, 715.01batch/s, train/train_loss_1=0.0952]Epoch 1: 55%|#####5 | 363/655 [00:00<00:00, 711.17batch/s, train/train_loss_1=0.0952]Epoch 1: 55%|#####5 | 363/655 [00:00<00:00, 711.17batch/s, train/train_loss_1=0.111] Epoch 1: 55%|#####5 | 363/655 [00:00<00:00, 711.17batch/s, train/train_loss_1=0.114]Epoch 1: 55%|#####5 | 363/655 [00:00<00:00, 711.17batch/s, train/train_loss_1=0.0913]Epoch 1: 55%|#####5 | 363/655 [00:00<00:00, 711.17batch/s, train/train_loss_1=0.102] Epoch 1: 55%|#####5 | 363/655 [00:00<00:00, 711.17batch/s, train/train_loss_1=0.107]Epoch 1: 55%|#####5 | 363/655 [00:00<00:00, 711.17batch/s, train/train_loss_1=0.0951]Epoch 1: 55%|#####5 | 363/655 [00:00<00:00, 711.17batch/s, train/train_loss_1=0.102] Epoch 1: 55%|#####5 | 363/655 [00:00<00:00, 711.17batch/s, train/train_loss_1=0.0834]Epoch 1: 55%|#####5 | 363/655 [00:00<00:00, 711.17batch/s, train/train_loss_1=0.0916]Epoch 1: 55%|#####5 | 363/655 [00:00<00:00, 711.17batch/s, train/train_loss_1=0.0808]Epoch 1: 55%|#####5 | 363/655 [00:00<00:00, 711.17batch/s, train/train_loss_1=0.1] Epoch 1: 55%|#####5 | 363/655 [00:00<00:00, 711.17batch/s, train/train_loss_1=0.0925]Epoch 1: 55%|#####5 | 363/655 [00:00<00:00, 711.17batch/s, train/train_loss_1=0.122] Epoch 1: 55%|#####5 | 363/655 [00:00<00:00, 711.17batch/s, train/train_loss_1=0.0945]Epoch 1: 55%|#####5 | 363/655 [00:00<00:00, 711.17batch/s, train/train_loss_1=0.113] Epoch 1: 55%|#####5 | 363/655 [00:00<00:00, 711.17batch/s, train/train_loss_1=0.115]Epoch 1: 55%|#####5 | 363/655 [00:00<00:00, 711.17batch/s, train/train_loss_1=0.107]Epoch 1: 55%|#####5 | 363/655 [00:00<00:00, 711.17batch/s, train/train_loss_1=0.11] Epoch 1: 55%|#####5 | 363/655 [00:00<00:00, 711.17batch/s, train/train_loss_1=0.102]Epoch 1: 55%|#####5 | 363/655 [00:00<00:00, 711.17batch/s, train/train_loss_1=0.105]Epoch 1: 55%|#####5 | 363/655 [00:00<00:00, 711.17batch/s, train/train_loss_1=0.101]Epoch 1: 55%|#####5 | 363/655 [00:00<00:00, 711.17batch/s, train/train_loss_1=0.0936]Epoch 1: 55%|#####5 | 363/655 [00:00<00:00, 711.17batch/s, train/train_loss_1=0.101] Epoch 1: 55%|#####5 | 363/655 [00:00<00:00, 711.17batch/s, train/train_loss_1=0.0985]Epoch 1: 55%|#####5 | 363/655 [00:00<00:00, 711.17batch/s, train/train_loss_1=0.0937]Epoch 1: 55%|#####5 | 363/655 [00:00<00:00, 711.17batch/s, train/train_loss_1=0.0933]Epoch 1: 55%|#####5 | 363/655 [00:00<00:00, 711.17batch/s, train/train_loss_1=0.103] Epoch 1: 55%|#####5 | 363/655 [00:00<00:00, 711.17batch/s, train/train_loss_1=0.0944]Epoch 1: 55%|#####5 | 363/655 [00:00<00:00, 711.17batch/s, train/train_loss_1=0.105] Epoch 1: 55%|#####5 | 363/655 [00:00<00:00, 711.17batch/s, train/train_loss_1=0.0973]Epoch 1: 55%|#####5 | 363/655 [00:00<00:00, 711.17batch/s, train/train_loss_1=0.104] Epoch 1: 55%|#####5 | 363/655 [00:00<00:00, 711.17batch/s, train/train_loss_1=0.0967]Epoch 1: 55%|#####5 | 363/655 [00:00<00:00, 711.17batch/s, train/train_loss_1=0.102] Epoch 1: 55%|#####5 | 363/655 [00:00<00:00, 711.17batch/s, train/train_loss_1=0.092]Epoch 1: 55%|#####5 | 363/655 [00:00<00:00, 711.17batch/s, train/train_loss_1=0.112]Epoch 1: 55%|#####5 | 363/655 [00:00<00:00, 711.17batch/s, train/train_loss_1=0.0851]Epoch 1: 55%|#####5 | 363/655 [00:00<00:00, 711.17batch/s, train/train_loss_1=0.0983]Epoch 1: 55%|#####5 | 363/655 [00:00<00:00, 711.17batch/s, train/train_loss_1=0.095] Epoch 1: 55%|#####5 | 363/655 [00:00<00:00, 711.17batch/s, train/train_loss_1=0.11] Epoch 1: 55%|#####5 | 363/655 [00:00<00:00, 711.17batch/s, train/train_loss_1=0.113]Epoch 1: 55%|#####5 | 363/655 [00:00<00:00, 711.17batch/s, train/train_loss_1=0.1] Epoch 1: 55%|#####5 | 363/655 [00:00<00:00, 711.17batch/s, train/train_loss_1=0.105]Epoch 1: 55%|#####5 | 363/655 [00:00<00:00, 711.17batch/s, train/train_loss_1=0.106]Epoch 1: 55%|#####5 | 363/655 [00:00<00:00, 711.17batch/s, train/train_loss_1=0.106]Epoch 1: 55%|#####5 | 363/655 [00:00<00:00, 711.17batch/s, train/train_loss_1=0.0882]Epoch 1: 55%|#####5 | 363/655 [00:00<00:00, 711.17batch/s, train/train_loss_1=0.104] Epoch 1: 55%|#####5 | 363/655 [00:00<00:00, 711.17batch/s, train/train_loss_1=0.0888]Epoch 1: 55%|#####5 | 363/655 [00:00<00:00, 711.17batch/s, train/train_loss_1=0.0825]Epoch 1: 55%|#####5 | 363/655 [00:00<00:00, 711.17batch/s, train/train_loss_1=0.1] Epoch 1: 55%|#####5 | 363/655 [00:00<00:00, 711.17batch/s, train/train_loss_1=0.0963]Epoch 1: 55%|#####5 | 363/655 [00:00<00:00, 711.17batch/s, train/train_loss_1=0.091] Epoch 1: 55%|#####5 | 363/655 [00:00<00:00, 711.17batch/s, train/train_loss_1=0.0931]Epoch 1: 55%|#####5 | 363/655 [00:00<00:00, 711.17batch/s, train/train_loss_1=0.0963]Epoch 1: 55%|#####5 | 363/655 [00:00<00:00, 711.17batch/s, train/train_loss_1=0.0978]Epoch 1: 55%|#####5 | 363/655 [00:00<00:00, 711.17batch/s, train/train_loss_1=0.102] Epoch 1: 55%|#####5 | 363/655 [00:00<00:00, 711.17batch/s, train/train_loss_1=0.0962]Epoch 1: 55%|#####5 | 363/655 [00:00<00:00, 711.17batch/s, train/train_loss_1=0.107] Epoch 1: 55%|#####5 | 363/655 [00:00<00:00, 711.17batch/s, train/train_loss_1=0.0855]Epoch 1: 55%|#####5 | 363/655 [00:00<00:00, 711.17batch/s, train/train_loss_1=0.0986]Epoch 1: 55%|#####5 | 363/655 [00:00<00:00, 711.17batch/s, train/train_loss_1=0.0971]Epoch 1: 55%|#####5 | 363/655 [00:00<00:00, 711.17batch/s, train/train_loss_1=0.106] Epoch 1: 55%|#####5 | 363/655 [00:00<00:00, 711.17batch/s, train/train_loss_1=0.105]Epoch 1: 55%|#####5 | 363/655 [00:00<00:00, 711.17batch/s, train/train_loss_1=0.112]Epoch 1: 55%|#####5 | 363/655 [00:00<00:00, 711.17batch/s, train/train_loss_1=0.117]Epoch 1: 55%|#####5 | 363/655 [00:00<00:00, 711.17batch/s, train/train_loss_1=0.0948]Epoch 1: 55%|#####5 | 363/655 [00:00<00:00, 711.17batch/s, train/train_loss_1=0.085] Epoch 1: 55%|#####5 | 363/655 [00:00<00:00, 711.17batch/s, train/train_loss_1=0.0944]Epoch 1: 55%|#####5 | 363/655 [00:00<00:00, 711.17batch/s, train/train_loss_1=0.104] Epoch 1: 55%|#####5 | 363/655 [00:00<00:00, 711.17batch/s, train/train_loss_1=0.101]Epoch 1: 55%|#####5 | 363/655 [00:00<00:00, 711.17batch/s, train/train_loss_1=0.108]Epoch 1: 55%|#####5 | 363/655 [00:00<00:00, 711.17batch/s, train/train_loss_1=0.0864]Epoch 1: 55%|#####5 | 363/655 [00:00<00:00, 711.17batch/s, train/train_loss_1=0.0992]Epoch 1: 55%|#####5 | 363/655 [00:00<00:00, 711.17batch/s, train/train_loss_1=0.0967]Epoch 1: 55%|#####5 | 363/655 [00:00<00:00, 711.17batch/s, train/train_loss_1=0.0875]Epoch 1: 67%|######6 | 437/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.0875]Epoch 1: 67%|######6 | 437/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.0912]Epoch 1: 67%|######6 | 437/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.104] Epoch 1: 67%|######6 | 437/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.0981]Epoch 1: 67%|######6 | 437/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.0849]Epoch 1: 67%|######6 | 437/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.0989]Epoch 1: 67%|######6 | 437/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.103] Epoch 1: 67%|######6 | 437/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.104]Epoch 1: 67%|######6 | 437/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.101]Epoch 1: 67%|######6 | 437/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.0929]Epoch 1: 67%|######6 | 437/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.104] Epoch 1: 67%|######6 | 437/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.101]Epoch 1: 67%|######6 | 437/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.104]Epoch 1: 67%|######6 | 437/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.0868]Epoch 1: 67%|######6 | 437/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.1] Epoch 1: 67%|######6 | 437/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.0825]Epoch 1: 67%|######6 | 437/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.116] Epoch 1: 67%|######6 | 437/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.0925]Epoch 1: 67%|######6 | 437/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.0874]Epoch 1: 67%|######6 | 437/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.092] Epoch 1: 67%|######6 | 437/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.105]Epoch 1: 67%|######6 | 437/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.0961]Epoch 1: 67%|######6 | 437/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.0991]Epoch 1: 67%|######6 | 437/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.115] Epoch 1: 67%|######6 | 437/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.0886]Epoch 1: 67%|######6 | 437/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.0986]Epoch 1: 67%|######6 | 437/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.0992]Epoch 1: 67%|######6 | 437/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.106] Epoch 1: 67%|######6 | 437/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.106]Epoch 1: 67%|######6 | 437/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.0859]Epoch 1: 67%|######6 | 437/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.0879]Epoch 1: 67%|######6 | 437/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.0935]Epoch 1: 67%|######6 | 437/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.106] Epoch 1: 67%|######6 | 437/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.095]Epoch 1: 67%|######6 | 437/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.09] Epoch 1: 67%|######6 | 437/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.101]Epoch 1: 67%|######6 | 437/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.114]Epoch 1: 67%|######6 | 437/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.0991]Epoch 1: 67%|######6 | 437/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.0909]Epoch 1: 67%|######6 | 437/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.0929]Epoch 1: 67%|######6 | 437/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.0862]Epoch 1: 67%|######6 | 437/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.103] Epoch 1: 67%|######6 | 437/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.0933]Epoch 1: 67%|######6 | 437/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.0995]Epoch 1: 67%|######6 | 437/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.112] Epoch 1: 67%|######6 | 437/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.0895]Epoch 1: 67%|######6 | 437/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.0951]Epoch 1: 67%|######6 | 437/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.108] Epoch 1: 67%|######6 | 437/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.08] Epoch 1: 67%|######6 | 437/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.09]Epoch 1: 67%|######6 | 437/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.0949]Epoch 1: 67%|######6 | 437/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.082] Epoch 1: 67%|######6 | 437/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.0925]Epoch 1: 67%|######6 | 437/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.107] Epoch 1: 67%|######6 | 437/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.0993]Epoch 1: 67%|######6 | 437/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.0973]Epoch 1: 67%|######6 | 437/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.0957]Epoch 1: 67%|######6 | 437/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.0944]Epoch 1: 67%|######6 | 437/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.114] Epoch 1: 67%|######6 | 437/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.0975]Epoch 1: 67%|######6 | 437/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.103] Epoch 1: 67%|######6 | 437/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.0962]Epoch 1: 67%|######6 | 437/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.0982]Epoch 1: 67%|######6 | 437/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.0973]Epoch 1: 67%|######6 | 437/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.0841]Epoch 1: 67%|######6 | 437/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.0998]Epoch 1: 67%|######6 | 437/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.0995]Epoch 1: 67%|######6 | 437/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.0969]Epoch 1: 67%|######6 | 437/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.108] Epoch 1: 67%|######6 | 437/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.101]Epoch 1: 67%|######6 | 437/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.0946]Epoch 1: 67%|######6 | 437/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.0942]Epoch 1: 67%|######6 | 437/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.0858]Epoch 1: 67%|######6 | 437/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.0798]Epoch 1: 67%|######6 | 437/655 [00:00<00:00, 719.35batch/s, train/train_loss_1=0.0993]Epoch 1: 78%|#######8 | 511/655 [00:00<00:00, 723.33batch/s, train/train_loss_1=0.0993]Epoch 1: 78%|#######8 | 511/655 [00:00<00:00, 723.33batch/s, train/train_loss_1=0.0978]Epoch 1: 78%|#######8 | 511/655 [00:00<00:00, 723.33batch/s, train/train_loss_1=0.0966]Epoch 1: 78%|#######8 | 511/655 [00:00<00:00, 723.33batch/s, train/train_loss_1=0.101] Epoch 1: 78%|#######8 | 511/655 [00:00<00:00, 723.33batch/s, train/train_loss_1=0.101]Epoch 1: 78%|#######8 | 511/655 [00:00<00:00, 723.33batch/s, train/train_loss_1=0.0948]Epoch 1: 78%|#######8 | 511/655 [00:00<00:00, 723.33batch/s, train/train_loss_1=0.0898]Epoch 1: 78%|#######8 | 511/655 [00:00<00:00, 723.33batch/s, train/train_loss_1=0.0925]Epoch 1: 78%|#######8 | 511/655 [00:00<00:00, 723.33batch/s, train/train_loss_1=0.103] Epoch 1: 78%|#######8 | 511/655 [00:00<00:00, 723.33batch/s, train/train_loss_1=0.0938]Epoch 1: 78%|#######8 | 511/655 [00:00<00:00, 723.33batch/s, train/train_loss_1=0.0919]Epoch 1: 78%|#######8 | 511/655 [00:00<00:00, 723.33batch/s, train/train_loss_1=0.101] Epoch 1: 78%|#######8 | 511/655 [00:00<00:00, 723.33batch/s, train/train_loss_1=0.11] Epoch 1: 78%|#######8 | 511/655 [00:00<00:00, 723.33batch/s, train/train_loss_1=0.107]Epoch 1: 78%|#######8 | 511/655 [00:00<00:00, 723.33batch/s, train/train_loss_1=0.0932]Epoch 1: 78%|#######8 | 511/655 [00:00<00:00, 723.33batch/s, train/train_loss_1=0.102] Epoch 1: 78%|#######8 | 511/655 [00:00<00:00, 723.33batch/s, train/train_loss_1=0.089]Epoch 1: 78%|#######8 | 511/655 [00:00<00:00, 723.33batch/s, train/train_loss_1=0.0952]Epoch 1: 78%|#######8 | 511/655 [00:00<00:00, 723.33batch/s, train/train_loss_1=0.101] Epoch 1: 78%|#######8 | 511/655 [00:00<00:00, 723.33batch/s, train/train_loss_1=0.1] Epoch 1: 78%|#######8 | 511/655 [00:00<00:00, 723.33batch/s, train/train_loss_1=0.0865]Epoch 1: 78%|#######8 | 511/655 [00:00<00:00, 723.33batch/s, train/train_loss_1=0.0977]Epoch 1: 78%|#######8 | 511/655 [00:00<00:00, 723.33batch/s, train/train_loss_1=0.0932]Epoch 1: 78%|#######8 | 511/655 [00:00<00:00, 723.33batch/s, train/train_loss_1=0.101] Epoch 1: 78%|#######8 | 511/655 [00:00<00:00, 723.33batch/s, train/train_loss_1=0.0874]Epoch 1: 78%|#######8 | 511/655 [00:00<00:00, 723.33batch/s, train/train_loss_1=0.096] Epoch 1: 78%|#######8 | 511/655 [00:00<00:00, 723.33batch/s, train/train_loss_1=0.103]Epoch 1: 78%|#######8 | 511/655 [00:00<00:00, 723.33batch/s, train/train_loss_1=0.0966]Epoch 1: 78%|#######8 | 511/655 [00:00<00:00, 723.33batch/s, train/train_loss_1=0.0823]Epoch 1: 78%|#######8 | 511/655 [00:00<00:00, 723.33batch/s, train/train_loss_1=0.1] Epoch 1: 78%|#######8 | 511/655 [00:00<00:00, 723.33batch/s, train/train_loss_1=0.0988]Epoch 1: 78%|#######8 | 511/655 [00:00<00:00, 723.33batch/s, train/train_loss_1=0.0804]Epoch 1: 78%|#######8 | 511/655 [00:00<00:00, 723.33batch/s, train/train_loss_1=0.0977]Epoch 1: 78%|#######8 | 511/655 [00:00<00:00, 723.33batch/s, train/train_loss_1=0.127] Epoch 1: 78%|#######8 | 511/655 [00:00<00:00, 723.33batch/s, train/train_loss_1=0.0934]Epoch 1: 78%|#######8 | 511/655 [00:00<00:00, 723.33batch/s, train/train_loss_1=0.105] Epoch 1: 78%|#######8 | 511/655 [00:00<00:00, 723.33batch/s, train/train_loss_1=0.0949]Epoch 1: 78%|#######8 | 511/655 [00:00<00:00, 723.33batch/s, train/train_loss_1=0.0926]Epoch 1: 78%|#######8 | 511/655 [00:00<00:00, 723.33batch/s, train/train_loss_1=0.0899]Epoch 1: 78%|#######8 | 511/655 [00:00<00:00, 723.33batch/s, train/train_loss_1=0.106] Epoch 1: 78%|#######8 | 511/655 [00:00<00:00, 723.33batch/s, train/train_loss_1=0.0915]Epoch 1: 78%|#######8 | 511/655 [00:00<00:00, 723.33batch/s, train/train_loss_1=0.0925]Epoch 1: 78%|#######8 | 511/655 [00:00<00:00, 723.33batch/s, train/train_loss_1=0.0808]Epoch 1: 78%|#######8 | 511/655 [00:00<00:00, 723.33batch/s, train/train_loss_1=0.102] Epoch 1: 78%|#######8 | 511/655 [00:00<00:00, 723.33batch/s, train/train_loss_1=0.0949]Epoch 1: 78%|#######8 | 511/655 [00:00<00:00, 723.33batch/s, train/train_loss_1=0.0914]Epoch 1: 78%|#######8 | 511/655 [00:00<00:00, 723.33batch/s, train/train_loss_1=0.0968]Epoch 1: 78%|#######8 | 511/655 [00:00<00:00, 723.33batch/s, train/train_loss_1=0.0854]Epoch 1: 78%|#######8 | 511/655 [00:00<00:00, 723.33batch/s, train/train_loss_1=0.0878]Epoch 1: 78%|#######8 | 511/655 [00:00<00:00, 723.33batch/s, train/train_loss_1=0.0982]Epoch 1: 78%|#######8 | 511/655 [00:00<00:00, 723.33batch/s, train/train_loss_1=0.0994]Epoch 1: 78%|#######8 | 511/655 [00:00<00:00, 723.33batch/s, train/train_loss_1=0.0933]Epoch 1: 78%|#######8 | 511/655 [00:00<00:00, 723.33batch/s, train/train_loss_1=0.0938]Epoch 1: 78%|#######8 | 511/655 [00:00<00:00, 723.33batch/s, train/train_loss_1=0.101] Epoch 1: 78%|#######8 | 511/655 [00:00<00:00, 723.33batch/s, train/train_loss_1=0.0849]Epoch 1: 78%|#######8 | 511/655 [00:00<00:00, 723.33batch/s, train/train_loss_1=0.0843]Epoch 1: 78%|#######8 | 511/655 [00:00<00:00, 723.33batch/s, train/train_loss_1=0.106] Epoch 1: 78%|#######8 | 511/655 [00:00<00:00, 723.33batch/s, train/train_loss_1=0.107]Epoch 1: 78%|#######8 | 511/655 [00:00<00:00, 723.33batch/s, train/train_loss_1=0.0779]Epoch 1: 78%|#######8 | 511/655 [00:00<00:00, 723.33batch/s, train/train_loss_1=0.0954]Epoch 1: 78%|#######8 | 511/655 [00:00<00:00, 723.33batch/s, train/train_loss_1=0.103] Epoch 1: 78%|#######8 | 511/655 [00:00<00:00, 723.33batch/s, train/train_loss_1=0.113]Epoch 1: 78%|#######8 | 511/655 [00:00<00:00, 723.33batch/s, train/train_loss_1=0.0937]Epoch 1: 78%|#######8 | 511/655 [00:00<00:00, 723.33batch/s, train/train_loss_1=0.102] Epoch 1: 78%|#######8 | 511/655 [00:00<00:00, 723.33batch/s, train/train_loss_1=0.0926]Epoch 1: 78%|#######8 | 511/655 [00:00<00:00, 723.33batch/s, train/train_loss_1=0.09] Epoch 1: 78%|#######8 | 511/655 [00:00<00:00, 723.33batch/s, train/train_loss_1=0.0976]Epoch 1: 78%|#######8 | 511/655 [00:00<00:00, 723.33batch/s, train/train_loss_1=0.0977]Epoch 1: 78%|#######8 | 511/655 [00:00<00:00, 723.33batch/s, train/train_loss_1=0.0864]Epoch 1: 78%|#######8 | 511/655 [00:00<00:00, 723.33batch/s, train/train_loss_1=0.0909]Epoch 1: 78%|#######8 | 511/655 [00:00<00:00, 723.33batch/s, train/train_loss_1=0.0811]Epoch 1: 78%|#######8 | 511/655 [00:00<00:00, 723.33batch/s, train/train_loss_1=0.117] Epoch 1: 78%|#######8 | 511/655 [00:00<00:00, 723.33batch/s, train/train_loss_1=0.0922]Epoch 1: 78%|#######8 | 511/655 [00:00<00:00, 723.33batch/s, train/train_loss_1=0.1] Epoch 1: 78%|#######8 | 511/655 [00:00<00:00, 723.33batch/s, train/train_loss_1=0.0837]Epoch 1: 89%|########9 | 585/655 [00:00<00:00, 727.77batch/s, train/train_loss_1=0.0837]Epoch 1: 89%|########9 | 585/655 [00:00<00:00, 727.77batch/s, train/train_loss_1=0.0882]Epoch 1: 89%|########9 | 585/655 [00:00<00:00, 727.77batch/s, train/train_loss_1=0.079] Epoch 1: 89%|########9 | 585/655 [00:00<00:00, 727.77batch/s, train/train_loss_1=0.087]Epoch 1: 89%|########9 | 585/655 [00:00<00:00, 727.77batch/s, train/train_loss_1=0.102]Epoch 1: 89%|########9 | 585/655 [00:00<00:00, 727.77batch/s, train/train_loss_1=0.0975]Epoch 1: 89%|########9 | 585/655 [00:00<00:00, 727.77batch/s, train/train_loss_1=0.0959]Epoch 1: 89%|########9 | 585/655 [00:00<00:00, 727.77batch/s, train/train_loss_1=0.11] Epoch 1: 89%|########9 | 585/655 [00:00<00:00, 727.77batch/s, train/train_loss_1=0.101]Epoch 1: 89%|########9 | 585/655 [00:00<00:00, 727.77batch/s, train/train_loss_1=0.0944]Epoch 1: 89%|########9 | 585/655 [00:00<00:00, 727.77batch/s, train/train_loss_1=0.0982]Epoch 1: 89%|########9 | 585/655 [00:00<00:00, 727.77batch/s, train/train_loss_1=0.0853]Epoch 1: 89%|########9 | 585/655 [00:00<00:00, 727.77batch/s, train/train_loss_1=0.1] Epoch 1: 89%|########9 | 585/655 [00:00<00:00, 727.77batch/s, train/train_loss_1=0.0819]Epoch 1: 89%|########9 | 585/655 [00:00<00:00, 727.77batch/s, train/train_loss_1=0.0835]Epoch 1: 89%|########9 | 585/655 [00:00<00:00, 727.77batch/s, train/train_loss_1=0.105] Epoch 1: 89%|########9 | 585/655 [00:00<00:00, 727.77batch/s, train/train_loss_1=0.108]Epoch 1: 89%|########9 | 585/655 [00:00<00:00, 727.77batch/s, train/train_loss_1=0.0965]Epoch 1: 89%|########9 | 585/655 [00:00<00:00, 727.77batch/s, train/train_loss_1=0.108] Epoch 1: 89%|########9 | 585/655 [00:00<00:00, 727.77batch/s, train/train_loss_1=0.0993]Epoch 1: 89%|########9 | 585/655 [00:00<00:00, 727.77batch/s, train/train_loss_1=0.0849]Epoch 1: 89%|########9 | 585/655 [00:00<00:00, 727.77batch/s, train/train_loss_1=0.0964]Epoch 1: 89%|########9 | 585/655 [00:00<00:00, 727.77batch/s, train/train_loss_1=0.0977]Epoch 1: 89%|########9 | 585/655 [00:00<00:00, 727.77batch/s, train/train_loss_1=0.0967]Epoch 1: 89%|########9 | 585/655 [00:00<00:00, 727.77batch/s, train/train_loss_1=0.0798]Epoch 1: 89%|########9 | 585/655 [00:00<00:00, 727.77batch/s, train/train_loss_1=0.0859]Epoch 1: 89%|########9 | 585/655 [00:00<00:00, 727.77batch/s, train/train_loss_1=0.102] Epoch 1: 89%|########9 | 585/655 [00:00<00:00, 727.77batch/s, train/train_loss_1=0.104]Epoch 1: 89%|########9 | 585/655 [00:00<00:00, 727.77batch/s, train/train_loss_1=0.107]Epoch 1: 89%|########9 | 585/655 [00:00<00:00, 727.77batch/s, train/train_loss_1=0.0788]Epoch 1: 89%|########9 | 585/655 [00:00<00:00, 727.77batch/s, train/train_loss_1=0.0887]Epoch 1: 89%|########9 | 585/655 [00:00<00:00, 727.77batch/s, train/train_loss_1=0.0936]Epoch 1: 89%|########9 | 585/655 [00:00<00:00, 727.77batch/s, train/train_loss_1=0.0865]Epoch 1: 89%|########9 | 585/655 [00:00<00:00, 727.77batch/s, train/train_loss_1=0.107] Epoch 1: 89%|########9 | 585/655 [00:00<00:00, 727.77batch/s, train/train_loss_1=0.0849]Epoch 1: 89%|########9 | 585/655 [00:00<00:00, 727.77batch/s, train/train_loss_1=0.0989]Epoch 1: 89%|########9 | 585/655 [00:00<00:00, 727.77batch/s, train/train_loss_1=0.0903]Epoch 1: 89%|########9 | 585/655 [00:00<00:00, 727.77batch/s, train/train_loss_1=0.104] Epoch 1: 89%|########9 | 585/655 [00:00<00:00, 727.77batch/s, train/train_loss_1=0.106]Epoch 1: 89%|########9 | 585/655 [00:00<00:00, 727.77batch/s, train/train_loss_1=0.0911]Epoch 1: 89%|########9 | 585/655 [00:00<00:00, 727.77batch/s, train/train_loss_1=0.0833]Epoch 1: 89%|########9 | 585/655 [00:00<00:00, 727.77batch/s, train/train_loss_1=0.104] Epoch 1: 89%|########9 | 585/655 [00:00<00:00, 727.77batch/s, train/train_loss_1=0.0977]Epoch 1: 89%|########9 | 585/655 [00:00<00:00, 727.77batch/s, train/train_loss_1=0.0873]Epoch 1: 89%|########9 | 585/655 [00:00<00:00, 727.77batch/s, train/train_loss_1=0.11] Epoch 1: 89%|########9 | 585/655 [00:00<00:00, 727.77batch/s, train/train_loss_1=0.0898]Epoch 1: 89%|########9 | 585/655 [00:00<00:00, 727.77batch/s, train/train_loss_1=0.0991]Epoch 1: 89%|########9 | 585/655 [00:00<00:00, 727.77batch/s, train/train_loss_1=0.11] Epoch 1: 89%|########9 | 585/655 [00:00<00:00, 727.77batch/s, train/train_loss_1=0.0956]Epoch 1: 89%|########9 | 585/655 [00:00<00:00, 727.77batch/s, train/train_loss_1=0.0944]Epoch 1: 89%|########9 | 585/655 [00:00<00:00, 727.77batch/s, train/train_loss_1=0.0842]Epoch 1: 89%|########9 | 585/655 [00:00<00:00, 727.77batch/s, train/train_loss_1=0.0926]Epoch 1: 89%|########9 | 585/655 [00:00<00:00, 727.77batch/s, train/train_loss_1=0.0924]Epoch 1: 89%|########9 | 585/655 [00:00<00:00, 727.77batch/s, train/train_loss_1=0.0902]Epoch 1: 89%|########9 | 585/655 [00:00<00:00, 727.77batch/s, train/train_loss_1=0.0917]Epoch 1: 89%|########9 | 585/655 [00:00<00:00, 727.77batch/s, train/train_loss_1=0.0813]Epoch 1: 89%|########9 | 585/655 [00:00<00:00, 727.77batch/s, train/train_loss_1=0.0977]Epoch 1: 89%|########9 | 585/655 [00:00<00:00, 727.77batch/s, train/train_loss_1=0.105] Epoch 1: 89%|########9 | 585/655 [00:00<00:00, 727.77batch/s, train/train_loss_1=0.103]Epoch 1: 89%|########9 | 585/655 [00:00<00:00, 727.77batch/s, train/train_loss_1=0.099]Epoch 1: 89%|########9 | 585/655 [00:00<00:00, 727.77batch/s, train/train_loss_1=0.0706]Epoch 1: 89%|########9 | 585/655 [00:00<00:00, 727.77batch/s, train/train_loss_1=0.0852]Epoch 1: 89%|########9 | 585/655 [00:00<00:00, 727.77batch/s, train/train_loss_1=0.112] Epoch 1: 89%|########9 | 585/655 [00:00<00:00, 727.77batch/s, train/train_loss_1=0.0914]Epoch 1: 89%|########9 | 585/655 [00:00<00:00, 727.77batch/s, train/train_loss_1=0.094] Epoch 1: 89%|########9 | 585/655 [00:00<00:00, 727.77batch/s, train/train_loss_1=0.108]Epoch 1: 89%|########9 | 585/655 [00:00<00:00, 727.77batch/s, train/train_loss_1=0.0907]Epoch 1: 89%|########9 | 585/655 [00:00<00:00, 727.77batch/s, train/train_loss_1=0.0853]Epoch 1: 89%|########9 | 585/655 [00:00<00:00, 727.77batch/s, train/train_loss_1=0.0966]Epoch 1: 89%|########9 | 585/655 [00:00<00:00, 727.77batch/s, train/train_loss_1=0.103] Epoch 1: 89%|########9 | 585/655 [00:00<00:00, 727.77batch/s, train/train_loss_1=0.0972] 0%| | 0/655 [00:00<?, ?batch/s]Epoch 2: 0%| | 0/655 [00:00<?, ?batch/s]Epoch 2: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0996]Epoch 2: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0963]Epoch 2: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0903]Epoch 2: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0928]Epoch 2: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0907]Epoch 2: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0963]Epoch 2: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0777]Epoch 2: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.108] Epoch 2: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.108]Epoch 2: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.103]Epoch 2: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0899]Epoch 2: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0966]Epoch 2: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0785]Epoch 2: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.095] Epoch 2: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0929]Epoch 2: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0898]Epoch 2: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0959]Epoch 2: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0931]Epoch 2: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0965]Epoch 2: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0851]Epoch 2: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.097] Epoch 2: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0825]Epoch 2: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.104] Epoch 2: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0887]Epoch 2: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.102] Epoch 2: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.107]Epoch 2: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0945]Epoch 2: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0943]Epoch 2: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0889]Epoch 2: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0874]Epoch 2: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.1] Epoch 2: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0951]Epoch 2: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.101] Epoch 2: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0967]Epoch 2: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.102] Epoch 2: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0902]Epoch 2: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.108] Epoch 2: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.101]Epoch 2: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.101]Epoch 2: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0899]Epoch 2: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0894]Epoch 2: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0843]Epoch 2: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0973]Epoch 2: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.089] Epoch 2: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0973]Epoch 2: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0967]Epoch 2: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.103] Epoch 2: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0971]Epoch 2: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.094] Epoch 2: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0835]Epoch 2: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.103] Epoch 2: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0945]Epoch 2: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0878]Epoch 2: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0897]Epoch 2: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0848]Epoch 2: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0824]Epoch 2: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0883]Epoch 2: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0912]Epoch 2: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0812]Epoch 2: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0969]Epoch 2: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0981]Epoch 2: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0883]Epoch 2: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.099] Epoch 2: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.11] Epoch 2: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0841]Epoch 2: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.101] Epoch 2: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.106]Epoch 2: 10%|# | 67/655 [00:00<00:00, 668.55batch/s, train/train_loss_1=0.106]Epoch 2: 10%|# | 67/655 [00:00<00:00, 668.55batch/s, train/train_loss_1=0.113]Epoch 2: 10%|# | 67/655 [00:00<00:00, 668.55batch/s, train/train_loss_1=0.0949]Epoch 2: 10%|# | 67/655 [00:00<00:00, 668.55batch/s, train/train_loss_1=0.101] Epoch 2: 10%|# | 67/655 [00:00<00:00, 668.55batch/s, train/train_loss_1=0.102]Epoch 2: 10%|# | 67/655 [00:00<00:00, 668.55batch/s, train/train_loss_1=0.0887]Epoch 2: 10%|# | 67/655 [00:00<00:00, 668.55batch/s, train/train_loss_1=0.103] Epoch 2: 10%|# | 67/655 [00:00<00:00, 668.55batch/s, train/train_loss_1=0.0944]Epoch 2: 10%|# | 67/655 [00:00<00:00, 668.55batch/s, train/train_loss_1=0.0915]Epoch 2: 10%|# | 67/655 [00:00<00:00, 668.55batch/s, train/train_loss_1=0.0964]Epoch 2: 10%|# | 67/655 [00:00<00:00, 668.55batch/s, train/train_loss_1=0.108] Epoch 2: 10%|# | 67/655 [00:00<00:00, 668.55batch/s, train/train_loss_1=0.101]Epoch 2: 10%|# | 67/655 [00:00<00:00, 668.55batch/s, train/train_loss_1=0.101]Epoch 2: 10%|# | 67/655 [00:00<00:00, 668.55batch/s, train/train_loss_1=0.0864]Epoch 2: 10%|# | 67/655 [00:00<00:00, 668.55batch/s, train/train_loss_1=0.101] Epoch 2: 10%|# | 67/655 [00:00<00:00, 668.55batch/s, train/train_loss_1=0.101]Epoch 2: 10%|# | 67/655 [00:00<00:00, 668.55batch/s, train/train_loss_1=0.101]Epoch 2: 10%|# | 67/655 [00:00<00:00, 668.55batch/s, train/train_loss_1=0.0967]Epoch 2: 10%|# | 67/655 [00:00<00:00, 668.55batch/s, train/train_loss_1=0.106] Epoch 2: 10%|# | 67/655 [00:00<00:00, 668.55batch/s, train/train_loss_1=0.0907]Epoch 2: 10%|# | 67/655 [00:00<00:00, 668.55batch/s, train/train_loss_1=0.0998]Epoch 2: 10%|# | 67/655 [00:00<00:00, 668.55batch/s, train/train_loss_1=0.0974]Epoch 2: 10%|# | 67/655 [00:00<00:00, 668.55batch/s, train/train_loss_1=0.0847]Epoch 2: 10%|# | 67/655 [00:00<00:00, 668.55batch/s, train/train_loss_1=0.0948]Epoch 2: 10%|# | 67/655 [00:00<00:00, 668.55batch/s, train/train_loss_1=0.092] Epoch 2: 10%|# | 67/655 [00:00<00:00, 668.55batch/s, train/train_loss_1=0.099]Epoch 2: 10%|# | 67/655 [00:00<00:00, 668.55batch/s, train/train_loss_1=0.0884]Epoch 2: 10%|# | 67/655 [00:00<00:00, 668.55batch/s, train/train_loss_1=0.0965]Epoch 2: 10%|# | 67/655 [00:00<00:00, 668.55batch/s, train/train_loss_1=0.0881]Epoch 2: 10%|# | 67/655 [00:00<00:00, 668.55batch/s, train/train_loss_1=0.0864]Epoch 2: 10%|# | 67/655 [00:00<00:00, 668.55batch/s, train/train_loss_1=0.0936]Epoch 2: 10%|# | 67/655 [00:00<00:00, 668.55batch/s, train/train_loss_1=0.0918]Epoch 2: 10%|# | 67/655 [00:00<00:00, 668.55batch/s, train/train_loss_1=0.0838]Epoch 2: 10%|# | 67/655 [00:00<00:00, 668.55batch/s, train/train_loss_1=0.0967]Epoch 2: 10%|# | 67/655 [00:00<00:00, 668.55batch/s, train/train_loss_1=0.102] Epoch 2: 10%|# | 67/655 [00:00<00:00, 668.55batch/s, train/train_loss_1=0.108]Epoch 2: 10%|# | 67/655 [00:00<00:00, 668.55batch/s, train/train_loss_1=0.0858]Epoch 2: 10%|# | 67/655 [00:00<00:00, 668.55batch/s, train/train_loss_1=0.103] Epoch 2: 10%|# | 67/655 [00:00<00:00, 668.55batch/s, train/train_loss_1=0.0903]Epoch 2: 10%|# | 67/655 [00:00<00:00, 668.55batch/s, train/train_loss_1=0.0928]Epoch 2: 10%|# | 67/655 [00:00<00:00, 668.55batch/s, train/train_loss_1=0.0847]Epoch 2: 10%|# | 67/655 [00:00<00:00, 668.55batch/s, train/train_loss_1=0.0903]Epoch 2: 10%|# | 67/655 [00:00<00:00, 668.55batch/s, train/train_loss_1=0.0943]Epoch 2: 10%|# | 67/655 [00:00<00:00, 668.55batch/s, train/train_loss_1=0.0955]Epoch 2: 10%|# | 67/655 [00:00<00:00, 668.55batch/s, train/train_loss_1=0.0947]Epoch 2: 10%|# | 67/655 [00:00<00:00, 668.55batch/s, train/train_loss_1=0.0848]Epoch 2: 10%|# | 67/655 [00:00<00:00, 668.55batch/s, train/train_loss_1=0.0896]Epoch 2: 10%|# | 67/655 [00:00<00:00, 668.55batch/s, train/train_loss_1=0.0863]Epoch 2: 10%|# | 67/655 [00:00<00:00, 668.55batch/s, train/train_loss_1=0.0795]Epoch 2: 10%|# | 67/655 [00:00<00:00, 668.55batch/s, train/train_loss_1=0.0965]Epoch 2: 10%|# | 67/655 [00:00<00:00, 668.55batch/s, train/train_loss_1=0.098] Epoch 2: 10%|# | 67/655 [00:00<00:00, 668.55batch/s, train/train_loss_1=0.0889]Epoch 2: 10%|# | 67/655 [00:00<00:00, 668.55batch/s, train/train_loss_1=0.0982]Epoch 2: 10%|# | 67/655 [00:00<00:00, 668.55batch/s, train/train_loss_1=0.0802]Epoch 2: 10%|# | 67/655 [00:00<00:00, 668.55batch/s, train/train_loss_1=0.104] Epoch 2: 10%|# | 67/655 [00:00<00:00, 668.55batch/s, train/train_loss_1=0.102]Epoch 2: 10%|# | 67/655 [00:00<00:00, 668.55batch/s, train/train_loss_1=0.0883]Epoch 2: 10%|# | 67/655 [00:00<00:00, 668.55batch/s, train/train_loss_1=0.0895]Epoch 2: 10%|# | 67/655 [00:00<00:00, 668.55batch/s, train/train_loss_1=0.0957]Epoch 2: 10%|# | 67/655 [00:00<00:00, 668.55batch/s, train/train_loss_1=0.0976]Epoch 2: 10%|# | 67/655 [00:00<00:00, 668.55batch/s, train/train_loss_1=0.0946]Epoch 2: 10%|# | 67/655 [00:00<00:00, 668.55batch/s, train/train_loss_1=0.0716]Epoch 2: 10%|# | 67/655 [00:00<00:00, 668.55batch/s, train/train_loss_1=0.0767]Epoch 2: 10%|# | 67/655 [00:00<00:00, 668.55batch/s, train/train_loss_1=0.0887]Epoch 2: 10%|# | 67/655 [00:00<00:00, 668.55batch/s, train/train_loss_1=0.0957]Epoch 2: 10%|# | 67/655 [00:00<00:00, 668.55batch/s, train/train_loss_1=0.0861]Epoch 2: 10%|# | 67/655 [00:00<00:00, 668.55batch/s, train/train_loss_1=0.0933]Epoch 2: 10%|# | 67/655 [00:00<00:00, 668.55batch/s, train/train_loss_1=0.101] Epoch 2: 10%|# | 67/655 [00:00<00:00, 668.55batch/s, train/train_loss_1=0.0884]Epoch 2: 10%|# | 67/655 [00:00<00:00, 668.55batch/s, train/train_loss_1=0.105] Epoch 2: 10%|# | 67/655 [00:00<00:00, 668.55batch/s, train/train_loss_1=0.0909]Epoch 2: 10%|# | 67/655 [00:00<00:00, 668.55batch/s, train/train_loss_1=0.0892]Epoch 2: 10%|# | 67/655 [00:00<00:00, 668.55batch/s, train/train_loss_1=0.0988]Epoch 2: 10%|# | 67/655 [00:00<00:00, 668.55batch/s, train/train_loss_1=0.0899]Epoch 2: 21%|##1 | 140/655 [00:00<00:00, 702.81batch/s, train/train_loss_1=0.0899]Epoch 2: 21%|##1 | 140/655 [00:00<00:00, 702.81batch/s, train/train_loss_1=0.0959]Epoch 2: 21%|##1 | 140/655 [00:00<00:00, 702.81batch/s, train/train_loss_1=0.096] Epoch 2: 21%|##1 | 140/655 [00:00<00:00, 702.81batch/s, train/train_loss_1=0.0935]Epoch 2: 21%|##1 | 140/655 [00:00<00:00, 702.81batch/s, train/train_loss_1=0.0974]Epoch 2: 21%|##1 | 140/655 [00:00<00:00, 702.81batch/s, train/train_loss_1=0.122] Epoch 2: 21%|##1 | 140/655 [00:00<00:00, 702.81batch/s, train/train_loss_1=0.11] Epoch 2: 21%|##1 | 140/655 [00:00<00:00, 702.81batch/s, train/train_loss_1=0.0954]Epoch 2: 21%|##1 | 140/655 [00:00<00:00, 702.81batch/s, train/train_loss_1=0.103] Epoch 2: 21%|##1 | 140/655 [00:00<00:00, 702.81batch/s, train/train_loss_1=0.0846]Epoch 2: 21%|##1 | 140/655 [00:00<00:00, 702.81batch/s, train/train_loss_1=0.104] Epoch 2: 21%|##1 | 140/655 [00:00<00:00, 702.81batch/s, train/train_loss_1=0.0955]Epoch 2: 21%|##1 | 140/655 [00:00<00:00, 702.81batch/s, train/train_loss_1=0.0963]Epoch 2: 21%|##1 | 140/655 [00:00<00:00, 702.81batch/s, train/train_loss_1=0.102] Epoch 2: 21%|##1 | 140/655 [00:00<00:00, 702.81batch/s, train/train_loss_1=0.0984]Epoch 2: 21%|##1 | 140/655 [00:00<00:00, 702.81batch/s, train/train_loss_1=0.0918]Epoch 2: 21%|##1 | 140/655 [00:00<00:00, 702.81batch/s, train/train_loss_1=0.0915]Epoch 2: 21%|##1 | 140/655 [00:00<00:00, 702.81batch/s, train/train_loss_1=0.0884]Epoch 2: 21%|##1 | 140/655 [00:00<00:00, 702.81batch/s, train/train_loss_1=0.0975]Epoch 2: 21%|##1 | 140/655 [00:00<00:00, 702.81batch/s, train/train_loss_1=0.0946]Epoch 2: 21%|##1 | 140/655 [00:00<00:00, 702.81batch/s, train/train_loss_1=0.0805]Epoch 2: 21%|##1 | 140/655 [00:00<00:00, 702.81batch/s, train/train_loss_1=0.103] Epoch 2: 21%|##1 | 140/655 [00:00<00:00, 702.81batch/s, train/train_loss_1=0.087]Epoch 2: 21%|##1 | 140/655 [00:00<00:00, 702.81batch/s, train/train_loss_1=0.0879]Epoch 2: 21%|##1 | 140/655 [00:00<00:00, 702.81batch/s, train/train_loss_1=0.0881]Epoch 2: 21%|##1 | 140/655 [00:00<00:00, 702.81batch/s, train/train_loss_1=0.0983]Epoch 2: 21%|##1 | 140/655 [00:00<00:00, 702.81batch/s, train/train_loss_1=0.1] Epoch 2: 21%|##1 | 140/655 [00:00<00:00, 702.81batch/s, train/train_loss_1=0.0801]Epoch 2: 21%|##1 | 140/655 [00:00<00:00, 702.81batch/s, train/train_loss_1=0.0826]Epoch 2: 21%|##1 | 140/655 [00:00<00:00, 702.81batch/s, train/train_loss_1=0.0924]Epoch 2: 21%|##1 | 140/655 [00:00<00:00, 702.81batch/s, train/train_loss_1=0.102] Epoch 2: 21%|##1 | 140/655 [00:00<00:00, 702.81batch/s, train/train_loss_1=0.102]Epoch 2: 21%|##1 | 140/655 [00:00<00:00, 702.81batch/s, train/train_loss_1=0.0948]Epoch 2: 21%|##1 | 140/655 [00:00<00:00, 702.81batch/s, train/train_loss_1=0.0978]Epoch 2: 21%|##1 | 140/655 [00:00<00:00, 702.81batch/s, train/train_loss_1=0.102] Epoch 2: 21%|##1 | 140/655 [00:00<00:00, 702.81batch/s, train/train_loss_1=0.0903]Epoch 2: 21%|##1 | 140/655 [00:00<00:00, 702.81batch/s, train/train_loss_1=0.0969]Epoch 2: 21%|##1 | 140/655 [00:00<00:00, 702.81batch/s, train/train_loss_1=0.0955]Epoch 2: 21%|##1 | 140/655 [00:00<00:00, 702.81batch/s, train/train_loss_1=0.0931]Epoch 2: 21%|##1 | 140/655 [00:00<00:00, 702.81batch/s, train/train_loss_1=0.0897]Epoch 2: 21%|##1 | 140/655 [00:00<00:00, 702.81batch/s, train/train_loss_1=0.0909]Epoch 2: 21%|##1 | 140/655 [00:00<00:00, 702.81batch/s, train/train_loss_1=0.0989]Epoch 2: 21%|##1 | 140/655 [00:00<00:00, 702.81batch/s, train/train_loss_1=0.0995]Epoch 2: 21%|##1 | 140/655 [00:00<00:00, 702.81batch/s, train/train_loss_1=0.0835]Epoch 2: 21%|##1 | 140/655 [00:00<00:00, 702.81batch/s, train/train_loss_1=0.0954]Epoch 2: 21%|##1 | 140/655 [00:00<00:00, 702.81batch/s, train/train_loss_1=0.0875]Epoch 2: 21%|##1 | 140/655 [00:00<00:00, 702.81batch/s, train/train_loss_1=0.0964]Epoch 2: 21%|##1 | 140/655 [00:00<00:00, 702.81batch/s, train/train_loss_1=0.103] Epoch 2: 21%|##1 | 140/655 [00:00<00:00, 702.81batch/s, train/train_loss_1=0.0882]Epoch 2: 21%|##1 | 140/655 [00:00<00:00, 702.81batch/s, train/train_loss_1=0.082] Epoch 2: 21%|##1 | 140/655 [00:00<00:00, 702.81batch/s, train/train_loss_1=0.098]Epoch 2: 21%|##1 | 140/655 [00:00<00:00, 702.81batch/s, train/train_loss_1=0.0831]Epoch 2: 21%|##1 | 140/655 [00:00<00:00, 702.81batch/s, train/train_loss_1=0.0956]Epoch 2: 21%|##1 | 140/655 [00:00<00:00, 702.81batch/s, train/train_loss_1=0.0882]Epoch 2: 21%|##1 | 140/655 [00:00<00:00, 702.81batch/s, train/train_loss_1=0.105] Epoch 2: 21%|##1 | 140/655 [00:00<00:00, 702.81batch/s, train/train_loss_1=0.0963]Epoch 2: 21%|##1 | 140/655 [00:00<00:00, 702.81batch/s, train/train_loss_1=0.103] Epoch 2: 21%|##1 | 140/655 [00:00<00:00, 702.81batch/s, train/train_loss_1=0.0884]Epoch 2: 21%|##1 | 140/655 [00:00<00:00, 702.81batch/s, train/train_loss_1=0.0898]Epoch 2: 21%|##1 | 140/655 [00:00<00:00, 702.81batch/s, train/train_loss_1=0.0858]Epoch 2: 21%|##1 | 140/655 [00:00<00:00, 702.81batch/s, train/train_loss_1=0.0941]Epoch 2: 21%|##1 | 140/655 [00:00<00:00, 702.81batch/s, train/train_loss_1=0.0992]Epoch 2: 21%|##1 | 140/655 [00:00<00:00, 702.81batch/s, train/train_loss_1=0.0972]Epoch 2: 21%|##1 | 140/655 [00:00<00:00, 702.81batch/s, train/train_loss_1=0.0923]Epoch 2: 21%|##1 | 140/655 [00:00<00:00, 702.81batch/s, train/train_loss_1=0.0998]Epoch 2: 21%|##1 | 140/655 [00:00<00:00, 702.81batch/s, train/train_loss_1=0.0911]Epoch 2: 21%|##1 | 140/655 [00:00<00:00, 702.81batch/s, train/train_loss_1=0.0769]Epoch 2: 21%|##1 | 140/655 [00:00<00:00, 702.81batch/s, train/train_loss_1=0.0901]Epoch 2: 21%|##1 | 140/655 [00:00<00:00, 702.81batch/s, train/train_loss_1=0.0812]Epoch 2: 21%|##1 | 140/655 [00:00<00:00, 702.81batch/s, train/train_loss_1=0.0791]Epoch 2: 21%|##1 | 140/655 [00:00<00:00, 702.81batch/s, train/train_loss_1=0.1] Epoch 2: 21%|##1 | 140/655 [00:00<00:00, 702.81batch/s, train/train_loss_1=0.0827]Epoch 2: 21%|##1 | 140/655 [00:00<00:00, 702.81batch/s, train/train_loss_1=0.0936]Epoch 2: 32%|###2 | 212/655 [00:00<00:00, 706.98batch/s, train/train_loss_1=0.0936]Epoch 2: 32%|###2 | 212/655 [00:00<00:00, 706.98batch/s, train/train_loss_1=0.0949]Epoch 2: 32%|###2 | 212/655 [00:00<00:00, 706.98batch/s, train/train_loss_1=0.0926]Epoch 2: 32%|###2 | 212/655 [00:00<00:00, 706.98batch/s, train/train_loss_1=0.0949]Epoch 2: 32%|###2 | 212/655 [00:00<00:00, 706.98batch/s, train/train_loss_1=0.0913]Epoch 2: 32%|###2 | 212/655 [00:00<00:00, 706.98batch/s, train/train_loss_1=0.102] Epoch 2: 32%|###2 | 212/655 [00:00<00:00, 706.98batch/s, train/train_loss_1=0.0904]Epoch 2: 32%|###2 | 212/655 [00:00<00:00, 706.98batch/s, train/train_loss_1=0.089] Epoch 2: 32%|###2 | 212/655 [00:00<00:00, 706.98batch/s, train/train_loss_1=0.0877]Epoch 2: 32%|###2 | 212/655 [00:00<00:00, 706.98batch/s, train/train_loss_1=0.0984]Epoch 2: 32%|###2 | 212/655 [00:00<00:00, 706.98batch/s, train/train_loss_1=0.106] Epoch 2: 32%|###2 | 212/655 [00:00<00:00, 706.98batch/s, train/train_loss_1=0.0893]Epoch 2: 32%|###2 | 212/655 [00:00<00:00, 706.98batch/s, train/train_loss_1=0.0796]Epoch 2: 32%|###2 | 212/655 [00:00<00:00, 706.98batch/s, train/train_loss_1=0.0927]Epoch 2: 32%|###2 | 212/655 [00:00<00:00, 706.98batch/s, train/train_loss_1=0.1] Epoch 2: 32%|###2 | 212/655 [00:00<00:00, 706.98batch/s, train/train_loss_1=0.0988]Epoch 2: 32%|###2 | 212/655 [00:00<00:00, 706.98batch/s, train/train_loss_1=0.115] Epoch 2: 32%|###2 | 212/655 [00:00<00:00, 706.98batch/s, train/train_loss_1=0.0832]Epoch 2: 32%|###2 | 212/655 [00:00<00:00, 706.98batch/s, train/train_loss_1=0.0962]Epoch 2: 32%|###2 | 212/655 [00:00<00:00, 706.98batch/s, train/train_loss_1=0.0934]Epoch 2: 32%|###2 | 212/655 [00:00<00:00, 706.98batch/s, train/train_loss_1=0.0895]Epoch 2: 32%|###2 | 212/655 [00:00<00:00, 706.98batch/s, train/train_loss_1=0.0905]Epoch 2: 32%|###2 | 212/655 [00:00<00:00, 706.98batch/s, train/train_loss_1=0.0887]Epoch 2: 32%|###2 | 212/655 [00:00<00:00, 706.98batch/s, train/train_loss_1=0.0868]Epoch 2: 32%|###2 | 212/655 [00:00<00:00, 706.98batch/s, train/train_loss_1=0.1] Epoch 2: 32%|###2 | 212/655 [00:00<00:00, 706.98batch/s, train/train_loss_1=0.111]Epoch 2: 32%|###2 | 212/655 [00:00<00:00, 706.98batch/s, train/train_loss_1=0.107]Epoch 2: 32%|###2 | 212/655 [00:00<00:00, 706.98batch/s, train/train_loss_1=0.0946]Epoch 2: 32%|###2 | 212/655 [00:00<00:00, 706.98batch/s, train/train_loss_1=0.102] Epoch 2: 32%|###2 | 212/655 [00:00<00:00, 706.98batch/s, train/train_loss_1=0.0952]Epoch 2: 32%|###2 | 212/655 [00:00<00:00, 706.98batch/s, train/train_loss_1=0.103] Epoch 2: 32%|###2 | 212/655 [00:00<00:00, 706.98batch/s, train/train_loss_1=0.0932]Epoch 2: 32%|###2 | 212/655 [00:00<00:00, 706.98batch/s, train/train_loss_1=0.0926]Epoch 2: 32%|###2 | 212/655 [00:00<00:00, 706.98batch/s, train/train_loss_1=0.0863]Epoch 2: 32%|###2 | 212/655 [00:00<00:00, 706.98batch/s, train/train_loss_1=0.0967]Epoch 2: 32%|###2 | 212/655 [00:00<00:00, 706.98batch/s, train/train_loss_1=0.11] Epoch 2: 32%|###2 | 212/655 [00:00<00:00, 706.98batch/s, train/train_loss_1=0.094]Epoch 2: 32%|###2 | 212/655 [00:00<00:00, 706.98batch/s, train/train_loss_1=0.094]Epoch 2: 32%|###2 | 212/655 [00:00<00:00, 706.98batch/s, train/train_loss_1=0.0921]Epoch 2: 32%|###2 | 212/655 [00:00<00:00, 706.98batch/s, train/train_loss_1=0.0844]Epoch 2: 32%|###2 | 212/655 [00:00<00:00, 706.98batch/s, train/train_loss_1=0.091] Epoch 2: 32%|###2 | 212/655 [00:00<00:00, 706.98batch/s, train/train_loss_1=0.102]Epoch 2: 32%|###2 | 212/655 [00:00<00:00, 706.98batch/s, train/train_loss_1=0.0948]Epoch 2: 32%|###2 | 212/655 [00:00<00:00, 706.98batch/s, train/train_loss_1=0.104] Epoch 2: 32%|###2 | 212/655 [00:00<00:00, 706.98batch/s, train/train_loss_1=0.11] Epoch 2: 32%|###2 | 212/655 [00:00<00:00, 706.98batch/s, train/train_loss_1=0.0974]Epoch 2: 32%|###2 | 212/655 [00:00<00:00, 706.98batch/s, train/train_loss_1=0.11] Epoch 2: 32%|###2 | 212/655 [00:00<00:00, 706.98batch/s, train/train_loss_1=0.0952]Epoch 2: 32%|###2 | 212/655 [00:00<00:00, 706.98batch/s, train/train_loss_1=0.0968]Epoch 2: 32%|###2 | 212/655 [00:00<00:00, 706.98batch/s, train/train_loss_1=0.103] Epoch 2: 32%|###2 | 212/655 [00:00<00:00, 706.98batch/s, train/train_loss_1=0.102]Epoch 2: 32%|###2 | 212/655 [00:00<00:00, 706.98batch/s, train/train_loss_1=0.109]Epoch 2: 32%|###2 | 212/655 [00:00<00:00, 706.98batch/s, train/train_loss_1=0.109]Epoch 2: 32%|###2 | 212/655 [00:00<00:00, 706.98batch/s, train/train_loss_1=0.0949]Epoch 2: 32%|###2 | 212/655 [00:00<00:00, 706.98batch/s, train/train_loss_1=0.0952]Epoch 2: 32%|###2 | 212/655 [00:00<00:00, 706.98batch/s, train/train_loss_1=0.0924]Epoch 2: 32%|###2 | 212/655 [00:00<00:00, 706.98batch/s, train/train_loss_1=0.118] Epoch 2: 32%|###2 | 212/655 [00:00<00:00, 706.98batch/s, train/train_loss_1=0.0956]Epoch 2: 32%|###2 | 212/655 [00:00<00:00, 706.98batch/s, train/train_loss_1=0.0958]Epoch 2: 32%|###2 | 212/655 [00:00<00:00, 706.98batch/s, train/train_loss_1=0.104] Epoch 2: 32%|###2 | 212/655 [00:00<00:00, 706.98batch/s, train/train_loss_1=0.0815]Epoch 2: 32%|###2 | 212/655 [00:00<00:00, 706.98batch/s, train/train_loss_1=0.104] Epoch 2: 32%|###2 | 212/655 [00:00<00:00, 706.98batch/s, train/train_loss_1=0.0843]Epoch 2: 32%|###2 | 212/655 [00:00<00:00, 706.98batch/s, train/train_loss_1=0.1] Epoch 2: 32%|###2 | 212/655 [00:00<00:00, 706.98batch/s, train/train_loss_1=0.0986]Epoch 2: 32%|###2 | 212/655 [00:00<00:00, 706.98batch/s, train/train_loss_1=0.0971]Epoch 2: 32%|###2 | 212/655 [00:00<00:00, 706.98batch/s, train/train_loss_1=0.0994]Epoch 2: 32%|###2 | 212/655 [00:00<00:00, 706.98batch/s, train/train_loss_1=0.0955]Epoch 2: 32%|###2 | 212/655 [00:00<00:00, 706.98batch/s, train/train_loss_1=0.0802]Epoch 2: 32%|###2 | 212/655 [00:00<00:00, 706.98batch/s, train/train_loss_1=0.0893]Epoch 2: 32%|###2 | 212/655 [00:00<00:00, 706.98batch/s, train/train_loss_1=0.109] Epoch 2: 32%|###2 | 212/655 [00:00<00:00, 706.98batch/s, train/train_loss_1=0.105]Epoch 2: 32%|###2 | 212/655 [00:00<00:00, 706.98batch/s, train/train_loss_1=0.1] Epoch 2: 32%|###2 | 212/655 [00:00<00:00, 706.98batch/s, train/train_loss_1=0.107]Epoch 2: 44%|####3 | 285/655 [00:00<00:00, 713.63batch/s, train/train_loss_1=0.107]Epoch 2: 44%|####3 | 285/655 [00:00<00:00, 713.63batch/s, train/train_loss_1=0.0774]Epoch 2: 44%|####3 | 285/655 [00:00<00:00, 713.63batch/s, train/train_loss_1=0.0997]Epoch 2: 44%|####3 | 285/655 [00:00<00:00, 713.63batch/s, train/train_loss_1=0.0905]Epoch 2: 44%|####3 | 285/655 [00:00<00:00, 713.63batch/s, train/train_loss_1=0.102] Epoch 2: 44%|####3 | 285/655 [00:00<00:00, 713.63batch/s, train/train_loss_1=0.089]Epoch 2: 44%|####3 | 285/655 [00:00<00:00, 713.63batch/s, train/train_loss_1=0.0955]Epoch 2: 44%|####3 | 285/655 [00:00<00:00, 713.63batch/s, train/train_loss_1=0.106] Epoch 2: 44%|####3 | 285/655 [00:00<00:00, 713.63batch/s, train/train_loss_1=0.0972]Epoch 2: 44%|####3 | 285/655 [00:00<00:00, 713.63batch/s, train/train_loss_1=0.0922]Epoch 2: 44%|####3 | 285/655 [00:00<00:00, 713.63batch/s, train/train_loss_1=0.104] Epoch 2: 44%|####3 | 285/655 [00:00<00:00, 713.63batch/s, train/train_loss_1=0.0971]Epoch 2: 44%|####3 | 285/655 [00:00<00:00, 713.63batch/s, train/train_loss_1=0.101] Epoch 2: 44%|####3 | 285/655 [00:00<00:00, 713.63batch/s, train/train_loss_1=0.0779]Epoch 2: 44%|####3 | 285/655 [00:00<00:00, 713.63batch/s, train/train_loss_1=0.104] Epoch 2: 44%|####3 | 285/655 [00:00<00:00, 713.63batch/s, train/train_loss_1=0.109]Epoch 2: 44%|####3 | 285/655 [00:00<00:00, 713.63batch/s, train/train_loss_1=0.105]Epoch 2: 44%|####3 | 285/655 [00:00<00:00, 713.63batch/s, train/train_loss_1=0.0965]Epoch 2: 44%|####3 | 285/655 [00:00<00:00, 713.63batch/s, train/train_loss_1=0.106] Epoch 2: 44%|####3 | 285/655 [00:00<00:00, 713.63batch/s, train/train_loss_1=0.094]Epoch 2: 44%|####3 | 285/655 [00:00<00:00, 713.63batch/s, train/train_loss_1=0.0947]Epoch 2: 44%|####3 | 285/655 [00:00<00:00, 713.63batch/s, train/train_loss_1=0.0879]Epoch 2: 44%|####3 | 285/655 [00:00<00:00, 713.63batch/s, train/train_loss_1=0.098] Epoch 2: 44%|####3 | 285/655 [00:00<00:00, 713.63batch/s, train/train_loss_1=0.102]Epoch 2: 44%|####3 | 285/655 [00:00<00:00, 713.63batch/s, train/train_loss_1=0.104]Epoch 2: 44%|####3 | 285/655 [00:00<00:00, 713.63batch/s, train/train_loss_1=0.0998]Epoch 2: 44%|####3 | 285/655 [00:00<00:00, 713.63batch/s, train/train_loss_1=0.0807]Epoch 2: 44%|####3 | 285/655 [00:00<00:00, 713.63batch/s, train/train_loss_1=0.101] Epoch 2: 44%|####3 | 285/655 [00:00<00:00, 713.63batch/s, train/train_loss_1=0.104]Epoch 2: 44%|####3 | 285/655 [00:00<00:00, 713.63batch/s, train/train_loss_1=0.0965]Epoch 2: 44%|####3 | 285/655 [00:00<00:00, 713.63batch/s, train/train_loss_1=0.0846]Epoch 2: 44%|####3 | 285/655 [00:00<00:00, 713.63batch/s, train/train_loss_1=0.0971]Epoch 2: 44%|####3 | 285/655 [00:00<00:00, 713.63batch/s, train/train_loss_1=0.0768]Epoch 2: 44%|####3 | 285/655 [00:00<00:00, 713.63batch/s, train/train_loss_1=0.0833]Epoch 2: 44%|####3 | 285/655 [00:00<00:00, 713.63batch/s, train/train_loss_1=0.101] Epoch 2: 44%|####3 | 285/655 [00:00<00:00, 713.63batch/s, train/train_loss_1=0.0919]Epoch 2: 44%|####3 | 285/655 [00:00<00:00, 713.63batch/s, train/train_loss_1=0.0969]Epoch 2: 44%|####3 | 285/655 [00:00<00:00, 713.63batch/s, train/train_loss_1=0.0916]Epoch 2: 44%|####3 | 285/655 [00:00<00:00, 713.63batch/s, train/train_loss_1=0.0961]Epoch 2: 44%|####3 | 285/655 [00:00<00:00, 713.63batch/s, train/train_loss_1=0.0947]Epoch 2: 44%|####3 | 285/655 [00:00<00:00, 713.63batch/s, train/train_loss_1=0.0963]Epoch 2: 44%|####3 | 285/655 [00:00<00:00, 713.63batch/s, train/train_loss_1=0.0898]Epoch 2: 44%|####3 | 285/655 [00:00<00:00, 713.63batch/s, train/train_loss_1=0.0826]Epoch 2: 44%|####3 | 285/655 [00:00<00:00, 713.63batch/s, train/train_loss_1=0.0933]Epoch 2: 44%|####3 | 285/655 [00:00<00:00, 713.63batch/s, train/train_loss_1=0.0825]Epoch 2: 44%|####3 | 285/655 [00:00<00:00, 713.63batch/s, train/train_loss_1=0.0961]Epoch 2: 44%|####3 | 285/655 [00:00<00:00, 713.63batch/s, train/train_loss_1=0.111] Epoch 2: 44%|####3 | 285/655 [00:00<00:00, 713.63batch/s, train/train_loss_1=0.101]Epoch 2: 44%|####3 | 285/655 [00:00<00:00, 713.63batch/s, train/train_loss_1=0.0958]Epoch 2: 44%|####3 | 285/655 [00:00<00:00, 713.63batch/s, train/train_loss_1=0.0876]Epoch 2: 44%|####3 | 285/655 [00:00<00:00, 713.63batch/s, train/train_loss_1=0.0835]Epoch 2: 44%|####3 | 285/655 [00:00<00:00, 713.63batch/s, train/train_loss_1=0.0799]Epoch 2: 44%|####3 | 285/655 [00:00<00:00, 713.63batch/s, train/train_loss_1=0.084] Epoch 2: 44%|####3 | 285/655 [00:00<00:00, 713.63batch/s, train/train_loss_1=0.0817]Epoch 2: 44%|####3 | 285/655 [00:00<00:00, 713.63batch/s, train/train_loss_1=0.0858]Epoch 2: 44%|####3 | 285/655 [00:00<00:00, 713.63batch/s, train/train_loss_1=0.0939]Epoch 2: 44%|####3 | 285/655 [00:00<00:00, 713.63batch/s, train/train_loss_1=0.0845]Epoch 2: 44%|####3 | 285/655 [00:00<00:00, 713.63batch/s, train/train_loss_1=0.107] Epoch 2: 44%|####3 | 285/655 [00:00<00:00, 713.63batch/s, train/train_loss_1=0.113]Epoch 2: 44%|####3 | 285/655 [00:00<00:00, 713.63batch/s, train/train_loss_1=0.0985]Epoch 2: 44%|####3 | 285/655 [00:00<00:00, 713.63batch/s, train/train_loss_1=0.114] Epoch 2: 44%|####3 | 285/655 [00:00<00:00, 713.63batch/s, train/train_loss_1=0.113]Epoch 2: 44%|####3 | 285/655 [00:00<00:00, 713.63batch/s, train/train_loss_1=0.0917]Epoch 2: 44%|####3 | 285/655 [00:00<00:00, 713.63batch/s, train/train_loss_1=0.0908]Epoch 2: 44%|####3 | 285/655 [00:00<00:00, 713.63batch/s, train/train_loss_1=0.101] Epoch 2: 44%|####3 | 285/655 [00:00<00:00, 713.63batch/s, train/train_loss_1=0.11] Epoch 2: 44%|####3 | 285/655 [00:00<00:00, 713.63batch/s, train/train_loss_1=0.075]Epoch 2: 44%|####3 | 285/655 [00:00<00:00, 713.63batch/s, train/train_loss_1=0.11] Epoch 2: 44%|####3 | 285/655 [00:00<00:00, 713.63batch/s, train/train_loss_1=0.0996]Epoch 2: 44%|####3 | 285/655 [00:00<00:00, 713.63batch/s, train/train_loss_1=0.0951]Epoch 2: 44%|####3 | 285/655 [00:00<00:00, 713.63batch/s, train/train_loss_1=0.0978]Epoch 2: 44%|####3 | 285/655 [00:00<00:00, 713.63batch/s, train/train_loss_1=0.103] Epoch 2: 44%|####3 | 285/655 [00:00<00:00, 713.63batch/s, train/train_loss_1=0.0876]Epoch 2: 44%|####3 | 285/655 [00:00<00:00, 713.63batch/s, train/train_loss_1=0.0965]Epoch 2: 44%|####3 | 285/655 [00:00<00:00, 713.63batch/s, train/train_loss_1=0.0948]Epoch 2: 55%|#####4 | 359/655 [00:00<00:00, 719.48batch/s, train/train_loss_1=0.0948]Epoch 2: 55%|#####4 | 359/655 [00:00<00:00, 719.48batch/s, train/train_loss_1=0.091] Epoch 2: 55%|#####4 | 359/655 [00:00<00:00, 719.48batch/s, train/train_loss_1=0.103]Epoch 2: 55%|#####4 | 359/655 [00:00<00:00, 719.48batch/s, train/train_loss_1=0.0875]Epoch 2: 55%|#####4 | 359/655 [00:00<00:00, 719.48batch/s, train/train_loss_1=0.0816]Epoch 2: 55%|#####4 | 359/655 [00:00<00:00, 719.48batch/s, train/train_loss_1=0.0969]Epoch 2: 55%|#####4 | 359/655 [00:00<00:00, 719.48batch/s, train/train_loss_1=0.0814]Epoch 2: 55%|#####4 | 359/655 [00:00<00:00, 719.48batch/s, train/train_loss_1=0.0915]Epoch 2: 55%|#####4 | 359/655 [00:00<00:00, 719.48batch/s, train/train_loss_1=0.0792]Epoch 2: 55%|#####4 | 359/655 [00:00<00:00, 719.48batch/s, train/train_loss_1=0.0978]Epoch 2: 55%|#####4 | 359/655 [00:00<00:00, 719.48batch/s, train/train_loss_1=0.0924]Epoch 2: 55%|#####4 | 359/655 [00:00<00:00, 719.48batch/s, train/train_loss_1=0.0984]Epoch 2: 55%|#####4 | 359/655 [00:00<00:00, 719.48batch/s, train/train_loss_1=0.0927]Epoch 2: 55%|#####4 | 359/655 [00:00<00:00, 719.48batch/s, train/train_loss_1=0.1] Epoch 2: 55%|#####4 | 359/655 [00:00<00:00, 719.48batch/s, train/train_loss_1=0.106]Epoch 2: 55%|#####4 | 359/655 [00:00<00:00, 719.48batch/s, train/train_loss_1=0.0859]Epoch 2: 55%|#####4 | 359/655 [00:00<00:00, 719.48batch/s, train/train_loss_1=0.0936]Epoch 2: 55%|#####4 | 359/655 [00:00<00:00, 719.48batch/s, train/train_loss_1=0.0835]Epoch 2: 55%|#####4 | 359/655 [00:00<00:00, 719.48batch/s, train/train_loss_1=0.093] Epoch 2: 55%|#####4 | 359/655 [00:00<00:00, 719.48batch/s, train/train_loss_1=0.101]Epoch 2: 55%|#####4 | 359/655 [00:00<00:00, 719.48batch/s, train/train_loss_1=0.117]Epoch 2: 55%|#####4 | 359/655 [00:00<00:00, 719.48batch/s, train/train_loss_1=0.0906]Epoch 2: 55%|#####4 | 359/655 [00:00<00:00, 719.48batch/s, train/train_loss_1=0.0889]Epoch 2: 55%|#####4 | 359/655 [00:00<00:00, 719.48batch/s, train/train_loss_1=0.0984]Epoch 2: 55%|#####4 | 359/655 [00:00<00:00, 719.48batch/s, train/train_loss_1=0.0923]Epoch 2: 55%|#####4 | 359/655 [00:00<00:00, 719.48batch/s, train/train_loss_1=0.0913]Epoch 2: 55%|#####4 | 359/655 [00:00<00:00, 719.48batch/s, train/train_loss_1=0.106] Epoch 2: 55%|#####4 | 359/655 [00:00<00:00, 719.48batch/s, train/train_loss_1=0.0816]Epoch 2: 55%|#####4 | 359/655 [00:00<00:00, 719.48batch/s, train/train_loss_1=0.0945]Epoch 2: 55%|#####4 | 359/655 [00:00<00:00, 719.48batch/s, train/train_loss_1=0.0873]Epoch 2: 55%|#####4 | 359/655 [00:00<00:00, 719.48batch/s, train/train_loss_1=0.0923]Epoch 2: 55%|#####4 | 359/655 [00:00<00:00, 719.48batch/s, train/train_loss_1=0.0909]Epoch 2: 55%|#####4 | 359/655 [00:00<00:00, 719.48batch/s, train/train_loss_1=0.0931]Epoch 2: 55%|#####4 | 359/655 [00:00<00:00, 719.48batch/s, train/train_loss_1=0.0958]Epoch 2: 55%|#####4 | 359/655 [00:00<00:00, 719.48batch/s, train/train_loss_1=0.0928]Epoch 2: 55%|#####4 | 359/655 [00:00<00:00, 719.48batch/s, train/train_loss_1=0.0998]Epoch 2: 55%|#####4 | 359/655 [00:00<00:00, 719.48batch/s, train/train_loss_1=0.0941]Epoch 2: 55%|#####4 | 359/655 [00:00<00:00, 719.48batch/s, train/train_loss_1=0.101] Epoch 2: 55%|#####4 | 359/655 [00:00<00:00, 719.48batch/s, train/train_loss_1=0.0993]Epoch 2: 55%|#####4 | 359/655 [00:00<00:00, 719.48batch/s, train/train_loss_1=0.0914]Epoch 2: 55%|#####4 | 359/655 [00:00<00:00, 719.48batch/s, train/train_loss_1=0.0853]Epoch 2: 55%|#####4 | 359/655 [00:00<00:00, 719.48batch/s, train/train_loss_1=0.0936]Epoch 2: 55%|#####4 | 359/655 [00:00<00:00, 719.48batch/s, train/train_loss_1=0.0972]Epoch 2: 55%|#####4 | 359/655 [00:00<00:00, 719.48batch/s, train/train_loss_1=0.109] Epoch 2: 55%|#####4 | 359/655 [00:00<00:00, 719.48batch/s, train/train_loss_1=0.085]Epoch 2: 55%|#####4 | 359/655 [00:00<00:00, 719.48batch/s, train/train_loss_1=0.0912]Epoch 2: 55%|#####4 | 359/655 [00:00<00:00, 719.48batch/s, train/train_loss_1=0.0966]Epoch 2: 55%|#####4 | 359/655 [00:00<00:00, 719.48batch/s, train/train_loss_1=0.0977]Epoch 2: 55%|#####4 | 359/655 [00:00<00:00, 719.48batch/s, train/train_loss_1=0.102] Epoch 2: 55%|#####4 | 359/655 [00:00<00:00, 719.48batch/s, train/train_loss_1=0.0939]Epoch 2: 55%|#####4 | 359/655 [00:00<00:00, 719.48batch/s, train/train_loss_1=0.0976]Epoch 2: 55%|#####4 | 359/655 [00:00<00:00, 719.48batch/s, train/train_loss_1=0.088] Epoch 2: 55%|#####4 | 359/655 [00:00<00:00, 719.48batch/s, train/train_loss_1=0.0924]Epoch 2: 55%|#####4 | 359/655 [00:00<00:00, 719.48batch/s, train/train_loss_1=0.101] Epoch 2: 55%|#####4 | 359/655 [00:00<00:00, 719.48batch/s, train/train_loss_1=0.0941]Epoch 2: 55%|#####4 | 359/655 [00:00<00:00, 719.48batch/s, train/train_loss_1=0.0958]Epoch 2: 55%|#####4 | 359/655 [00:00<00:00, 719.48batch/s, train/train_loss_1=0.0897]Epoch 2: 55%|#####4 | 359/655 [00:00<00:00, 719.48batch/s, train/train_loss_1=0.0852]Epoch 2: 55%|#####4 | 359/655 [00:00<00:00, 719.48batch/s, train/train_loss_1=0.093] Epoch 2: 55%|#####4 | 359/655 [00:00<00:00, 719.48batch/s, train/train_loss_1=0.0991]Epoch 2: 55%|#####4 | 359/655 [00:00<00:00, 719.48batch/s, train/train_loss_1=0.092] Epoch 2: 55%|#####4 | 359/655 [00:00<00:00, 719.48batch/s, train/train_loss_1=0.0942]Epoch 2: 55%|#####4 | 359/655 [00:00<00:00, 719.48batch/s, train/train_loss_1=0.0943]Epoch 2: 55%|#####4 | 359/655 [00:00<00:00, 719.48batch/s, train/train_loss_1=0.0944]Epoch 2: 55%|#####4 | 359/655 [00:00<00:00, 719.48batch/s, train/train_loss_1=0.0933]Epoch 2: 55%|#####4 | 359/655 [00:00<00:00, 719.48batch/s, train/train_loss_1=0.108] Epoch 2: 55%|#####4 | 359/655 [00:00<00:00, 719.48batch/s, train/train_loss_1=0.115]Epoch 2: 55%|#####4 | 359/655 [00:00<00:00, 719.48batch/s, train/train_loss_1=0.0986]Epoch 2: 55%|#####4 | 359/655 [00:00<00:00, 719.48batch/s, train/train_loss_1=0.103] Epoch 2: 55%|#####4 | 359/655 [00:00<00:00, 719.48batch/s, train/train_loss_1=0.102]Epoch 2: 55%|#####4 | 359/655 [00:00<00:00, 719.48batch/s, train/train_loss_1=0.0747]Epoch 2: 55%|#####4 | 359/655 [00:00<00:00, 719.48batch/s, train/train_loss_1=0.0824]Epoch 2: 55%|#####4 | 359/655 [00:00<00:00, 719.48batch/s, train/train_loss_1=0.0914]Epoch 2: 66%|######5 | 431/655 [00:00<00:00, 716.22batch/s, train/train_loss_1=0.0914]Epoch 2: 66%|######5 | 431/655 [00:00<00:00, 716.22batch/s, train/train_loss_1=0.0846]Epoch 2: 66%|######5 | 431/655 [00:00<00:00, 716.22batch/s, train/train_loss_1=0.0976]Epoch 2: 66%|######5 | 431/655 [00:00<00:00, 716.22batch/s, train/train_loss_1=0.108] Epoch 2: 66%|######5 | 431/655 [00:00<00:00, 716.22batch/s, train/train_loss_1=0.0997]Epoch 2: 66%|######5 | 431/655 [00:00<00:00, 716.22batch/s, train/train_loss_1=0.0809]Epoch 2: 66%|######5 | 431/655 [00:00<00:00, 716.22batch/s, train/train_loss_1=0.0877]Epoch 2: 66%|######5 | 431/655 [00:00<00:00, 716.22batch/s, train/train_loss_1=0.0877]Epoch 2: 66%|######5 | 431/655 [00:00<00:00, 716.22batch/s, train/train_loss_1=0.096] Epoch 2: 66%|######5 | 431/655 [00:00<00:00, 716.22batch/s, train/train_loss_1=0.0876]Epoch 2: 66%|######5 | 431/655 [00:00<00:00, 716.22batch/s, train/train_loss_1=0.0882]Epoch 2: 66%|######5 | 431/655 [00:00<00:00, 716.22batch/s, train/train_loss_1=0.116] Epoch 2: 66%|######5 | 431/655 [00:00<00:00, 716.22batch/s, train/train_loss_1=0.11] Epoch 2: 66%|######5 | 431/655 [00:00<00:00, 716.22batch/s, train/train_loss_1=0.0955]Epoch 2: 66%|######5 | 431/655 [00:00<00:00, 716.22batch/s, train/train_loss_1=0.089] Epoch 2: 66%|######5 | 431/655 [00:00<00:00, 716.22batch/s, train/train_loss_1=0.0964]Epoch 2: 66%|######5 | 431/655 [00:00<00:00, 716.22batch/s, train/train_loss_1=0.0816]Epoch 2: 66%|######5 | 431/655 [00:00<00:00, 716.22batch/s, train/train_loss_1=0.0893]Epoch 2: 66%|######5 | 431/655 [00:00<00:00, 716.22batch/s, train/train_loss_1=0.102] Epoch 2: 66%|######5 | 431/655 [00:00<00:00, 716.22batch/s, train/train_loss_1=0.0948]Epoch 2: 66%|######5 | 431/655 [00:00<00:00, 716.22batch/s, train/train_loss_1=0.0951]Epoch 2: 66%|######5 | 431/655 [00:00<00:00, 716.22batch/s, train/train_loss_1=0.078] Epoch 2: 66%|######5 | 431/655 [00:00<00:00, 716.22batch/s, train/train_loss_1=0.0975]Epoch 2: 66%|######5 | 431/655 [00:00<00:00, 716.22batch/s, train/train_loss_1=0.0888]Epoch 2: 66%|######5 | 431/655 [00:00<00:00, 716.22batch/s, train/train_loss_1=0.0897]Epoch 2: 66%|######5 | 431/655 [00:00<00:00, 716.22batch/s, train/train_loss_1=0.0804]Epoch 2: 66%|######5 | 431/655 [00:00<00:00, 716.22batch/s, train/train_loss_1=0.1] Epoch 2: 66%|######5 | 431/655 [00:00<00:00, 716.22batch/s, train/train_loss_1=0.112]Epoch 2: 66%|######5 | 431/655 [00:00<00:00, 716.22batch/s, train/train_loss_1=0.105]Epoch 2: 66%|######5 | 431/655 [00:00<00:00, 716.22batch/s, train/train_loss_1=0.0976]Epoch 2: 66%|######5 | 431/655 [00:00<00:00, 716.22batch/s, train/train_loss_1=0.106] Epoch 2: 66%|######5 | 431/655 [00:00<00:00, 716.22batch/s, train/train_loss_1=0.0967]Epoch 2: 66%|######5 | 431/655 [00:00<00:00, 716.22batch/s, train/train_loss_1=0.0936]Epoch 2: 66%|######5 | 431/655 [00:00<00:00, 716.22batch/s, train/train_loss_1=0.0936]Epoch 2: 66%|######5 | 431/655 [00:00<00:00, 716.22batch/s, train/train_loss_1=0.0841]Epoch 2: 66%|######5 | 431/655 [00:00<00:00, 716.22batch/s, train/train_loss_1=0.0939]Epoch 2: 66%|######5 | 431/655 [00:00<00:00, 716.22batch/s, train/train_loss_1=0.0987]Epoch 2: 66%|######5 | 431/655 [00:00<00:00, 716.22batch/s, train/train_loss_1=0.0959]Epoch 2: 66%|######5 | 431/655 [00:00<00:00, 716.22batch/s, train/train_loss_1=0.086] Epoch 2: 66%|######5 | 431/655 [00:00<00:00, 716.22batch/s, train/train_loss_1=0.0924]Epoch 2: 66%|######5 | 431/655 [00:00<00:00, 716.22batch/s, train/train_loss_1=0.099] Epoch 2: 66%|######5 | 431/655 [00:00<00:00, 716.22batch/s, train/train_loss_1=0.0938]Epoch 2: 66%|######5 | 431/655 [00:00<00:00, 716.22batch/s, train/train_loss_1=0.0843]Epoch 2: 66%|######5 | 431/655 [00:00<00:00, 716.22batch/s, train/train_loss_1=0.0812]Epoch 2: 66%|######5 | 431/655 [00:00<00:00, 716.22batch/s, train/train_loss_1=0.0791]Epoch 2: 66%|######5 | 431/655 [00:00<00:00, 716.22batch/s, train/train_loss_1=0.0972]Epoch 2: 66%|######5 | 431/655 [00:00<00:00, 716.22batch/s, train/train_loss_1=0.0947]Epoch 2: 66%|######5 | 431/655 [00:00<00:00, 716.22batch/s, train/train_loss_1=0.117] Epoch 2: 66%|######5 | 431/655 [00:00<00:00, 716.22batch/s, train/train_loss_1=0.096]Epoch 2: 66%|######5 | 431/655 [00:00<00:00, 716.22batch/s, train/train_loss_1=0.101]Epoch 2: 66%|######5 | 431/655 [00:00<00:00, 716.22batch/s, train/train_loss_1=0.0897]Epoch 2: 66%|######5 | 431/655 [00:00<00:00, 716.22batch/s, train/train_loss_1=0.0823]Epoch 2: 66%|######5 | 431/655 [00:00<00:00, 716.22batch/s, train/train_loss_1=0.0949]Epoch 2: 66%|######5 | 431/655 [00:00<00:00, 716.22batch/s, train/train_loss_1=0.0991]Epoch 2: 66%|######5 | 431/655 [00:00<00:00, 716.22batch/s, train/train_loss_1=0.0938]Epoch 2: 66%|######5 | 431/655 [00:00<00:00, 716.22batch/s, train/train_loss_1=0.094] Epoch 2: 66%|######5 | 431/655 [00:00<00:00, 716.22batch/s, train/train_loss_1=0.0944]Epoch 2: 66%|######5 | 431/655 [00:00<00:00, 716.22batch/s, train/train_loss_1=0.0939]Epoch 2: 66%|######5 | 431/655 [00:00<00:00, 716.22batch/s, train/train_loss_1=0.0957]Epoch 2: 66%|######5 | 431/655 [00:00<00:00, 716.22batch/s, train/train_loss_1=0.0879]Epoch 2: 66%|######5 | 431/655 [00:00<00:00, 716.22batch/s, train/train_loss_1=0.0948]Epoch 2: 66%|######5 | 431/655 [00:00<00:00, 716.22batch/s, train/train_loss_1=0.0907]Epoch 2: 66%|######5 | 431/655 [00:00<00:00, 716.22batch/s, train/train_loss_1=0.0935]Epoch 2: 66%|######5 | 431/655 [00:00<00:00, 716.22batch/s, train/train_loss_1=0.0781]Epoch 2: 66%|######5 | 431/655 [00:00<00:00, 716.22batch/s, train/train_loss_1=0.0944]Epoch 2: 66%|######5 | 431/655 [00:00<00:00, 716.22batch/s, train/train_loss_1=0.103] Epoch 2: 66%|######5 | 431/655 [00:00<00:00, 716.22batch/s, train/train_loss_1=0.0954]Epoch 2: 66%|######5 | 431/655 [00:00<00:00, 716.22batch/s, train/train_loss_1=0.094] Epoch 2: 66%|######5 | 431/655 [00:00<00:00, 716.22batch/s, train/train_loss_1=0.104]Epoch 2: 66%|######5 | 431/655 [00:00<00:00, 716.22batch/s, train/train_loss_1=0.104]Epoch 2: 66%|######5 | 431/655 [00:00<00:00, 716.22batch/s, train/train_loss_1=0.0845]Epoch 2: 66%|######5 | 431/655 [00:00<00:00, 716.22batch/s, train/train_loss_1=0.0967]Epoch 2: 66%|######5 | 431/655 [00:00<00:00, 716.22batch/s, train/train_loss_1=0.102] Epoch 2: 77%|#######6 | 503/655 [00:00<00:00, 709.93batch/s, train/train_loss_1=0.102]Epoch 2: 77%|#######6 | 503/655 [00:00<00:00, 709.93batch/s, train/train_loss_1=0.105]Epoch 2: 77%|#######6 | 503/655 [00:00<00:00, 709.93batch/s, train/train_loss_1=0.106]Epoch 2: 77%|#######6 | 503/655 [00:00<00:00, 709.93batch/s, train/train_loss_1=0.0847]Epoch 2: 77%|#######6 | 503/655 [00:00<00:00, 709.93batch/s, train/train_loss_1=0.0911]Epoch 2: 77%|#######6 | 503/655 [00:00<00:00, 709.93batch/s, train/train_loss_1=0.101] Epoch 2: 77%|#######6 | 503/655 [00:00<00:00, 709.93batch/s, train/train_loss_1=0.0766]Epoch 2: 77%|#######6 | 503/655 [00:00<00:00, 709.93batch/s, train/train_loss_1=0.0899]Epoch 2: 77%|#######6 | 503/655 [00:00<00:00, 709.93batch/s, train/train_loss_1=0.0981]Epoch 2: 77%|#######6 | 503/655 [00:00<00:00, 709.93batch/s, train/train_loss_1=0.0995]Epoch 2: 77%|#######6 | 503/655 [00:00<00:00, 709.93batch/s, train/train_loss_1=0.0877]Epoch 2: 77%|#######6 | 503/655 [00:00<00:00, 709.93batch/s, train/train_loss_1=0.083] Epoch 2: 77%|#######6 | 503/655 [00:00<00:00, 709.93batch/s, train/train_loss_1=0.0834]Epoch 2: 77%|#######6 | 503/655 [00:00<00:00, 709.93batch/s, train/train_loss_1=0.0996]Epoch 2: 77%|#######6 | 503/655 [00:00<00:00, 709.93batch/s, train/train_loss_1=0.0993]Epoch 2: 77%|#######6 | 503/655 [00:00<00:00, 709.93batch/s, train/train_loss_1=0.0966]Epoch 2: 77%|#######6 | 503/655 [00:00<00:00, 709.93batch/s, train/train_loss_1=0.1] Epoch 2: 77%|#######6 | 503/655 [00:00<00:00, 709.93batch/s, train/train_loss_1=0.0855]Epoch 2: 77%|#######6 | 503/655 [00:00<00:00, 709.93batch/s, train/train_loss_1=0.0908]Epoch 2: 77%|#######6 | 503/655 [00:00<00:00, 709.93batch/s, train/train_loss_1=0.0919]Epoch 2: 77%|#######6 | 503/655 [00:00<00:00, 709.93batch/s, train/train_loss_1=0.0938]Epoch 2: 77%|#######6 | 503/655 [00:00<00:00, 709.93batch/s, train/train_loss_1=0.111] Epoch 2: 77%|#######6 | 503/655 [00:00<00:00, 709.93batch/s, train/train_loss_1=0.0922]Epoch 2: 77%|#######6 | 503/655 [00:00<00:00, 709.93batch/s, train/train_loss_1=0.0887]Epoch 2: 77%|#######6 | 503/655 [00:00<00:00, 709.93batch/s, train/train_loss_1=0.0973]Epoch 2: 77%|#######6 | 503/655 [00:00<00:00, 709.93batch/s, train/train_loss_1=0.101] Epoch 2: 77%|#######6 | 503/655 [00:00<00:00, 709.93batch/s, train/train_loss_1=0.0976]Epoch 2: 77%|#######6 | 503/655 [00:00<00:00, 709.93batch/s, train/train_loss_1=0.0956]Epoch 2: 77%|#######6 | 503/655 [00:00<00:00, 709.93batch/s, train/train_loss_1=0.0946]Epoch 2: 77%|#######6 | 503/655 [00:00<00:00, 709.93batch/s, train/train_loss_1=0.104] Epoch 2: 77%|#######6 | 503/655 [00:00<00:00, 709.93batch/s, train/train_loss_1=0.0909]Epoch 2: 77%|#######6 | 503/655 [00:00<00:00, 709.93batch/s, train/train_loss_1=0.106] Epoch 2: 77%|#######6 | 503/655 [00:00<00:00, 709.93batch/s, train/train_loss_1=0.103]Epoch 2: 77%|#######6 | 503/655 [00:00<00:00, 709.93batch/s, train/train_loss_1=0.106]Epoch 2: 77%|#######6 | 503/655 [00:00<00:00, 709.93batch/s, train/train_loss_1=0.103]Epoch 2: 77%|#######6 | 503/655 [00:00<00:00, 709.93batch/s, train/train_loss_1=0.0882]Epoch 2: 77%|#######6 | 503/655 [00:00<00:00, 709.93batch/s, train/train_loss_1=0.0977]Epoch 2: 77%|#######6 | 503/655 [00:00<00:00, 709.93batch/s, train/train_loss_1=0.077] Epoch 2: 77%|#######6 | 503/655 [00:00<00:00, 709.93batch/s, train/train_loss_1=0.0862]Epoch 2: 77%|#######6 | 503/655 [00:00<00:00, 709.93batch/s, train/train_loss_1=0.091] Epoch 2: 77%|#######6 | 503/655 [00:00<00:00, 709.93batch/s, train/train_loss_1=0.0936]Epoch 2: 77%|#######6 | 503/655 [00:00<00:00, 709.93batch/s, train/train_loss_1=0.109] Epoch 2: 77%|#######6 | 503/655 [00:00<00:00, 709.93batch/s, train/train_loss_1=0.101]Epoch 2: 77%|#######6 | 503/655 [00:00<00:00, 709.93batch/s, train/train_loss_1=0.115]Epoch 2: 77%|#######6 | 503/655 [00:00<00:00, 709.93batch/s, train/train_loss_1=0.092]Epoch 2: 77%|#######6 | 503/655 [00:00<00:00, 709.93batch/s, train/train_loss_1=0.076]Epoch 2: 77%|#######6 | 503/655 [00:00<00:00, 709.93batch/s, train/train_loss_1=0.109]Epoch 2: 77%|#######6 | 503/655 [00:00<00:00, 709.93batch/s, train/train_loss_1=0.0803]Epoch 2: 77%|#######6 | 503/655 [00:00<00:00, 709.93batch/s, train/train_loss_1=0.1] Epoch 2: 77%|#######6 | 503/655 [00:00<00:00, 709.93batch/s, train/train_loss_1=0.0967]Epoch 2: 77%|#######6 | 503/655 [00:00<00:00, 709.93batch/s, train/train_loss_1=0.0971]Epoch 2: 77%|#######6 | 503/655 [00:00<00:00, 709.93batch/s, train/train_loss_1=0.0911]Epoch 2: 77%|#######6 | 503/655 [00:00<00:00, 709.93batch/s, train/train_loss_1=0.0936]Epoch 2: 77%|#######6 | 503/655 [00:00<00:00, 709.93batch/s, train/train_loss_1=0.0926]Epoch 2: 77%|#######6 | 503/655 [00:00<00:00, 709.93batch/s, train/train_loss_1=0.113] Epoch 2: 77%|#######6 | 503/655 [00:00<00:00, 709.93batch/s, train/train_loss_1=0.0937]Epoch 2: 77%|#######6 | 503/655 [00:00<00:00, 709.93batch/s, train/train_loss_1=0.0846]Epoch 2: 77%|#######6 | 503/655 [00:00<00:00, 709.93batch/s, train/train_loss_1=0.0846]Epoch 2: 77%|#######6 | 503/655 [00:00<00:00, 709.93batch/s, train/train_loss_1=0.0831]Epoch 2: 77%|#######6 | 503/655 [00:00<00:00, 709.93batch/s, train/train_loss_1=0.0932]Epoch 2: 77%|#######6 | 503/655 [00:00<00:00, 709.93batch/s, train/train_loss_1=0.0948]Epoch 2: 77%|#######6 | 503/655 [00:00<00:00, 709.93batch/s, train/train_loss_1=0.0713]Epoch 2: 77%|#######6 | 503/655 [00:00<00:00, 709.93batch/s, train/train_loss_1=0.0957]Epoch 2: 77%|#######6 | 503/655 [00:00<00:00, 709.93batch/s, train/train_loss_1=0.106] Epoch 2: 77%|#######6 | 503/655 [00:00<00:00, 709.93batch/s, train/train_loss_1=0.0991]Epoch 2: 77%|#######6 | 503/655 [00:00<00:00, 709.93batch/s, train/train_loss_1=0.102] Epoch 2: 77%|#######6 | 503/655 [00:00<00:00, 709.93batch/s, train/train_loss_1=0.0968]Epoch 2: 77%|#######6 | 503/655 [00:00<00:00, 709.93batch/s, train/train_loss_1=0.0935]Epoch 2: 77%|#######6 | 503/655 [00:00<00:00, 709.93batch/s, train/train_loss_1=0.108] Epoch 2: 77%|#######6 | 503/655 [00:00<00:00, 709.93batch/s, train/train_loss_1=0.116]Epoch 2: 77%|#######6 | 503/655 [00:00<00:00, 709.93batch/s, train/train_loss_1=0.0886]Epoch 2: 77%|#######6 | 503/655 [00:00<00:00, 709.93batch/s, train/train_loss_1=0.0924]Epoch 2: 77%|#######6 | 503/655 [00:00<00:00, 709.93batch/s, train/train_loss_1=0.0945]Epoch 2: 88%|########7 | 575/655 [00:00<00:00, 709.68batch/s, train/train_loss_1=0.0945]Epoch 2: 88%|########7 | 575/655 [00:00<00:00, 709.68batch/s, train/train_loss_1=0.0863]Epoch 2: 88%|########7 | 575/655 [00:00<00:00, 709.68batch/s, train/train_loss_1=0.088] Epoch 2: 88%|########7 | 575/655 [00:00<00:00, 709.68batch/s, train/train_loss_1=0.109]Epoch 2: 88%|########7 | 575/655 [00:00<00:00, 709.68batch/s, train/train_loss_1=0.0796]Epoch 2: 88%|########7 | 575/655 [00:00<00:00, 709.68batch/s, train/train_loss_1=0.0921]Epoch 2: 88%|########7 | 575/655 [00:00<00:00, 709.68batch/s, train/train_loss_1=0.0952]Epoch 2: 88%|########7 | 575/655 [00:00<00:00, 709.68batch/s, train/train_loss_1=0.0877]Epoch 2: 88%|########7 | 575/655 [00:00<00:00, 709.68batch/s, train/train_loss_1=0.097] Epoch 2: 88%|########7 | 575/655 [00:00<00:00, 709.68batch/s, train/train_loss_1=0.0842]Epoch 2: 88%|########7 | 575/655 [00:00<00:00, 709.68batch/s, train/train_loss_1=0.109] Epoch 2: 88%|########7 | 575/655 [00:00<00:00, 709.68batch/s, train/train_loss_1=0.0703]Epoch 2: 88%|########7 | 575/655 [00:00<00:00, 709.68batch/s, train/train_loss_1=0.0932]Epoch 2: 88%|########7 | 575/655 [00:00<00:00, 709.68batch/s, train/train_loss_1=0.0856]Epoch 2: 88%|########7 | 575/655 [00:00<00:00, 709.68batch/s, train/train_loss_1=0.0997]Epoch 2: 88%|########7 | 575/655 [00:00<00:00, 709.68batch/s, train/train_loss_1=0.0876]Epoch 2: 88%|########7 | 575/655 [00:00<00:00, 709.68batch/s, train/train_loss_1=0.0872]Epoch 2: 88%|########7 | 575/655 [00:00<00:00, 709.68batch/s, train/train_loss_1=0.105] Epoch 2: 88%|########7 | 575/655 [00:00<00:00, 709.68batch/s, train/train_loss_1=0.0925]Epoch 2: 88%|########7 | 575/655 [00:00<00:00, 709.68batch/s, train/train_loss_1=0.101] Epoch 2: 88%|########7 | 575/655 [00:00<00:00, 709.68batch/s, train/train_loss_1=0.0904]Epoch 2: 88%|########7 | 575/655 [00:00<00:00, 709.68batch/s, train/train_loss_1=0.0996]Epoch 2: 88%|########7 | 575/655 [00:00<00:00, 709.68batch/s, train/train_loss_1=0.0919]Epoch 2: 88%|########7 | 575/655 [00:00<00:00, 709.68batch/s, train/train_loss_1=0.0878]Epoch 2: 88%|########7 | 575/655 [00:00<00:00, 709.68batch/s, train/train_loss_1=0.0803]Epoch 2: 88%|########7 | 575/655 [00:00<00:00, 709.68batch/s, train/train_loss_1=0.0965]Epoch 2: 88%|########7 | 575/655 [00:00<00:00, 709.68batch/s, train/train_loss_1=0.0866]Epoch 2: 88%|########7 | 575/655 [00:00<00:00, 709.68batch/s, train/train_loss_1=0.0821]Epoch 2: 88%|########7 | 575/655 [00:00<00:00, 709.68batch/s, train/train_loss_1=0.1] Epoch 2: 88%|########7 | 575/655 [00:00<00:00, 709.68batch/s, train/train_loss_1=0.0947]Epoch 2: 88%|########7 | 575/655 [00:00<00:00, 709.68batch/s, train/train_loss_1=0.091] Epoch 2: 88%|########7 | 575/655 [00:00<00:00, 709.68batch/s, train/train_loss_1=0.105]Epoch 2: 88%|########7 | 575/655 [00:00<00:00, 709.68batch/s, train/train_loss_1=0.0997]Epoch 2: 88%|########7 | 575/655 [00:00<00:00, 709.68batch/s, train/train_loss_1=0.0764]Epoch 2: 88%|########7 | 575/655 [00:00<00:00, 709.68batch/s, train/train_loss_1=0.0887]Epoch 2: 88%|########7 | 575/655 [00:00<00:00, 709.68batch/s, train/train_loss_1=0.105] Epoch 2: 88%|########7 | 575/655 [00:00<00:00, 709.68batch/s, train/train_loss_1=0.0918]Epoch 2: 88%|########7 | 575/655 [00:00<00:00, 709.68batch/s, train/train_loss_1=0.0897]Epoch 2: 88%|########7 | 575/655 [00:00<00:00, 709.68batch/s, train/train_loss_1=0.087] Epoch 2: 88%|########7 | 575/655 [00:00<00:00, 709.68batch/s, train/train_loss_1=0.0888]Epoch 2: 88%|########7 | 575/655 [00:00<00:00, 709.68batch/s, train/train_loss_1=0.107] Epoch 2: 88%|########7 | 575/655 [00:00<00:00, 709.68batch/s, train/train_loss_1=0.104]Epoch 2: 88%|########7 | 575/655 [00:00<00:00, 709.68batch/s, train/train_loss_1=0.0905]Epoch 2: 88%|########7 | 575/655 [00:00<00:00, 709.68batch/s, train/train_loss_1=0.0779]Epoch 2: 88%|########7 | 575/655 [00:00<00:00, 709.68batch/s, train/train_loss_1=0.0839]Epoch 2: 88%|########7 | 575/655 [00:00<00:00, 709.68batch/s, train/train_loss_1=0.0874]Epoch 2: 88%|########7 | 575/655 [00:00<00:00, 709.68batch/s, train/train_loss_1=0.0972]Epoch 2: 88%|########7 | 575/655 [00:00<00:00, 709.68batch/s, train/train_loss_1=0.0985]Epoch 2: 88%|########7 | 575/655 [00:00<00:00, 709.68batch/s, train/train_loss_1=0.093] Epoch 2: 88%|########7 | 575/655 [00:00<00:00, 709.68batch/s, train/train_loss_1=0.0935]Epoch 2: 88%|########7 | 575/655 [00:00<00:00, 709.68batch/s, train/train_loss_1=0.106] Epoch 2: 88%|########7 | 575/655 [00:00<00:00, 709.68batch/s, train/train_loss_1=0.097]Epoch 2: 88%|########7 | 575/655 [00:00<00:00, 709.68batch/s, train/train_loss_1=0.112]Epoch 2: 88%|########7 | 575/655 [00:00<00:00, 709.68batch/s, train/train_loss_1=0.0955]Epoch 2: 88%|########7 | 575/655 [00:00<00:00, 709.68batch/s, train/train_loss_1=0.102] Epoch 2: 88%|########7 | 575/655 [00:00<00:00, 709.68batch/s, train/train_loss_1=0.11] Epoch 2: 88%|########7 | 575/655 [00:00<00:00, 709.68batch/s, train/train_loss_1=0.0841]Epoch 2: 88%|########7 | 575/655 [00:00<00:00, 709.68batch/s, train/train_loss_1=0.103] Epoch 2: 88%|########7 | 575/655 [00:00<00:00, 709.68batch/s, train/train_loss_1=0.0961]Epoch 2: 88%|########7 | 575/655 [00:00<00:00, 709.68batch/s, train/train_loss_1=0.0804]Epoch 2: 88%|########7 | 575/655 [00:00<00:00, 709.68batch/s, train/train_loss_1=0.107] Epoch 2: 88%|########7 | 575/655 [00:00<00:00, 709.68batch/s, train/train_loss_1=0.0903]Epoch 2: 88%|########7 | 575/655 [00:00<00:00, 709.68batch/s, train/train_loss_1=0.0808]Epoch 2: 88%|########7 | 575/655 [00:00<00:00, 709.68batch/s, train/train_loss_1=0.0901]Epoch 2: 88%|########7 | 575/655 [00:00<00:00, 709.68batch/s, train/train_loss_1=0.109] Epoch 2: 88%|########7 | 575/655 [00:00<00:00, 709.68batch/s, train/train_loss_1=0.0951]Epoch 2: 88%|########7 | 575/655 [00:00<00:00, 709.68batch/s, train/train_loss_1=0.0927]Epoch 2: 88%|########7 | 575/655 [00:00<00:00, 709.68batch/s, train/train_loss_1=0.0948]Epoch 2: 88%|########7 | 575/655 [00:00<00:00, 709.68batch/s, train/train_loss_1=0.0939]Epoch 2: 88%|########7 | 575/655 [00:00<00:00, 709.68batch/s, train/train_loss_1=0.0816]Epoch 2: 88%|########7 | 575/655 [00:00<00:00, 709.68batch/s, train/train_loss_1=0.0878]Epoch 2: 88%|########7 | 575/655 [00:00<00:00, 709.68batch/s, train/train_loss_1=0.0781]Epoch 2: 99%|#########8| 646/655 [00:00<00:00, 705.96batch/s, train/train_loss_1=0.0781]Epoch 2: 99%|#########8| 646/655 [00:00<00:00, 705.96batch/s, train/train_loss_1=0.0898]Epoch 2: 99%|#########8| 646/655 [00:00<00:00, 705.96batch/s, train/train_loss_1=0.0946]Epoch 2: 99%|#########8| 646/655 [00:00<00:00, 705.96batch/s, train/train_loss_1=0.0853]Epoch 2: 99%|#########8| 646/655 [00:00<00:00, 705.96batch/s, train/train_loss_1=0.0991]Epoch 2: 99%|#########8| 646/655 [00:00<00:00, 705.96batch/s, train/train_loss_1=0.0861]Epoch 2: 99%|#########8| 646/655 [00:00<00:00, 705.96batch/s, train/train_loss_1=0.0929]Epoch 2: 99%|#########8| 646/655 [00:00<00:00, 705.96batch/s, train/train_loss_1=0.0871]Epoch 2: 99%|#########8| 646/655 [00:00<00:00, 705.96batch/s, train/train_loss_1=0.0923]Epoch 2: 99%|#########8| 646/655 [00:00<00:00, 705.96batch/s, train/train_loss_1=0.0816] 0%| | 0/655 [00:00<?, ?batch/s]Epoch 3: 0%| | 0/655 [00:00<?, ?batch/s]Epoch 3: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0884]Epoch 3: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0854]Epoch 3: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0753]Epoch 3: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.103] Epoch 3: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.101]Epoch 3: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0858]Epoch 3: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0829]Epoch 3: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.103] Epoch 3: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0985]Epoch 3: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.097] Epoch 3: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0874]Epoch 3: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.102] Epoch 3: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0836]Epoch 3: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0979]Epoch 3: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.095] Epoch 3: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0985]Epoch 3: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0944]Epoch 3: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.094] Epoch 3: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0842]Epoch 3: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0921]Epoch 3: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0977]Epoch 3: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0965]Epoch 3: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0884]Epoch 3: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0828]Epoch 3: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.09] Epoch 3: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0966]Epoch 3: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0877]Epoch 3: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0909]Epoch 3: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.104] Epoch 3: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0974]Epoch 3: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0791]Epoch 3: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0965]Epoch 3: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0984]Epoch 3: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0839]Epoch 3: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0956]Epoch 3: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.095] Epoch 3: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.111]Epoch 3: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0802]Epoch 3: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.102] Epoch 3: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.102]Epoch 3: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0935]Epoch 3: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0751]Epoch 3: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0997]Epoch 3: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0956]Epoch 3: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0868]Epoch 3: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.095] Epoch 3: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0902]Epoch 3: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0844]Epoch 3: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.103] Epoch 3: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0994]Epoch 3: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.1] Epoch 3: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0866]Epoch 3: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0925]Epoch 3: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0929]Epoch 3: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0871]Epoch 3: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0995]Epoch 3: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0873]Epoch 3: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0976]Epoch 3: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0792]Epoch 3: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0837]Epoch 3: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0818]Epoch 3: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.106] Epoch 3: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0884]Epoch 3: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0864]Epoch 3: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0921]Epoch 3: 10%|9 | 65/655 [00:00<00:00, 648.56batch/s, train/train_loss_1=0.0921]Epoch 3: 10%|9 | 65/655 [00:00<00:00, 648.56batch/s, train/train_loss_1=0.0914]Epoch 3: 10%|9 | 65/655 [00:00<00:00, 648.56batch/s, train/train_loss_1=0.0857]Epoch 3: 10%|9 | 65/655 [00:00<00:00, 648.56batch/s, train/train_loss_1=0.0923]Epoch 3: 10%|9 | 65/655 [00:00<00:00, 648.56batch/s, train/train_loss_1=0.0901]Epoch 3: 10%|9 | 65/655 [00:00<00:00, 648.56batch/s, train/train_loss_1=0.0935]Epoch 3: 10%|9 | 65/655 [00:00<00:00, 648.56batch/s, train/train_loss_1=0.0847]Epoch 3: 10%|9 | 65/655 [00:00<00:00, 648.56batch/s, train/train_loss_1=0.0986]Epoch 3: 10%|9 | 65/655 [00:00<00:00, 648.56batch/s, train/train_loss_1=0.106] Epoch 3: 10%|9 | 65/655 [00:00<00:00, 648.56batch/s, train/train_loss_1=0.0978]Epoch 3: 10%|9 | 65/655 [00:00<00:00, 648.56batch/s, train/train_loss_1=0.0893]Epoch 3: 10%|9 | 65/655 [00:00<00:00, 648.56batch/s, train/train_loss_1=0.0946]Epoch 3: 10%|9 | 65/655 [00:00<00:00, 648.56batch/s, train/train_loss_1=0.106] Epoch 3: 10%|9 | 65/655 [00:00<00:00, 648.56batch/s, train/train_loss_1=0.0892]Epoch 3: 10%|9 | 65/655 [00:00<00:00, 648.56batch/s, train/train_loss_1=0.0972]Epoch 3: 10%|9 | 65/655 [00:00<00:00, 648.56batch/s, train/train_loss_1=0.0859]Epoch 3: 10%|9 | 65/655 [00:00<00:00, 648.56batch/s, train/train_loss_1=0.0863]Epoch 3: 10%|9 | 65/655 [00:00<00:00, 648.56batch/s, train/train_loss_1=0.0773]Epoch 3: 10%|9 | 65/655 [00:00<00:00, 648.56batch/s, train/train_loss_1=0.0937]Epoch 3: 10%|9 | 65/655 [00:00<00:00, 648.56batch/s, train/train_loss_1=0.0983]Epoch 3: 10%|9 | 65/655 [00:00<00:00, 648.56batch/s, train/train_loss_1=0.0927]Epoch 3: 10%|9 | 65/655 [00:00<00:00, 648.56batch/s, train/train_loss_1=0.0939]Epoch 3: 10%|9 | 65/655 [00:00<00:00, 648.56batch/s, train/train_loss_1=0.0925]Epoch 3: 10%|9 | 65/655 [00:00<00:00, 648.56batch/s, train/train_loss_1=0.079] Epoch 3: 10%|9 | 65/655 [00:00<00:00, 648.56batch/s, train/train_loss_1=0.108]Epoch 3: 10%|9 | 65/655 [00:00<00:00, 648.56batch/s, train/train_loss_1=0.0986]Epoch 3: 10%|9 | 65/655 [00:00<00:00, 648.56batch/s, train/train_loss_1=0.0942]Epoch 3: 10%|9 | 65/655 [00:00<00:00, 648.56batch/s, train/train_loss_1=0.104] Epoch 3: 10%|9 | 65/655 [00:00<00:00, 648.56batch/s, train/train_loss_1=0.109]Epoch 3: 10%|9 | 65/655 [00:00<00:00, 648.56batch/s, train/train_loss_1=0.102]Epoch 3: 10%|9 | 65/655 [00:00<00:00, 648.56batch/s, train/train_loss_1=0.0928]Epoch 3: 10%|9 | 65/655 [00:00<00:00, 648.56batch/s, train/train_loss_1=0.083] Epoch 3: 10%|9 | 65/655 [00:00<00:00, 648.56batch/s, train/train_loss_1=0.0798]Epoch 3: 10%|9 | 65/655 [00:00<00:00, 648.56batch/s, train/train_loss_1=0.0812]Epoch 3: 10%|9 | 65/655 [00:00<00:00, 648.56batch/s, train/train_loss_1=0.0921]Epoch 3: 10%|9 | 65/655 [00:00<00:00, 648.56batch/s, train/train_loss_1=0.095] Epoch 3: 10%|9 | 65/655 [00:00<00:00, 648.56batch/s, train/train_loss_1=0.0851]Epoch 3: 10%|9 | 65/655 [00:00<00:00, 648.56batch/s, train/train_loss_1=0.0826]Epoch 3: 10%|9 | 65/655 [00:00<00:00, 648.56batch/s, train/train_loss_1=0.0907]Epoch 3: 10%|9 | 65/655 [00:00<00:00, 648.56batch/s, train/train_loss_1=0.116] Epoch 3: 10%|9 | 65/655 [00:00<00:00, 648.56batch/s, train/train_loss_1=0.079]Epoch 3: 10%|9 | 65/655 [00:00<00:00, 648.56batch/s, train/train_loss_1=0.0877]Epoch 3: 10%|9 | 65/655 [00:00<00:00, 648.56batch/s, train/train_loss_1=0.105] Epoch 3: 10%|9 | 65/655 [00:00<00:00, 648.56batch/s, train/train_loss_1=0.0909]Epoch 3: 10%|9 | 65/655 [00:00<00:00, 648.56batch/s, train/train_loss_1=0.0888]Epoch 3: 10%|9 | 65/655 [00:00<00:00, 648.56batch/s, train/train_loss_1=0.0934]Epoch 3: 10%|9 | 65/655 [00:00<00:00, 648.56batch/s, train/train_loss_1=0.098] Epoch 3: 10%|9 | 65/655 [00:00<00:00, 648.56batch/s, train/train_loss_1=0.103]Epoch 3: 10%|9 | 65/655 [00:00<00:00, 648.56batch/s, train/train_loss_1=0.0827]Epoch 3: 10%|9 | 65/655 [00:00<00:00, 648.56batch/s, train/train_loss_1=0.0778]Epoch 3: 10%|9 | 65/655 [00:00<00:00, 648.56batch/s, train/train_loss_1=0.104] Epoch 3: 10%|9 | 65/655 [00:00<00:00, 648.56batch/s, train/train_loss_1=0.0949]Epoch 3: 10%|9 | 65/655 [00:00<00:00, 648.56batch/s, train/train_loss_1=0.101] Epoch 3: 10%|9 | 65/655 [00:00<00:00, 648.56batch/s, train/train_loss_1=0.0918]Epoch 3: 10%|9 | 65/655 [00:00<00:00, 648.56batch/s, train/train_loss_1=0.0914]Epoch 3: 10%|9 | 65/655 [00:00<00:00, 648.56batch/s, train/train_loss_1=0.0947]Epoch 3: 10%|9 | 65/655 [00:00<00:00, 648.56batch/s, train/train_loss_1=0.088] Epoch 3: 10%|9 | 65/655 [00:00<00:00, 648.56batch/s, train/train_loss_1=0.0812]Epoch 3: 10%|9 | 65/655 [00:00<00:00, 648.56batch/s, train/train_loss_1=0.0963]Epoch 3: 10%|9 | 65/655 [00:00<00:00, 648.56batch/s, train/train_loss_1=0.0854]Epoch 3: 10%|9 | 65/655 [00:00<00:00, 648.56batch/s, train/train_loss_1=0.0998]Epoch 3: 10%|9 | 65/655 [00:00<00:00, 648.56batch/s, train/train_loss_1=0.103] Epoch 3: 10%|9 | 65/655 [00:00<00:00, 648.56batch/s, train/train_loss_1=0.0894]Epoch 3: 10%|9 | 65/655 [00:00<00:00, 648.56batch/s, train/train_loss_1=0.0821]Epoch 3: 10%|9 | 65/655 [00:00<00:00, 648.56batch/s, train/train_loss_1=0.0901]Epoch 3: 10%|9 | 65/655 [00:00<00:00, 648.56batch/s, train/train_loss_1=0.0922]Epoch 3: 10%|9 | 65/655 [00:00<00:00, 648.56batch/s, train/train_loss_1=0.0997]Epoch 3: 10%|9 | 65/655 [00:00<00:00, 648.56batch/s, train/train_loss_1=0.096] Epoch 3: 10%|9 | 65/655 [00:00<00:00, 648.56batch/s, train/train_loss_1=0.0995]Epoch 3: 10%|9 | 65/655 [00:00<00:00, 648.56batch/s, train/train_loss_1=0.0985]Epoch 3: 10%|9 | 65/655 [00:00<00:00, 648.56batch/s, train/train_loss_1=0.0981]Epoch 3: 10%|9 | 65/655 [00:00<00:00, 648.56batch/s, train/train_loss_1=0.0986]Epoch 3: 21%|## | 136/655 [00:00<00:00, 679.78batch/s, train/train_loss_1=0.0986]Epoch 3: 21%|## | 136/655 [00:00<00:00, 679.78batch/s, train/train_loss_1=0.0924]Epoch 3: 21%|## | 136/655 [00:00<00:00, 679.78batch/s, train/train_loss_1=0.0936]Epoch 3: 21%|## | 136/655 [00:00<00:00, 679.78batch/s, train/train_loss_1=0.0882]Epoch 3: 21%|## | 136/655 [00:00<00:00, 679.78batch/s, train/train_loss_1=0.108] Epoch 3: 21%|## | 136/655 [00:00<00:00, 679.78batch/s, train/train_loss_1=0.104]Epoch 3: 21%|## | 136/655 [00:00<00:00, 679.78batch/s, train/train_loss_1=0.102]Epoch 3: 21%|## | 136/655 [00:00<00:00, 679.78batch/s, train/train_loss_1=0.086]Epoch 3: 21%|## | 136/655 [00:00<00:00, 679.78batch/s, train/train_loss_1=0.0984]Epoch 3: 21%|## | 136/655 [00:00<00:00, 679.78batch/s, train/train_loss_1=0.094] Epoch 3: 21%|## | 136/655 [00:00<00:00, 679.78batch/s, train/train_loss_1=0.1] Epoch 3: 21%|## | 136/655 [00:00<00:00, 679.78batch/s, train/train_loss_1=0.0861]Epoch 3: 21%|## | 136/655 [00:00<00:00, 679.78batch/s, train/train_loss_1=0.0889]Epoch 3: 21%|## | 136/655 [00:00<00:00, 679.78batch/s, train/train_loss_1=0.0928]Epoch 3: 21%|## | 136/655 [00:00<00:00, 679.78batch/s, train/train_loss_1=0.0851]Epoch 3: 21%|## | 136/655 [00:00<00:00, 679.78batch/s, train/train_loss_1=0.099] Epoch 3: 21%|## | 136/655 [00:00<00:00, 679.78batch/s, train/train_loss_1=0.0927]Epoch 3: 21%|## | 136/655 [00:00<00:00, 679.78batch/s, train/train_loss_1=0.0821]Epoch 3: 21%|## | 136/655 [00:00<00:00, 679.78batch/s, train/train_loss_1=0.0859]Epoch 3: 21%|## | 136/655 [00:00<00:00, 679.78batch/s, train/train_loss_1=0.09] Epoch 3: 21%|## | 136/655 [00:00<00:00, 679.78batch/s, train/train_loss_1=0.0901]Epoch 3: 21%|## | 136/655 [00:00<00:00, 679.78batch/s, train/train_loss_1=0.102] Epoch 3: 21%|## | 136/655 [00:00<00:00, 679.78batch/s, train/train_loss_1=0.0924]Epoch 3: 21%|## | 136/655 [00:00<00:00, 679.78batch/s, train/train_loss_1=0.0914]Epoch 3: 21%|## | 136/655 [00:00<00:00, 679.78batch/s, train/train_loss_1=0.0858]Epoch 3: 21%|## | 136/655 [00:00<00:00, 679.78batch/s, train/train_loss_1=0.0946]Epoch 3: 21%|## | 136/655 [00:00<00:00, 679.78batch/s, train/train_loss_1=0.103] Epoch 3: 21%|## | 136/655 [00:00<00:00, 679.78batch/s, train/train_loss_1=0.0915]Epoch 3: 21%|## | 136/655 [00:00<00:00, 679.78batch/s, train/train_loss_1=0.0851]Epoch 3: 21%|## | 136/655 [00:00<00:00, 679.78batch/s, train/train_loss_1=0.107] Epoch 3: 21%|## | 136/655 [00:00<00:00, 679.78batch/s, train/train_loss_1=0.0911]Epoch 3: 21%|## | 136/655 [00:00<00:00, 679.78batch/s, train/train_loss_1=0.0997]Epoch 3: 21%|## | 136/655 [00:00<00:00, 679.78batch/s, train/train_loss_1=0.0813]Epoch 3: 21%|## | 136/655 [00:00<00:00, 679.78batch/s, train/train_loss_1=0.0953]Epoch 3: 21%|## | 136/655 [00:00<00:00, 679.78batch/s, train/train_loss_1=0.0872]Epoch 3: 21%|## | 136/655 [00:00<00:00, 679.78batch/s, train/train_loss_1=0.0846]Epoch 3: 21%|## | 136/655 [00:00<00:00, 679.78batch/s, train/train_loss_1=0.104] Epoch 3: 21%|## | 136/655 [00:00<00:00, 679.78batch/s, train/train_loss_1=0.104]Epoch 3: 21%|## | 136/655 [00:00<00:00, 679.78batch/s, train/train_loss_1=0.0918]Epoch 3: 21%|## | 136/655 [00:00<00:00, 679.78batch/s, train/train_loss_1=0.0815]Epoch 3: 21%|## | 136/655 [00:00<00:00, 679.78batch/s, train/train_loss_1=0.0987]Epoch 3: 21%|## | 136/655 [00:00<00:00, 679.78batch/s, train/train_loss_1=0.0988]Epoch 3: 21%|## | 136/655 [00:00<00:00, 679.78batch/s, train/train_loss_1=0.0912]Epoch 3: 21%|## | 136/655 [00:00<00:00, 679.78batch/s, train/train_loss_1=0.0935]Epoch 3: 21%|## | 136/655 [00:00<00:00, 679.78batch/s, train/train_loss_1=0.0937]Epoch 3: 21%|## | 136/655 [00:00<00:00, 679.78batch/s, train/train_loss_1=0.1] Epoch 3: 21%|## | 136/655 [00:00<00:00, 679.78batch/s, train/train_loss_1=0.0818]Epoch 3: 21%|## | 136/655 [00:00<00:00, 679.78batch/s, train/train_loss_1=0.0872]Epoch 3: 21%|## | 136/655 [00:00<00:00, 679.78batch/s, train/train_loss_1=0.0882]Epoch 3: 21%|## | 136/655 [00:00<00:00, 679.78batch/s, train/train_loss_1=0.1] Epoch 3: 21%|## | 136/655 [00:00<00:00, 679.78batch/s, train/train_loss_1=0.0951]Epoch 3: 21%|## | 136/655 [00:00<00:00, 679.78batch/s, train/train_loss_1=0.0767]Epoch 3: 21%|## | 136/655 [00:00<00:00, 679.78batch/s, train/train_loss_1=0.0984]Epoch 3: 21%|## | 136/655 [00:00<00:00, 679.78batch/s, train/train_loss_1=0.0886]Epoch 3: 21%|## | 136/655 [00:00<00:00, 679.78batch/s, train/train_loss_1=0.0839]Epoch 3: 21%|## | 136/655 [00:00<00:00, 679.78batch/s, train/train_loss_1=0.102] Epoch 3: 21%|## | 136/655 [00:00<00:00, 679.78batch/s, train/train_loss_1=0.0858]Epoch 3: 21%|## | 136/655 [00:00<00:00, 679.78batch/s, train/train_loss_1=0.0802]Epoch 3: 21%|## | 136/655 [00:00<00:00, 679.78batch/s, train/train_loss_1=0.108] Epoch 3: 21%|## | 136/655 [00:00<00:00, 679.78batch/s, train/train_loss_1=0.102]Epoch 3: 21%|## | 136/655 [00:00<00:00, 679.78batch/s, train/train_loss_1=0.0849]Epoch 3: 21%|## | 136/655 [00:00<00:00, 679.78batch/s, train/train_loss_1=0.102] Epoch 3: 21%|## | 136/655 [00:00<00:00, 679.78batch/s, train/train_loss_1=0.0953]Epoch 3: 21%|## | 136/655 [00:00<00:00, 679.78batch/s, train/train_loss_1=0.0937]Epoch 3: 21%|## | 136/655 [00:00<00:00, 679.78batch/s, train/train_loss_1=0.0911]Epoch 3: 21%|## | 136/655 [00:00<00:00, 679.78batch/s, train/train_loss_1=0.0923]Epoch 3: 21%|## | 136/655 [00:00<00:00, 679.78batch/s, train/train_loss_1=0.0822]Epoch 3: 21%|## | 136/655 [00:00<00:00, 679.78batch/s, train/train_loss_1=0.083] Epoch 3: 21%|## | 136/655 [00:00<00:00, 679.78batch/s, train/train_loss_1=0.093]Epoch 3: 31%|###1 | 204/655 [00:00<00:00, 620.28batch/s, train/train_loss_1=0.093]Epoch 3: 31%|###1 | 204/655 [00:00<00:00, 620.28batch/s, train/train_loss_1=0.0847]Epoch 3: 31%|###1 | 204/655 [00:00<00:00, 620.28batch/s, train/train_loss_1=0.0891]Epoch 3: 31%|###1 | 204/655 [00:00<00:00, 620.28batch/s, train/train_loss_1=0.0969]Epoch 3: 31%|###1 | 204/655 [00:00<00:00, 620.28batch/s, train/train_loss_1=0.0851]Epoch 3: 31%|###1 | 204/655 [00:00<00:00, 620.28batch/s, train/train_loss_1=0.0889]Epoch 3: 31%|###1 | 204/655 [00:00<00:00, 620.28batch/s, train/train_loss_1=0.0728]Epoch 3: 31%|###1 | 204/655 [00:00<00:00, 620.28batch/s, train/train_loss_1=0.0904]Epoch 3: 31%|###1 | 204/655 [00:00<00:00, 620.28batch/s, train/train_loss_1=0.0855]Epoch 3: 31%|###1 | 204/655 [00:00<00:00, 620.28batch/s, train/train_loss_1=0.0934]Epoch 3: 31%|###1 | 204/655 [00:00<00:00, 620.28batch/s, train/train_loss_1=0.0896]Epoch 3: 31%|###1 | 204/655 [00:00<00:00, 620.28batch/s, train/train_loss_1=0.0702]Epoch 3: 31%|###1 | 204/655 [00:00<00:00, 620.28batch/s, train/train_loss_1=0.0892]Epoch 3: 31%|###1 | 204/655 [00:00<00:00, 620.28batch/s, train/train_loss_1=0.0883]Epoch 3: 31%|###1 | 204/655 [00:00<00:00, 620.28batch/s, train/train_loss_1=0.0862]Epoch 3: 31%|###1 | 204/655 [00:00<00:00, 620.28batch/s, train/train_loss_1=0.0913]Epoch 3: 31%|###1 | 204/655 [00:00<00:00, 620.28batch/s, train/train_loss_1=0.102] Epoch 3: 31%|###1 | 204/655 [00:00<00:00, 620.28batch/s, train/train_loss_1=0.096]Epoch 3: 31%|###1 | 204/655 [00:00<00:00, 620.28batch/s, train/train_loss_1=0.0835]Epoch 3: 31%|###1 | 204/655 [00:00<00:00, 620.28batch/s, train/train_loss_1=0.0838]Epoch 3: 31%|###1 | 204/655 [00:00<00:00, 620.28batch/s, train/train_loss_1=0.105] Epoch 3: 31%|###1 | 204/655 [00:00<00:00, 620.28batch/s, train/train_loss_1=0.0814]Epoch 3: 31%|###1 | 204/655 [00:00<00:00, 620.28batch/s, train/train_loss_1=0.0849]Epoch 3: 31%|###1 | 204/655 [00:00<00:00, 620.28batch/s, train/train_loss_1=0.103] Epoch 3: 31%|###1 | 204/655 [00:00<00:00, 620.28batch/s, train/train_loss_1=0.0822]Epoch 3: 31%|###1 | 204/655 [00:00<00:00, 620.28batch/s, train/train_loss_1=0.0817]Epoch 3: 31%|###1 | 204/655 [00:00<00:00, 620.28batch/s, train/train_loss_1=0.0996]Epoch 3: 31%|###1 | 204/655 [00:00<00:00, 620.28batch/s, train/train_loss_1=0.0966]Epoch 3: 31%|###1 | 204/655 [00:00<00:00, 620.28batch/s, train/train_loss_1=0.0923]Epoch 3: 31%|###1 | 204/655 [00:00<00:00, 620.28batch/s, train/train_loss_1=0.0933]Epoch 3: 31%|###1 | 204/655 [00:00<00:00, 620.28batch/s, train/train_loss_1=0.0755]Epoch 3: 31%|###1 | 204/655 [00:00<00:00, 620.28batch/s, train/train_loss_1=0.104] Epoch 3: 31%|###1 | 204/655 [00:00<00:00, 620.28batch/s, train/train_loss_1=0.0851]Epoch 3: 31%|###1 | 204/655 [00:00<00:00, 620.28batch/s, train/train_loss_1=0.0976]Epoch 3: 31%|###1 | 204/655 [00:00<00:00, 620.28batch/s, train/train_loss_1=0.0851]Epoch 3: 31%|###1 | 204/655 [00:00<00:00, 620.28batch/s, train/train_loss_1=0.0748]Epoch 3: 31%|###1 | 204/655 [00:00<00:00, 620.28batch/s, train/train_loss_1=0.103] Epoch 3: 31%|###1 | 204/655 [00:00<00:00, 620.28batch/s, train/train_loss_1=0.0853]Epoch 3: 31%|###1 | 204/655 [00:00<00:00, 620.28batch/s, train/train_loss_1=0.0994]Epoch 3: 31%|###1 | 204/655 [00:00<00:00, 620.28batch/s, train/train_loss_1=0.087] Epoch 3: 31%|###1 | 204/655 [00:00<00:00, 620.28batch/s, train/train_loss_1=0.0849]Epoch 3: 31%|###1 | 204/655 [00:00<00:00, 620.28batch/s, train/train_loss_1=0.0773]Epoch 3: 31%|###1 | 204/655 [00:00<00:00, 620.28batch/s, train/train_loss_1=0.0704]Epoch 3: 31%|###1 | 204/655 [00:00<00:00, 620.28batch/s, train/train_loss_1=0.11] Epoch 3: 31%|###1 | 204/655 [00:00<00:00, 620.28batch/s, train/train_loss_1=0.0917]Epoch 3: 31%|###1 | 204/655 [00:00<00:00, 620.28batch/s, train/train_loss_1=0.0939]Epoch 3: 31%|###1 | 204/655 [00:00<00:00, 620.28batch/s, train/train_loss_1=0.0832]Epoch 3: 31%|###1 | 204/655 [00:00<00:00, 620.28batch/s, train/train_loss_1=0.0786]Epoch 3: 31%|###1 | 204/655 [00:00<00:00, 620.28batch/s, train/train_loss_1=0.0893]Epoch 3: 31%|###1 | 204/655 [00:00<00:00, 620.28batch/s, train/train_loss_1=0.0935]Epoch 3: 31%|###1 | 204/655 [00:00<00:00, 620.28batch/s, train/train_loss_1=0.0752]Epoch 3: 31%|###1 | 204/655 [00:00<00:00, 620.28batch/s, train/train_loss_1=0.0893]Epoch 3: 31%|###1 | 204/655 [00:00<00:00, 620.28batch/s, train/train_loss_1=0.0797]Epoch 3: 31%|###1 | 204/655 [00:00<00:00, 620.28batch/s, train/train_loss_1=0.109] Epoch 3: 31%|###1 | 204/655 [00:00<00:00, 620.28batch/s, train/train_loss_1=0.0884]Epoch 3: 31%|###1 | 204/655 [00:00<00:00, 620.28batch/s, train/train_loss_1=0.0904]Epoch 3: 31%|###1 | 204/655 [00:00<00:00, 620.28batch/s, train/train_loss_1=0.104] Epoch 3: 31%|###1 | 204/655 [00:00<00:00, 620.28batch/s, train/train_loss_1=0.1] Epoch 3: 31%|###1 | 204/655 [00:00<00:00, 620.28batch/s, train/train_loss_1=0.0988]Epoch 3: 31%|###1 | 204/655 [00:00<00:00, 620.28batch/s, train/train_loss_1=0.0816]Epoch 3: 31%|###1 | 204/655 [00:00<00:00, 620.28batch/s, train/train_loss_1=0.0967]Epoch 3: 31%|###1 | 204/655 [00:00<00:00, 620.28batch/s, train/train_loss_1=0.104] Epoch 3: 31%|###1 | 204/655 [00:00<00:00, 620.28batch/s, train/train_loss_1=0.0913]Epoch 3: 31%|###1 | 204/655 [00:00<00:00, 620.28batch/s, train/train_loss_1=0.0982]Epoch 3: 31%|###1 | 204/655 [00:00<00:00, 620.28batch/s, train/train_loss_1=0.0947]Epoch 3: 31%|###1 | 204/655 [00:00<00:00, 620.28batch/s, train/train_loss_1=0.0966]Epoch 3: 31%|###1 | 204/655 [00:00<00:00, 620.28batch/s, train/train_loss_1=0.103] Epoch 3: 31%|###1 | 204/655 [00:00<00:00, 620.28batch/s, train/train_loss_1=0.0889]Epoch 3: 31%|###1 | 204/655 [00:00<00:00, 620.28batch/s, train/train_loss_1=0.0874]Epoch 3: 31%|###1 | 204/655 [00:00<00:00, 620.28batch/s, train/train_loss_1=0.0832]Epoch 3: 31%|###1 | 204/655 [00:00<00:00, 620.28batch/s, train/train_loss_1=0.0858]Epoch 3: 31%|###1 | 204/655 [00:00<00:00, 620.28batch/s, train/train_loss_1=0.0819]Epoch 3: 31%|###1 | 204/655 [00:00<00:00, 620.28batch/s, train/train_loss_1=0.0861]Epoch 3: 42%|####2 | 276/655 [00:00<00:00, 654.70batch/s, train/train_loss_1=0.0861]Epoch 3: 42%|####2 | 276/655 [00:00<00:00, 654.70batch/s, train/train_loss_1=0.103] Epoch 3: 42%|####2 | 276/655 [00:00<00:00, 654.70batch/s, train/train_loss_1=0.0891]Epoch 3: 42%|####2 | 276/655 [00:00<00:00, 654.70batch/s, train/train_loss_1=0.0869]Epoch 3: 42%|####2 | 276/655 [00:00<00:00, 654.70batch/s, train/train_loss_1=0.0897]Epoch 3: 42%|####2 | 276/655 [00:00<00:00, 654.70batch/s, train/train_loss_1=0.0974]Epoch 3: 42%|####2 | 276/655 [00:00<00:00, 654.70batch/s, train/train_loss_1=0.0884]Epoch 3: 42%|####2 | 276/655 [00:00<00:00, 654.70batch/s, train/train_loss_1=0.0975]Epoch 3: 42%|####2 | 276/655 [00:00<00:00, 654.70batch/s, train/train_loss_1=0.0959]Epoch 3: 42%|####2 | 276/655 [00:00<00:00, 654.70batch/s, train/train_loss_1=0.0867]Epoch 3: 42%|####2 | 276/655 [00:00<00:00, 654.70batch/s, train/train_loss_1=0.0996]Epoch 3: 42%|####2 | 276/655 [00:00<00:00, 654.70batch/s, train/train_loss_1=0.1] Epoch 3: 42%|####2 | 276/655 [00:00<00:00, 654.70batch/s, train/train_loss_1=0.0834]Epoch 3: 42%|####2 | 276/655 [00:00<00:00, 654.70batch/s, train/train_loss_1=0.0874]Epoch 3: 42%|####2 | 276/655 [00:00<00:00, 654.70batch/s, train/train_loss_1=0.1] Epoch 3: 42%|####2 | 276/655 [00:00<00:00, 654.70batch/s, train/train_loss_1=0.0856]Epoch 3: 42%|####2 | 276/655 [00:00<00:00, 654.70batch/s, train/train_loss_1=0.0694]Epoch 3: 42%|####2 | 276/655 [00:00<00:00, 654.70batch/s, train/train_loss_1=0.0859]Epoch 3: 42%|####2 | 276/655 [00:00<00:00, 654.70batch/s, train/train_loss_1=0.0963]Epoch 3: 42%|####2 | 276/655 [00:00<00:00, 654.70batch/s, train/train_loss_1=0.0924]Epoch 3: 42%|####2 | 276/655 [00:00<00:00, 654.70batch/s, train/train_loss_1=0.116] Epoch 3: 42%|####2 | 276/655 [00:00<00:00, 654.70batch/s, train/train_loss_1=0.0859]Epoch 3: 42%|####2 | 276/655 [00:00<00:00, 654.70batch/s, train/train_loss_1=0.0967]Epoch 3: 42%|####2 | 276/655 [00:00<00:00, 654.70batch/s, train/train_loss_1=0.0981]Epoch 3: 42%|####2 | 276/655 [00:00<00:00, 654.70batch/s, train/train_loss_1=0.104] Epoch 3: 42%|####2 | 276/655 [00:00<00:00, 654.70batch/s, train/train_loss_1=0.104]Epoch 3: 42%|####2 | 276/655 [00:00<00:00, 654.70batch/s, train/train_loss_1=0.0837]Epoch 3: 42%|####2 | 276/655 [00:00<00:00, 654.70batch/s, train/train_loss_1=0.104] Epoch 3: 42%|####2 | 276/655 [00:00<00:00, 654.70batch/s, train/train_loss_1=0.0971]Epoch 3: 42%|####2 | 276/655 [00:00<00:00, 654.70batch/s, train/train_loss_1=0.104] Epoch 3: 42%|####2 | 276/655 [00:00<00:00, 654.70batch/s, train/train_loss_1=0.0872]Epoch 3: 42%|####2 | 276/655 [00:00<00:00, 654.70batch/s, train/train_loss_1=0.0886]Epoch 3: 42%|####2 | 276/655 [00:00<00:00, 654.70batch/s, train/train_loss_1=0.0911]Epoch 3: 42%|####2 | 276/655 [00:00<00:00, 654.70batch/s, train/train_loss_1=0.0933]Epoch 3: 42%|####2 | 276/655 [00:00<00:00, 654.70batch/s, train/train_loss_1=0.0931]Epoch 3: 42%|####2 | 276/655 [00:00<00:00, 654.70batch/s, train/train_loss_1=0.0991]Epoch 3: 42%|####2 | 276/655 [00:00<00:00, 654.70batch/s, train/train_loss_1=0.103] Epoch 3: 42%|####2 | 276/655 [00:00<00:00, 654.70batch/s, train/train_loss_1=0.0813]Epoch 3: 42%|####2 | 276/655 [00:00<00:00, 654.70batch/s, train/train_loss_1=0.0885]Epoch 3: 42%|####2 | 276/655 [00:00<00:00, 654.70batch/s, train/train_loss_1=0.0892]Epoch 3: 42%|####2 | 276/655 [00:00<00:00, 654.70batch/s, train/train_loss_1=0.0876]Epoch 3: 42%|####2 | 276/655 [00:00<00:00, 654.70batch/s, train/train_loss_1=0.101] Epoch 3: 42%|####2 | 276/655 [00:00<00:00, 654.70batch/s, train/train_loss_1=0.0795]Epoch 3: 42%|####2 | 276/655 [00:00<00:00, 654.70batch/s, train/train_loss_1=0.0958]Epoch 3: 42%|####2 | 276/655 [00:00<00:00, 654.70batch/s, train/train_loss_1=0.0959]Epoch 3: 42%|####2 | 276/655 [00:00<00:00, 654.70batch/s, train/train_loss_1=0.0972]Epoch 3: 42%|####2 | 276/655 [00:00<00:00, 654.70batch/s, train/train_loss_1=0.102] Epoch 3: 42%|####2 | 276/655 [00:00<00:00, 654.70batch/s, train/train_loss_1=0.082]Epoch 3: 42%|####2 | 276/655 [00:00<00:00, 654.70batch/s, train/train_loss_1=0.0914]Epoch 3: 42%|####2 | 276/655 [00:00<00:00, 654.70batch/s, train/train_loss_1=0.0889]Epoch 3: 42%|####2 | 276/655 [00:00<00:00, 654.70batch/s, train/train_loss_1=0.0993]Epoch 3: 42%|####2 | 276/655 [00:00<00:00, 654.70batch/s, train/train_loss_1=0.103] Epoch 3: 42%|####2 | 276/655 [00:00<00:00, 654.70batch/s, train/train_loss_1=0.0823]Epoch 3: 42%|####2 | 276/655 [00:00<00:00, 654.70batch/s, train/train_loss_1=0.0881]Epoch 3: 42%|####2 | 276/655 [00:00<00:00, 654.70batch/s, train/train_loss_1=0.091] Epoch 3: 42%|####2 | 276/655 [00:00<00:00, 654.70batch/s, train/train_loss_1=0.0866]Epoch 3: 42%|####2 | 276/655 [00:00<00:00, 654.70batch/s, train/train_loss_1=0.0849]Epoch 3: 42%|####2 | 276/655 [00:00<00:00, 654.70batch/s, train/train_loss_1=0.0945]Epoch 3: 42%|####2 | 276/655 [00:00<00:00, 654.70batch/s, train/train_loss_1=0.0807]Epoch 3: 42%|####2 | 276/655 [00:00<00:00, 654.70batch/s, train/train_loss_1=0.0938]Epoch 3: 42%|####2 | 276/655 [00:00<00:00, 654.70batch/s, train/train_loss_1=0.0952]Epoch 3: 42%|####2 | 276/655 [00:00<00:00, 654.70batch/s, train/train_loss_1=0.0823]Epoch 3: 42%|####2 | 276/655 [00:00<00:00, 654.70batch/s, train/train_loss_1=0.0789]Epoch 3: 42%|####2 | 276/655 [00:00<00:00, 654.70batch/s, train/train_loss_1=0.0984]Epoch 3: 42%|####2 | 276/655 [00:00<00:00, 654.70batch/s, train/train_loss_1=0.0903]Epoch 3: 42%|####2 | 276/655 [00:00<00:00, 654.70batch/s, train/train_loss_1=0.0835]Epoch 3: 42%|####2 | 276/655 [00:00<00:00, 654.70batch/s, train/train_loss_1=0.0916]Epoch 3: 42%|####2 | 276/655 [00:00<00:00, 654.70batch/s, train/train_loss_1=0.0837]Epoch 3: 42%|####2 | 276/655 [00:00<00:00, 654.70batch/s, train/train_loss_1=0.0897]Epoch 3: 42%|####2 | 276/655 [00:00<00:00, 654.70batch/s, train/train_loss_1=0.0986]Epoch 3: 42%|####2 | 276/655 [00:00<00:00, 654.70batch/s, train/train_loss_1=0.0835]Epoch 3: 42%|####2 | 276/655 [00:00<00:00, 654.70batch/s, train/train_loss_1=0.113] Epoch 3: 42%|####2 | 276/655 [00:00<00:00, 654.70batch/s, train/train_loss_1=0.0842]Epoch 3: 42%|####2 | 276/655 [00:00<00:00, 654.70batch/s, train/train_loss_1=0.0937]Epoch 3: 42%|####2 | 276/655 [00:00<00:00, 654.70batch/s, train/train_loss_1=0.0824]Epoch 3: 53%|#####3 | 350/655 [00:00<00:00, 682.83batch/s, train/train_loss_1=0.0824]Epoch 3: 53%|#####3 | 350/655 [00:00<00:00, 682.83batch/s, train/train_loss_1=0.0913]Epoch 3: 53%|#####3 | 350/655 [00:00<00:00, 682.83batch/s, train/train_loss_1=0.0835]Epoch 3: 53%|#####3 | 350/655 [00:00<00:00, 682.83batch/s, train/train_loss_1=0.0972]Epoch 3: 53%|#####3 | 350/655 [00:00<00:00, 682.83batch/s, train/train_loss_1=0.0729]Epoch 3: 53%|#####3 | 350/655 [00:00<00:00, 682.83batch/s, train/train_loss_1=0.0781]Epoch 3: 53%|#####3 | 350/655 [00:00<00:00, 682.83batch/s, train/train_loss_1=0.105] Epoch 3: 53%|#####3 | 350/655 [00:00<00:00, 682.83batch/s, train/train_loss_1=0.113]Epoch 3: 53%|#####3 | 350/655 [00:00<00:00, 682.83batch/s, train/train_loss_1=0.0985]Epoch 3: 53%|#####3 | 350/655 [00:00<00:00, 682.83batch/s, train/train_loss_1=0.0776]Epoch 3: 53%|#####3 | 350/655 [00:00<00:00, 682.83batch/s, train/train_loss_1=0.111] Epoch 3: 53%|#####3 | 350/655 [00:00<00:00, 682.83batch/s, train/train_loss_1=0.0784]Epoch 3: 53%|#####3 | 350/655 [00:00<00:00, 682.83batch/s, train/train_loss_1=0.0872]Epoch 3: 53%|#####3 | 350/655 [00:00<00:00, 682.83batch/s, train/train_loss_1=0.0964]Epoch 3: 53%|#####3 | 350/655 [00:00<00:00, 682.83batch/s, train/train_loss_1=0.0937]Epoch 3: 53%|#####3 | 350/655 [00:00<00:00, 682.83batch/s, train/train_loss_1=0.0935]Epoch 3: 53%|#####3 | 350/655 [00:00<00:00, 682.83batch/s, train/train_loss_1=0.106] Epoch 3: 53%|#####3 | 350/655 [00:00<00:00, 682.83batch/s, train/train_loss_1=0.0942]Epoch 3: 53%|#####3 | 350/655 [00:00<00:00, 682.83batch/s, train/train_loss_1=0.0947]Epoch 3: 53%|#####3 | 350/655 [00:00<00:00, 682.83batch/s, train/train_loss_1=0.0903]Epoch 3: 53%|#####3 | 350/655 [00:00<00:00, 682.83batch/s, train/train_loss_1=0.11] Epoch 3: 53%|#####3 | 350/655 [00:00<00:00, 682.83batch/s, train/train_loss_1=0.0959]Epoch 3: 53%|#####3 | 350/655 [00:00<00:00, 682.83batch/s, train/train_loss_1=0.0903]Epoch 3: 53%|#####3 | 350/655 [00:00<00:00, 682.83batch/s, train/train_loss_1=0.0921]Epoch 3: 53%|#####3 | 350/655 [00:00<00:00, 682.83batch/s, train/train_loss_1=0.095] Epoch 3: 53%|#####3 | 350/655 [00:00<00:00, 682.83batch/s, train/train_loss_1=0.0937]Epoch 3: 53%|#####3 | 350/655 [00:00<00:00, 682.83batch/s, train/train_loss_1=0.0921]Epoch 3: 53%|#####3 | 350/655 [00:00<00:00, 682.83batch/s, train/train_loss_1=0.0862]Epoch 3: 53%|#####3 | 350/655 [00:00<00:00, 682.83batch/s, train/train_loss_1=0.091] Epoch 3: 53%|#####3 | 350/655 [00:00<00:00, 682.83batch/s, train/train_loss_1=0.0788]Epoch 3: 53%|#####3 | 350/655 [00:00<00:00, 682.83batch/s, train/train_loss_1=0.0991]Epoch 3: 53%|#####3 | 350/655 [00:00<00:00, 682.83batch/s, train/train_loss_1=0.0925]Epoch 3: 53%|#####3 | 350/655 [00:00<00:00, 682.83batch/s, train/train_loss_1=0.0913]Epoch 3: 53%|#####3 | 350/655 [00:00<00:00, 682.83batch/s, train/train_loss_1=0.0822]Epoch 3: 53%|#####3 | 350/655 [00:00<00:00, 682.83batch/s, train/train_loss_1=0.0938]Epoch 3: 53%|#####3 | 350/655 [00:00<00:00, 682.83batch/s, train/train_loss_1=0.103] Epoch 3: 53%|#####3 | 350/655 [00:00<00:00, 682.83batch/s, train/train_loss_1=0.0984]Epoch 3: 53%|#####3 | 350/655 [00:00<00:00, 682.83batch/s, train/train_loss_1=0.103] Epoch 3: 53%|#####3 | 350/655 [00:00<00:00, 682.83batch/s, train/train_loss_1=0.104]Epoch 3: 53%|#####3 | 350/655 [00:00<00:00, 682.83batch/s, train/train_loss_1=0.0809]Epoch 3: 53%|#####3 | 350/655 [00:00<00:00, 682.83batch/s, train/train_loss_1=0.0844]Epoch 3: 53%|#####3 | 350/655 [00:00<00:00, 682.83batch/s, train/train_loss_1=0.0828]Epoch 3: 53%|#####3 | 350/655 [00:00<00:00, 682.83batch/s, train/train_loss_1=0.106] Epoch 3: 53%|#####3 | 350/655 [00:00<00:00, 682.83batch/s, train/train_loss_1=0.0914]Epoch 3: 53%|#####3 | 350/655 [00:00<00:00, 682.83batch/s, train/train_loss_1=0.0916]Epoch 3: 53%|#####3 | 350/655 [00:00<00:00, 682.83batch/s, train/train_loss_1=0.0745]Epoch 3: 53%|#####3 | 350/655 [00:00<00:00, 682.83batch/s, train/train_loss_1=0.0837]Epoch 3: 53%|#####3 | 350/655 [00:00<00:00, 682.83batch/s, train/train_loss_1=0.0797]Epoch 3: 53%|#####3 | 350/655 [00:00<00:00, 682.83batch/s, train/train_loss_1=0.0802]Epoch 3: 53%|#####3 | 350/655 [00:00<00:00, 682.83batch/s, train/train_loss_1=0.0758]Epoch 3: 53%|#####3 | 350/655 [00:00<00:00, 682.83batch/s, train/train_loss_1=0.102] Epoch 3: 53%|#####3 | 350/655 [00:00<00:00, 682.83batch/s, train/train_loss_1=0.095]Epoch 3: 53%|#####3 | 350/655 [00:00<00:00, 682.83batch/s, train/train_loss_1=0.0913]Epoch 3: 53%|#####3 | 350/655 [00:00<00:00, 682.83batch/s, train/train_loss_1=0.0955]Epoch 3: 53%|#####3 | 350/655 [00:00<00:00, 682.83batch/s, train/train_loss_1=0.112] Epoch 3: 53%|#####3 | 350/655 [00:00<00:00, 682.83batch/s, train/train_loss_1=0.0957]Epoch 3: 53%|#####3 | 350/655 [00:00<00:00, 682.83batch/s, train/train_loss_1=0.0927]Epoch 3: 53%|#####3 | 350/655 [00:00<00:00, 682.83batch/s, train/train_loss_1=0.102] Epoch 3: 53%|#####3 | 350/655 [00:00<00:00, 682.83batch/s, train/train_loss_1=0.0877]Epoch 3: 53%|#####3 | 350/655 [00:00<00:00, 682.83batch/s, train/train_loss_1=0.0961]Epoch 3: 53%|#####3 | 350/655 [00:00<00:00, 682.83batch/s, train/train_loss_1=0.0821]Epoch 3: 53%|#####3 | 350/655 [00:00<00:00, 682.83batch/s, train/train_loss_1=0.0994]Epoch 3: 53%|#####3 | 350/655 [00:00<00:00, 682.83batch/s, train/train_loss_1=0.0931]Epoch 3: 53%|#####3 | 350/655 [00:00<00:00, 682.83batch/s, train/train_loss_1=0.0955]Epoch 3: 53%|#####3 | 350/655 [00:00<00:00, 682.83batch/s, train/train_loss_1=0.0894]Epoch 3: 53%|#####3 | 350/655 [00:00<00:00, 682.83batch/s, train/train_loss_1=0.0844]Epoch 3: 53%|#####3 | 350/655 [00:00<00:00, 682.83batch/s, train/train_loss_1=0.104] Epoch 3: 53%|#####3 | 350/655 [00:00<00:00, 682.83batch/s, train/train_loss_1=0.0988]Epoch 3: 53%|#####3 | 350/655 [00:00<00:00, 682.83batch/s, train/train_loss_1=0.0909]Epoch 3: 53%|#####3 | 350/655 [00:00<00:00, 682.83batch/s, train/train_loss_1=0.098] Epoch 3: 53%|#####3 | 350/655 [00:00<00:00, 682.83batch/s, train/train_loss_1=0.0779]Epoch 3: 53%|#####3 | 350/655 [00:00<00:00, 682.83batch/s, train/train_loss_1=0.0827]Epoch 3: 53%|#####3 | 350/655 [00:00<00:00, 682.83batch/s, train/train_loss_1=0.092] Epoch 3: 53%|#####3 | 350/655 [00:00<00:00, 682.83batch/s, train/train_loss_1=0.102]Epoch 3: 65%|######4 | 423/655 [00:00<00:00, 695.78batch/s, train/train_loss_1=0.102]Epoch 3: 65%|######4 | 423/655 [00:00<00:00, 695.78batch/s, train/train_loss_1=0.092]Epoch 3: 65%|######4 | 423/655 [00:00<00:00, 695.78batch/s, train/train_loss_1=0.0777]Epoch 3: 65%|######4 | 423/655 [00:00<00:00, 695.78batch/s, train/train_loss_1=0.0767]Epoch 3: 65%|######4 | 423/655 [00:00<00:00, 695.78batch/s, train/train_loss_1=0.103] Epoch 3: 65%|######4 | 423/655 [00:00<00:00, 695.78batch/s, train/train_loss_1=0.0879]Epoch 3: 65%|######4 | 423/655 [00:00<00:00, 695.78batch/s, train/train_loss_1=0.0916]Epoch 3: 65%|######4 | 423/655 [00:00<00:00, 695.78batch/s, train/train_loss_1=0.0876]Epoch 3: 65%|######4 | 423/655 [00:00<00:00, 695.78batch/s, train/train_loss_1=0.0907]Epoch 3: 65%|######4 | 423/655 [00:00<00:00, 695.78batch/s, train/train_loss_1=0.109] Epoch 3: 65%|######4 | 423/655 [00:00<00:00, 695.78batch/s, train/train_loss_1=0.0827]Epoch 3: 65%|######4 | 423/655 [00:00<00:00, 695.78batch/s, train/train_loss_1=0.0864]Epoch 3: 65%|######4 | 423/655 [00:00<00:00, 695.78batch/s, train/train_loss_1=0.0859]Epoch 3: 65%|######4 | 423/655 [00:00<00:00, 695.78batch/s, train/train_loss_1=0.0925]Epoch 3: 65%|######4 | 423/655 [00:00<00:00, 695.78batch/s, train/train_loss_1=0.0869]Epoch 3: 65%|######4 | 423/655 [00:00<00:00, 695.78batch/s, train/train_loss_1=0.099] Epoch 3: 65%|######4 | 423/655 [00:00<00:00, 695.78batch/s, train/train_loss_1=0.0904]Epoch 3: 65%|######4 | 423/655 [00:00<00:00, 695.78batch/s, train/train_loss_1=0.0977]Epoch 3: 65%|######4 | 423/655 [00:00<00:00, 695.78batch/s, train/train_loss_1=0.0884]Epoch 3: 65%|######4 | 423/655 [00:00<00:00, 695.78batch/s, train/train_loss_1=0.0862]Epoch 3: 65%|######4 | 423/655 [00:00<00:00, 695.78batch/s, train/train_loss_1=0.0947]Epoch 3: 65%|######4 | 423/655 [00:00<00:00, 695.78batch/s, train/train_loss_1=0.089] Epoch 3: 65%|######4 | 423/655 [00:00<00:00, 695.78batch/s, train/train_loss_1=0.107]Epoch 3: 65%|######4 | 423/655 [00:00<00:00, 695.78batch/s, train/train_loss_1=0.0947]Epoch 3: 65%|######4 | 423/655 [00:00<00:00, 695.78batch/s, train/train_loss_1=0.109] Epoch 3: 65%|######4 | 423/655 [00:00<00:00, 695.78batch/s, train/train_loss_1=0.0973]Epoch 3: 65%|######4 | 423/655 [00:00<00:00, 695.78batch/s, train/train_loss_1=0.0847]Epoch 3: 65%|######4 | 423/655 [00:00<00:00, 695.78batch/s, train/train_loss_1=0.0874]Epoch 3: 65%|######4 | 423/655 [00:00<00:00, 695.78batch/s, train/train_loss_1=0.0903]Epoch 3: 65%|######4 | 423/655 [00:00<00:00, 695.78batch/s, train/train_loss_1=0.0929]Epoch 3: 65%|######4 | 423/655 [00:00<00:00, 695.78batch/s, train/train_loss_1=0.0913]Epoch 3: 65%|######4 | 423/655 [00:00<00:00, 695.78batch/s, train/train_loss_1=0.093] Epoch 3: 65%|######4 | 423/655 [00:00<00:00, 695.78batch/s, train/train_loss_1=0.0903]Epoch 3: 65%|######4 | 423/655 [00:00<00:00, 695.78batch/s, train/train_loss_1=0.104] Epoch 3: 65%|######4 | 423/655 [00:00<00:00, 695.78batch/s, train/train_loss_1=0.0927]Epoch 3: 65%|######4 | 423/655 [00:00<00:00, 695.78batch/s, train/train_loss_1=0.0905]Epoch 3: 65%|######4 | 423/655 [00:00<00:00, 695.78batch/s, train/train_loss_1=0.083] Epoch 3: 65%|######4 | 423/655 [00:00<00:00, 695.78batch/s, train/train_loss_1=0.11] Epoch 3: 65%|######4 | 423/655 [00:00<00:00, 695.78batch/s, train/train_loss_1=0.11]Epoch 3: 65%|######4 | 423/655 [00:00<00:00, 695.78batch/s, train/train_loss_1=0.087]Epoch 3: 65%|######4 | 423/655 [00:00<00:00, 695.78batch/s, train/train_loss_1=0.092]Epoch 3: 65%|######4 | 423/655 [00:00<00:00, 695.78batch/s, train/train_loss_1=0.08] Epoch 3: 65%|######4 | 423/655 [00:00<00:00, 695.78batch/s, train/train_loss_1=0.1] Epoch 3: 65%|######4 | 423/655 [00:00<00:00, 695.78batch/s, train/train_loss_1=0.0964]Epoch 3: 65%|######4 | 423/655 [00:00<00:00, 695.78batch/s, train/train_loss_1=0.105] Epoch 3: 65%|######4 | 423/655 [00:00<00:00, 695.78batch/s, train/train_loss_1=0.0845]Epoch 3: 65%|######4 | 423/655 [00:00<00:00, 695.78batch/s, train/train_loss_1=0.0961]Epoch 3: 65%|######4 | 423/655 [00:00<00:00, 695.78batch/s, train/train_loss_1=0.0926]Epoch 3: 65%|######4 | 423/655 [00:00<00:00, 695.78batch/s, train/train_loss_1=0.0843]Epoch 3: 65%|######4 | 423/655 [00:00<00:00, 695.78batch/s, train/train_loss_1=0.0983]Epoch 3: 65%|######4 | 423/655 [00:00<00:00, 695.78batch/s, train/train_loss_1=0.0858]Epoch 3: 65%|######4 | 423/655 [00:00<00:00, 695.78batch/s, train/train_loss_1=0.0904]Epoch 3: 65%|######4 | 423/655 [00:00<00:00, 695.78batch/s, train/train_loss_1=0.0934]Epoch 3: 65%|######4 | 423/655 [00:00<00:00, 695.78batch/s, train/train_loss_1=0.114] Epoch 3: 65%|######4 | 423/655 [00:00<00:00, 695.78batch/s, train/train_loss_1=0.0813]Epoch 3: 65%|######4 | 423/655 [00:00<00:00, 695.78batch/s, train/train_loss_1=0.0936]Epoch 3: 65%|######4 | 423/655 [00:00<00:00, 695.78batch/s, train/train_loss_1=0.0936]Epoch 3: 65%|######4 | 423/655 [00:00<00:00, 695.78batch/s, train/train_loss_1=0.0964]Epoch 3: 65%|######4 | 423/655 [00:00<00:00, 695.78batch/s, train/train_loss_1=0.09] Epoch 3: 65%|######4 | 423/655 [00:00<00:00, 695.78batch/s, train/train_loss_1=0.0912]Epoch 3: 65%|######4 | 423/655 [00:00<00:00, 695.78batch/s, train/train_loss_1=0.0997]Epoch 3: 65%|######4 | 423/655 [00:00<00:00, 695.78batch/s, train/train_loss_1=0.1] Epoch 3: 65%|######4 | 423/655 [00:00<00:00, 695.78batch/s, train/train_loss_1=0.0921]Epoch 3: 65%|######4 | 423/655 [00:00<00:00, 695.78batch/s, train/train_loss_1=0.0942]Epoch 3: 65%|######4 | 423/655 [00:00<00:00, 695.78batch/s, train/train_loss_1=0.113] Epoch 3: 65%|######4 | 423/655 [00:00<00:00, 695.78batch/s, train/train_loss_1=0.0919]Epoch 3: 65%|######4 | 423/655 [00:00<00:00, 695.78batch/s, train/train_loss_1=0.0877]Epoch 3: 65%|######4 | 423/655 [00:00<00:00, 695.78batch/s, train/train_loss_1=0.0877]Epoch 3: 65%|######4 | 423/655 [00:00<00:00, 695.78batch/s, train/train_loss_1=0.0814]Epoch 3: 65%|######4 | 423/655 [00:00<00:00, 695.78batch/s, train/train_loss_1=0.0986]Epoch 3: 65%|######4 | 423/655 [00:00<00:00, 695.78batch/s, train/train_loss_1=0.0885]Epoch 3: 65%|######4 | 423/655 [00:00<00:00, 695.78batch/s, train/train_loss_1=0.0874]Epoch 3: 65%|######4 | 423/655 [00:00<00:00, 695.78batch/s, train/train_loss_1=0.0975]Epoch 3: 65%|######4 | 423/655 [00:00<00:00, 695.78batch/s, train/train_loss_1=0.0864]Epoch 3: 76%|#######5 | 496/655 [00:00<00:00, 704.31batch/s, train/train_loss_1=0.0864]Epoch 3: 76%|#######5 | 496/655 [00:00<00:00, 704.31batch/s, train/train_loss_1=0.0871]Epoch 3: 76%|#######5 | 496/655 [00:00<00:00, 704.31batch/s, train/train_loss_1=0.0853]Epoch 3: 76%|#######5 | 496/655 [00:00<00:00, 704.31batch/s, train/train_loss_1=0.0915]Epoch 3: 76%|#######5 | 496/655 [00:00<00:00, 704.31batch/s, train/train_loss_1=0.0979]Epoch 3: 76%|#######5 | 496/655 [00:00<00:00, 704.31batch/s, train/train_loss_1=0.11] Epoch 3: 76%|#######5 | 496/655 [00:00<00:00, 704.31batch/s, train/train_loss_1=0.0935]Epoch 3: 76%|#######5 | 496/655 [00:00<00:00, 704.31batch/s, train/train_loss_1=0.0965]Epoch 3: 76%|#######5 | 496/655 [00:00<00:00, 704.31batch/s, train/train_loss_1=0.0838]Epoch 3: 76%|#######5 | 496/655 [00:00<00:00, 704.31batch/s, train/train_loss_1=0.0952]Epoch 3: 76%|#######5 | 496/655 [00:00<00:00, 704.31batch/s, train/train_loss_1=0.0974]Epoch 3: 76%|#######5 | 496/655 [00:00<00:00, 704.31batch/s, train/train_loss_1=0.0731]Epoch 3: 76%|#######5 | 496/655 [00:00<00:00, 704.31batch/s, train/train_loss_1=0.108] Epoch 3: 76%|#######5 | 496/655 [00:00<00:00, 704.31batch/s, train/train_loss_1=0.0813]Epoch 3: 76%|#######5 | 496/655 [00:00<00:00, 704.31batch/s, train/train_loss_1=0.0857]Epoch 3: 76%|#######5 | 496/655 [00:00<00:00, 704.31batch/s, train/train_loss_1=0.0833]Epoch 3: 76%|#######5 | 496/655 [00:00<00:00, 704.31batch/s, train/train_loss_1=0.0842]Epoch 3: 76%|#######5 | 496/655 [00:00<00:00, 704.31batch/s, train/train_loss_1=0.0869]Epoch 3: 76%|#######5 | 496/655 [00:00<00:00, 704.31batch/s, train/train_loss_1=0.107] Epoch 3: 76%|#######5 | 496/655 [00:00<00:00, 704.31batch/s, train/train_loss_1=0.0884]Epoch 3: 76%|#######5 | 496/655 [00:00<00:00, 704.31batch/s, train/train_loss_1=0.0919]Epoch 3: 76%|#######5 | 496/655 [00:00<00:00, 704.31batch/s, train/train_loss_1=0.0782]Epoch 3: 76%|#######5 | 496/655 [00:00<00:00, 704.31batch/s, train/train_loss_1=0.0847]Epoch 3: 76%|#######5 | 496/655 [00:00<00:00, 704.31batch/s, train/train_loss_1=0.0931]Epoch 3: 76%|#######5 | 496/655 [00:00<00:00, 704.31batch/s, train/train_loss_1=0.104] Epoch 3: 76%|#######5 | 496/655 [00:00<00:00, 704.31batch/s, train/train_loss_1=0.0904]Epoch 3: 76%|#######5 | 496/655 [00:00<00:00, 704.31batch/s, train/train_loss_1=0.0913]Epoch 3: 76%|#######5 | 496/655 [00:00<00:00, 704.31batch/s, train/train_loss_1=0.0986]Epoch 3: 76%|#######5 | 496/655 [00:00<00:00, 704.31batch/s, train/train_loss_1=0.0957]Epoch 3: 76%|#######5 | 496/655 [00:00<00:00, 704.31batch/s, train/train_loss_1=0.104] Epoch 3: 76%|#######5 | 496/655 [00:00<00:00, 704.31batch/s, train/train_loss_1=0.0829]Epoch 3: 76%|#######5 | 496/655 [00:00<00:00, 704.31batch/s, train/train_loss_1=0.0878]Epoch 3: 76%|#######5 | 496/655 [00:00<00:00, 704.31batch/s, train/train_loss_1=0.105] Epoch 3: 76%|#######5 | 496/655 [00:00<00:00, 704.31batch/s, train/train_loss_1=0.0901]Epoch 3: 76%|#######5 | 496/655 [00:00<00:00, 704.31batch/s, train/train_loss_1=0.0992]Epoch 3: 76%|#######5 | 496/655 [00:00<00:00, 704.31batch/s, train/train_loss_1=0.0959]Epoch 3: 76%|#######5 | 496/655 [00:00<00:00, 704.31batch/s, train/train_loss_1=0.0937]Epoch 3: 76%|#######5 | 496/655 [00:00<00:00, 704.31batch/s, train/train_loss_1=0.0964]Epoch 3: 76%|#######5 | 496/655 [00:00<00:00, 704.31batch/s, train/train_loss_1=0.0943]Epoch 3: 76%|#######5 | 496/655 [00:00<00:00, 704.31batch/s, train/train_loss_1=0.0953]Epoch 3: 76%|#######5 | 496/655 [00:00<00:00, 704.31batch/s, train/train_loss_1=0.0906]Epoch 3: 76%|#######5 | 496/655 [00:00<00:00, 704.31batch/s, train/train_loss_1=0.0923]Epoch 3: 76%|#######5 | 496/655 [00:00<00:00, 704.31batch/s, train/train_loss_1=0.0899]Epoch 3: 76%|#######5 | 496/655 [00:00<00:00, 704.31batch/s, train/train_loss_1=0.0894]Epoch 3: 76%|#######5 | 496/655 [00:00<00:00, 704.31batch/s, train/train_loss_1=0.103] Epoch 3: 76%|#######5 | 496/655 [00:00<00:00, 704.31batch/s, train/train_loss_1=0.0935]Epoch 3: 76%|#######5 | 496/655 [00:00<00:00, 704.31batch/s, train/train_loss_1=0.0885]Epoch 3: 76%|#######5 | 496/655 [00:00<00:00, 704.31batch/s, train/train_loss_1=0.0867]Epoch 3: 76%|#######5 | 496/655 [00:00<00:00, 704.31batch/s, train/train_loss_1=0.103] Epoch 3: 76%|#######5 | 496/655 [00:00<00:00, 704.31batch/s, train/train_loss_1=0.0928]Epoch 3: 76%|#######5 | 496/655 [00:00<00:00, 704.31batch/s, train/train_loss_1=0.069] Epoch 3: 76%|#######5 | 496/655 [00:00<00:00, 704.31batch/s, train/train_loss_1=0.0983]Epoch 3: 76%|#######5 | 496/655 [00:00<00:00, 704.31batch/s, train/train_loss_1=0.0904]Epoch 3: 76%|#######5 | 496/655 [00:00<00:00, 704.31batch/s, train/train_loss_1=0.0769]Epoch 3: 76%|#######5 | 496/655 [00:00<00:00, 704.31batch/s, train/train_loss_1=0.106] Epoch 3: 76%|#######5 | 496/655 [00:00<00:00, 704.31batch/s, train/train_loss_1=0.0981]Epoch 3: 76%|#######5 | 496/655 [00:00<00:00, 704.31batch/s, train/train_loss_1=0.0934]Epoch 3: 76%|#######5 | 496/655 [00:00<00:00, 704.31batch/s, train/train_loss_1=0.101] Epoch 3: 76%|#######5 | 496/655 [00:00<00:00, 704.31batch/s, train/train_loss_1=0.0849]Epoch 3: 76%|#######5 | 496/655 [00:00<00:00, 704.31batch/s, train/train_loss_1=0.102] Epoch 3: 76%|#######5 | 496/655 [00:00<00:00, 704.31batch/s, train/train_loss_1=0.0806]Epoch 3: 76%|#######5 | 496/655 [00:00<00:00, 704.31batch/s, train/train_loss_1=0.0772]Epoch 3: 76%|#######5 | 496/655 [00:00<00:00, 704.31batch/s, train/train_loss_1=0.095] Epoch 3: 76%|#######5 | 496/655 [00:00<00:00, 704.31batch/s, train/train_loss_1=0.0848]Epoch 3: 76%|#######5 | 496/655 [00:00<00:00, 704.31batch/s, train/train_loss_1=0.0966]Epoch 3: 76%|#######5 | 496/655 [00:00<00:00, 704.31batch/s, train/train_loss_1=0.0992]Epoch 3: 76%|#######5 | 496/655 [00:00<00:00, 704.31batch/s, train/train_loss_1=0.101] Epoch 3: 76%|#######5 | 496/655 [00:00<00:00, 704.31batch/s, train/train_loss_1=0.0815]Epoch 3: 76%|#######5 | 496/655 [00:00<00:00, 704.31batch/s, train/train_loss_1=0.0821]Epoch 3: 76%|#######5 | 496/655 [00:00<00:00, 704.31batch/s, train/train_loss_1=0.102] Epoch 3: 76%|#######5 | 496/655 [00:00<00:00, 704.31batch/s, train/train_loss_1=0.108]Epoch 3: 76%|#######5 | 496/655 [00:00<00:00, 704.31batch/s, train/train_loss_1=0.0837]Epoch 3: 76%|#######5 | 496/655 [00:00<00:00, 704.31batch/s, train/train_loss_1=0.0831]Epoch 3: 87%|########6 | 568/655 [00:00<00:00, 708.91batch/s, train/train_loss_1=0.0831]Epoch 3: 87%|########6 | 568/655 [00:00<00:00, 708.91batch/s, train/train_loss_1=0.093] Epoch 3: 87%|########6 | 568/655 [00:00<00:00, 708.91batch/s, train/train_loss_1=0.0817]Epoch 3: 87%|########6 | 568/655 [00:00<00:00, 708.91batch/s, train/train_loss_1=0.0985]Epoch 3: 87%|########6 | 568/655 [00:00<00:00, 708.91batch/s, train/train_loss_1=0.0946]Epoch 3: 87%|########6 | 568/655 [00:00<00:00, 708.91batch/s, train/train_loss_1=0.0929]Epoch 3: 87%|########6 | 568/655 [00:00<00:00, 708.91batch/s, train/train_loss_1=0.11] Epoch 3: 87%|########6 | 568/655 [00:00<00:00, 708.91batch/s, train/train_loss_1=0.0837]Epoch 3: 87%|########6 | 568/655 [00:00<00:00, 708.91batch/s, train/train_loss_1=0.085] Epoch 3: 87%|########6 | 568/655 [00:00<00:00, 708.91batch/s, train/train_loss_1=0.0908]Epoch 3: 87%|########6 | 568/655 [00:00<00:00, 708.91batch/s, train/train_loss_1=0.0918]Epoch 3: 87%|########6 | 568/655 [00:00<00:00, 708.91batch/s, train/train_loss_1=0.0796]Epoch 3: 87%|########6 | 568/655 [00:00<00:00, 708.91batch/s, train/train_loss_1=0.0986]Epoch 3: 87%|########6 | 568/655 [00:00<00:00, 708.91batch/s, train/train_loss_1=0.0874]Epoch 3: 87%|########6 | 568/655 [00:00<00:00, 708.91batch/s, train/train_loss_1=0.0985]Epoch 3: 87%|########6 | 568/655 [00:00<00:00, 708.91batch/s, train/train_loss_1=0.0933]Epoch 3: 87%|########6 | 568/655 [00:00<00:00, 708.91batch/s, train/train_loss_1=0.0812]Epoch 3: 87%|########6 | 568/655 [00:00<00:00, 708.91batch/s, train/train_loss_1=0.0907]Epoch 3: 87%|########6 | 568/655 [00:00<00:00, 708.91batch/s, train/train_loss_1=0.1] Epoch 3: 87%|########6 | 568/655 [00:00<00:00, 708.91batch/s, train/train_loss_1=0.0762]Epoch 3: 87%|########6 | 568/655 [00:00<00:00, 708.91batch/s, train/train_loss_1=0.094] Epoch 3: 87%|########6 | 568/655 [00:00<00:00, 708.91batch/s, train/train_loss_1=0.0921]Epoch 3: 87%|########6 | 568/655 [00:00<00:00, 708.91batch/s, train/train_loss_1=0.0792]Epoch 3: 87%|########6 | 568/655 [00:00<00:00, 708.91batch/s, train/train_loss_1=0.111] Epoch 3: 87%|########6 | 568/655 [00:00<00:00, 708.91batch/s, train/train_loss_1=0.0834]Epoch 3: 87%|########6 | 568/655 [00:00<00:00, 708.91batch/s, train/train_loss_1=0.0911]Epoch 3: 87%|########6 | 568/655 [00:00<00:00, 708.91batch/s, train/train_loss_1=0.0963]Epoch 3: 87%|########6 | 568/655 [00:00<00:00, 708.91batch/s, train/train_loss_1=0.0989]Epoch 3: 87%|########6 | 568/655 [00:00<00:00, 708.91batch/s, train/train_loss_1=0.0898]Epoch 3: 87%|########6 | 568/655 [00:00<00:00, 708.91batch/s, train/train_loss_1=0.0734]Epoch 3: 87%|########6 | 568/655 [00:00<00:00, 708.91batch/s, train/train_loss_1=0.0972]Epoch 3: 87%|########6 | 568/655 [00:00<00:00, 708.91batch/s, train/train_loss_1=0.101] Epoch 3: 87%|########6 | 568/655 [00:00<00:00, 708.91batch/s, train/train_loss_1=0.0926]Epoch 3: 87%|########6 | 568/655 [00:00<00:00, 708.91batch/s, train/train_loss_1=0.0845]Epoch 3: 87%|########6 | 568/655 [00:00<00:00, 708.91batch/s, train/train_loss_1=0.101] Epoch 3: 87%|########6 | 568/655 [00:00<00:00, 708.91batch/s, train/train_loss_1=0.0743]Epoch 3: 87%|########6 | 568/655 [00:00<00:00, 708.91batch/s, train/train_loss_1=0.0852]Epoch 3: 87%|########6 | 568/655 [00:00<00:00, 708.91batch/s, train/train_loss_1=0.115] Epoch 3: 87%|########6 | 568/655 [00:00<00:00, 708.91batch/s, train/train_loss_1=0.102]Epoch 3: 87%|########6 | 568/655 [00:00<00:00, 708.91batch/s, train/train_loss_1=0.0804]Epoch 3: 87%|########6 | 568/655 [00:00<00:00, 708.91batch/s, train/train_loss_1=0.0885]Epoch 3: 87%|########6 | 568/655 [00:00<00:00, 708.91batch/s, train/train_loss_1=0.0861]Epoch 3: 87%|########6 | 568/655 [00:00<00:00, 708.91batch/s, train/train_loss_1=0.0797]Epoch 3: 87%|########6 | 568/655 [00:00<00:00, 708.91batch/s, train/train_loss_1=0.0845]Epoch 3: 87%|########6 | 568/655 [00:00<00:00, 708.91batch/s, train/train_loss_1=0.0858]Epoch 3: 87%|########6 | 568/655 [00:00<00:00, 708.91batch/s, train/train_loss_1=0.0842]Epoch 3: 87%|########6 | 568/655 [00:00<00:00, 708.91batch/s, train/train_loss_1=0.0901]Epoch 3: 87%|########6 | 568/655 [00:00<00:00, 708.91batch/s, train/train_loss_1=0.0941]Epoch 3: 87%|########6 | 568/655 [00:00<00:00, 708.91batch/s, train/train_loss_1=0.0973]Epoch 3: 87%|########6 | 568/655 [00:00<00:00, 708.91batch/s, train/train_loss_1=0.0942]Epoch 3: 87%|########6 | 568/655 [00:00<00:00, 708.91batch/s, train/train_loss_1=0.0915]Epoch 3: 87%|########6 | 568/655 [00:00<00:00, 708.91batch/s, train/train_loss_1=0.0873]Epoch 3: 87%|########6 | 568/655 [00:00<00:00, 708.91batch/s, train/train_loss_1=0.103] Epoch 3: 87%|########6 | 568/655 [00:00<00:00, 708.91batch/s, train/train_loss_1=0.104]Epoch 3: 87%|########6 | 568/655 [00:00<00:00, 708.91batch/s, train/train_loss_1=0.0797]Epoch 3: 87%|########6 | 568/655 [00:00<00:00, 708.91batch/s, train/train_loss_1=0.0951]Epoch 3: 87%|########6 | 568/655 [00:00<00:00, 708.91batch/s, train/train_loss_1=0.102] Epoch 3: 87%|########6 | 568/655 [00:00<00:00, 708.91batch/s, train/train_loss_1=0.0802]Epoch 3: 87%|########6 | 568/655 [00:00<00:00, 708.91batch/s, train/train_loss_1=0.0863]Epoch 3: 87%|########6 | 568/655 [00:00<00:00, 708.91batch/s, train/train_loss_1=0.0882]Epoch 3: 87%|########6 | 568/655 [00:00<00:00, 708.91batch/s, train/train_loss_1=0.093] Epoch 3: 87%|########6 | 568/655 [00:00<00:00, 708.91batch/s, train/train_loss_1=0.0816]Epoch 3: 87%|########6 | 568/655 [00:00<00:00, 708.91batch/s, train/train_loss_1=0.09] Epoch 3: 87%|########6 | 568/655 [00:00<00:00, 708.91batch/s, train/train_loss_1=0.0879]Epoch 3: 87%|########6 | 568/655 [00:00<00:00, 708.91batch/s, train/train_loss_1=0.0829]Epoch 3: 87%|########6 | 568/655 [00:00<00:00, 708.91batch/s, train/train_loss_1=0.086] Epoch 3: 87%|########6 | 568/655 [00:00<00:00, 708.91batch/s, train/train_loss_1=0.1] Epoch 3: 87%|########6 | 568/655 [00:00<00:00, 708.91batch/s, train/train_loss_1=0.0955]Epoch 3: 87%|########6 | 568/655 [00:00<00:00, 708.91batch/s, train/train_loss_1=0.08] Epoch 3: 87%|########6 | 568/655 [00:00<00:00, 708.91batch/s, train/train_loss_1=0.0827]Epoch 3: 87%|########6 | 568/655 [00:00<00:00, 708.91batch/s, train/train_loss_1=0.0826]Epoch 3: 87%|########6 | 568/655 [00:00<00:00, 708.91batch/s, train/train_loss_1=0.12] Epoch 3: 87%|########6 | 568/655 [00:00<00:00, 708.91batch/s, train/train_loss_1=0.0954]Epoch 3: 98%|#########7| 640/655 [00:00<00:00, 680.83batch/s, train/train_loss_1=0.0954]Epoch 3: 98%|#########7| 640/655 [00:00<00:00, 680.83batch/s, train/train_loss_1=0.0899]Epoch 3: 98%|#########7| 640/655 [00:00<00:00, 680.83batch/s, train/train_loss_1=0.1] Epoch 3: 98%|#########7| 640/655 [00:00<00:00, 680.83batch/s, train/train_loss_1=0.0933]Epoch 3: 98%|#########7| 640/655 [00:00<00:00, 680.83batch/s, train/train_loss_1=0.0784]Epoch 3: 98%|#########7| 640/655 [00:00<00:00, 680.83batch/s, train/train_loss_1=0.0814]Epoch 3: 98%|#########7| 640/655 [00:00<00:00, 680.83batch/s, train/train_loss_1=0.102] Epoch 3: 98%|#########7| 640/655 [00:00<00:00, 680.83batch/s, train/train_loss_1=0.0899]Epoch 3: 98%|#########7| 640/655 [00:00<00:00, 680.83batch/s, train/train_loss_1=0.0883]Epoch 3: 98%|#########7| 640/655 [00:00<00:00, 680.83batch/s, train/train_loss_1=0.0958]Epoch 3: 98%|#########7| 640/655 [00:00<00:00, 680.83batch/s, train/train_loss_1=0.0797]Epoch 3: 98%|#########7| 640/655 [00:00<00:00, 680.83batch/s, train/train_loss_1=0.0972]Epoch 3: 98%|#########7| 640/655 [00:00<00:00, 680.83batch/s, train/train_loss_1=0.0896]Epoch 3: 98%|#########7| 640/655 [00:00<00:00, 680.83batch/s, train/train_loss_1=0.0919]Epoch 3: 98%|#########7| 640/655 [00:00<00:00, 680.83batch/s, train/train_loss_1=0.0955]Epoch 3: 98%|#########7| 640/655 [00:00<00:00, 680.83batch/s, train/train_loss_1=0.0826] 0%| | 0/655 [00:00<?, ?batch/s]Epoch 4: 0%| | 0/655 [00:00<?, ?batch/s]Epoch 4: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.102]Epoch 4: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.1] Epoch 4: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.076]Epoch 4: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0947]Epoch 4: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0932]Epoch 4: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0905]Epoch 4: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0896]Epoch 4: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0911]Epoch 4: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0999]Epoch 4: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0974]Epoch 4: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0826]Epoch 4: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0748]Epoch 4: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0789]Epoch 4: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0959]Epoch 4: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0982]Epoch 4: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0971]Epoch 4: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0892]Epoch 4: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.1] Epoch 4: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0928]Epoch 4: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0885]Epoch 4: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.071] Epoch 4: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.088]Epoch 4: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0889]Epoch 4: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0903]Epoch 4: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0802]Epoch 4: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.108] Epoch 4: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0985]Epoch 4: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0863]Epoch 4: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.094] Epoch 4: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0862]Epoch 4: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0803]Epoch 4: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0924]Epoch 4: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0889]Epoch 4: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0808]Epoch 4: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.101] Epoch 4: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0879]Epoch 4: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0835]Epoch 4: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0916]Epoch 4: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0962]Epoch 4: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.105] Epoch 4: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0801]Epoch 4: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0903]Epoch 4: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0892]Epoch 4: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0809]Epoch 4: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0923]Epoch 4: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0968]Epoch 4: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0895]Epoch 4: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0803]Epoch 4: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.114] Epoch 4: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0859]Epoch 4: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0805]Epoch 4: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0831]Epoch 4: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0963]Epoch 4: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.107] Epoch 4: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0883]Epoch 4: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.079] Epoch 4: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0716]Epoch 4: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0816]Epoch 4: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0866]Epoch 4: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0976]Epoch 4: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0842]Epoch 4: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0923]Epoch 4: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0989]Epoch 4: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0859]Epoch 4: 10%|9 | 64/655 [00:00<00:00, 639.03batch/s, train/train_loss_1=0.0859]Epoch 4: 10%|9 | 64/655 [00:00<00:00, 639.03batch/s, train/train_loss_1=0.0848]Epoch 4: 10%|9 | 64/655 [00:00<00:00, 639.03batch/s, train/train_loss_1=0.101] Epoch 4: 10%|9 | 64/655 [00:00<00:00, 639.03batch/s, train/train_loss_1=0.0998]Epoch 4: 10%|9 | 64/655 [00:00<00:00, 639.03batch/s, train/train_loss_1=0.0964]Epoch 4: 10%|9 | 64/655 [00:00<00:00, 639.03batch/s, train/train_loss_1=0.0804]Epoch 4: 10%|9 | 64/655 [00:00<00:00, 639.03batch/s, train/train_loss_1=0.103] Epoch 4: 10%|9 | 64/655 [00:00<00:00, 639.03batch/s, train/train_loss_1=0.0835]Epoch 4: 10%|9 | 64/655 [00:00<00:00, 639.03batch/s, train/train_loss_1=0.0858]Epoch 4: 10%|9 | 64/655 [00:00<00:00, 639.03batch/s, train/train_loss_1=0.0953]Epoch 4: 10%|9 | 64/655 [00:00<00:00, 639.03batch/s, train/train_loss_1=0.0847]Epoch 4: 10%|9 | 64/655 [00:00<00:00, 639.03batch/s, train/train_loss_1=0.0937]Epoch 4: 10%|9 | 64/655 [00:00<00:00, 639.03batch/s, train/train_loss_1=0.0987]Epoch 4: 10%|9 | 64/655 [00:00<00:00, 639.03batch/s, train/train_loss_1=0.0908]Epoch 4: 10%|9 | 64/655 [00:00<00:00, 639.03batch/s, train/train_loss_1=0.0891]Epoch 4: 10%|9 | 64/655 [00:00<00:00, 639.03batch/s, train/train_loss_1=0.099] Epoch 4: 10%|9 | 64/655 [00:00<00:00, 639.03batch/s, train/train_loss_1=0.0873]Epoch 4: 10%|9 | 64/655 [00:00<00:00, 639.03batch/s, train/train_loss_1=0.0897]Epoch 4: 10%|9 | 64/655 [00:00<00:00, 639.03batch/s, train/train_loss_1=0.0928]Epoch 4: 10%|9 | 64/655 [00:00<00:00, 639.03batch/s, train/train_loss_1=0.0876]Epoch 4: 10%|9 | 64/655 [00:00<00:00, 639.03batch/s, train/train_loss_1=0.0852]Epoch 4: 10%|9 | 64/655 [00:00<00:00, 639.03batch/s, train/train_loss_1=0.0935]Epoch 4: 10%|9 | 64/655 [00:00<00:00, 639.03batch/s, train/train_loss_1=0.0887]Epoch 4: 10%|9 | 64/655 [00:00<00:00, 639.03batch/s, train/train_loss_1=0.0918]Epoch 4: 10%|9 | 64/655 [00:00<00:00, 639.03batch/s, train/train_loss_1=0.0837]Epoch 4: 10%|9 | 64/655 [00:00<00:00, 639.03batch/s, train/train_loss_1=0.0813]Epoch 4: 10%|9 | 64/655 [00:00<00:00, 639.03batch/s, train/train_loss_1=0.0967]Epoch 4: 10%|9 | 64/655 [00:00<00:00, 639.03batch/s, train/train_loss_1=0.0863]Epoch 4: 10%|9 | 64/655 [00:00<00:00, 639.03batch/s, train/train_loss_1=0.103] Epoch 4: 10%|9 | 64/655 [00:00<00:00, 639.03batch/s, train/train_loss_1=0.1] Epoch 4: 10%|9 | 64/655 [00:00<00:00, 639.03batch/s, train/train_loss_1=0.0813]Epoch 4: 10%|9 | 64/655 [00:00<00:00, 639.03batch/s, train/train_loss_1=0.097] Epoch 4: 10%|9 | 64/655 [00:00<00:00, 639.03batch/s, train/train_loss_1=0.0916]Epoch 4: 10%|9 | 64/655 [00:00<00:00, 639.03batch/s, train/train_loss_1=0.0837]Epoch 4: 10%|9 | 64/655 [00:00<00:00, 639.03batch/s, train/train_loss_1=0.103] Epoch 4: 10%|9 | 64/655 [00:00<00:00, 639.03batch/s, train/train_loss_1=0.0928]Epoch 4: 10%|9 | 64/655 [00:00<00:00, 639.03batch/s, train/train_loss_1=0.089] Epoch 4: 10%|9 | 64/655 [00:00<00:00, 639.03batch/s, train/train_loss_1=0.1] Epoch 4: 10%|9 | 64/655 [00:00<00:00, 639.03batch/s, train/train_loss_1=0.0892]Epoch 4: 10%|9 | 64/655 [00:00<00:00, 639.03batch/s, train/train_loss_1=0.0826]Epoch 4: 10%|9 | 64/655 [00:00<00:00, 639.03batch/s, train/train_loss_1=0.0854]Epoch 4: 10%|9 | 64/655 [00:00<00:00, 639.03batch/s, train/train_loss_1=0.0886]Epoch 4: 10%|9 | 64/655 [00:00<00:00, 639.03batch/s, train/train_loss_1=0.0826]Epoch 4: 10%|9 | 64/655 [00:00<00:00, 639.03batch/s, train/train_loss_1=0.0879]Epoch 4: 10%|9 | 64/655 [00:00<00:00, 639.03batch/s, train/train_loss_1=0.0879]Epoch 4: 10%|9 | 64/655 [00:00<00:00, 639.03batch/s, train/train_loss_1=0.0772]Epoch 4: 10%|9 | 64/655 [00:00<00:00, 639.03batch/s, train/train_loss_1=0.092] Epoch 4: 10%|9 | 64/655 [00:00<00:00, 639.03batch/s, train/train_loss_1=0.0925]Epoch 4: 10%|9 | 64/655 [00:00<00:00, 639.03batch/s, train/train_loss_1=0.0832]Epoch 4: 10%|9 | 64/655 [00:00<00:00, 639.03batch/s, train/train_loss_1=0.1] Epoch 4: 10%|9 | 64/655 [00:00<00:00, 639.03batch/s, train/train_loss_1=0.0972]Epoch 4: 10%|9 | 64/655 [00:00<00:00, 639.03batch/s, train/train_loss_1=0.0877]Epoch 4: 10%|9 | 64/655 [00:00<00:00, 639.03batch/s, train/train_loss_1=0.0963]Epoch 4: 10%|9 | 64/655 [00:00<00:00, 639.03batch/s, train/train_loss_1=0.107] Epoch 4: 10%|9 | 64/655 [00:00<00:00, 639.03batch/s, train/train_loss_1=0.0825]Epoch 4: 10%|9 | 64/655 [00:00<00:00, 639.03batch/s, train/train_loss_1=0.0926]Epoch 4: 10%|9 | 64/655 [00:00<00:00, 639.03batch/s, train/train_loss_1=0.0883]Epoch 4: 10%|9 | 64/655 [00:00<00:00, 639.03batch/s, train/train_loss_1=0.073] Epoch 4: 10%|9 | 64/655 [00:00<00:00, 639.03batch/s, train/train_loss_1=0.0936]Epoch 4: 10%|9 | 64/655 [00:00<00:00, 639.03batch/s, train/train_loss_1=0.0835]Epoch 4: 10%|9 | 64/655 [00:00<00:00, 639.03batch/s, train/train_loss_1=0.0867]Epoch 4: 10%|9 | 64/655 [00:00<00:00, 639.03batch/s, train/train_loss_1=0.088] Epoch 4: 10%|9 | 64/655 [00:00<00:00, 639.03batch/s, train/train_loss_1=0.0927]Epoch 4: 10%|9 | 64/655 [00:00<00:00, 639.03batch/s, train/train_loss_1=0.118] Epoch 4: 10%|9 | 64/655 [00:00<00:00, 639.03batch/s, train/train_loss_1=0.11] Epoch 4: 10%|9 | 64/655 [00:00<00:00, 639.03batch/s, train/train_loss_1=0.0907]Epoch 4: 10%|9 | 64/655 [00:00<00:00, 639.03batch/s, train/train_loss_1=0.0848]Epoch 4: 20%|#9 | 130/655 [00:00<00:00, 647.58batch/s, train/train_loss_1=0.0848]Epoch 4: 20%|#9 | 130/655 [00:00<00:00, 647.58batch/s, train/train_loss_1=0.085] Epoch 4: 20%|#9 | 130/655 [00:00<00:00, 647.58batch/s, train/train_loss_1=0.0948]Epoch 4: 20%|#9 | 130/655 [00:00<00:00, 647.58batch/s, train/train_loss_1=0.0795]Epoch 4: 20%|#9 | 130/655 [00:00<00:00, 647.58batch/s, train/train_loss_1=0.0868]Epoch 4: 20%|#9 | 130/655 [00:00<00:00, 647.58batch/s, train/train_loss_1=0.1] Epoch 4: 20%|#9 | 130/655 [00:00<00:00, 647.58batch/s, train/train_loss_1=0.0809]Epoch 4: 20%|#9 | 130/655 [00:00<00:00, 647.58batch/s, train/train_loss_1=0.095] Epoch 4: 20%|#9 | 130/655 [00:00<00:00, 647.58batch/s, train/train_loss_1=0.0929]Epoch 4: 20%|#9 | 130/655 [00:00<00:00, 647.58batch/s, train/train_loss_1=0.09] Epoch 4: 20%|#9 | 130/655 [00:00<00:00, 647.58batch/s, train/train_loss_1=0.0894]Epoch 4: 20%|#9 | 130/655 [00:00<00:00, 647.58batch/s, train/train_loss_1=0.0899]Epoch 4: 20%|#9 | 130/655 [00:00<00:00, 647.58batch/s, train/train_loss_1=0.0983]Epoch 4: 20%|#9 | 130/655 [00:00<00:00, 647.58batch/s, train/train_loss_1=0.0981]Epoch 4: 20%|#9 | 130/655 [00:00<00:00, 647.58batch/s, train/train_loss_1=0.0922]Epoch 4: 20%|#9 | 130/655 [00:00<00:00, 647.58batch/s, train/train_loss_1=0.0868]Epoch 4: 20%|#9 | 130/655 [00:00<00:00, 647.58batch/s, train/train_loss_1=0.0888]Epoch 4: 20%|#9 | 130/655 [00:00<00:00, 647.58batch/s, train/train_loss_1=0.0783]Epoch 4: 20%|#9 | 130/655 [00:00<00:00, 647.58batch/s, train/train_loss_1=0.0862]Epoch 4: 20%|#9 | 130/655 [00:00<00:00, 647.58batch/s, train/train_loss_1=0.0767]Epoch 4: 20%|#9 | 130/655 [00:00<00:00, 647.58batch/s, train/train_loss_1=0.0931]Epoch 4: 20%|#9 | 130/655 [00:00<00:00, 647.58batch/s, train/train_loss_1=0.0857]Epoch 4: 20%|#9 | 130/655 [00:00<00:00, 647.58batch/s, train/train_loss_1=0.0906]Epoch 4: 20%|#9 | 130/655 [00:00<00:00, 647.58batch/s, train/train_loss_1=0.0897]Epoch 4: 20%|#9 | 130/655 [00:00<00:00, 647.58batch/s, train/train_loss_1=0.0808]Epoch 4: 20%|#9 | 130/655 [00:00<00:00, 647.58batch/s, train/train_loss_1=0.102] Epoch 4: 20%|#9 | 130/655 [00:00<00:00, 647.58batch/s, train/train_loss_1=0.0916]Epoch 4: 20%|#9 | 130/655 [00:00<00:00, 647.58batch/s, train/train_loss_1=0.0986]Epoch 4: 20%|#9 | 130/655 [00:00<00:00, 647.58batch/s, train/train_loss_1=0.084] Epoch 4: 20%|#9 | 130/655 [00:00<00:00, 647.58batch/s, train/train_loss_1=0.093]Epoch 4: 20%|#9 | 130/655 [00:00<00:00, 647.58batch/s, train/train_loss_1=0.0959]Epoch 4: 20%|#9 | 130/655 [00:00<00:00, 647.58batch/s, train/train_loss_1=0.0636]Epoch 4: 20%|#9 | 130/655 [00:00<00:00, 647.58batch/s, train/train_loss_1=0.0729]Epoch 4: 20%|#9 | 130/655 [00:00<00:00, 647.58batch/s, train/train_loss_1=0.09] Epoch 4: 20%|#9 | 130/655 [00:00<00:00, 647.58batch/s, train/train_loss_1=0.0989]Epoch 4: 20%|#9 | 130/655 [00:00<00:00, 647.58batch/s, train/train_loss_1=0.0851]Epoch 4: 20%|#9 | 130/655 [00:00<00:00, 647.58batch/s, train/train_loss_1=0.0967]Epoch 4: 20%|#9 | 130/655 [00:00<00:00, 647.58batch/s, train/train_loss_1=0.0908]Epoch 4: 20%|#9 | 130/655 [00:00<00:00, 647.58batch/s, train/train_loss_1=0.09] Epoch 4: 20%|#9 | 130/655 [00:00<00:00, 647.58batch/s, train/train_loss_1=0.0711]Epoch 4: 20%|#9 | 130/655 [00:00<00:00, 647.58batch/s, train/train_loss_1=0.0896]Epoch 4: 20%|#9 | 130/655 [00:00<00:00, 647.58batch/s, train/train_loss_1=0.0987]Epoch 4: 20%|#9 | 130/655 [00:00<00:00, 647.58batch/s, train/train_loss_1=0.105] Epoch 4: 20%|#9 | 130/655 [00:00<00:00, 647.58batch/s, train/train_loss_1=0.107]Epoch 4: 20%|#9 | 130/655 [00:00<00:00, 647.58batch/s, train/train_loss_1=0.0909]Epoch 4: 20%|#9 | 130/655 [00:00<00:00, 647.58batch/s, train/train_loss_1=0.0883]Epoch 4: 20%|#9 | 130/655 [00:00<00:00, 647.58batch/s, train/train_loss_1=0.0792]Epoch 4: 20%|#9 | 130/655 [00:00<00:00, 647.58batch/s, train/train_loss_1=0.0923]Epoch 4: 20%|#9 | 130/655 [00:00<00:00, 647.58batch/s, train/train_loss_1=0.0911]Epoch 4: 20%|#9 | 130/655 [00:00<00:00, 647.58batch/s, train/train_loss_1=0.0897]Epoch 4: 20%|#9 | 130/655 [00:00<00:00, 647.58batch/s, train/train_loss_1=0.0852]Epoch 4: 20%|#9 | 130/655 [00:00<00:00, 647.58batch/s, train/train_loss_1=0.0772]Epoch 4: 20%|#9 | 130/655 [00:00<00:00, 647.58batch/s, train/train_loss_1=0.0934]Epoch 4: 20%|#9 | 130/655 [00:00<00:00, 647.58batch/s, train/train_loss_1=0.0919]Epoch 4: 20%|#9 | 130/655 [00:00<00:00, 647.58batch/s, train/train_loss_1=0.104] Epoch 4: 20%|#9 | 130/655 [00:00<00:00, 647.58batch/s, train/train_loss_1=0.102]Epoch 4: 20%|#9 | 130/655 [00:00<00:00, 647.58batch/s, train/train_loss_1=0.1] Epoch 4: 20%|#9 | 130/655 [00:00<00:00, 647.58batch/s, train/train_loss_1=0.0894]Epoch 4: 20%|#9 | 130/655 [00:00<00:00, 647.58batch/s, train/train_loss_1=0.0907]Epoch 4: 20%|#9 | 130/655 [00:00<00:00, 647.58batch/s, train/train_loss_1=0.0833]Epoch 4: 20%|#9 | 130/655 [00:00<00:00, 647.58batch/s, train/train_loss_1=0.0936]Epoch 4: 20%|#9 | 130/655 [00:00<00:00, 647.58batch/s, train/train_loss_1=0.0872]Epoch 4: 20%|#9 | 130/655 [00:00<00:00, 647.58batch/s, train/train_loss_1=0.089] Epoch 4: 20%|#9 | 130/655 [00:00<00:00, 647.58batch/s, train/train_loss_1=0.104]Epoch 4: 20%|#9 | 130/655 [00:00<00:00, 647.58batch/s, train/train_loss_1=0.0816]Epoch 4: 20%|#9 | 130/655 [00:00<00:00, 647.58batch/s, train/train_loss_1=0.0771]Epoch 4: 20%|#9 | 130/655 [00:00<00:00, 647.58batch/s, train/train_loss_1=0.0903]Epoch 4: 20%|#9 | 130/655 [00:00<00:00, 647.58batch/s, train/train_loss_1=0.0885]Epoch 4: 20%|#9 | 130/655 [00:00<00:00, 647.58batch/s, train/train_loss_1=0.0909]Epoch 4: 20%|#9 | 130/655 [00:00<00:00, 647.58batch/s, train/train_loss_1=0.0888]Epoch 4: 20%|#9 | 130/655 [00:00<00:00, 647.58batch/s, train/train_loss_1=0.0821]Epoch 4: 20%|#9 | 130/655 [00:00<00:00, 647.58batch/s, train/train_loss_1=0.0797]Epoch 4: 31%|### | 201/655 [00:00<00:00, 674.40batch/s, train/train_loss_1=0.0797]Epoch 4: 31%|### | 201/655 [00:00<00:00, 674.40batch/s, train/train_loss_1=0.093] Epoch 4: 31%|### | 201/655 [00:00<00:00, 674.40batch/s, train/train_loss_1=0.0862]Epoch 4: 31%|### | 201/655 [00:00<00:00, 674.40batch/s, train/train_loss_1=0.0777]Epoch 4: 31%|### | 201/655 [00:00<00:00, 674.40batch/s, train/train_loss_1=0.0943]Epoch 4: 31%|### | 201/655 [00:00<00:00, 674.40batch/s, train/train_loss_1=0.0877]Epoch 4: 31%|### | 201/655 [00:00<00:00, 674.40batch/s, train/train_loss_1=0.0945]Epoch 4: 31%|### | 201/655 [00:00<00:00, 674.40batch/s, train/train_loss_1=0.0958]Epoch 4: 31%|### | 201/655 [00:00<00:00, 674.40batch/s, train/train_loss_1=0.0862]Epoch 4: 31%|### | 201/655 [00:00<00:00, 674.40batch/s, train/train_loss_1=0.0877]Epoch 4: 31%|### | 201/655 [00:00<00:00, 674.40batch/s, train/train_loss_1=0.0962]Epoch 4: 31%|### | 201/655 [00:00<00:00, 674.40batch/s, train/train_loss_1=0.0781]Epoch 4: 31%|### | 201/655 [00:00<00:00, 674.40batch/s, train/train_loss_1=0.0797]Epoch 4: 31%|### | 201/655 [00:00<00:00, 674.40batch/s, train/train_loss_1=0.0914]Epoch 4: 31%|### | 201/655 [00:00<00:00, 674.40batch/s, train/train_loss_1=0.0805]Epoch 4: 31%|### | 201/655 [00:00<00:00, 674.40batch/s, train/train_loss_1=0.103] Epoch 4: 31%|### | 201/655 [00:00<00:00, 674.40batch/s, train/train_loss_1=0.0792]Epoch 4: 31%|### | 201/655 [00:00<00:00, 674.40batch/s, train/train_loss_1=0.0826]Epoch 4: 31%|### | 201/655 [00:00<00:00, 674.40batch/s, train/train_loss_1=0.0743]Epoch 4: 31%|### | 201/655 [00:00<00:00, 674.40batch/s, train/train_loss_1=0.0829]Epoch 4: 31%|### | 201/655 [00:00<00:00, 674.40batch/s, train/train_loss_1=0.0953]Epoch 4: 31%|### | 201/655 [00:00<00:00, 674.40batch/s, train/train_loss_1=0.0921]Epoch 4: 31%|### | 201/655 [00:00<00:00, 674.40batch/s, train/train_loss_1=0.0971]Epoch 4: 31%|### | 201/655 [00:00<00:00, 674.40batch/s, train/train_loss_1=0.0801]Epoch 4: 31%|### | 201/655 [00:00<00:00, 674.40batch/s, train/train_loss_1=0.089] Epoch 4: 31%|### | 201/655 [00:00<00:00, 674.40batch/s, train/train_loss_1=0.0832]Epoch 4: 31%|### | 201/655 [00:00<00:00, 674.40batch/s, train/train_loss_1=0.0816]Epoch 4: 31%|### | 201/655 [00:00<00:00, 674.40batch/s, train/train_loss_1=0.0966]Epoch 4: 31%|### | 201/655 [00:00<00:00, 674.40batch/s, train/train_loss_1=0.0936]Epoch 4: 31%|### | 201/655 [00:00<00:00, 674.40batch/s, train/train_loss_1=0.0943]Epoch 4: 31%|### | 201/655 [00:00<00:00, 674.40batch/s, train/train_loss_1=0.105] Epoch 4: 31%|### | 201/655 [00:00<00:00, 674.40batch/s, train/train_loss_1=0.0845]Epoch 4: 31%|### | 201/655 [00:00<00:00, 674.40batch/s, train/train_loss_1=0.0784]Epoch 4: 31%|### | 201/655 [00:00<00:00, 674.40batch/s, train/train_loss_1=0.081] Epoch 4: 31%|### | 201/655 [00:00<00:00, 674.40batch/s, train/train_loss_1=0.0977]Epoch 4: 31%|### | 201/655 [00:00<00:00, 674.40batch/s, train/train_loss_1=0.0952]Epoch 4: 31%|### | 201/655 [00:00<00:00, 674.40batch/s, train/train_loss_1=0.0888]Epoch 4: 31%|### | 201/655 [00:00<00:00, 674.40batch/s, train/train_loss_1=0.0821]Epoch 4: 31%|### | 201/655 [00:00<00:00, 674.40batch/s, train/train_loss_1=0.102] Epoch 4: 31%|### | 201/655 [00:00<00:00, 674.40batch/s, train/train_loss_1=0.0887]Epoch 4: 31%|### | 201/655 [00:00<00:00, 674.40batch/s, train/train_loss_1=0.0942]Epoch 4: 31%|### | 201/655 [00:00<00:00, 674.40batch/s, train/train_loss_1=0.0921]Epoch 4: 31%|### | 201/655 [00:00<00:00, 674.40batch/s, train/train_loss_1=0.0647]Epoch 4: 31%|### | 201/655 [00:00<00:00, 674.40batch/s, train/train_loss_1=0.0868]Epoch 4: 31%|### | 201/655 [00:00<00:00, 674.40batch/s, train/train_loss_1=0.0982]Epoch 4: 31%|### | 201/655 [00:00<00:00, 674.40batch/s, train/train_loss_1=0.0957]Epoch 4: 31%|### | 201/655 [00:00<00:00, 674.40batch/s, train/train_loss_1=0.0969]Epoch 4: 31%|### | 201/655 [00:00<00:00, 674.40batch/s, train/train_loss_1=0.0937]Epoch 4: 31%|### | 201/655 [00:00<00:00, 674.40batch/s, train/train_loss_1=0.0896]Epoch 4: 31%|### | 201/655 [00:00<00:00, 674.40batch/s, train/train_loss_1=0.0939]Epoch 4: 31%|### | 201/655 [00:00<00:00, 674.40batch/s, train/train_loss_1=0.0887]Epoch 4: 31%|### | 201/655 [00:00<00:00, 674.40batch/s, train/train_loss_1=0.103] Epoch 4: 31%|### | 201/655 [00:00<00:00, 674.40batch/s, train/train_loss_1=0.0865]Epoch 4: 31%|### | 201/655 [00:00<00:00, 674.40batch/s, train/train_loss_1=0.0792]Epoch 4: 31%|### | 201/655 [00:00<00:00, 674.40batch/s, train/train_loss_1=0.0941]Epoch 4: 31%|### | 201/655 [00:00<00:00, 674.40batch/s, train/train_loss_1=0.0872]Epoch 4: 31%|### | 201/655 [00:00<00:00, 674.40batch/s, train/train_loss_1=0.0855]Epoch 4: 31%|### | 201/655 [00:00<00:00, 674.40batch/s, train/train_loss_1=0.0994]Epoch 4: 31%|### | 201/655 [00:00<00:00, 674.40batch/s, train/train_loss_1=0.0863]Epoch 4: 31%|### | 201/655 [00:00<00:00, 674.40batch/s, train/train_loss_1=0.0899]Epoch 4: 31%|### | 201/655 [00:00<00:00, 674.40batch/s, train/train_loss_1=0.0899]Epoch 4: 31%|### | 201/655 [00:00<00:00, 674.40batch/s, train/train_loss_1=0.0967]Epoch 4: 31%|### | 201/655 [00:00<00:00, 674.40batch/s, train/train_loss_1=0.0903]Epoch 4: 31%|### | 201/655 [00:00<00:00, 674.40batch/s, train/train_loss_1=0.0823]Epoch 4: 31%|### | 201/655 [00:00<00:00, 674.40batch/s, train/train_loss_1=0.09] Epoch 4: 31%|### | 201/655 [00:00<00:00, 674.40batch/s, train/train_loss_1=0.0934]Epoch 4: 31%|### | 201/655 [00:00<00:00, 674.40batch/s, train/train_loss_1=0.102] Epoch 4: 31%|### | 201/655 [00:00<00:00, 674.40batch/s, train/train_loss_1=0.0806]Epoch 4: 31%|### | 201/655 [00:00<00:00, 674.40batch/s, train/train_loss_1=0.0974]Epoch 4: 31%|### | 201/655 [00:00<00:00, 674.40batch/s, train/train_loss_1=0.0952]Epoch 4: 31%|### | 201/655 [00:00<00:00, 674.40batch/s, train/train_loss_1=0.076] Epoch 4: 31%|### | 201/655 [00:00<00:00, 674.40batch/s, train/train_loss_1=0.0839]Epoch 4: 42%|####1 | 272/655 [00:00<00:00, 685.87batch/s, train/train_loss_1=0.0839]Epoch 4: 42%|####1 | 272/655 [00:00<00:00, 685.87batch/s, train/train_loss_1=0.0805]Epoch 4: 42%|####1 | 272/655 [00:00<00:00, 685.87batch/s, train/train_loss_1=0.0958]Epoch 4: 42%|####1 | 272/655 [00:00<00:00, 685.87batch/s, train/train_loss_1=0.0923]Epoch 4: 42%|####1 | 272/655 [00:00<00:00, 685.87batch/s, train/train_loss_1=0.0761]Epoch 4: 42%|####1 | 272/655 [00:00<00:00, 685.87batch/s, train/train_loss_1=0.0848]Epoch 4: 42%|####1 | 272/655 [00:00<00:00, 685.87batch/s, train/train_loss_1=0.0888]Epoch 4: 42%|####1 | 272/655 [00:00<00:00, 685.87batch/s, train/train_loss_1=0.0967]Epoch 4: 42%|####1 | 272/655 [00:00<00:00, 685.87batch/s, train/train_loss_1=0.0938]Epoch 4: 42%|####1 | 272/655 [00:00<00:00, 685.87batch/s, train/train_loss_1=0.101] Epoch 4: 42%|####1 | 272/655 [00:00<00:00, 685.87batch/s, train/train_loss_1=0.105]Epoch 4: 42%|####1 | 272/655 [00:00<00:00, 685.87batch/s, train/train_loss_1=0.0809]Epoch 4: 42%|####1 | 272/655 [00:00<00:00, 685.87batch/s, train/train_loss_1=0.0936]Epoch 4: 42%|####1 | 272/655 [00:00<00:00, 685.87batch/s, train/train_loss_1=0.0796]Epoch 4: 42%|####1 | 272/655 [00:00<00:00, 685.87batch/s, train/train_loss_1=0.079] Epoch 4: 42%|####1 | 272/655 [00:00<00:00, 685.87batch/s, train/train_loss_1=0.0904]Epoch 4: 42%|####1 | 272/655 [00:00<00:00, 685.87batch/s, train/train_loss_1=0.0928]Epoch 4: 42%|####1 | 272/655 [00:00<00:00, 685.87batch/s, train/train_loss_1=0.108] Epoch 4: 42%|####1 | 272/655 [00:00<00:00, 685.87batch/s, train/train_loss_1=0.0972]Epoch 4: 42%|####1 | 272/655 [00:00<00:00, 685.87batch/s, train/train_loss_1=0.0935]Epoch 4: 42%|####1 | 272/655 [00:00<00:00, 685.87batch/s, train/train_loss_1=0.0933]Epoch 4: 42%|####1 | 272/655 [00:00<00:00, 685.87batch/s, train/train_loss_1=0.0895]Epoch 4: 42%|####1 | 272/655 [00:00<00:00, 685.87batch/s, train/train_loss_1=0.0943]Epoch 4: 42%|####1 | 272/655 [00:00<00:00, 685.87batch/s, train/train_loss_1=0.0815]Epoch 4: 42%|####1 | 272/655 [00:00<00:00, 685.87batch/s, train/train_loss_1=0.0836]Epoch 4: 42%|####1 | 272/655 [00:00<00:00, 685.87batch/s, train/train_loss_1=0.085] Epoch 4: 42%|####1 | 272/655 [00:00<00:00, 685.87batch/s, train/train_loss_1=0.103]Epoch 4: 42%|####1 | 272/655 [00:00<00:00, 685.87batch/s, train/train_loss_1=0.0907]Epoch 4: 42%|####1 | 272/655 [00:00<00:00, 685.87batch/s, train/train_loss_1=0.0857]Epoch 4: 42%|####1 | 272/655 [00:00<00:00, 685.87batch/s, train/train_loss_1=0.0797]Epoch 4: 42%|####1 | 272/655 [00:00<00:00, 685.87batch/s, train/train_loss_1=0.0871]Epoch 4: 42%|####1 | 272/655 [00:00<00:00, 685.87batch/s, train/train_loss_1=0.0865]Epoch 4: 42%|####1 | 272/655 [00:00<00:00, 685.87batch/s, train/train_loss_1=0.0885]Epoch 4: 42%|####1 | 272/655 [00:00<00:00, 685.87batch/s, train/train_loss_1=0.0921]Epoch 4: 42%|####1 | 272/655 [00:00<00:00, 685.87batch/s, train/train_loss_1=0.098] Epoch 4: 42%|####1 | 272/655 [00:00<00:00, 685.87batch/s, train/train_loss_1=0.0967]Epoch 4: 42%|####1 | 272/655 [00:00<00:00, 685.87batch/s, train/train_loss_1=0.0875]Epoch 4: 42%|####1 | 272/655 [00:00<00:00, 685.87batch/s, train/train_loss_1=0.101] Epoch 4: 42%|####1 | 272/655 [00:00<00:00, 685.87batch/s, train/train_loss_1=0.0828]Epoch 4: 42%|####1 | 272/655 [00:00<00:00, 685.87batch/s, train/train_loss_1=0.0983]Epoch 4: 42%|####1 | 272/655 [00:00<00:00, 685.87batch/s, train/train_loss_1=0.0875]Epoch 4: 42%|####1 | 272/655 [00:00<00:00, 685.87batch/s, train/train_loss_1=0.0907]Epoch 4: 42%|####1 | 272/655 [00:00<00:00, 685.87batch/s, train/train_loss_1=0.0774]Epoch 4: 42%|####1 | 272/655 [00:00<00:00, 685.87batch/s, train/train_loss_1=0.0968]Epoch 4: 42%|####1 | 272/655 [00:00<00:00, 685.87batch/s, train/train_loss_1=0.0833]Epoch 4: 42%|####1 | 272/655 [00:00<00:00, 685.87batch/s, train/train_loss_1=0.0829]Epoch 4: 42%|####1 | 272/655 [00:00<00:00, 685.87batch/s, train/train_loss_1=0.0879]Epoch 4: 42%|####1 | 272/655 [00:00<00:00, 685.87batch/s, train/train_loss_1=0.0821]Epoch 4: 42%|####1 | 272/655 [00:00<00:00, 685.87batch/s, train/train_loss_1=0.117] Epoch 4: 42%|####1 | 272/655 [00:00<00:00, 685.87batch/s, train/train_loss_1=0.0962]Epoch 4: 42%|####1 | 272/655 [00:00<00:00, 685.87batch/s, train/train_loss_1=0.0897]Epoch 4: 42%|####1 | 272/655 [00:00<00:00, 685.87batch/s, train/train_loss_1=0.0875]Epoch 4: 42%|####1 | 272/655 [00:00<00:00, 685.87batch/s, train/train_loss_1=0.0975]Epoch 4: 42%|####1 | 272/655 [00:00<00:00, 685.87batch/s, train/train_loss_1=0.0772]Epoch 4: 42%|####1 | 272/655 [00:00<00:00, 685.87batch/s, train/train_loss_1=0.0832]Epoch 4: 42%|####1 | 272/655 [00:00<00:00, 685.87batch/s, train/train_loss_1=0.0909]Epoch 4: 42%|####1 | 272/655 [00:00<00:00, 685.87batch/s, train/train_loss_1=0.105] Epoch 4: 42%|####1 | 272/655 [00:00<00:00, 685.87batch/s, train/train_loss_1=0.104]Epoch 4: 42%|####1 | 272/655 [00:00<00:00, 685.87batch/s, train/train_loss_1=0.0819]Epoch 4: 42%|####1 | 272/655 [00:00<00:00, 685.87batch/s, train/train_loss_1=0.105] Epoch 4: 42%|####1 | 272/655 [00:00<00:00, 685.87batch/s, train/train_loss_1=0.0979]Epoch 4: 42%|####1 | 272/655 [00:00<00:00, 685.87batch/s, train/train_loss_1=0.0911]Epoch 4: 42%|####1 | 272/655 [00:00<00:00, 685.87batch/s, train/train_loss_1=0.0835]Epoch 4: 42%|####1 | 272/655 [00:00<00:00, 685.87batch/s, train/train_loss_1=0.103] Epoch 4: 42%|####1 | 272/655 [00:00<00:00, 685.87batch/s, train/train_loss_1=0.0985]Epoch 4: 42%|####1 | 272/655 [00:00<00:00, 685.87batch/s, train/train_loss_1=0.093] Epoch 4: 42%|####1 | 272/655 [00:00<00:00, 685.87batch/s, train/train_loss_1=0.0903]Epoch 4: 42%|####1 | 272/655 [00:00<00:00, 685.87batch/s, train/train_loss_1=0.0903]Epoch 4: 42%|####1 | 272/655 [00:00<00:00, 685.87batch/s, train/train_loss_1=0.0924]Epoch 4: 42%|####1 | 272/655 [00:00<00:00, 685.87batch/s, train/train_loss_1=0.0882]Epoch 4: 42%|####1 | 272/655 [00:00<00:00, 685.87batch/s, train/train_loss_1=0.1] Epoch 4: 42%|####1 | 272/655 [00:00<00:00, 685.87batch/s, train/train_loss_1=0.0877]Epoch 4: 42%|####1 | 272/655 [00:00<00:00, 685.87batch/s, train/train_loss_1=0.0985]Epoch 4: 53%|#####2 | 344/655 [00:00<00:00, 695.11batch/s, train/train_loss_1=0.0985]Epoch 4: 53%|#####2 | 344/655 [00:00<00:00, 695.11batch/s, train/train_loss_1=0.0913]Epoch 4: 53%|#####2 | 344/655 [00:00<00:00, 695.11batch/s, train/train_loss_1=0.0874]Epoch 4: 53%|#####2 | 344/655 [00:00<00:00, 695.11batch/s, train/train_loss_1=0.0878]Epoch 4: 53%|#####2 | 344/655 [00:00<00:00, 695.11batch/s, train/train_loss_1=0.0992]Epoch 4: 53%|#####2 | 344/655 [00:00<00:00, 695.11batch/s, train/train_loss_1=0.105] Epoch 4: 53%|#####2 | 344/655 [00:00<00:00, 695.11batch/s, train/train_loss_1=0.0904]Epoch 4: 53%|#####2 | 344/655 [00:00<00:00, 695.11batch/s, train/train_loss_1=0.0861]Epoch 4: 53%|#####2 | 344/655 [00:00<00:00, 695.11batch/s, train/train_loss_1=0.0997]Epoch 4: 53%|#####2 | 344/655 [00:00<00:00, 695.11batch/s, train/train_loss_1=0.0778]Epoch 4: 53%|#####2 | 344/655 [00:00<00:00, 695.11batch/s, train/train_loss_1=0.0793]Epoch 4: 53%|#####2 | 344/655 [00:00<00:00, 695.11batch/s, train/train_loss_1=0.0827]Epoch 4: 53%|#####2 | 344/655 [00:00<00:00, 695.11batch/s, train/train_loss_1=0.0986]Epoch 4: 53%|#####2 | 344/655 [00:00<00:00, 695.11batch/s, train/train_loss_1=0.0994]Epoch 4: 53%|#####2 | 344/655 [00:00<00:00, 695.11batch/s, train/train_loss_1=0.1] Epoch 4: 53%|#####2 | 344/655 [00:00<00:00, 695.11batch/s, train/train_loss_1=0.083]Epoch 4: 53%|#####2 | 344/655 [00:00<00:00, 695.11batch/s, train/train_loss_1=0.104]Epoch 4: 53%|#####2 | 344/655 [00:00<00:00, 695.11batch/s, train/train_loss_1=0.0822]Epoch 4: 53%|#####2 | 344/655 [00:00<00:00, 695.11batch/s, train/train_loss_1=0.086] Epoch 4: 53%|#####2 | 344/655 [00:00<00:00, 695.11batch/s, train/train_loss_1=0.0929]Epoch 4: 53%|#####2 | 344/655 [00:00<00:00, 695.11batch/s, train/train_loss_1=0.0875]Epoch 4: 53%|#####2 | 344/655 [00:00<00:00, 695.11batch/s, train/train_loss_1=0.101] Epoch 4: 53%|#####2 | 344/655 [00:00<00:00, 695.11batch/s, train/train_loss_1=0.0909]Epoch 4: 53%|#####2 | 344/655 [00:00<00:00, 695.11batch/s, train/train_loss_1=0.0903]Epoch 4: 53%|#####2 | 344/655 [00:00<00:00, 695.11batch/s, train/train_loss_1=0.0905]Epoch 4: 53%|#####2 | 344/655 [00:00<00:00, 695.11batch/s, train/train_loss_1=0.0982]Epoch 4: 53%|#####2 | 344/655 [00:00<00:00, 695.11batch/s, train/train_loss_1=0.0945]Epoch 4: 53%|#####2 | 344/655 [00:00<00:00, 695.11batch/s, train/train_loss_1=0.0903]Epoch 4: 53%|#####2 | 344/655 [00:00<00:00, 695.11batch/s, train/train_loss_1=0.0759]Epoch 4: 53%|#####2 | 344/655 [00:00<00:00, 695.11batch/s, train/train_loss_1=0.0843]Epoch 4: 53%|#####2 | 344/655 [00:00<00:00, 695.11batch/s, train/train_loss_1=0.0887]Epoch 4: 53%|#####2 | 344/655 [00:00<00:00, 695.11batch/s, train/train_loss_1=0.0799]Epoch 4: 53%|#####2 | 344/655 [00:00<00:00, 695.11batch/s, train/train_loss_1=0.0894]Epoch 4: 53%|#####2 | 344/655 [00:00<00:00, 695.11batch/s, train/train_loss_1=0.105] Epoch 4: 53%|#####2 | 344/655 [00:00<00:00, 695.11batch/s, train/train_loss_1=0.1] Epoch 4: 53%|#####2 | 344/655 [00:00<00:00, 695.11batch/s, train/train_loss_1=0.0907]Epoch 4: 53%|#####2 | 344/655 [00:00<00:00, 695.11batch/s, train/train_loss_1=0.0994]Epoch 4: 53%|#####2 | 344/655 [00:00<00:00, 695.11batch/s, train/train_loss_1=0.101] Epoch 4: 53%|#####2 | 344/655 [00:00<00:00, 695.11batch/s, train/train_loss_1=0.095]Epoch 4: 53%|#####2 | 344/655 [00:00<00:00, 695.11batch/s, train/train_loss_1=0.0877]Epoch 4: 53%|#####2 | 344/655 [00:00<00:00, 695.11batch/s, train/train_loss_1=0.103] Epoch 4: 53%|#####2 | 344/655 [00:00<00:00, 695.11batch/s, train/train_loss_1=0.0998]Epoch 4: 53%|#####2 | 344/655 [00:00<00:00, 695.11batch/s, train/train_loss_1=0.0905]Epoch 4: 53%|#####2 | 344/655 [00:00<00:00, 695.11batch/s, train/train_loss_1=0.08] Epoch 4: 53%|#####2 | 344/655 [00:00<00:00, 695.11batch/s, train/train_loss_1=0.0825]Epoch 4: 53%|#####2 | 344/655 [00:00<00:00, 695.11batch/s, train/train_loss_1=0.0865]Epoch 4: 53%|#####2 | 344/655 [00:00<00:00, 695.11batch/s, train/train_loss_1=0.0942]Epoch 4: 53%|#####2 | 344/655 [00:00<00:00, 695.11batch/s, train/train_loss_1=0.0873]Epoch 4: 53%|#####2 | 344/655 [00:00<00:00, 695.11batch/s, train/train_loss_1=0.0937]Epoch 4: 53%|#####2 | 344/655 [00:00<00:00, 695.11batch/s, train/train_loss_1=0.101] Epoch 4: 53%|#####2 | 344/655 [00:00<00:00, 695.11batch/s, train/train_loss_1=0.1] Epoch 4: 53%|#####2 | 344/655 [00:00<00:00, 695.11batch/s, train/train_loss_1=0.0978]Epoch 4: 53%|#####2 | 344/655 [00:00<00:00, 695.11batch/s, train/train_loss_1=0.0997]Epoch 4: 53%|#####2 | 344/655 [00:00<00:00, 695.11batch/s, train/train_loss_1=0.0941]Epoch 4: 53%|#####2 | 344/655 [00:00<00:00, 695.11batch/s, train/train_loss_1=0.08] Epoch 4: 53%|#####2 | 344/655 [00:00<00:00, 695.11batch/s, train/train_loss_1=0.0927]Epoch 4: 53%|#####2 | 344/655 [00:00<00:00, 695.11batch/s, train/train_loss_1=0.0944]Epoch 4: 53%|#####2 | 344/655 [00:00<00:00, 695.11batch/s, train/train_loss_1=0.0911]Epoch 4: 53%|#####2 | 344/655 [00:00<00:00, 695.11batch/s, train/train_loss_1=0.0838]Epoch 4: 53%|#####2 | 344/655 [00:00<00:00, 695.11batch/s, train/train_loss_1=0.084] Epoch 4: 53%|#####2 | 344/655 [00:00<00:00, 695.11batch/s, train/train_loss_1=0.0845]Epoch 4: 53%|#####2 | 344/655 [00:00<00:00, 695.11batch/s, train/train_loss_1=0.0881]Epoch 4: 53%|#####2 | 344/655 [00:00<00:00, 695.11batch/s, train/train_loss_1=0.0738]Epoch 4: 53%|#####2 | 344/655 [00:00<00:00, 695.11batch/s, train/train_loss_1=0.0854]Epoch 4: 53%|#####2 | 344/655 [00:00<00:00, 695.11batch/s, train/train_loss_1=0.0891]Epoch 4: 53%|#####2 | 344/655 [00:00<00:00, 695.11batch/s, train/train_loss_1=0.0935]Epoch 4: 53%|#####2 | 344/655 [00:00<00:00, 695.11batch/s, train/train_loss_1=0.0949]Epoch 4: 53%|#####2 | 344/655 [00:00<00:00, 695.11batch/s, train/train_loss_1=0.0923]Epoch 4: 53%|#####2 | 344/655 [00:00<00:00, 695.11batch/s, train/train_loss_1=0.0833]Epoch 4: 53%|#####2 | 344/655 [00:00<00:00, 695.11batch/s, train/train_loss_1=0.0845]Epoch 4: 53%|#####2 | 344/655 [00:00<00:00, 695.11batch/s, train/train_loss_1=0.108] Epoch 4: 53%|#####2 | 344/655 [00:00<00:00, 695.11batch/s, train/train_loss_1=0.0741]Epoch 4: 53%|#####2 | 344/655 [00:00<00:00, 695.11batch/s, train/train_loss_1=0.104] Epoch 4: 64%|######3 | 416/655 [00:00<00:00, 701.11batch/s, train/train_loss_1=0.104]Epoch 4: 64%|######3 | 416/655 [00:00<00:00, 701.11batch/s, train/train_loss_1=0.0834]Epoch 4: 64%|######3 | 416/655 [00:00<00:00, 701.11batch/s, train/train_loss_1=0.0954]Epoch 4: 64%|######3 | 416/655 [00:00<00:00, 701.11batch/s, train/train_loss_1=0.0982]Epoch 4: 64%|######3 | 416/655 [00:00<00:00, 701.11batch/s, train/train_loss_1=0.0915]Epoch 4: 64%|######3 | 416/655 [00:00<00:00, 701.11batch/s, train/train_loss_1=0.0856]Epoch 4: 64%|######3 | 416/655 [00:00<00:00, 701.11batch/s, train/train_loss_1=0.0968]Epoch 4: 64%|######3 | 416/655 [00:00<00:00, 701.11batch/s, train/train_loss_1=0.0986]Epoch 4: 64%|######3 | 416/655 [00:00<00:00, 701.11batch/s, train/train_loss_1=0.0885]Epoch 4: 64%|######3 | 416/655 [00:00<00:00, 701.11batch/s, train/train_loss_1=0.0991]Epoch 4: 64%|######3 | 416/655 [00:00<00:00, 701.11batch/s, train/train_loss_1=0.0927]Epoch 4: 64%|######3 | 416/655 [00:00<00:00, 701.11batch/s, train/train_loss_1=0.0862]Epoch 4: 64%|######3 | 416/655 [00:00<00:00, 701.11batch/s, train/train_loss_1=0.0916]Epoch 4: 64%|######3 | 416/655 [00:00<00:00, 701.11batch/s, train/train_loss_1=0.079] Epoch 4: 64%|######3 | 416/655 [00:00<00:00, 701.11batch/s, train/train_loss_1=0.089]Epoch 4: 64%|######3 | 416/655 [00:00<00:00, 701.11batch/s, train/train_loss_1=0.0812]Epoch 4: 64%|######3 | 416/655 [00:00<00:00, 701.11batch/s, train/train_loss_1=0.0854]Epoch 4: 64%|######3 | 416/655 [00:00<00:00, 701.11batch/s, train/train_loss_1=0.0974]Epoch 4: 64%|######3 | 416/655 [00:00<00:00, 701.11batch/s, train/train_loss_1=0.0815]Epoch 4: 64%|######3 | 416/655 [00:00<00:00, 701.11batch/s, train/train_loss_1=0.0893]Epoch 4: 64%|######3 | 416/655 [00:00<00:00, 701.11batch/s, train/train_loss_1=0.0896]Epoch 4: 64%|######3 | 416/655 [00:00<00:00, 701.11batch/s, train/train_loss_1=0.0962]Epoch 4: 64%|######3 | 416/655 [00:00<00:00, 701.11batch/s, train/train_loss_1=0.0866]Epoch 4: 64%|######3 | 416/655 [00:00<00:00, 701.11batch/s, train/train_loss_1=0.0829]Epoch 4: 64%|######3 | 416/655 [00:00<00:00, 701.11batch/s, train/train_loss_1=0.0831]Epoch 4: 64%|######3 | 416/655 [00:00<00:00, 701.11batch/s, train/train_loss_1=0.0971]Epoch 4: 64%|######3 | 416/655 [00:00<00:00, 701.11batch/s, train/train_loss_1=0.081] Epoch 4: 64%|######3 | 416/655 [00:00<00:00, 701.11batch/s, train/train_loss_1=0.0762]Epoch 4: 64%|######3 | 416/655 [00:00<00:00, 701.11batch/s, train/train_loss_1=0.0947]Epoch 4: 64%|######3 | 416/655 [00:00<00:00, 701.11batch/s, train/train_loss_1=0.0865]Epoch 4: 64%|######3 | 416/655 [00:00<00:00, 701.11batch/s, train/train_loss_1=0.0964]Epoch 4: 64%|######3 | 416/655 [00:00<00:00, 701.11batch/s, train/train_loss_1=0.0977]Epoch 4: 64%|######3 | 416/655 [00:00<00:00, 701.11batch/s, train/train_loss_1=0.0947]Epoch 4: 64%|######3 | 416/655 [00:00<00:00, 701.11batch/s, train/train_loss_1=0.0828]Epoch 4: 64%|######3 | 416/655 [00:00<00:00, 701.11batch/s, train/train_loss_1=0.0873]Epoch 4: 64%|######3 | 416/655 [00:00<00:00, 701.11batch/s, train/train_loss_1=0.108] Epoch 4: 64%|######3 | 416/655 [00:00<00:00, 701.11batch/s, train/train_loss_1=0.0873]Epoch 4: 64%|######3 | 416/655 [00:00<00:00, 701.11batch/s, train/train_loss_1=0.0982]Epoch 4: 64%|######3 | 416/655 [00:00<00:00, 701.11batch/s, train/train_loss_1=0.0957]Epoch 4: 64%|######3 | 416/655 [00:00<00:00, 701.11batch/s, train/train_loss_1=0.0805]Epoch 4: 64%|######3 | 416/655 [00:00<00:00, 701.11batch/s, train/train_loss_1=0.0871]Epoch 4: 64%|######3 | 416/655 [00:00<00:00, 701.11batch/s, train/train_loss_1=0.086] Epoch 4: 64%|######3 | 416/655 [00:00<00:00, 701.11batch/s, train/train_loss_1=0.0866]Epoch 4: 64%|######3 | 416/655 [00:00<00:00, 701.11batch/s, train/train_loss_1=0.0842]Epoch 4: 64%|######3 | 416/655 [00:00<00:00, 701.11batch/s, train/train_loss_1=0.0909]Epoch 4: 64%|######3 | 416/655 [00:00<00:00, 701.11batch/s, train/train_loss_1=0.0933]Epoch 4: 64%|######3 | 416/655 [00:00<00:00, 701.11batch/s, train/train_loss_1=0.0929]Epoch 4: 64%|######3 | 416/655 [00:00<00:00, 701.11batch/s, train/train_loss_1=0.0853]Epoch 4: 64%|######3 | 416/655 [00:00<00:00, 701.11batch/s, train/train_loss_1=0.104] Epoch 4: 64%|######3 | 416/655 [00:00<00:00, 701.11batch/s, train/train_loss_1=0.0773]Epoch 4: 64%|######3 | 416/655 [00:00<00:00, 701.11batch/s, train/train_loss_1=0.0798]Epoch 4: 64%|######3 | 416/655 [00:00<00:00, 701.11batch/s, train/train_loss_1=0.0893]Epoch 4: 64%|######3 | 416/655 [00:00<00:00, 701.11batch/s, train/train_loss_1=0.0825]Epoch 4: 64%|######3 | 416/655 [00:00<00:00, 701.11batch/s, train/train_loss_1=0.0836]Epoch 4: 64%|######3 | 416/655 [00:00<00:00, 701.11batch/s, train/train_loss_1=0.0969]Epoch 4: 64%|######3 | 416/655 [00:00<00:00, 701.11batch/s, train/train_loss_1=0.0912]Epoch 4: 64%|######3 | 416/655 [00:00<00:00, 701.11batch/s, train/train_loss_1=0.0768]Epoch 4: 64%|######3 | 416/655 [00:00<00:00, 701.11batch/s, train/train_loss_1=0.101] Epoch 4: 64%|######3 | 416/655 [00:00<00:00, 701.11batch/s, train/train_loss_1=0.089]Epoch 4: 64%|######3 | 416/655 [00:00<00:00, 701.11batch/s, train/train_loss_1=0.0842]Epoch 4: 64%|######3 | 416/655 [00:00<00:00, 701.11batch/s, train/train_loss_1=0.109] Epoch 4: 64%|######3 | 416/655 [00:00<00:00, 701.11batch/s, train/train_loss_1=0.0868]Epoch 4: 64%|######3 | 416/655 [00:00<00:00, 701.11batch/s, train/train_loss_1=0.0895]Epoch 4: 64%|######3 | 416/655 [00:00<00:00, 701.11batch/s, train/train_loss_1=0.0892]Epoch 4: 64%|######3 | 416/655 [00:00<00:00, 701.11batch/s, train/train_loss_1=0.0901]Epoch 4: 64%|######3 | 416/655 [00:00<00:00, 701.11batch/s, train/train_loss_1=0.0914]Epoch 4: 64%|######3 | 416/655 [00:00<00:00, 701.11batch/s, train/train_loss_1=0.0904]Epoch 4: 64%|######3 | 416/655 [00:00<00:00, 701.11batch/s, train/train_loss_1=0.0769]Epoch 4: 64%|######3 | 416/655 [00:00<00:00, 701.11batch/s, train/train_loss_1=0.0869]Epoch 4: 64%|######3 | 416/655 [00:00<00:00, 701.11batch/s, train/train_loss_1=0.0958]Epoch 4: 64%|######3 | 416/655 [00:00<00:00, 701.11batch/s, train/train_loss_1=0.0782]Epoch 4: 64%|######3 | 416/655 [00:00<00:00, 701.11batch/s, train/train_loss_1=0.0857]Epoch 4: 74%|#######4 | 487/655 [00:00<00:00, 673.90batch/s, train/train_loss_1=0.0857]Epoch 4: 74%|#######4 | 487/655 [00:00<00:00, 673.90batch/s, train/train_loss_1=0.0853]Epoch 4: 74%|#######4 | 487/655 [00:00<00:00, 673.90batch/s, train/train_loss_1=0.0853]Epoch 4: 74%|#######4 | 487/655 [00:00<00:00, 673.90batch/s, train/train_loss_1=0.0895]Epoch 4: 74%|#######4 | 487/655 [00:00<00:00, 673.90batch/s, train/train_loss_1=0.0909]Epoch 4: 74%|#######4 | 487/655 [00:00<00:00, 673.90batch/s, train/train_loss_1=0.0802]Epoch 4: 74%|#######4 | 487/655 [00:00<00:00, 673.90batch/s, train/train_loss_1=0.0954]Epoch 4: 74%|#######4 | 487/655 [00:00<00:00, 673.90batch/s, train/train_loss_1=0.0822]Epoch 4: 74%|#######4 | 487/655 [00:00<00:00, 673.90batch/s, train/train_loss_1=0.0861]Epoch 4: 74%|#######4 | 487/655 [00:00<00:00, 673.90batch/s, train/train_loss_1=0.0901]Epoch 4: 74%|#######4 | 487/655 [00:00<00:00, 673.90batch/s, train/train_loss_1=0.0824]Epoch 4: 74%|#######4 | 487/655 [00:00<00:00, 673.90batch/s, train/train_loss_1=0.0914]Epoch 4: 74%|#######4 | 487/655 [00:00<00:00, 673.90batch/s, train/train_loss_1=0.108] Epoch 4: 74%|#######4 | 487/655 [00:00<00:00, 673.90batch/s, train/train_loss_1=0.0811]Epoch 4: 74%|#######4 | 487/655 [00:00<00:00, 673.90batch/s, train/train_loss_1=0.0946]Epoch 4: 74%|#######4 | 487/655 [00:00<00:00, 673.90batch/s, train/train_loss_1=0.0925]Epoch 4: 74%|#######4 | 487/655 [00:00<00:00, 673.90batch/s, train/train_loss_1=0.105] Epoch 4: 74%|#######4 | 487/655 [00:00<00:00, 673.90batch/s, train/train_loss_1=0.076]Epoch 4: 74%|#######4 | 487/655 [00:00<00:00, 673.90batch/s, train/train_loss_1=0.0829]Epoch 4: 74%|#######4 | 487/655 [00:00<00:00, 673.90batch/s, train/train_loss_1=0.105] Epoch 4: 74%|#######4 | 487/655 [00:00<00:00, 673.90batch/s, train/train_loss_1=0.0869]Epoch 4: 74%|#######4 | 487/655 [00:00<00:00, 673.90batch/s, train/train_loss_1=0.097] Epoch 4: 74%|#######4 | 487/655 [00:00<00:00, 673.90batch/s, train/train_loss_1=0.0932]Epoch 4: 74%|#######4 | 487/655 [00:00<00:00, 673.90batch/s, train/train_loss_1=0.0837]Epoch 4: 74%|#######4 | 487/655 [00:00<00:00, 673.90batch/s, train/train_loss_1=0.104] Epoch 4: 74%|#######4 | 487/655 [00:00<00:00, 673.90batch/s, train/train_loss_1=0.0883]Epoch 4: 74%|#######4 | 487/655 [00:00<00:00, 673.90batch/s, train/train_loss_1=0.0944]Epoch 4: 74%|#######4 | 487/655 [00:00<00:00, 673.90batch/s, train/train_loss_1=0.0948]Epoch 4: 74%|#######4 | 487/655 [00:00<00:00, 673.90batch/s, train/train_loss_1=0.0898]Epoch 4: 74%|#######4 | 487/655 [00:00<00:00, 673.90batch/s, train/train_loss_1=0.0823]Epoch 4: 74%|#######4 | 487/655 [00:00<00:00, 673.90batch/s, train/train_loss_1=0.0857]Epoch 4: 74%|#######4 | 487/655 [00:00<00:00, 673.90batch/s, train/train_loss_1=0.0841]Epoch 4: 74%|#######4 | 487/655 [00:00<00:00, 673.90batch/s, train/train_loss_1=0.0801]Epoch 4: 74%|#######4 | 487/655 [00:00<00:00, 673.90batch/s, train/train_loss_1=0.0906]Epoch 4: 74%|#######4 | 487/655 [00:00<00:00, 673.90batch/s, train/train_loss_1=0.0962]Epoch 4: 74%|#######4 | 487/655 [00:00<00:00, 673.90batch/s, train/train_loss_1=0.0939]Epoch 4: 74%|#######4 | 487/655 [00:00<00:00, 673.90batch/s, train/train_loss_1=0.0868]Epoch 4: 74%|#######4 | 487/655 [00:00<00:00, 673.90batch/s, train/train_loss_1=0.0822]Epoch 4: 74%|#######4 | 487/655 [00:00<00:00, 673.90batch/s, train/train_loss_1=0.0849]Epoch 4: 74%|#######4 | 487/655 [00:00<00:00, 673.90batch/s, train/train_loss_1=0.095] Epoch 4: 74%|#######4 | 487/655 [00:00<00:00, 673.90batch/s, train/train_loss_1=0.0826]Epoch 4: 74%|#######4 | 487/655 [00:00<00:00, 673.90batch/s, train/train_loss_1=0.0866]Epoch 4: 74%|#######4 | 487/655 [00:00<00:00, 673.90batch/s, train/train_loss_1=0.086] Epoch 4: 74%|#######4 | 487/655 [00:00<00:00, 673.90batch/s, train/train_loss_1=0.079]Epoch 4: 74%|#######4 | 487/655 [00:00<00:00, 673.90batch/s, train/train_loss_1=0.0935]Epoch 4: 74%|#######4 | 487/655 [00:00<00:00, 673.90batch/s, train/train_loss_1=0.085] Epoch 4: 74%|#######4 | 487/655 [00:00<00:00, 673.90batch/s, train/train_loss_1=0.0809]Epoch 4: 74%|#######4 | 487/655 [00:00<00:00, 673.90batch/s, train/train_loss_1=0.0871]Epoch 4: 74%|#######4 | 487/655 [00:00<00:00, 673.90batch/s, train/train_loss_1=0.0872]Epoch 4: 74%|#######4 | 487/655 [00:00<00:00, 673.90batch/s, train/train_loss_1=0.0837]Epoch 4: 74%|#######4 | 487/655 [00:00<00:00, 673.90batch/s, train/train_loss_1=0.0888]Epoch 4: 74%|#######4 | 487/655 [00:00<00:00, 673.90batch/s, train/train_loss_1=0.0919]Epoch 4: 74%|#######4 | 487/655 [00:00<00:00, 673.90batch/s, train/train_loss_1=0.0826]Epoch 4: 74%|#######4 | 487/655 [00:00<00:00, 673.90batch/s, train/train_loss_1=0.0911]Epoch 4: 74%|#######4 | 487/655 [00:00<00:00, 673.90batch/s, train/train_loss_1=0.0878]Epoch 4: 74%|#######4 | 487/655 [00:00<00:00, 673.90batch/s, train/train_loss_1=0.0801]Epoch 4: 74%|#######4 | 487/655 [00:00<00:00, 673.90batch/s, train/train_loss_1=0.0897]Epoch 4: 74%|#######4 | 487/655 [00:00<00:00, 673.90batch/s, train/train_loss_1=0.0838]Epoch 4: 74%|#######4 | 487/655 [00:00<00:00, 673.90batch/s, train/train_loss_1=0.0865]Epoch 4: 74%|#######4 | 487/655 [00:00<00:00, 673.90batch/s, train/train_loss_1=0.0842]Epoch 4: 74%|#######4 | 487/655 [00:00<00:00, 673.90batch/s, train/train_loss_1=0.101] Epoch 4: 74%|#######4 | 487/655 [00:00<00:00, 673.90batch/s, train/train_loss_1=0.0881]Epoch 4: 74%|#######4 | 487/655 [00:00<00:00, 673.90batch/s, train/train_loss_1=0.0833]Epoch 4: 74%|#######4 | 487/655 [00:00<00:00, 673.90batch/s, train/train_loss_1=0.0784]Epoch 4: 74%|#######4 | 487/655 [00:00<00:00, 673.90batch/s, train/train_loss_1=0.0827]Epoch 4: 74%|#######4 | 487/655 [00:00<00:00, 673.90batch/s, train/train_loss_1=0.0909]Epoch 4: 74%|#######4 | 487/655 [00:00<00:00, 673.90batch/s, train/train_loss_1=0.0979]Epoch 4: 74%|#######4 | 487/655 [00:00<00:00, 673.90batch/s, train/train_loss_1=0.0905]Epoch 4: 74%|#######4 | 487/655 [00:00<00:00, 673.90batch/s, train/train_loss_1=0.0874]Epoch 4: 74%|#######4 | 487/655 [00:00<00:00, 673.90batch/s, train/train_loss_1=0.092] Epoch 4: 74%|#######4 | 487/655 [00:00<00:00, 673.90batch/s, train/train_loss_1=0.0876]Epoch 4: 74%|#######4 | 487/655 [00:00<00:00, 673.90batch/s, train/train_loss_1=0.111] Epoch 4: 74%|#######4 | 487/655 [00:00<00:00, 673.90batch/s, train/train_loss_1=0.0852]Epoch 4: 85%|########5 | 559/655 [00:00<00:00, 685.42batch/s, train/train_loss_1=0.0852]Epoch 4: 85%|########5 | 559/655 [00:00<00:00, 685.42batch/s, train/train_loss_1=0.0731]Epoch 4: 85%|########5 | 559/655 [00:00<00:00, 685.42batch/s, train/train_loss_1=0.0799]Epoch 4: 85%|########5 | 559/655 [00:00<00:00, 685.42batch/s, train/train_loss_1=0.0894]Epoch 4: 85%|########5 | 559/655 [00:00<00:00, 685.42batch/s, train/train_loss_1=0.102] Epoch 4: 85%|########5 | 559/655 [00:00<00:00, 685.42batch/s, train/train_loss_1=0.081]Epoch 4: 85%|########5 | 559/655 [00:00<00:00, 685.42batch/s, train/train_loss_1=0.111]Epoch 4: 85%|########5 | 559/655 [00:00<00:00, 685.42batch/s, train/train_loss_1=0.0908]Epoch 4: 85%|########5 | 559/655 [00:00<00:00, 685.42batch/s, train/train_loss_1=0.079] Epoch 4: 85%|########5 | 559/655 [00:00<00:00, 685.42batch/s, train/train_loss_1=0.0974]Epoch 4: 85%|########5 | 559/655 [00:00<00:00, 685.42batch/s, train/train_loss_1=0.0902]Epoch 4: 85%|########5 | 559/655 [00:00<00:00, 685.42batch/s, train/train_loss_1=0.0877]Epoch 4: 85%|########5 | 559/655 [00:00<00:00, 685.42batch/s, train/train_loss_1=0.0826]Epoch 4: 85%|########5 | 559/655 [00:00<00:00, 685.42batch/s, train/train_loss_1=0.0944]Epoch 4: 85%|########5 | 559/655 [00:00<00:00, 685.42batch/s, train/train_loss_1=0.0843]Epoch 4: 85%|########5 | 559/655 [00:00<00:00, 685.42batch/s, train/train_loss_1=0.0962]Epoch 4: 85%|########5 | 559/655 [00:00<00:00, 685.42batch/s, train/train_loss_1=0.0736]Epoch 4: 85%|########5 | 559/655 [00:00<00:00, 685.42batch/s, train/train_loss_1=0.111] Epoch 4: 85%|########5 | 559/655 [00:00<00:00, 685.42batch/s, train/train_loss_1=0.0938]Epoch 4: 85%|########5 | 559/655 [00:00<00:00, 685.42batch/s, train/train_loss_1=0.096] Epoch 4: 85%|########5 | 559/655 [00:00<00:00, 685.42batch/s, train/train_loss_1=0.0806]Epoch 4: 85%|########5 | 559/655 [00:00<00:00, 685.42batch/s, train/train_loss_1=0.0689]Epoch 4: 85%|########5 | 559/655 [00:00<00:00, 685.42batch/s, train/train_loss_1=0.092] Epoch 4: 85%|########5 | 559/655 [00:00<00:00, 685.42batch/s, train/train_loss_1=0.0751]Epoch 4: 85%|########5 | 559/655 [00:00<00:00, 685.42batch/s, train/train_loss_1=0.0909]Epoch 4: 85%|########5 | 559/655 [00:00<00:00, 685.42batch/s, train/train_loss_1=0.0906]Epoch 4: 85%|########5 | 559/655 [00:00<00:00, 685.42batch/s, train/train_loss_1=0.11] Epoch 4: 85%|########5 | 559/655 [00:00<00:00, 685.42batch/s, train/train_loss_1=0.08]Epoch 4: 85%|########5 | 559/655 [00:00<00:00, 685.42batch/s, train/train_loss_1=0.0928]Epoch 4: 85%|########5 | 559/655 [00:00<00:00, 685.42batch/s, train/train_loss_1=0.0869]Epoch 4: 85%|########5 | 559/655 [00:00<00:00, 685.42batch/s, train/train_loss_1=0.0895]Epoch 4: 85%|########5 | 559/655 [00:00<00:00, 685.42batch/s, train/train_loss_1=0.0769]Epoch 4: 85%|########5 | 559/655 [00:00<00:00, 685.42batch/s, train/train_loss_1=0.0877]Epoch 4: 85%|########5 | 559/655 [00:00<00:00, 685.42batch/s, train/train_loss_1=0.0916]Epoch 4: 85%|########5 | 559/655 [00:00<00:00, 685.42batch/s, train/train_loss_1=0.0847]Epoch 4: 85%|########5 | 559/655 [00:00<00:00, 685.42batch/s, train/train_loss_1=0.0871]Epoch 4: 85%|########5 | 559/655 [00:00<00:00, 685.42batch/s, train/train_loss_1=0.0886]Epoch 4: 85%|########5 | 559/655 [00:00<00:00, 685.42batch/s, train/train_loss_1=0.0922]Epoch 4: 85%|########5 | 559/655 [00:00<00:00, 685.42batch/s, train/train_loss_1=0.0968]Epoch 4: 85%|########5 | 559/655 [00:00<00:00, 685.42batch/s, train/train_loss_1=0.0937]Epoch 4: 85%|########5 | 559/655 [00:00<00:00, 685.42batch/s, train/train_loss_1=0.0885]Epoch 4: 85%|########5 | 559/655 [00:00<00:00, 685.42batch/s, train/train_loss_1=0.095] Epoch 4: 85%|########5 | 559/655 [00:00<00:00, 685.42batch/s, train/train_loss_1=0.0933]Epoch 4: 85%|########5 | 559/655 [00:00<00:00, 685.42batch/s, train/train_loss_1=0.086] Epoch 4: 85%|########5 | 559/655 [00:00<00:00, 685.42batch/s, train/train_loss_1=0.0843]Epoch 4: 85%|########5 | 559/655 [00:00<00:00, 685.42batch/s, train/train_loss_1=0.0931]Epoch 4: 85%|########5 | 559/655 [00:00<00:00, 685.42batch/s, train/train_loss_1=0.0952]Epoch 4: 85%|########5 | 559/655 [00:00<00:00, 685.42batch/s, train/train_loss_1=0.0828]Epoch 4: 85%|########5 | 559/655 [00:00<00:00, 685.42batch/s, train/train_loss_1=0.104] Epoch 4: 85%|########5 | 559/655 [00:00<00:00, 685.42batch/s, train/train_loss_1=0.105]Epoch 4: 85%|########5 | 559/655 [00:00<00:00, 685.42batch/s, train/train_loss_1=0.089]Epoch 4: 85%|########5 | 559/655 [00:00<00:00, 685.42batch/s, train/train_loss_1=0.093]Epoch 4: 85%|########5 | 559/655 [00:00<00:00, 685.42batch/s, train/train_loss_1=0.0806]Epoch 4: 85%|########5 | 559/655 [00:00<00:00, 685.42batch/s, train/train_loss_1=0.0797]Epoch 4: 85%|########5 | 559/655 [00:00<00:00, 685.42batch/s, train/train_loss_1=0.0775]Epoch 4: 85%|########5 | 559/655 [00:00<00:00, 685.42batch/s, train/train_loss_1=0.0647]Epoch 4: 85%|########5 | 559/655 [00:00<00:00, 685.42batch/s, train/train_loss_1=0.0823]Epoch 4: 85%|########5 | 559/655 [00:00<00:00, 685.42batch/s, train/train_loss_1=0.0905]Epoch 4: 85%|########5 | 559/655 [00:00<00:00, 685.42batch/s, train/train_loss_1=0.0914]Epoch 4: 85%|########5 | 559/655 [00:00<00:00, 685.42batch/s, train/train_loss_1=0.0789]Epoch 4: 85%|########5 | 559/655 [00:00<00:00, 685.42batch/s, train/train_loss_1=0.0909]Epoch 4: 85%|########5 | 559/655 [00:00<00:00, 685.42batch/s, train/train_loss_1=0.0847]Epoch 4: 85%|########5 | 559/655 [00:00<00:00, 685.42batch/s, train/train_loss_1=0.0861]Epoch 4: 85%|########5 | 559/655 [00:00<00:00, 685.42batch/s, train/train_loss_1=0.0972]Epoch 4: 85%|########5 | 559/655 [00:00<00:00, 685.42batch/s, train/train_loss_1=0.0891]Epoch 4: 85%|########5 | 559/655 [00:00<00:00, 685.42batch/s, train/train_loss_1=0.104] Epoch 4: 85%|########5 | 559/655 [00:00<00:00, 685.42batch/s, train/train_loss_1=0.0961]Epoch 4: 85%|########5 | 559/655 [00:00<00:00, 685.42batch/s, train/train_loss_1=0.0819]Epoch 4: 85%|########5 | 559/655 [00:00<00:00, 685.42batch/s, train/train_loss_1=0.0808]Epoch 4: 85%|########5 | 559/655 [00:00<00:00, 685.42batch/s, train/train_loss_1=0.0873]Epoch 4: 85%|########5 | 559/655 [00:00<00:00, 685.42batch/s, train/train_loss_1=0.0879]Epoch 4: 96%|#########6| 629/655 [00:00<00:00, 688.98batch/s, train/train_loss_1=0.0879]Epoch 4: 96%|#########6| 629/655 [00:00<00:00, 688.98batch/s, train/train_loss_1=0.105] Epoch 4: 96%|#########6| 629/655 [00:00<00:00, 688.98batch/s, train/train_loss_1=0.0797]Epoch 4: 96%|#########6| 629/655 [00:00<00:00, 688.98batch/s, train/train_loss_1=0.0981]Epoch 4: 96%|#########6| 629/655 [00:00<00:00, 688.98batch/s, train/train_loss_1=0.112] Epoch 4: 96%|#########6| 629/655 [00:00<00:00, 688.98batch/s, train/train_loss_1=0.0946]Epoch 4: 96%|#########6| 629/655 [00:00<00:00, 688.98batch/s, train/train_loss_1=0.104] Epoch 4: 96%|#########6| 629/655 [00:00<00:00, 688.98batch/s, train/train_loss_1=0.105]Epoch 4: 96%|#########6| 629/655 [00:00<00:00, 688.98batch/s, train/train_loss_1=0.0953]Epoch 4: 96%|#########6| 629/655 [00:00<00:00, 688.98batch/s, train/train_loss_1=0.0856]Epoch 4: 96%|#########6| 629/655 [00:00<00:00, 688.98batch/s, train/train_loss_1=0.102] Epoch 4: 96%|#########6| 629/655 [00:00<00:00, 688.98batch/s, train/train_loss_1=0.109]Epoch 4: 96%|#########6| 629/655 [00:00<00:00, 688.98batch/s, train/train_loss_1=0.0833]Epoch 4: 96%|#########6| 629/655 [00:00<00:00, 688.98batch/s, train/train_loss_1=0.0974]Epoch 4: 96%|#########6| 629/655 [00:00<00:00, 688.98batch/s, train/train_loss_1=0.0756]Epoch 4: 96%|#########6| 629/655 [00:00<00:00, 688.98batch/s, train/train_loss_1=0.0906]Epoch 4: 96%|#########6| 629/655 [00:00<00:00, 688.98batch/s, train/train_loss_1=0.0967]Epoch 4: 96%|#########6| 629/655 [00:00<00:00, 688.98batch/s, train/train_loss_1=0.0982]Epoch 4: 96%|#########6| 629/655 [00:00<00:00, 688.98batch/s, train/train_loss_1=0.0795]Epoch 4: 96%|#########6| 629/655 [00:00<00:00, 688.98batch/s, train/train_loss_1=0.0866]Epoch 4: 96%|#########6| 629/655 [00:00<00:00, 688.98batch/s, train/train_loss_1=0.0806]Epoch 4: 96%|#########6| 629/655 [00:00<00:00, 688.98batch/s, train/train_loss_1=0.0806]Epoch 4: 96%|#########6| 629/655 [00:00<00:00, 688.98batch/s, train/train_loss_1=0.081] Epoch 4: 96%|#########6| 629/655 [00:00<00:00, 688.98batch/s, train/train_loss_1=0.0852]Epoch 4: 96%|#########6| 629/655 [00:00<00:00, 688.98batch/s, train/train_loss_1=0.0813]Epoch 4: 96%|#########6| 629/655 [00:00<00:00, 688.98batch/s, train/train_loss_1=0.0952]Epoch 4: 96%|#########6| 629/655 [00:00<00:00, 688.98batch/s, train/train_loss_1=0.0791] 0%| | 0/655 [00:00<?, ?batch/s]Epoch 5: 0%| | 0/655 [00:00<?, ?batch/s]Epoch 5: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0923]Epoch 5: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0818]Epoch 5: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0867]Epoch 5: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.102] Epoch 5: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0867]Epoch 5: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0832]Epoch 5: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0991]Epoch 5: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.101] Epoch 5: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0834]Epoch 5: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.104] Epoch 5: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.104]Epoch 5: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0951]Epoch 5: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0976]Epoch 5: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0829]Epoch 5: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.112] Epoch 5: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0873]Epoch 5: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0815]Epoch 5: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0981]Epoch 5: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0866]Epoch 5: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0736]Epoch 5: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0956]Epoch 5: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0942]Epoch 5: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0968]Epoch 5: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0908]Epoch 5: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0915]Epoch 5: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0898]Epoch 5: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0971]Epoch 5: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0855]Epoch 5: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0938]Epoch 5: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0942]Epoch 5: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0837]Epoch 5: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0828]Epoch 5: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0873]Epoch 5: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0736]Epoch 5: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0889]Epoch 5: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0908]Epoch 5: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0888]Epoch 5: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.102] Epoch 5: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0836]Epoch 5: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0862]Epoch 5: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.094] Epoch 5: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0999]Epoch 5: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.085] Epoch 5: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0883]Epoch 5: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0945]Epoch 5: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0764]Epoch 5: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0946]Epoch 5: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0974]Epoch 5: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0845]Epoch 5: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.103] Epoch 5: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0798]Epoch 5: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0888]Epoch 5: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0822]Epoch 5: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0775]Epoch 5: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0899]Epoch 5: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0938]Epoch 5: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0748]Epoch 5: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0874]Epoch 5: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0903]Epoch 5: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0949]Epoch 5: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0817]Epoch 5: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0931]Epoch 5: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0816]Epoch 5: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0969]Epoch 5: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0828]Epoch 5: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.104] Epoch 5: 10%|# | 66/655 [00:00<00:00, 656.57batch/s, train/train_loss_1=0.104]Epoch 5: 10%|# | 66/655 [00:00<00:00, 656.57batch/s, train/train_loss_1=0.082]Epoch 5: 10%|# | 66/655 [00:00<00:00, 656.57batch/s, train/train_loss_1=0.0855]Epoch 5: 10%|# | 66/655 [00:00<00:00, 656.57batch/s, train/train_loss_1=0.102] Epoch 5: 10%|# | 66/655 [00:00<00:00, 656.57batch/s, train/train_loss_1=0.0787]Epoch 5: 10%|# | 66/655 [00:00<00:00, 656.57batch/s, train/train_loss_1=0.085] Epoch 5: 10%|# | 66/655 [00:00<00:00, 656.57batch/s, train/train_loss_1=0.0764]Epoch 5: 10%|# | 66/655 [00:00<00:00, 656.57batch/s, train/train_loss_1=0.113] Epoch 5: 10%|# | 66/655 [00:00<00:00, 656.57batch/s, train/train_loss_1=0.0982]Epoch 5: 10%|# | 66/655 [00:00<00:00, 656.57batch/s, train/train_loss_1=0.0803]Epoch 5: 10%|# | 66/655 [00:00<00:00, 656.57batch/s, train/train_loss_1=0.0815]Epoch 5: 10%|# | 66/655 [00:00<00:00, 656.57batch/s, train/train_loss_1=0.086] Epoch 5: 10%|# | 66/655 [00:00<00:00, 656.57batch/s, train/train_loss_1=0.0952]Epoch 5: 10%|# | 66/655 [00:00<00:00, 656.57batch/s, train/train_loss_1=0.0924]Epoch 5: 10%|# | 66/655 [00:00<00:00, 656.57batch/s, train/train_loss_1=0.0786]Epoch 5: 10%|# | 66/655 [00:00<00:00, 656.57batch/s, train/train_loss_1=0.0958]Epoch 5: 10%|# | 66/655 [00:00<00:00, 656.57batch/s, train/train_loss_1=0.107] Epoch 5: 10%|# | 66/655 [00:00<00:00, 656.57batch/s, train/train_loss_1=0.0904]Epoch 5: 10%|# | 66/655 [00:00<00:00, 656.57batch/s, train/train_loss_1=0.09] Epoch 5: 10%|# | 66/655 [00:00<00:00, 656.57batch/s, train/train_loss_1=0.0874]Epoch 5: 10%|# | 66/655 [00:00<00:00, 656.57batch/s, train/train_loss_1=0.0986]Epoch 5: 10%|# | 66/655 [00:00<00:00, 656.57batch/s, train/train_loss_1=0.0879]Epoch 5: 10%|# | 66/655 [00:00<00:00, 656.57batch/s, train/train_loss_1=0.091] Epoch 5: 10%|# | 66/655 [00:00<00:00, 656.57batch/s, train/train_loss_1=0.0867]Epoch 5: 10%|# | 66/655 [00:00<00:00, 656.57batch/s, train/train_loss_1=0.0898]Epoch 5: 10%|# | 66/655 [00:00<00:00, 656.57batch/s, train/train_loss_1=0.0909]Epoch 5: 10%|# | 66/655 [00:00<00:00, 656.57batch/s, train/train_loss_1=0.0909]Epoch 5: 10%|# | 66/655 [00:00<00:00, 656.57batch/s, train/train_loss_1=0.082] Epoch 5: 10%|# | 66/655 [00:00<00:00, 656.57batch/s, train/train_loss_1=0.0966]Epoch 5: 10%|# | 66/655 [00:00<00:00, 656.57batch/s, train/train_loss_1=0.0888]Epoch 5: 10%|# | 66/655 [00:00<00:00, 656.57batch/s, train/train_loss_1=0.0897]Epoch 5: 10%|# | 66/655 [00:00<00:00, 656.57batch/s, train/train_loss_1=0.0827]Epoch 5: 10%|# | 66/655 [00:00<00:00, 656.57batch/s, train/train_loss_1=0.0742]Epoch 5: 10%|# | 66/655 [00:00<00:00, 656.57batch/s, train/train_loss_1=0.0908]Epoch 5: 10%|# | 66/655 [00:00<00:00, 656.57batch/s, train/train_loss_1=0.0894]Epoch 5: 10%|# | 66/655 [00:00<00:00, 656.57batch/s, train/train_loss_1=0.0944]Epoch 5: 10%|# | 66/655 [00:00<00:00, 656.57batch/s, train/train_loss_1=0.0882]Epoch 5: 10%|# | 66/655 [00:00<00:00, 656.57batch/s, train/train_loss_1=0.0832]Epoch 5: 10%|# | 66/655 [00:00<00:00, 656.57batch/s, train/train_loss_1=0.0947]Epoch 5: 10%|# | 66/655 [00:00<00:00, 656.57batch/s, train/train_loss_1=0.0911]Epoch 5: 10%|# | 66/655 [00:00<00:00, 656.57batch/s, train/train_loss_1=0.0965]Epoch 5: 10%|# | 66/655 [00:00<00:00, 656.57batch/s, train/train_loss_1=0.075] Epoch 5: 10%|# | 66/655 [00:00<00:00, 656.57batch/s, train/train_loss_1=0.0876]Epoch 5: 10%|# | 66/655 [00:00<00:00, 656.57batch/s, train/train_loss_1=0.089] Epoch 5: 10%|# | 66/655 [00:00<00:00, 656.57batch/s, train/train_loss_1=0.0802]Epoch 5: 10%|# | 66/655 [00:00<00:00, 656.57batch/s, train/train_loss_1=0.0826]Epoch 5: 10%|# | 66/655 [00:00<00:00, 656.57batch/s, train/train_loss_1=0.0998]Epoch 5: 10%|# | 66/655 [00:00<00:00, 656.57batch/s, train/train_loss_1=0.0884]Epoch 5: 10%|# | 66/655 [00:00<00:00, 656.57batch/s, train/train_loss_1=0.0922]Epoch 5: 10%|# | 66/655 [00:00<00:00, 656.57batch/s, train/train_loss_1=0.103] Epoch 5: 10%|# | 66/655 [00:00<00:00, 656.57batch/s, train/train_loss_1=0.0861]Epoch 5: 10%|# | 66/655 [00:00<00:00, 656.57batch/s, train/train_loss_1=0.0998]Epoch 5: 10%|# | 66/655 [00:00<00:00, 656.57batch/s, train/train_loss_1=0.0819]Epoch 5: 10%|# | 66/655 [00:00<00:00, 656.57batch/s, train/train_loss_1=0.088] Epoch 5: 10%|# | 66/655 [00:00<00:00, 656.57batch/s, train/train_loss_1=0.0788]Epoch 5: 10%|# | 66/655 [00:00<00:00, 656.57batch/s, train/train_loss_1=0.0845]Epoch 5: 10%|# | 66/655 [00:00<00:00, 656.57batch/s, train/train_loss_1=0.0986]Epoch 5: 10%|# | 66/655 [00:00<00:00, 656.57batch/s, train/train_loss_1=0.0808]Epoch 5: 10%|# | 66/655 [00:00<00:00, 656.57batch/s, train/train_loss_1=0.0927]Epoch 5: 10%|# | 66/655 [00:00<00:00, 656.57batch/s, train/train_loss_1=0.0875]Epoch 5: 10%|# | 66/655 [00:00<00:00, 656.57batch/s, train/train_loss_1=0.0888]Epoch 5: 10%|# | 66/655 [00:00<00:00, 656.57batch/s, train/train_loss_1=0.0904]Epoch 5: 10%|# | 66/655 [00:00<00:00, 656.57batch/s, train/train_loss_1=0.0854]Epoch 5: 10%|# | 66/655 [00:00<00:00, 656.57batch/s, train/train_loss_1=0.089] Epoch 5: 10%|# | 66/655 [00:00<00:00, 656.57batch/s, train/train_loss_1=0.101]Epoch 5: 10%|# | 66/655 [00:00<00:00, 656.57batch/s, train/train_loss_1=0.0834]Epoch 5: 10%|# | 66/655 [00:00<00:00, 656.57batch/s, train/train_loss_1=0.0928]Epoch 5: 10%|# | 66/655 [00:00<00:00, 656.57batch/s, train/train_loss_1=0.0893]Epoch 5: 10%|# | 66/655 [00:00<00:00, 656.57batch/s, train/train_loss_1=0.0899]Epoch 5: 10%|# | 66/655 [00:00<00:00, 656.57batch/s, train/train_loss_1=0.0954]Epoch 5: 10%|# | 66/655 [00:00<00:00, 656.57batch/s, train/train_loss_1=0.0754]Epoch 5: 21%|## | 136/655 [00:00<00:00, 676.79batch/s, train/train_loss_1=0.0754]Epoch 5: 21%|## | 136/655 [00:00<00:00, 676.79batch/s, train/train_loss_1=0.0993]Epoch 5: 21%|## | 136/655 [00:00<00:00, 676.79batch/s, train/train_loss_1=0.0886]Epoch 5: 21%|## | 136/655 [00:00<00:00, 676.79batch/s, train/train_loss_1=0.0788]Epoch 5: 21%|## | 136/655 [00:00<00:00, 676.79batch/s, train/train_loss_1=0.111] Epoch 5: 21%|## | 136/655 [00:00<00:00, 676.79batch/s, train/train_loss_1=0.0968]Epoch 5: 21%|## | 136/655 [00:00<00:00, 676.79batch/s, train/train_loss_1=0.105] Epoch 5: 21%|## | 136/655 [00:00<00:00, 676.79batch/s, train/train_loss_1=0.0971]Epoch 5: 21%|## | 136/655 [00:00<00:00, 676.79batch/s, train/train_loss_1=0.0792]Epoch 5: 21%|## | 136/655 [00:00<00:00, 676.79batch/s, train/train_loss_1=0.0876]Epoch 5: 21%|## | 136/655 [00:00<00:00, 676.79batch/s, train/train_loss_1=0.0973]Epoch 5: 21%|## | 136/655 [00:00<00:00, 676.79batch/s, train/train_loss_1=0.107] Epoch 5: 21%|## | 136/655 [00:00<00:00, 676.79batch/s, train/train_loss_1=0.072]Epoch 5: 21%|## | 136/655 [00:00<00:00, 676.79batch/s, train/train_loss_1=0.0962]Epoch 5: 21%|## | 136/655 [00:00<00:00, 676.79batch/s, train/train_loss_1=0.0867]Epoch 5: 21%|## | 136/655 [00:00<00:00, 676.79batch/s, train/train_loss_1=0.0747]Epoch 5: 21%|## | 136/655 [00:00<00:00, 676.79batch/s, train/train_loss_1=0.0919]Epoch 5: 21%|## | 136/655 [00:00<00:00, 676.79batch/s, train/train_loss_1=0.0921]Epoch 5: 21%|## | 136/655 [00:00<00:00, 676.79batch/s, train/train_loss_1=0.081] Epoch 5: 21%|## | 136/655 [00:00<00:00, 676.79batch/s, train/train_loss_1=0.0887]Epoch 5: 21%|## | 136/655 [00:00<00:00, 676.79batch/s, train/train_loss_1=0.075] Epoch 5: 21%|## | 136/655 [00:00<00:00, 676.79batch/s, train/train_loss_1=0.0876]Epoch 5: 21%|## | 136/655 [00:00<00:00, 676.79batch/s, train/train_loss_1=0.0942]Epoch 5: 21%|## | 136/655 [00:00<00:00, 676.79batch/s, train/train_loss_1=0.0929]Epoch 5: 21%|## | 136/655 [00:00<00:00, 676.79batch/s, train/train_loss_1=0.0778]Epoch 5: 21%|## | 136/655 [00:00<00:00, 676.79batch/s, train/train_loss_1=0.0739]Epoch 5: 21%|## | 136/655 [00:00<00:00, 676.79batch/s, train/train_loss_1=0.102] Epoch 5: 21%|## | 136/655 [00:00<00:00, 676.79batch/s, train/train_loss_1=0.0887]Epoch 5: 21%|## | 136/655 [00:00<00:00, 676.79batch/s, train/train_loss_1=0.0762]Epoch 5: 21%|## | 136/655 [00:00<00:00, 676.79batch/s, train/train_loss_1=0.0937]Epoch 5: 21%|## | 136/655 [00:00<00:00, 676.79batch/s, train/train_loss_1=0.0919]Epoch 5: 21%|## | 136/655 [00:00<00:00, 676.79batch/s, train/train_loss_1=0.0992]Epoch 5: 21%|## | 136/655 [00:00<00:00, 676.79batch/s, train/train_loss_1=0.08] Epoch 5: 21%|## | 136/655 [00:00<00:00, 676.79batch/s, train/train_loss_1=0.0819]Epoch 5: 21%|## | 136/655 [00:00<00:00, 676.79batch/s, train/train_loss_1=0.0918]Epoch 5: 21%|## | 136/655 [00:00<00:00, 676.79batch/s, train/train_loss_1=0.0684]Epoch 5: 21%|## | 136/655 [00:00<00:00, 676.79batch/s, train/train_loss_1=0.0909]Epoch 5: 21%|## | 136/655 [00:00<00:00, 676.79batch/s, train/train_loss_1=0.0804]Epoch 5: 21%|## | 136/655 [00:00<00:00, 676.79batch/s, train/train_loss_1=0.0734]Epoch 5: 21%|## | 136/655 [00:00<00:00, 676.79batch/s, train/train_loss_1=0.0792]Epoch 5: 21%|## | 136/655 [00:00<00:00, 676.79batch/s, train/train_loss_1=0.0928]Epoch 5: 21%|## | 136/655 [00:00<00:00, 676.79batch/s, train/train_loss_1=0.0798]Epoch 5: 21%|## | 136/655 [00:00<00:00, 676.79batch/s, train/train_loss_1=0.0915]Epoch 5: 21%|## | 136/655 [00:00<00:00, 676.79batch/s, train/train_loss_1=0.0901]Epoch 5: 21%|## | 136/655 [00:00<00:00, 676.79batch/s, train/train_loss_1=0.0907]Epoch 5: 21%|## | 136/655 [00:00<00:00, 676.79batch/s, train/train_loss_1=0.0729]Epoch 5: 21%|## | 136/655 [00:00<00:00, 676.79batch/s, train/train_loss_1=0.0974]Epoch 5: 21%|## | 136/655 [00:00<00:00, 676.79batch/s, train/train_loss_1=0.0984]Epoch 5: 21%|## | 136/655 [00:00<00:00, 676.79batch/s, train/train_loss_1=0.0713]Epoch 5: 21%|## | 136/655 [00:00<00:00, 676.79batch/s, train/train_loss_1=0.088] Epoch 5: 21%|## | 136/655 [00:00<00:00, 676.79batch/s, train/train_loss_1=0.0836]Epoch 5: 21%|## | 136/655 [00:00<00:00, 676.79batch/s, train/train_loss_1=0.0786]Epoch 5: 21%|## | 136/655 [00:00<00:00, 676.79batch/s, train/train_loss_1=0.0868]Epoch 5: 21%|## | 136/655 [00:00<00:00, 676.79batch/s, train/train_loss_1=0.1] Epoch 5: 21%|## | 136/655 [00:00<00:00, 676.79batch/s, train/train_loss_1=0.103]Epoch 5: 21%|## | 136/655 [00:00<00:00, 676.79batch/s, train/train_loss_1=0.0891]Epoch 5: 21%|## | 136/655 [00:00<00:00, 676.79batch/s, train/train_loss_1=0.0946]Epoch 5: 21%|## | 136/655 [00:00<00:00, 676.79batch/s, train/train_loss_1=0.108] Epoch 5: 21%|## | 136/655 [00:00<00:00, 676.79batch/s, train/train_loss_1=0.081]Epoch 5: 21%|## | 136/655 [00:00<00:00, 676.79batch/s, train/train_loss_1=0.0786]Epoch 5: 21%|## | 136/655 [00:00<00:00, 676.79batch/s, train/train_loss_1=0.0935]Epoch 5: 21%|## | 136/655 [00:00<00:00, 676.79batch/s, train/train_loss_1=0.0806]Epoch 5: 21%|## | 136/655 [00:00<00:00, 676.79batch/s, train/train_loss_1=0.089] Epoch 5: 21%|## | 136/655 [00:00<00:00, 676.79batch/s, train/train_loss_1=0.103]Epoch 5: 21%|## | 136/655 [00:00<00:00, 676.79batch/s, train/train_loss_1=0.0983]Epoch 5: 21%|## | 136/655 [00:00<00:00, 676.79batch/s, train/train_loss_1=0.0802]Epoch 5: 21%|## | 136/655 [00:00<00:00, 676.79batch/s, train/train_loss_1=0.0821]Epoch 5: 21%|## | 136/655 [00:00<00:00, 676.79batch/s, train/train_loss_1=0.0849]Epoch 5: 21%|## | 136/655 [00:00<00:00, 676.79batch/s, train/train_loss_1=0.0883]Epoch 5: 21%|## | 136/655 [00:00<00:00, 676.79batch/s, train/train_loss_1=0.0918]Epoch 5: 21%|## | 136/655 [00:00<00:00, 676.79batch/s, train/train_loss_1=0.0811]Epoch 5: 21%|## | 136/655 [00:00<00:00, 676.79batch/s, train/train_loss_1=0.0825]Epoch 5: 32%|###1 | 207/655 [00:00<00:00, 688.02batch/s, train/train_loss_1=0.0825]Epoch 5: 32%|###1 | 207/655 [00:00<00:00, 688.02batch/s, train/train_loss_1=0.082] Epoch 5: 32%|###1 | 207/655 [00:00<00:00, 688.02batch/s, train/train_loss_1=0.0943]Epoch 5: 32%|###1 | 207/655 [00:00<00:00, 688.02batch/s, train/train_loss_1=0.0712]Epoch 5: 32%|###1 | 207/655 [00:00<00:00, 688.02batch/s, train/train_loss_1=0.103] Epoch 5: 32%|###1 | 207/655 [00:00<00:00, 688.02batch/s, train/train_loss_1=0.0913]Epoch 5: 32%|###1 | 207/655 [00:00<00:00, 688.02batch/s, train/train_loss_1=0.101] Epoch 5: 32%|###1 | 207/655 [00:00<00:00, 688.02batch/s, train/train_loss_1=0.0856]Epoch 5: 32%|###1 | 207/655 [00:00<00:00, 688.02batch/s, train/train_loss_1=0.0952]Epoch 5: 32%|###1 | 207/655 [00:00<00:00, 688.02batch/s, train/train_loss_1=0.0657]Epoch 5: 32%|###1 | 207/655 [00:00<00:00, 688.02batch/s, train/train_loss_1=0.0871]Epoch 5: 32%|###1 | 207/655 [00:00<00:00, 688.02batch/s, train/train_loss_1=0.0909]Epoch 5: 32%|###1 | 207/655 [00:00<00:00, 688.02batch/s, train/train_loss_1=0.0854]Epoch 5: 32%|###1 | 207/655 [00:00<00:00, 688.02batch/s, train/train_loss_1=0.091] Epoch 5: 32%|###1 | 207/655 [00:00<00:00, 688.02batch/s, train/train_loss_1=0.109]Epoch 5: 32%|###1 | 207/655 [00:00<00:00, 688.02batch/s, train/train_loss_1=0.0856]Epoch 5: 32%|###1 | 207/655 [00:00<00:00, 688.02batch/s, train/train_loss_1=0.0797]Epoch 5: 32%|###1 | 207/655 [00:00<00:00, 688.02batch/s, train/train_loss_1=0.0893]Epoch 5: 32%|###1 | 207/655 [00:00<00:00, 688.02batch/s, train/train_loss_1=0.0927]Epoch 5: 32%|###1 | 207/655 [00:00<00:00, 688.02batch/s, train/train_loss_1=0.0915]Epoch 5: 32%|###1 | 207/655 [00:00<00:00, 688.02batch/s, train/train_loss_1=0.0932]Epoch 5: 32%|###1 | 207/655 [00:00<00:00, 688.02batch/s, train/train_loss_1=0.0885]Epoch 5: 32%|###1 | 207/655 [00:00<00:00, 688.02batch/s, train/train_loss_1=0.0898]Epoch 5: 32%|###1 | 207/655 [00:00<00:00, 688.02batch/s, train/train_loss_1=0.0861]Epoch 5: 32%|###1 | 207/655 [00:00<00:00, 688.02batch/s, train/train_loss_1=0.0799]Epoch 5: 32%|###1 | 207/655 [00:00<00:00, 688.02batch/s, train/train_loss_1=0.0875]Epoch 5: 32%|###1 | 207/655 [00:00<00:00, 688.02batch/s, train/train_loss_1=0.088] Epoch 5: 32%|###1 | 207/655 [00:00<00:00, 688.02batch/s, train/train_loss_1=0.0855]Epoch 5: 32%|###1 | 207/655 [00:00<00:00, 688.02batch/s, train/train_loss_1=0.0812]Epoch 5: 32%|###1 | 207/655 [00:00<00:00, 688.02batch/s, train/train_loss_1=0.0787]Epoch 5: 32%|###1 | 207/655 [00:00<00:00, 688.02batch/s, train/train_loss_1=0.0884]Epoch 5: 32%|###1 | 207/655 [00:00<00:00, 688.02batch/s, train/train_loss_1=0.0893]Epoch 5: 32%|###1 | 207/655 [00:00<00:00, 688.02batch/s, train/train_loss_1=0.0907]Epoch 5: 32%|###1 | 207/655 [00:00<00:00, 688.02batch/s, train/train_loss_1=0.0846]Epoch 5: 32%|###1 | 207/655 [00:00<00:00, 688.02batch/s, train/train_loss_1=0.0932]Epoch 5: 32%|###1 | 207/655 [00:00<00:00, 688.02batch/s, train/train_loss_1=0.0894]Epoch 5: 32%|###1 | 207/655 [00:00<00:00, 688.02batch/s, train/train_loss_1=0.0924]Epoch 5: 32%|###1 | 207/655 [00:00<00:00, 688.02batch/s, train/train_loss_1=0.0815]Epoch 5: 32%|###1 | 207/655 [00:00<00:00, 688.02batch/s, train/train_loss_1=0.068] Epoch 5: 32%|###1 | 207/655 [00:00<00:00, 688.02batch/s, train/train_loss_1=0.0778]Epoch 5: 32%|###1 | 207/655 [00:00<00:00, 688.02batch/s, train/train_loss_1=0.0796]Epoch 5: 32%|###1 | 207/655 [00:00<00:00, 688.02batch/s, train/train_loss_1=0.0739]Epoch 5: 32%|###1 | 207/655 [00:00<00:00, 688.02batch/s, train/train_loss_1=0.0905]Epoch 5: 32%|###1 | 207/655 [00:00<00:00, 688.02batch/s, train/train_loss_1=0.0832]Epoch 5: 32%|###1 | 207/655 [00:00<00:00, 688.02batch/s, train/train_loss_1=0.0988]Epoch 5: 32%|###1 | 207/655 [00:00<00:00, 688.02batch/s, train/train_loss_1=0.102] Epoch 5: 32%|###1 | 207/655 [00:00<00:00, 688.02batch/s, train/train_loss_1=0.0947]Epoch 5: 32%|###1 | 207/655 [00:00<00:00, 688.02batch/s, train/train_loss_1=0.0775]Epoch 5: 32%|###1 | 207/655 [00:00<00:00, 688.02batch/s, train/train_loss_1=0.0916]Epoch 5: 32%|###1 | 207/655 [00:00<00:00, 688.02batch/s, train/train_loss_1=0.0933]Epoch 5: 32%|###1 | 207/655 [00:00<00:00, 688.02batch/s, train/train_loss_1=0.0931]Epoch 5: 32%|###1 | 207/655 [00:00<00:00, 688.02batch/s, train/train_loss_1=0.0924]Epoch 5: 32%|###1 | 207/655 [00:00<00:00, 688.02batch/s, train/train_loss_1=0.101] Epoch 5: 32%|###1 | 207/655 [00:00<00:00, 688.02batch/s, train/train_loss_1=0.11] Epoch 5: 32%|###1 | 207/655 [00:00<00:00, 688.02batch/s, train/train_loss_1=0.088]Epoch 5: 32%|###1 | 207/655 [00:00<00:00, 688.02batch/s, train/train_loss_1=0.098]Epoch 5: 32%|###1 | 207/655 [00:00<00:00, 688.02batch/s, train/train_loss_1=0.0806]Epoch 5: 32%|###1 | 207/655 [00:00<00:00, 688.02batch/s, train/train_loss_1=0.106] Epoch 5: 32%|###1 | 207/655 [00:00<00:00, 688.02batch/s, train/train_loss_1=0.0896]Epoch 5: 32%|###1 | 207/655 [00:00<00:00, 688.02batch/s, train/train_loss_1=0.0839]Epoch 5: 32%|###1 | 207/655 [00:00<00:00, 688.02batch/s, train/train_loss_1=0.0676]Epoch 5: 32%|###1 | 207/655 [00:00<00:00, 688.02batch/s, train/train_loss_1=0.0921]Epoch 5: 32%|###1 | 207/655 [00:00<00:00, 688.02batch/s, train/train_loss_1=0.0855]Epoch 5: 32%|###1 | 207/655 [00:00<00:00, 688.02batch/s, train/train_loss_1=0.095] Epoch 5: 32%|###1 | 207/655 [00:00<00:00, 688.02batch/s, train/train_loss_1=0.0887]Epoch 5: 32%|###1 | 207/655 [00:00<00:00, 688.02batch/s, train/train_loss_1=0.0825]Epoch 5: 32%|###1 | 207/655 [00:00<00:00, 688.02batch/s, train/train_loss_1=0.0935]Epoch 5: 32%|###1 | 207/655 [00:00<00:00, 688.02batch/s, train/train_loss_1=0.104] Epoch 5: 32%|###1 | 207/655 [00:00<00:00, 688.02batch/s, train/train_loss_1=0.0834]Epoch 5: 32%|###1 | 207/655 [00:00<00:00, 688.02batch/s, train/train_loss_1=0.0736]Epoch 5: 32%|###1 | 207/655 [00:00<00:00, 688.02batch/s, train/train_loss_1=0.0992]Epoch 5: 32%|###1 | 207/655 [00:00<00:00, 688.02batch/s, train/train_loss_1=0.0924]Epoch 5: 32%|###1 | 207/655 [00:00<00:00, 688.02batch/s, train/train_loss_1=0.0959]Epoch 5: 32%|###1 | 207/655 [00:00<00:00, 688.02batch/s, train/train_loss_1=0.0911]Epoch 5: 43%|####2 | 280/655 [00:00<00:00, 701.22batch/s, train/train_loss_1=0.0911]Epoch 5: 43%|####2 | 280/655 [00:00<00:00, 701.22batch/s, train/train_loss_1=0.0797]Epoch 5: 43%|####2 | 280/655 [00:00<00:00, 701.22batch/s, train/train_loss_1=0.104] Epoch 5: 43%|####2 | 280/655 [00:00<00:00, 701.22batch/s, train/train_loss_1=0.0778]Epoch 5: 43%|####2 | 280/655 [00:00<00:00, 701.22batch/s, train/train_loss_1=0.0845]Epoch 5: 43%|####2 | 280/655 [00:00<00:00, 701.22batch/s, train/train_loss_1=0.0915]Epoch 5: 43%|####2 | 280/655 [00:00<00:00, 701.22batch/s, train/train_loss_1=0.0936]Epoch 5: 43%|####2 | 280/655 [00:00<00:00, 701.22batch/s, train/train_loss_1=0.0845]Epoch 5: 43%|####2 | 280/655 [00:00<00:00, 701.22batch/s, train/train_loss_1=0.0807]Epoch 5: 43%|####2 | 280/655 [00:00<00:00, 701.22batch/s, train/train_loss_1=0.0848]Epoch 5: 43%|####2 | 280/655 [00:00<00:00, 701.22batch/s, train/train_loss_1=0.0907]Epoch 5: 43%|####2 | 280/655 [00:00<00:00, 701.22batch/s, train/train_loss_1=0.0847]Epoch 5: 43%|####2 | 280/655 [00:00<00:00, 701.22batch/s, train/train_loss_1=0.0926]Epoch 5: 43%|####2 | 280/655 [00:00<00:00, 701.22batch/s, train/train_loss_1=0.0737]Epoch 5: 43%|####2 | 280/655 [00:00<00:00, 701.22batch/s, train/train_loss_1=0.0886]Epoch 5: 43%|####2 | 280/655 [00:00<00:00, 701.22batch/s, train/train_loss_1=0.0842]Epoch 5: 43%|####2 | 280/655 [00:00<00:00, 701.22batch/s, train/train_loss_1=0.0976]Epoch 5: 43%|####2 | 280/655 [00:00<00:00, 701.22batch/s, train/train_loss_1=0.0846]Epoch 5: 43%|####2 | 280/655 [00:00<00:00, 701.22batch/s, train/train_loss_1=0.0898]Epoch 5: 43%|####2 | 280/655 [00:00<00:00, 701.22batch/s, train/train_loss_1=0.101] Epoch 5: 43%|####2 | 280/655 [00:00<00:00, 701.22batch/s, train/train_loss_1=0.0949]Epoch 5: 43%|####2 | 280/655 [00:00<00:00, 701.22batch/s, train/train_loss_1=0.0726]Epoch 5: 43%|####2 | 280/655 [00:00<00:00, 701.22batch/s, train/train_loss_1=0.0928]Epoch 5: 43%|####2 | 280/655 [00:00<00:00, 701.22batch/s, train/train_loss_1=0.0945]Epoch 5: 43%|####2 | 280/655 [00:00<00:00, 701.22batch/s, train/train_loss_1=0.0689]Epoch 5: 43%|####2 | 280/655 [00:00<00:00, 701.22batch/s, train/train_loss_1=0.0806]Epoch 5: 43%|####2 | 280/655 [00:00<00:00, 701.22batch/s, train/train_loss_1=0.0939]Epoch 5: 43%|####2 | 280/655 [00:00<00:00, 701.22batch/s, train/train_loss_1=0.0907]Epoch 5: 43%|####2 | 280/655 [00:00<00:00, 701.22batch/s, train/train_loss_1=0.0871]Epoch 5: 43%|####2 | 280/655 [00:00<00:00, 701.22batch/s, train/train_loss_1=0.0945]Epoch 5: 43%|####2 | 280/655 [00:00<00:00, 701.22batch/s, train/train_loss_1=0.0684]Epoch 5: 43%|####2 | 280/655 [00:00<00:00, 701.22batch/s, train/train_loss_1=0.0881]Epoch 5: 43%|####2 | 280/655 [00:00<00:00, 701.22batch/s, train/train_loss_1=0.0926]Epoch 5: 43%|####2 | 280/655 [00:00<00:00, 701.22batch/s, train/train_loss_1=0.0914]Epoch 5: 43%|####2 | 280/655 [00:00<00:00, 701.22batch/s, train/train_loss_1=0.0985]Epoch 5: 43%|####2 | 280/655 [00:00<00:00, 701.22batch/s, train/train_loss_1=0.0868]Epoch 5: 43%|####2 | 280/655 [00:00<00:00, 701.22batch/s, train/train_loss_1=0.0916]Epoch 5: 43%|####2 | 280/655 [00:00<00:00, 701.22batch/s, train/train_loss_1=0.0847]Epoch 5: 43%|####2 | 280/655 [00:00<00:00, 701.22batch/s, train/train_loss_1=0.0785]Epoch 5: 43%|####2 | 280/655 [00:00<00:00, 701.22batch/s, train/train_loss_1=0.0905]Epoch 5: 43%|####2 | 280/655 [00:00<00:00, 701.22batch/s, train/train_loss_1=0.0878]Epoch 5: 43%|####2 | 280/655 [00:00<00:00, 701.22batch/s, train/train_loss_1=0.0862]Epoch 5: 43%|####2 | 280/655 [00:00<00:00, 701.22batch/s, train/train_loss_1=0.0842]Epoch 5: 43%|####2 | 280/655 [00:00<00:00, 701.22batch/s, train/train_loss_1=0.0756]Epoch 5: 43%|####2 | 280/655 [00:00<00:00, 701.22batch/s, train/train_loss_1=0.0699]Epoch 5: 43%|####2 | 280/655 [00:00<00:00, 701.22batch/s, train/train_loss_1=0.0879]Epoch 5: 43%|####2 | 280/655 [00:00<00:00, 701.22batch/s, train/train_loss_1=0.0801]Epoch 5: 43%|####2 | 280/655 [00:00<00:00, 701.22batch/s, train/train_loss_1=0.0925]Epoch 5: 43%|####2 | 280/655 [00:00<00:00, 701.22batch/s, train/train_loss_1=0.0782]Epoch 5: 43%|####2 | 280/655 [00:00<00:00, 701.22batch/s, train/train_loss_1=0.0689]Epoch 5: 43%|####2 | 280/655 [00:00<00:00, 701.22batch/s, train/train_loss_1=0.0973]Epoch 5: 43%|####2 | 280/655 [00:00<00:00, 701.22batch/s, train/train_loss_1=0.1] Epoch 5: 43%|####2 | 280/655 [00:00<00:00, 701.22batch/s, train/train_loss_1=0.0907]Epoch 5: 43%|####2 | 280/655 [00:00<00:00, 701.22batch/s, train/train_loss_1=0.0931]Epoch 5: 43%|####2 | 280/655 [00:00<00:00, 701.22batch/s, train/train_loss_1=0.0821]Epoch 5: 43%|####2 | 280/655 [00:00<00:00, 701.22batch/s, train/train_loss_1=0.0721]Epoch 5: 43%|####2 | 280/655 [00:00<00:00, 701.22batch/s, train/train_loss_1=0.0862]Epoch 5: 43%|####2 | 280/655 [00:00<00:00, 701.22batch/s, train/train_loss_1=0.0784]Epoch 5: 43%|####2 | 280/655 [00:00<00:00, 701.22batch/s, train/train_loss_1=0.0765]Epoch 5: 43%|####2 | 280/655 [00:00<00:00, 701.22batch/s, train/train_loss_1=0.0861]Epoch 5: 43%|####2 | 280/655 [00:00<00:00, 701.22batch/s, train/train_loss_1=0.0768]Epoch 5: 43%|####2 | 280/655 [00:00<00:00, 701.22batch/s, train/train_loss_1=0.0835]Epoch 5: 43%|####2 | 280/655 [00:00<00:00, 701.22batch/s, train/train_loss_1=0.0883]Epoch 5: 43%|####2 | 280/655 [00:00<00:00, 701.22batch/s, train/train_loss_1=0.0891]Epoch 5: 43%|####2 | 280/655 [00:00<00:00, 701.22batch/s, train/train_loss_1=0.107] Epoch 5: 43%|####2 | 280/655 [00:00<00:00, 701.22batch/s, train/train_loss_1=0.0753]Epoch 5: 43%|####2 | 280/655 [00:00<00:00, 701.22batch/s, train/train_loss_1=0.0694]Epoch 5: 43%|####2 | 280/655 [00:00<00:00, 701.22batch/s, train/train_loss_1=0.101] Epoch 5: 43%|####2 | 280/655 [00:00<00:00, 701.22batch/s, train/train_loss_1=0.0921]Epoch 5: 43%|####2 | 280/655 [00:00<00:00, 701.22batch/s, train/train_loss_1=0.0801]Epoch 5: 43%|####2 | 280/655 [00:00<00:00, 701.22batch/s, train/train_loss_1=0.0949]Epoch 5: 43%|####2 | 280/655 [00:00<00:00, 701.22batch/s, train/train_loss_1=0.0854]Epoch 5: 43%|####2 | 280/655 [00:00<00:00, 701.22batch/s, train/train_loss_1=0.0966]Epoch 5: 43%|####2 | 280/655 [00:00<00:00, 701.22batch/s, train/train_loss_1=0.102] Epoch 5: 54%|#####3 | 353/655 [00:00<00:00, 711.00batch/s, train/train_loss_1=0.102]Epoch 5: 54%|#####3 | 353/655 [00:00<00:00, 711.00batch/s, train/train_loss_1=0.0921]Epoch 5: 54%|#####3 | 353/655 [00:00<00:00, 711.00batch/s, train/train_loss_1=0.0794]Epoch 5: 54%|#####3 | 353/655 [00:00<00:00, 711.00batch/s, train/train_loss_1=0.0893]Epoch 5: 54%|#####3 | 353/655 [00:00<00:00, 711.00batch/s, train/train_loss_1=0.0853]Epoch 5: 54%|#####3 | 353/655 [00:00<00:00, 711.00batch/s, train/train_loss_1=0.0934]Epoch 5: 54%|#####3 | 353/655 [00:00<00:00, 711.00batch/s, train/train_loss_1=0.101] Epoch 5: 54%|#####3 | 353/655 [00:00<00:00, 711.00batch/s, train/train_loss_1=0.0805]Epoch 5: 54%|#####3 | 353/655 [00:00<00:00, 711.00batch/s, train/train_loss_1=0.0827]Epoch 5: 54%|#####3 | 353/655 [00:00<00:00, 711.00batch/s, train/train_loss_1=0.0809]Epoch 5: 54%|#####3 | 353/655 [00:00<00:00, 711.00batch/s, train/train_loss_1=0.101] Epoch 5: 54%|#####3 | 353/655 [00:00<00:00, 711.00batch/s, train/train_loss_1=0.0775]Epoch 5: 54%|#####3 | 353/655 [00:00<00:00, 711.00batch/s, train/train_loss_1=0.0914]Epoch 5: 54%|#####3 | 353/655 [00:00<00:00, 711.00batch/s, train/train_loss_1=0.083] Epoch 5: 54%|#####3 | 353/655 [00:00<00:00, 711.00batch/s, train/train_loss_1=0.0908]Epoch 5: 54%|#####3 | 353/655 [00:00<00:00, 711.00batch/s, train/train_loss_1=0.0795]Epoch 5: 54%|#####3 | 353/655 [00:00<00:00, 711.00batch/s, train/train_loss_1=0.0745]Epoch 5: 54%|#####3 | 353/655 [00:00<00:00, 711.00batch/s, train/train_loss_1=0.0802]Epoch 5: 54%|#####3 | 353/655 [00:00<00:00, 711.00batch/s, train/train_loss_1=0.0857]Epoch 5: 54%|#####3 | 353/655 [00:00<00:00, 711.00batch/s, train/train_loss_1=0.104] Epoch 5: 54%|#####3 | 353/655 [00:00<00:00, 711.00batch/s, train/train_loss_1=0.0887]Epoch 5: 54%|#####3 | 353/655 [00:00<00:00, 711.00batch/s, train/train_loss_1=0.0798]Epoch 5: 54%|#####3 | 353/655 [00:00<00:00, 711.00batch/s, train/train_loss_1=0.0874]Epoch 5: 54%|#####3 | 353/655 [00:00<00:00, 711.00batch/s, train/train_loss_1=0.0869]Epoch 5: 54%|#####3 | 353/655 [00:00<00:00, 711.00batch/s, train/train_loss_1=0.0905]Epoch 5: 54%|#####3 | 353/655 [00:00<00:00, 711.00batch/s, train/train_loss_1=0.0843]Epoch 5: 54%|#####3 | 353/655 [00:00<00:00, 711.00batch/s, train/train_loss_1=0.0805]Epoch 5: 54%|#####3 | 353/655 [00:00<00:00, 711.00batch/s, train/train_loss_1=0.0777]Epoch 5: 54%|#####3 | 353/655 [00:00<00:00, 711.00batch/s, train/train_loss_1=0.0882]Epoch 5: 54%|#####3 | 353/655 [00:00<00:00, 711.00batch/s, train/train_loss_1=0.0975]Epoch 5: 54%|#####3 | 353/655 [00:00<00:00, 711.00batch/s, train/train_loss_1=0.0976]Epoch 5: 54%|#####3 | 353/655 [00:00<00:00, 711.00batch/s, train/train_loss_1=0.0978]Epoch 5: 54%|#####3 | 353/655 [00:00<00:00, 711.00batch/s, train/train_loss_1=0.0898]Epoch 5: 54%|#####3 | 353/655 [00:00<00:00, 711.00batch/s, train/train_loss_1=0.094] Epoch 5: 54%|#####3 | 353/655 [00:00<00:00, 711.00batch/s, train/train_loss_1=0.0939]Epoch 5: 54%|#####3 | 353/655 [00:00<00:00, 711.00batch/s, train/train_loss_1=0.0858]Epoch 5: 54%|#####3 | 353/655 [00:00<00:00, 711.00batch/s, train/train_loss_1=0.0883]Epoch 5: 54%|#####3 | 353/655 [00:00<00:00, 711.00batch/s, train/train_loss_1=0.0808]Epoch 5: 54%|#####3 | 353/655 [00:00<00:00, 711.00batch/s, train/train_loss_1=0.0863]Epoch 5: 54%|#####3 | 353/655 [00:00<00:00, 711.00batch/s, train/train_loss_1=0.0967]Epoch 5: 54%|#####3 | 353/655 [00:00<00:00, 711.00batch/s, train/train_loss_1=0.0862]Epoch 5: 54%|#####3 | 353/655 [00:00<00:00, 711.00batch/s, train/train_loss_1=0.0878]Epoch 5: 54%|#####3 | 353/655 [00:00<00:00, 711.00batch/s, train/train_loss_1=0.0884]Epoch 5: 54%|#####3 | 353/655 [00:00<00:00, 711.00batch/s, train/train_loss_1=0.0923]Epoch 5: 54%|#####3 | 353/655 [00:00<00:00, 711.00batch/s, train/train_loss_1=0.0975]Epoch 5: 54%|#####3 | 353/655 [00:00<00:00, 711.00batch/s, train/train_loss_1=0.0914]Epoch 5: 54%|#####3 | 353/655 [00:00<00:00, 711.00batch/s, train/train_loss_1=0.0838]Epoch 5: 54%|#####3 | 353/655 [00:00<00:00, 711.00batch/s, train/train_loss_1=0.094] Epoch 5: 54%|#####3 | 353/655 [00:00<00:00, 711.00batch/s, train/train_loss_1=0.0871]Epoch 5: 54%|#####3 | 353/655 [00:00<00:00, 711.00batch/s, train/train_loss_1=0.0939]Epoch 5: 54%|#####3 | 353/655 [00:00<00:00, 711.00batch/s, train/train_loss_1=0.0955]Epoch 5: 54%|#####3 | 353/655 [00:00<00:00, 711.00batch/s, train/train_loss_1=0.093] Epoch 5: 54%|#####3 | 353/655 [00:00<00:00, 711.00batch/s, train/train_loss_1=0.0921]Epoch 5: 54%|#####3 | 353/655 [00:00<00:00, 711.00batch/s, train/train_loss_1=0.0653]Epoch 5: 54%|#####3 | 353/655 [00:00<00:00, 711.00batch/s, train/train_loss_1=0.0845]Epoch 5: 54%|#####3 | 353/655 [00:00<00:00, 711.00batch/s, train/train_loss_1=0.0937]Epoch 5: 54%|#####3 | 353/655 [00:00<00:00, 711.00batch/s, train/train_loss_1=0.0917]Epoch 5: 54%|#####3 | 353/655 [00:00<00:00, 711.00batch/s, train/train_loss_1=0.0865]Epoch 5: 54%|#####3 | 353/655 [00:00<00:00, 711.00batch/s, train/train_loss_1=0.0726]Epoch 5: 54%|#####3 | 353/655 [00:00<00:00, 711.00batch/s, train/train_loss_1=0.0855]Epoch 5: 54%|#####3 | 353/655 [00:00<00:00, 711.00batch/s, train/train_loss_1=0.0969]Epoch 5: 54%|#####3 | 353/655 [00:00<00:00, 711.00batch/s, train/train_loss_1=0.0852]Epoch 5: 54%|#####3 | 353/655 [00:00<00:00, 711.00batch/s, train/train_loss_1=0.0748]Epoch 5: 54%|#####3 | 353/655 [00:00<00:00, 711.00batch/s, train/train_loss_1=0.0882]Epoch 5: 54%|#####3 | 353/655 [00:00<00:00, 711.00batch/s, train/train_loss_1=0.0885]Epoch 5: 54%|#####3 | 353/655 [00:00<00:00, 711.00batch/s, train/train_loss_1=0.0897]Epoch 5: 54%|#####3 | 353/655 [00:00<00:00, 711.00batch/s, train/train_loss_1=0.0985]Epoch 5: 54%|#####3 | 353/655 [00:00<00:00, 711.00batch/s, train/train_loss_1=0.0885]Epoch 5: 54%|#####3 | 353/655 [00:00<00:00, 711.00batch/s, train/train_loss_1=0.0917]Epoch 5: 54%|#####3 | 353/655 [00:00<00:00, 711.00batch/s, train/train_loss_1=0.106] Epoch 5: 54%|#####3 | 353/655 [00:00<00:00, 711.00batch/s, train/train_loss_1=0.107]Epoch 5: 54%|#####3 | 353/655 [00:00<00:00, 711.00batch/s, train/train_loss_1=0.0904]Epoch 5: 54%|#####3 | 353/655 [00:00<00:00, 711.00batch/s, train/train_loss_1=0.0938]Epoch 5: 54%|#####3 | 353/655 [00:00<00:00, 711.00batch/s, train/train_loss_1=0.0745]Epoch 5: 65%|######5 | 426/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0745]Epoch 5: 65%|######5 | 426/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0915]Epoch 5: 65%|######5 | 426/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0998]Epoch 5: 65%|######5 | 426/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0947]Epoch 5: 65%|######5 | 426/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0788]Epoch 5: 65%|######5 | 426/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0814]Epoch 5: 65%|######5 | 426/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.109] Epoch 5: 65%|######5 | 426/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0865]Epoch 5: 65%|######5 | 426/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0712]Epoch 5: 65%|######5 | 426/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0834]Epoch 5: 65%|######5 | 426/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0899]Epoch 5: 65%|######5 | 426/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0784]Epoch 5: 65%|######5 | 426/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0879]Epoch 5: 65%|######5 | 426/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0982]Epoch 5: 65%|######5 | 426/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0632]Epoch 5: 65%|######5 | 426/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0867]Epoch 5: 65%|######5 | 426/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.117] Epoch 5: 65%|######5 | 426/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0808]Epoch 5: 65%|######5 | 426/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0913]Epoch 5: 65%|######5 | 426/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0892]Epoch 5: 65%|######5 | 426/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0958]Epoch 5: 65%|######5 | 426/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0833]Epoch 5: 65%|######5 | 426/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0823]Epoch 5: 65%|######5 | 426/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.1] Epoch 5: 65%|######5 | 426/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0932]Epoch 5: 65%|######5 | 426/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.101] Epoch 5: 65%|######5 | 426/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0765]Epoch 5: 65%|######5 | 426/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0867]Epoch 5: 65%|######5 | 426/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0997]Epoch 5: 65%|######5 | 426/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.071] Epoch 5: 65%|######5 | 426/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0823]Epoch 5: 65%|######5 | 426/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0779]Epoch 5: 65%|######5 | 426/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0916]Epoch 5: 65%|######5 | 426/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.094] Epoch 5: 65%|######5 | 426/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.094]Epoch 5: 65%|######5 | 426/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0851]Epoch 5: 65%|######5 | 426/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.094] Epoch 5: 65%|######5 | 426/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0941]Epoch 5: 65%|######5 | 426/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0724]Epoch 5: 65%|######5 | 426/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.08] Epoch 5: 65%|######5 | 426/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0885]Epoch 5: 65%|######5 | 426/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0912]Epoch 5: 65%|######5 | 426/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0971]Epoch 5: 65%|######5 | 426/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.067] Epoch 5: 65%|######5 | 426/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0949]Epoch 5: 65%|######5 | 426/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0852]Epoch 5: 65%|######5 | 426/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0985]Epoch 5: 65%|######5 | 426/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0864]Epoch 5: 65%|######5 | 426/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0933]Epoch 5: 65%|######5 | 426/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0835]Epoch 5: 65%|######5 | 426/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0929]Epoch 5: 65%|######5 | 426/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0996]Epoch 5: 65%|######5 | 426/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0903]Epoch 5: 65%|######5 | 426/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0877]Epoch 5: 65%|######5 | 426/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.087] Epoch 5: 65%|######5 | 426/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.101]Epoch 5: 65%|######5 | 426/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0942]Epoch 5: 65%|######5 | 426/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0952]Epoch 5: 65%|######5 | 426/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.092] Epoch 5: 65%|######5 | 426/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0975]Epoch 5: 65%|######5 | 426/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0938]Epoch 5: 65%|######5 | 426/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0794]Epoch 5: 65%|######5 | 426/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0998]Epoch 5: 65%|######5 | 426/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0907]Epoch 5: 65%|######5 | 426/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0811]Epoch 5: 65%|######5 | 426/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.113] Epoch 5: 65%|######5 | 426/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.101]Epoch 5: 65%|######5 | 426/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0838]Epoch 5: 65%|######5 | 426/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0832]Epoch 5: 65%|######5 | 426/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0945]Epoch 5: 65%|######5 | 426/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.085] Epoch 5: 65%|######5 | 426/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0787]Epoch 5: 65%|######5 | 426/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0961]Epoch 5: 76%|#######6 | 498/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0961]Epoch 5: 76%|#######6 | 498/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0864]Epoch 5: 76%|#######6 | 498/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.111] Epoch 5: 76%|#######6 | 498/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0858]Epoch 5: 76%|#######6 | 498/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0992]Epoch 5: 76%|#######6 | 498/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.106] Epoch 5: 76%|#######6 | 498/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.09] Epoch 5: 76%|#######6 | 498/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0904]Epoch 5: 76%|#######6 | 498/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0844]Epoch 5: 76%|#######6 | 498/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0857]Epoch 5: 76%|#######6 | 498/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0961]Epoch 5: 76%|#######6 | 498/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0802]Epoch 5: 76%|#######6 | 498/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0758]Epoch 5: 76%|#######6 | 498/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0794]Epoch 5: 76%|#######6 | 498/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0874]Epoch 5: 76%|#######6 | 498/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.078] Epoch 5: 76%|#######6 | 498/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0939]Epoch 5: 76%|#######6 | 498/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0932]Epoch 5: 76%|#######6 | 498/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0892]Epoch 5: 76%|#######6 | 498/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.085] Epoch 5: 76%|#######6 | 498/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0793]Epoch 5: 76%|#######6 | 498/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0943]Epoch 5: 76%|#######6 | 498/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0788]Epoch 5: 76%|#######6 | 498/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0797]Epoch 5: 76%|#######6 | 498/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0937]Epoch 5: 76%|#######6 | 498/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0818]Epoch 5: 76%|#######6 | 498/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0936]Epoch 5: 76%|#######6 | 498/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.106] Epoch 5: 76%|#######6 | 498/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0953]Epoch 5: 76%|#######6 | 498/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0824]Epoch 5: 76%|#######6 | 498/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0757]Epoch 5: 76%|#######6 | 498/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0931]Epoch 5: 76%|#######6 | 498/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0706]Epoch 5: 76%|#######6 | 498/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0954]Epoch 5: 76%|#######6 | 498/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0938]Epoch 5: 76%|#######6 | 498/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0785]Epoch 5: 76%|#######6 | 498/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0843]Epoch 5: 76%|#######6 | 498/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0919]Epoch 5: 76%|#######6 | 498/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0857]Epoch 5: 76%|#######6 | 498/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0854]Epoch 5: 76%|#######6 | 498/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0967]Epoch 5: 76%|#######6 | 498/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0914]Epoch 5: 76%|#######6 | 498/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0874]Epoch 5: 76%|#######6 | 498/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0953]Epoch 5: 76%|#######6 | 498/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0869]Epoch 5: 76%|#######6 | 498/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0927]Epoch 5: 76%|#######6 | 498/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0829]Epoch 5: 76%|#######6 | 498/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0871]Epoch 5: 76%|#######6 | 498/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0949]Epoch 5: 76%|#######6 | 498/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0814]Epoch 5: 76%|#######6 | 498/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0949]Epoch 5: 76%|#######6 | 498/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.102] Epoch 5: 76%|#######6 | 498/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0969]Epoch 5: 76%|#######6 | 498/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.092] Epoch 5: 76%|#######6 | 498/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0869]Epoch 5: 76%|#######6 | 498/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0838]Epoch 5: 76%|#######6 | 498/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0818]Epoch 5: 76%|#######6 | 498/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0939]Epoch 5: 76%|#######6 | 498/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.102] Epoch 5: 76%|#######6 | 498/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0877]Epoch 5: 76%|#######6 | 498/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0898]Epoch 5: 76%|#######6 | 498/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0866]Epoch 5: 76%|#######6 | 498/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0944]Epoch 5: 76%|#######6 | 498/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0784]Epoch 5: 76%|#######6 | 498/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.088] Epoch 5: 76%|#######6 | 498/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0888]Epoch 5: 76%|#######6 | 498/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0861]Epoch 5: 76%|#######6 | 498/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0825]Epoch 5: 76%|#######6 | 498/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0834]Epoch 5: 76%|#######6 | 498/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0824]Epoch 5: 76%|#######6 | 498/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0829]Epoch 5: 76%|#######6 | 498/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.107] Epoch 5: 76%|#######6 | 498/655 [00:00<00:00, 715.99batch/s, train/train_loss_1=0.0856]Epoch 5: 87%|########7 | 570/655 [00:00<00:00, 716.96batch/s, train/train_loss_1=0.0856]Epoch 5: 87%|########7 | 570/655 [00:00<00:00, 716.96batch/s, train/train_loss_1=0.0885]Epoch 5: 87%|########7 | 570/655 [00:00<00:00, 716.96batch/s, train/train_loss_1=0.0948]Epoch 5: 87%|########7 | 570/655 [00:00<00:00, 716.96batch/s, train/train_loss_1=0.109] Epoch 5: 87%|########7 | 570/655 [00:00<00:00, 716.96batch/s, train/train_loss_1=0.0926]Epoch 5: 87%|########7 | 570/655 [00:00<00:00, 716.96batch/s, train/train_loss_1=0.077] Epoch 5: 87%|########7 | 570/655 [00:00<00:00, 716.96batch/s, train/train_loss_1=0.109]Epoch 5: 87%|########7 | 570/655 [00:00<00:00, 716.96batch/s, train/train_loss_1=0.0822]Epoch 5: 87%|########7 | 570/655 [00:00<00:00, 716.96batch/s, train/train_loss_1=0.0814]Epoch 5: 87%|########7 | 570/655 [00:00<00:00, 716.96batch/s, train/train_loss_1=0.0889]Epoch 5: 87%|########7 | 570/655 [00:00<00:00, 716.96batch/s, train/train_loss_1=0.0864]Epoch 5: 87%|########7 | 570/655 [00:00<00:00, 716.96batch/s, train/train_loss_1=0.0843]Epoch 5: 87%|########7 | 570/655 [00:00<00:00, 716.96batch/s, train/train_loss_1=0.0859]Epoch 5: 87%|########7 | 570/655 [00:00<00:00, 716.96batch/s, train/train_loss_1=0.102] Epoch 5: 87%|########7 | 570/655 [00:00<00:00, 716.96batch/s, train/train_loss_1=0.1] Epoch 5: 87%|########7 | 570/655 [00:00<00:00, 716.96batch/s, train/train_loss_1=0.0882]Epoch 5: 87%|########7 | 570/655 [00:00<00:00, 716.96batch/s, train/train_loss_1=0.087] Epoch 5: 87%|########7 | 570/655 [00:00<00:00, 716.96batch/s, train/train_loss_1=0.0823]Epoch 5: 87%|########7 | 570/655 [00:00<00:00, 716.96batch/s, train/train_loss_1=0.113] Epoch 5: 87%|########7 | 570/655 [00:00<00:00, 716.96batch/s, train/train_loss_1=0.0914]Epoch 5: 87%|########7 | 570/655 [00:00<00:00, 716.96batch/s, train/train_loss_1=0.0741]Epoch 5: 87%|########7 | 570/655 [00:00<00:00, 716.96batch/s, train/train_loss_1=0.0823]Epoch 5: 87%|########7 | 570/655 [00:00<00:00, 716.96batch/s, train/train_loss_1=0.084] Epoch 5: 87%|########7 | 570/655 [00:00<00:00, 716.96batch/s, train/train_loss_1=0.101]Epoch 5: 87%|########7 | 570/655 [00:00<00:00, 716.96batch/s, train/train_loss_1=0.0924]Epoch 5: 87%|########7 | 570/655 [00:00<00:00, 716.96batch/s, train/train_loss_1=0.0875]Epoch 5: 87%|########7 | 570/655 [00:00<00:00, 716.96batch/s, train/train_loss_1=0.0857]Epoch 5: 87%|########7 | 570/655 [00:00<00:00, 716.96batch/s, train/train_loss_1=0.0959]Epoch 5: 87%|########7 | 570/655 [00:00<00:00, 716.96batch/s, train/train_loss_1=0.0692]Epoch 5: 87%|########7 | 570/655 [00:00<00:00, 716.96batch/s, train/train_loss_1=0.0909]Epoch 5: 87%|########7 | 570/655 [00:00<00:00, 716.96batch/s, train/train_loss_1=0.102] Epoch 5: 87%|########7 | 570/655 [00:00<00:00, 716.96batch/s, train/train_loss_1=0.0812]Epoch 5: 87%|########7 | 570/655 [00:00<00:00, 716.96batch/s, train/train_loss_1=0.0909]Epoch 5: 87%|########7 | 570/655 [00:00<00:00, 716.96batch/s, train/train_loss_1=0.0948]Epoch 5: 87%|########7 | 570/655 [00:00<00:00, 716.96batch/s, train/train_loss_1=0.0866]Epoch 5: 87%|########7 | 570/655 [00:00<00:00, 716.96batch/s, train/train_loss_1=0.0784]Epoch 5: 87%|########7 | 570/655 [00:00<00:00, 716.96batch/s, train/train_loss_1=0.0996]Epoch 5: 87%|########7 | 570/655 [00:00<00:00, 716.96batch/s, train/train_loss_1=0.07] Epoch 5: 87%|########7 | 570/655 [00:00<00:00, 716.96batch/s, train/train_loss_1=0.1] Epoch 5: 87%|########7 | 570/655 [00:00<00:00, 716.96batch/s, train/train_loss_1=0.0918]Epoch 5: 87%|########7 | 570/655 [00:00<00:00, 716.96batch/s, train/train_loss_1=0.0909]Epoch 5: 87%|########7 | 570/655 [00:00<00:00, 716.96batch/s, train/train_loss_1=0.0695]Epoch 5: 87%|########7 | 570/655 [00:00<00:00, 716.96batch/s, train/train_loss_1=0.106] Epoch 5: 87%|########7 | 570/655 [00:00<00:00, 716.96batch/s, train/train_loss_1=0.0911]Epoch 5: 87%|########7 | 570/655 [00:00<00:00, 716.96batch/s, train/train_loss_1=0.0955]Epoch 5: 87%|########7 | 570/655 [00:00<00:00, 716.96batch/s, train/train_loss_1=0.0871]Epoch 5: 87%|########7 | 570/655 [00:00<00:00, 716.96batch/s, train/train_loss_1=0.0871]Epoch 5: 87%|########7 | 570/655 [00:00<00:00, 716.96batch/s, train/train_loss_1=0.0853]Epoch 5: 87%|########7 | 570/655 [00:00<00:00, 716.96batch/s, train/train_loss_1=0.101] Epoch 5: 87%|########7 | 570/655 [00:00<00:00, 716.96batch/s, train/train_loss_1=0.0961]Epoch 5: 87%|########7 | 570/655 [00:00<00:00, 716.96batch/s, train/train_loss_1=0.0915]Epoch 5: 87%|########7 | 570/655 [00:00<00:00, 716.96batch/s, train/train_loss_1=0.0836]Epoch 5: 87%|########7 | 570/655 [00:00<00:00, 716.96batch/s, train/train_loss_1=0.0942]Epoch 5: 87%|########7 | 570/655 [00:00<00:00, 716.96batch/s, train/train_loss_1=0.0869]Epoch 5: 87%|########7 | 570/655 [00:00<00:00, 716.96batch/s, train/train_loss_1=0.0878]Epoch 5: 87%|########7 | 570/655 [00:00<00:00, 716.96batch/s, train/train_loss_1=0.104] Epoch 5: 87%|########7 | 570/655 [00:00<00:00, 716.96batch/s, train/train_loss_1=0.0866]Epoch 5: 87%|########7 | 570/655 [00:00<00:00, 716.96batch/s, train/train_loss_1=0.0836]Epoch 5: 87%|########7 | 570/655 [00:00<00:00, 716.96batch/s, train/train_loss_1=0.103] Epoch 5: 87%|########7 | 570/655 [00:00<00:00, 716.96batch/s, train/train_loss_1=0.0912]Epoch 5: 87%|########7 | 570/655 [00:00<00:00, 716.96batch/s, train/train_loss_1=0.0849]Epoch 5: 87%|########7 | 570/655 [00:00<00:00, 716.96batch/s, train/train_loss_1=0.08] Epoch 5: 87%|########7 | 570/655 [00:00<00:00, 716.96batch/s, train/train_loss_1=0.0805]Epoch 5: 87%|########7 | 570/655 [00:00<00:00, 716.96batch/s, train/train_loss_1=0.0985]Epoch 5: 87%|########7 | 570/655 [00:00<00:00, 716.96batch/s, train/train_loss_1=0.0793]Epoch 5: 87%|########7 | 570/655 [00:00<00:00, 716.96batch/s, train/train_loss_1=0.0684]Epoch 5: 87%|########7 | 570/655 [00:00<00:00, 716.96batch/s, train/train_loss_1=0.0764]Epoch 5: 87%|########7 | 570/655 [00:00<00:00, 716.96batch/s, train/train_loss_1=0.0847]Epoch 5: 87%|########7 | 570/655 [00:00<00:00, 716.96batch/s, train/train_loss_1=0.0971]Epoch 5: 87%|########7 | 570/655 [00:00<00:00, 716.96batch/s, train/train_loss_1=0.0849]Epoch 5: 87%|########7 | 570/655 [00:00<00:00, 716.96batch/s, train/train_loss_1=0.0979]Epoch 5: 87%|########7 | 570/655 [00:00<00:00, 716.96batch/s, train/train_loss_1=0.0843]Epoch 5: 87%|########7 | 570/655 [00:00<00:00, 716.96batch/s, train/train_loss_1=0.0877]Epoch 5: 87%|########7 | 570/655 [00:00<00:00, 716.96batch/s, train/train_loss_1=0.111] Epoch 5: 87%|########7 | 570/655 [00:00<00:00, 716.96batch/s, train/train_loss_1=0.0691]Epoch 5: 98%|#########8| 644/655 [00:00<00:00, 721.83batch/s, train/train_loss_1=0.0691]Epoch 5: 98%|#########8| 644/655 [00:00<00:00, 721.83batch/s, train/train_loss_1=0.0733]Epoch 5: 98%|#########8| 644/655 [00:00<00:00, 721.83batch/s, train/train_loss_1=0.103] Epoch 5: 98%|#########8| 644/655 [00:00<00:00, 721.83batch/s, train/train_loss_1=0.0852]Epoch 5: 98%|#########8| 644/655 [00:00<00:00, 721.83batch/s, train/train_loss_1=0.082] Epoch 5: 98%|#########8| 644/655 [00:00<00:00, 721.83batch/s, train/train_loss_1=0.0947]Epoch 5: 98%|#########8| 644/655 [00:00<00:00, 721.83batch/s, train/train_loss_1=0.0889]Epoch 5: 98%|#########8| 644/655 [00:00<00:00, 721.83batch/s, train/train_loss_1=0.0783]Epoch 5: 98%|#########8| 644/655 [00:00<00:00, 721.83batch/s, train/train_loss_1=0.0759]Epoch 5: 98%|#########8| 644/655 [00:00<00:00, 721.83batch/s, train/train_loss_1=0.0898]Epoch 5: 98%|#########8| 644/655 [00:00<00:00, 721.83batch/s, train/train_loss_1=0.0845]Epoch 5: 98%|#########8| 644/655 [00:00<00:00, 721.83batch/s, train/train_loss_1=0.0937] 0%| | 0/655 [00:00<?, ?batch/s]Epoch 6: 0%| | 0/655 [00:00<?, ?batch/s]Epoch 6: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0769]Epoch 6: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0953]Epoch 6: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0842]Epoch 6: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0854]Epoch 6: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0994]Epoch 6: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0976]Epoch 6: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.084] Epoch 6: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0833]Epoch 6: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0817]Epoch 6: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0949]Epoch 6: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0825]Epoch 6: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0894]Epoch 6: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0795]Epoch 6: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0834]Epoch 6: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0848]Epoch 6: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0945]Epoch 6: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0917]Epoch 6: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0952]Epoch 6: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.089] Epoch 6: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0707]Epoch 6: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0942]Epoch 6: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0781]Epoch 6: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0977]Epoch 6: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0925]Epoch 6: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0798]Epoch 6: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0704]Epoch 6: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0719]Epoch 6: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.105] Epoch 6: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0863]Epoch 6: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0806]Epoch 6: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0894]Epoch 6: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0891]Epoch 6: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0945]Epoch 6: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0794]Epoch 6: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0873]Epoch 6: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0925]Epoch 6: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0975]Epoch 6: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.069] Epoch 6: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0873]Epoch 6: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.082] Epoch 6: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.086]Epoch 6: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0892]Epoch 6: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0849]Epoch 6: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.077] Epoch 6: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0892]Epoch 6: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0914]Epoch 6: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0864]Epoch 6: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0963]Epoch 6: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0857]Epoch 6: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0849]Epoch 6: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0952]Epoch 6: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0809]Epoch 6: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0801]Epoch 6: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0786]Epoch 6: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0912]Epoch 6: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0885]Epoch 6: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0877]Epoch 6: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0787]Epoch 6: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.103] Epoch 6: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0762]Epoch 6: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0965]Epoch 6: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0863]Epoch 6: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0874]Epoch 6: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.094] Epoch 6: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.083]Epoch 6: 10%|9 | 65/655 [00:00<00:00, 645.78batch/s, train/train_loss_1=0.083]Epoch 6: 10%|9 | 65/655 [00:00<00:00, 645.78batch/s, train/train_loss_1=0.0904]Epoch 6: 10%|9 | 65/655 [00:00<00:00, 645.78batch/s, train/train_loss_1=0.0793]Epoch 6: 10%|9 | 65/655 [00:00<00:00, 645.78batch/s, train/train_loss_1=0.0813]Epoch 6: 10%|9 | 65/655 [00:00<00:00, 645.78batch/s, train/train_loss_1=0.0869]Epoch 6: 10%|9 | 65/655 [00:00<00:00, 645.78batch/s, train/train_loss_1=0.0883]Epoch 6: 10%|9 | 65/655 [00:00<00:00, 645.78batch/s, train/train_loss_1=0.0877]Epoch 6: 10%|9 | 65/655 [00:00<00:00, 645.78batch/s, train/train_loss_1=0.0735]Epoch 6: 10%|9 | 65/655 [00:00<00:00, 645.78batch/s, train/train_loss_1=0.102] Epoch 6: 10%|9 | 65/655 [00:00<00:00, 645.78batch/s, train/train_loss_1=0.0796]Epoch 6: 10%|9 | 65/655 [00:00<00:00, 645.78batch/s, train/train_loss_1=0.0907]Epoch 6: 10%|9 | 65/655 [00:00<00:00, 645.78batch/s, train/train_loss_1=0.0835]Epoch 6: 10%|9 | 65/655 [00:00<00:00, 645.78batch/s, train/train_loss_1=0.0978]Epoch 6: 10%|9 | 65/655 [00:00<00:00, 645.78batch/s, train/train_loss_1=0.0847]Epoch 6: 10%|9 | 65/655 [00:00<00:00, 645.78batch/s, train/train_loss_1=0.0868]Epoch 6: 10%|9 | 65/655 [00:00<00:00, 645.78batch/s, train/train_loss_1=0.0948]Epoch 6: 10%|9 | 65/655 [00:00<00:00, 645.78batch/s, train/train_loss_1=0.0848]Epoch 6: 10%|9 | 65/655 [00:00<00:00, 645.78batch/s, train/train_loss_1=0.0911]Epoch 6: 10%|9 | 65/655 [00:00<00:00, 645.78batch/s, train/train_loss_1=0.0883]Epoch 6: 10%|9 | 65/655 [00:00<00:00, 645.78batch/s, train/train_loss_1=0.0885]Epoch 6: 10%|9 | 65/655 [00:00<00:00, 645.78batch/s, train/train_loss_1=0.0943]Epoch 6: 10%|9 | 65/655 [00:00<00:00, 645.78batch/s, train/train_loss_1=0.0895]Epoch 6: 10%|9 | 65/655 [00:00<00:00, 645.78batch/s, train/train_loss_1=0.0815]Epoch 6: 10%|9 | 65/655 [00:00<00:00, 645.78batch/s, train/train_loss_1=0.0918]Epoch 6: 10%|9 | 65/655 [00:00<00:00, 645.78batch/s, train/train_loss_1=0.0922]Epoch 6: 10%|9 | 65/655 [00:00<00:00, 645.78batch/s, train/train_loss_1=0.0798]Epoch 6: 10%|9 | 65/655 [00:00<00:00, 645.78batch/s, train/train_loss_1=0.0811]Epoch 6: 10%|9 | 65/655 [00:00<00:00, 645.78batch/s, train/train_loss_1=0.0781]Epoch 6: 10%|9 | 65/655 [00:00<00:00, 645.78batch/s, train/train_loss_1=0.0823]Epoch 6: 10%|9 | 65/655 [00:00<00:00, 645.78batch/s, train/train_loss_1=0.0655]Epoch 6: 10%|9 | 65/655 [00:00<00:00, 645.78batch/s, train/train_loss_1=0.0963]Epoch 6: 10%|9 | 65/655 [00:00<00:00, 645.78batch/s, train/train_loss_1=0.0833]Epoch 6: 10%|9 | 65/655 [00:00<00:00, 645.78batch/s, train/train_loss_1=0.0904]Epoch 6: 10%|9 | 65/655 [00:00<00:00, 645.78batch/s, train/train_loss_1=0.0936]Epoch 6: 10%|9 | 65/655 [00:00<00:00, 645.78batch/s, train/train_loss_1=0.105] Epoch 6: 10%|9 | 65/655 [00:00<00:00, 645.78batch/s, train/train_loss_1=0.0807]Epoch 6: 10%|9 | 65/655 [00:00<00:00, 645.78batch/s, train/train_loss_1=0.0854]Epoch 6: 10%|9 | 65/655 [00:00<00:00, 645.78batch/s, train/train_loss_1=0.0851]Epoch 6: 10%|9 | 65/655 [00:00<00:00, 645.78batch/s, train/train_loss_1=0.0906]Epoch 6: 10%|9 | 65/655 [00:00<00:00, 645.78batch/s, train/train_loss_1=0.0939]Epoch 6: 10%|9 | 65/655 [00:00<00:00, 645.78batch/s, train/train_loss_1=0.0767]Epoch 6: 10%|9 | 65/655 [00:00<00:00, 645.78batch/s, train/train_loss_1=0.0832]Epoch 6: 10%|9 | 65/655 [00:00<00:00, 645.78batch/s, train/train_loss_1=0.0848]Epoch 6: 10%|9 | 65/655 [00:00<00:00, 645.78batch/s, train/train_loss_1=0.0789]Epoch 6: 10%|9 | 65/655 [00:00<00:00, 645.78batch/s, train/train_loss_1=0.0843]Epoch 6: 10%|9 | 65/655 [00:00<00:00, 645.78batch/s, train/train_loss_1=0.0852]Epoch 6: 10%|9 | 65/655 [00:00<00:00, 645.78batch/s, train/train_loss_1=0.084] Epoch 6: 10%|9 | 65/655 [00:00<00:00, 645.78batch/s, train/train_loss_1=0.0945]Epoch 6: 10%|9 | 65/655 [00:00<00:00, 645.78batch/s, train/train_loss_1=0.081] Epoch 6: 10%|9 | 65/655 [00:00<00:00, 645.78batch/s, train/train_loss_1=0.0705]Epoch 6: 10%|9 | 65/655 [00:00<00:00, 645.78batch/s, train/train_loss_1=0.0841]Epoch 6: 10%|9 | 65/655 [00:00<00:00, 645.78batch/s, train/train_loss_1=0.0896]Epoch 6: 10%|9 | 65/655 [00:00<00:00, 645.78batch/s, train/train_loss_1=0.084] Epoch 6: 10%|9 | 65/655 [00:00<00:00, 645.78batch/s, train/train_loss_1=0.0886]Epoch 6: 10%|9 | 65/655 [00:00<00:00, 645.78batch/s, train/train_loss_1=0.0932]Epoch 6: 10%|9 | 65/655 [00:00<00:00, 645.78batch/s, train/train_loss_1=0.0866]Epoch 6: 10%|9 | 65/655 [00:00<00:00, 645.78batch/s, train/train_loss_1=0.0984]Epoch 6: 10%|9 | 65/655 [00:00<00:00, 645.78batch/s, train/train_loss_1=0.1] Epoch 6: 10%|9 | 65/655 [00:00<00:00, 645.78batch/s, train/train_loss_1=0.0909]Epoch 6: 10%|9 | 65/655 [00:00<00:00, 645.78batch/s, train/train_loss_1=0.0909]Epoch 6: 10%|9 | 65/655 [00:00<00:00, 645.78batch/s, train/train_loss_1=0.109] Epoch 6: 10%|9 | 65/655 [00:00<00:00, 645.78batch/s, train/train_loss_1=0.0779]Epoch 6: 10%|9 | 65/655 [00:00<00:00, 645.78batch/s, train/train_loss_1=0.101] Epoch 6: 10%|9 | 65/655 [00:00<00:00, 645.78batch/s, train/train_loss_1=0.0917]Epoch 6: 10%|9 | 65/655 [00:00<00:00, 645.78batch/s, train/train_loss_1=0.0832]Epoch 6: 10%|9 | 65/655 [00:00<00:00, 645.78batch/s, train/train_loss_1=0.0945]Epoch 6: 10%|9 | 65/655 [00:00<00:00, 645.78batch/s, train/train_loss_1=0.0998]Epoch 6: 10%|9 | 65/655 [00:00<00:00, 645.78batch/s, train/train_loss_1=0.0936]Epoch 6: 10%|9 | 65/655 [00:00<00:00, 645.78batch/s, train/train_loss_1=0.0886]Epoch 6: 10%|9 | 65/655 [00:00<00:00, 645.78batch/s, train/train_loss_1=0.0878]Epoch 6: 20%|## | 134/655 [00:00<00:00, 671.22batch/s, train/train_loss_1=0.0878]Epoch 6: 20%|## | 134/655 [00:00<00:00, 671.22batch/s, train/train_loss_1=0.0873]Epoch 6: 20%|## | 134/655 [00:00<00:00, 671.22batch/s, train/train_loss_1=0.0854]Epoch 6: 20%|## | 134/655 [00:00<00:00, 671.22batch/s, train/train_loss_1=0.0834]Epoch 6: 20%|## | 134/655 [00:00<00:00, 671.22batch/s, train/train_loss_1=0.0804]Epoch 6: 20%|## | 134/655 [00:00<00:00, 671.22batch/s, train/train_loss_1=0.0871]Epoch 6: 20%|## | 134/655 [00:00<00:00, 671.22batch/s, train/train_loss_1=0.0872]Epoch 6: 20%|## | 134/655 [00:00<00:00, 671.22batch/s, train/train_loss_1=0.101] Epoch 6: 20%|## | 134/655 [00:00<00:00, 671.22batch/s, train/train_loss_1=0.117]Epoch 6: 20%|## | 134/655 [00:00<00:00, 671.22batch/s, train/train_loss_1=0.085]Epoch 6: 20%|## | 134/655 [00:00<00:00, 671.22batch/s, train/train_loss_1=0.0842]Epoch 6: 20%|## | 134/655 [00:00<00:00, 671.22batch/s, train/train_loss_1=0.0761]Epoch 6: 20%|## | 134/655 [00:00<00:00, 671.22batch/s, train/train_loss_1=0.0934]Epoch 6: 20%|## | 134/655 [00:00<00:00, 671.22batch/s, train/train_loss_1=0.0823]Epoch 6: 20%|## | 134/655 [00:00<00:00, 671.22batch/s, train/train_loss_1=0.087] Epoch 6: 20%|## | 134/655 [00:00<00:00, 671.22batch/s, train/train_loss_1=0.0892]Epoch 6: 20%|## | 134/655 [00:00<00:00, 671.22batch/s, train/train_loss_1=0.0907]Epoch 6: 20%|## | 134/655 [00:00<00:00, 671.22batch/s, train/train_loss_1=0.103] Epoch 6: 20%|## | 134/655 [00:00<00:00, 671.22batch/s, train/train_loss_1=0.0906]Epoch 6: 20%|## | 134/655 [00:00<00:00, 671.22batch/s, train/train_loss_1=0.0769]Epoch 6: 20%|## | 134/655 [00:00<00:00, 671.22batch/s, train/train_loss_1=0.0945]Epoch 6: 20%|## | 134/655 [00:00<00:00, 671.22batch/s, train/train_loss_1=0.0886]Epoch 6: 20%|## | 134/655 [00:00<00:00, 671.22batch/s, train/train_loss_1=0.0862]Epoch 6: 20%|## | 134/655 [00:00<00:00, 671.22batch/s, train/train_loss_1=0.0743]Epoch 6: 20%|## | 134/655 [00:00<00:00, 671.22batch/s, train/train_loss_1=0.0914]Epoch 6: 20%|## | 134/655 [00:00<00:00, 671.22batch/s, train/train_loss_1=0.0806]Epoch 6: 20%|## | 134/655 [00:00<00:00, 671.22batch/s, train/train_loss_1=0.0848]Epoch 6: 20%|## | 134/655 [00:00<00:00, 671.22batch/s, train/train_loss_1=0.0918]Epoch 6: 20%|## | 134/655 [00:00<00:00, 671.22batch/s, train/train_loss_1=0.0986]Epoch 6: 20%|## | 134/655 [00:00<00:00, 671.22batch/s, train/train_loss_1=0.095] Epoch 6: 20%|## | 134/655 [00:00<00:00, 671.22batch/s, train/train_loss_1=0.0736]Epoch 6: 20%|## | 134/655 [00:00<00:00, 671.22batch/s, train/train_loss_1=0.0819]Epoch 6: 20%|## | 134/655 [00:00<00:00, 671.22batch/s, train/train_loss_1=0.0882]Epoch 6: 20%|## | 134/655 [00:00<00:00, 671.22batch/s, train/train_loss_1=0.0853]Epoch 6: 20%|## | 134/655 [00:00<00:00, 671.22batch/s, train/train_loss_1=0.0828]Epoch 6: 20%|## | 134/655 [00:00<00:00, 671.22batch/s, train/train_loss_1=0.106] Epoch 6: 20%|## | 134/655 [00:00<00:00, 671.22batch/s, train/train_loss_1=0.0984]Epoch 6: 20%|## | 134/655 [00:00<00:00, 671.22batch/s, train/train_loss_1=0.0807]Epoch 6: 20%|## | 134/655 [00:00<00:00, 671.22batch/s, train/train_loss_1=0.0928]Epoch 6: 20%|## | 134/655 [00:00<00:00, 671.22batch/s, train/train_loss_1=0.0961]Epoch 6: 20%|## | 134/655 [00:00<00:00, 671.22batch/s, train/train_loss_1=0.0867]Epoch 6: 20%|## | 134/655 [00:00<00:00, 671.22batch/s, train/train_loss_1=0.087] Epoch 6: 20%|## | 134/655 [00:00<00:00, 671.22batch/s, train/train_loss_1=0.0929]Epoch 6: 20%|## | 134/655 [00:00<00:00, 671.22batch/s, train/train_loss_1=0.0899]Epoch 6: 20%|## | 134/655 [00:00<00:00, 671.22batch/s, train/train_loss_1=0.0818]Epoch 6: 20%|## | 134/655 [00:00<00:00, 671.22batch/s, train/train_loss_1=0.0993]Epoch 6: 20%|## | 134/655 [00:00<00:00, 671.22batch/s, train/train_loss_1=0.0824]Epoch 6: 20%|## | 134/655 [00:00<00:00, 671.22batch/s, train/train_loss_1=0.105] Epoch 6: 20%|## | 134/655 [00:00<00:00, 671.22batch/s, train/train_loss_1=0.0781]Epoch 6: 20%|## | 134/655 [00:00<00:00, 671.22batch/s, train/train_loss_1=0.0942]Epoch 6: 20%|## | 134/655 [00:00<00:00, 671.22batch/s, train/train_loss_1=0.086] Epoch 6: 20%|## | 134/655 [00:00<00:00, 671.22batch/s, train/train_loss_1=0.0759]Epoch 6: 20%|## | 134/655 [00:00<00:00, 671.22batch/s, train/train_loss_1=0.0848]Epoch 6: 20%|## | 134/655 [00:00<00:00, 671.22batch/s, train/train_loss_1=0.0923]Epoch 6: 20%|## | 134/655 [00:00<00:00, 671.22batch/s, train/train_loss_1=0.0799]Epoch 6: 20%|## | 134/655 [00:00<00:00, 671.22batch/s, train/train_loss_1=0.0771]Epoch 6: 20%|## | 134/655 [00:00<00:00, 671.22batch/s, train/train_loss_1=0.0814]Epoch 6: 20%|## | 134/655 [00:00<00:00, 671.22batch/s, train/train_loss_1=0.0881]Epoch 6: 20%|## | 134/655 [00:00<00:00, 671.22batch/s, train/train_loss_1=0.0909]Epoch 6: 20%|## | 134/655 [00:00<00:00, 671.22batch/s, train/train_loss_1=0.0875]Epoch 6: 20%|## | 134/655 [00:00<00:00, 671.22batch/s, train/train_loss_1=0.118] Epoch 6: 20%|## | 134/655 [00:00<00:00, 671.22batch/s, train/train_loss_1=0.0682]Epoch 6: 20%|## | 134/655 [00:00<00:00, 671.22batch/s, train/train_loss_1=0.0872]Epoch 6: 20%|## | 134/655 [00:00<00:00, 671.22batch/s, train/train_loss_1=0.0989]Epoch 6: 20%|## | 134/655 [00:00<00:00, 671.22batch/s, train/train_loss_1=0.0748]Epoch 6: 20%|## | 134/655 [00:00<00:00, 671.22batch/s, train/train_loss_1=0.0876]Epoch 6: 20%|## | 134/655 [00:00<00:00, 671.22batch/s, train/train_loss_1=0.0887]Epoch 6: 20%|## | 134/655 [00:00<00:00, 671.22batch/s, train/train_loss_1=0.11] Epoch 6: 20%|## | 134/655 [00:00<00:00, 671.22batch/s, train/train_loss_1=0.0852]Epoch 6: 20%|## | 134/655 [00:00<00:00, 671.22batch/s, train/train_loss_1=0.0935]Epoch 6: 20%|## | 134/655 [00:00<00:00, 671.22batch/s, train/train_loss_1=0.0671]Epoch 6: 20%|## | 134/655 [00:00<00:00, 671.22batch/s, train/train_loss_1=0.0787]Epoch 6: 20%|## | 134/655 [00:00<00:00, 671.22batch/s, train/train_loss_1=0.0701]Epoch 6: 20%|## | 134/655 [00:00<00:00, 671.22batch/s, train/train_loss_1=0.0903]Epoch 6: 32%|###1 | 207/655 [00:00<00:00, 697.47batch/s, train/train_loss_1=0.0903]Epoch 6: 32%|###1 | 207/655 [00:00<00:00, 697.47batch/s, train/train_loss_1=0.086] Epoch 6: 32%|###1 | 207/655 [00:00<00:00, 697.47batch/s, train/train_loss_1=0.0941]Epoch 6: 32%|###1 | 207/655 [00:00<00:00, 697.47batch/s, train/train_loss_1=0.0833]Epoch 6: 32%|###1 | 207/655 [00:00<00:00, 697.47batch/s, train/train_loss_1=0.0931]Epoch 6: 32%|###1 | 207/655 [00:00<00:00, 697.47batch/s, train/train_loss_1=0.0752]Epoch 6: 32%|###1 | 207/655 [00:00<00:00, 697.47batch/s, train/train_loss_1=0.0806]Epoch 6: 32%|###1 | 207/655 [00:00<00:00, 697.47batch/s, train/train_loss_1=0.0892]Epoch 6: 32%|###1 | 207/655 [00:00<00:00, 697.47batch/s, train/train_loss_1=0.0848]Epoch 6: 32%|###1 | 207/655 [00:00<00:00, 697.47batch/s, train/train_loss_1=0.0847]Epoch 6: 32%|###1 | 207/655 [00:00<00:00, 697.47batch/s, train/train_loss_1=0.0776]Epoch 6: 32%|###1 | 207/655 [00:00<00:00, 697.47batch/s, train/train_loss_1=0.0833]Epoch 6: 32%|###1 | 207/655 [00:00<00:00, 697.47batch/s, train/train_loss_1=0.077] Epoch 6: 32%|###1 | 207/655 [00:00<00:00, 697.47batch/s, train/train_loss_1=0.0846]Epoch 6: 32%|###1 | 207/655 [00:00<00:00, 697.47batch/s, train/train_loss_1=0.0737]Epoch 6: 32%|###1 | 207/655 [00:00<00:00, 697.47batch/s, train/train_loss_1=0.0831]Epoch 6: 32%|###1 | 207/655 [00:00<00:00, 697.47batch/s, train/train_loss_1=0.0934]Epoch 6: 32%|###1 | 207/655 [00:00<00:00, 697.47batch/s, train/train_loss_1=0.103] Epoch 6: 32%|###1 | 207/655 [00:00<00:00, 697.47batch/s, train/train_loss_1=0.0969]Epoch 6: 32%|###1 | 207/655 [00:00<00:00, 697.47batch/s, train/train_loss_1=0.0984]Epoch 6: 32%|###1 | 207/655 [00:00<00:00, 697.47batch/s, train/train_loss_1=0.084] Epoch 6: 32%|###1 | 207/655 [00:00<00:00, 697.47batch/s, train/train_loss_1=0.0841]Epoch 6: 32%|###1 | 207/655 [00:00<00:00, 697.47batch/s, train/train_loss_1=0.097] Epoch 6: 32%|###1 | 207/655 [00:00<00:00, 697.47batch/s, train/train_loss_1=0.0789]Epoch 6: 32%|###1 | 207/655 [00:00<00:00, 697.47batch/s, train/train_loss_1=0.0845]Epoch 6: 32%|###1 | 207/655 [00:00<00:00, 697.47batch/s, train/train_loss_1=0.0847]Epoch 6: 32%|###1 | 207/655 [00:00<00:00, 697.47batch/s, train/train_loss_1=0.102] Epoch 6: 32%|###1 | 207/655 [00:00<00:00, 697.47batch/s, train/train_loss_1=0.0946]Epoch 6: 32%|###1 | 207/655 [00:00<00:00, 697.47batch/s, train/train_loss_1=0.0865]Epoch 6: 32%|###1 | 207/655 [00:00<00:00, 697.47batch/s, train/train_loss_1=0.0872]Epoch 6: 32%|###1 | 207/655 [00:00<00:00, 697.47batch/s, train/train_loss_1=0.0887]Epoch 6: 32%|###1 | 207/655 [00:00<00:00, 697.47batch/s, train/train_loss_1=0.0877]Epoch 6: 32%|###1 | 207/655 [00:00<00:00, 697.47batch/s, train/train_loss_1=0.0777]Epoch 6: 32%|###1 | 207/655 [00:00<00:00, 697.47batch/s, train/train_loss_1=0.0884]Epoch 6: 32%|###1 | 207/655 [00:00<00:00, 697.47batch/s, train/train_loss_1=0.0957]Epoch 6: 32%|###1 | 207/655 [00:00<00:00, 697.47batch/s, train/train_loss_1=0.0779]Epoch 6: 32%|###1 | 207/655 [00:00<00:00, 697.47batch/s, train/train_loss_1=0.0873]Epoch 6: 32%|###1 | 207/655 [00:00<00:00, 697.47batch/s, train/train_loss_1=0.0877]Epoch 6: 32%|###1 | 207/655 [00:00<00:00, 697.47batch/s, train/train_loss_1=0.0887]Epoch 6: 32%|###1 | 207/655 [00:00<00:00, 697.47batch/s, train/train_loss_1=0.0874]Epoch 6: 32%|###1 | 207/655 [00:00<00:00, 697.47batch/s, train/train_loss_1=0.0801]Epoch 6: 32%|###1 | 207/655 [00:00<00:00, 697.47batch/s, train/train_loss_1=0.0932]Epoch 6: 32%|###1 | 207/655 [00:00<00:00, 697.47batch/s, train/train_loss_1=0.0791]Epoch 6: 32%|###1 | 207/655 [00:00<00:00, 697.47batch/s, train/train_loss_1=0.0796]Epoch 6: 32%|###1 | 207/655 [00:00<00:00, 697.47batch/s, train/train_loss_1=0.0775]Epoch 6: 32%|###1 | 207/655 [00:00<00:00, 697.47batch/s, train/train_loss_1=0.0849]Epoch 6: 32%|###1 | 207/655 [00:00<00:00, 697.47batch/s, train/train_loss_1=0.0917]Epoch 6: 32%|###1 | 207/655 [00:00<00:00, 697.47batch/s, train/train_loss_1=0.0847]Epoch 6: 32%|###1 | 207/655 [00:00<00:00, 697.47batch/s, train/train_loss_1=0.0852]Epoch 6: 32%|###1 | 207/655 [00:00<00:00, 697.47batch/s, train/train_loss_1=0.0858]Epoch 6: 32%|###1 | 207/655 [00:00<00:00, 697.47batch/s, train/train_loss_1=0.0938]Epoch 6: 32%|###1 | 207/655 [00:00<00:00, 697.47batch/s, train/train_loss_1=0.0651]Epoch 6: 32%|###1 | 207/655 [00:00<00:00, 697.47batch/s, train/train_loss_1=0.0826]Epoch 6: 32%|###1 | 207/655 [00:00<00:00, 697.47batch/s, train/train_loss_1=0.0978]Epoch 6: 32%|###1 | 207/655 [00:00<00:00, 697.47batch/s, train/train_loss_1=0.0996]Epoch 6: 32%|###1 | 207/655 [00:00<00:00, 697.47batch/s, train/train_loss_1=0.094] Epoch 6: 32%|###1 | 207/655 [00:00<00:00, 697.47batch/s, train/train_loss_1=0.0881]Epoch 6: 32%|###1 | 207/655 [00:00<00:00, 697.47batch/s, train/train_loss_1=0.0879]Epoch 6: 32%|###1 | 207/655 [00:00<00:00, 697.47batch/s, train/train_loss_1=0.0846]Epoch 6: 32%|###1 | 207/655 [00:00<00:00, 697.47batch/s, train/train_loss_1=0.0702]Epoch 6: 32%|###1 | 207/655 [00:00<00:00, 697.47batch/s, train/train_loss_1=0.0767]Epoch 6: 32%|###1 | 207/655 [00:00<00:00, 697.47batch/s, train/train_loss_1=0.0762]Epoch 6: 32%|###1 | 207/655 [00:00<00:00, 697.47batch/s, train/train_loss_1=0.0632]Epoch 6: 32%|###1 | 207/655 [00:00<00:00, 697.47batch/s, train/train_loss_1=0.0915]Epoch 6: 32%|###1 | 207/655 [00:00<00:00, 697.47batch/s, train/train_loss_1=0.0852]Epoch 6: 32%|###1 | 207/655 [00:00<00:00, 697.47batch/s, train/train_loss_1=0.108] Epoch 6: 32%|###1 | 207/655 [00:00<00:00, 697.47batch/s, train/train_loss_1=0.0941]Epoch 6: 32%|###1 | 207/655 [00:00<00:00, 697.47batch/s, train/train_loss_1=0.0938]Epoch 6: 32%|###1 | 207/655 [00:00<00:00, 697.47batch/s, train/train_loss_1=0.0864]Epoch 6: 32%|###1 | 207/655 [00:00<00:00, 697.47batch/s, train/train_loss_1=0.0937]Epoch 6: 32%|###1 | 207/655 [00:00<00:00, 697.47batch/s, train/train_loss_1=0.102] Epoch 6: 32%|###1 | 207/655 [00:00<00:00, 697.47batch/s, train/train_loss_1=0.0908]Epoch 6: 32%|###1 | 207/655 [00:00<00:00, 697.47batch/s, train/train_loss_1=0.0831]Epoch 6: 43%|####2 | 279/655 [00:00<00:00, 704.85batch/s, train/train_loss_1=0.0831]Epoch 6: 43%|####2 | 279/655 [00:00<00:00, 704.85batch/s, train/train_loss_1=0.0751]Epoch 6: 43%|####2 | 279/655 [00:00<00:00, 704.85batch/s, train/train_loss_1=0.0876]Epoch 6: 43%|####2 | 279/655 [00:00<00:00, 704.85batch/s, train/train_loss_1=0.0885]Epoch 6: 43%|####2 | 279/655 [00:00<00:00, 704.85batch/s, train/train_loss_1=0.0976]Epoch 6: 43%|####2 | 279/655 [00:00<00:00, 704.85batch/s, train/train_loss_1=0.0885]Epoch 6: 43%|####2 | 279/655 [00:00<00:00, 704.85batch/s, train/train_loss_1=0.0953]Epoch 6: 43%|####2 | 279/655 [00:00<00:00, 704.85batch/s, train/train_loss_1=0.0927]Epoch 6: 43%|####2 | 279/655 [00:00<00:00, 704.85batch/s, train/train_loss_1=0.0861]Epoch 6: 43%|####2 | 279/655 [00:00<00:00, 704.85batch/s, train/train_loss_1=0.0966]Epoch 6: 43%|####2 | 279/655 [00:00<00:00, 704.85batch/s, train/train_loss_1=0.0916]Epoch 6: 43%|####2 | 279/655 [00:00<00:00, 704.85batch/s, train/train_loss_1=0.0872]Epoch 6: 43%|####2 | 279/655 [00:00<00:00, 704.85batch/s, train/train_loss_1=0.0743]Epoch 6: 43%|####2 | 279/655 [00:00<00:00, 704.85batch/s, train/train_loss_1=0.0953]Epoch 6: 43%|####2 | 279/655 [00:00<00:00, 704.85batch/s, train/train_loss_1=0.0884]Epoch 6: 43%|####2 | 279/655 [00:00<00:00, 704.85batch/s, train/train_loss_1=0.0808]Epoch 6: 43%|####2 | 279/655 [00:00<00:00, 704.85batch/s, train/train_loss_1=0.083] Epoch 6: 43%|####2 | 279/655 [00:00<00:00, 704.85batch/s, train/train_loss_1=0.0898]Epoch 6: 43%|####2 | 279/655 [00:00<00:00, 704.85batch/s, train/train_loss_1=0.0882]Epoch 6: 43%|####2 | 279/655 [00:00<00:00, 704.85batch/s, train/train_loss_1=0.0818]Epoch 6: 43%|####2 | 279/655 [00:00<00:00, 704.85batch/s, train/train_loss_1=0.0931]Epoch 6: 43%|####2 | 279/655 [00:00<00:00, 704.85batch/s, train/train_loss_1=0.0962]Epoch 6: 43%|####2 | 279/655 [00:00<00:00, 704.85batch/s, train/train_loss_1=0.0884]Epoch 6: 43%|####2 | 279/655 [00:00<00:00, 704.85batch/s, train/train_loss_1=0.095] Epoch 6: 43%|####2 | 279/655 [00:00<00:00, 704.85batch/s, train/train_loss_1=0.0689]Epoch 6: 43%|####2 | 279/655 [00:00<00:00, 704.85batch/s, train/train_loss_1=0.0833]Epoch 6: 43%|####2 | 279/655 [00:00<00:00, 704.85batch/s, train/train_loss_1=0.0946]Epoch 6: 43%|####2 | 279/655 [00:00<00:00, 704.85batch/s, train/train_loss_1=0.0961]Epoch 6: 43%|####2 | 279/655 [00:00<00:00, 704.85batch/s, train/train_loss_1=0.0829]Epoch 6: 43%|####2 | 279/655 [00:00<00:00, 704.85batch/s, train/train_loss_1=0.0769]Epoch 6: 43%|####2 | 279/655 [00:00<00:00, 704.85batch/s, train/train_loss_1=0.088] Epoch 6: 43%|####2 | 279/655 [00:00<00:00, 704.85batch/s, train/train_loss_1=0.0768]Epoch 6: 43%|####2 | 279/655 [00:00<00:00, 704.85batch/s, train/train_loss_1=0.0851]Epoch 6: 43%|####2 | 279/655 [00:00<00:00, 704.85batch/s, train/train_loss_1=0.0787]Epoch 6: 43%|####2 | 279/655 [00:00<00:00, 704.85batch/s, train/train_loss_1=0.0811]Epoch 6: 43%|####2 | 279/655 [00:00<00:00, 704.85batch/s, train/train_loss_1=0.08] Epoch 6: 43%|####2 | 279/655 [00:00<00:00, 704.85batch/s, train/train_loss_1=0.0824]Epoch 6: 43%|####2 | 279/655 [00:00<00:00, 704.85batch/s, train/train_loss_1=0.104] Epoch 6: 43%|####2 | 279/655 [00:00<00:00, 704.85batch/s, train/train_loss_1=0.102]Epoch 6: 43%|####2 | 279/655 [00:00<00:00, 704.85batch/s, train/train_loss_1=0.0825]Epoch 6: 43%|####2 | 279/655 [00:00<00:00, 704.85batch/s, train/train_loss_1=0.0737]Epoch 6: 43%|####2 | 279/655 [00:00<00:00, 704.85batch/s, train/train_loss_1=0.0783]Epoch 6: 43%|####2 | 279/655 [00:00<00:00, 704.85batch/s, train/train_loss_1=0.096] Epoch 6: 43%|####2 | 279/655 [00:00<00:00, 704.85batch/s, train/train_loss_1=0.0724]Epoch 6: 43%|####2 | 279/655 [00:00<00:00, 704.85batch/s, train/train_loss_1=0.0921]Epoch 6: 43%|####2 | 279/655 [00:00<00:00, 704.85batch/s, train/train_loss_1=0.0957]Epoch 6: 43%|####2 | 279/655 [00:00<00:00, 704.85batch/s, train/train_loss_1=0.0884]Epoch 6: 43%|####2 | 279/655 [00:00<00:00, 704.85batch/s, train/train_loss_1=0.0812]Epoch 6: 43%|####2 | 279/655 [00:00<00:00, 704.85batch/s, train/train_loss_1=0.0749]Epoch 6: 43%|####2 | 279/655 [00:00<00:00, 704.85batch/s, train/train_loss_1=0.0793]Epoch 6: 43%|####2 | 279/655 [00:00<00:00, 704.85batch/s, train/train_loss_1=0.101] Epoch 6: 43%|####2 | 279/655 [00:00<00:00, 704.85batch/s, train/train_loss_1=0.0905]Epoch 6: 43%|####2 | 279/655 [00:00<00:00, 704.85batch/s, train/train_loss_1=0.0744]Epoch 6: 43%|####2 | 279/655 [00:00<00:00, 704.85batch/s, train/train_loss_1=0.0876]Epoch 6: 43%|####2 | 279/655 [00:00<00:00, 704.85batch/s, train/train_loss_1=0.0995]Epoch 6: 43%|####2 | 279/655 [00:00<00:00, 704.85batch/s, train/train_loss_1=0.0772]Epoch 6: 43%|####2 | 279/655 [00:00<00:00, 704.85batch/s, train/train_loss_1=0.101] Epoch 6: 43%|####2 | 279/655 [00:00<00:00, 704.85batch/s, train/train_loss_1=0.0786]Epoch 6: 43%|####2 | 279/655 [00:00<00:00, 704.85batch/s, train/train_loss_1=0.0814]Epoch 6: 43%|####2 | 279/655 [00:00<00:00, 704.85batch/s, train/train_loss_1=0.089] Epoch 6: 43%|####2 | 279/655 [00:00<00:00, 704.85batch/s, train/train_loss_1=0.0806]Epoch 6: 43%|####2 | 279/655 [00:00<00:00, 704.85batch/s, train/train_loss_1=0.0751]Epoch 6: 43%|####2 | 279/655 [00:00<00:00, 704.85batch/s, train/train_loss_1=0.0962]Epoch 6: 43%|####2 | 279/655 [00:00<00:00, 704.85batch/s, train/train_loss_1=0.0996]Epoch 6: 43%|####2 | 279/655 [00:00<00:00, 704.85batch/s, train/train_loss_1=0.0975]Epoch 6: 43%|####2 | 279/655 [00:00<00:00, 704.85batch/s, train/train_loss_1=0.0769]Epoch 6: 43%|####2 | 279/655 [00:00<00:00, 704.85batch/s, train/train_loss_1=0.0923]Epoch 6: 43%|####2 | 279/655 [00:00<00:00, 704.85batch/s, train/train_loss_1=0.104] Epoch 6: 43%|####2 | 279/655 [00:00<00:00, 704.85batch/s, train/train_loss_1=0.0913]Epoch 6: 43%|####2 | 279/655 [00:00<00:00, 704.85batch/s, train/train_loss_1=0.0877]Epoch 6: 43%|####2 | 279/655 [00:00<00:00, 704.85batch/s, train/train_loss_1=0.0963]Epoch 6: 43%|####2 | 279/655 [00:00<00:00, 704.85batch/s, train/train_loss_1=0.0958]Epoch 6: 43%|####2 | 279/655 [00:00<00:00, 704.85batch/s, train/train_loss_1=0.0711]Epoch 6: 43%|####2 | 279/655 [00:00<00:00, 704.85batch/s, train/train_loss_1=0.0736]Epoch 6: 54%|#####3 | 352/655 [00:00<00:00, 711.45batch/s, train/train_loss_1=0.0736]Epoch 6: 54%|#####3 | 352/655 [00:00<00:00, 711.45batch/s, train/train_loss_1=0.1] Epoch 6: 54%|#####3 | 352/655 [00:00<00:00, 711.45batch/s, train/train_loss_1=0.0822]Epoch 6: 54%|#####3 | 352/655 [00:00<00:00, 711.45batch/s, train/train_loss_1=0.0772]Epoch 6: 54%|#####3 | 352/655 [00:00<00:00, 711.45batch/s, train/train_loss_1=0.0865]Epoch 6: 54%|#####3 | 352/655 [00:00<00:00, 711.45batch/s, train/train_loss_1=0.0875]Epoch 6: 54%|#####3 | 352/655 [00:00<00:00, 711.45batch/s, train/train_loss_1=0.0975]Epoch 6: 54%|#####3 | 352/655 [00:00<00:00, 711.45batch/s, train/train_loss_1=0.0848]Epoch 6: 54%|#####3 | 352/655 [00:00<00:00, 711.45batch/s, train/train_loss_1=0.0852]Epoch 6: 54%|#####3 | 352/655 [00:00<00:00, 711.45batch/s, train/train_loss_1=0.074] Epoch 6: 54%|#####3 | 352/655 [00:00<00:00, 711.45batch/s, train/train_loss_1=0.09] Epoch 6: 54%|#####3 | 352/655 [00:00<00:00, 711.45batch/s, train/train_loss_1=0.0871]Epoch 6: 54%|#####3 | 352/655 [00:00<00:00, 711.45batch/s, train/train_loss_1=0.0977]Epoch 6: 54%|#####3 | 352/655 [00:00<00:00, 711.45batch/s, train/train_loss_1=0.0854]Epoch 6: 54%|#####3 | 352/655 [00:00<00:00, 711.45batch/s, train/train_loss_1=0.086] Epoch 6: 54%|#####3 | 352/655 [00:00<00:00, 711.45batch/s, train/train_loss_1=0.087]Epoch 6: 54%|#####3 | 352/655 [00:00<00:00, 711.45batch/s, train/train_loss_1=0.0949]Epoch 6: 54%|#####3 | 352/655 [00:00<00:00, 711.45batch/s, train/train_loss_1=0.0928]Epoch 6: 54%|#####3 | 352/655 [00:00<00:00, 711.45batch/s, train/train_loss_1=0.0901]Epoch 6: 54%|#####3 | 352/655 [00:00<00:00, 711.45batch/s, train/train_loss_1=0.0783]Epoch 6: 54%|#####3 | 352/655 [00:00<00:00, 711.45batch/s, train/train_loss_1=0.0838]Epoch 6: 54%|#####3 | 352/655 [00:00<00:00, 711.45batch/s, train/train_loss_1=0.097] Epoch 6: 54%|#####3 | 352/655 [00:00<00:00, 711.45batch/s, train/train_loss_1=0.0821]Epoch 6: 54%|#####3 | 352/655 [00:00<00:00, 711.45batch/s, train/train_loss_1=0.0948]Epoch 6: 54%|#####3 | 352/655 [00:00<00:00, 711.45batch/s, train/train_loss_1=0.0928]Epoch 6: 54%|#####3 | 352/655 [00:00<00:00, 711.45batch/s, train/train_loss_1=0.0904]Epoch 6: 54%|#####3 | 352/655 [00:00<00:00, 711.45batch/s, train/train_loss_1=0.0938]Epoch 6: 54%|#####3 | 352/655 [00:00<00:00, 711.45batch/s, train/train_loss_1=0.0751]Epoch 6: 54%|#####3 | 352/655 [00:00<00:00, 711.45batch/s, train/train_loss_1=0.0798]Epoch 6: 54%|#####3 | 352/655 [00:00<00:00, 711.45batch/s, train/train_loss_1=0.0858]Epoch 6: 54%|#####3 | 352/655 [00:00<00:00, 711.45batch/s, train/train_loss_1=0.0972]Epoch 6: 54%|#####3 | 352/655 [00:00<00:00, 711.45batch/s, train/train_loss_1=0.08] Epoch 6: 54%|#####3 | 352/655 [00:00<00:00, 711.45batch/s, train/train_loss_1=0.0852]Epoch 6: 54%|#####3 | 352/655 [00:00<00:00, 711.45batch/s, train/train_loss_1=0.0841]Epoch 6: 54%|#####3 | 352/655 [00:00<00:00, 711.45batch/s, train/train_loss_1=0.0895]Epoch 6: 54%|#####3 | 352/655 [00:00<00:00, 711.45batch/s, train/train_loss_1=0.0887]Epoch 6: 54%|#####3 | 352/655 [00:00<00:00, 711.45batch/s, train/train_loss_1=0.0905]Epoch 6: 54%|#####3 | 352/655 [00:00<00:00, 711.45batch/s, train/train_loss_1=0.0894]Epoch 6: 54%|#####3 | 352/655 [00:00<00:00, 711.45batch/s, train/train_loss_1=0.0842]Epoch 6: 54%|#####3 | 352/655 [00:00<00:00, 711.45batch/s, train/train_loss_1=0.0861]Epoch 6: 54%|#####3 | 352/655 [00:00<00:00, 711.45batch/s, train/train_loss_1=0.0827]Epoch 6: 54%|#####3 | 352/655 [00:00<00:00, 711.45batch/s, train/train_loss_1=0.0836]Epoch 6: 54%|#####3 | 352/655 [00:00<00:00, 711.45batch/s, train/train_loss_1=0.091] Epoch 6: 54%|#####3 | 352/655 [00:00<00:00, 711.45batch/s, train/train_loss_1=0.0789]Epoch 6: 54%|#####3 | 352/655 [00:00<00:00, 711.45batch/s, train/train_loss_1=0.0938]Epoch 6: 54%|#####3 | 352/655 [00:00<00:00, 711.45batch/s, train/train_loss_1=0.0923]Epoch 6: 54%|#####3 | 352/655 [00:00<00:00, 711.45batch/s, train/train_loss_1=0.0841]Epoch 6: 54%|#####3 | 352/655 [00:00<00:00, 711.45batch/s, train/train_loss_1=0.0913]Epoch 6: 54%|#####3 | 352/655 [00:00<00:00, 711.45batch/s, train/train_loss_1=0.0941]Epoch 6: 54%|#####3 | 352/655 [00:00<00:00, 711.45batch/s, train/train_loss_1=0.0929]Epoch 6: 54%|#####3 | 352/655 [00:00<00:00, 711.45batch/s, train/train_loss_1=0.0798]Epoch 6: 54%|#####3 | 352/655 [00:00<00:00, 711.45batch/s, train/train_loss_1=0.108] Epoch 6: 54%|#####3 | 352/655 [00:00<00:00, 711.45batch/s, train/train_loss_1=0.087]Epoch 6: 54%|#####3 | 352/655 [00:00<00:00, 711.45batch/s, train/train_loss_1=0.0814]Epoch 6: 54%|#####3 | 352/655 [00:00<00:00, 711.45batch/s, train/train_loss_1=0.0797]Epoch 6: 54%|#####3 | 352/655 [00:00<00:00, 711.45batch/s, train/train_loss_1=0.0827]Epoch 6: 54%|#####3 | 352/655 [00:00<00:00, 711.45batch/s, train/train_loss_1=0.0914]Epoch 6: 54%|#####3 | 352/655 [00:00<00:00, 711.45batch/s, train/train_loss_1=0.0828]Epoch 6: 54%|#####3 | 352/655 [00:00<00:00, 711.45batch/s, train/train_loss_1=0.0709]Epoch 6: 54%|#####3 | 352/655 [00:00<00:00, 711.45batch/s, train/train_loss_1=0.0754]Epoch 6: 54%|#####3 | 352/655 [00:00<00:00, 711.45batch/s, train/train_loss_1=0.0838]Epoch 6: 54%|#####3 | 352/655 [00:00<00:00, 711.45batch/s, train/train_loss_1=0.0903]Epoch 6: 54%|#####3 | 352/655 [00:00<00:00, 711.45batch/s, train/train_loss_1=0.1] Epoch 6: 54%|#####3 | 352/655 [00:00<00:00, 711.45batch/s, train/train_loss_1=0.0879]Epoch 6: 54%|#####3 | 352/655 [00:00<00:00, 711.45batch/s, train/train_loss_1=0.0873]Epoch 6: 54%|#####3 | 352/655 [00:00<00:00, 711.45batch/s, train/train_loss_1=0.0899]Epoch 6: 54%|#####3 | 352/655 [00:00<00:00, 711.45batch/s, train/train_loss_1=0.0852]Epoch 6: 54%|#####3 | 352/655 [00:00<00:00, 711.45batch/s, train/train_loss_1=0.078] Epoch 6: 54%|#####3 | 352/655 [00:00<00:00, 711.45batch/s, train/train_loss_1=0.0854]Epoch 6: 54%|#####3 | 352/655 [00:00<00:00, 711.45batch/s, train/train_loss_1=0.0959]Epoch 6: 54%|#####3 | 352/655 [00:00<00:00, 711.45batch/s, train/train_loss_1=0.0752]Epoch 6: 54%|#####3 | 352/655 [00:00<00:00, 711.45batch/s, train/train_loss_1=0.0785]Epoch 6: 54%|#####3 | 352/655 [00:00<00:00, 711.45batch/s, train/train_loss_1=0.0889]Epoch 6: 54%|#####3 | 352/655 [00:00<00:00, 711.45batch/s, train/train_loss_1=0.0947]Epoch 6: 65%|######4 | 425/655 [00:00<00:00, 714.47batch/s, train/train_loss_1=0.0947]Epoch 6: 65%|######4 | 425/655 [00:00<00:00, 714.47batch/s, train/train_loss_1=0.0942]Epoch 6: 65%|######4 | 425/655 [00:00<00:00, 714.47batch/s, train/train_loss_1=0.0693]Epoch 6: 65%|######4 | 425/655 [00:00<00:00, 714.47batch/s, train/train_loss_1=0.0881]Epoch 6: 65%|######4 | 425/655 [00:00<00:00, 714.47batch/s, train/train_loss_1=0.0893]Epoch 6: 65%|######4 | 425/655 [00:00<00:00, 714.47batch/s, train/train_loss_1=0.0981]Epoch 6: 65%|######4 | 425/655 [00:00<00:00, 714.47batch/s, train/train_loss_1=0.0905]Epoch 6: 65%|######4 | 425/655 [00:00<00:00, 714.47batch/s, train/train_loss_1=0.0947]Epoch 6: 65%|######4 | 425/655 [00:00<00:00, 714.47batch/s, train/train_loss_1=0.0903]Epoch 6: 65%|######4 | 425/655 [00:00<00:00, 714.47batch/s, train/train_loss_1=0.102] Epoch 6: 65%|######4 | 425/655 [00:00<00:00, 714.47batch/s, train/train_loss_1=0.0937]Epoch 6: 65%|######4 | 425/655 [00:00<00:00, 714.47batch/s, train/train_loss_1=0.0961]Epoch 6: 65%|######4 | 425/655 [00:00<00:00, 714.47batch/s, train/train_loss_1=0.0913]Epoch 6: 65%|######4 | 425/655 [00:00<00:00, 714.47batch/s, train/train_loss_1=0.0848]Epoch 6: 65%|######4 | 425/655 [00:00<00:00, 714.47batch/s, train/train_loss_1=0.0925]Epoch 6: 65%|######4 | 425/655 [00:00<00:00, 714.47batch/s, train/train_loss_1=0.0921]Epoch 6: 65%|######4 | 425/655 [00:00<00:00, 714.47batch/s, train/train_loss_1=0.0913]Epoch 6: 65%|######4 | 425/655 [00:00<00:00, 714.47batch/s, train/train_loss_1=0.0822]Epoch 6: 65%|######4 | 425/655 [00:00<00:00, 714.47batch/s, train/train_loss_1=0.091] Epoch 6: 65%|######4 | 425/655 [00:00<00:00, 714.47batch/s, train/train_loss_1=0.0923]Epoch 6: 65%|######4 | 425/655 [00:00<00:00, 714.47batch/s, train/train_loss_1=0.0783]Epoch 6: 65%|######4 | 425/655 [00:00<00:00, 714.47batch/s, train/train_loss_1=0.0855]Epoch 6: 65%|######4 | 425/655 [00:00<00:00, 714.47batch/s, train/train_loss_1=0.0908]Epoch 6: 65%|######4 | 425/655 [00:00<00:00, 714.47batch/s, train/train_loss_1=0.0948]Epoch 6: 65%|######4 | 425/655 [00:00<00:00, 714.47batch/s, train/train_loss_1=0.0719]Epoch 6: 65%|######4 | 425/655 [00:00<00:00, 714.47batch/s, train/train_loss_1=0.0806]Epoch 6: 65%|######4 | 425/655 [00:00<00:00, 714.47batch/s, train/train_loss_1=0.0784]Epoch 6: 65%|######4 | 425/655 [00:00<00:00, 714.47batch/s, train/train_loss_1=0.0965]Epoch 6: 65%|######4 | 425/655 [00:00<00:00, 714.47batch/s, train/train_loss_1=0.105] Epoch 6: 65%|######4 | 425/655 [00:00<00:00, 714.47batch/s, train/train_loss_1=0.0908]Epoch 6: 65%|######4 | 425/655 [00:00<00:00, 714.47batch/s, train/train_loss_1=0.0991]Epoch 6: 65%|######4 | 425/655 [00:00<00:00, 714.47batch/s, train/train_loss_1=0.0996]Epoch 6: 65%|######4 | 425/655 [00:00<00:00, 714.47batch/s, train/train_loss_1=0.0826]Epoch 6: 65%|######4 | 425/655 [00:00<00:00, 714.47batch/s, train/train_loss_1=0.105] Epoch 6: 65%|######4 | 425/655 [00:00<00:00, 714.47batch/s, train/train_loss_1=0.0889]Epoch 6: 65%|######4 | 425/655 [00:00<00:00, 714.47batch/s, train/train_loss_1=0.0857]Epoch 6: 65%|######4 | 425/655 [00:00<00:00, 714.47batch/s, train/train_loss_1=0.0871]Epoch 6: 65%|######4 | 425/655 [00:00<00:00, 714.47batch/s, train/train_loss_1=0.0802]Epoch 6: 65%|######4 | 425/655 [00:00<00:00, 714.47batch/s, train/train_loss_1=0.0851]Epoch 6: 65%|######4 | 425/655 [00:00<00:00, 714.47batch/s, train/train_loss_1=0.0855]Epoch 6: 65%|######4 | 425/655 [00:00<00:00, 714.47batch/s, train/train_loss_1=0.0918]Epoch 6: 65%|######4 | 425/655 [00:00<00:00, 714.47batch/s, train/train_loss_1=0.0763]Epoch 6: 65%|######4 | 425/655 [00:00<00:00, 714.47batch/s, train/train_loss_1=0.0892]Epoch 6: 65%|######4 | 425/655 [00:00<00:00, 714.47batch/s, train/train_loss_1=0.102] Epoch 6: 65%|######4 | 425/655 [00:00<00:00, 714.47batch/s, train/train_loss_1=0.082]Epoch 6: 65%|######4 | 425/655 [00:00<00:00, 714.47batch/s, train/train_loss_1=0.0819]Epoch 6: 65%|######4 | 425/655 [00:00<00:00, 714.47batch/s, train/train_loss_1=0.0884]Epoch 6: 65%|######4 | 425/655 [00:00<00:00, 714.47batch/s, train/train_loss_1=0.0816]Epoch 6: 65%|######4 | 425/655 [00:00<00:00, 714.47batch/s, train/train_loss_1=0.0869]Epoch 6: 65%|######4 | 425/655 [00:00<00:00, 714.47batch/s, train/train_loss_1=0.0976]Epoch 6: 65%|######4 | 425/655 [00:00<00:00, 714.47batch/s, train/train_loss_1=0.0777]Epoch 6: 65%|######4 | 425/655 [00:00<00:00, 714.47batch/s, train/train_loss_1=0.0857]Epoch 6: 65%|######4 | 425/655 [00:00<00:00, 714.47batch/s, train/train_loss_1=0.0968]Epoch 6: 65%|######4 | 425/655 [00:00<00:00, 714.47batch/s, train/train_loss_1=0.1] Epoch 6: 65%|######4 | 425/655 [00:00<00:00, 714.47batch/s, train/train_loss_1=0.0894]Epoch 6: 65%|######4 | 425/655 [00:00<00:00, 714.47batch/s, train/train_loss_1=0.0916]Epoch 6: 65%|######4 | 425/655 [00:00<00:00, 714.47batch/s, train/train_loss_1=0.0803]Epoch 6: 65%|######4 | 425/655 [00:00<00:00, 714.47batch/s, train/train_loss_1=0.0903]Epoch 6: 65%|######4 | 425/655 [00:00<00:00, 714.47batch/s, train/train_loss_1=0.0954]Epoch 6: 65%|######4 | 425/655 [00:00<00:00, 714.47batch/s, train/train_loss_1=0.0827]Epoch 6: 65%|######4 | 425/655 [00:00<00:00, 714.47batch/s, train/train_loss_1=0.109] Epoch 6: 65%|######4 | 425/655 [00:00<00:00, 714.47batch/s, train/train_loss_1=0.0774]Epoch 6: 65%|######4 | 425/655 [00:00<00:00, 714.47batch/s, train/train_loss_1=0.0913]Epoch 6: 65%|######4 | 425/655 [00:00<00:00, 714.47batch/s, train/train_loss_1=0.0895]Epoch 6: 65%|######4 | 425/655 [00:00<00:00, 714.47batch/s, train/train_loss_1=0.0805]Epoch 6: 65%|######4 | 425/655 [00:00<00:00, 714.47batch/s, train/train_loss_1=0.102] Epoch 6: 65%|######4 | 425/655 [00:00<00:00, 714.47batch/s, train/train_loss_1=0.0793]Epoch 6: 65%|######4 | 425/655 [00:00<00:00, 714.47batch/s, train/train_loss_1=0.0931]Epoch 6: 65%|######4 | 425/655 [00:00<00:00, 714.47batch/s, train/train_loss_1=0.0774]Epoch 6: 65%|######4 | 425/655 [00:00<00:00, 714.47batch/s, train/train_loss_1=0.0911]Epoch 6: 65%|######4 | 425/655 [00:00<00:00, 714.47batch/s, train/train_loss_1=0.0842]Epoch 6: 65%|######4 | 425/655 [00:00<00:00, 714.47batch/s, train/train_loss_1=0.0793]Epoch 6: 65%|######4 | 425/655 [00:00<00:00, 714.47batch/s, train/train_loss_1=0.098] Epoch 6: 76%|#######5 | 497/655 [00:00<00:00, 715.30batch/s, train/train_loss_1=0.098]Epoch 6: 76%|#######5 | 497/655 [00:00<00:00, 715.30batch/s, train/train_loss_1=0.0869]Epoch 6: 76%|#######5 | 497/655 [00:00<00:00, 715.30batch/s, train/train_loss_1=0.0979]Epoch 6: 76%|#######5 | 497/655 [00:00<00:00, 715.30batch/s, train/train_loss_1=0.11] Epoch 6: 76%|#######5 | 497/655 [00:00<00:00, 715.30batch/s, train/train_loss_1=0.0907]Epoch 6: 76%|#######5 | 497/655 [00:00<00:00, 715.30batch/s, train/train_loss_1=0.0858]Epoch 6: 76%|#######5 | 497/655 [00:00<00:00, 715.30batch/s, train/train_loss_1=0.0855]Epoch 6: 76%|#######5 | 497/655 [00:00<00:00, 715.30batch/s, train/train_loss_1=0.0725]Epoch 6: 76%|#######5 | 497/655 [00:00<00:00, 715.30batch/s, train/train_loss_1=0.0752]Epoch 6: 76%|#######5 | 497/655 [00:00<00:00, 715.30batch/s, train/train_loss_1=0.0962]Epoch 6: 76%|#######5 | 497/655 [00:00<00:00, 715.30batch/s, train/train_loss_1=0.0848]Epoch 6: 76%|#######5 | 497/655 [00:00<00:00, 715.30batch/s, train/train_loss_1=0.0847]Epoch 6: 76%|#######5 | 497/655 [00:00<00:00, 715.30batch/s, train/train_loss_1=0.0745]Epoch 6: 76%|#######5 | 497/655 [00:00<00:00, 715.30batch/s, train/train_loss_1=0.0668]Epoch 6: 76%|#######5 | 497/655 [00:00<00:00, 715.30batch/s, train/train_loss_1=0.0873]Epoch 6: 76%|#######5 | 497/655 [00:00<00:00, 715.30batch/s, train/train_loss_1=0.0964]Epoch 6: 76%|#######5 | 497/655 [00:00<00:00, 715.30batch/s, train/train_loss_1=0.0905]Epoch 6: 76%|#######5 | 497/655 [00:00<00:00, 715.30batch/s, train/train_loss_1=0.0755]Epoch 6: 76%|#######5 | 497/655 [00:00<00:00, 715.30batch/s, train/train_loss_1=0.0834]Epoch 6: 76%|#######5 | 497/655 [00:00<00:00, 715.30batch/s, train/train_loss_1=0.0939]Epoch 6: 76%|#######5 | 497/655 [00:00<00:00, 715.30batch/s, train/train_loss_1=0.0935]Epoch 6: 76%|#######5 | 497/655 [00:00<00:00, 715.30batch/s, train/train_loss_1=0.0892]Epoch 6: 76%|#######5 | 497/655 [00:00<00:00, 715.30batch/s, train/train_loss_1=0.0902]Epoch 6: 76%|#######5 | 497/655 [00:00<00:00, 715.30batch/s, train/train_loss_1=0.0884]Epoch 6: 76%|#######5 | 497/655 [00:00<00:00, 715.30batch/s, train/train_loss_1=0.0856]Epoch 6: 76%|#######5 | 497/655 [00:00<00:00, 715.30batch/s, train/train_loss_1=0.0835]Epoch 6: 76%|#######5 | 497/655 [00:00<00:00, 715.30batch/s, train/train_loss_1=0.0776]Epoch 6: 76%|#######5 | 497/655 [00:00<00:00, 715.30batch/s, train/train_loss_1=0.076] Epoch 6: 76%|#######5 | 497/655 [00:00<00:00, 715.30batch/s, train/train_loss_1=0.0868]Epoch 6: 76%|#######5 | 497/655 [00:00<00:00, 715.30batch/s, train/train_loss_1=0.0841]Epoch 6: 76%|#######5 | 497/655 [00:00<00:00, 715.30batch/s, train/train_loss_1=0.0952]Epoch 6: 76%|#######5 | 497/655 [00:00<00:00, 715.30batch/s, train/train_loss_1=0.0974]Epoch 6: 76%|#######5 | 497/655 [00:00<00:00, 715.30batch/s, train/train_loss_1=0.0953]Epoch 6: 76%|#######5 | 497/655 [00:00<00:00, 715.30batch/s, train/train_loss_1=0.093] Epoch 6: 76%|#######5 | 497/655 [00:00<00:00, 715.30batch/s, train/train_loss_1=0.0952]Epoch 6: 76%|#######5 | 497/655 [00:00<00:00, 715.30batch/s, train/train_loss_1=0.0871]Epoch 6: 76%|#######5 | 497/655 [00:00<00:00, 715.30batch/s, train/train_loss_1=0.0968]Epoch 6: 76%|#######5 | 497/655 [00:00<00:00, 715.30batch/s, train/train_loss_1=0.0675]Epoch 6: 76%|#######5 | 497/655 [00:00<00:00, 715.30batch/s, train/train_loss_1=0.0993]Epoch 6: 76%|#######5 | 497/655 [00:00<00:00, 715.30batch/s, train/train_loss_1=0.0769]Epoch 6: 76%|#######5 | 497/655 [00:00<00:00, 715.30batch/s, train/train_loss_1=0.08] Epoch 6: 76%|#######5 | 497/655 [00:00<00:00, 715.30batch/s, train/train_loss_1=0.095]Epoch 6: 76%|#######5 | 497/655 [00:00<00:00, 715.30batch/s, train/train_loss_1=0.0908]Epoch 6: 76%|#######5 | 497/655 [00:00<00:00, 715.30batch/s, train/train_loss_1=0.0908]Epoch 6: 76%|#######5 | 497/655 [00:00<00:00, 715.30batch/s, train/train_loss_1=0.0786]Epoch 6: 76%|#######5 | 497/655 [00:00<00:00, 715.30batch/s, train/train_loss_1=0.0902]Epoch 6: 76%|#######5 | 497/655 [00:00<00:00, 715.30batch/s, train/train_loss_1=0.0924]Epoch 6: 76%|#######5 | 497/655 [00:00<00:00, 715.30batch/s, train/train_loss_1=0.0771]Epoch 6: 76%|#######5 | 497/655 [00:00<00:00, 715.30batch/s, train/train_loss_1=0.0861]Epoch 6: 76%|#######5 | 497/655 [00:00<00:00, 715.30batch/s, train/train_loss_1=0.0916]Epoch 6: 76%|#######5 | 497/655 [00:00<00:00, 715.30batch/s, train/train_loss_1=0.0901]Epoch 6: 76%|#######5 | 497/655 [00:00<00:00, 715.30batch/s, train/train_loss_1=0.0959]Epoch 6: 76%|#######5 | 497/655 [00:00<00:00, 715.30batch/s, train/train_loss_1=0.0902]Epoch 6: 76%|#######5 | 497/655 [00:00<00:00, 715.30batch/s, train/train_loss_1=0.0878]Epoch 6: 76%|#######5 | 497/655 [00:00<00:00, 715.30batch/s, train/train_loss_1=0.0831]Epoch 6: 76%|#######5 | 497/655 [00:00<00:00, 715.30batch/s, train/train_loss_1=0.0908]Epoch 6: 76%|#######5 | 497/655 [00:00<00:00, 715.30batch/s, train/train_loss_1=0.0806]Epoch 6: 76%|#######5 | 497/655 [00:00<00:00, 715.30batch/s, train/train_loss_1=0.0869]Epoch 6: 76%|#######5 | 497/655 [00:00<00:00, 715.30batch/s, train/train_loss_1=0.0792]Epoch 6: 76%|#######5 | 497/655 [00:00<00:00, 715.30batch/s, train/train_loss_1=0.0854]Epoch 6: 76%|#######5 | 497/655 [00:00<00:00, 715.30batch/s, train/train_loss_1=0.0826]Epoch 6: 76%|#######5 | 497/655 [00:00<00:00, 715.30batch/s, train/train_loss_1=0.0892]Epoch 6: 76%|#######5 | 497/655 [00:00<00:00, 715.30batch/s, train/train_loss_1=0.0875]Epoch 6: 76%|#######5 | 497/655 [00:00<00:00, 715.30batch/s, train/train_loss_1=0.0967]Epoch 6: 76%|#######5 | 497/655 [00:00<00:00, 715.30batch/s, train/train_loss_1=0.0913]Epoch 6: 76%|#######5 | 497/655 [00:00<00:00, 715.30batch/s, train/train_loss_1=0.0782]Epoch 6: 76%|#######5 | 497/655 [00:00<00:00, 715.30batch/s, train/train_loss_1=0.0933]Epoch 6: 76%|#######5 | 497/655 [00:00<00:00, 715.30batch/s, train/train_loss_1=0.105] Epoch 6: 76%|#######5 | 497/655 [00:00<00:00, 715.30batch/s, train/train_loss_1=0.0968]Epoch 6: 76%|#######5 | 497/655 [00:00<00:00, 715.30batch/s, train/train_loss_1=0.0823]Epoch 6: 76%|#######5 | 497/655 [00:00<00:00, 715.30batch/s, train/train_loss_1=0.092] Epoch 6: 76%|#######5 | 497/655 [00:00<00:00, 715.30batch/s, train/train_loss_1=0.082]Epoch 6: 76%|#######5 | 497/655 [00:00<00:00, 715.30batch/s, train/train_loss_1=0.0905]Epoch 6: 76%|#######5 | 497/655 [00:00<00:00, 715.30batch/s, train/train_loss_1=0.0888]Epoch 6: 87%|########7 | 570/655 [00:00<00:00, 717.45batch/s, train/train_loss_1=0.0888]Epoch 6: 87%|########7 | 570/655 [00:00<00:00, 717.45batch/s, train/train_loss_1=0.0898]Epoch 6: 87%|########7 | 570/655 [00:00<00:00, 717.45batch/s, train/train_loss_1=0.103] Epoch 6: 87%|########7 | 570/655 [00:00<00:00, 717.45batch/s, train/train_loss_1=0.0922]Epoch 6: 87%|########7 | 570/655 [00:00<00:00, 717.45batch/s, train/train_loss_1=0.0894]Epoch 6: 87%|########7 | 570/655 [00:00<00:00, 717.45batch/s, train/train_loss_1=0.0838]Epoch 6: 87%|########7 | 570/655 [00:00<00:00, 717.45batch/s, train/train_loss_1=0.0961]Epoch 6: 87%|########7 | 570/655 [00:00<00:00, 717.45batch/s, train/train_loss_1=0.0979]Epoch 6: 87%|########7 | 570/655 [00:00<00:00, 717.45batch/s, train/train_loss_1=0.0859]Epoch 6: 87%|########7 | 570/655 [00:00<00:00, 717.45batch/s, train/train_loss_1=0.0767]Epoch 6: 87%|########7 | 570/655 [00:00<00:00, 717.45batch/s, train/train_loss_1=0.0851]Epoch 6: 87%|########7 | 570/655 [00:00<00:00, 717.45batch/s, train/train_loss_1=0.0765]Epoch 6: 87%|########7 | 570/655 [00:00<00:00, 717.45batch/s, train/train_loss_1=0.0788]Epoch 6: 87%|########7 | 570/655 [00:00<00:00, 717.45batch/s, train/train_loss_1=0.0833]Epoch 6: 87%|########7 | 570/655 [00:00<00:00, 717.45batch/s, train/train_loss_1=0.0915]Epoch 6: 87%|########7 | 570/655 [00:00<00:00, 717.45batch/s, train/train_loss_1=0.0876]Epoch 6: 87%|########7 | 570/655 [00:00<00:00, 717.45batch/s, train/train_loss_1=0.0876]Epoch 6: 87%|########7 | 570/655 [00:00<00:00, 717.45batch/s, train/train_loss_1=0.096] Epoch 6: 87%|########7 | 570/655 [00:00<00:00, 717.45batch/s, train/train_loss_1=0.0828]Epoch 6: 87%|########7 | 570/655 [00:00<00:00, 717.45batch/s, train/train_loss_1=0.066] Epoch 6: 87%|########7 | 570/655 [00:00<00:00, 717.45batch/s, train/train_loss_1=0.0999]Epoch 6: 87%|########7 | 570/655 [00:00<00:00, 717.45batch/s, train/train_loss_1=0.0918]Epoch 6: 87%|########7 | 570/655 [00:00<00:00, 717.45batch/s, train/train_loss_1=0.0907]Epoch 6: 87%|########7 | 570/655 [00:00<00:00, 717.45batch/s, train/train_loss_1=0.0882]Epoch 6: 87%|########7 | 570/655 [00:00<00:00, 717.45batch/s, train/train_loss_1=0.0816]Epoch 6: 87%|########7 | 570/655 [00:00<00:00, 717.45batch/s, train/train_loss_1=0.091] Epoch 6: 87%|########7 | 570/655 [00:00<00:00, 717.45batch/s, train/train_loss_1=0.0918]Epoch 6: 87%|########7 | 570/655 [00:00<00:00, 717.45batch/s, train/train_loss_1=0.0866]Epoch 6: 87%|########7 | 570/655 [00:00<00:00, 717.45batch/s, train/train_loss_1=0.0877]Epoch 6: 87%|########7 | 570/655 [00:00<00:00, 717.45batch/s, train/train_loss_1=0.0867]Epoch 6: 87%|########7 | 570/655 [00:00<00:00, 717.45batch/s, train/train_loss_1=0.0919]Epoch 6: 87%|########7 | 570/655 [00:00<00:00, 717.45batch/s, train/train_loss_1=0.0911]Epoch 6: 87%|########7 | 570/655 [00:00<00:00, 717.45batch/s, train/train_loss_1=0.109] Epoch 6: 87%|########7 | 570/655 [00:00<00:00, 717.45batch/s, train/train_loss_1=0.0783]Epoch 6: 87%|########7 | 570/655 [00:00<00:00, 717.45batch/s, train/train_loss_1=0.0979]Epoch 6: 87%|########7 | 570/655 [00:00<00:00, 717.45batch/s, train/train_loss_1=0.0705]Epoch 6: 87%|########7 | 570/655 [00:00<00:00, 717.45batch/s, train/train_loss_1=0.0927]Epoch 6: 87%|########7 | 570/655 [00:00<00:00, 717.45batch/s, train/train_loss_1=0.0911]Epoch 6: 87%|########7 | 570/655 [00:00<00:00, 717.45batch/s, train/train_loss_1=0.0939]Epoch 6: 87%|########7 | 570/655 [00:00<00:00, 717.45batch/s, train/train_loss_1=0.086] Epoch 6: 87%|########7 | 570/655 [00:00<00:00, 717.45batch/s, train/train_loss_1=0.0838]Epoch 6: 87%|########7 | 570/655 [00:00<00:00, 717.45batch/s, train/train_loss_1=0.0893]Epoch 6: 87%|########7 | 570/655 [00:00<00:00, 717.45batch/s, train/train_loss_1=0.0692]Epoch 6: 87%|########7 | 570/655 [00:00<00:00, 717.45batch/s, train/train_loss_1=0.0852]Epoch 6: 87%|########7 | 570/655 [00:00<00:00, 717.45batch/s, train/train_loss_1=0.0979]Epoch 6: 87%|########7 | 570/655 [00:00<00:00, 717.45batch/s, train/train_loss_1=0.0992]Epoch 6: 87%|########7 | 570/655 [00:00<00:00, 717.45batch/s, train/train_loss_1=0.0921]Epoch 6: 87%|########7 | 570/655 [00:00<00:00, 717.45batch/s, train/train_loss_1=0.0876]Epoch 6: 87%|########7 | 570/655 [00:00<00:00, 717.45batch/s, train/train_loss_1=0.101] Epoch 6: 87%|########7 | 570/655 [00:00<00:00, 717.45batch/s, train/train_loss_1=0.0909]Epoch 6: 87%|########7 | 570/655 [00:00<00:00, 717.45batch/s, train/train_loss_1=0.0962]Epoch 6: 87%|########7 | 570/655 [00:00<00:00, 717.45batch/s, train/train_loss_1=0.0654]Epoch 6: 87%|########7 | 570/655 [00:00<00:00, 717.45batch/s, train/train_loss_1=0.089] Epoch 6: 87%|########7 | 570/655 [00:00<00:00, 717.45batch/s, train/train_loss_1=0.074]Epoch 6: 87%|########7 | 570/655 [00:00<00:00, 717.45batch/s, train/train_loss_1=0.0974]Epoch 6: 87%|########7 | 570/655 [00:00<00:00, 717.45batch/s, train/train_loss_1=0.088] Epoch 6: 87%|########7 | 570/655 [00:00<00:00, 717.45batch/s, train/train_loss_1=0.0802]Epoch 6: 87%|########7 | 570/655 [00:00<00:00, 717.45batch/s, train/train_loss_1=0.0967]Epoch 6: 87%|########7 | 570/655 [00:00<00:00, 717.45batch/s, train/train_loss_1=0.0812]Epoch 6: 87%|########7 | 570/655 [00:00<00:00, 717.45batch/s, train/train_loss_1=0.0795]Epoch 6: 87%|########7 | 570/655 [00:00<00:00, 717.45batch/s, train/train_loss_1=0.0747]Epoch 6: 87%|########7 | 570/655 [00:00<00:00, 717.45batch/s, train/train_loss_1=0.0853]Epoch 6: 87%|########7 | 570/655 [00:00<00:00, 717.45batch/s, train/train_loss_1=0.0951]Epoch 6: 87%|########7 | 570/655 [00:00<00:00, 717.45batch/s, train/train_loss_1=0.0793]Epoch 6: 87%|########7 | 570/655 [00:00<00:00, 717.45batch/s, train/train_loss_1=0.079] Epoch 6: 87%|########7 | 570/655 [00:00<00:00, 717.45batch/s, train/train_loss_1=0.0837]Epoch 6: 87%|########7 | 570/655 [00:00<00:00, 717.45batch/s, train/train_loss_1=0.0921]Epoch 6: 87%|########7 | 570/655 [00:00<00:00, 717.45batch/s, train/train_loss_1=0.0901]Epoch 6: 87%|########7 | 570/655 [00:00<00:00, 717.45batch/s, train/train_loss_1=0.0924]Epoch 6: 87%|########7 | 570/655 [00:00<00:00, 717.45batch/s, train/train_loss_1=0.0893]Epoch 6: 87%|########7 | 570/655 [00:00<00:00, 717.45batch/s, train/train_loss_1=0.0883]Epoch 6: 87%|########7 | 570/655 [00:00<00:00, 717.45batch/s, train/train_loss_1=0.0874]Epoch 6: 87%|########7 | 570/655 [00:00<00:00, 717.45batch/s, train/train_loss_1=0.102] Epoch 6: 98%|#########8| 642/655 [00:00<00:00, 698.73batch/s, train/train_loss_1=0.102]Epoch 6: 98%|#########8| 642/655 [00:00<00:00, 698.73batch/s, train/train_loss_1=0.0789]Epoch 6: 98%|#########8| 642/655 [00:00<00:00, 698.73batch/s, train/train_loss_1=0.0961]Epoch 6: 98%|#########8| 642/655 [00:00<00:00, 698.73batch/s, train/train_loss_1=0.0726]Epoch 6: 98%|#########8| 642/655 [00:00<00:00, 698.73batch/s, train/train_loss_1=0.0792]Epoch 6: 98%|#########8| 642/655 [00:00<00:00, 698.73batch/s, train/train_loss_1=0.0927]Epoch 6: 98%|#########8| 642/655 [00:00<00:00, 698.73batch/s, train/train_loss_1=0.0756]Epoch 6: 98%|#########8| 642/655 [00:00<00:00, 698.73batch/s, train/train_loss_1=0.0934]Epoch 6: 98%|#########8| 642/655 [00:00<00:00, 698.73batch/s, train/train_loss_1=0.107] Epoch 6: 98%|#########8| 642/655 [00:00<00:00, 698.73batch/s, train/train_loss_1=0.0905]Epoch 6: 98%|#########8| 642/655 [00:00<00:00, 698.73batch/s, train/train_loss_1=0.0938]Epoch 6: 98%|#########8| 642/655 [00:00<00:00, 698.73batch/s, train/train_loss_1=0.0738]Epoch 6: 98%|#########8| 642/655 [00:00<00:00, 698.73batch/s, train/train_loss_1=0.0854]Epoch 6: 98%|#########8| 642/655 [00:00<00:00, 698.73batch/s, train/train_loss_1=0.0807] 0%| | 0/655 [00:00<?, ?batch/s]Epoch 7: 0%| | 0/655 [00:00<?, ?batch/s]Epoch 7: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.076]Epoch 7: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.103]Epoch 7: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0825]Epoch 7: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0919]Epoch 7: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0964]Epoch 7: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0925]Epoch 7: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0579]Epoch 7: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0894]Epoch 7: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0798]Epoch 7: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.076] Epoch 7: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0945]Epoch 7: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0787]Epoch 7: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0832]Epoch 7: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0916]Epoch 7: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0702]Epoch 7: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0779]Epoch 7: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0762]Epoch 7: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0991]Epoch 7: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0948]Epoch 7: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.093] Epoch 7: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0769]Epoch 7: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.101] Epoch 7: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0942]Epoch 7: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0942]Epoch 7: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0941]Epoch 7: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.083] Epoch 7: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0735]Epoch 7: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.086] Epoch 7: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0885]Epoch 7: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0805]Epoch 7: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0803]Epoch 7: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0932]Epoch 7: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0813]Epoch 7: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0802]Epoch 7: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0721]Epoch 7: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0937]Epoch 7: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0885]Epoch 7: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0998]Epoch 7: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0764]Epoch 7: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.1] Epoch 7: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0787]Epoch 7: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0931]Epoch 7: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0887]Epoch 7: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.08] Epoch 7: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.114]Epoch 7: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.107]Epoch 7: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0785]Epoch 7: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0812]Epoch 7: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0999]Epoch 7: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0762]Epoch 7: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0804]Epoch 7: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0817]Epoch 7: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.087] Epoch 7: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.092]Epoch 7: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0965]Epoch 7: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.082] Epoch 7: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0888]Epoch 7: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.102] Epoch 7: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0747]Epoch 7: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.104] Epoch 7: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0965]Epoch 7: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0954]Epoch 7: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0705]Epoch 7: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0938]Epoch 7: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.084] Epoch 7: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0899]Epoch 7: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0962]Epoch 7: 10%|# | 67/655 [00:00<00:00, 665.41batch/s, train/train_loss_1=0.0962]Epoch 7: 10%|# | 67/655 [00:00<00:00, 665.41batch/s, train/train_loss_1=0.101] Epoch 7: 10%|# | 67/655 [00:00<00:00, 665.41batch/s, train/train_loss_1=0.0939]Epoch 7: 10%|# | 67/655 [00:00<00:00, 665.41batch/s, train/train_loss_1=0.0758]Epoch 7: 10%|# | 67/655 [00:00<00:00, 665.41batch/s, train/train_loss_1=0.0797]Epoch 7: 10%|# | 67/655 [00:00<00:00, 665.41batch/s, train/train_loss_1=0.0906]Epoch 7: 10%|# | 67/655 [00:00<00:00, 665.41batch/s, train/train_loss_1=0.0906]Epoch 7: 10%|# | 67/655 [00:00<00:00, 665.41batch/s, train/train_loss_1=0.0907]Epoch 7: 10%|# | 67/655 [00:00<00:00, 665.41batch/s, train/train_loss_1=0.0819]Epoch 7: 10%|# | 67/655 [00:00<00:00, 665.41batch/s, train/train_loss_1=0.0838]Epoch 7: 10%|# | 67/655 [00:00<00:00, 665.41batch/s, train/train_loss_1=0.0956]Epoch 7: 10%|# | 67/655 [00:00<00:00, 665.41batch/s, train/train_loss_1=0.0851]Epoch 7: 10%|# | 67/655 [00:00<00:00, 665.41batch/s, train/train_loss_1=0.082] Epoch 7: 10%|# | 67/655 [00:00<00:00, 665.41batch/s, train/train_loss_1=0.0871]Epoch 7: 10%|# | 67/655 [00:00<00:00, 665.41batch/s, train/train_loss_1=0.0811]Epoch 7: 10%|# | 67/655 [00:00<00:00, 665.41batch/s, train/train_loss_1=0.0826]Epoch 7: 10%|# | 67/655 [00:00<00:00, 665.41batch/s, train/train_loss_1=0.086] Epoch 7: 10%|# | 67/655 [00:00<00:00, 665.41batch/s, train/train_loss_1=0.0772]Epoch 7: 10%|# | 67/655 [00:00<00:00, 665.41batch/s, train/train_loss_1=0.0796]Epoch 7: 10%|# | 67/655 [00:00<00:00, 665.41batch/s, train/train_loss_1=0.0748]Epoch 7: 10%|# | 67/655 [00:00<00:00, 665.41batch/s, train/train_loss_1=0.0996]Epoch 7: 10%|# | 67/655 [00:00<00:00, 665.41batch/s, train/train_loss_1=0.0887]Epoch 7: 10%|# | 67/655 [00:00<00:00, 665.41batch/s, train/train_loss_1=0.0853]Epoch 7: 10%|# | 67/655 [00:00<00:00, 665.41batch/s, train/train_loss_1=0.0817]Epoch 7: 10%|# | 67/655 [00:00<00:00, 665.41batch/s, train/train_loss_1=0.101] Epoch 7: 10%|# | 67/655 [00:00<00:00, 665.41batch/s, train/train_loss_1=0.0953]Epoch 7: 10%|# | 67/655 [00:00<00:00, 665.41batch/s, train/train_loss_1=0.0813]Epoch 7: 10%|# | 67/655 [00:00<00:00, 665.41batch/s, train/train_loss_1=0.0892]Epoch 7: 10%|# | 67/655 [00:00<00:00, 665.41batch/s, train/train_loss_1=0.0739]Epoch 7: 10%|# | 67/655 [00:00<00:00, 665.41batch/s, train/train_loss_1=0.0872]Epoch 7: 10%|# | 67/655 [00:00<00:00, 665.41batch/s, train/train_loss_1=0.0819]Epoch 7: 10%|# | 67/655 [00:00<00:00, 665.41batch/s, train/train_loss_1=0.083] Epoch 7: 10%|# | 67/655 [00:00<00:00, 665.41batch/s, train/train_loss_1=0.0917]Epoch 7: 10%|# | 67/655 [00:00<00:00, 665.41batch/s, train/train_loss_1=0.0676]Epoch 7: 10%|# | 67/655 [00:00<00:00, 665.41batch/s, train/train_loss_1=0.0747]Epoch 7: 10%|# | 67/655 [00:00<00:00, 665.41batch/s, train/train_loss_1=0.0924]Epoch 7: 10%|# | 67/655 [00:00<00:00, 665.41batch/s, train/train_loss_1=0.0902]Epoch 7: 10%|# | 67/655 [00:00<00:00, 665.41batch/s, train/train_loss_1=0.0826]Epoch 7: 10%|# | 67/655 [00:00<00:00, 665.41batch/s, train/train_loss_1=0.0811]Epoch 7: 10%|# | 67/655 [00:00<00:00, 665.41batch/s, train/train_loss_1=0.0872]Epoch 7: 10%|# | 67/655 [00:00<00:00, 665.41batch/s, train/train_loss_1=0.0796]Epoch 7: 10%|# | 67/655 [00:00<00:00, 665.41batch/s, train/train_loss_1=0.0888]Epoch 7: 10%|# | 67/655 [00:00<00:00, 665.41batch/s, train/train_loss_1=0.0959]Epoch 7: 10%|# | 67/655 [00:00<00:00, 665.41batch/s, train/train_loss_1=0.0762]Epoch 7: 10%|# | 67/655 [00:00<00:00, 665.41batch/s, train/train_loss_1=0.0822]Epoch 7: 10%|# | 67/655 [00:00<00:00, 665.41batch/s, train/train_loss_1=0.0868]Epoch 7: 10%|# | 67/655 [00:00<00:00, 665.41batch/s, train/train_loss_1=0.0862]Epoch 7: 10%|# | 67/655 [00:00<00:00, 665.41batch/s, train/train_loss_1=0.0949]Epoch 7: 10%|# | 67/655 [00:00<00:00, 665.41batch/s, train/train_loss_1=0.0885]Epoch 7: 10%|# | 67/655 [00:00<00:00, 665.41batch/s, train/train_loss_1=0.106] Epoch 7: 10%|# | 67/655 [00:00<00:00, 665.41batch/s, train/train_loss_1=0.0912]Epoch 7: 10%|# | 67/655 [00:00<00:00, 665.41batch/s, train/train_loss_1=0.0886]Epoch 7: 10%|# | 67/655 [00:00<00:00, 665.41batch/s, train/train_loss_1=0.0898]Epoch 7: 10%|# | 67/655 [00:00<00:00, 665.41batch/s, train/train_loss_1=0.0835]Epoch 7: 10%|# | 67/655 [00:00<00:00, 665.41batch/s, train/train_loss_1=0.0712]Epoch 7: 10%|# | 67/655 [00:00<00:00, 665.41batch/s, train/train_loss_1=0.0905]Epoch 7: 10%|# | 67/655 [00:00<00:00, 665.41batch/s, train/train_loss_1=0.0852]Epoch 7: 10%|# | 67/655 [00:00<00:00, 665.41batch/s, train/train_loss_1=0.0844]Epoch 7: 10%|# | 67/655 [00:00<00:00, 665.41batch/s, train/train_loss_1=0.0766]Epoch 7: 10%|# | 67/655 [00:00<00:00, 665.41batch/s, train/train_loss_1=0.0834]Epoch 7: 10%|# | 67/655 [00:00<00:00, 665.41batch/s, train/train_loss_1=0.0867]Epoch 7: 10%|# | 67/655 [00:00<00:00, 665.41batch/s, train/train_loss_1=0.0936]Epoch 7: 10%|# | 67/655 [00:00<00:00, 665.41batch/s, train/train_loss_1=0.0776]Epoch 7: 10%|# | 67/655 [00:00<00:00, 665.41batch/s, train/train_loss_1=0.0866]Epoch 7: 10%|# | 67/655 [00:00<00:00, 665.41batch/s, train/train_loss_1=0.0806]Epoch 7: 10%|# | 67/655 [00:00<00:00, 665.41batch/s, train/train_loss_1=0.0909]Epoch 7: 10%|# | 67/655 [00:00<00:00, 665.41batch/s, train/train_loss_1=0.0662]Epoch 7: 10%|# | 67/655 [00:00<00:00, 665.41batch/s, train/train_loss_1=0.0774]Epoch 7: 20%|## | 134/655 [00:00<00:00, 657.85batch/s, train/train_loss_1=0.0774]Epoch 7: 20%|## | 134/655 [00:00<00:00, 657.85batch/s, train/train_loss_1=0.0861]Epoch 7: 20%|## | 134/655 [00:00<00:00, 657.85batch/s, train/train_loss_1=0.0781]Epoch 7: 20%|## | 134/655 [00:00<00:00, 657.85batch/s, train/train_loss_1=0.0718]Epoch 7: 20%|## | 134/655 [00:00<00:00, 657.85batch/s, train/train_loss_1=0.0882]Epoch 7: 20%|## | 134/655 [00:00<00:00, 657.85batch/s, train/train_loss_1=0.0833]Epoch 7: 20%|## | 134/655 [00:00<00:00, 657.85batch/s, train/train_loss_1=0.0839]Epoch 7: 20%|## | 134/655 [00:00<00:00, 657.85batch/s, train/train_loss_1=0.0948]Epoch 7: 20%|## | 134/655 [00:00<00:00, 657.85batch/s, train/train_loss_1=0.097] Epoch 7: 20%|## | 134/655 [00:00<00:00, 657.85batch/s, train/train_loss_1=0.0814]Epoch 7: 20%|## | 134/655 [00:00<00:00, 657.85batch/s, train/train_loss_1=0.0897]Epoch 7: 20%|## | 134/655 [00:00<00:00, 657.85batch/s, train/train_loss_1=0.0868]Epoch 7: 20%|## | 134/655 [00:00<00:00, 657.85batch/s, train/train_loss_1=0.073] Epoch 7: 20%|## | 134/655 [00:00<00:00, 657.85batch/s, train/train_loss_1=0.067]Epoch 7: 20%|## | 134/655 [00:00<00:00, 657.85batch/s, train/train_loss_1=0.0802]Epoch 7: 20%|## | 134/655 [00:00<00:00, 657.85batch/s, train/train_loss_1=0.0901]Epoch 7: 20%|## | 134/655 [00:00<00:00, 657.85batch/s, train/train_loss_1=0.0847]Epoch 7: 20%|## | 134/655 [00:00<00:00, 657.85batch/s, train/train_loss_1=0.0888]Epoch 7: 20%|## | 134/655 [00:00<00:00, 657.85batch/s, train/train_loss_1=0.085] Epoch 7: 20%|## | 134/655 [00:00<00:00, 657.85batch/s, train/train_loss_1=0.0727]Epoch 7: 20%|## | 134/655 [00:00<00:00, 657.85batch/s, train/train_loss_1=0.0905]Epoch 7: 20%|## | 134/655 [00:00<00:00, 657.85batch/s, train/train_loss_1=0.0857]Epoch 7: 20%|## | 134/655 [00:00<00:00, 657.85batch/s, train/train_loss_1=0.0979]Epoch 7: 20%|## | 134/655 [00:00<00:00, 657.85batch/s, train/train_loss_1=0.0829]Epoch 7: 20%|## | 134/655 [00:00<00:00, 657.85batch/s, train/train_loss_1=0.0718]Epoch 7: 20%|## | 134/655 [00:00<00:00, 657.85batch/s, train/train_loss_1=0.089] Epoch 7: 20%|## | 134/655 [00:00<00:00, 657.85batch/s, train/train_loss_1=0.0859]Epoch 7: 20%|## | 134/655 [00:00<00:00, 657.85batch/s, train/train_loss_1=0.0877]Epoch 7: 20%|## | 134/655 [00:00<00:00, 657.85batch/s, train/train_loss_1=0.08] Epoch 7: 20%|## | 134/655 [00:00<00:00, 657.85batch/s, train/train_loss_1=0.0871]Epoch 7: 20%|## | 134/655 [00:00<00:00, 657.85batch/s, train/train_loss_1=0.0924]Epoch 7: 20%|## | 134/655 [00:00<00:00, 657.85batch/s, train/train_loss_1=0.095] Epoch 7: 20%|## | 134/655 [00:00<00:00, 657.85batch/s, train/train_loss_1=0.1] Epoch 7: 20%|## | 134/655 [00:00<00:00, 657.85batch/s, train/train_loss_1=0.0775]Epoch 7: 20%|## | 134/655 [00:00<00:00, 657.85batch/s, train/train_loss_1=0.0933]Epoch 7: 20%|## | 134/655 [00:00<00:00, 657.85batch/s, train/train_loss_1=0.0746]Epoch 7: 20%|## | 134/655 [00:00<00:00, 657.85batch/s, train/train_loss_1=0.0775]Epoch 7: 20%|## | 134/655 [00:00<00:00, 657.85batch/s, train/train_loss_1=0.0874]Epoch 7: 20%|## | 134/655 [00:00<00:00, 657.85batch/s, train/train_loss_1=0.0867]Epoch 7: 20%|## | 134/655 [00:00<00:00, 657.85batch/s, train/train_loss_1=0.088] Epoch 7: 20%|## | 134/655 [00:00<00:00, 657.85batch/s, train/train_loss_1=0.0943]Epoch 7: 20%|## | 134/655 [00:00<00:00, 657.85batch/s, train/train_loss_1=0.104] Epoch 7: 20%|## | 134/655 [00:00<00:00, 657.85batch/s, train/train_loss_1=0.087]Epoch 7: 20%|## | 134/655 [00:00<00:00, 657.85batch/s, train/train_loss_1=0.0792]Epoch 7: 20%|## | 134/655 [00:00<00:00, 657.85batch/s, train/train_loss_1=0.0885]Epoch 7: 20%|## | 134/655 [00:00<00:00, 657.85batch/s, train/train_loss_1=0.0864]Epoch 7: 20%|## | 134/655 [00:00<00:00, 657.85batch/s, train/train_loss_1=0.0805]Epoch 7: 20%|## | 134/655 [00:00<00:00, 657.85batch/s, train/train_loss_1=0.0787]Epoch 7: 20%|## | 134/655 [00:00<00:00, 657.85batch/s, train/train_loss_1=0.0898]Epoch 7: 20%|## | 134/655 [00:00<00:00, 657.85batch/s, train/train_loss_1=0.0836]Epoch 7: 20%|## | 134/655 [00:00<00:00, 657.85batch/s, train/train_loss_1=0.104] Epoch 7: 20%|## | 134/655 [00:00<00:00, 657.85batch/s, train/train_loss_1=0.0965]Epoch 7: 20%|## | 134/655 [00:00<00:00, 657.85batch/s, train/train_loss_1=0.0878]Epoch 7: 20%|## | 134/655 [00:00<00:00, 657.85batch/s, train/train_loss_1=0.0888]Epoch 7: 20%|## | 134/655 [00:00<00:00, 657.85batch/s, train/train_loss_1=0.0879]Epoch 7: 20%|## | 134/655 [00:00<00:00, 657.85batch/s, train/train_loss_1=0.0734]Epoch 7: 20%|## | 134/655 [00:00<00:00, 657.85batch/s, train/train_loss_1=0.0775]Epoch 7: 20%|## | 134/655 [00:00<00:00, 657.85batch/s, train/train_loss_1=0.0861]Epoch 7: 20%|## | 134/655 [00:00<00:00, 657.85batch/s, train/train_loss_1=0.078] Epoch 7: 20%|## | 134/655 [00:00<00:00, 657.85batch/s, train/train_loss_1=0.0903]Epoch 7: 20%|## | 134/655 [00:00<00:00, 657.85batch/s, train/train_loss_1=0.104] Epoch 7: 20%|## | 134/655 [00:00<00:00, 657.85batch/s, train/train_loss_1=0.0887]Epoch 7: 20%|## | 134/655 [00:00<00:00, 657.85batch/s, train/train_loss_1=0.0937]Epoch 7: 20%|## | 134/655 [00:00<00:00, 657.85batch/s, train/train_loss_1=0.0845]Epoch 7: 20%|## | 134/655 [00:00<00:00, 657.85batch/s, train/train_loss_1=0.102] Epoch 7: 20%|## | 134/655 [00:00<00:00, 657.85batch/s, train/train_loss_1=0.0789]Epoch 7: 20%|## | 134/655 [00:00<00:00, 657.85batch/s, train/train_loss_1=0.0877]Epoch 7: 31%|### | 200/655 [00:00<00:00, 658.79batch/s, train/train_loss_1=0.0877]Epoch 7: 31%|### | 200/655 [00:00<00:00, 658.79batch/s, train/train_loss_1=0.0959]Epoch 7: 31%|### | 200/655 [00:00<00:00, 658.79batch/s, train/train_loss_1=0.0847]Epoch 7: 31%|### | 200/655 [00:00<00:00, 658.79batch/s, train/train_loss_1=0.0894]Epoch 7: 31%|### | 200/655 [00:00<00:00, 658.79batch/s, train/train_loss_1=0.0944]Epoch 7: 31%|### | 200/655 [00:00<00:00, 658.79batch/s, train/train_loss_1=0.0761]Epoch 7: 31%|### | 200/655 [00:00<00:00, 658.79batch/s, train/train_loss_1=0.0897]Epoch 7: 31%|### | 200/655 [00:00<00:00, 658.79batch/s, train/train_loss_1=0.0917]Epoch 7: 31%|### | 200/655 [00:00<00:00, 658.79batch/s, train/train_loss_1=0.0969]Epoch 7: 31%|### | 200/655 [00:00<00:00, 658.79batch/s, train/train_loss_1=0.0902]Epoch 7: 31%|### | 200/655 [00:00<00:00, 658.79batch/s, train/train_loss_1=0.0839]Epoch 7: 31%|### | 200/655 [00:00<00:00, 658.79batch/s, train/train_loss_1=0.0875]Epoch 7: 31%|### | 200/655 [00:00<00:00, 658.79batch/s, train/train_loss_1=0.0702]Epoch 7: 31%|### | 200/655 [00:00<00:00, 658.79batch/s, train/train_loss_1=0.0833]Epoch 7: 31%|### | 200/655 [00:00<00:00, 658.79batch/s, train/train_loss_1=0.102] Epoch 7: 31%|### | 200/655 [00:00<00:00, 658.79batch/s, train/train_loss_1=0.0974]Epoch 7: 31%|### | 200/655 [00:00<00:00, 658.79batch/s, train/train_loss_1=0.0713]Epoch 7: 31%|### | 200/655 [00:00<00:00, 658.79batch/s, train/train_loss_1=0.0951]Epoch 7: 31%|### | 200/655 [00:00<00:00, 658.79batch/s, train/train_loss_1=0.087] Epoch 7: 31%|### | 200/655 [00:00<00:00, 658.79batch/s, train/train_loss_1=0.0719]Epoch 7: 31%|### | 200/655 [00:00<00:00, 658.79batch/s, train/train_loss_1=0.0651]Epoch 7: 31%|### | 200/655 [00:00<00:00, 658.79batch/s, train/train_loss_1=0.0759]Epoch 7: 31%|### | 200/655 [00:00<00:00, 658.79batch/s, train/train_loss_1=0.0753]Epoch 7: 31%|### | 200/655 [00:00<00:00, 658.79batch/s, train/train_loss_1=0.086] Epoch 7: 31%|### | 200/655 [00:00<00:00, 658.79batch/s, train/train_loss_1=0.083]Epoch 7: 31%|### | 200/655 [00:00<00:00, 658.79batch/s, train/train_loss_1=0.109]Epoch 7: 31%|### | 200/655 [00:00<00:00, 658.79batch/s, train/train_loss_1=0.088]Epoch 7: 31%|### | 200/655 [00:00<00:00, 658.79batch/s, train/train_loss_1=0.0911]Epoch 7: 31%|### | 200/655 [00:00<00:00, 658.79batch/s, train/train_loss_1=0.079] Epoch 7: 31%|### | 200/655 [00:00<00:00, 658.79batch/s, train/train_loss_1=0.0994]Epoch 7: 31%|### | 200/655 [00:00<00:00, 658.79batch/s, train/train_loss_1=0.0798]Epoch 7: 31%|### | 200/655 [00:00<00:00, 658.79batch/s, train/train_loss_1=0.0827]Epoch 7: 31%|### | 200/655 [00:00<00:00, 658.79batch/s, train/train_loss_1=0.0858]Epoch 7: 31%|### | 200/655 [00:00<00:00, 658.79batch/s, train/train_loss_1=0.1] Epoch 7: 31%|### | 200/655 [00:00<00:00, 658.79batch/s, train/train_loss_1=0.0924]Epoch 7: 31%|### | 200/655 [00:00<00:00, 658.79batch/s, train/train_loss_1=0.0838]Epoch 7: 31%|### | 200/655 [00:00<00:00, 658.79batch/s, train/train_loss_1=0.0976]Epoch 7: 31%|### | 200/655 [00:00<00:00, 658.79batch/s, train/train_loss_1=0.086] Epoch 7: 31%|### | 200/655 [00:00<00:00, 658.79batch/s, train/train_loss_1=0.0945]Epoch 7: 31%|### | 200/655 [00:00<00:00, 658.79batch/s, train/train_loss_1=0.0914]Epoch 7: 31%|### | 200/655 [00:00<00:00, 658.79batch/s, train/train_loss_1=0.0758]Epoch 7: 31%|### | 200/655 [00:00<00:00, 658.79batch/s, train/train_loss_1=0.0967]Epoch 7: 31%|### | 200/655 [00:00<00:00, 658.79batch/s, train/train_loss_1=0.0807]Epoch 7: 31%|### | 200/655 [00:00<00:00, 658.79batch/s, train/train_loss_1=0.0886]Epoch 7: 31%|### | 200/655 [00:00<00:00, 658.79batch/s, train/train_loss_1=0.0832]Epoch 7: 31%|### | 200/655 [00:00<00:00, 658.79batch/s, train/train_loss_1=0.0904]Epoch 7: 31%|### | 200/655 [00:00<00:00, 658.79batch/s, train/train_loss_1=0.0729]Epoch 7: 31%|### | 200/655 [00:00<00:00, 658.79batch/s, train/train_loss_1=0.082] Epoch 7: 31%|### | 200/655 [00:00<00:00, 658.79batch/s, train/train_loss_1=0.0872]Epoch 7: 31%|### | 200/655 [00:00<00:00, 658.79batch/s, train/train_loss_1=0.0785]Epoch 7: 31%|### | 200/655 [00:00<00:00, 658.79batch/s, train/train_loss_1=0.0839]Epoch 7: 31%|### | 200/655 [00:00<00:00, 658.79batch/s, train/train_loss_1=0.0941]Epoch 7: 31%|### | 200/655 [00:00<00:00, 658.79batch/s, train/train_loss_1=0.0969]Epoch 7: 31%|### | 200/655 [00:00<00:00, 658.79batch/s, train/train_loss_1=0.0807]Epoch 7: 31%|### | 200/655 [00:00<00:00, 658.79batch/s, train/train_loss_1=0.0912]Epoch 7: 31%|### | 200/655 [00:00<00:00, 658.79batch/s, train/train_loss_1=0.0789]Epoch 7: 31%|### | 200/655 [00:00<00:00, 658.79batch/s, train/train_loss_1=0.0984]Epoch 7: 31%|### | 200/655 [00:00<00:00, 658.79batch/s, train/train_loss_1=0.0871]Epoch 7: 31%|### | 200/655 [00:00<00:00, 658.79batch/s, train/train_loss_1=0.0749]Epoch 7: 31%|### | 200/655 [00:00<00:00, 658.79batch/s, train/train_loss_1=0.0898]Epoch 7: 31%|### | 200/655 [00:00<00:00, 658.79batch/s, train/train_loss_1=0.0876]Epoch 7: 31%|### | 200/655 [00:00<00:00, 658.79batch/s, train/train_loss_1=0.104] Epoch 7: 31%|### | 200/655 [00:00<00:00, 658.79batch/s, train/train_loss_1=0.0815]Epoch 7: 31%|### | 200/655 [00:00<00:00, 658.79batch/s, train/train_loss_1=0.094] Epoch 7: 31%|### | 200/655 [00:00<00:00, 658.79batch/s, train/train_loss_1=0.103]Epoch 7: 31%|### | 200/655 [00:00<00:00, 658.79batch/s, train/train_loss_1=0.0821]Epoch 7: 31%|### | 200/655 [00:00<00:00, 658.79batch/s, train/train_loss_1=0.0893]Epoch 7: 31%|### | 200/655 [00:00<00:00, 658.79batch/s, train/train_loss_1=0.0962]Epoch 7: 41%|#### | 267/655 [00:00<00:00, 660.50batch/s, train/train_loss_1=0.0962]Epoch 7: 41%|#### | 267/655 [00:00<00:00, 660.50batch/s, train/train_loss_1=0.0971]Epoch 7: 41%|#### | 267/655 [00:00<00:00, 660.50batch/s, train/train_loss_1=0.093] Epoch 7: 41%|#### | 267/655 [00:00<00:00, 660.50batch/s, train/train_loss_1=0.0936]Epoch 7: 41%|#### | 267/655 [00:00<00:00, 660.50batch/s, train/train_loss_1=0.0814]Epoch 7: 41%|#### | 267/655 [00:00<00:00, 660.50batch/s, train/train_loss_1=0.0871]Epoch 7: 41%|#### | 267/655 [00:00<00:00, 660.50batch/s, train/train_loss_1=0.0961]Epoch 7: 41%|#### | 267/655 [00:00<00:00, 660.50batch/s, train/train_loss_1=0.0837]Epoch 7: 41%|#### | 267/655 [00:00<00:00, 660.50batch/s, train/train_loss_1=0.0892]Epoch 7: 41%|#### | 267/655 [00:00<00:00, 660.50batch/s, train/train_loss_1=0.082] Epoch 7: 41%|#### | 267/655 [00:00<00:00, 660.50batch/s, train/train_loss_1=0.0854]Epoch 7: 41%|#### | 267/655 [00:00<00:00, 660.50batch/s, train/train_loss_1=0.0814]Epoch 7: 41%|#### | 267/655 [00:00<00:00, 660.50batch/s, train/train_loss_1=0.0922]Epoch 7: 41%|#### | 267/655 [00:00<00:00, 660.50batch/s, train/train_loss_1=0.0826]Epoch 7: 41%|#### | 267/655 [00:00<00:00, 660.50batch/s, train/train_loss_1=0.0839]Epoch 7: 41%|#### | 267/655 [00:00<00:00, 660.50batch/s, train/train_loss_1=0.0738]Epoch 7: 41%|#### | 267/655 [00:00<00:00, 660.50batch/s, train/train_loss_1=0.0872]Epoch 7: 41%|#### | 267/655 [00:00<00:00, 660.50batch/s, train/train_loss_1=0.0933]Epoch 7: 41%|#### | 267/655 [00:00<00:00, 660.50batch/s, train/train_loss_1=0.0759]Epoch 7: 41%|#### | 267/655 [00:00<00:00, 660.50batch/s, train/train_loss_1=0.0776]Epoch 7: 41%|#### | 267/655 [00:00<00:00, 660.50batch/s, train/train_loss_1=0.0722]Epoch 7: 41%|#### | 267/655 [00:00<00:00, 660.50batch/s, train/train_loss_1=0.0667]Epoch 7: 41%|#### | 267/655 [00:00<00:00, 660.50batch/s, train/train_loss_1=0.0777]Epoch 7: 41%|#### | 267/655 [00:00<00:00, 660.50batch/s, train/train_loss_1=0.0772]Epoch 7: 41%|#### | 267/655 [00:00<00:00, 660.50batch/s, train/train_loss_1=0.101] Epoch 7: 41%|#### | 267/655 [00:00<00:00, 660.50batch/s, train/train_loss_1=0.0951]Epoch 7: 41%|#### | 267/655 [00:00<00:00, 660.50batch/s, train/train_loss_1=0.0778]Epoch 7: 41%|#### | 267/655 [00:00<00:00, 660.50batch/s, train/train_loss_1=0.0979]Epoch 7: 41%|#### | 267/655 [00:00<00:00, 660.50batch/s, train/train_loss_1=0.0999]Epoch 7: 41%|#### | 267/655 [00:00<00:00, 660.50batch/s, train/train_loss_1=0.0816]Epoch 7: 41%|#### | 267/655 [00:00<00:00, 660.50batch/s, train/train_loss_1=0.0885]Epoch 7: 41%|#### | 267/655 [00:00<00:00, 660.50batch/s, train/train_loss_1=0.0879]Epoch 7: 41%|#### | 267/655 [00:00<00:00, 660.50batch/s, train/train_loss_1=0.0927]Epoch 7: 41%|#### | 267/655 [00:00<00:00, 660.50batch/s, train/train_loss_1=0.0742]Epoch 7: 41%|#### | 267/655 [00:00<00:00, 660.50batch/s, train/train_loss_1=0.0967]Epoch 7: 41%|#### | 267/655 [00:00<00:00, 660.50batch/s, train/train_loss_1=0.0879]Epoch 7: 41%|#### | 267/655 [00:00<00:00, 660.50batch/s, train/train_loss_1=0.096] Epoch 7: 41%|#### | 267/655 [00:00<00:00, 660.50batch/s, train/train_loss_1=0.0866]Epoch 7: 41%|#### | 267/655 [00:00<00:00, 660.50batch/s, train/train_loss_1=0.0862]Epoch 7: 41%|#### | 267/655 [00:00<00:00, 660.50batch/s, train/train_loss_1=0.0847]Epoch 7: 41%|#### | 267/655 [00:00<00:00, 660.50batch/s, train/train_loss_1=0.0885]Epoch 7: 41%|#### | 267/655 [00:00<00:00, 660.50batch/s, train/train_loss_1=0.0833]Epoch 7: 41%|#### | 267/655 [00:00<00:00, 660.50batch/s, train/train_loss_1=0.0897]Epoch 7: 41%|#### | 267/655 [00:00<00:00, 660.50batch/s, train/train_loss_1=0.0723]Epoch 7: 41%|#### | 267/655 [00:00<00:00, 660.50batch/s, train/train_loss_1=0.0875]Epoch 7: 41%|#### | 267/655 [00:00<00:00, 660.50batch/s, train/train_loss_1=0.09] Epoch 7: 41%|#### | 267/655 [00:00<00:00, 660.50batch/s, train/train_loss_1=0.0905]Epoch 7: 41%|#### | 267/655 [00:00<00:00, 660.50batch/s, train/train_loss_1=0.0811]Epoch 7: 41%|#### | 267/655 [00:00<00:00, 660.50batch/s, train/train_loss_1=0.0941]Epoch 7: 41%|#### | 267/655 [00:00<00:00, 660.50batch/s, train/train_loss_1=0.0834]Epoch 7: 41%|#### | 267/655 [00:00<00:00, 660.50batch/s, train/train_loss_1=0.101] Epoch 7: 41%|#### | 267/655 [00:00<00:00, 660.50batch/s, train/train_loss_1=0.0765]Epoch 7: 41%|#### | 267/655 [00:00<00:00, 660.50batch/s, train/train_loss_1=0.0853]Epoch 7: 41%|#### | 267/655 [00:00<00:00, 660.50batch/s, train/train_loss_1=0.0897]Epoch 7: 41%|#### | 267/655 [00:00<00:00, 660.50batch/s, train/train_loss_1=0.0838]Epoch 7: 41%|#### | 267/655 [00:00<00:00, 660.50batch/s, train/train_loss_1=0.0899]Epoch 7: 41%|#### | 267/655 [00:00<00:00, 660.50batch/s, train/train_loss_1=0.0805]Epoch 7: 41%|#### | 267/655 [00:00<00:00, 660.50batch/s, train/train_loss_1=0.0947]Epoch 7: 41%|#### | 267/655 [00:00<00:00, 660.50batch/s, train/train_loss_1=0.078] Epoch 7: 41%|#### | 267/655 [00:00<00:00, 660.50batch/s, train/train_loss_1=0.0962]Epoch 7: 41%|#### | 267/655 [00:00<00:00, 660.50batch/s, train/train_loss_1=0.0935]Epoch 7: 41%|#### | 267/655 [00:00<00:00, 660.50batch/s, train/train_loss_1=0.0862]Epoch 7: 41%|#### | 267/655 [00:00<00:00, 660.50batch/s, train/train_loss_1=0.0932]Epoch 7: 41%|#### | 267/655 [00:00<00:00, 660.50batch/s, train/train_loss_1=0.0806]Epoch 7: 41%|#### | 267/655 [00:00<00:00, 660.50batch/s, train/train_loss_1=0.0926]Epoch 7: 41%|#### | 267/655 [00:00<00:00, 660.50batch/s, train/train_loss_1=0.0826]Epoch 7: 41%|#### | 267/655 [00:00<00:00, 660.50batch/s, train/train_loss_1=0.0865]Epoch 7: 41%|#### | 267/655 [00:00<00:00, 660.50batch/s, train/train_loss_1=0.083] Epoch 7: 41%|#### | 267/655 [00:00<00:00, 660.50batch/s, train/train_loss_1=0.0839]Epoch 7: 41%|#### | 267/655 [00:00<00:00, 660.50batch/s, train/train_loss_1=0.0749]Epoch 7: 41%|#### | 267/655 [00:00<00:00, 660.50batch/s, train/train_loss_1=0.0897]Epoch 7: 41%|#### | 267/655 [00:00<00:00, 660.50batch/s, train/train_loss_1=0.091] Epoch 7: 52%|#####1 | 338/655 [00:00<00:00, 677.39batch/s, train/train_loss_1=0.091]Epoch 7: 52%|#####1 | 338/655 [00:00<00:00, 677.39batch/s, train/train_loss_1=0.104]Epoch 7: 52%|#####1 | 338/655 [00:00<00:00, 677.39batch/s, train/train_loss_1=0.103]Epoch 7: 52%|#####1 | 338/655 [00:00<00:00, 677.39batch/s, train/train_loss_1=0.0915]Epoch 7: 52%|#####1 | 338/655 [00:00<00:00, 677.39batch/s, train/train_loss_1=0.0771]Epoch 7: 52%|#####1 | 338/655 [00:00<00:00, 677.39batch/s, train/train_loss_1=0.0867]Epoch 7: 52%|#####1 | 338/655 [00:00<00:00, 677.39batch/s, train/train_loss_1=0.106] Epoch 7: 52%|#####1 | 338/655 [00:00<00:00, 677.39batch/s, train/train_loss_1=0.084]Epoch 7: 52%|#####1 | 338/655 [00:00<00:00, 677.39batch/s, train/train_loss_1=0.0832]Epoch 7: 52%|#####1 | 338/655 [00:00<00:00, 677.39batch/s, train/train_loss_1=0.091] Epoch 7: 52%|#####1 | 338/655 [00:00<00:00, 677.39batch/s, train/train_loss_1=0.0895]Epoch 7: 52%|#####1 | 338/655 [00:00<00:00, 677.39batch/s, train/train_loss_1=0.0899]Epoch 7: 52%|#####1 | 338/655 [00:00<00:00, 677.39batch/s, train/train_loss_1=0.102] Epoch 7: 52%|#####1 | 338/655 [00:00<00:00, 677.39batch/s, train/train_loss_1=0.0797]Epoch 7: 52%|#####1 | 338/655 [00:00<00:00, 677.39batch/s, train/train_loss_1=0.0884]Epoch 7: 52%|#####1 | 338/655 [00:00<00:00, 677.39batch/s, train/train_loss_1=0.0835]Epoch 7: 52%|#####1 | 338/655 [00:00<00:00, 677.39batch/s, train/train_loss_1=0.0794]Epoch 7: 52%|#####1 | 338/655 [00:00<00:00, 677.39batch/s, train/train_loss_1=0.083] Epoch 7: 52%|#####1 | 338/655 [00:00<00:00, 677.39batch/s, train/train_loss_1=0.0789]Epoch 7: 52%|#####1 | 338/655 [00:00<00:00, 677.39batch/s, train/train_loss_1=0.107] Epoch 7: 52%|#####1 | 338/655 [00:00<00:00, 677.39batch/s, train/train_loss_1=0.0957]Epoch 7: 52%|#####1 | 338/655 [00:00<00:00, 677.39batch/s, train/train_loss_1=0.0838]Epoch 7: 52%|#####1 | 338/655 [00:00<00:00, 677.39batch/s, train/train_loss_1=0.092] Epoch 7: 52%|#####1 | 338/655 [00:00<00:00, 677.39batch/s, train/train_loss_1=0.0834]Epoch 7: 52%|#####1 | 338/655 [00:00<00:00, 677.39batch/s, train/train_loss_1=0.103] Epoch 7: 52%|#####1 | 338/655 [00:00<00:00, 677.39batch/s, train/train_loss_1=0.0999]Epoch 7: 52%|#####1 | 338/655 [00:00<00:00, 677.39batch/s, train/train_loss_1=0.0702]Epoch 7: 52%|#####1 | 338/655 [00:00<00:00, 677.39batch/s, train/train_loss_1=0.0849]Epoch 7: 52%|#####1 | 338/655 [00:00<00:00, 677.39batch/s, train/train_loss_1=0.0977]Epoch 7: 52%|#####1 | 338/655 [00:00<00:00, 677.39batch/s, train/train_loss_1=0.0878]Epoch 7: 52%|#####1 | 338/655 [00:00<00:00, 677.39batch/s, train/train_loss_1=0.0885]Epoch 7: 52%|#####1 | 338/655 [00:00<00:00, 677.39batch/s, train/train_loss_1=0.0987]Epoch 7: 52%|#####1 | 338/655 [00:00<00:00, 677.39batch/s, train/train_loss_1=0.0859]Epoch 7: 52%|#####1 | 338/655 [00:00<00:00, 677.39batch/s, train/train_loss_1=0.0783]Epoch 7: 52%|#####1 | 338/655 [00:00<00:00, 677.39batch/s, train/train_loss_1=0.0834]Epoch 7: 52%|#####1 | 338/655 [00:00<00:00, 677.39batch/s, train/train_loss_1=0.0806]Epoch 7: 52%|#####1 | 338/655 [00:00<00:00, 677.39batch/s, train/train_loss_1=0.0759]Epoch 7: 52%|#####1 | 338/655 [00:00<00:00, 677.39batch/s, train/train_loss_1=0.0881]Epoch 7: 52%|#####1 | 338/655 [00:00<00:00, 677.39batch/s, train/train_loss_1=0.0874]Epoch 7: 52%|#####1 | 338/655 [00:00<00:00, 677.39batch/s, train/train_loss_1=0.0907]Epoch 7: 52%|#####1 | 338/655 [00:00<00:00, 677.39batch/s, train/train_loss_1=0.0929]Epoch 7: 52%|#####1 | 338/655 [00:00<00:00, 677.39batch/s, train/train_loss_1=0.0877]Epoch 7: 52%|#####1 | 338/655 [00:00<00:00, 677.39batch/s, train/train_loss_1=0.08] Epoch 7: 52%|#####1 | 338/655 [00:00<00:00, 677.39batch/s, train/train_loss_1=0.0857]Epoch 7: 52%|#####1 | 338/655 [00:00<00:00, 677.39batch/s, train/train_loss_1=0.087] Epoch 7: 52%|#####1 | 338/655 [00:00<00:00, 677.39batch/s, train/train_loss_1=0.0835]Epoch 7: 52%|#####1 | 338/655 [00:00<00:00, 677.39batch/s, train/train_loss_1=0.0836]Epoch 7: 52%|#####1 | 338/655 [00:00<00:00, 677.39batch/s, train/train_loss_1=0.0821]Epoch 7: 52%|#####1 | 338/655 [00:00<00:00, 677.39batch/s, train/train_loss_1=0.0872]Epoch 7: 52%|#####1 | 338/655 [00:00<00:00, 677.39batch/s, train/train_loss_1=0.085] Epoch 7: 52%|#####1 | 338/655 [00:00<00:00, 677.39batch/s, train/train_loss_1=0.0865]Epoch 7: 52%|#####1 | 338/655 [00:00<00:00, 677.39batch/s, train/train_loss_1=0.0709]Epoch 7: 52%|#####1 | 338/655 [00:00<00:00, 677.39batch/s, train/train_loss_1=0.0982]Epoch 7: 52%|#####1 | 338/655 [00:00<00:00, 677.39batch/s, train/train_loss_1=0.0883]Epoch 7: 52%|#####1 | 338/655 [00:00<00:00, 677.39batch/s, train/train_loss_1=0.0847]Epoch 7: 52%|#####1 | 338/655 [00:00<00:00, 677.39batch/s, train/train_loss_1=0.0873]Epoch 7: 52%|#####1 | 338/655 [00:00<00:00, 677.39batch/s, train/train_loss_1=0.0894]Epoch 7: 52%|#####1 | 338/655 [00:00<00:00, 677.39batch/s, train/train_loss_1=0.0793]Epoch 7: 52%|#####1 | 338/655 [00:00<00:00, 677.39batch/s, train/train_loss_1=0.0713]Epoch 7: 52%|#####1 | 338/655 [00:00<00:00, 677.39batch/s, train/train_loss_1=0.08] Epoch 7: 52%|#####1 | 338/655 [00:00<00:00, 677.39batch/s, train/train_loss_1=0.0912]Epoch 7: 52%|#####1 | 338/655 [00:00<00:00, 677.39batch/s, train/train_loss_1=0.0856]Epoch 7: 52%|#####1 | 338/655 [00:00<00:00, 677.39batch/s, train/train_loss_1=0.0964]Epoch 7: 52%|#####1 | 338/655 [00:00<00:00, 677.39batch/s, train/train_loss_1=0.0891]Epoch 7: 52%|#####1 | 338/655 [00:00<00:00, 677.39batch/s, train/train_loss_1=0.0819]Epoch 7: 52%|#####1 | 338/655 [00:00<00:00, 677.39batch/s, train/train_loss_1=0.0873]Epoch 7: 52%|#####1 | 338/655 [00:00<00:00, 677.39batch/s, train/train_loss_1=0.102] Epoch 7: 52%|#####1 | 338/655 [00:00<00:00, 677.39batch/s, train/train_loss_1=0.0841]Epoch 7: 52%|#####1 | 338/655 [00:00<00:00, 677.39batch/s, train/train_loss_1=0.0693]Epoch 7: 52%|#####1 | 338/655 [00:00<00:00, 677.39batch/s, train/train_loss_1=0.0837]Epoch 7: 52%|#####1 | 338/655 [00:00<00:00, 677.39batch/s, train/train_loss_1=0.0828]Epoch 7: 52%|#####1 | 338/655 [00:00<00:00, 677.39batch/s, train/train_loss_1=0.0913]Epoch 7: 62%|######2 | 409/655 [00:00<00:00, 687.81batch/s, train/train_loss_1=0.0913]Epoch 7: 62%|######2 | 409/655 [00:00<00:00, 687.81batch/s, train/train_loss_1=0.0805]Epoch 7: 62%|######2 | 409/655 [00:00<00:00, 687.81batch/s, train/train_loss_1=0.0747]Epoch 7: 62%|######2 | 409/655 [00:00<00:00, 687.81batch/s, train/train_loss_1=0.0862]Epoch 7: 62%|######2 | 409/655 [00:00<00:00, 687.81batch/s, train/train_loss_1=0.0701]Epoch 7: 62%|######2 | 409/655 [00:00<00:00, 687.81batch/s, train/train_loss_1=0.079] Epoch 7: 62%|######2 | 409/655 [00:00<00:00, 687.81batch/s, train/train_loss_1=0.0812]Epoch 7: 62%|######2 | 409/655 [00:00<00:00, 687.81batch/s, train/train_loss_1=0.0823]Epoch 7: 62%|######2 | 409/655 [00:00<00:00, 687.81batch/s, train/train_loss_1=0.0838]Epoch 7: 62%|######2 | 409/655 [00:00<00:00, 687.81batch/s, train/train_loss_1=0.0956]Epoch 7: 62%|######2 | 409/655 [00:00<00:00, 687.81batch/s, train/train_loss_1=0.0892]Epoch 7: 62%|######2 | 409/655 [00:00<00:00, 687.81batch/s, train/train_loss_1=0.0811]Epoch 7: 62%|######2 | 409/655 [00:00<00:00, 687.81batch/s, train/train_loss_1=0.0848]Epoch 7: 62%|######2 | 409/655 [00:00<00:00, 687.81batch/s, train/train_loss_1=0.0871]Epoch 7: 62%|######2 | 409/655 [00:00<00:00, 687.81batch/s, train/train_loss_1=0.0756]Epoch 7: 62%|######2 | 409/655 [00:00<00:00, 687.81batch/s, train/train_loss_1=0.0916]Epoch 7: 62%|######2 | 409/655 [00:00<00:00, 687.81batch/s, train/train_loss_1=0.0858]Epoch 7: 62%|######2 | 409/655 [00:00<00:00, 687.81batch/s, train/train_loss_1=0.0845]Epoch 7: 62%|######2 | 409/655 [00:00<00:00, 687.81batch/s, train/train_loss_1=0.104] Epoch 7: 62%|######2 | 409/655 [00:00<00:00, 687.81batch/s, train/train_loss_1=0.105]Epoch 7: 62%|######2 | 409/655 [00:00<00:00, 687.81batch/s, train/train_loss_1=0.0925]Epoch 7: 62%|######2 | 409/655 [00:00<00:00, 687.81batch/s, train/train_loss_1=0.079] Epoch 7: 62%|######2 | 409/655 [00:00<00:00, 687.81batch/s, train/train_loss_1=0.0977]Epoch 7: 62%|######2 | 409/655 [00:00<00:00, 687.81batch/s, train/train_loss_1=0.0819]Epoch 7: 62%|######2 | 409/655 [00:00<00:00, 687.81batch/s, train/train_loss_1=0.078] Epoch 7: 62%|######2 | 409/655 [00:00<00:00, 687.81batch/s, train/train_loss_1=0.0901]Epoch 7: 62%|######2 | 409/655 [00:00<00:00, 687.81batch/s, train/train_loss_1=0.0841]Epoch 7: 62%|######2 | 409/655 [00:00<00:00, 687.81batch/s, train/train_loss_1=0.0747]Epoch 7: 62%|######2 | 409/655 [00:00<00:00, 687.81batch/s, train/train_loss_1=0.0863]Epoch 7: 62%|######2 | 409/655 [00:00<00:00, 687.81batch/s, train/train_loss_1=0.0776]Epoch 7: 62%|######2 | 409/655 [00:00<00:00, 687.81batch/s, train/train_loss_1=0.0799]Epoch 7: 62%|######2 | 409/655 [00:00<00:00, 687.81batch/s, train/train_loss_1=0.0965]Epoch 7: 62%|######2 | 409/655 [00:00<00:00, 687.81batch/s, train/train_loss_1=0.0745]Epoch 7: 62%|######2 | 409/655 [00:00<00:00, 687.81batch/s, train/train_loss_1=0.0963]Epoch 7: 62%|######2 | 409/655 [00:00<00:00, 687.81batch/s, train/train_loss_1=0.0845]Epoch 7: 62%|######2 | 409/655 [00:00<00:00, 687.81batch/s, train/train_loss_1=0.0751]Epoch 7: 62%|######2 | 409/655 [00:00<00:00, 687.81batch/s, train/train_loss_1=0.0883]Epoch 7: 62%|######2 | 409/655 [00:00<00:00, 687.81batch/s, train/train_loss_1=0.0882]Epoch 7: 62%|######2 | 409/655 [00:00<00:00, 687.81batch/s, train/train_loss_1=0.0909]Epoch 7: 62%|######2 | 409/655 [00:00<00:00, 687.81batch/s, train/train_loss_1=0.083] Epoch 7: 62%|######2 | 409/655 [00:00<00:00, 687.81batch/s, train/train_loss_1=0.0842]Epoch 7: 62%|######2 | 409/655 [00:00<00:00, 687.81batch/s, train/train_loss_1=0.0744]Epoch 7: 62%|######2 | 409/655 [00:00<00:00, 687.81batch/s, train/train_loss_1=0.0951]Epoch 7: 62%|######2 | 409/655 [00:00<00:00, 687.81batch/s, train/train_loss_1=0.0952]Epoch 7: 62%|######2 | 409/655 [00:00<00:00, 687.81batch/s, train/train_loss_1=0.088] Epoch 7: 62%|######2 | 409/655 [00:00<00:00, 687.81batch/s, train/train_loss_1=0.082]Epoch 7: 62%|######2 | 409/655 [00:00<00:00, 687.81batch/s, train/train_loss_1=0.0925]Epoch 7: 62%|######2 | 409/655 [00:00<00:00, 687.81batch/s, train/train_loss_1=0.083] Epoch 7: 62%|######2 | 409/655 [00:00<00:00, 687.81batch/s, train/train_loss_1=0.105]Epoch 7: 62%|######2 | 409/655 [00:00<00:00, 687.81batch/s, train/train_loss_1=0.0932]Epoch 7: 62%|######2 | 409/655 [00:00<00:00, 687.81batch/s, train/train_loss_1=0.0892]Epoch 7: 62%|######2 | 409/655 [00:00<00:00, 687.81batch/s, train/train_loss_1=0.0984]Epoch 7: 62%|######2 | 409/655 [00:00<00:00, 687.81batch/s, train/train_loss_1=0.0883]Epoch 7: 62%|######2 | 409/655 [00:00<00:00, 687.81batch/s, train/train_loss_1=0.089] Epoch 7: 62%|######2 | 409/655 [00:00<00:00, 687.81batch/s, train/train_loss_1=0.0809]Epoch 7: 62%|######2 | 409/655 [00:00<00:00, 687.81batch/s, train/train_loss_1=0.0814]Epoch 7: 62%|######2 | 409/655 [00:00<00:00, 687.81batch/s, train/train_loss_1=0.0734]Epoch 7: 62%|######2 | 409/655 [00:00<00:00, 687.81batch/s, train/train_loss_1=0.0862]Epoch 7: 62%|######2 | 409/655 [00:00<00:00, 687.81batch/s, train/train_loss_1=0.0863]Epoch 7: 62%|######2 | 409/655 [00:00<00:00, 687.81batch/s, train/train_loss_1=0.081] Epoch 7: 62%|######2 | 409/655 [00:00<00:00, 687.81batch/s, train/train_loss_1=0.0841]Epoch 7: 62%|######2 | 409/655 [00:00<00:00, 687.81batch/s, train/train_loss_1=0.0625]Epoch 7: 62%|######2 | 409/655 [00:00<00:00, 687.81batch/s, train/train_loss_1=0.0885]Epoch 7: 62%|######2 | 409/655 [00:00<00:00, 687.81batch/s, train/train_loss_1=0.0963]Epoch 7: 62%|######2 | 409/655 [00:00<00:00, 687.81batch/s, train/train_loss_1=0.0987]Epoch 7: 62%|######2 | 409/655 [00:00<00:00, 687.81batch/s, train/train_loss_1=0.0828]Epoch 7: 62%|######2 | 409/655 [00:00<00:00, 687.81batch/s, train/train_loss_1=0.0795]Epoch 7: 62%|######2 | 409/655 [00:00<00:00, 687.81batch/s, train/train_loss_1=0.0733]Epoch 7: 62%|######2 | 409/655 [00:00<00:00, 687.81batch/s, train/train_loss_1=0.0798]Epoch 7: 62%|######2 | 409/655 [00:00<00:00, 687.81batch/s, train/train_loss_1=0.078] Epoch 7: 62%|######2 | 409/655 [00:00<00:00, 687.81batch/s, train/train_loss_1=0.0895]Epoch 7: 62%|######2 | 409/655 [00:00<00:00, 687.81batch/s, train/train_loss_1=0.0912]Epoch 7: 62%|######2 | 409/655 [00:00<00:00, 687.81batch/s, train/train_loss_1=0.0955]Epoch 7: 73%|#######3 | 481/655 [00:00<00:00, 698.14batch/s, train/train_loss_1=0.0955]Epoch 7: 73%|#######3 | 481/655 [00:00<00:00, 698.14batch/s, train/train_loss_1=0.0834]Epoch 7: 73%|#######3 | 481/655 [00:00<00:00, 698.14batch/s, train/train_loss_1=0.0794]Epoch 7: 73%|#######3 | 481/655 [00:00<00:00, 698.14batch/s, train/train_loss_1=0.0723]Epoch 7: 73%|#######3 | 481/655 [00:00<00:00, 698.14batch/s, train/train_loss_1=0.101] Epoch 7: 73%|#######3 | 481/655 [00:00<00:00, 698.14batch/s, train/train_loss_1=0.0824]Epoch 7: 73%|#######3 | 481/655 [00:00<00:00, 698.14batch/s, train/train_loss_1=0.0934]Epoch 7: 73%|#######3 | 481/655 [00:00<00:00, 698.14batch/s, train/train_loss_1=0.0835]Epoch 7: 73%|#######3 | 481/655 [00:00<00:00, 698.14batch/s, train/train_loss_1=0.0864]Epoch 7: 73%|#######3 | 481/655 [00:00<00:00, 698.14batch/s, train/train_loss_1=0.0793]Epoch 7: 73%|#######3 | 481/655 [00:00<00:00, 698.14batch/s, train/train_loss_1=0.0837]Epoch 7: 73%|#######3 | 481/655 [00:00<00:00, 698.14batch/s, train/train_loss_1=0.086] Epoch 7: 73%|#######3 | 481/655 [00:00<00:00, 698.14batch/s, train/train_loss_1=0.0837]Epoch 7: 73%|#######3 | 481/655 [00:00<00:00, 698.14batch/s, train/train_loss_1=0.0899]Epoch 7: 73%|#######3 | 481/655 [00:00<00:00, 698.14batch/s, train/train_loss_1=0.0865]Epoch 7: 73%|#######3 | 481/655 [00:00<00:00, 698.14batch/s, train/train_loss_1=0.0939]Epoch 7: 73%|#######3 | 481/655 [00:00<00:00, 698.14batch/s, train/train_loss_1=0.0867]Epoch 7: 73%|#######3 | 481/655 [00:00<00:00, 698.14batch/s, train/train_loss_1=0.0951]Epoch 7: 73%|#######3 | 481/655 [00:00<00:00, 698.14batch/s, train/train_loss_1=0.079] Epoch 7: 73%|#######3 | 481/655 [00:00<00:00, 698.14batch/s, train/train_loss_1=0.0783]Epoch 7: 73%|#######3 | 481/655 [00:00<00:00, 698.14batch/s, train/train_loss_1=0.0741]Epoch 7: 73%|#######3 | 481/655 [00:00<00:00, 698.14batch/s, train/train_loss_1=0.0804]Epoch 7: 73%|#######3 | 481/655 [00:00<00:00, 698.14batch/s, train/train_loss_1=0.0905]Epoch 7: 73%|#######3 | 481/655 [00:00<00:00, 698.14batch/s, train/train_loss_1=0.0909]Epoch 7: 73%|#######3 | 481/655 [00:00<00:00, 698.14batch/s, train/train_loss_1=0.0769]Epoch 7: 73%|#######3 | 481/655 [00:00<00:00, 698.14batch/s, train/train_loss_1=0.0791]Epoch 7: 73%|#######3 | 481/655 [00:00<00:00, 698.14batch/s, train/train_loss_1=0.0807]Epoch 7: 73%|#######3 | 481/655 [00:00<00:00, 698.14batch/s, train/train_loss_1=0.0903]Epoch 7: 73%|#######3 | 481/655 [00:00<00:00, 698.14batch/s, train/train_loss_1=0.0832]Epoch 7: 73%|#######3 | 481/655 [00:00<00:00, 698.14batch/s, train/train_loss_1=0.0819]Epoch 7: 73%|#######3 | 481/655 [00:00<00:00, 698.14batch/s, train/train_loss_1=0.0864]Epoch 7: 73%|#######3 | 481/655 [00:00<00:00, 698.14batch/s, train/train_loss_1=0.0845]Epoch 7: 73%|#######3 | 481/655 [00:00<00:00, 698.14batch/s, train/train_loss_1=0.0825]Epoch 7: 73%|#######3 | 481/655 [00:00<00:00, 698.14batch/s, train/train_loss_1=0.0782]Epoch 7: 73%|#######3 | 481/655 [00:00<00:00, 698.14batch/s, train/train_loss_1=0.0843]Epoch 7: 73%|#######3 | 481/655 [00:00<00:00, 698.14batch/s, train/train_loss_1=0.0689]Epoch 7: 73%|#######3 | 481/655 [00:00<00:00, 698.14batch/s, train/train_loss_1=0.0843]Epoch 7: 73%|#######3 | 481/655 [00:00<00:00, 698.14batch/s, train/train_loss_1=0.0935]Epoch 7: 73%|#######3 | 481/655 [00:00<00:00, 698.14batch/s, train/train_loss_1=0.087] Epoch 7: 73%|#######3 | 481/655 [00:00<00:00, 698.14batch/s, train/train_loss_1=0.0936]Epoch 7: 73%|#######3 | 481/655 [00:00<00:00, 698.14batch/s, train/train_loss_1=0.0905]Epoch 7: 73%|#######3 | 481/655 [00:00<00:00, 698.14batch/s, train/train_loss_1=0.0765]Epoch 7: 73%|#######3 | 481/655 [00:00<00:00, 698.14batch/s, train/train_loss_1=0.0893]Epoch 7: 73%|#######3 | 481/655 [00:00<00:00, 698.14batch/s, train/train_loss_1=0.0919]Epoch 7: 73%|#######3 | 481/655 [00:00<00:00, 698.14batch/s, train/train_loss_1=0.0789]Epoch 7: 73%|#######3 | 481/655 [00:00<00:00, 698.14batch/s, train/train_loss_1=0.0931]Epoch 7: 73%|#######3 | 481/655 [00:00<00:00, 698.14batch/s, train/train_loss_1=0.0764]Epoch 7: 73%|#######3 | 481/655 [00:00<00:00, 698.14batch/s, train/train_loss_1=0.0882]Epoch 7: 73%|#######3 | 481/655 [00:00<00:00, 698.14batch/s, train/train_loss_1=0.092] Epoch 7: 73%|#######3 | 481/655 [00:00<00:00, 698.14batch/s, train/train_loss_1=0.0956]Epoch 7: 73%|#######3 | 481/655 [00:00<00:00, 698.14batch/s, train/train_loss_1=0.0739]Epoch 7: 73%|#######3 | 481/655 [00:00<00:00, 698.14batch/s, train/train_loss_1=0.0929]Epoch 7: 73%|#######3 | 481/655 [00:00<00:00, 698.14batch/s, train/train_loss_1=0.0801]Epoch 7: 73%|#######3 | 481/655 [00:00<00:00, 698.14batch/s, train/train_loss_1=0.0834]Epoch 7: 73%|#######3 | 481/655 [00:00<00:00, 698.14batch/s, train/train_loss_1=0.0929]Epoch 7: 73%|#######3 | 481/655 [00:00<00:00, 698.14batch/s, train/train_loss_1=0.101] Epoch 7: 73%|#######3 | 481/655 [00:00<00:00, 698.14batch/s, train/train_loss_1=0.0833]Epoch 7: 73%|#######3 | 481/655 [00:00<00:00, 698.14batch/s, train/train_loss_1=0.0849]Epoch 7: 73%|#######3 | 481/655 [00:00<00:00, 698.14batch/s, train/train_loss_1=0.0973]Epoch 7: 73%|#######3 | 481/655 [00:00<00:00, 698.14batch/s, train/train_loss_1=0.0897]Epoch 7: 73%|#######3 | 481/655 [00:00<00:00, 698.14batch/s, train/train_loss_1=0.0788]Epoch 7: 73%|#######3 | 481/655 [00:00<00:00, 698.14batch/s, train/train_loss_1=0.084] Epoch 7: 73%|#######3 | 481/655 [00:00<00:00, 698.14batch/s, train/train_loss_1=0.0856]Epoch 7: 73%|#######3 | 481/655 [00:00<00:00, 698.14batch/s, train/train_loss_1=0.0887]Epoch 7: 73%|#######3 | 481/655 [00:00<00:00, 698.14batch/s, train/train_loss_1=0.087] Epoch 7: 73%|#######3 | 481/655 [00:00<00:00, 698.14batch/s, train/train_loss_1=0.0797]Epoch 7: 73%|#######3 | 481/655 [00:00<00:00, 698.14batch/s, train/train_loss_1=0.0854]Epoch 7: 73%|#######3 | 481/655 [00:00<00:00, 698.14batch/s, train/train_loss_1=0.0792]Epoch 7: 73%|#######3 | 481/655 [00:00<00:00, 698.14batch/s, train/train_loss_1=0.0812]Epoch 7: 73%|#######3 | 481/655 [00:00<00:00, 698.14batch/s, train/train_loss_1=0.0734]Epoch 7: 73%|#######3 | 481/655 [00:00<00:00, 698.14batch/s, train/train_loss_1=0.088] Epoch 7: 73%|#######3 | 481/655 [00:00<00:00, 698.14batch/s, train/train_loss_1=0.081]Epoch 7: 73%|#######3 | 481/655 [00:00<00:00, 698.14batch/s, train/train_loss_1=0.0834]Epoch 7: 84%|########4 | 553/655 [00:00<00:00, 703.01batch/s, train/train_loss_1=0.0834]Epoch 7: 84%|########4 | 553/655 [00:00<00:00, 703.01batch/s, train/train_loss_1=0.0796]Epoch 7: 84%|########4 | 553/655 [00:00<00:00, 703.01batch/s, train/train_loss_1=0.0875]Epoch 7: 84%|########4 | 553/655 [00:00<00:00, 703.01batch/s, train/train_loss_1=0.0789]Epoch 7: 84%|########4 | 553/655 [00:00<00:00, 703.01batch/s, train/train_loss_1=0.0734]Epoch 7: 84%|########4 | 553/655 [00:00<00:00, 703.01batch/s, train/train_loss_1=0.0938]Epoch 7: 84%|########4 | 553/655 [00:00<00:00, 703.01batch/s, train/train_loss_1=0.0828]Epoch 7: 84%|########4 | 553/655 [00:00<00:00, 703.01batch/s, train/train_loss_1=0.0785]Epoch 7: 84%|########4 | 553/655 [00:00<00:00, 703.01batch/s, train/train_loss_1=0.0914]Epoch 7: 84%|########4 | 553/655 [00:00<00:00, 703.01batch/s, train/train_loss_1=0.0836]Epoch 7: 84%|########4 | 553/655 [00:00<00:00, 703.01batch/s, train/train_loss_1=0.083] Epoch 7: 84%|########4 | 553/655 [00:00<00:00, 703.01batch/s, train/train_loss_1=0.0791]Epoch 7: 84%|########4 | 553/655 [00:00<00:00, 703.01batch/s, train/train_loss_1=0.0947]Epoch 7: 84%|########4 | 553/655 [00:00<00:00, 703.01batch/s, train/train_loss_1=0.0865]Epoch 7: 84%|########4 | 553/655 [00:00<00:00, 703.01batch/s, train/train_loss_1=0.0862]Epoch 7: 84%|########4 | 553/655 [00:00<00:00, 703.01batch/s, train/train_loss_1=0.081] Epoch 7: 84%|########4 | 553/655 [00:00<00:00, 703.01batch/s, train/train_loss_1=0.0905]Epoch 7: 84%|########4 | 553/655 [00:00<00:00, 703.01batch/s, train/train_loss_1=0.0859]Epoch 7: 84%|########4 | 553/655 [00:00<00:00, 703.01batch/s, train/train_loss_1=0.107] Epoch 7: 84%|########4 | 553/655 [00:00<00:00, 703.01batch/s, train/train_loss_1=0.0826]Epoch 7: 84%|########4 | 553/655 [00:00<00:00, 703.01batch/s, train/train_loss_1=0.0728]Epoch 7: 84%|########4 | 553/655 [00:00<00:00, 703.01batch/s, train/train_loss_1=0.105] Epoch 7: 84%|########4 | 553/655 [00:00<00:00, 703.01batch/s, train/train_loss_1=0.0782]Epoch 7: 84%|########4 | 553/655 [00:00<00:00, 703.01batch/s, train/train_loss_1=0.068] Epoch 7: 84%|########4 | 553/655 [00:00<00:00, 703.01batch/s, train/train_loss_1=0.0849]Epoch 7: 84%|########4 | 553/655 [00:00<00:00, 703.01batch/s, train/train_loss_1=0.0835]Epoch 7: 84%|########4 | 553/655 [00:00<00:00, 703.01batch/s, train/train_loss_1=0.0729]Epoch 7: 84%|########4 | 553/655 [00:00<00:00, 703.01batch/s, train/train_loss_1=0.0803]Epoch 7: 84%|########4 | 553/655 [00:00<00:00, 703.01batch/s, train/train_loss_1=0.0856]Epoch 7: 84%|########4 | 553/655 [00:00<00:00, 703.01batch/s, train/train_loss_1=0.0906]Epoch 7: 84%|########4 | 553/655 [00:00<00:00, 703.01batch/s, train/train_loss_1=0.0951]Epoch 7: 84%|########4 | 553/655 [00:00<00:00, 703.01batch/s, train/train_loss_1=0.0812]Epoch 7: 84%|########4 | 553/655 [00:00<00:00, 703.01batch/s, train/train_loss_1=0.0978]Epoch 7: 84%|########4 | 553/655 [00:00<00:00, 703.01batch/s, train/train_loss_1=0.0749]Epoch 7: 84%|########4 | 553/655 [00:00<00:00, 703.01batch/s, train/train_loss_1=0.0964]Epoch 7: 84%|########4 | 553/655 [00:00<00:00, 703.01batch/s, train/train_loss_1=0.0887]Epoch 7: 84%|########4 | 553/655 [00:00<00:00, 703.01batch/s, train/train_loss_1=0.0843]Epoch 7: 84%|########4 | 553/655 [00:00<00:00, 703.01batch/s, train/train_loss_1=0.0833]Epoch 7: 84%|########4 | 553/655 [00:00<00:00, 703.01batch/s, train/train_loss_1=0.0843]Epoch 7: 84%|########4 | 553/655 [00:00<00:00, 703.01batch/s, train/train_loss_1=0.0797]Epoch 7: 84%|########4 | 553/655 [00:00<00:00, 703.01batch/s, train/train_loss_1=0.0906]Epoch 7: 84%|########4 | 553/655 [00:00<00:00, 703.01batch/s, train/train_loss_1=0.0877]Epoch 7: 84%|########4 | 553/655 [00:00<00:00, 703.01batch/s, train/train_loss_1=0.0888]Epoch 7: 84%|########4 | 553/655 [00:00<00:00, 703.01batch/s, train/train_loss_1=0.0761]Epoch 7: 84%|########4 | 553/655 [00:00<00:00, 703.01batch/s, train/train_loss_1=0.0855]Epoch 7: 84%|########4 | 553/655 [00:00<00:00, 703.01batch/s, train/train_loss_1=0.0707]Epoch 7: 84%|########4 | 553/655 [00:00<00:00, 703.01batch/s, train/train_loss_1=0.0969]Epoch 7: 84%|########4 | 553/655 [00:00<00:00, 703.01batch/s, train/train_loss_1=0.0893]Epoch 7: 84%|########4 | 553/655 [00:00<00:00, 703.01batch/s, train/train_loss_1=0.074] Epoch 7: 84%|########4 | 553/655 [00:00<00:00, 703.01batch/s, train/train_loss_1=0.076]Epoch 7: 84%|########4 | 553/655 [00:00<00:00, 703.01batch/s, train/train_loss_1=0.0894]Epoch 7: 84%|########4 | 553/655 [00:00<00:00, 703.01batch/s, train/train_loss_1=0.0847]Epoch 7: 84%|########4 | 553/655 [00:00<00:00, 703.01batch/s, train/train_loss_1=0.073] Epoch 7: 84%|########4 | 553/655 [00:00<00:00, 703.01batch/s, train/train_loss_1=0.0734]Epoch 7: 84%|########4 | 553/655 [00:00<00:00, 703.01batch/s, train/train_loss_1=0.103] Epoch 7: 84%|########4 | 553/655 [00:00<00:00, 703.01batch/s, train/train_loss_1=0.092]Epoch 7: 84%|########4 | 553/655 [00:00<00:00, 703.01batch/s, train/train_loss_1=0.0766]Epoch 7: 84%|########4 | 553/655 [00:00<00:00, 703.01batch/s, train/train_loss_1=0.0835]Epoch 7: 84%|########4 | 553/655 [00:00<00:00, 703.01batch/s, train/train_loss_1=0.0731]Epoch 7: 84%|########4 | 553/655 [00:00<00:00, 703.01batch/s, train/train_loss_1=0.0805]Epoch 7: 84%|########4 | 553/655 [00:00<00:00, 703.01batch/s, train/train_loss_1=0.0808]Epoch 7: 84%|########4 | 553/655 [00:00<00:00, 703.01batch/s, train/train_loss_1=0.0922]Epoch 7: 84%|########4 | 553/655 [00:00<00:00, 703.01batch/s, train/train_loss_1=0.0919]Epoch 7: 84%|########4 | 553/655 [00:00<00:00, 703.01batch/s, train/train_loss_1=0.093] Epoch 7: 84%|########4 | 553/655 [00:00<00:00, 703.01batch/s, train/train_loss_1=0.0867]Epoch 7: 84%|########4 | 553/655 [00:00<00:00, 703.01batch/s, train/train_loss_1=0.0868]Epoch 7: 84%|########4 | 553/655 [00:00<00:00, 703.01batch/s, train/train_loss_1=0.0832]Epoch 7: 84%|########4 | 553/655 [00:00<00:00, 703.01batch/s, train/train_loss_1=0.0887]Epoch 7: 84%|########4 | 553/655 [00:00<00:00, 703.01batch/s, train/train_loss_1=0.0781]Epoch 7: 84%|########4 | 553/655 [00:00<00:00, 703.01batch/s, train/train_loss_1=0.086] Epoch 7: 84%|########4 | 553/655 [00:00<00:00, 703.01batch/s, train/train_loss_1=0.0802]Epoch 7: 84%|########4 | 553/655 [00:00<00:00, 703.01batch/s, train/train_loss_1=0.086] Epoch 7: 84%|########4 | 553/655 [00:00<00:00, 703.01batch/s, train/train_loss_1=0.0796]Epoch 7: 95%|#########5| 625/655 [00:00<00:00, 706.33batch/s, train/train_loss_1=0.0796]Epoch 7: 95%|#########5| 625/655 [00:00<00:00, 706.33batch/s, train/train_loss_1=0.0861]Epoch 7: 95%|#########5| 625/655 [00:00<00:00, 706.33batch/s, train/train_loss_1=0.0863]Epoch 7: 95%|#########5| 625/655 [00:00<00:00, 706.33batch/s, train/train_loss_1=0.0898]Epoch 7: 95%|#########5| 625/655 [00:00<00:00, 706.33batch/s, train/train_loss_1=0.0934]Epoch 7: 95%|#########5| 625/655 [00:00<00:00, 706.33batch/s, train/train_loss_1=0.0852]Epoch 7: 95%|#########5| 625/655 [00:00<00:00, 706.33batch/s, train/train_loss_1=0.0849]Epoch 7: 95%|#########5| 625/655 [00:00<00:00, 706.33batch/s, train/train_loss_1=0.0888]Epoch 7: 95%|#########5| 625/655 [00:00<00:00, 706.33batch/s, train/train_loss_1=0.0949]Epoch 7: 95%|#########5| 625/655 [00:00<00:00, 706.33batch/s, train/train_loss_1=0.0924]Epoch 7: 95%|#########5| 625/655 [00:00<00:00, 706.33batch/s, train/train_loss_1=0.097] Epoch 7: 95%|#########5| 625/655 [00:00<00:00, 706.33batch/s, train/train_loss_1=0.101]Epoch 7: 95%|#########5| 625/655 [00:00<00:00, 706.33batch/s, train/train_loss_1=0.0823]Epoch 7: 95%|#########5| 625/655 [00:00<00:00, 706.33batch/s, train/train_loss_1=0.0912]Epoch 7: 95%|#########5| 625/655 [00:00<00:00, 706.33batch/s, train/train_loss_1=0.0903]Epoch 7: 95%|#########5| 625/655 [00:00<00:00, 706.33batch/s, train/train_loss_1=0.0989]Epoch 7: 95%|#########5| 625/655 [00:00<00:00, 706.33batch/s, train/train_loss_1=0.0812]Epoch 7: 95%|#########5| 625/655 [00:00<00:00, 706.33batch/s, train/train_loss_1=0.0858]Epoch 7: 95%|#########5| 625/655 [00:00<00:00, 706.33batch/s, train/train_loss_1=0.0862]Epoch 7: 95%|#########5| 625/655 [00:00<00:00, 706.33batch/s, train/train_loss_1=0.0758]Epoch 7: 95%|#########5| 625/655 [00:00<00:00, 706.33batch/s, train/train_loss_1=0.085] Epoch 7: 95%|#########5| 625/655 [00:00<00:00, 706.33batch/s, train/train_loss_1=0.0853]Epoch 7: 95%|#########5| 625/655 [00:00<00:00, 706.33batch/s, train/train_loss_1=0.0756]Epoch 7: 95%|#########5| 625/655 [00:00<00:00, 706.33batch/s, train/train_loss_1=0.0807]Epoch 7: 95%|#########5| 625/655 [00:00<00:00, 706.33batch/s, train/train_loss_1=0.104] Epoch 7: 95%|#########5| 625/655 [00:00<00:00, 706.33batch/s, train/train_loss_1=0.0932]Epoch 7: 95%|#########5| 625/655 [00:00<00:00, 706.33batch/s, train/train_loss_1=0.103] Epoch 7: 95%|#########5| 625/655 [00:00<00:00, 706.33batch/s, train/train_loss_1=0.0891]Epoch 7: 95%|#########5| 625/655 [00:00<00:00, 706.33batch/s, train/train_loss_1=0.0829]Epoch 7: 95%|#########5| 625/655 [00:00<00:00, 706.33batch/s, train/train_loss_1=0.0915]Epoch 7: 95%|#########5| 625/655 [00:00<00:00, 706.33batch/s, train/train_loss_1=0.0782] 0%| | 0/655 [00:00<?, ?batch/s]Epoch 8: 0%| | 0/655 [00:00<?, ?batch/s]Epoch 8: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.076]Epoch 8: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0884]Epoch 8: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0912]Epoch 8: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.105] Epoch 8: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0878]Epoch 8: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0947]Epoch 8: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.091] Epoch 8: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0907]Epoch 8: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0872]Epoch 8: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0962]Epoch 8: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.074] Epoch 8: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0819]Epoch 8: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0814]Epoch 8: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0965]Epoch 8: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.073] Epoch 8: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0872]Epoch 8: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0769]Epoch 8: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0769]Epoch 8: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0732]Epoch 8: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0993]Epoch 8: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.089] Epoch 8: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0986]Epoch 8: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0756]Epoch 8: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0842]Epoch 8: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.077] Epoch 8: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0811]Epoch 8: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.064] Epoch 8: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0913]Epoch 8: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0966]Epoch 8: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0876]Epoch 8: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0754]Epoch 8: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0776]Epoch 8: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0864]Epoch 8: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0823]Epoch 8: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0943]Epoch 8: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0884]Epoch 8: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0978]Epoch 8: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0884]Epoch 8: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0796]Epoch 8: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0801]Epoch 8: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0818]Epoch 8: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0905]Epoch 8: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0788]Epoch 8: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0864]Epoch 8: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0958]Epoch 8: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0879]Epoch 8: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0903]Epoch 8: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0879]Epoch 8: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.082] Epoch 8: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.094]Epoch 8: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0782]Epoch 8: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0847]Epoch 8: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0857]Epoch 8: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.081] Epoch 8: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0869]Epoch 8: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0956]Epoch 8: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0826]Epoch 8: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0811]Epoch 8: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0835]Epoch 8: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0888]Epoch 8: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0885]Epoch 8: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0881]Epoch 8: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0817]Epoch 8: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.077] Epoch 8: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0873]Epoch 8: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.075] Epoch 8: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0772]Epoch 8: 10%|# | 67/655 [00:00<00:00, 665.99batch/s, train/train_loss_1=0.0772]Epoch 8: 10%|# | 67/655 [00:00<00:00, 665.99batch/s, train/train_loss_1=0.103] Epoch 8: 10%|# | 67/655 [00:00<00:00, 665.99batch/s, train/train_loss_1=0.0862]Epoch 8: 10%|# | 67/655 [00:00<00:00, 665.99batch/s, train/train_loss_1=0.0799]Epoch 8: 10%|# | 67/655 [00:00<00:00, 665.99batch/s, train/train_loss_1=0.0921]Epoch 8: 10%|# | 67/655 [00:00<00:00, 665.99batch/s, train/train_loss_1=0.0913]Epoch 8: 10%|# | 67/655 [00:00<00:00, 665.99batch/s, train/train_loss_1=0.0975]Epoch 8: 10%|# | 67/655 [00:00<00:00, 665.99batch/s, train/train_loss_1=0.0741]Epoch 8: 10%|# | 67/655 [00:00<00:00, 665.99batch/s, train/train_loss_1=0.0694]Epoch 8: 10%|# | 67/655 [00:00<00:00, 665.99batch/s, train/train_loss_1=0.0825]Epoch 8: 10%|# | 67/655 [00:00<00:00, 665.99batch/s, train/train_loss_1=0.0758]Epoch 8: 10%|# | 67/655 [00:00<00:00, 665.99batch/s, train/train_loss_1=0.0796]Epoch 8: 10%|# | 67/655 [00:00<00:00, 665.99batch/s, train/train_loss_1=0.0758]Epoch 8: 10%|# | 67/655 [00:00<00:00, 665.99batch/s, train/train_loss_1=0.086] Epoch 8: 10%|# | 67/655 [00:00<00:00, 665.99batch/s, train/train_loss_1=0.0924]Epoch 8: 10%|# | 67/655 [00:00<00:00, 665.99batch/s, train/train_loss_1=0.0764]Epoch 8: 10%|# | 67/655 [00:00<00:00, 665.99batch/s, train/train_loss_1=0.0781]Epoch 8: 10%|# | 67/655 [00:00<00:00, 665.99batch/s, train/train_loss_1=0.0847]Epoch 8: 10%|# | 67/655 [00:00<00:00, 665.99batch/s, train/train_loss_1=0.0916]Epoch 8: 10%|# | 67/655 [00:00<00:00, 665.99batch/s, train/train_loss_1=0.0925]Epoch 8: 10%|# | 67/655 [00:00<00:00, 665.99batch/s, train/train_loss_1=0.0975]Epoch 8: 10%|# | 67/655 [00:00<00:00, 665.99batch/s, train/train_loss_1=0.084] Epoch 8: 10%|# | 67/655 [00:00<00:00, 665.99batch/s, train/train_loss_1=0.101]Epoch 8: 10%|# | 67/655 [00:00<00:00, 665.99batch/s, train/train_loss_1=0.0763]Epoch 8: 10%|# | 67/655 [00:00<00:00, 665.99batch/s, train/train_loss_1=0.0869]Epoch 8: 10%|# | 67/655 [00:00<00:00, 665.99batch/s, train/train_loss_1=0.101] Epoch 8: 10%|# | 67/655 [00:00<00:00, 665.99batch/s, train/train_loss_1=0.0981]Epoch 8: 10%|# | 67/655 [00:00<00:00, 665.99batch/s, train/train_loss_1=0.0914]Epoch 8: 10%|# | 67/655 [00:00<00:00, 665.99batch/s, train/train_loss_1=0.0785]Epoch 8: 10%|# | 67/655 [00:00<00:00, 665.99batch/s, train/train_loss_1=0.0817]Epoch 8: 10%|# | 67/655 [00:00<00:00, 665.99batch/s, train/train_loss_1=0.0909]Epoch 8: 10%|# | 67/655 [00:00<00:00, 665.99batch/s, train/train_loss_1=0.083] Epoch 8: 10%|# | 67/655 [00:00<00:00, 665.99batch/s, train/train_loss_1=0.0861]Epoch 8: 10%|# | 67/655 [00:00<00:00, 665.99batch/s, train/train_loss_1=0.0791]Epoch 8: 10%|# | 67/655 [00:00<00:00, 665.99batch/s, train/train_loss_1=0.0839]Epoch 8: 10%|# | 67/655 [00:00<00:00, 665.99batch/s, train/train_loss_1=0.0951]Epoch 8: 10%|# | 67/655 [00:00<00:00, 665.99batch/s, train/train_loss_1=0.0891]Epoch 8: 10%|# | 67/655 [00:00<00:00, 665.99batch/s, train/train_loss_1=0.0858]Epoch 8: 10%|# | 67/655 [00:00<00:00, 665.99batch/s, train/train_loss_1=0.0819]Epoch 8: 10%|# | 67/655 [00:00<00:00, 665.99batch/s, train/train_loss_1=0.0843]Epoch 8: 10%|# | 67/655 [00:00<00:00, 665.99batch/s, train/train_loss_1=0.0689]Epoch 8: 10%|# | 67/655 [00:00<00:00, 665.99batch/s, train/train_loss_1=0.0886]Epoch 8: 10%|# | 67/655 [00:00<00:00, 665.99batch/s, train/train_loss_1=0.0845]Epoch 8: 10%|# | 67/655 [00:00<00:00, 665.99batch/s, train/train_loss_1=0.0749]Epoch 8: 10%|# | 67/655 [00:00<00:00, 665.99batch/s, train/train_loss_1=0.0767]Epoch 8: 10%|# | 67/655 [00:00<00:00, 665.99batch/s, train/train_loss_1=0.0769]Epoch 8: 10%|# | 67/655 [00:00<00:00, 665.99batch/s, train/train_loss_1=0.0719]Epoch 8: 10%|# | 67/655 [00:00<00:00, 665.99batch/s, train/train_loss_1=0.0915]Epoch 8: 10%|# | 67/655 [00:00<00:00, 665.99batch/s, train/train_loss_1=0.0877]Epoch 8: 10%|# | 67/655 [00:00<00:00, 665.99batch/s, train/train_loss_1=0.0734]Epoch 8: 10%|# | 67/655 [00:00<00:00, 665.99batch/s, train/train_loss_1=0.0893]Epoch 8: 10%|# | 67/655 [00:00<00:00, 665.99batch/s, train/train_loss_1=0.0864]Epoch 8: 10%|# | 67/655 [00:00<00:00, 665.99batch/s, train/train_loss_1=0.0892]Epoch 8: 10%|# | 67/655 [00:00<00:00, 665.99batch/s, train/train_loss_1=0.0892]Epoch 8: 10%|# | 67/655 [00:00<00:00, 665.99batch/s, train/train_loss_1=0.0915]Epoch 8: 10%|# | 67/655 [00:00<00:00, 665.99batch/s, train/train_loss_1=0.086] Epoch 8: 10%|# | 67/655 [00:00<00:00, 665.99batch/s, train/train_loss_1=0.0893]Epoch 8: 10%|# | 67/655 [00:00<00:00, 665.99batch/s, train/train_loss_1=0.0843]Epoch 8: 10%|# | 67/655 [00:00<00:00, 665.99batch/s, train/train_loss_1=0.0835]Epoch 8: 10%|# | 67/655 [00:00<00:00, 665.99batch/s, train/train_loss_1=0.0705]Epoch 8: 10%|# | 67/655 [00:00<00:00, 665.99batch/s, train/train_loss_1=0.0901]Epoch 8: 10%|# | 67/655 [00:00<00:00, 665.99batch/s, train/train_loss_1=0.0926]Epoch 8: 10%|# | 67/655 [00:00<00:00, 665.99batch/s, train/train_loss_1=0.076] Epoch 8: 10%|# | 67/655 [00:00<00:00, 665.99batch/s, train/train_loss_1=0.0909]Epoch 8: 10%|# | 67/655 [00:00<00:00, 665.99batch/s, train/train_loss_1=0.0913]Epoch 8: 10%|# | 67/655 [00:00<00:00, 665.99batch/s, train/train_loss_1=0.0945]Epoch 8: 10%|# | 67/655 [00:00<00:00, 665.99batch/s, train/train_loss_1=0.0979]Epoch 8: 10%|# | 67/655 [00:00<00:00, 665.99batch/s, train/train_loss_1=0.0877]Epoch 8: 10%|# | 67/655 [00:00<00:00, 665.99batch/s, train/train_loss_1=0.0888]Epoch 8: 21%|## | 135/655 [00:00<00:00, 670.13batch/s, train/train_loss_1=0.0888]Epoch 8: 21%|## | 135/655 [00:00<00:00, 670.13batch/s, train/train_loss_1=0.105] Epoch 8: 21%|## | 135/655 [00:00<00:00, 670.13batch/s, train/train_loss_1=0.0878]Epoch 8: 21%|## | 135/655 [00:00<00:00, 670.13batch/s, train/train_loss_1=0.0847]Epoch 8: 21%|## | 135/655 [00:00<00:00, 670.13batch/s, train/train_loss_1=0.0983]Epoch 8: 21%|## | 135/655 [00:00<00:00, 670.13batch/s, train/train_loss_1=0.088] Epoch 8: 21%|## | 135/655 [00:00<00:00, 670.13batch/s, train/train_loss_1=0.0868]Epoch 8: 21%|## | 135/655 [00:00<00:00, 670.13batch/s, train/train_loss_1=0.0886]Epoch 8: 21%|## | 135/655 [00:00<00:00, 670.13batch/s, train/train_loss_1=0.0834]Epoch 8: 21%|## | 135/655 [00:00<00:00, 670.13batch/s, train/train_loss_1=0.0802]Epoch 8: 21%|## | 135/655 [00:00<00:00, 670.13batch/s, train/train_loss_1=0.0861]Epoch 8: 21%|## | 135/655 [00:00<00:00, 670.13batch/s, train/train_loss_1=0.0802]Epoch 8: 21%|## | 135/655 [00:00<00:00, 670.13batch/s, train/train_loss_1=0.102] Epoch 8: 21%|## | 135/655 [00:00<00:00, 670.13batch/s, train/train_loss_1=0.0762]Epoch 8: 21%|## | 135/655 [00:00<00:00, 670.13batch/s, train/train_loss_1=0.0848]Epoch 8: 21%|## | 135/655 [00:00<00:00, 670.13batch/s, train/train_loss_1=0.0822]Epoch 8: 21%|## | 135/655 [00:00<00:00, 670.13batch/s, train/train_loss_1=0.0932]Epoch 8: 21%|## | 135/655 [00:00<00:00, 670.13batch/s, train/train_loss_1=0.0854]Epoch 8: 21%|## | 135/655 [00:00<00:00, 670.13batch/s, train/train_loss_1=0.0849]Epoch 8: 21%|## | 135/655 [00:00<00:00, 670.13batch/s, train/train_loss_1=0.0838]Epoch 8: 21%|## | 135/655 [00:00<00:00, 670.13batch/s, train/train_loss_1=0.095] Epoch 8: 21%|## | 135/655 [00:00<00:00, 670.13batch/s, train/train_loss_1=0.0958]Epoch 8: 21%|## | 135/655 [00:00<00:00, 670.13batch/s, train/train_loss_1=0.0822]Epoch 8: 21%|## | 135/655 [00:00<00:00, 670.13batch/s, train/train_loss_1=0.0838]Epoch 8: 21%|## | 135/655 [00:00<00:00, 670.13batch/s, train/train_loss_1=0.0907]Epoch 8: 21%|## | 135/655 [00:00<00:00, 670.13batch/s, train/train_loss_1=0.0919]Epoch 8: 21%|## | 135/655 [00:00<00:00, 670.13batch/s, train/train_loss_1=0.0807]Epoch 8: 21%|## | 135/655 [00:00<00:00, 670.13batch/s, train/train_loss_1=0.0864]Epoch 8: 21%|## | 135/655 [00:00<00:00, 670.13batch/s, train/train_loss_1=0.0919]Epoch 8: 21%|## | 135/655 [00:00<00:00, 670.13batch/s, train/train_loss_1=0.0899]Epoch 8: 21%|## | 135/655 [00:00<00:00, 670.13batch/s, train/train_loss_1=0.0873]Epoch 8: 21%|## | 135/655 [00:00<00:00, 670.13batch/s, train/train_loss_1=0.0773]Epoch 8: 21%|## | 135/655 [00:00<00:00, 670.13batch/s, train/train_loss_1=0.0807]Epoch 8: 21%|## | 135/655 [00:00<00:00, 670.13batch/s, train/train_loss_1=0.0795]Epoch 8: 21%|## | 135/655 [00:00<00:00, 670.13batch/s, train/train_loss_1=0.0827]Epoch 8: 21%|## | 135/655 [00:00<00:00, 670.13batch/s, train/train_loss_1=0.09] Epoch 8: 21%|## | 135/655 [00:00<00:00, 670.13batch/s, train/train_loss_1=0.09]Epoch 8: 21%|## | 135/655 [00:00<00:00, 670.13batch/s, train/train_loss_1=0.0893]Epoch 8: 21%|## | 135/655 [00:00<00:00, 670.13batch/s, train/train_loss_1=0.075] Epoch 8: 21%|## | 135/655 [00:00<00:00, 670.13batch/s, train/train_loss_1=0.0847]Epoch 8: 21%|## | 135/655 [00:00<00:00, 670.13batch/s, train/train_loss_1=0.0757]Epoch 8: 21%|## | 135/655 [00:00<00:00, 670.13batch/s, train/train_loss_1=0.0913]Epoch 8: 21%|## | 135/655 [00:00<00:00, 670.13batch/s, train/train_loss_1=0.0937]Epoch 8: 21%|## | 135/655 [00:00<00:00, 670.13batch/s, train/train_loss_1=0.0911]Epoch 8: 21%|## | 135/655 [00:00<00:00, 670.13batch/s, train/train_loss_1=0.0853]Epoch 8: 21%|## | 135/655 [00:00<00:00, 670.13batch/s, train/train_loss_1=0.0763]Epoch 8: 21%|## | 135/655 [00:00<00:00, 670.13batch/s, train/train_loss_1=0.0787]Epoch 8: 21%|## | 135/655 [00:00<00:00, 670.13batch/s, train/train_loss_1=0.0756]Epoch 8: 21%|## | 135/655 [00:00<00:00, 670.13batch/s, train/train_loss_1=0.0965]Epoch 8: 21%|## | 135/655 [00:00<00:00, 670.13batch/s, train/train_loss_1=0.0937]Epoch 8: 21%|## | 135/655 [00:00<00:00, 670.13batch/s, train/train_loss_1=0.0826]Epoch 8: 21%|## | 135/655 [00:00<00:00, 670.13batch/s, train/train_loss_1=0.0911]Epoch 8: 21%|## | 135/655 [00:00<00:00, 670.13batch/s, train/train_loss_1=0.072] Epoch 8: 21%|## | 135/655 [00:00<00:00, 670.13batch/s, train/train_loss_1=0.0893]Epoch 8: 21%|## | 135/655 [00:00<00:00, 670.13batch/s, train/train_loss_1=0.0931]Epoch 8: 21%|## | 135/655 [00:00<00:00, 670.13batch/s, train/train_loss_1=0.0758]Epoch 8: 21%|## | 135/655 [00:00<00:00, 670.13batch/s, train/train_loss_1=0.0789]Epoch 8: 21%|## | 135/655 [00:00<00:00, 670.13batch/s, train/train_loss_1=0.0823]Epoch 8: 21%|## | 135/655 [00:00<00:00, 670.13batch/s, train/train_loss_1=0.0941]Epoch 8: 21%|## | 135/655 [00:00<00:00, 670.13batch/s, train/train_loss_1=0.0961]Epoch 8: 21%|## | 135/655 [00:00<00:00, 670.13batch/s, train/train_loss_1=0.0869]Epoch 8: 21%|## | 135/655 [00:00<00:00, 670.13batch/s, train/train_loss_1=0.0935]Epoch 8: 21%|## | 135/655 [00:00<00:00, 670.13batch/s, train/train_loss_1=0.0823]Epoch 8: 21%|## | 135/655 [00:00<00:00, 670.13batch/s, train/train_loss_1=0.0889]Epoch 8: 21%|## | 135/655 [00:00<00:00, 670.13batch/s, train/train_loss_1=0.0924]Epoch 8: 21%|## | 135/655 [00:00<00:00, 670.13batch/s, train/train_loss_1=0.0883]Epoch 8: 21%|## | 135/655 [00:00<00:00, 670.13batch/s, train/train_loss_1=0.0835]Epoch 8: 21%|## | 135/655 [00:00<00:00, 670.13batch/s, train/train_loss_1=0.0937]Epoch 8: 21%|## | 135/655 [00:00<00:00, 670.13batch/s, train/train_loss_1=0.0878]Epoch 8: 21%|## | 135/655 [00:00<00:00, 670.13batch/s, train/train_loss_1=0.0871]Epoch 8: 31%|###1 | 204/655 [00:00<00:00, 675.19batch/s, train/train_loss_1=0.0871]Epoch 8: 31%|###1 | 204/655 [00:00<00:00, 675.19batch/s, train/train_loss_1=0.0956]Epoch 8: 31%|###1 | 204/655 [00:00<00:00, 675.19batch/s, train/train_loss_1=0.0878]Epoch 8: 31%|###1 | 204/655 [00:00<00:00, 675.19batch/s, train/train_loss_1=0.0778]Epoch 8: 31%|###1 | 204/655 [00:00<00:00, 675.19batch/s, train/train_loss_1=0.0871]Epoch 8: 31%|###1 | 204/655 [00:00<00:00, 675.19batch/s, train/train_loss_1=0.0942]Epoch 8: 31%|###1 | 204/655 [00:00<00:00, 675.19batch/s, train/train_loss_1=0.08] Epoch 8: 31%|###1 | 204/655 [00:00<00:00, 675.19batch/s, train/train_loss_1=0.089]Epoch 8: 31%|###1 | 204/655 [00:00<00:00, 675.19batch/s, train/train_loss_1=0.103]Epoch 8: 31%|###1 | 204/655 [00:00<00:00, 675.19batch/s, train/train_loss_1=0.0844]Epoch 8: 31%|###1 | 204/655 [00:00<00:00, 675.19batch/s, train/train_loss_1=0.0801]Epoch 8: 31%|###1 | 204/655 [00:00<00:00, 675.19batch/s, train/train_loss_1=0.0741]Epoch 8: 31%|###1 | 204/655 [00:00<00:00, 675.19batch/s, train/train_loss_1=0.0959]Epoch 8: 31%|###1 | 204/655 [00:00<00:00, 675.19batch/s, train/train_loss_1=0.0754]Epoch 8: 31%|###1 | 204/655 [00:00<00:00, 675.19batch/s, train/train_loss_1=0.0883]Epoch 8: 31%|###1 | 204/655 [00:00<00:00, 675.19batch/s, train/train_loss_1=0.0916]Epoch 8: 31%|###1 | 204/655 [00:00<00:00, 675.19batch/s, train/train_loss_1=0.0917]Epoch 8: 31%|###1 | 204/655 [00:00<00:00, 675.19batch/s, train/train_loss_1=0.0942]Epoch 8: 31%|###1 | 204/655 [00:00<00:00, 675.19batch/s, train/train_loss_1=0.0818]Epoch 8: 31%|###1 | 204/655 [00:00<00:00, 675.19batch/s, train/train_loss_1=0.0853]Epoch 8: 31%|###1 | 204/655 [00:00<00:00, 675.19batch/s, train/train_loss_1=0.077] Epoch 8: 31%|###1 | 204/655 [00:00<00:00, 675.19batch/s, train/train_loss_1=0.0871]Epoch 8: 31%|###1 | 204/655 [00:00<00:00, 675.19batch/s, train/train_loss_1=0.0946]Epoch 8: 31%|###1 | 204/655 [00:00<00:00, 675.19batch/s, train/train_loss_1=0.0888]Epoch 8: 31%|###1 | 204/655 [00:00<00:00, 675.19batch/s, train/train_loss_1=0.096] Epoch 8: 31%|###1 | 204/655 [00:00<00:00, 675.19batch/s, train/train_loss_1=0.0832]Epoch 8: 31%|###1 | 204/655 [00:00<00:00, 675.19batch/s, train/train_loss_1=0.0909]Epoch 8: 31%|###1 | 204/655 [00:00<00:00, 675.19batch/s, train/train_loss_1=0.0779]Epoch 8: 31%|###1 | 204/655 [00:00<00:00, 675.19batch/s, train/train_loss_1=0.0686]Epoch 8: 31%|###1 | 204/655 [00:00<00:00, 675.19batch/s, train/train_loss_1=0.0837]Epoch 8: 31%|###1 | 204/655 [00:00<00:00, 675.19batch/s, train/train_loss_1=0.0772]Epoch 8: 31%|###1 | 204/655 [00:00<00:00, 675.19batch/s, train/train_loss_1=0.0838]Epoch 8: 31%|###1 | 204/655 [00:00<00:00, 675.19batch/s, train/train_loss_1=0.0852]Epoch 8: 31%|###1 | 204/655 [00:00<00:00, 675.19batch/s, train/train_loss_1=0.0805]Epoch 8: 31%|###1 | 204/655 [00:00<00:00, 675.19batch/s, train/train_loss_1=0.0871]Epoch 8: 31%|###1 | 204/655 [00:00<00:00, 675.19batch/s, train/train_loss_1=0.0844]Epoch 8: 31%|###1 | 204/655 [00:00<00:00, 675.19batch/s, train/train_loss_1=0.0955]Epoch 8: 31%|###1 | 204/655 [00:00<00:00, 675.19batch/s, train/train_loss_1=0.0837]Epoch 8: 31%|###1 | 204/655 [00:00<00:00, 675.19batch/s, train/train_loss_1=0.0936]Epoch 8: 31%|###1 | 204/655 [00:00<00:00, 675.19batch/s, train/train_loss_1=0.0864]Epoch 8: 31%|###1 | 204/655 [00:00<00:00, 675.19batch/s, train/train_loss_1=0.0876]Epoch 8: 31%|###1 | 204/655 [00:00<00:00, 675.19batch/s, train/train_loss_1=0.0715]Epoch 8: 31%|###1 | 204/655 [00:00<00:00, 675.19batch/s, train/train_loss_1=0.085] Epoch 8: 31%|###1 | 204/655 [00:00<00:00, 675.19batch/s, train/train_loss_1=0.0865]Epoch 8: 31%|###1 | 204/655 [00:00<00:00, 675.19batch/s, train/train_loss_1=0.083] Epoch 8: 31%|###1 | 204/655 [00:00<00:00, 675.19batch/s, train/train_loss_1=0.0903]Epoch 8: 31%|###1 | 204/655 [00:00<00:00, 675.19batch/s, train/train_loss_1=0.0929]Epoch 8: 31%|###1 | 204/655 [00:00<00:00, 675.19batch/s, train/train_loss_1=0.071] Epoch 8: 31%|###1 | 204/655 [00:00<00:00, 675.19batch/s, train/train_loss_1=0.0883]Epoch 8: 31%|###1 | 204/655 [00:00<00:00, 675.19batch/s, train/train_loss_1=0.0885]Epoch 8: 31%|###1 | 204/655 [00:00<00:00, 675.19batch/s, train/train_loss_1=0.0906]Epoch 8: 31%|###1 | 204/655 [00:00<00:00, 675.19batch/s, train/train_loss_1=0.0777]Epoch 8: 31%|###1 | 204/655 [00:00<00:00, 675.19batch/s, train/train_loss_1=0.085] Epoch 8: 31%|###1 | 204/655 [00:00<00:00, 675.19batch/s, train/train_loss_1=0.0944]Epoch 8: 31%|###1 | 204/655 [00:00<00:00, 675.19batch/s, train/train_loss_1=0.0814]Epoch 8: 31%|###1 | 204/655 [00:00<00:00, 675.19batch/s, train/train_loss_1=0.0905]Epoch 8: 31%|###1 | 204/655 [00:00<00:00, 675.19batch/s, train/train_loss_1=0.0835]Epoch 8: 31%|###1 | 204/655 [00:00<00:00, 675.19batch/s, train/train_loss_1=0.0855]Epoch 8: 31%|###1 | 204/655 [00:00<00:00, 675.19batch/s, train/train_loss_1=0.0799]Epoch 8: 31%|###1 | 204/655 [00:00<00:00, 675.19batch/s, train/train_loss_1=0.085] Epoch 8: 31%|###1 | 204/655 [00:00<00:00, 675.19batch/s, train/train_loss_1=0.0747]Epoch 8: 31%|###1 | 204/655 [00:00<00:00, 675.19batch/s, train/train_loss_1=0.0785]Epoch 8: 31%|###1 | 204/655 [00:00<00:00, 675.19batch/s, train/train_loss_1=0.0844]Epoch 8: 31%|###1 | 204/655 [00:00<00:00, 675.19batch/s, train/train_loss_1=0.0912]Epoch 8: 31%|###1 | 204/655 [00:00<00:00, 675.19batch/s, train/train_loss_1=0.0889]Epoch 8: 31%|###1 | 204/655 [00:00<00:00, 675.19batch/s, train/train_loss_1=0.0741]Epoch 8: 31%|###1 | 204/655 [00:00<00:00, 675.19batch/s, train/train_loss_1=0.0865]Epoch 8: 31%|###1 | 204/655 [00:00<00:00, 675.19batch/s, train/train_loss_1=0.0875]Epoch 8: 31%|###1 | 204/655 [00:00<00:00, 675.19batch/s, train/train_loss_1=0.0832]Epoch 8: 42%|####1 | 272/655 [00:00<00:00, 668.86batch/s, train/train_loss_1=0.0832]Epoch 8: 42%|####1 | 272/655 [00:00<00:00, 668.86batch/s, train/train_loss_1=0.0713]Epoch 8: 42%|####1 | 272/655 [00:00<00:00, 668.86batch/s, train/train_loss_1=0.0819]Epoch 8: 42%|####1 | 272/655 [00:00<00:00, 668.86batch/s, train/train_loss_1=0.0825]Epoch 8: 42%|####1 | 272/655 [00:00<00:00, 668.86batch/s, train/train_loss_1=0.0875]Epoch 8: 42%|####1 | 272/655 [00:00<00:00, 668.86batch/s, train/train_loss_1=0.0884]Epoch 8: 42%|####1 | 272/655 [00:00<00:00, 668.86batch/s, train/train_loss_1=0.0972]Epoch 8: 42%|####1 | 272/655 [00:00<00:00, 668.86batch/s, train/train_loss_1=0.0952]Epoch 8: 42%|####1 | 272/655 [00:00<00:00, 668.86batch/s, train/train_loss_1=0.0828]Epoch 8: 42%|####1 | 272/655 [00:00<00:00, 668.86batch/s, train/train_loss_1=0.105] Epoch 8: 42%|####1 | 272/655 [00:00<00:00, 668.86batch/s, train/train_loss_1=0.0914]Epoch 8: 42%|####1 | 272/655 [00:00<00:00, 668.86batch/s, train/train_loss_1=0.0882]Epoch 8: 42%|####1 | 272/655 [00:00<00:00, 668.86batch/s, train/train_loss_1=0.0789]Epoch 8: 42%|####1 | 272/655 [00:00<00:00, 668.86batch/s, train/train_loss_1=0.0798]Epoch 8: 42%|####1 | 272/655 [00:00<00:00, 668.86batch/s, train/train_loss_1=0.1] Epoch 8: 42%|####1 | 272/655 [00:00<00:00, 668.86batch/s, train/train_loss_1=0.0906]Epoch 8: 42%|####1 | 272/655 [00:00<00:00, 668.86batch/s, train/train_loss_1=0.0869]Epoch 8: 42%|####1 | 272/655 [00:00<00:00, 668.86batch/s, train/train_loss_1=0.0853]Epoch 8: 42%|####1 | 272/655 [00:00<00:00, 668.86batch/s, train/train_loss_1=0.0922]Epoch 8: 42%|####1 | 272/655 [00:00<00:00, 668.86batch/s, train/train_loss_1=0.0897]Epoch 8: 42%|####1 | 272/655 [00:00<00:00, 668.86batch/s, train/train_loss_1=0.0889]Epoch 8: 42%|####1 | 272/655 [00:00<00:00, 668.86batch/s, train/train_loss_1=0.0801]Epoch 8: 42%|####1 | 272/655 [00:00<00:00, 668.86batch/s, train/train_loss_1=0.0869]Epoch 8: 42%|####1 | 272/655 [00:00<00:00, 668.86batch/s, train/train_loss_1=0.0908]Epoch 8: 42%|####1 | 272/655 [00:00<00:00, 668.86batch/s, train/train_loss_1=0.0883]Epoch 8: 42%|####1 | 272/655 [00:00<00:00, 668.86batch/s, train/train_loss_1=0.0807]Epoch 8: 42%|####1 | 272/655 [00:00<00:00, 668.86batch/s, train/train_loss_1=0.0843]Epoch 8: 42%|####1 | 272/655 [00:00<00:00, 668.86batch/s, train/train_loss_1=0.0847]Epoch 8: 42%|####1 | 272/655 [00:00<00:00, 668.86batch/s, train/train_loss_1=0.0848]Epoch 8: 42%|####1 | 272/655 [00:00<00:00, 668.86batch/s, train/train_loss_1=0.0858]Epoch 8: 42%|####1 | 272/655 [00:00<00:00, 668.86batch/s, train/train_loss_1=0.0855]Epoch 8: 42%|####1 | 272/655 [00:00<00:00, 668.86batch/s, train/train_loss_1=0.0991]Epoch 8: 42%|####1 | 272/655 [00:00<00:00, 668.86batch/s, train/train_loss_1=0.0835]Epoch 8: 42%|####1 | 272/655 [00:00<00:00, 668.86batch/s, train/train_loss_1=0.0843]Epoch 8: 42%|####1 | 272/655 [00:00<00:00, 668.86batch/s, train/train_loss_1=0.081] Epoch 8: 42%|####1 | 272/655 [00:00<00:00, 668.86batch/s, train/train_loss_1=0.091]Epoch 8: 42%|####1 | 272/655 [00:00<00:00, 668.86batch/s, train/train_loss_1=0.0734]Epoch 8: 42%|####1 | 272/655 [00:00<00:00, 668.86batch/s, train/train_loss_1=0.0822]Epoch 8: 42%|####1 | 272/655 [00:00<00:00, 668.86batch/s, train/train_loss_1=0.0968]Epoch 8: 42%|####1 | 272/655 [00:00<00:00, 668.86batch/s, train/train_loss_1=0.0914]Epoch 8: 42%|####1 | 272/655 [00:00<00:00, 668.86batch/s, train/train_loss_1=0.0746]Epoch 8: 42%|####1 | 272/655 [00:00<00:00, 668.86batch/s, train/train_loss_1=0.0873]Epoch 8: 42%|####1 | 272/655 [00:00<00:00, 668.86batch/s, train/train_loss_1=0.0967]Epoch 8: 42%|####1 | 272/655 [00:00<00:00, 668.86batch/s, train/train_loss_1=0.072] Epoch 8: 42%|####1 | 272/655 [00:00<00:00, 668.86batch/s, train/train_loss_1=0.0844]Epoch 8: 42%|####1 | 272/655 [00:00<00:00, 668.86batch/s, train/train_loss_1=0.0919]Epoch 8: 42%|####1 | 272/655 [00:00<00:00, 668.86batch/s, train/train_loss_1=0.0925]Epoch 8: 42%|####1 | 272/655 [00:00<00:00, 668.86batch/s, train/train_loss_1=0.0924]Epoch 8: 42%|####1 | 272/655 [00:00<00:00, 668.86batch/s, train/train_loss_1=0.0943]Epoch 8: 42%|####1 | 272/655 [00:00<00:00, 668.86batch/s, train/train_loss_1=0.0896]Epoch 8: 42%|####1 | 272/655 [00:00<00:00, 668.86batch/s, train/train_loss_1=0.0848]Epoch 8: 42%|####1 | 272/655 [00:00<00:00, 668.86batch/s, train/train_loss_1=0.089] Epoch 8: 42%|####1 | 272/655 [00:00<00:00, 668.86batch/s, train/train_loss_1=0.078]Epoch 8: 42%|####1 | 272/655 [00:00<00:00, 668.86batch/s, train/train_loss_1=0.0854]Epoch 8: 42%|####1 | 272/655 [00:00<00:00, 668.86batch/s, train/train_loss_1=0.0871]Epoch 8: 42%|####1 | 272/655 [00:00<00:00, 668.86batch/s, train/train_loss_1=0.0765]Epoch 8: 42%|####1 | 272/655 [00:00<00:00, 668.86batch/s, train/train_loss_1=0.0772]Epoch 8: 42%|####1 | 272/655 [00:00<00:00, 668.86batch/s, train/train_loss_1=0.0807]Epoch 8: 42%|####1 | 272/655 [00:00<00:00, 668.86batch/s, train/train_loss_1=0.0734]Epoch 8: 42%|####1 | 272/655 [00:00<00:00, 668.86batch/s, train/train_loss_1=0.0767]Epoch 8: 42%|####1 | 272/655 [00:00<00:00, 668.86batch/s, train/train_loss_1=0.0857]Epoch 8: 42%|####1 | 272/655 [00:00<00:00, 668.86batch/s, train/train_loss_1=0.085] Epoch 8: 42%|####1 | 272/655 [00:00<00:00, 668.86batch/s, train/train_loss_1=0.103]Epoch 8: 42%|####1 | 272/655 [00:00<00:00, 668.86batch/s, train/train_loss_1=0.092]Epoch 8: 42%|####1 | 272/655 [00:00<00:00, 668.86batch/s, train/train_loss_1=0.0861]Epoch 8: 42%|####1 | 272/655 [00:00<00:00, 668.86batch/s, train/train_loss_1=0.0803]Epoch 8: 42%|####1 | 272/655 [00:00<00:00, 668.86batch/s, train/train_loss_1=0.0833]Epoch 8: 42%|####1 | 272/655 [00:00<00:00, 668.86batch/s, train/train_loss_1=0.0816]Epoch 8: 42%|####1 | 272/655 [00:00<00:00, 668.86batch/s, train/train_loss_1=0.0846]Epoch 8: 52%|#####1 | 340/655 [00:00<00:00, 671.34batch/s, train/train_loss_1=0.0846]Epoch 8: 52%|#####1 | 340/655 [00:00<00:00, 671.34batch/s, train/train_loss_1=0.097] Epoch 8: 52%|#####1 | 340/655 [00:00<00:00, 671.34batch/s, train/train_loss_1=0.0884]Epoch 8: 52%|#####1 | 340/655 [00:00<00:00, 671.34batch/s, train/train_loss_1=0.0775]Epoch 8: 52%|#####1 | 340/655 [00:00<00:00, 671.34batch/s, train/train_loss_1=0.0751]Epoch 8: 52%|#####1 | 340/655 [00:00<00:00, 671.34batch/s, train/train_loss_1=0.078] Epoch 8: 52%|#####1 | 340/655 [00:00<00:00, 671.34batch/s, train/train_loss_1=0.0958]Epoch 8: 52%|#####1 | 340/655 [00:00<00:00, 671.34batch/s, train/train_loss_1=0.0924]Epoch 8: 52%|#####1 | 340/655 [00:00<00:00, 671.34batch/s, train/train_loss_1=0.0868]Epoch 8: 52%|#####1 | 340/655 [00:00<00:00, 671.34batch/s, train/train_loss_1=0.0916]Epoch 8: 52%|#####1 | 340/655 [00:00<00:00, 671.34batch/s, train/train_loss_1=0.0825]Epoch 8: 52%|#####1 | 340/655 [00:00<00:00, 671.34batch/s, train/train_loss_1=0.0843]Epoch 8: 52%|#####1 | 340/655 [00:00<00:00, 671.34batch/s, train/train_loss_1=0.0748]Epoch 8: 52%|#####1 | 340/655 [00:00<00:00, 671.34batch/s, train/train_loss_1=0.0944]Epoch 8: 52%|#####1 | 340/655 [00:00<00:00, 671.34batch/s, train/train_loss_1=0.0846]Epoch 8: 52%|#####1 | 340/655 [00:00<00:00, 671.34batch/s, train/train_loss_1=0.0843]Epoch 8: 52%|#####1 | 340/655 [00:00<00:00, 671.34batch/s, train/train_loss_1=0.0891]Epoch 8: 52%|#####1 | 340/655 [00:00<00:00, 671.34batch/s, train/train_loss_1=0.0894]Epoch 8: 52%|#####1 | 340/655 [00:00<00:00, 671.34batch/s, train/train_loss_1=0.0901]Epoch 8: 52%|#####1 | 340/655 [00:00<00:00, 671.34batch/s, train/train_loss_1=0.082] Epoch 8: 52%|#####1 | 340/655 [00:00<00:00, 671.34batch/s, train/train_loss_1=0.0887]Epoch 8: 52%|#####1 | 340/655 [00:00<00:00, 671.34batch/s, train/train_loss_1=0.0854]Epoch 8: 52%|#####1 | 340/655 [00:00<00:00, 671.34batch/s, train/train_loss_1=0.0925]Epoch 8: 52%|#####1 | 340/655 [00:00<00:00, 671.34batch/s, train/train_loss_1=0.0841]Epoch 8: 52%|#####1 | 340/655 [00:00<00:00, 671.34batch/s, train/train_loss_1=0.0713]Epoch 8: 52%|#####1 | 340/655 [00:00<00:00, 671.34batch/s, train/train_loss_1=0.0889]Epoch 8: 52%|#####1 | 340/655 [00:00<00:00, 671.34batch/s, train/train_loss_1=0.0807]Epoch 8: 52%|#####1 | 340/655 [00:00<00:00, 671.34batch/s, train/train_loss_1=0.085] Epoch 8: 52%|#####1 | 340/655 [00:00<00:00, 671.34batch/s, train/train_loss_1=0.0803]Epoch 8: 52%|#####1 | 340/655 [00:00<00:00, 671.34batch/s, train/train_loss_1=0.0912]Epoch 8: 52%|#####1 | 340/655 [00:00<00:00, 671.34batch/s, train/train_loss_1=0.0923]Epoch 8: 52%|#####1 | 340/655 [00:00<00:00, 671.34batch/s, train/train_loss_1=0.092] Epoch 8: 52%|#####1 | 340/655 [00:00<00:00, 671.34batch/s, train/train_loss_1=0.0808]Epoch 8: 52%|#####1 | 340/655 [00:00<00:00, 671.34batch/s, train/train_loss_1=0.0966]Epoch 8: 52%|#####1 | 340/655 [00:00<00:00, 671.34batch/s, train/train_loss_1=0.0799]Epoch 8: 52%|#####1 | 340/655 [00:00<00:00, 671.34batch/s, train/train_loss_1=0.0893]Epoch 8: 52%|#####1 | 340/655 [00:00<00:00, 671.34batch/s, train/train_loss_1=0.0883]Epoch 8: 52%|#####1 | 340/655 [00:00<00:00, 671.34batch/s, train/train_loss_1=0.087] Epoch 8: 52%|#####1 | 340/655 [00:00<00:00, 671.34batch/s, train/train_loss_1=0.087]Epoch 8: 52%|#####1 | 340/655 [00:00<00:00, 671.34batch/s, train/train_loss_1=0.0818]Epoch 8: 52%|#####1 | 340/655 [00:00<00:00, 671.34batch/s, train/train_loss_1=0.083] Epoch 8: 52%|#####1 | 340/655 [00:00<00:00, 671.34batch/s, train/train_loss_1=0.0816]Epoch 8: 52%|#####1 | 340/655 [00:00<00:00, 671.34batch/s, train/train_loss_1=0.0854]Epoch 8: 52%|#####1 | 340/655 [00:00<00:00, 671.34batch/s, train/train_loss_1=0.0801]Epoch 8: 52%|#####1 | 340/655 [00:00<00:00, 671.34batch/s, train/train_loss_1=0.0949]Epoch 8: 52%|#####1 | 340/655 [00:00<00:00, 671.34batch/s, train/train_loss_1=0.0795]Epoch 8: 52%|#####1 | 340/655 [00:00<00:00, 671.34batch/s, train/train_loss_1=0.0903]Epoch 8: 52%|#####1 | 340/655 [00:00<00:00, 671.34batch/s, train/train_loss_1=0.079] Epoch 8: 52%|#####1 | 340/655 [00:00<00:00, 671.34batch/s, train/train_loss_1=0.0903]Epoch 8: 52%|#####1 | 340/655 [00:00<00:00, 671.34batch/s, train/train_loss_1=0.0773]Epoch 8: 52%|#####1 | 340/655 [00:00<00:00, 671.34batch/s, train/train_loss_1=0.0782]Epoch 8: 52%|#####1 | 340/655 [00:00<00:00, 671.34batch/s, train/train_loss_1=0.0858]Epoch 8: 52%|#####1 | 340/655 [00:00<00:00, 671.34batch/s, train/train_loss_1=0.102] Epoch 8: 52%|#####1 | 340/655 [00:00<00:00, 671.34batch/s, train/train_loss_1=0.0799]Epoch 8: 52%|#####1 | 340/655 [00:00<00:00, 671.34batch/s, train/train_loss_1=0.0763]Epoch 8: 52%|#####1 | 340/655 [00:00<00:00, 671.34batch/s, train/train_loss_1=0.0852]Epoch 8: 52%|#####1 | 340/655 [00:00<00:00, 671.34batch/s, train/train_loss_1=0.0858]Epoch 8: 52%|#####1 | 340/655 [00:00<00:00, 671.34batch/s, train/train_loss_1=0.102] Epoch 8: 52%|#####1 | 340/655 [00:00<00:00, 671.34batch/s, train/train_loss_1=0.0818]Epoch 8: 52%|#####1 | 340/655 [00:00<00:00, 671.34batch/s, train/train_loss_1=0.0798]Epoch 8: 52%|#####1 | 340/655 [00:00<00:00, 671.34batch/s, train/train_loss_1=0.068] Epoch 8: 52%|#####1 | 340/655 [00:00<00:00, 671.34batch/s, train/train_loss_1=0.0888]Epoch 8: 52%|#####1 | 340/655 [00:00<00:00, 671.34batch/s, train/train_loss_1=0.08] Epoch 8: 52%|#####1 | 340/655 [00:00<00:00, 671.34batch/s, train/train_loss_1=0.0818]Epoch 8: 52%|#####1 | 340/655 [00:00<00:00, 671.34batch/s, train/train_loss_1=0.0757]Epoch 8: 52%|#####1 | 340/655 [00:00<00:00, 671.34batch/s, train/train_loss_1=0.081] Epoch 8: 52%|#####1 | 340/655 [00:00<00:00, 671.34batch/s, train/train_loss_1=0.0861]Epoch 8: 52%|#####1 | 340/655 [00:00<00:00, 671.34batch/s, train/train_loss_1=0.101] Epoch 8: 52%|#####1 | 340/655 [00:00<00:00, 671.34batch/s, train/train_loss_1=0.0883]Epoch 8: 52%|#####1 | 340/655 [00:00<00:00, 671.34batch/s, train/train_loss_1=0.091] Epoch 8: 52%|#####1 | 340/655 [00:00<00:00, 671.34batch/s, train/train_loss_1=0.0901]Epoch 8: 63%|######2 | 410/655 [00:00<00:00, 680.61batch/s, train/train_loss_1=0.0901]Epoch 8: 63%|######2 | 410/655 [00:00<00:00, 680.61batch/s, train/train_loss_1=0.0857]Epoch 8: 63%|######2 | 410/655 [00:00<00:00, 680.61batch/s, train/train_loss_1=0.106] Epoch 8: 63%|######2 | 410/655 [00:00<00:00, 680.61batch/s, train/train_loss_1=0.0934]Epoch 8: 63%|######2 | 410/655 [00:00<00:00, 680.61batch/s, train/train_loss_1=0.0786]Epoch 8: 63%|######2 | 410/655 [00:00<00:00, 680.61batch/s, train/train_loss_1=0.0888]Epoch 8: 63%|######2 | 410/655 [00:00<00:00, 680.61batch/s, train/train_loss_1=0.0815]Epoch 8: 63%|######2 | 410/655 [00:00<00:00, 680.61batch/s, train/train_loss_1=0.0955]Epoch 8: 63%|######2 | 410/655 [00:00<00:00, 680.61batch/s, train/train_loss_1=0.0971]Epoch 8: 63%|######2 | 410/655 [00:00<00:00, 680.61batch/s, train/train_loss_1=0.0809]Epoch 8: 63%|######2 | 410/655 [00:00<00:00, 680.61batch/s, train/train_loss_1=0.0806]Epoch 8: 63%|######2 | 410/655 [00:00<00:00, 680.61batch/s, train/train_loss_1=0.0788]Epoch 8: 63%|######2 | 410/655 [00:00<00:00, 680.61batch/s, train/train_loss_1=0.0912]Epoch 8: 63%|######2 | 410/655 [00:00<00:00, 680.61batch/s, train/train_loss_1=0.076] Epoch 8: 63%|######2 | 410/655 [00:00<00:00, 680.61batch/s, train/train_loss_1=0.0895]Epoch 8: 63%|######2 | 410/655 [00:00<00:00, 680.61batch/s, train/train_loss_1=0.0806]Epoch 8: 63%|######2 | 410/655 [00:00<00:00, 680.61batch/s, train/train_loss_1=0.0861]Epoch 8: 63%|######2 | 410/655 [00:00<00:00, 680.61batch/s, train/train_loss_1=0.0834]Epoch 8: 63%|######2 | 410/655 [00:00<00:00, 680.61batch/s, train/train_loss_1=0.0869]Epoch 8: 63%|######2 | 410/655 [00:00<00:00, 680.61batch/s, train/train_loss_1=0.0863]Epoch 8: 63%|######2 | 410/655 [00:00<00:00, 680.61batch/s, train/train_loss_1=0.0796]Epoch 8: 63%|######2 | 410/655 [00:00<00:00, 680.61batch/s, train/train_loss_1=0.0794]Epoch 8: 63%|######2 | 410/655 [00:00<00:00, 680.61batch/s, train/train_loss_1=0.112] Epoch 8: 63%|######2 | 410/655 [00:00<00:00, 680.61batch/s, train/train_loss_1=0.0952]Epoch 8: 63%|######2 | 410/655 [00:00<00:00, 680.61batch/s, train/train_loss_1=0.105] Epoch 8: 63%|######2 | 410/655 [00:00<00:00, 680.61batch/s, train/train_loss_1=0.0712]Epoch 8: 63%|######2 | 410/655 [00:00<00:00, 680.61batch/s, train/train_loss_1=0.0983]Epoch 8: 63%|######2 | 410/655 [00:00<00:00, 680.61batch/s, train/train_loss_1=0.0665]Epoch 8: 63%|######2 | 410/655 [00:00<00:00, 680.61batch/s, train/train_loss_1=0.081] Epoch 8: 63%|######2 | 410/655 [00:00<00:00, 680.61batch/s, train/train_loss_1=0.0825]Epoch 8: 63%|######2 | 410/655 [00:00<00:00, 680.61batch/s, train/train_loss_1=0.0767]Epoch 8: 63%|######2 | 410/655 [00:00<00:00, 680.61batch/s, train/train_loss_1=0.0956]Epoch 8: 63%|######2 | 410/655 [00:00<00:00, 680.61batch/s, train/train_loss_1=0.081] Epoch 8: 63%|######2 | 410/655 [00:00<00:00, 680.61batch/s, train/train_loss_1=0.0804]Epoch 8: 63%|######2 | 410/655 [00:00<00:00, 680.61batch/s, train/train_loss_1=0.082] Epoch 8: 63%|######2 | 410/655 [00:00<00:00, 680.61batch/s, train/train_loss_1=0.0825]Epoch 8: 63%|######2 | 410/655 [00:00<00:00, 680.61batch/s, train/train_loss_1=0.0762]Epoch 8: 63%|######2 | 410/655 [00:00<00:00, 680.61batch/s, train/train_loss_1=0.0759]Epoch 8: 63%|######2 | 410/655 [00:00<00:00, 680.61batch/s, train/train_loss_1=0.0869]Epoch 8: 63%|######2 | 410/655 [00:00<00:00, 680.61batch/s, train/train_loss_1=0.0815]Epoch 8: 63%|######2 | 410/655 [00:00<00:00, 680.61batch/s, train/train_loss_1=0.0821]Epoch 8: 63%|######2 | 410/655 [00:00<00:00, 680.61batch/s, train/train_loss_1=0.0896]Epoch 8: 63%|######2 | 410/655 [00:00<00:00, 680.61batch/s, train/train_loss_1=0.0807]Epoch 8: 63%|######2 | 410/655 [00:00<00:00, 680.61batch/s, train/train_loss_1=0.0832]Epoch 8: 63%|######2 | 410/655 [00:00<00:00, 680.61batch/s, train/train_loss_1=0.0865]Epoch 8: 63%|######2 | 410/655 [00:00<00:00, 680.61batch/s, train/train_loss_1=0.0989]Epoch 8: 63%|######2 | 410/655 [00:00<00:00, 680.61batch/s, train/train_loss_1=0.0723]Epoch 8: 63%|######2 | 410/655 [00:00<00:00, 680.61batch/s, train/train_loss_1=0.101] Epoch 8: 63%|######2 | 410/655 [00:00<00:00, 680.61batch/s, train/train_loss_1=0.0763]Epoch 8: 63%|######2 | 410/655 [00:00<00:00, 680.61batch/s, train/train_loss_1=0.099] Epoch 8: 63%|######2 | 410/655 [00:00<00:00, 680.61batch/s, train/train_loss_1=0.0832]Epoch 8: 63%|######2 | 410/655 [00:00<00:00, 680.61batch/s, train/train_loss_1=0.0861]Epoch 8: 63%|######2 | 410/655 [00:00<00:00, 680.61batch/s, train/train_loss_1=0.0789]Epoch 8: 63%|######2 | 410/655 [00:00<00:00, 680.61batch/s, train/train_loss_1=0.0741]Epoch 8: 63%|######2 | 410/655 [00:00<00:00, 680.61batch/s, train/train_loss_1=0.0815]Epoch 8: 63%|######2 | 410/655 [00:00<00:00, 680.61batch/s, train/train_loss_1=0.0876]Epoch 8: 63%|######2 | 410/655 [00:00<00:00, 680.61batch/s, train/train_loss_1=0.0895]Epoch 8: 63%|######2 | 410/655 [00:00<00:00, 680.61batch/s, train/train_loss_1=0.0742]Epoch 8: 63%|######2 | 410/655 [00:00<00:00, 680.61batch/s, train/train_loss_1=0.0694]Epoch 8: 63%|######2 | 410/655 [00:00<00:00, 680.61batch/s, train/train_loss_1=0.0806]Epoch 8: 63%|######2 | 410/655 [00:00<00:00, 680.61batch/s, train/train_loss_1=0.0736]Epoch 8: 63%|######2 | 410/655 [00:00<00:00, 680.61batch/s, train/train_loss_1=0.0773]Epoch 8: 63%|######2 | 410/655 [00:00<00:00, 680.61batch/s, train/train_loss_1=0.0864]Epoch 8: 63%|######2 | 410/655 [00:00<00:00, 680.61batch/s, train/train_loss_1=0.0851]Epoch 8: 63%|######2 | 410/655 [00:00<00:00, 680.61batch/s, train/train_loss_1=0.0834]Epoch 8: 63%|######2 | 410/655 [00:00<00:00, 680.61batch/s, train/train_loss_1=0.0852]Epoch 8: 63%|######2 | 410/655 [00:00<00:00, 680.61batch/s, train/train_loss_1=0.0831]Epoch 8: 63%|######2 | 410/655 [00:00<00:00, 680.61batch/s, train/train_loss_1=0.0979]Epoch 8: 63%|######2 | 410/655 [00:00<00:00, 680.61batch/s, train/train_loss_1=0.0924]Epoch 8: 63%|######2 | 410/655 [00:00<00:00, 680.61batch/s, train/train_loss_1=0.086] Epoch 8: 63%|######2 | 410/655 [00:00<00:00, 680.61batch/s, train/train_loss_1=0.0811]Epoch 8: 63%|######2 | 410/655 [00:00<00:00, 680.61batch/s, train/train_loss_1=0.0785]Epoch 8: 63%|######2 | 410/655 [00:00<00:00, 680.61batch/s, train/train_loss_1=0.0801]Epoch 8: 63%|######2 | 410/655 [00:00<00:00, 680.61batch/s, train/train_loss_1=0.0681]Epoch 8: 74%|#######3 | 483/655 [00:00<00:00, 693.75batch/s, train/train_loss_1=0.0681]Epoch 8: 74%|#######3 | 483/655 [00:00<00:00, 693.75batch/s, train/train_loss_1=0.0723]Epoch 8: 74%|#######3 | 483/655 [00:00<00:00, 693.75batch/s, train/train_loss_1=0.0785]Epoch 8: 74%|#######3 | 483/655 [00:00<00:00, 693.75batch/s, train/train_loss_1=0.0816]Epoch 8: 74%|#######3 | 483/655 [00:00<00:00, 693.75batch/s, train/train_loss_1=0.101] Epoch 8: 74%|#######3 | 483/655 [00:00<00:00, 693.75batch/s, train/train_loss_1=0.0856]Epoch 8: 74%|#######3 | 483/655 [00:00<00:00, 693.75batch/s, train/train_loss_1=0.104] Epoch 8: 74%|#######3 | 483/655 [00:00<00:00, 693.75batch/s, train/train_loss_1=0.0914]Epoch 8: 74%|#######3 | 483/655 [00:00<00:00, 693.75batch/s, train/train_loss_1=0.0909]Epoch 8: 74%|#######3 | 483/655 [00:00<00:00, 693.75batch/s, train/train_loss_1=0.0844]Epoch 8: 74%|#######3 | 483/655 [00:00<00:00, 693.75batch/s, train/train_loss_1=0.0709]Epoch 8: 74%|#######3 | 483/655 [00:00<00:00, 693.75batch/s, train/train_loss_1=0.0772]Epoch 8: 74%|#######3 | 483/655 [00:00<00:00, 693.75batch/s, train/train_loss_1=0.0959]Epoch 8: 74%|#######3 | 483/655 [00:00<00:00, 693.75batch/s, train/train_loss_1=0.089] Epoch 8: 74%|#######3 | 483/655 [00:00<00:00, 693.75batch/s, train/train_loss_1=0.0935]Epoch 8: 74%|#######3 | 483/655 [00:00<00:00, 693.75batch/s, train/train_loss_1=0.0667]Epoch 8: 74%|#######3 | 483/655 [00:00<00:00, 693.75batch/s, train/train_loss_1=0.102] Epoch 8: 74%|#######3 | 483/655 [00:00<00:00, 693.75batch/s, train/train_loss_1=0.0926]Epoch 8: 74%|#######3 | 483/655 [00:00<00:00, 693.75batch/s, train/train_loss_1=0.0857]Epoch 8: 74%|#######3 | 483/655 [00:00<00:00, 693.75batch/s, train/train_loss_1=0.0929]Epoch 8: 74%|#######3 | 483/655 [00:00<00:00, 693.75batch/s, train/train_loss_1=0.0858]Epoch 8: 74%|#######3 | 483/655 [00:00<00:00, 693.75batch/s, train/train_loss_1=0.112] Epoch 8: 74%|#######3 | 483/655 [00:00<00:00, 693.75batch/s, train/train_loss_1=0.0866]Epoch 8: 74%|#######3 | 483/655 [00:00<00:00, 693.75batch/s, train/train_loss_1=0.08] Epoch 8: 74%|#######3 | 483/655 [00:00<00:00, 693.75batch/s, train/train_loss_1=0.0928]Epoch 8: 74%|#######3 | 483/655 [00:00<00:00, 693.75batch/s, train/train_loss_1=0.0836]Epoch 8: 74%|#######3 | 483/655 [00:00<00:00, 693.75batch/s, train/train_loss_1=0.0869]Epoch 8: 74%|#######3 | 483/655 [00:00<00:00, 693.75batch/s, train/train_loss_1=0.0933]Epoch 8: 74%|#######3 | 483/655 [00:00<00:00, 693.75batch/s, train/train_loss_1=0.0889]Epoch 8: 74%|#######3 | 483/655 [00:00<00:00, 693.75batch/s, train/train_loss_1=0.0793]Epoch 8: 74%|#######3 | 483/655 [00:00<00:00, 693.75batch/s, train/train_loss_1=0.071] Epoch 8: 74%|#######3 | 483/655 [00:00<00:00, 693.75batch/s, train/train_loss_1=0.0857]Epoch 8: 74%|#######3 | 483/655 [00:00<00:00, 693.75batch/s, train/train_loss_1=0.0855]Epoch 8: 74%|#######3 | 483/655 [00:00<00:00, 693.75batch/s, train/train_loss_1=0.0969]Epoch 8: 74%|#######3 | 483/655 [00:00<00:00, 693.75batch/s, train/train_loss_1=0.0863]Epoch 8: 74%|#######3 | 483/655 [00:00<00:00, 693.75batch/s, train/train_loss_1=0.0661]Epoch 8: 74%|#######3 | 483/655 [00:00<00:00, 693.75batch/s, train/train_loss_1=0.0754]Epoch 8: 74%|#######3 | 483/655 [00:00<00:00, 693.75batch/s, train/train_loss_1=0.0983]Epoch 8: 74%|#######3 | 483/655 [00:00<00:00, 693.75batch/s, train/train_loss_1=0.0802]Epoch 8: 74%|#######3 | 483/655 [00:00<00:00, 693.75batch/s, train/train_loss_1=0.0925]Epoch 8: 74%|#######3 | 483/655 [00:00<00:00, 693.75batch/s, train/train_loss_1=0.0753]Epoch 8: 74%|#######3 | 483/655 [00:00<00:00, 693.75batch/s, train/train_loss_1=0.0795]Epoch 8: 74%|#######3 | 483/655 [00:00<00:00, 693.75batch/s, train/train_loss_1=0.0877]Epoch 8: 74%|#######3 | 483/655 [00:00<00:00, 693.75batch/s, train/train_loss_1=0.098] Epoch 8: 74%|#######3 | 483/655 [00:00<00:00, 693.75batch/s, train/train_loss_1=0.0842]Epoch 8: 74%|#######3 | 483/655 [00:00<00:00, 693.75batch/s, train/train_loss_1=0.0988]Epoch 8: 74%|#######3 | 483/655 [00:00<00:00, 693.75batch/s, train/train_loss_1=0.0846]Epoch 8: 74%|#######3 | 483/655 [00:00<00:00, 693.75batch/s, train/train_loss_1=0.0851]Epoch 8: 74%|#######3 | 483/655 [00:00<00:00, 693.75batch/s, train/train_loss_1=0.0846]Epoch 8: 74%|#######3 | 483/655 [00:00<00:00, 693.75batch/s, train/train_loss_1=0.0941]Epoch 8: 74%|#######3 | 483/655 [00:00<00:00, 693.75batch/s, train/train_loss_1=0.0727]Epoch 8: 74%|#######3 | 483/655 [00:00<00:00, 693.75batch/s, train/train_loss_1=0.104] Epoch 8: 74%|#######3 | 483/655 [00:00<00:00, 693.75batch/s, train/train_loss_1=0.0928]Epoch 8: 74%|#######3 | 483/655 [00:00<00:00, 693.75batch/s, train/train_loss_1=0.0957]Epoch 8: 74%|#######3 | 483/655 [00:00<00:00, 693.75batch/s, train/train_loss_1=0.0847]Epoch 8: 74%|#######3 | 483/655 [00:00<00:00, 693.75batch/s, train/train_loss_1=0.0832]Epoch 8: 74%|#######3 | 483/655 [00:00<00:00, 693.75batch/s, train/train_loss_1=0.08] Epoch 8: 74%|#######3 | 483/655 [00:00<00:00, 693.75batch/s, train/train_loss_1=0.0776]Epoch 8: 74%|#######3 | 483/655 [00:00<00:00, 693.75batch/s, train/train_loss_1=0.0944]Epoch 8: 74%|#######3 | 483/655 [00:00<00:00, 693.75batch/s, train/train_loss_1=0.0817]Epoch 8: 74%|#######3 | 483/655 [00:00<00:00, 693.75batch/s, train/train_loss_1=0.0997]Epoch 8: 74%|#######3 | 483/655 [00:00<00:00, 693.75batch/s, train/train_loss_1=0.0789]Epoch 8: 74%|#######3 | 483/655 [00:00<00:00, 693.75batch/s, train/train_loss_1=0.065] Epoch 8: 74%|#######3 | 483/655 [00:00<00:00, 693.75batch/s, train/train_loss_1=0.0841]Epoch 8: 74%|#######3 | 483/655 [00:00<00:00, 693.75batch/s, train/train_loss_1=0.0807]Epoch 8: 74%|#######3 | 483/655 [00:00<00:00, 693.75batch/s, train/train_loss_1=0.0803]Epoch 8: 74%|#######3 | 483/655 [00:00<00:00, 693.75batch/s, train/train_loss_1=0.0747]Epoch 8: 74%|#######3 | 483/655 [00:00<00:00, 693.75batch/s, train/train_loss_1=0.0871]Epoch 8: 74%|#######3 | 483/655 [00:00<00:00, 693.75batch/s, train/train_loss_1=0.0902]Epoch 8: 74%|#######3 | 483/655 [00:00<00:00, 693.75batch/s, train/train_loss_1=0.0895]Epoch 8: 74%|#######3 | 483/655 [00:00<00:00, 693.75batch/s, train/train_loss_1=0.0874]Epoch 8: 74%|#######3 | 483/655 [00:00<00:00, 693.75batch/s, train/train_loss_1=0.0853]Epoch 8: 74%|#######3 | 483/655 [00:00<00:00, 693.75batch/s, train/train_loss_1=0.0767]Epoch 8: 74%|#######3 | 483/655 [00:00<00:00, 693.75batch/s, train/train_loss_1=0.0963]Epoch 8: 85%|########4 | 556/655 [00:00<00:00, 704.79batch/s, train/train_loss_1=0.0963]Epoch 8: 85%|########4 | 556/655 [00:00<00:00, 704.79batch/s, train/train_loss_1=0.0914]Epoch 8: 85%|########4 | 556/655 [00:00<00:00, 704.79batch/s, train/train_loss_1=0.0939]Epoch 8: 85%|########4 | 556/655 [00:00<00:00, 704.79batch/s, train/train_loss_1=0.0815]Epoch 8: 85%|########4 | 556/655 [00:00<00:00, 704.79batch/s, train/train_loss_1=0.0947]Epoch 8: 85%|########4 | 556/655 [00:00<00:00, 704.79batch/s, train/train_loss_1=0.0784]Epoch 8: 85%|########4 | 556/655 [00:00<00:00, 704.79batch/s, train/train_loss_1=0.0936]Epoch 8: 85%|########4 | 556/655 [00:00<00:00, 704.79batch/s, train/train_loss_1=0.0827]Epoch 8: 85%|########4 | 556/655 [00:00<00:00, 704.79batch/s, train/train_loss_1=0.0835]Epoch 8: 85%|########4 | 556/655 [00:00<00:00, 704.79batch/s, train/train_loss_1=0.0792]Epoch 8: 85%|########4 | 556/655 [00:00<00:00, 704.79batch/s, train/train_loss_1=0.104] Epoch 8: 85%|########4 | 556/655 [00:00<00:00, 704.79batch/s, train/train_loss_1=0.0928]Epoch 8: 85%|########4 | 556/655 [00:00<00:00, 704.79batch/s, train/train_loss_1=0.0895]Epoch 8: 85%|########4 | 556/655 [00:00<00:00, 704.79batch/s, train/train_loss_1=0.0856]Epoch 8: 85%|########4 | 556/655 [00:00<00:00, 704.79batch/s, train/train_loss_1=0.0873]Epoch 8: 85%|########4 | 556/655 [00:00<00:00, 704.79batch/s, train/train_loss_1=0.0816]Epoch 8: 85%|########4 | 556/655 [00:00<00:00, 704.79batch/s, train/train_loss_1=0.0816]Epoch 8: 85%|########4 | 556/655 [00:00<00:00, 704.79batch/s, train/train_loss_1=0.0911]Epoch 8: 85%|########4 | 556/655 [00:00<00:00, 704.79batch/s, train/train_loss_1=0.0883]Epoch 8: 85%|########4 | 556/655 [00:00<00:00, 704.79batch/s, train/train_loss_1=0.101] Epoch 8: 85%|########4 | 556/655 [00:00<00:00, 704.79batch/s, train/train_loss_1=0.0956]Epoch 8: 85%|########4 | 556/655 [00:00<00:00, 704.79batch/s, train/train_loss_1=0.0813]Epoch 8: 85%|########4 | 556/655 [00:00<00:00, 704.79batch/s, train/train_loss_1=0.0826]Epoch 8: 85%|########4 | 556/655 [00:00<00:00, 704.79batch/s, train/train_loss_1=0.0669]Epoch 8: 85%|########4 | 556/655 [00:00<00:00, 704.79batch/s, train/train_loss_1=0.0812]Epoch 8: 85%|########4 | 556/655 [00:00<00:00, 704.79batch/s, train/train_loss_1=0.0987]Epoch 8: 85%|########4 | 556/655 [00:00<00:00, 704.79batch/s, train/train_loss_1=0.0849]Epoch 8: 85%|########4 | 556/655 [00:00<00:00, 704.79batch/s, train/train_loss_1=0.11] Epoch 8: 85%|########4 | 556/655 [00:00<00:00, 704.79batch/s, train/train_loss_1=0.0847]Epoch 8: 85%|########4 | 556/655 [00:00<00:00, 704.79batch/s, train/train_loss_1=0.101] Epoch 8: 85%|########4 | 556/655 [00:00<00:00, 704.79batch/s, train/train_loss_1=0.0879]Epoch 8: 85%|########4 | 556/655 [00:00<00:00, 704.79batch/s, train/train_loss_1=0.106] Epoch 8: 85%|########4 | 556/655 [00:00<00:00, 704.79batch/s, train/train_loss_1=0.0865]Epoch 8: 85%|########4 | 556/655 [00:00<00:00, 704.79batch/s, train/train_loss_1=0.0894]Epoch 8: 85%|########4 | 556/655 [00:00<00:00, 704.79batch/s, train/train_loss_1=0.0874]Epoch 8: 85%|########4 | 556/655 [00:00<00:00, 704.79batch/s, train/train_loss_1=0.0911]Epoch 8: 85%|########4 | 556/655 [00:00<00:00, 704.79batch/s, train/train_loss_1=0.0863]Epoch 8: 85%|########4 | 556/655 [00:00<00:00, 704.79batch/s, train/train_loss_1=0.0782]Epoch 8: 85%|########4 | 556/655 [00:00<00:00, 704.79batch/s, train/train_loss_1=0.0908]Epoch 8: 85%|########4 | 556/655 [00:00<00:00, 704.79batch/s, train/train_loss_1=0.0856]Epoch 8: 85%|########4 | 556/655 [00:00<00:00, 704.79batch/s, train/train_loss_1=0.0966]Epoch 8: 85%|########4 | 556/655 [00:00<00:00, 704.79batch/s, train/train_loss_1=0.0923]Epoch 8: 85%|########4 | 556/655 [00:00<00:00, 704.79batch/s, train/train_loss_1=0.0864]Epoch 8: 85%|########4 | 556/655 [00:00<00:00, 704.79batch/s, train/train_loss_1=0.0833]Epoch 8: 85%|########4 | 556/655 [00:00<00:00, 704.79batch/s, train/train_loss_1=0.112] Epoch 8: 85%|########4 | 556/655 [00:00<00:00, 704.79batch/s, train/train_loss_1=0.0769]Epoch 8: 85%|########4 | 556/655 [00:00<00:00, 704.79batch/s, train/train_loss_1=0.0854]Epoch 8: 85%|########4 | 556/655 [00:00<00:00, 704.79batch/s, train/train_loss_1=0.0973]Epoch 8: 85%|########4 | 556/655 [00:00<00:00, 704.79batch/s, train/train_loss_1=0.0733]Epoch 8: 85%|########4 | 556/655 [00:00<00:00, 704.79batch/s, train/train_loss_1=0.0847]Epoch 8: 85%|########4 | 556/655 [00:00<00:00, 704.79batch/s, train/train_loss_1=0.0777]Epoch 8: 85%|########4 | 556/655 [00:00<00:00, 704.79batch/s, train/train_loss_1=0.0928]Epoch 8: 85%|########4 | 556/655 [00:00<00:00, 704.79batch/s, train/train_loss_1=0.0913]Epoch 8: 85%|########4 | 556/655 [00:00<00:00, 704.79batch/s, train/train_loss_1=0.0865]Epoch 8: 85%|########4 | 556/655 [00:00<00:00, 704.79batch/s, train/train_loss_1=0.0896]Epoch 8: 85%|########4 | 556/655 [00:00<00:00, 704.79batch/s, train/train_loss_1=0.0798]Epoch 8: 85%|########4 | 556/655 [00:00<00:00, 704.79batch/s, train/train_loss_1=0.0816]Epoch 8: 85%|########4 | 556/655 [00:00<00:00, 704.79batch/s, train/train_loss_1=0.0967]Epoch 8: 85%|########4 | 556/655 [00:00<00:00, 704.79batch/s, train/train_loss_1=0.0754]Epoch 8: 85%|########4 | 556/655 [00:00<00:00, 704.79batch/s, train/train_loss_1=0.0865]Epoch 8: 85%|########4 | 556/655 [00:00<00:00, 704.79batch/s, train/train_loss_1=0.0808]Epoch 8: 85%|########4 | 556/655 [00:00<00:00, 704.79batch/s, train/train_loss_1=0.0888]Epoch 8: 85%|########4 | 556/655 [00:00<00:00, 704.79batch/s, train/train_loss_1=0.0873]Epoch 8: 85%|########4 | 556/655 [00:00<00:00, 704.79batch/s, train/train_loss_1=0.084] Epoch 8: 85%|########4 | 556/655 [00:00<00:00, 704.79batch/s, train/train_loss_1=0.0882]Epoch 8: 85%|########4 | 556/655 [00:00<00:00, 704.79batch/s, train/train_loss_1=0.0693]Epoch 8: 85%|########4 | 556/655 [00:00<00:00, 704.79batch/s, train/train_loss_1=0.0869]Epoch 8: 85%|########4 | 556/655 [00:00<00:00, 704.79batch/s, train/train_loss_1=0.0876]Epoch 8: 85%|########4 | 556/655 [00:00<00:00, 704.79batch/s, train/train_loss_1=0.0852]Epoch 8: 85%|########4 | 556/655 [00:00<00:00, 704.79batch/s, train/train_loss_1=0.0926]Epoch 8: 85%|########4 | 556/655 [00:00<00:00, 704.79batch/s, train/train_loss_1=0.081] Epoch 8: 85%|########4 | 556/655 [00:00<00:00, 704.79batch/s, train/train_loss_1=0.0835]Epoch 8: 85%|########4 | 556/655 [00:00<00:00, 704.79batch/s, train/train_loss_1=0.0797]Epoch 8: 85%|########4 | 556/655 [00:00<00:00, 704.79batch/s, train/train_loss_1=0.0793]Epoch 8: 96%|#########6| 629/655 [00:00<00:00, 712.21batch/s, train/train_loss_1=0.0793]Epoch 8: 96%|#########6| 629/655 [00:00<00:00, 712.21batch/s, train/train_loss_1=0.0928]Epoch 8: 96%|#########6| 629/655 [00:00<00:00, 712.21batch/s, train/train_loss_1=0.0696]Epoch 8: 96%|#########6| 629/655 [00:00<00:00, 712.21batch/s, train/train_loss_1=0.105] Epoch 8: 96%|#########6| 629/655 [00:00<00:00, 712.21batch/s, train/train_loss_1=0.0811]Epoch 8: 96%|#########6| 629/655 [00:00<00:00, 712.21batch/s, train/train_loss_1=0.0902]Epoch 8: 96%|#########6| 629/655 [00:00<00:00, 712.21batch/s, train/train_loss_1=0.0874]Epoch 8: 96%|#########6| 629/655 [00:00<00:00, 712.21batch/s, train/train_loss_1=0.0946]Epoch 8: 96%|#########6| 629/655 [00:00<00:00, 712.21batch/s, train/train_loss_1=0.0804]Epoch 8: 96%|#########6| 629/655 [00:00<00:00, 712.21batch/s, train/train_loss_1=0.0809]Epoch 8: 96%|#########6| 629/655 [00:00<00:00, 712.21batch/s, train/train_loss_1=0.0885]Epoch 8: 96%|#########6| 629/655 [00:00<00:00, 712.21batch/s, train/train_loss_1=0.0828]Epoch 8: 96%|#########6| 629/655 [00:00<00:00, 712.21batch/s, train/train_loss_1=0.08] Epoch 8: 96%|#########6| 629/655 [00:00<00:00, 712.21batch/s, train/train_loss_1=0.0912]Epoch 8: 96%|#########6| 629/655 [00:00<00:00, 712.21batch/s, train/train_loss_1=0.0979]Epoch 8: 96%|#########6| 629/655 [00:00<00:00, 712.21batch/s, train/train_loss_1=0.0798]Epoch 8: 96%|#########6| 629/655 [00:00<00:00, 712.21batch/s, train/train_loss_1=0.0858]Epoch 8: 96%|#########6| 629/655 [00:00<00:00, 712.21batch/s, train/train_loss_1=0.0862]Epoch 8: 96%|#########6| 629/655 [00:00<00:00, 712.21batch/s, train/train_loss_1=0.0803]Epoch 8: 96%|#########6| 629/655 [00:00<00:00, 712.21batch/s, train/train_loss_1=0.0864]Epoch 8: 96%|#########6| 629/655 [00:00<00:00, 712.21batch/s, train/train_loss_1=0.0747]Epoch 8: 96%|#########6| 629/655 [00:00<00:00, 712.21batch/s, train/train_loss_1=0.0834]Epoch 8: 96%|#########6| 629/655 [00:00<00:00, 712.21batch/s, train/train_loss_1=0.0941]Epoch 8: 96%|#########6| 629/655 [00:00<00:00, 712.21batch/s, train/train_loss_1=0.0953]Epoch 8: 96%|#########6| 629/655 [00:00<00:00, 712.21batch/s, train/train_loss_1=0.0801]Epoch 8: 96%|#########6| 629/655 [00:00<00:00, 712.21batch/s, train/train_loss_1=0.0803]Epoch 8: 96%|#########6| 629/655 [00:00<00:00, 712.21batch/s, train/train_loss_1=0.0699] 0%| | 0/655 [00:00<?, ?batch/s]Epoch 9: 0%| | 0/655 [00:00<?, ?batch/s]Epoch 9: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0977]Epoch 9: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0908]Epoch 9: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0965]Epoch 9: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0873]Epoch 9: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0923]Epoch 9: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0758]Epoch 9: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0747]Epoch 9: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0788]Epoch 9: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.103] Epoch 9: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0849]Epoch 9: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.102] Epoch 9: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0902]Epoch 9: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0916]Epoch 9: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0863]Epoch 9: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.079] Epoch 9: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0934]Epoch 9: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0852]Epoch 9: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0822]Epoch 9: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0832]Epoch 9: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0887]Epoch 9: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0908]Epoch 9: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0795]Epoch 9: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0761]Epoch 9: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0701]Epoch 9: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0651]Epoch 9: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.098] Epoch 9: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0857]Epoch 9: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.086] Epoch 9: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0784]Epoch 9: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0712]Epoch 9: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0683]Epoch 9: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0984]Epoch 9: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0767]Epoch 9: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0811]Epoch 9: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0896]Epoch 9: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0857]Epoch 9: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0852]Epoch 9: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0878]Epoch 9: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0978]Epoch 9: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0896]Epoch 9: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0896]Epoch 9: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0811]Epoch 9: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0787]Epoch 9: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0817]Epoch 9: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0927]Epoch 9: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0822]Epoch 9: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0866]Epoch 9: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0814]Epoch 9: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0783]Epoch 9: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0859]Epoch 9: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0825]Epoch 9: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0772]Epoch 9: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0754]Epoch 9: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0899]Epoch 9: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0889]Epoch 9: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0798]Epoch 9: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0776]Epoch 9: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0814]Epoch 9: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0904]Epoch 9: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0863]Epoch 9: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0898]Epoch 9: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0817]Epoch 9: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0892]Epoch 9: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0809]Epoch 9: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0851]Epoch 9: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.085] Epoch 9: 0%| | 0/655 [00:00<?, ?batch/s, train/train_loss_1=0.0849]Epoch 9: 10%|# | 67/655 [00:00<00:00, 664.54batch/s, train/train_loss_1=0.0849]Epoch 9: 10%|# | 67/655 [00:00<00:00, 664.54batch/s, train/train_loss_1=0.0884]Epoch 9: 10%|# | 67/655 [00:00<00:00, 664.54batch/s, train/train_loss_1=0.0893]Epoch 9: 10%|# | 67/655 [00:00<00:00, 664.54batch/s, train/train_loss_1=0.0871]Epoch 9: 10%|# | 67/655 [00:00<00:00, 664.54batch/s, train/train_loss_1=0.0992]Epoch 9: 10%|# | 67/655 [00:00<00:00, 664.54batch/s, train/train_loss_1=0.0951]Epoch 9: 10%|# | 67/655 [00:00<00:00, 664.54batch/s, train/train_loss_1=0.0891]Epoch 9: 10%|# | 67/655 [00:00<00:00, 664.54batch/s, train/train_loss_1=0.0873]Epoch 9: 10%|# | 67/655 [00:00<00:00, 664.54batch/s, train/train_loss_1=0.0804]Epoch 9: 10%|# | 67/655 [00:00<00:00, 664.54batch/s, train/train_loss_1=0.0929]Epoch 9: 10%|# | 67/655 [00:00<00:00, 664.54batch/s, train/train_loss_1=0.0985]Epoch 9: 10%|# | 67/655 [00:00<00:00, 664.54batch/s, train/train_loss_1=0.0946]Epoch 9: 10%|# | 67/655 [00:00<00:00, 664.54batch/s, train/train_loss_1=0.0883]Epoch 9: 10%|# | 67/655 [00:00<00:00, 664.54batch/s, train/train_loss_1=0.0784]Epoch 9: 10%|# | 67/655 [00:00<00:00, 664.54batch/s, train/train_loss_1=0.0873]Epoch 9: 10%|# | 67/655 [00:00<00:00, 664.54batch/s, train/train_loss_1=0.0921]Epoch 9: 10%|# | 67/655 [00:00<00:00, 664.54batch/s, train/train_loss_1=0.0814]Epoch 9: 10%|# | 67/655 [00:00<00:00, 664.54batch/s, train/train_loss_1=0.0935]Epoch 9: 10%|# | 67/655 [00:00<00:00, 664.54batch/s, train/train_loss_1=0.0885]Epoch 9: 10%|# | 67/655 [00:00<00:00, 664.54batch/s, train/train_loss_1=0.0828]Epoch 9: 10%|# | 67/655 [00:00<00:00, 664.54batch/s, train/train_loss_1=0.0777]Epoch 9: 10%|# | 67/655 [00:00<00:00, 664.54batch/s, train/train_loss_1=0.075] Epoch 9: 10%|# | 67/655 [00:00<00:00, 664.54batch/s, train/train_loss_1=0.0913]Epoch 9: 10%|# | 67/655 [00:00<00:00, 664.54batch/s, train/train_loss_1=0.0967]Epoch 9: 10%|# | 67/655 [00:00<00:00, 664.54batch/s, train/train_loss_1=0.0927]Epoch 9: 10%|# | 67/655 [00:00<00:00, 664.54batch/s, train/train_loss_1=0.0898]Epoch 9: 10%|# | 67/655 [00:00<00:00, 664.54batch/s, train/train_loss_1=0.0857]Epoch 9: 10%|# | 67/655 [00:00<00:00, 664.54batch/s, train/train_loss_1=0.0803]Epoch 9: 10%|# | 67/655 [00:00<00:00, 664.54batch/s, train/train_loss_1=0.0849]Epoch 9: 10%|# | 67/655 [00:00<00:00, 664.54batch/s, train/train_loss_1=0.0787]Epoch 9: 10%|# | 67/655 [00:00<00:00, 664.54batch/s, train/train_loss_1=0.0882]Epoch 9: 10%|# | 67/655 [00:00<00:00, 664.54batch/s, train/train_loss_1=0.0876]Epoch 9: 10%|# | 67/655 [00:00<00:00, 664.54batch/s, train/train_loss_1=0.0852]Epoch 9: 10%|# | 67/655 [00:00<00:00, 664.54batch/s, train/train_loss_1=0.0869]Epoch 9: 10%|# | 67/655 [00:00<00:00, 664.54batch/s, train/train_loss_1=0.0839]Epoch 9: 10%|# | 67/655 [00:00<00:00, 664.54batch/s, train/train_loss_1=0.097] Epoch 9: 10%|# | 67/655 [00:00<00:00, 664.54batch/s, train/train_loss_1=0.0839]Epoch 9: 10%|# | 67/655 [00:00<00:00, 664.54batch/s, train/train_loss_1=0.0921]Epoch 9: 10%|# | 67/655 [00:00<00:00, 664.54batch/s, train/train_loss_1=0.0858]Epoch 9: 10%|# | 67/655 [00:00<00:00, 664.54batch/s, train/train_loss_1=0.103] Epoch 9: 10%|# | 67/655 [00:00<00:00, 664.54batch/s, train/train_loss_1=0.0852]Epoch 9: 10%|# | 67/655 [00:00<00:00, 664.54batch/s, train/train_loss_1=0.0935]Epoch 9: 10%|# | 67/655 [00:00<00:00, 664.54batch/s, train/train_loss_1=0.0811]Epoch 9: 10%|# | 67/655 [00:00<00:00, 664.54batch/s, train/train_loss_1=0.0688]Epoch 9: 10%|# | 67/655 [00:00<00:00, 664.54batch/s, train/train_loss_1=0.0813]Epoch 9: 10%|# | 67/655 [00:00<00:00, 664.54batch/s, train/train_loss_1=0.0832]Epoch 9: 10%|# | 67/655 [00:00<00:00, 664.54batch/s, train/train_loss_1=0.0916]Epoch 9: 10%|# | 67/655 [00:00<00:00, 664.54batch/s, train/train_loss_1=0.102] Epoch 9: 10%|# | 67/655 [00:00<00:00, 664.54batch/s, train/train_loss_1=0.0751]Epoch 9: 10%|# | 67/655 [00:00<00:00, 664.54batch/s, train/train_loss_1=0.0843]Epoch 9: 10%|# | 67/655 [00:00<00:00, 664.54batch/s, train/train_loss_1=0.107] Epoch 9: 10%|# | 67/655 [00:00<00:00, 664.54batch/s, train/train_loss_1=0.0894]Epoch 9: 10%|# | 67/655 [00:00<00:00, 664.54batch/s, train/train_loss_1=0.0926]Epoch 9: 10%|# | 67/655 [00:00<00:00, 664.54batch/s, train/train_loss_1=0.0978]Epoch 9: 10%|# | 67/655 [00:00<00:00, 664.54batch/s, train/train_loss_1=0.0832]Epoch 9: 10%|# | 67/655 [00:00<00:00, 664.54batch/s, train/train_loss_1=0.0987]Epoch 9: 10%|# | 67/655 [00:00<00:00, 664.54batch/s, train/train_loss_1=0.0819]Epoch 9: 10%|# | 67/655 [00:00<00:00, 664.54batch/s, train/train_loss_1=0.0793]Epoch 9: 10%|# | 67/655 [00:00<00:00, 664.54batch/s, train/train_loss_1=0.0853]Epoch 9: 10%|# | 67/655 [00:00<00:00, 664.54batch/s, train/train_loss_1=0.0667]Epoch 9: 10%|# | 67/655 [00:00<00:00, 664.54batch/s, train/train_loss_1=0.0888]Epoch 9: 10%|# | 67/655 [00:00<00:00, 664.54batch/s, train/train_loss_1=0.11] Epoch 9: 10%|# | 67/655 [00:00<00:00, 664.54batch/s, train/train_loss_1=0.0883]Epoch 9: 10%|# | 67/655 [00:00<00:00, 664.54batch/s, train/train_loss_1=0.0695]Epoch 9: 10%|# | 67/655 [00:00<00:00, 664.54batch/s, train/train_loss_1=0.0756]Epoch 9: 10%|# | 67/655 [00:00<00:00, 664.54batch/s, train/train_loss_1=0.0902]Epoch 9: 10%|# | 67/655 [00:00<00:00, 664.54batch/s, train/train_loss_1=0.0947]Epoch 9: 10%|# | 67/655 [00:00<00:00, 664.54batch/s, train/train_loss_1=0.0787]Epoch 9: 10%|# | 67/655 [00:00<00:00, 664.54batch/s, train/train_loss_1=0.103] Epoch 9: 21%|## | 135/655 [00:00<00:00, 667.06batch/s, train/train_loss_1=0.103]Epoch 9: 21%|## | 135/655 [00:00<00:00, 667.06batch/s, train/train_loss_1=0.0784]Epoch 9: 21%|## | 135/655 [00:00<00:00, 667.06batch/s, train/train_loss_1=0.0868]Epoch 9: 21%|## | 135/655 [00:00<00:00, 667.06batch/s, train/train_loss_1=0.0941]Epoch 9: 21%|## | 135/655 [00:00<00:00, 667.06batch/s, train/train_loss_1=0.0723]Epoch 9: 21%|## | 135/655 [00:00<00:00, 667.06batch/s, train/train_loss_1=0.0998]Epoch 9: 21%|## | 135/655 [00:00<00:00, 667.06batch/s, train/train_loss_1=0.0831]Epoch 9: 21%|## | 135/655 [00:00<00:00, 667.06batch/s, train/train_loss_1=0.0804]Epoch 9: 21%|## | 135/655 [00:00<00:00, 667.06batch/s, train/train_loss_1=0.093] Epoch 9: 21%|## | 135/655 [00:00<00:00, 667.06batch/s, train/train_loss_1=0.0818]Epoch 9: 21%|## | 135/655 [00:00<00:00, 667.06batch/s, train/train_loss_1=0.0749]Epoch 9: 21%|## | 135/655 [00:00<00:00, 667.06batch/s, train/train_loss_1=0.0888]Epoch 9: 21%|## | 135/655 [00:00<00:00, 667.06batch/s, train/train_loss_1=0.0715]Epoch 9: 21%|## | 135/655 [00:00<00:00, 667.06batch/s, train/train_loss_1=0.0767]Epoch 9: 21%|## | 135/655 [00:00<00:00, 667.06batch/s, train/train_loss_1=0.0787]Epoch 9: 21%|## | 135/655 [00:00<00:00, 667.06batch/s, train/train_loss_1=0.093] Epoch 9: 21%|## | 135/655 [00:00<00:00, 667.06batch/s, train/train_loss_1=0.0856]Epoch 9: 21%|## | 135/655 [00:00<00:00, 667.06batch/s, train/train_loss_1=0.0951]Epoch 9: 21%|## | 135/655 [00:00<00:00, 667.06batch/s, train/train_loss_1=0.0996]Epoch 9: 21%|## | 135/655 [00:00<00:00, 667.06batch/s, train/train_loss_1=0.0687]Epoch 9: 21%|## | 135/655 [00:00<00:00, 667.06batch/s, train/train_loss_1=0.103] Epoch 9: 21%|## | 135/655 [00:00<00:00, 667.06batch/s, train/train_loss_1=0.0892]Epoch 9: 21%|## | 135/655 [00:00<00:00, 667.06batch/s, train/train_loss_1=0.0796]Epoch 9: 21%|## | 135/655 [00:00<00:00, 667.06batch/s, train/train_loss_1=0.0934]Epoch 9: 21%|## | 135/655 [00:00<00:00, 667.06batch/s, train/train_loss_1=0.0977]Epoch 9: 21%|## | 135/655 [00:00<00:00, 667.06batch/s, train/train_loss_1=0.0999]Epoch 9: 21%|## | 135/655 [00:00<00:00, 667.06batch/s, train/train_loss_1=0.109] Epoch 9: 21%|## | 135/655 [00:00<00:00, 667.06batch/s, train/train_loss_1=0.0882]Epoch 9: 21%|## | 135/655 [00:00<00:00, 667.06batch/s, train/train_loss_1=0.0726]Epoch 9: 21%|## | 135/655 [00:00<00:00, 667.06batch/s, train/train_loss_1=0.0773]Epoch 9: 21%|## | 135/655 [00:00<00:00, 667.06batch/s, train/train_loss_1=0.0784]Epoch 9: 21%|## | 135/655 [00:00<00:00, 667.06batch/s, train/train_loss_1=0.0868]Epoch 9: 21%|## | 135/655 [00:00<00:00, 667.06batch/s, train/train_loss_1=0.0786]Epoch 9: 21%|## | 135/655 [00:00<00:00, 667.06batch/s, train/train_loss_1=0.0918]Epoch 9: 21%|## | 135/655 [00:00<00:00, 667.06batch/s, train/train_loss_1=0.0935]Epoch 9: 21%|## | 135/655 [00:00<00:00, 667.06batch/s, train/train_loss_1=0.102] Epoch 9: 21%|## | 135/655 [00:00<00:00, 667.06batch/s, train/train_loss_1=0.0864]Epoch 9: 21%|## | 135/655 [00:00<00:00, 667.06batch/s, train/train_loss_1=0.0721]Epoch 9: 21%|## | 135/655 [00:00<00:00, 667.06batch/s, train/train_loss_1=0.0778]Epoch 9: 21%|## | 135/655 [00:00<00:00, 667.06batch/s, train/train_loss_1=0.0792]Epoch 9: 21%|## | 135/655 [00:00<00:00, 667.06batch/s, train/train_loss_1=0.0898]Epoch 9: 21%|## | 135/655 [00:00<00:00, 667.06batch/s, train/train_loss_1=0.0928]Epoch 9: 21%|## | 135/655 [00:00<00:00, 667.06batch/s, train/train_loss_1=0.0704]Epoch 9: 21%|## | 135/655 [00:00<00:00, 667.06batch/s, train/train_loss_1=0.0909]Epoch 9: 21%|## | 135/655 [00:00<00:00, 667.06batch/s, train/train_loss_1=0.0921]Epoch 9: 21%|## | 135/655 [00:00<00:00, 667.06batch/s, train/train_loss_1=0.0821]Epoch 9: 21%|## | 135/655 [00:00<00:00, 667.06batch/s, train/train_loss_1=0.0835]Epoch 9: 21%|## | 135/655 [00:00<00:00, 667.06batch/s, train/train_loss_1=0.0679]Epoch 9: 21%|## | 135/655 [00:00<00:00, 667.06batch/s, train/train_loss_1=0.0803]Epoch 9: 21%|## | 135/655 [00:00<00:00, 667.06batch/s, train/train_loss_1=0.071] Epoch 9: 21%|## | 135/655 [00:00<00:00, 667.06batch/s, train/train_loss_1=0.0973]Epoch 9: 21%|## | 135/655 [00:00<00:00, 667.06batch/s, train/train_loss_1=0.0817]Epoch 9: 21%|## | 135/655 [00:00<00:00, 667.06batch/s, train/train_loss_1=0.0806]Epoch 9: 21%|## | 135/655 [00:00<00:00, 667.06batch/s, train/train_loss_1=0.0929]Epoch 9: 21%|## | 135/655 [00:00<00:00, 667.06batch/s, train/train_loss_1=0.0746]Epoch 9: 21%|## | 135/655 [00:00<00:00, 667.06batch/s, train/train_loss_1=0.0837]Epoch 9: 21%|## | 135/655 [00:00<00:00, 667.06batch/s, train/train_loss_1=0.0874]Epoch 9: 21%|## | 135/655 [00:00<00:00, 667.06batch/s, train/train_loss_1=0.0852]Epoch 9: 21%|## | 135/655 [00:00<00:00, 667.06batch/s, train/train_loss_1=0.0739]Epoch 9: 21%|## | 135/655 [00:00<00:00, 667.06batch/s, train/train_loss_1=0.0822]Epoch 9: 21%|## | 135/655 [00:00<00:00, 667.06batch/s, train/train_loss_1=0.0927]Epoch 9: 21%|## | 135/655 [00:00<00:00, 667.06batch/s, train/train_loss_1=0.0765]Epoch 9: 21%|## | 135/655 [00:00<00:00, 667.06batch/s, train/train_loss_1=0.0837]Epoch 9: 21%|## | 135/655 [00:00<00:00, 667.06batch/s, train/train_loss_1=0.0904]Epoch 9: 21%|## | 135/655 [00:00<00:00, 667.06batch/s, train/train_loss_1=0.0943]Epoch 9: 21%|## | 135/655 [00:00<00:00, 667.06batch/s, train/train_loss_1=0.0892]Epoch 9: 21%|## | 135/655 [00:00<00:00, 667.06batch/s, train/train_loss_1=0.0872]Epoch 9: 21%|## | 135/655 [00:00<00:00, 667.06batch/s, train/train_loss_1=0.0912]Epoch 9: 31%|### | 202/655 [00:00<00:00, 662.64batch/s, train/train_loss_1=0.0912]Epoch 9: 31%|### | 202/655 [00:00<00:00, 662.64batch/s, train/train_loss_1=0.0887]Epoch 9: 31%|### | 202/655 [00:00<00:00, 662.64batch/s, train/train_loss_1=0.0782]Epoch 9: 31%|### | 202/655 [00:00<00:00, 662.64batch/s, train/train_loss_1=0.0857]Epoch 9: 31%|### | 202/655 [00:00<00:00, 662.64batch/s, train/train_loss_1=0.0772]Epoch 9: 31%|### | 202/655 [00:00<00:00, 662.64batch/s, train/train_loss_1=0.0942]Epoch 9: 31%|### | 202/655 [00:00<00:00, 662.64batch/s, train/train_loss_1=0.0787]Epoch 9: 31%|### | 202/655 [00:00<00:00, 662.64batch/s, train/train_loss_1=0.0889]Epoch 9: 31%|### | 202/655 [00:00<00:00, 662.64batch/s, train/train_loss_1=0.0977]Epoch 9: 31%|### | 202/655 [00:00<00:00, 662.64batch/s, train/train_loss_1=0.0841]Epoch 9: 31%|### | 202/655 [00:00<00:00, 662.64batch/s, train/train_loss_1=0.0804]Epoch 9: 31%|### | 202/655 [00:00<00:00, 662.64batch/s, train/train_loss_1=0.0731]Epoch 9: 31%|### | 202/655 [00:00<00:00, 662.64batch/s, train/train_loss_1=0.0873]Epoch 9: 31%|### | 202/655 [00:00<00:00, 662.64batch/s, train/train_loss_1=0.0807]Epoch 9: 31%|### | 202/655 [00:00<00:00, 662.64batch/s, train/train_loss_1=0.0882]Epoch 9: 31%|### | 202/655 [00:00<00:00, 662.64batch/s, train/train_loss_1=0.106] Epoch 9: 31%|### | 202/655 [00:00<00:00, 662.64batch/s, train/train_loss_1=0.0771]Epoch 9: 31%|### | 202/655 [00:00<00:00, 662.64batch/s, train/train_loss_1=0.1] Epoch 9: 31%|### | 202/655 [00:00<00:00, 662.64batch/s, train/train_loss_1=0.0776]Epoch 9: 31%|### | 202/655 [00:00<00:00, 662.64batch/s, train/train_loss_1=0.0755]Epoch 9: 31%|### | 202/655 [00:00<00:00, 662.64batch/s, train/train_loss_1=0.0821]Epoch 9: 31%|### | 202/655 [00:00<00:00, 662.64batch/s, train/train_loss_1=0.0956]Epoch 9: 31%|### | 202/655 [00:00<00:00, 662.64batch/s, train/train_loss_1=0.0946]Epoch 9: 31%|### | 202/655 [00:00<00:00, 662.64batch/s, train/train_loss_1=0.0973]Epoch 9: 31%|### | 202/655 [00:00<00:00, 662.64batch/s, train/train_loss_1=0.0886]Epoch 9: 31%|### | 202/655 [00:00<00:00, 662.64batch/s, train/train_loss_1=0.0893]Epoch 9: 31%|### | 202/655 [00:00<00:00, 662.64batch/s, train/train_loss_1=0.0679]Epoch 9: 31%|### | 202/655 [00:00<00:00, 662.64batch/s, train/train_loss_1=0.0715]Epoch 9: 31%|### | 202/655 [00:00<00:00, 662.64batch/s, train/train_loss_1=0.0914]Epoch 9: 31%|### | 202/655 [00:00<00:00, 662.64batch/s, train/train_loss_1=0.0888]Epoch 9: 31%|### | 202/655 [00:00<00:00, 662.64batch/s, train/train_loss_1=0.0704]Epoch 9: 31%|### | 202/655 [00:00<00:00, 662.64batch/s, train/train_loss_1=0.0861]Epoch 9: 31%|### | 202/655 [00:00<00:00, 662.64batch/s, train/train_loss_1=0.0988]Epoch 9: 31%|### | 202/655 [00:00<00:00, 662.64batch/s, train/train_loss_1=0.0776]Epoch 9: 31%|### | 202/655 [00:00<00:00, 662.64batch/s, train/train_loss_1=0.0768]Epoch 9: 31%|### | 202/655 [00:00<00:00, 662.64batch/s, train/train_loss_1=0.0896]Epoch 9: 31%|### | 202/655 [00:00<00:00, 662.64batch/s, train/train_loss_1=0.0758]Epoch 9: 31%|### | 202/655 [00:00<00:00, 662.64batch/s, train/train_loss_1=0.08] Epoch 9: 31%|### | 202/655 [00:00<00:00, 662.64batch/s, train/train_loss_1=0.0821]Epoch 9: 31%|### | 202/655 [00:00<00:00, 662.64batch/s, train/train_loss_1=0.0771]Epoch 9: 31%|### | 202/655 [00:00<00:00, 662.64batch/s, train/train_loss_1=0.0799]Epoch 9: 31%|### | 202/655 [00:00<00:00, 662.64batch/s, train/train_loss_1=0.083] Epoch 9: 31%|### | 202/655 [00:00<00:00, 662.64batch/s, train/train_loss_1=0.0919]Epoch 9: 31%|### | 202/655 [00:00<00:00, 662.64batch/s, train/train_loss_1=0.0915]Epoch 9: 31%|### | 202/655 [00:00<00:00, 662.64batch/s, train/train_loss_1=0.0686]Epoch 9: 31%|### | 202/655 [00:00<00:00, 662.64batch/s, train/train_loss_1=0.0862]Epoch 9: 31%|### | 202/655 [00:00<00:00, 662.64batch/s, train/train_loss_1=0.082] Epoch 9: 31%|### | 202/655 [00:00<00:00, 662.64batch/s, train/train_loss_1=0.0989]Epoch 9: 31%|### | 202/655 [00:00<00:00, 662.64batch/s, train/train_loss_1=0.0771]Epoch 9: 31%|### | 202/655 [00:00<00:00, 662.64batch/s, train/train_loss_1=0.0806]Epoch 9: 31%|### | 202/655 [00:00<00:00, 662.64batch/s, train/train_loss_1=0.0848]Epoch 9: 31%|### | 202/655 [00:00<00:00, 662.64batch/s, train/train_loss_1=0.0916]Epoch 9: 31%|### | 202/655 [00:00<00:00, 662.64batch/s, train/train_loss_1=0.084] Epoch 9: 31%|### | 202/655 [00:00<00:00, 662.64batch/s, train/train_loss_1=0.0938]Epoch 9: 31%|### | 202/655 [00:00<00:00, 662.64batch/s, train/train_loss_1=0.0727]Epoch 9: 31%|### | 202/655 [00:00<00:00, 662.64batch/s, train/train_loss_1=0.0902]Epoch 9: 31%|### | 202/655 [00:00<00:00, 662.64batch/s, train/train_loss_1=0.0838]Epoch 9: 31%|### | 202/655 [00:00<00:00, 662.64batch/s, train/train_loss_1=0.087] Epoch 9: 31%|### | 202/655 [00:00<00:00, 662.64batch/s, train/train_loss_1=0.0897]Epoch 9: 31%|### | 202/655 [00:00<00:00, 662.64batch/s, train/train_loss_1=0.0785]Epoch 9: 31%|### | 202/655 [00:00<00:00, 662.64batch/s, train/train_loss_1=0.0887]Epoch 9: 31%|### | 202/655 [00:00<00:00, 662.64batch/s, train/train_loss_1=0.0746]Epoch 9: 31%|### | 202/655 [00:00<00:00, 662.64batch/s, train/train_loss_1=0.09] Epoch 9: 31%|### | 202/655 [00:00<00:00, 662.64batch/s, train/train_loss_1=0.0973]Epoch 9: 31%|### | 202/655 [00:00<00:00, 662.64batch/s, train/train_loss_1=0.0931]Epoch 9: 31%|### | 202/655 [00:00<00:00, 662.64batch/s, train/train_loss_1=0.106] Epoch 9: 31%|### | 202/655 [00:00<00:00, 662.64batch/s, train/train_loss_1=0.0842]Epoch 9: 31%|### | 202/655 [00:00<00:00, 662.64batch/s, train/train_loss_1=0.0888]Epoch 9: 41%|####1 | 269/655 [00:00<00:00, 663.42batch/s, train/train_loss_1=0.0888]Epoch 9: 41%|####1 | 269/655 [00:00<00:00, 663.42batch/s, train/train_loss_1=0.0744]Epoch 9: 41%|####1 | 269/655 [00:00<00:00, 663.42batch/s, train/train_loss_1=0.0717]Epoch 9: 41%|####1 | 269/655 [00:00<00:00, 663.42batch/s, train/train_loss_1=0.0796]Epoch 9: 41%|####1 | 269/655 [00:00<00:00, 663.42batch/s, train/train_loss_1=0.0906]Epoch 9: 41%|####1 | 269/655 [00:00<00:00, 663.42batch/s, train/train_loss_1=0.0811]Epoch 9: 41%|####1 | 269/655 [00:00<00:00, 663.42batch/s, train/train_loss_1=0.0912]Epoch 9: 41%|####1 | 269/655 [00:00<00:00, 663.42batch/s, train/train_loss_1=0.0944]Epoch 9: 41%|####1 | 269/655 [00:00<00:00, 663.42batch/s, train/train_loss_1=0.0969]Epoch 9: 41%|####1 | 269/655 [00:00<00:00, 663.42batch/s, train/train_loss_1=0.0835]Epoch 9: 41%|####1 | 269/655 [00:00<00:00, 663.42batch/s, train/train_loss_1=0.0799]Epoch 9: 41%|####1 | 269/655 [00:00<00:00, 663.42batch/s, train/train_loss_1=0.0752]Epoch 9: 41%|####1 | 269/655 [00:00<00:00, 663.42batch/s, train/train_loss_1=0.0917]Epoch 9: 41%|####1 | 269/655 [00:00<00:00, 663.42batch/s, train/train_loss_1=0.0815]Epoch 9: 41%|####1 | 269/655 [00:00<00:00, 663.42batch/s, train/train_loss_1=0.0826]Epoch 9: 41%|####1 | 269/655 [00:00<00:00, 663.42batch/s, train/train_loss_1=0.0735]Epoch 9: 41%|####1 | 269/655 [00:00<00:00, 663.42batch/s, train/train_loss_1=0.0901]Epoch 9: 41%|####1 | 269/655 [00:00<00:00, 663.42batch/s, train/train_loss_1=0.0903]Epoch 9: 41%|####1 | 269/655 [00:00<00:00, 663.42batch/s, train/train_loss_1=0.0881]Epoch 9: 41%|####1 | 269/655 [00:00<00:00, 663.42batch/s, train/train_loss_1=0.0809]Epoch 9: 41%|####1 | 269/655 [00:00<00:00, 663.42batch/s, train/train_loss_1=0.0672]Epoch 9: 41%|####1 | 269/655 [00:00<00:00, 663.42batch/s, train/train_loss_1=0.0955]Epoch 9: 41%|####1 | 269/655 [00:00<00:00, 663.42batch/s, train/train_loss_1=0.0931]Epoch 9: 41%|####1 | 269/655 [00:00<00:00, 663.42batch/s, train/train_loss_1=0.0836]Epoch 9: 41%|####1 | 269/655 [00:00<00:00, 663.42batch/s, train/train_loss_1=0.101] Epoch 9: 41%|####1 | 269/655 [00:00<00:00, 663.42batch/s, train/train_loss_1=0.0878]Epoch 9: 41%|####1 | 269/655 [00:00<00:00, 663.42batch/s, train/train_loss_1=0.073] Epoch 9: 41%|####1 | 269/655 [00:00<00:00, 663.42batch/s, train/train_loss_1=0.0795]Epoch 9: 41%|####1 | 269/655 [00:00<00:00, 663.42batch/s, train/train_loss_1=0.0993]Epoch 9: 41%|####1 | 269/655 [00:00<00:00, 663.42batch/s, train/train_loss_1=0.098] Epoch 9: 41%|####1 | 269/655 [00:00<00:00, 663.42batch/s, train/train_loss_1=0.0834]Epoch 9: 41%|####1 | 269/655 [00:00<00:00, 663.42batch/s, train/train_loss_1=0.0936]Epoch 9: 41%|####1 | 269/655 [00:00<00:00, 663.42batch/s, train/train_loss_1=0.0752]Epoch 9: 41%|####1 | 269/655 [00:00<00:00, 663.42batch/s, train/train_loss_1=0.0812]Epoch 9: 41%|####1 | 269/655 [00:00<00:00, 663.42batch/s, train/train_loss_1=0.0776]Epoch 9: 41%|####1 | 269/655 [00:00<00:00, 663.42batch/s, train/train_loss_1=0.0877]Epoch 9: 41%|####1 | 269/655 [00:00<00:00, 663.42batch/s, train/train_loss_1=0.0813]Epoch 9: 41%|####1 | 269/655 [00:00<00:00, 663.42batch/s, train/train_loss_1=0.107] Epoch 9: 41%|####1 | 269/655 [00:00<00:00, 663.42batch/s, train/train_loss_1=0.0947]Epoch 9: 41%|####1 | 269/655 [00:00<00:00, 663.42batch/s, train/train_loss_1=0.0984]Epoch 9: 41%|####1 | 269/655 [00:00<00:00, 663.42batch/s, train/train_loss_1=0.0944]Epoch 9: 41%|####1 | 269/655 [00:00<00:00, 663.42batch/s, train/train_loss_1=0.0849]Epoch 9: 41%|####1 | 269/655 [00:00<00:00, 663.42batch/s, train/train_loss_1=0.0874]Epoch 9: 41%|####1 | 269/655 [00:00<00:00, 663.42batch/s, train/train_loss_1=0.0865]Epoch 9: 41%|####1 | 269/655 [00:00<00:00, 663.42batch/s, train/train_loss_1=0.0703]Epoch 9: 41%|####1 | 269/655 [00:00<00:00, 663.42batch/s, train/train_loss_1=0.0821]Epoch 9: 41%|####1 | 269/655 [00:00<00:00, 663.42batch/s, train/train_loss_1=0.0798]Epoch 9: 41%|####1 | 269/655 [00:00<00:00, 663.42batch/s, train/train_loss_1=0.0808]Epoch 9: 41%|####1 | 269/655 [00:00<00:00, 663.42batch/s, train/train_loss_1=0.0846]Epoch 9: 41%|####1 | 269/655 [00:00<00:00, 663.42batch/s, train/train_loss_1=0.0766]Epoch 9: 41%|####1 | 269/655 [00:00<00:00, 663.42batch/s, train/train_loss_1=0.0785]Epoch 9: 41%|####1 | 269/655 [00:00<00:00, 663.42batch/s, train/train_loss_1=0.0967]Epoch 9: 41%|####1 | 269/655 [00:00<00:00, 663.42batch/s, train/train_loss_1=0.108] Epoch 9: 41%|####1 | 269/655 [00:00<00:00, 663.42batch/s, train/train_loss_1=0.0786]Epoch 9: 41%|####1 | 269/655 [00:00<00:00, 663.42batch/s, train/train_loss_1=0.0801]Epoch 9: 41%|####1 | 269/655 [00:00<00:00, 663.42batch/s, train/train_loss_1=0.0883]Epoch 9: 41%|####1 | 269/655 [00:00<00:00, 663.42batch/s, train/train_loss_1=0.0802]Epoch 9: 41%|####1 | 269/655 [00:00<00:00, 663.42batch/s, train/train_loss_1=0.0829]Epoch 9: 41%|####1 | 269/655 [00:00<00:00, 663.42batch/s, train/train_loss_1=0.0955]Epoch 9: 41%|####1 | 269/655 [00:00<00:00, 663.42batch/s, train/train_loss_1=0.0844]Epoch 9: 41%|####1 | 269/655 [00:00<00:00, 663.42batch/s, train/train_loss_1=0.087] Epoch 9: 41%|####1 | 269/655 [00:00<00:00, 663.42batch/s, train/train_loss_1=0.0778]Epoch 9: 41%|####1 | 269/655 [00:00<00:00, 663.42batch/s, train/train_loss_1=0.0722]Epoch 9: 41%|####1 | 269/655 [00:00<00:00, 663.42batch/s, train/train_loss_1=0.0858]Epoch 9: 41%|####1 | 269/655 [00:00<00:00, 663.42batch/s, train/train_loss_1=0.0859]Epoch 9: 41%|####1 | 269/655 [00:00<00:00, 663.42batch/s, train/train_loss_1=0.0957]Epoch 9: 41%|####1 | 269/655 [00:00<00:00, 663.42batch/s, train/train_loss_1=0.0633]Epoch 9: 41%|####1 | 269/655 [00:00<00:00, 663.42batch/s, train/train_loss_1=0.103] Epoch 9: 41%|####1 | 269/655 [00:00<00:00, 663.42batch/s, train/train_loss_1=0.0763]Epoch 9: 41%|####1 | 269/655 [00:00<00:00, 663.42batch/s, train/train_loss_1=0.087] Epoch 9: 52%|#####1 | 338/655 [00:00<00:00, 671.28batch/s, train/train_loss_1=0.087]Epoch 9: 52%|#####1 | 338/655 [00:00<00:00, 671.28batch/s, train/train_loss_1=0.0784]Epoch 9: 52%|#####1 | 338/655 [00:00<00:00, 671.28batch/s, train/train_loss_1=0.069] Epoch 9: 52%|#####1 | 338/655 [00:00<00:00, 671.28batch/s, train/train_loss_1=0.0912]Epoch 9: 52%|#####1 | 338/655 [00:00<00:00, 671.28batch/s, train/train_loss_1=0.0944]Epoch 9: 52%|#####1 | 338/655 [00:00<00:00, 671.28batch/s, train/train_loss_1=0.0875]Epoch 9: 52%|#####1 | 338/655 [00:00<00:00, 671.28batch/s, train/train_loss_1=0.0871]Epoch 9: 52%|#####1 | 338/655 [00:00<00:00, 671.28batch/s, train/train_loss_1=0.0729]Epoch 9: 52%|#####1 | 338/655 [00:00<00:00, 671.28batch/s, train/train_loss_1=0.0798]Epoch 9: 52%|#####1 | 338/655 [00:00<00:00, 671.28batch/s, train/train_loss_1=0.089] Epoch 9: 52%|#####1 | 338/655 [00:00<00:00, 671.28batch/s, train/train_loss_1=0.0738]Epoch 9: 52%|#####1 | 338/655 [00:00<00:00, 671.28batch/s, train/train_loss_1=0.0707]Epoch 9: 52%|#####1 | 338/655 [00:00<00:00, 671.28batch/s, train/train_loss_1=0.0841]Epoch 9: 52%|#####1 | 338/655 [00:00<00:00, 671.28batch/s, train/train_loss_1=0.0816]Epoch 9: 52%|#####1 | 338/655 [00:00<00:00, 671.28batch/s, train/train_loss_1=0.085] Epoch 9: 52%|#####1 | 338/655 [00:00<00:00, 671.28batch/s, train/train_loss_1=0.0879]Epoch 9: 52%|#####1 | 338/655 [00:00<00:00, 671.28batch/s, train/train_loss_1=0.0733]Epoch 9: 52%|#####1 | 338/655 [00:00<00:00, 671.28batch/s, train/train_loss_1=0.0912]Epoch 9: 52%|#####1 | 338/655 [00:00<00:00, 671.28batch/s, train/train_loss_1=0.0916]Epoch 9: 52%|#####1 | 338/655 [00:00<00:00, 671.28batch/s, train/train_loss_1=0.0845]Epoch 9: 52%|#####1 | 338/655 [00:00<00:00, 671.28batch/s, train/train_loss_1=0.0712]Epoch 9: 52%|#####1 | 338/655 [00:00<00:00, 671.28batch/s, train/train_loss_1=0.0811]Epoch 9: 52%|#####1 | 338/655 [00:00<00:00, 671.28batch/s, train/train_loss_1=0.0925]Epoch 9: 52%|#####1 | 338/655 [00:00<00:00, 671.28batch/s, train/train_loss_1=0.0658]Epoch 9: 52%|#####1 | 338/655 [00:00<00:00, 671.28batch/s, train/train_loss_1=0.101] Epoch 9: 52%|#####1 | 338/655 [00:00<00:00, 671.28batch/s, train/train_loss_1=0.106]Epoch 9: 52%|#####1 | 338/655 [00:00<00:00, 671.28batch/s, train/train_loss_1=0.0948]Epoch 9: 52%|#####1 | 338/655 [00:00<00:00, 671.28batch/s, train/train_loss_1=0.0777]Epoch 9: 52%|#####1 | 338/655 [00:00<00:00, 671.28batch/s, train/train_loss_1=0.0903]Epoch 9: 52%|#####1 | 338/655 [00:00<00:00, 671.28batch/s, train/train_loss_1=0.0905]Epoch 9: 52%|#####1 | 338/655 [00:00<00:00, 671.28batch/s, train/train_loss_1=0.0798]Epoch 9: 52%|#####1 | 338/655 [00:00<00:00, 671.28batch/s, train/train_loss_1=0.0857]Epoch 9: 52%|#####1 | 338/655 [00:00<00:00, 671.28batch/s, train/train_loss_1=0.085] Epoch 9: 52%|#####1 | 338/655 [00:00<00:00, 671.28batch/s, train/train_loss_1=0.085]Epoch 9: 52%|#####1 | 338/655 [00:00<00:00, 671.28batch/s, train/train_loss_1=0.0812]Epoch 9: 52%|#####1 | 338/655 [00:00<00:00, 671.28batch/s, train/train_loss_1=0.0799]Epoch 9: 52%|#####1 | 338/655 [00:00<00:00, 671.28batch/s, train/train_loss_1=0.0772]Epoch 9: 52%|#####1 | 338/655 [00:00<00:00, 671.28batch/s, train/train_loss_1=0.0735]Epoch 9: 52%|#####1 | 338/655 [00:00<00:00, 671.28batch/s, train/train_loss_1=0.0793]Epoch 9: 52%|#####1 | 338/655 [00:00<00:00, 671.28batch/s, train/train_loss_1=0.0981]Epoch 9: 52%|#####1 | 338/655 [00:00<00:00, 671.28batch/s, train/train_loss_1=0.0939]Epoch 9: 52%|#####1 | 338/655 [00:00<00:00, 671.28batch/s, train/train_loss_1=0.0874]Epoch 9: 52%|#####1 | 338/655 [00:00<00:00, 671.28batch/s, train/train_loss_1=0.0773]Epoch 9: 52%|#####1 | 338/655 [00:00<00:00, 671.28batch/s, train/train_loss_1=0.0886]Epoch 9: 52%|#####1 | 338/655 [00:00<00:00, 671.28batch/s, train/train_loss_1=0.0856]Epoch 9: 52%|#####1 | 338/655 [00:00<00:00, 671.28batch/s, train/train_loss_1=0.0865]Epoch 9: 52%|#####1 | 338/655 [00:00<00:00, 671.28batch/s, train/train_loss_1=0.084] Epoch 9: 52%|#####1 | 338/655 [00:00<00:00, 671.28batch/s, train/train_loss_1=0.0806]Epoch 9: 52%|#####1 | 338/655 [00:00<00:00, 671.28batch/s, train/train_loss_1=0.0957]Epoch 9: 52%|#####1 | 338/655 [00:00<00:00, 671.28batch/s, train/train_loss_1=0.09] Epoch 9: 52%|#####1 | 338/655 [00:00<00:00, 671.28batch/s, train/train_loss_1=0.0782]Epoch 9: 52%|#####1 | 338/655 [00:00<00:00, 671.28batch/s, train/train_loss_1=0.0947]Epoch 9: 52%|#####1 | 338/655 [00:00<00:00, 671.28batch/s, train/train_loss_1=0.0804]Epoch 9: 52%|#####1 | 338/655 [00:00<00:00, 671.28batch/s, train/train_loss_1=0.0664]Epoch 9: 52%|#####1 | 338/655 [00:00<00:00, 671.28batch/s, train/train_loss_1=0.0769]Epoch 9: 52%|#####1 | 338/655 [00:00<00:00, 671.28batch/s, train/train_loss_1=0.0903]Epoch 9: 52%|#####1 | 338/655 [00:00<00:00, 671.28batch/s, train/train_loss_1=0.0703]Epoch 9: 52%|#####1 | 338/655 [00:00<00:00, 671.28batch/s, train/train_loss_1=0.0951]Epoch 9: 52%|#####1 | 338/655 [00:00<00:00, 671.28batch/s, train/train_loss_1=0.0747]Epoch 9: 52%|#####1 | 338/655 [00:00<00:00, 671.28batch/s, train/train_loss_1=0.0967]Epoch 9: 52%|#####1 | 338/655 [00:00<00:00, 671.28batch/s, train/train_loss_1=0.0824]Epoch 9: 52%|#####1 | 338/655 [00:00<00:00, 671.28batch/s, train/train_loss_1=0.0857]Epoch 9: 52%|#####1 | 338/655 [00:00<00:00, 671.28batch/s, train/train_loss_1=0.0806]Epoch 9: 52%|#####1 | 338/655 [00:00<00:00, 671.28batch/s, train/train_loss_1=0.0779]Epoch 9: 52%|#####1 | 338/655 [00:00<00:00, 671.28batch/s, train/train_loss_1=0.0682]Epoch 9: 52%|#####1 | 338/655 [00:00<00:00, 671.28batch/s, train/train_loss_1=0.0858]Epoch 9: 52%|#####1 | 338/655 [00:00<00:00, 671.28batch/s, train/train_loss_1=0.0829]Epoch 9: 52%|#####1 | 338/655 [00:00<00:00, 671.28batch/s, train/train_loss_1=0.0775]Epoch 9: 52%|#####1 | 338/655 [00:00<00:00, 671.28batch/s, train/train_loss_1=0.07] Epoch 9: 52%|#####1 | 338/655 [00:00<00:00, 671.28batch/s, train/train_loss_1=0.0694]Epoch 9: 52%|#####1 | 338/655 [00:00<00:00, 671.28batch/s, train/train_loss_1=0.0897]Epoch 9: 52%|#####1 | 338/655 [00:00<00:00, 671.28batch/s, train/train_loss_1=0.102] Epoch 9: 52%|#####1 | 338/655 [00:00<00:00, 671.28batch/s, train/train_loss_1=0.0894]Epoch 9: 52%|#####1 | 338/655 [00:00<00:00, 671.28batch/s, train/train_loss_1=0.0842]Epoch 9: 63%|######2 | 411/655 [00:00<00:00, 689.93batch/s, train/train_loss_1=0.0842]Epoch 9: 63%|######2 | 411/655 [00:00<00:00, 689.93batch/s, train/train_loss_1=0.0791]Epoch 9: 63%|######2 | 411/655 [00:00<00:00, 689.93batch/s, train/train_loss_1=0.0854]Epoch 9: 63%|######2 | 411/655 [00:00<00:00, 689.93batch/s, train/train_loss_1=0.0795]Epoch 9: 63%|######2 | 411/655 [00:00<00:00, 689.93batch/s, train/train_loss_1=0.0893]Epoch 9: 63%|######2 | 411/655 [00:00<00:00, 689.93batch/s, train/train_loss_1=0.0856]Epoch 9: 63%|######2 | 411/655 [00:00<00:00, 689.93batch/s, train/train_loss_1=0.0804]Epoch 9: 63%|######2 | 411/655 [00:00<00:00, 689.93batch/s, train/train_loss_1=0.0842]Epoch 9: 63%|######2 | 411/655 [00:00<00:00, 689.93batch/s, train/train_loss_1=0.0984]Epoch 9: 63%|######2 | 411/655 [00:00<00:00, 689.93batch/s, train/train_loss_1=0.0753]Epoch 9: 63%|######2 | 411/655 [00:00<00:00, 689.93batch/s, train/train_loss_1=0.0662]Epoch 9: 63%|######2 | 411/655 [00:00<00:00, 689.93batch/s, train/train_loss_1=0.0928]Epoch 9: 63%|######2 | 411/655 [00:00<00:00, 689.93batch/s, train/train_loss_1=0.0694]Epoch 9: 63%|######2 | 411/655 [00:00<00:00, 689.93batch/s, train/train_loss_1=0.0719]Epoch 9: 63%|######2 | 411/655 [00:00<00:00, 689.93batch/s, train/train_loss_1=0.0788]Epoch 9: 63%|######2 | 411/655 [00:00<00:00, 689.93batch/s, train/train_loss_1=0.0776]Epoch 9: 63%|######2 | 411/655 [00:00<00:00, 689.93batch/s, train/train_loss_1=0.0737]Epoch 9: 63%|######2 | 411/655 [00:00<00:00, 689.93batch/s, train/train_loss_1=0.0906]Epoch 9: 63%|######2 | 411/655 [00:00<00:00, 689.93batch/s, train/train_loss_1=0.102] Epoch 9: 63%|######2 | 411/655 [00:00<00:00, 689.93batch/s, train/train_loss_1=0.103]Epoch 9: 63%|######2 | 411/655 [00:00<00:00, 689.93batch/s, train/train_loss_1=0.0672]Epoch 9: 63%|######2 | 411/655 [00:00<00:00, 689.93batch/s, train/train_loss_1=0.0926]Epoch 9: 63%|######2 | 411/655 [00:00<00:00, 689.93batch/s, train/train_loss_1=0.0785]Epoch 9: 63%|######2 | 411/655 [00:00<00:00, 689.93batch/s, train/train_loss_1=0.0875]Epoch 9: 63%|######2 | 411/655 [00:00<00:00, 689.93batch/s, train/train_loss_1=0.0809]Epoch 9: 63%|######2 | 411/655 [00:00<00:00, 689.93batch/s, train/train_loss_1=0.0758]Epoch 9: 63%|######2 | 411/655 [00:00<00:00, 689.93batch/s, train/train_loss_1=0.0835]Epoch 9: 63%|######2 | 411/655 [00:00<00:00, 689.93batch/s, train/train_loss_1=0.0761]Epoch 9: 63%|######2 | 411/655 [00:00<00:00, 689.93batch/s, train/train_loss_1=0.0852]Epoch 9: 63%|######2 | 411/655 [00:00<00:00, 689.93batch/s, train/train_loss_1=0.0899]Epoch 9: 63%|######2 | 411/655 [00:00<00:00, 689.93batch/s, train/train_loss_1=0.0752]Epoch 9: 63%|######2 | 411/655 [00:00<00:00, 689.93batch/s, train/train_loss_1=0.0975]Epoch 9: 63%|######2 | 411/655 [00:00<00:00, 689.93batch/s, train/train_loss_1=0.0743]Epoch 9: 63%|######2 | 411/655 [00:00<00:00, 689.93batch/s, train/train_loss_1=0.0759]Epoch 9: 63%|######2 | 411/655 [00:00<00:00, 689.93batch/s, train/train_loss_1=0.103] Epoch 9: 63%|######2 | 411/655 [00:00<00:00, 689.93batch/s, train/train_loss_1=0.0954]Epoch 9: 63%|######2 | 411/655 [00:00<00:00, 689.93batch/s, train/train_loss_1=0.0696]Epoch 9: 63%|######2 | 411/655 [00:00<00:00, 689.93batch/s, train/train_loss_1=0.0895]Epoch 9: 63%|######2 | 411/655 [00:00<00:00, 689.93batch/s, train/train_loss_1=0.0652]Epoch 9: 63%|######2 | 411/655 [00:00<00:00, 689.93batch/s, train/train_loss_1=0.0915]Epoch 9: 63%|######2 | 411/655 [00:00<00:00, 689.93batch/s, train/train_loss_1=0.0842]Epoch 9: 63%|######2 | 411/655 [00:00<00:00, 689.93batch/s, train/train_loss_1=0.0773]Epoch 9: 63%|######2 | 411/655 [00:00<00:00, 689.93batch/s, train/train_loss_1=0.087] Epoch 9: 63%|######2 | 411/655 [00:00<00:00, 689.93batch/s, train/train_loss_1=0.0844]Epoch 9: 63%|######2 | 411/655 [00:00<00:00, 689.93batch/s, train/train_loss_1=0.0869]Epoch 9: 63%|######2 | 411/655 [00:00<00:00, 689.93batch/s, train/train_loss_1=0.0942]Epoch 9: 63%|######2 | 411/655 [00:00<00:00, 689.93batch/s, train/train_loss_1=0.08] Epoch 9: 63%|######2 | 411/655 [00:00<00:00, 689.93batch/s, train/train_loss_1=0.0731]Epoch 9: 63%|######2 | 411/655 [00:00<00:00, 689.93batch/s, train/train_loss_1=0.0909]Epoch 9: 63%|######2 | 411/655 [00:00<00:00, 689.93batch/s, train/train_loss_1=0.0878]Epoch 9: 63%|######2 | 411/655 [00:00<00:00, 689.93batch/s, train/train_loss_1=0.0762]Epoch 9: 63%|######2 | 411/655 [00:00<00:00, 689.93batch/s, train/train_loss_1=0.0758]Epoch 9: 63%|######2 | 411/655 [00:00<00:00, 689.93batch/s, train/train_loss_1=0.0851]Epoch 9: 63%|######2 | 411/655 [00:00<00:00, 689.93batch/s, train/train_loss_1=0.0861]Epoch 9: 63%|######2 | 411/655 [00:00<00:00, 689.93batch/s, train/train_loss_1=0.0874]Epoch 9: 63%|######2 | 411/655 [00:00<00:00, 689.93batch/s, train/train_loss_1=0.0829]Epoch 9: 63%|######2 | 411/655 [00:00<00:00, 689.93batch/s, train/train_loss_1=0.0852]Epoch 9: 63%|######2 | 411/655 [00:00<00:00, 689.93batch/s, train/train_loss_1=0.0792]Epoch 9: 63%|######2 | 411/655 [00:00<00:00, 689.93batch/s, train/train_loss_1=0.0817]Epoch 9: 63%|######2 | 411/655 [00:00<00:00, 689.93batch/s, train/train_loss_1=0.0814]Epoch 9: 63%|######2 | 411/655 [00:00<00:00, 689.93batch/s, train/train_loss_1=0.0964]Epoch 9: 63%|######2 | 411/655 [00:00<00:00, 689.93batch/s, train/train_loss_1=0.0889]Epoch 9: 63%|######2 | 411/655 [00:00<00:00, 689.93batch/s, train/train_loss_1=0.1] Epoch 9: 63%|######2 | 411/655 [00:00<00:00, 689.93batch/s, train/train_loss_1=0.0859]Epoch 9: 63%|######2 | 411/655 [00:00<00:00, 689.93batch/s, train/train_loss_1=0.0876]Epoch 9: 63%|######2 | 411/655 [00:00<00:00, 689.93batch/s, train/train_loss_1=0.091] Epoch 9: 63%|######2 | 411/655 [00:00<00:00, 689.93batch/s, train/train_loss_1=0.0937]Epoch 9: 63%|######2 | 411/655 [00:00<00:00, 689.93batch/s, train/train_loss_1=0.0913]Epoch 9: 63%|######2 | 411/655 [00:00<00:00, 689.93batch/s, train/train_loss_1=0.071] Epoch 9: 63%|######2 | 411/655 [00:00<00:00, 689.93batch/s, train/train_loss_1=0.0802]Epoch 9: 63%|######2 | 411/655 [00:00<00:00, 689.93batch/s, train/train_loss_1=0.0935]Epoch 9: 73%|#######3 | 481/655 [00:00<00:00, 689.15batch/s, train/train_loss_1=0.0935]Epoch 9: 73%|#######3 | 481/655 [00:00<00:00, 689.15batch/s, train/train_loss_1=0.0926]Epoch 9: 73%|#######3 | 481/655 [00:00<00:00, 689.15batch/s, train/train_loss_1=0.0983]Epoch 9: 73%|#######3 | 481/655 [00:00<00:00, 689.15batch/s, train/train_loss_1=0.0849]Epoch 9: 73%|#######3 | 481/655 [00:00<00:00, 689.15batch/s, train/train_loss_1=0.0854]Epoch 9: 73%|#######3 | 481/655 [00:00<00:00, 689.15batch/s, train/train_loss_1=0.0951]Epoch 9: 73%|#######3 | 481/655 [00:00<00:00, 689.15batch/s, train/train_loss_1=0.0826]Epoch 9: 73%|#######3 | 481/655 [00:00<00:00, 689.15batch/s, train/train_loss_1=0.0844]Epoch 9: 73%|#######3 | 481/655 [00:00<00:00, 689.15batch/s, train/train_loss_1=0.0885]Epoch 9: 73%|#######3 | 481/655 [00:00<00:00, 689.15batch/s, train/train_loss_1=0.0959]Epoch 9: 73%|#######3 | 481/655 [00:00<00:00, 689.15batch/s, train/train_loss_1=0.0866]Epoch 9: 73%|#######3 | 481/655 [00:00<00:00, 689.15batch/s, train/train_loss_1=0.0939]Epoch 9: 73%|#######3 | 481/655 [00:00<00:00, 689.15batch/s, train/train_loss_1=0.0728]Epoch 9: 73%|#######3 | 481/655 [00:00<00:00, 689.15batch/s, train/train_loss_1=0.0783]Epoch 9: 73%|#######3 | 481/655 [00:00<00:00, 689.15batch/s, train/train_loss_1=0.0777]Epoch 9: 73%|#######3 | 481/655 [00:00<00:00, 689.15batch/s, train/train_loss_1=0.0962]Epoch 9: 73%|#######3 | 481/655 [00:00<00:00, 689.15batch/s, train/train_loss_1=0.0837]Epoch 9: 73%|#######3 | 481/655 [00:00<00:00, 689.15batch/s, train/train_loss_1=0.0925]Epoch 9: 73%|#######3 | 481/655 [00:00<00:00, 689.15batch/s, train/train_loss_1=0.078] Epoch 9: 73%|#######3 | 481/655 [00:00<00:00, 689.15batch/s, train/train_loss_1=0.0858]Epoch 9: 73%|#######3 | 481/655 [00:00<00:00, 689.15batch/s, train/train_loss_1=0.0691]Epoch 9: 73%|#######3 | 481/655 [00:00<00:00, 689.15batch/s, train/train_loss_1=0.102] Epoch 9: 73%|#######3 | 481/655 [00:00<00:00, 689.15batch/s, train/train_loss_1=0.0709]Epoch 9: 73%|#######3 | 481/655 [00:00<00:00, 689.15batch/s, train/train_loss_1=0.0891]Epoch 9: 73%|#######3 | 481/655 [00:00<00:00, 689.15batch/s, train/train_loss_1=0.0843]Epoch 9: 73%|#######3 | 481/655 [00:00<00:00, 689.15batch/s, train/train_loss_1=0.0927]Epoch 9: 73%|#######3 | 481/655 [00:00<00:00, 689.15batch/s, train/train_loss_1=0.0893]Epoch 9: 73%|#######3 | 481/655 [00:00<00:00, 689.15batch/s, train/train_loss_1=0.0819]Epoch 9: 73%|#######3 | 481/655 [00:00<00:00, 689.15batch/s, train/train_loss_1=0.0913]Epoch 9: 73%|#######3 | 481/655 [00:00<00:00, 689.15batch/s, train/train_loss_1=0.0882]Epoch 9: 73%|#######3 | 481/655 [00:00<00:00, 689.15batch/s, train/train_loss_1=0.106] Epoch 9: 73%|#######3 | 481/655 [00:00<00:00, 689.15batch/s, train/train_loss_1=0.0906]Epoch 9: 73%|#######3 | 481/655 [00:00<00:00, 689.15batch/s, train/train_loss_1=0.0985]Epoch 9: 73%|#######3 | 481/655 [00:00<00:00, 689.15batch/s, train/train_loss_1=0.0839]Epoch 9: 73%|#######3 | 481/655 [00:00<00:00, 689.15batch/s, train/train_loss_1=0.0993]Epoch 9: 73%|#######3 | 481/655 [00:00<00:00, 689.15batch/s, train/train_loss_1=0.0886]Epoch 9: 73%|#######3 | 481/655 [00:00<00:00, 689.15batch/s, train/train_loss_1=0.082] Epoch 9: 73%|#######3 | 481/655 [00:00<00:00, 689.15batch/s, train/train_loss_1=0.0896]Epoch 9: 73%|#######3 | 481/655 [00:00<00:00, 689.15batch/s, train/train_loss_1=0.0825]Epoch 9: 73%|#######3 | 481/655 [00:00<00:00, 689.15batch/s, train/train_loss_1=0.0983]Epoch 9: 73%|#######3 | 481/655 [00:00<00:00, 689.15batch/s, train/train_loss_1=0.0913]Epoch 9: 73%|#######3 | 481/655 [00:00<00:00, 689.15batch/s, train/train_loss_1=0.0984]Epoch 9: 73%|#######3 | 481/655 [00:00<00:00, 689.15batch/s, train/train_loss_1=0.09] Epoch 9: 73%|#######3 | 481/655 [00:00<00:00, 689.15batch/s, train/train_loss_1=0.0735]Epoch 9: 73%|#######3 | 481/655 [00:00<00:00, 689.15batch/s, train/train_loss_1=0.0902]Epoch 9: 73%|#######3 | 481/655 [00:00<00:00, 689.15batch/s, train/train_loss_1=0.101] Epoch 9: 73%|#######3 | 481/655 [00:00<00:00, 689.15batch/s, train/train_loss_1=0.0901]Epoch 9: 73%|#######3 | 481/655 [00:00<00:00, 689.15batch/s, train/train_loss_1=0.0858]Epoch 9: 73%|#######3 | 481/655 [00:00<00:00, 689.15batch/s, train/train_loss_1=0.0724]Epoch 9: 73%|#######3 | 481/655 [00:00<00:00, 689.15batch/s, train/train_loss_1=0.0843]Epoch 9: 73%|#######3 | 481/655 [00:00<00:00, 689.15batch/s, train/train_loss_1=0.084] Epoch 9: 73%|#######3 | 481/655 [00:00<00:00, 689.15batch/s, train/train_loss_1=0.08] Epoch 9: 73%|#######3 | 481/655 [00:00<00:00, 689.15batch/s, train/train_loss_1=0.0737]Epoch 9: 73%|#######3 | 481/655 [00:00<00:00, 689.15batch/s, train/train_loss_1=0.0937]Epoch 9: 73%|#######3 | 481/655 [00:00<00:00, 689.15batch/s, train/train_loss_1=0.0809]Epoch 9: 73%|#######3 | 481/655 [00:00<00:00, 689.15batch/s, train/train_loss_1=0.091] Epoch 9: 73%|#######3 | 481/655 [00:00<00:00, 689.15batch/s, train/train_loss_1=0.094]Epoch 9: 73%|#######3 | 481/655 [00:00<00:00, 689.15batch/s, train/train_loss_1=0.0762]Epoch 9: 73%|#######3 | 481/655 [00:00<00:00, 689.15batch/s, train/train_loss_1=0.0885]Epoch 9: 73%|#######3 | 481/655 [00:00<00:00, 689.15batch/s, train/train_loss_1=0.0829]Epoch 9: 73%|#######3 | 481/655 [00:00<00:00, 689.15batch/s, train/train_loss_1=0.0851]Epoch 9: 73%|#######3 | 481/655 [00:00<00:00, 689.15batch/s, train/train_loss_1=0.0905]Epoch 9: 73%|#######3 | 481/655 [00:00<00:00, 689.15batch/s, train/train_loss_1=0.0791]Epoch 9: 73%|#######3 | 481/655 [00:00<00:00, 689.15batch/s, train/train_loss_1=0.0829]Epoch 9: 73%|#######3 | 481/655 [00:00<00:00, 689.15batch/s, train/train_loss_1=0.0956]Epoch 9: 73%|#######3 | 481/655 [00:00<00:00, 689.15batch/s, train/train_loss_1=0.0829]Epoch 9: 73%|#######3 | 481/655 [00:00<00:00, 689.15batch/s, train/train_loss_1=0.107] Epoch 9: 73%|#######3 | 481/655 [00:00<00:00, 689.15batch/s, train/train_loss_1=0.0855]Epoch 9: 73%|#######3 | 481/655 [00:00<00:00, 689.15batch/s, train/train_loss_1=0.0864]Epoch 9: 73%|#######3 | 481/655 [00:00<00:00, 689.15batch/s, train/train_loss_1=0.0806]Epoch 9: 84%|########3 | 550/655 [00:00<00:00, 682.85batch/s, train/train_loss_1=0.0806]Epoch 9: 84%|########3 | 550/655 [00:00<00:00, 682.85batch/s, train/train_loss_1=0.0793]Epoch 9: 84%|########3 | 550/655 [00:00<00:00, 682.85batch/s, train/train_loss_1=0.0821]Epoch 9: 84%|########3 | 550/655 [00:00<00:00, 682.85batch/s, train/train_loss_1=0.0896]Epoch 9: 84%|########3 | 550/655 [00:00<00:00, 682.85batch/s, train/train_loss_1=0.079] Epoch 9: 84%|########3 | 550/655 [00:00<00:00, 682.85batch/s, train/train_loss_1=0.0833]Epoch 9: 84%|########3 | 550/655 [00:00<00:00, 682.85batch/s, train/train_loss_1=0.0857]Epoch 9: 84%|########3 | 550/655 [00:00<00:00, 682.85batch/s, train/train_loss_1=0.0915]Epoch 9: 84%|########3 | 550/655 [00:00<00:00, 682.85batch/s, train/train_loss_1=0.0961]Epoch 9: 84%|########3 | 550/655 [00:00<00:00, 682.85batch/s, train/train_loss_1=0.0804]Epoch 9: 84%|########3 | 550/655 [00:00<00:00, 682.85batch/s, train/train_loss_1=0.0941]Epoch 9: 84%|########3 | 550/655 [00:00<00:00, 682.85batch/s, train/train_loss_1=0.0924]Epoch 9: 84%|########3 | 550/655 [00:00<00:00, 682.85batch/s, train/train_loss_1=0.0906]Epoch 9: 84%|########3 | 550/655 [00:00<00:00, 682.85batch/s, train/train_loss_1=0.0826]Epoch 9: 84%|########3 | 550/655 [00:00<00:00, 682.85batch/s, train/train_loss_1=0.0838]Epoch 9: 84%|########3 | 550/655 [00:00<00:00, 682.85batch/s, train/train_loss_1=0.0943]Epoch 9: 84%|########3 | 550/655 [00:00<00:00, 682.85batch/s, train/train_loss_1=0.097] Epoch 9: 84%|########3 | 550/655 [00:00<00:00, 682.85batch/s, train/train_loss_1=0.0868]Epoch 9: 84%|########3 | 550/655 [00:00<00:00, 682.85batch/s, train/train_loss_1=0.0881]Epoch 9: 84%|########3 | 550/655 [00:00<00:00, 682.85batch/s, train/train_loss_1=0.0919]Epoch 9: 84%|########3 | 550/655 [00:00<00:00, 682.85batch/s, train/train_loss_1=0.0759]Epoch 9: 84%|########3 | 550/655 [00:00<00:00, 682.85batch/s, train/train_loss_1=0.0855]Epoch 9: 84%|########3 | 550/655 [00:00<00:00, 682.85batch/s, train/train_loss_1=0.0741]Epoch 9: 84%|########3 | 550/655 [00:00<00:00, 682.85batch/s, train/train_loss_1=0.0837]Epoch 9: 84%|########3 | 550/655 [00:00<00:00, 682.85batch/s, train/train_loss_1=0.0796]Epoch 9: 84%|########3 | 550/655 [00:00<00:00, 682.85batch/s, train/train_loss_1=0.0828]Epoch 9: 84%|########3 | 550/655 [00:00<00:00, 682.85batch/s, train/train_loss_1=0.0841]Epoch 9: 84%|########3 | 550/655 [00:00<00:00, 682.85batch/s, train/train_loss_1=0.0871]Epoch 9: 84%|########3 | 550/655 [00:00<00:00, 682.85batch/s, train/train_loss_1=0.0891]Epoch 9: 84%|########3 | 550/655 [00:00<00:00, 682.85batch/s, train/train_loss_1=0.0859]Epoch 9: 84%|########3 | 550/655 [00:00<00:00, 682.85batch/s, train/train_loss_1=0.0729]Epoch 9: 84%|########3 | 550/655 [00:00<00:00, 682.85batch/s, train/train_loss_1=0.0837]Epoch 9: 84%|########3 | 550/655 [00:00<00:00, 682.85batch/s, train/train_loss_1=0.0874]Epoch 9: 84%|########3 | 550/655 [00:00<00:00, 682.85batch/s, train/train_loss_1=0.0858]Epoch 9: 84%|########3 | 550/655 [00:00<00:00, 682.85batch/s, train/train_loss_1=0.0881]Epoch 9: 84%|########3 | 550/655 [00:00<00:00, 682.85batch/s, train/train_loss_1=0.0849]Epoch 9: 84%|########3 | 550/655 [00:00<00:00, 682.85batch/s, train/train_loss_1=0.0963]Epoch 9: 84%|########3 | 550/655 [00:00<00:00, 682.85batch/s, train/train_loss_1=0.0964]Epoch 9: 84%|########3 | 550/655 [00:00<00:00, 682.85batch/s, train/train_loss_1=0.0937]Epoch 9: 84%|########3 | 550/655 [00:00<00:00, 682.85batch/s, train/train_loss_1=0.102] Epoch 9: 84%|########3 | 550/655 [00:00<00:00, 682.85batch/s, train/train_loss_1=0.0822]Epoch 9: 84%|########3 | 550/655 [00:00<00:00, 682.85batch/s, train/train_loss_1=0.0803]Epoch 9: 84%|########3 | 550/655 [00:00<00:00, 682.85batch/s, train/train_loss_1=0.0868]Epoch 9: 84%|########3 | 550/655 [00:00<00:00, 682.85batch/s, train/train_loss_1=0.0914]Epoch 9: 84%|########3 | 550/655 [00:00<00:00, 682.85batch/s, train/train_loss_1=0.083] Epoch 9: 84%|########3 | 550/655 [00:00<00:00, 682.85batch/s, train/train_loss_1=0.075]Epoch 9: 84%|########3 | 550/655 [00:00<00:00, 682.85batch/s, train/train_loss_1=0.0826]Epoch 9: 84%|########3 | 550/655 [00:00<00:00, 682.85batch/s, train/train_loss_1=0.0816]Epoch 9: 84%|########3 | 550/655 [00:00<00:00, 682.85batch/s, train/train_loss_1=0.0924]Epoch 9: 84%|########3 | 550/655 [00:00<00:00, 682.85batch/s, train/train_loss_1=0.0874]Epoch 9: 84%|########3 | 550/655 [00:00<00:00, 682.85batch/s, train/train_loss_1=0.0745]Epoch 9: 84%|########3 | 550/655 [00:00<00:00, 682.85batch/s, train/train_loss_1=0.0833]Epoch 9: 84%|########3 | 550/655 [00:00<00:00, 682.85batch/s, train/train_loss_1=0.0879]Epoch 9: 84%|########3 | 550/655 [00:00<00:00, 682.85batch/s, train/train_loss_1=0.0703]Epoch 9: 84%|########3 | 550/655 [00:00<00:00, 682.85batch/s, train/train_loss_1=0.0723]Epoch 9: 84%|########3 | 550/655 [00:00<00:00, 682.85batch/s, train/train_loss_1=0.0941]Epoch 9: 84%|########3 | 550/655 [00:00<00:00, 682.85batch/s, train/train_loss_1=0.0772]Epoch 9: 84%|########3 | 550/655 [00:00<00:00, 682.85batch/s, train/train_loss_1=0.0755]Epoch 9: 84%|########3 | 550/655 [00:00<00:00, 682.85batch/s, train/train_loss_1=0.0895]Epoch 9: 84%|########3 | 550/655 [00:00<00:00, 682.85batch/s, train/train_loss_1=0.0993]Epoch 9: 84%|########3 | 550/655 [00:00<00:00, 682.85batch/s, train/train_loss_1=0.0851]Epoch 9: 84%|########3 | 550/655 [00:00<00:00, 682.85batch/s, train/train_loss_1=0.0978]Epoch 9: 84%|########3 | 550/655 [00:00<00:00, 682.85batch/s, train/train_loss_1=0.0792]Epoch 9: 84%|########3 | 550/655 [00:00<00:00, 682.85batch/s, train/train_loss_1=0.106] Epoch 9: 84%|########3 | 550/655 [00:00<00:00, 682.85batch/s, train/train_loss_1=0.0773]Epoch 9: 84%|########3 | 550/655 [00:00<00:00, 682.85batch/s, train/train_loss_1=0.0904]Epoch 9: 84%|########3 | 550/655 [00:00<00:00, 682.85batch/s, train/train_loss_1=0.0773]Epoch 9: 84%|########3 | 550/655 [00:00<00:00, 682.85batch/s, train/train_loss_1=0.0777]Epoch 9: 84%|########3 | 550/655 [00:00<00:00, 682.85batch/s, train/train_loss_1=0.0765]Epoch 9: 84%|########3 | 550/655 [00:00<00:00, 682.85batch/s, train/train_loss_1=0.0838]Epoch 9: 84%|########3 | 550/655 [00:00<00:00, 682.85batch/s, train/train_loss_1=0.0824]Epoch 9: 84%|########3 | 550/655 [00:00<00:00, 682.85batch/s, train/train_loss_1=0.102] Epoch 9: 84%|########3 | 550/655 [00:00<00:00, 682.85batch/s, train/train_loss_1=0.0801]Epoch 9: 84%|########3 | 550/655 [00:00<00:00, 682.85batch/s, train/train_loss_1=0.0715]Epoch 9: 95%|#########5| 623/655 [00:00<00:00, 696.75batch/s, train/train_loss_1=0.0715]Epoch 9: 95%|#########5| 623/655 [00:00<00:00, 696.75batch/s, train/train_loss_1=0.0825]Epoch 9: 95%|#########5| 623/655 [00:00<00:00, 696.75batch/s, train/train_loss_1=0.0657]Epoch 9: 95%|#########5| 623/655 [00:00<00:00, 696.75batch/s, train/train_loss_1=0.0835]Epoch 9: 95%|#########5| 623/655 [00:00<00:00, 696.75batch/s, train/train_loss_1=0.0993]Epoch 9: 95%|#########5| 623/655 [00:00<00:00, 696.75batch/s, train/train_loss_1=0.0823]Epoch 9: 95%|#########5| 623/655 [00:00<00:00, 696.75batch/s, train/train_loss_1=0.0825]Epoch 9: 95%|#########5| 623/655 [00:00<00:00, 696.75batch/s, train/train_loss_1=0.0718]Epoch 9: 95%|#########5| 623/655 [00:00<00:00, 696.75batch/s, train/train_loss_1=0.081] Epoch 9: 95%|#########5| 623/655 [00:00<00:00, 696.75batch/s, train/train_loss_1=0.0811]Epoch 9: 95%|#########5| 623/655 [00:00<00:00, 696.75batch/s, train/train_loss_1=0.0774]Epoch 9: 95%|#########5| 623/655 [00:00<00:00, 696.75batch/s, train/train_loss_1=0.0868]Epoch 9: 95%|#########5| 623/655 [00:00<00:00, 696.75batch/s, train/train_loss_1=0.0827]Epoch 9: 95%|#########5| 623/655 [00:00<00:00, 696.75batch/s, train/train_loss_1=0.0764]Epoch 9: 95%|#########5| 623/655 [00:00<00:00, 696.75batch/s, train/train_loss_1=0.087] Epoch 9: 95%|#########5| 623/655 [00:00<00:00, 696.75batch/s, train/train_loss_1=0.0902]Epoch 9: 95%|#########5| 623/655 [00:00<00:00, 696.75batch/s, train/train_loss_1=0.0768]Epoch 9: 95%|#########5| 623/655 [00:00<00:00, 696.75batch/s, train/train_loss_1=0.0874]Epoch 9: 95%|#########5| 623/655 [00:00<00:00, 696.75batch/s, train/train_loss_1=0.0894]Epoch 9: 95%|#########5| 623/655 [00:00<00:00, 696.75batch/s, train/train_loss_1=0.0972]Epoch 9: 95%|#########5| 623/655 [00:00<00:00, 696.75batch/s, train/train_loss_1=0.0896]Epoch 9: 95%|#########5| 623/655 [00:00<00:00, 696.75batch/s, train/train_loss_1=0.0904]Epoch 9: 95%|#########5| 623/655 [00:00<00:00, 696.75batch/s, train/train_loss_1=0.0861]Epoch 9: 95%|#########5| 623/655 [00:00<00:00, 696.75batch/s, train/train_loss_1=0.103] Epoch 9: 95%|#########5| 623/655 [00:00<00:00, 696.75batch/s, train/train_loss_1=0.0956]Epoch 9: 95%|#########5| 623/655 [00:00<00:00, 696.75batch/s, train/train_loss_1=0.089] Epoch 9: 95%|#########5| 623/655 [00:00<00:00, 696.75batch/s, train/train_loss_1=0.0949]Epoch 9: 95%|#########5| 623/655 [00:00<00:00, 696.75batch/s, train/train_loss_1=0.0862]Epoch 9: 95%|#########5| 623/655 [00:00<00:00, 696.75batch/s, train/train_loss_1=0.0863]Epoch 9: 95%|#########5| 623/655 [00:00<00:00, 696.75batch/s, train/train_loss_1=0.0838]Epoch 9: 95%|#########5| 623/655 [00:00<00:00, 696.75batch/s, train/train_loss_1=0.0872]Epoch 9: 95%|#########5| 623/655 [00:00<00:00, 696.75batch/s, train/train_loss_1=0.0921]Epoch 9: 95%|#########5| 623/655 [00:00<00:00, 696.75batch/s, train/train_loss_1=0.0672]Epoch 9: 100%|##########| 655/655 [00:01<00:00, 647.74batch/s, train/train_loss_1=0.0672]