tinygrad Examples

Output from running the Python scripts in test/.

Note, while these examples were running, builds were also running, hitting 128 processors, so these examples aren’t benchmarks.

In all runs so far, just one GPU was used in the tinygrad examples.

beautiful_cartpole.py

  0%|          | 0/40 [00:00<?, ?it/s]
sz:    16 steps/s:    4.53 action_loss:  -8.405 entropy_loss:  -0.677 critic_loss:   83.913 reward:  16.00:   0%|          | 0/40 [00:03<?, ?it/s]
sz:    16 steps/s:    4.53 action_loss:  -8.405 entropy_loss:  -0.677 critic_loss:   83.913 reward:  16.00:   2%|▎         | 1/40 [00:03<02:18,  3.54s/it]
sz:    42 steps/s:    9.89 action_loss:  -9.755 entropy_loss:  -0.656 critic_loss:  130.257 reward:  26.00:   2%|▎         | 1/40 [00:04<02:18,  3.54s/it]
sz:    42 steps/s:    9.89 action_loss:  -9.755 entropy_loss:  -0.656 critic_loss:  130.257 reward:  26.00:   5%|▌         | 2/40 [00:04<01:11,  1.88s/it]
sz:    62 steps/s:   12.84 action_loss:  -9.323 entropy_loss:  -0.632 critic_loss:  120.510 reward:  20.00:   5%|▌         | 2/40 [00:04<01:11,  1.88s/it]
sz:    62 steps/s:   12.84 action_loss:  -9.323 entropy_loss:  -0.632 critic_loss:  120.510 reward:  20.00:   8%|▊         | 3/40 [00:04<00:47,  1.28s/it]
sz:   125 steps/s:   22.83 action_loss: -16.834 entropy_loss:  -0.612 critic_loss:  461.137 reward:  63.00:   8%|▊         | 3/40 [00:05<00:47,  1.28s/it]
sz:   125 steps/s:   22.83 action_loss: -16.834 entropy_loss:  -0.612 critic_loss:  461.137 reward:  63.00:  10%|█         | 4/40 [00:05<00:37,  1.03s/it]
sz:   140 steps/s:   22.87 action_loss: -14.485 entropy_loss:  -0.604 critic_loss:  378.145 reward:  15.00:  10%|█         | 4/40 [00:06<00:37,  1.03s/it]
sz:   140 steps/s:   22.87 action_loss: -14.485 entropy_loss:  -0.604 critic_loss:  378.145 reward:  15.00:  12%|█▎        | 5/40 [00:06<00:31,  1.12it/s]
sz:   158 steps/s:   23.51 action_loss: -12.541 entropy_loss:  -0.609 critic_loss:  317.384 reward:  18.00:  12%|█▎        | 5/40 [00:06<00:31,  1.12it/s]
sz:   158 steps/s:   23.51 action_loss: -12.541 entropy_loss:  -0.609 critic_loss:  317.384 reward:  18.00:  15%|█▌        | 6/40 [00:06<00:26,  1.26it/s]
sz:   173 steps/s:   23.68 action_loss: -10.413 entropy_loss:  -0.618 critic_loss:  270.751 reward:  15.00:  15%|█▌        | 6/40 [00:07<00:26,  1.26it/s]
sz:   173 steps/s:   23.68 action_loss: -10.413 entropy_loss:  -0.618 critic_loss:  270.751 reward:  15.00:  18%|█▊        | 7/40 [00:07<00:23,  1.38it/s]
sz:   197 steps/s:   24.93 action_loss:  -9.082 entropy_loss:  -0.623 critic_loss:  249.494 reward:  24.00:  18%|█▊        | 7/40 [00:07<00:23,  1.38it/s]
sz:   197 steps/s:   24.93 action_loss:  -9.082 entropy_loss:  -0.623 critic_loss:  249.494 reward:  24.00:  20%|██        | 8/40 [00:07<00:21,  1.46it/s]
sz:   234 steps/s:   27.33 action_loss:  -8.351 entropy_loss:  -0.608 critic_loss:  237.388 reward:  37.00:  20%|██        | 8/40 [00:08<00:21,  1.46it/s]
sz:   234 steps/s:   27.33 action_loss:  -8.351 entropy_loss:  -0.608 critic_loss:  237.388 reward:  37.00:  22%|██▎       | 9/40 [00:08<00:20,  1.48it/s]
sz:   284 steps/s:   30.63 action_loss:  -8.378 entropy_loss:  -0.566 critic_loss:  238.783 reward:  50.00:  22%|██▎       | 9/40 [00:09<00:20,  1.48it/s]
sz:   284 steps/s:   30.63 action_loss:  -8.378 entropy_loss:  -0.566 critic_loss:  238.783 reward:  50.00:  25%|██▌       | 10/40 [00:09<00:20,  1.46it/s]
sz:   327 steps/s:   32.95 action_loss:  -7.069 entropy_loss:  -0.520 critic_loss:  217.788 reward:  43.00:  25%|██▌       | 10/40 [00:09<00:20,  1.46it/s]
sz:   327 steps/s:   32.95 action_loss:  -7.069 entropy_loss:  -0.520 critic_loss:  217.788 reward:  43.00:  28%|██▊       | 11/40 [00:09<00:19,  1.48it/s]
sz:   356 steps/s:   33.55 action_loss:  -3.947 entropy_loss:  -0.461 critic_loss:  144.527 reward:  29.00:  28%|██▊       | 11/40 [00:10<00:19,  1.48it/s]
sz:   356 steps/s:   33.55 action_loss:  -3.947 entropy_loss:  -0.461 critic_loss:  144.527 reward:  29.00:  30%|███       | 12/40 [00:10<00:19,  1.47it/s]
sz:   479 steps/s:   42.10 action_loss:  -8.053 entropy_loss:  -0.448 critic_loss:  421.731 reward: 123.00:  30%|███       | 12/40 [00:11<00:19,  1.47it/s]
sz:   479 steps/s:   42.10 action_loss:  -8.053 entropy_loss:  -0.448 critic_loss:  421.731 reward: 123.00:  32%|███▎      | 13/40 [00:11<00:19,  1.42it/s]
sz:   645 steps/s:   52.32 action_loss: -15.570 entropy_loss:  -0.464 critic_loss:  810.073 reward: 166.00:  32%|███▎      | 13/40 [00:12<00:19,  1.42it/s]
sz:   645 steps/s:   52.32 action_loss: -15.570 entropy_loss:  -0.464 critic_loss:  810.073 reward: 166.00:  35%|███▌      | 14/40 [00:12<00:20,  1.28it/s]
sz:   792 steps/s:   60.34 action_loss: -15.626 entropy_loss:  -0.414 critic_loss:  871.600 reward: 147.00:  35%|███▌      | 14/40 [00:13<00:20,  1.28it/s]
sz:   792 steps/s:   60.34 action_loss: -15.626 entropy_loss:  -0.414 critic_loss:  871.600 reward: 147.00:  38%|███▊      | 15/40 [00:13<00:19,  1.27it/s]
sz:  1203 steps/s:   83.40 action_loss: -28.025 entropy_loss:  -0.473 critic_loss: 1730.603 reward: 411.00:  38%|███▊      | 15/40 [00:14<00:19,  1.27it/s]
sz:  1203 steps/s:   83.40 action_loss: -28.025 entropy_loss:  -0.473 critic_loss: 1730.603 reward: 411.00:  40%|████      | 16/40 [00:14<00:22,  1.06it/s]
sz:  1435 steps/s:   93.00 action_loss: -25.648 entropy_loss:  -0.458 critic_loss: 1608.802 reward: 232.00:  40%|████      | 16/40 [00:15<00:22,  1.06it/s]
sz:  1435 steps/s:   93.00 action_loss: -25.648 entropy_loss:  -0.458 critic_loss: 1608.802 reward: 232.00:  42%|████▎     | 17/40 [00:15<00:22,  1.04it/s]
sz:  1592 steps/s:   96.96 action_loss: -16.278 entropy_loss:  -0.413 critic_loss: 1147.564 reward: 157.00:  42%|████▎     | 17/40 [00:16<00:22,  1.04it/s]
sz:  1592 steps/s:   96.96 action_loss: -16.278 entropy_loss:  -0.413 critic_loss: 1147.564 reward: 157.00:  45%|████▌     | 18/40 [00:16<00:21,  1.03it/s]
sz:  1789 steps/s:  103.18 action_loss: -19.750 entropy_loss:  -0.398 critic_loss: 1422.291 reward: 197.00:  45%|████▌     | 18/40 [00:17<00:21,  1.03it/s]
sz:  1789 steps/s:  103.18 action_loss: -19.750 entropy_loss:  -0.398 critic_loss: 1422.291 reward: 197.00:  48%|████▊     | 19/40 [00:17<00:20,  1.05it/s]
sz:  1949 steps/s:  106.81 action_loss: -10.565 entropy_loss:  -0.397 critic_loss:  895.935 reward: 160.00:  48%|████▊     | 19/40 [00:18<00:20,  1.05it/s]
sz:  1949 steps/s:  106.81 action_loss: -10.565 entropy_loss:  -0.397 critic_loss:  895.935 reward: 160.00:  50%|█████     | 20/40 [00:18<00:18,  1.06it/s]
sz:  2000 steps/s:  110.75 action_loss:  -9.772 entropy_loss:  -0.385 critic_loss:  871.258 reward: 182.00:  50%|█████     | 20/40 [00:19<00:18,  1.06it/s]
sz:  2000 steps/s:  110.75 action_loss:  -9.772 entropy_loss:  -0.385 critic_loss:  871.258 reward: 182.00:  52%|█████▎    | 21/40 [00:19<00:18,  1.05it/s]
sz:  2000 steps/s:  121.57 action_loss: -11.975 entropy_loss:  -0.382 critic_loss:  808.180 reward: 261.00:  52%|█████▎    | 21/40 [00:19<00:18,  1.05it/s]
sz:  2000 steps/s:  121.57 action_loss: -11.975 entropy_loss:  -0.382 critic_loss:  808.180 reward: 261.00:  55%|█████▌    | 22/40 [00:19<00:14,  1.25it/s]
sz:  2000 steps/s:  133.60 action_loss: -10.460 entropy_loss:  -0.392 critic_loss:  741.236 reward: 296.00:  55%|█████▌    | 22/40 [00:20<00:14,  1.25it/s]
sz:  2000 steps/s:  133.60 action_loss: -10.460 entropy_loss:  -0.392 critic_loss:  741.236 reward: 296.00:  57%|█████▊    | 23/40 [00:20<00:11,  1.44it/s]
sz:  2000 steps/s:  153.39 action_loss:  -9.818 entropy_loss:  -0.397 critic_loss:  624.722 reward: 500.00:  57%|█████▊    | 23/40 [00:20<00:11,  1.44it/s]
sz:  2000 steps/s:  153.39 action_loss:  -9.818 entropy_loss:  -0.397 critic_loss:  624.722 reward: 500.00:  60%|██████    | 24/40 [00:20<00:10,  1.46it/s]
sz:  2000 steps/s:  171.97 action_loss:  -8.829 entropy_loss:  -0.434 critic_loss:  686.303 reward: 500.00:  60%|██████    | 24/40 [00:21<00:10,  1.46it/s]
sz:  2000 steps/s:  171.97 action_loss:  -8.829 entropy_loss:  -0.434 critic_loss:  686.303 reward: 500.00:  62%|██████▎   | 25/40 [00:21<00:10,  1.48it/s]
sz:  2000 steps/s:  189.42 action_loss: -12.795 entropy_loss:  -0.473 critic_loss:  664.423 reward: 500.00:  62%|██████▎   | 25/40 [00:22<00:10,  1.48it/s]
sz:  2000 steps/s:  189.42 action_loss: -12.795 entropy_loss:  -0.473 critic_loss:  664.423 reward: 500.00:  65%|██████▌   | 26/40 [00:22<00:09,  1.49it/s]
sz:  2000 steps/s:  205.86 action_loss:  -7.701 entropy_loss:  -0.469 critic_loss:  597.681 reward: 500.00:  65%|██████▌   | 26/40 [00:22<00:09,  1.49it/s]
sz:  2000 steps/s:  205.86 action_loss:  -7.701 entropy_loss:  -0.469 critic_loss:  597.681 reward: 500.00:  68%|██████▊   | 27/40 [00:22<00:08,  1.49it/s]
sz:  2000 steps/s:  221.37 action_loss:  -2.992 entropy_loss:  -0.482 critic_loss:  482.757 reward: 500.00:  68%|██████▊   | 27/40 [00:23<00:08,  1.49it/s]
sz:  2000 steps/s:  221.37 action_loss:  -2.992 entropy_loss:  -0.482 critic_loss:  482.757 reward: 500.00:  70%|███████   | 28/40 [00:23<00:08,  1.50it/s]
sz:  2000 steps/s:  236.03 action_loss:   1.380 entropy_loss:  -0.453 critic_loss:  538.772 reward: 500.00:  70%|███████   | 28/40 [00:24<00:08,  1.50it/s]
sz:  2000 steps/s:  236.03 action_loss:   1.380 entropy_loss:  -0.453 critic_loss:  538.772 reward: 500.00:  72%|███████▎  | 29/40 [00:24<00:07,  1.50it/s]
sz:  2000 steps/s:  249.51 action_loss:   3.675 entropy_loss:  -0.430 critic_loss:  589.797 reward: 500.00:  72%|███████▎  | 29/40 [00:24<00:07,  1.50it/s]
sz:  2000 steps/s:  249.51 action_loss:   3.675 entropy_loss:  -0.430 critic_loss:  589.797 reward: 500.00:  75%|███████▌  | 30/40 [00:24<00:06,  1.48it/s]
sz:  2000 steps/s:  262.02 action_loss:   1.766 entropy_loss:  -0.456 critic_loss:  630.521 reward: 500.00:  75%|███████▌  | 30/40 [00:25<00:06,  1.48it/s]
sz:  2000 steps/s:  262.02 action_loss:   1.766 entropy_loss:  -0.456 critic_loss:  630.521 reward: 500.00:  78%|███████▊  | 31/40 [00:25<00:06,  1.45it/s]
sz:  2000 steps/s:  274.49 action_loss:   2.491 entropy_loss:  -0.429 critic_loss:  615.586 reward: 500.00:  78%|███████▊  | 31/40 [00:26<00:06,  1.45it/s]
sz:  2000 steps/s:  274.49 action_loss:   2.491 entropy_loss:  -0.429 critic_loss:  615.586 reward: 500.00:  80%|████████  | 32/40 [00:26<00:05,  1.47it/s]
sz:  2000 steps/s:  286.35 action_loss:  -0.218 entropy_loss:  -0.476 critic_loss:  532.887 reward: 500.00:  80%|████████  | 32/40 [00:26<00:05,  1.47it/s]
sz:  2000 steps/s:  286.35 action_loss:  -0.218 entropy_loss:  -0.476 critic_loss:  532.887 reward: 500.00:  82%|████████▎ | 33/40 [00:26<00:04,  1.48it/s]
sz:  2000 steps/s:  297.63 action_loss:  -0.294 entropy_loss:  -0.486 critic_loss:  535.985 reward: 500.00:  82%|████████▎ | 33/40 [00:27<00:04,  1.48it/s]
sz:  2000 steps/s:  297.63 action_loss:  -0.294 entropy_loss:  -0.486 critic_loss:  535.985 reward: 500.00:  85%|████████▌ | 34/40 [00:27<00:04,  1.49it/s]
sz:  2000 steps/s:  308.38 action_loss:  -0.592 entropy_loss:  -0.502 critic_loss:  491.457 reward: 500.00:  85%|████████▌ | 34/40 [00:28<00:04,  1.49it/s]
sz:  2000 steps/s:  308.38 action_loss:  -0.592 entropy_loss:  -0.502 critic_loss:  491.457 reward: 500.00:  88%|████████▊ | 35/40 [00:28<00:03,  1.49it/s]
sz:  2000 steps/s:  318.63 action_loss:   2.020 entropy_loss:  -0.532 critic_loss:  754.644 reward: 500.00:  88%|████████▊ | 35/40 [00:28<00:03,  1.49it/s]
sz:  2000 steps/s:  318.63 action_loss:   2.020 entropy_loss:  -0.532 critic_loss:  754.644 reward: 500.00:  90%|█████████ | 36/40 [00:28<00:02,  1.50it/s]
sz:  2000 steps/s:  328.42 action_loss:  -0.480 entropy_loss:  -0.542 critic_loss:  522.170 reward: 500.00:  90%|█████████ | 36/40 [00:29<00:02,  1.50it/s]
sz:  2000 steps/s:  328.42 action_loss:  -0.480 entropy_loss:  -0.542 critic_loss:  522.170 reward: 500.00:  92%|█████████▎| 37/40 [00:29<00:01,  1.50it/s]
sz:  2000 steps/s:  337.78 action_loss:  -1.788 entropy_loss:  -0.560 critic_loss:  586.351 reward: 500.00:  92%|█████████▎| 37/40 [00:30<00:01,  1.50it/s]
sz:  2000 steps/s:  337.78 action_loss:  -1.788 entropy_loss:  -0.560 critic_loss:  586.351 reward: 500.00:  95%|█████████▌| 38/40 [00:30<00:01,  1.49it/s]
sz:  2000 steps/s:  346.58 action_loss:   1.829 entropy_loss:  -0.512 critic_loss:  649.962 reward: 500.00:  95%|█████████▌| 38/40 [00:30<00:01,  1.49it/s]
sz:  2000 steps/s:  346.58 action_loss:   1.829 entropy_loss:  -0.512 critic_loss:  649.962 reward: 500.00:  98%|█████████▊| 39/40 [00:30<00:00,  1.50it/s]
sz:  2000 steps/s:  355.17 action_loss:  -1.843 entropy_loss:  -0.548 critic_loss:  540.821 reward: 500.00:  98%|█████████▊| 39/40 [00:31<00:00,  1.50it/s]
sz:  2000 steps/s:  355.17 action_loss:  -1.843 entropy_loss:  -0.548 critic_loss:  540.821 reward: 500.00: 100%|██████████| 40/40 [00:31<00:00,  1.50it/s]
sz:  2000 steps/s:  355.17 action_loss:  -1.843 entropy_loss:  -0.548 critic_loss:  540.821 reward: 500.00: 100%|██████████| 40/40 [00:31<00:00,  1.27it/s]
test reward: 500.0

beautiful_mnist.py

  0%|          | 0/70 [00:00<?, ?it/s]
loss:   2.85 test_accuracy:   nan%:   0%|          | 0/70 [00:05<?, ?it/s]
loss:   2.85 test_accuracy:   nan%:   1%|▏         | 1/70 [00:05<06:52,  5.98s/it]
loss:   1.75 test_accuracy:   nan%:   1%|▏         | 1/70 [00:07<06:52,  5.98s/it]
loss:   1.75 test_accuracy:   nan%:   3%|▎         | 2/70 [00:07<03:51,  3.41s/it]
loss:   1.33 test_accuracy:   nan%:   3%|▎         | 2/70 [00:07<03:51,  3.41s/it]
loss:   1.00 test_accuracy:   nan%:   3%|▎         | 2/70 [00:07<03:51,  3.41s/it]
loss:   0.83 test_accuracy:   nan%:   3%|▎         | 2/70 [00:07<03:51,  3.41s/it]
loss:   0.68 test_accuracy:   nan%:   3%|▎         | 2/70 [00:07<03:51,  3.41s/it]
loss:   0.68 test_accuracy:   nan%:   9%|▊         | 6/70 [00:07<00:51,  1.24it/s]
loss:   0.55 test_accuracy:   nan%:   9%|▊         | 6/70 [00:07<00:51,  1.24it/s]
loss:   0.50 test_accuracy:   nan%:   9%|▊         | 6/70 [00:07<00:51,  1.24it/s]
loss:   0.44 test_accuracy:   nan%:   9%|▊         | 6/70 [00:07<00:51,  1.24it/s]
loss:   0.41 test_accuracy: 85.98%:   9%|▊         | 6/70 [00:08<00:51,  1.24it/s]
loss:   0.41 test_accuracy: 85.98%:  14%|█▍        | 10/70 [00:08<00:28,  2.09it/s]
loss:   0.38 test_accuracy: 85.98%:  14%|█▍        | 10/70 [00:08<00:28,  2.09it/s]
loss:   0.33 test_accuracy: 85.98%:  14%|█▍        | 10/70 [00:08<00:28,  2.09it/s]
loss:   0.30 test_accuracy: 85.98%:  14%|█▍        | 10/70 [00:08<00:28,  2.09it/s]
loss:   0.27 test_accuracy: 85.98%:  14%|█▍        | 10/70 [00:08<00:28,  2.09it/s]
loss:   0.27 test_accuracy: 85.98%:  20%|██        | 14/70 [00:08<00:16,  3.49it/s]
loss:   0.25 test_accuracy: 85.98%:  20%|██        | 14/70 [00:08<00:16,  3.49it/s]
loss:   0.26 test_accuracy: 85.98%:  20%|██        | 14/70 [00:08<00:16,  3.49it/s]
loss:   0.25 test_accuracy: 85.98%:  20%|██        | 14/70 [00:08<00:16,  3.49it/s]
loss:   0.23 test_accuracy: 85.98%:  20%|██        | 14/70 [00:08<00:16,  3.49it/s]
loss:   0.23 test_accuracy: 85.98%:  26%|██▌       | 18/70 [00:08<00:09,  5.30it/s]
loss:   0.19 test_accuracy: 85.98%:  26%|██▌       | 18/70 [00:08<00:09,  5.30it/s]
loss:   0.19 test_accuracy: 94.39%:  26%|██▌       | 18/70 [00:08<00:09,  5.30it/s]
loss:   0.20 test_accuracy: 94.39%:  26%|██▌       | 18/70 [00:08<00:09,  5.30it/s]
loss:   0.20 test_accuracy: 94.39%:  30%|███       | 21/70 [00:08<00:07,  6.93it/s]
loss:   0.19 test_accuracy: 94.39%:  30%|███       | 21/70 [00:08<00:07,  6.93it/s]
loss:   0.14 test_accuracy: 94.39%:  30%|███       | 21/70 [00:08<00:07,  6.93it/s]
loss:   0.15 test_accuracy: 94.39%:  30%|███       | 21/70 [00:08<00:07,  6.93it/s]
loss:   0.14 test_accuracy: 94.39%:  30%|███       | 21/70 [00:08<00:07,  6.93it/s]
loss:   0.14 test_accuracy: 94.39%:  36%|███▌      | 25/70 [00:08<00:04,  9.69it/s]
loss:   0.14 test_accuracy: 94.39%:  36%|███▌      | 25/70 [00:08<00:04,  9.69it/s]
loss:   0.15 test_accuracy: 94.39%:  36%|███▌      | 25/70 [00:08<00:04,  9.69it/s]
loss:   0.14 test_accuracy: 94.39%:  36%|███▌      | 25/70 [00:08<00:04,  9.69it/s]
loss:   0.11 test_accuracy: 94.39%:  36%|███▌      | 25/70 [00:09<00:04,  9.69it/s]
loss:   0.11 test_accuracy: 94.39%:  41%|████▏     | 29/70 [00:09<00:03, 12.82it/s]
loss:   0.12 test_accuracy: 96.40%:  41%|████▏     | 29/70 [00:09<00:03, 12.82it/s]
loss:   0.15 test_accuracy: 96.40%:  41%|████▏     | 29/70 [00:09<00:03, 12.82it/s]
loss:   0.11 test_accuracy: 96.40%:  41%|████▏     | 29/70 [00:09<00:03, 12.82it/s]
loss:   0.10 test_accuracy: 96.40%:  41%|████▏     | 29/70 [00:09<00:03, 12.82it/s]
loss:   0.10 test_accuracy: 96.40%:  47%|████▋     | 33/70 [00:09<00:02, 15.72it/s]
loss:   0.08 test_accuracy: 96.40%:  47%|████▋     | 33/70 [00:09<00:02, 15.72it/s]
loss:   0.08 test_accuracy: 96.40%:  47%|████▋     | 33/70 [00:09<00:02, 15.72it/s]
loss:   0.11 test_accuracy: 96.40%:  47%|████▋     | 33/70 [00:09<00:02, 15.72it/s]
loss:   0.11 test_accuracy: 96.40%:  47%|████▋     | 33/70 [00:09<00:02, 15.72it/s]
loss:   0.11 test_accuracy: 96.40%:  53%|█████▎    | 37/70 [00:09<00:01, 19.05it/s]
loss:   0.09 test_accuracy: 96.40%:  53%|█████▎    | 37/70 [00:09<00:01, 19.05it/s]
loss:   0.10 test_accuracy: 96.40%:  53%|█████▎    | 37/70 [00:09<00:01, 19.05it/s]
loss:   0.09 test_accuracy: 97.22%:  53%|█████▎    | 37/70 [00:09<00:01, 19.05it/s]
loss:   0.07 test_accuracy: 97.22%:  53%|█████▎    | 37/70 [00:09<00:01, 19.05it/s]
loss:   0.07 test_accuracy: 97.22%:  59%|█████▊    | 41/70 [00:09<00:01, 21.47it/s]
loss:   0.10 test_accuracy: 97.22%:  59%|█████▊    | 41/70 [00:09<00:01, 21.47it/s]
loss:   0.09 test_accuracy: 97.22%:  59%|█████▊    | 41/70 [00:09<00:01, 21.47it/s]
loss:   0.08 test_accuracy: 97.22%:  59%|█████▊    | 41/70 [00:09<00:01, 21.47it/s]
loss:   0.09 test_accuracy: 97.22%:  59%|█████▊    | 41/70 [00:09<00:01, 21.47it/s]
loss:   0.09 test_accuracy: 97.22%:  64%|██████▍   | 45/70 [00:09<00:01, 24.40it/s]
loss:   0.10 test_accuracy: 97.22%:  64%|██████▍   | 45/70 [00:09<00:01, 24.40it/s]
loss:   0.09 test_accuracy: 97.22%:  64%|██████▍   | 45/70 [00:09<00:01, 24.40it/s]
loss:   0.07 test_accuracy: 97.22%:  64%|██████▍   | 45/70 [00:09<00:01, 24.40it/s]
loss:   0.09 test_accuracy: 97.22%:  64%|██████▍   | 45/70 [00:09<00:01, 24.40it/s]
loss:   0.09 test_accuracy: 97.22%:  70%|███████   | 49/70 [00:09<00:00, 26.90it/s]
loss:   0.10 test_accuracy: 97.70%:  70%|███████   | 49/70 [00:09<00:00, 26.90it/s]
loss:   0.06 test_accuracy: 97.70%:  70%|███████   | 49/70 [00:09<00:00, 26.90it/s]
loss:   0.06 test_accuracy: 97.70%:  70%|███████   | 49/70 [00:09<00:00, 26.90it/s]
loss:   0.10 test_accuracy: 97.70%:  70%|███████   | 49/70 [00:09<00:00, 26.90it/s]
loss:   0.10 test_accuracy: 97.70%:  76%|███████▌  | 53/70 [00:09<00:00, 27.73it/s]
loss:   0.08 test_accuracy: 97.70%:  76%|███████▌  | 53/70 [00:09<00:00, 27.73it/s]
loss:   0.08 test_accuracy: 97.70%:  76%|███████▌  | 53/70 [00:09<00:00, 27.73it/s]
loss:   0.06 test_accuracy: 97.70%:  76%|███████▌  | 53/70 [00:09<00:00, 27.73it/s]
loss:   0.06 test_accuracy: 97.70%:  76%|███████▌  | 53/70 [00:09<00:00, 27.73it/s]
loss:   0.06 test_accuracy: 97.70%:  81%|████████▏ | 57/70 [00:09<00:00, 29.62it/s]
loss:   0.08 test_accuracy: 97.70%:  81%|████████▏ | 57/70 [00:09<00:00, 29.62it/s]
loss:   0.08 test_accuracy: 97.70%:  81%|████████▏ | 57/70 [00:09<00:00, 29.62it/s]
loss:   0.07 test_accuracy: 98.14%:  81%|████████▏ | 57/70 [00:09<00:00, 29.62it/s]
loss:   0.05 test_accuracy: 98.14%:  81%|████████▏ | 57/70 [00:09<00:00, 29.62it/s]
loss:   0.05 test_accuracy: 98.14%:  87%|████████▋ | 61/70 [00:09<00:00, 29.67it/s]
loss:   0.07 test_accuracy: 98.14%:  87%|████████▋ | 61/70 [00:10<00:00, 29.67it/s]
loss:   0.06 test_accuracy: 98.14%:  87%|████████▋ | 61/70 [00:10<00:00, 29.67it/s]
loss:   0.09 test_accuracy: 98.14%:  87%|████████▋ | 61/70 [00:10<00:00, 29.67it/s]
loss:   0.07 test_accuracy: 98.14%:  87%|████████▋ | 61/70 [00:10<00:00, 29.67it/s]
loss:   0.07 test_accuracy: 98.14%:  93%|█████████▎| 65/70 [00:10<00:00, 31.13it/s]
loss:   0.06 test_accuracy: 98.14%:  93%|█████████▎| 65/70 [00:10<00:00, 31.13it/s]
loss:   0.06 test_accuracy: 98.14%:  93%|█████████▎| 65/70 [00:10<00:00, 31.13it/s]
loss:   0.06 test_accuracy: 98.14%:  93%|█████████▎| 65/70 [00:10<00:00, 31.13it/s]
loss:   0.08 test_accuracy: 98.14%:  93%|█████████▎| 65/70 [00:10<00:00, 31.13it/s]
loss:   0.08 test_accuracy: 98.14%:  99%|█████████▊| 69/70 [00:10<00:00, 32.21it/s]
loss:   0.06 test_accuracy: 98.42%:  99%|█████████▊| 69/70 [00:10<00:00, 32.21it/s]
loss:   0.06 test_accuracy: 98.42%: 100%|██████████| 70/70 [00:10<00:00,  6.81it/s]

benchmark_train_efficientnet.py

NUM:2 BS:8 CNT:10

  0%|          | 0/10 [00:00<?, ?it/s]
 10%|█         | 1/10 [00:09<01:26,  9.67s/it]
 20%|██        | 2/10 [00:09<00:32,  4.06s/it]
 30%|███       | 3/10 [00:09<00:15,  2.28s/it]
 40%|████      | 4/10 [00:10<00:08,  1.43s/it]
 50%|█████     | 5/10 [00:10<00:04,  1.04it/s]
 60%|██████    | 6/10 [00:10<00:02,  1.47it/s]
 70%|███████   | 7/10 [00:10<00:01,  1.94it/s]
 80%|████████  | 8/10 [00:10<00:00,  2.54it/s]
 90%|█████████ | 9/10 [00:10<00:00,  3.20it/s]
100%|██████████| 10/10 [00:10<00:00,  3.89it/s]
100%|██████████| 10/10 [00:10<00:00,  1.09s/it]
 175.27 ms cpy,  9470.44 ms run,   62.31 ms build, 9347.02 ms realize,   61.11 ms CL,    0.06 loss,  421 tensors, 0.04 GB used,      1.22 GFLOPS
  12.54 ms cpy,   103.18 ms run,   55.31 ms build,   45.06 ms realize,    2.82 ms CL,   -0.02 loss,  421 tensors, 0.04 GB used,    111.65 GFLOPS
  11.01 ms cpy,   142.94 ms run,   53.91 ms build,   86.17 ms realize,    2.85 ms CL,    0.07 loss,  421 tensors, 0.04 GB used,     80.60 GFLOPS
  11.05 ms cpy,   102.45 ms run,   53.68 ms build,   45.98 ms realize,    2.79 ms CL,    0.03 loss,  421 tensors, 0.04 GB used,    112.45 GFLOPS
  11.07 ms cpy,   102.35 ms run,   53.75 ms build,   45.86 ms realize,    2.74 ms CL,    0.07 loss,  421 tensors, 0.04 GB used,    112.56 GFLOPS
  11.14 ms cpy,   101.95 ms run,   53.89 ms build,   45.27 ms realize,    2.78 ms CL,   -0.00 loss,  421 tensors, 0.04 GB used,    113.01 GFLOPS
  11.14 ms cpy,   143.39 ms run,   54.09 ms build,   86.44 ms realize,    2.86 ms CL,    0.03 loss,  421 tensors, 0.04 GB used,     80.34 GFLOPS
  11.97 ms cpy,   103.14 ms run,   54.24 ms build,   46.12 ms realize,    2.78 ms CL,   -0.04 loss,  421 tensors, 0.04 GB used,    111.70 GFLOPS
  11.29 ms cpy,   102.81 ms run,   54.46 ms build,   45.58 ms realize,    2.77 ms CL,    0.04 loss,  421 tensors, 0.04 GB used,    112.06 GFLOPS
  11.15 ms cpy,   103.26 ms run,   54.59 ms build,   45.89 ms realize,    2.77 ms CL,   -0.05 loss,  421 tensors, 0.04 GB used,    111.57 GFLOPS

coder.py

create model: 155.25 ms
download weights:  24.86 ms

  0%|          | 0/292 [00:00<?, ?it/s]
ram used:  0.00 GB, layers.0.attention.wq.weight                      :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  0.00 GB, layers.0.attention.wq.weight                      :   0%|          | 1/292 [00:00<01:35,  3.04it/s]
ram used:  0.03 GB, layers.0.attention.wk.weight                      :   0%|          | 1/292 [00:00<01:35,  3.04it/s]
ram used:  0.04 GB, layers.0.attention.wv.weight                      :   0%|          | 1/292 [00:00<01:35,  3.04it/s]
ram used:  0.05 GB, layers.0.attention.wo.weight                      :   0%|          | 1/292 [00:00<01:35,  3.04it/s]
ram used:  0.05 GB, layers.0.attention.wo.weight                      :   1%|▏         | 4/292 [00:00<00:28, 10.07it/s]
ram used:  0.08 GB, layers.0.feed_forward.w1.weight                   :   1%|▏         | 4/292 [00:00<00:28, 10.07it/s]
ram used:  0.20 GB, layers.0.feed_forward.w2.weight                   :   1%|▏         | 4/292 [00:00<00:28, 10.07it/s]
ram used:  0.20 GB, layers.0.feed_forward.w2.weight                   :   2%|▏         | 6/292 [00:00<00:26, 10.89it/s]
ram used:  0.32 GB, layers.0.feed_forward.w3.weight                   :   2%|▏         | 6/292 [00:00<00:26, 10.89it/s]
ram used:  0.44 GB, layers.0.attention_norm.weight                    :   2%|▏         | 6/292 [00:00<00:26, 10.89it/s]
ram used:  0.44 GB, layers.0.ffn_norm.weight                          :   2%|▏         | 6/292 [00:00<00:26, 10.89it/s]
ram used:  0.44 GB, layers.1.attention.wq.weight                      :   2%|▏         | 6/292 [00:00<00:26, 10.89it/s]
ram used:  0.47 GB, layers.1.attention.wk.weight                      :   2%|▏         | 6/292 [00:00<00:26, 10.89it/s]
ram used:  0.48 GB, layers.1.attention.wv.weight                      :   2%|▏         | 6/292 [00:00<00:26, 10.89it/s]
ram used:  0.48 GB, layers.1.attention.wv.weight                      :   4%|▍         | 12/292 [00:00<00:12, 23.09it/s]
ram used:  0.49 GB, layers.1.attention.wo.weight                      :   4%|▍         | 12/292 [00:00<00:12, 23.09it/s]
ram used:  0.52 GB, layers.1.feed_forward.w1.weight                   :   4%|▍         | 12/292 [00:00<00:12, 23.09it/s]
ram used:  0.64 GB, layers.1.feed_forward.w2.weight                   :   4%|▍         | 12/292 [00:00<00:12, 23.09it/s]
ram used:  0.75 GB, layers.1.feed_forward.w3.weight                   :   4%|▍         | 12/292 [00:00<00:12, 23.09it/s]
ram used:  0.75 GB, layers.1.feed_forward.w3.weight                   :   5%|▌         | 16/292 [00:00<00:11, 24.79it/s]
ram used:  0.87 GB, layers.1.attention_norm.weight                    :   5%|▌         | 16/292 [00:00<00:11, 24.79it/s]
ram used:  0.87 GB, layers.1.ffn_norm.weight                          :   5%|▌         | 16/292 [00:00<00:11, 24.79it/s]
ram used:  0.87 GB, layers.2.attention.wq.weight                      :   5%|▌         | 16/292 [00:00<00:11, 24.79it/s]
ram used:  0.91 GB, layers.2.attention.wk.weight                      :   5%|▌         | 16/292 [00:00<00:11, 24.79it/s]
ram used:  0.91 GB, layers.2.attention.wv.weight                      :   5%|▌         | 16/292 [00:00<00:11, 24.79it/s]
ram used:  0.92 GB, layers.2.attention.wo.weight                      :   5%|▌         | 16/292 [00:00<00:11, 24.79it/s]
ram used:  0.96 GB, layers.2.feed_forward.w1.weight                   :   5%|▌         | 16/292 [00:00<00:11, 24.79it/s]
ram used:  1.07 GB, layers.2.feed_forward.w2.weight                   :   5%|▌         | 16/292 [00:00<00:11, 24.79it/s]
ram used:  1.07 GB, layers.2.feed_forward.w2.weight                   :   8%|▊         | 24/292 [00:01<00:07, 36.49it/s]
ram used:  1.19 GB, layers.2.feed_forward.w3.weight                   :   8%|▊         | 24/292 [00:01<00:07, 36.49it/s]
ram used:  1.31 GB, layers.2.attention_norm.weight                    :   8%|▊         | 24/292 [00:01<00:07, 36.49it/s]
ram used:  1.31 GB, layers.2.ffn_norm.weight                          :   8%|▊         | 24/292 [00:01<00:07, 36.49it/s]
ram used:  1.31 GB, layers.3.attention.wq.weight                      :   8%|▊         | 24/292 [00:01<00:07, 36.49it/s]
ram used:  1.34 GB, layers.3.attention.wk.weight                      :   8%|▊         | 24/292 [00:01<00:07, 36.49it/s]
ram used:  1.35 GB, layers.3.attention.wv.weight                      :   8%|▊         | 24/292 [00:01<00:07, 36.49it/s]
ram used:  1.36 GB, layers.3.attention.wo.weight                      :   8%|▊         | 24/292 [00:01<00:07, 36.49it/s]
ram used:  1.39 GB, layers.3.feed_forward.w1.weight                   :   8%|▊         | 24/292 [00:01<00:07, 36.49it/s]
ram used:  1.39 GB, layers.3.feed_forward.w1.weight                   :  11%|█         | 32/292 [00:01<00:05, 44.77it/s]
ram used:  1.51 GB, layers.3.feed_forward.w2.weight                   :  11%|█         | 32/292 [00:01<00:05, 44.77it/s]
ram used:  1.63 GB, layers.3.feed_forward.w3.weight                   :  11%|█         | 32/292 [00:01<00:05, 44.77it/s]
ram used:  1.74 GB, layers.3.attention_norm.weight                    :  11%|█         | 32/292 [00:01<00:05, 44.77it/s]
ram used:  1.74 GB, layers.3.ffn_norm.weight                          :  11%|█         | 32/292 [00:01<00:05, 44.77it/s]
ram used:  1.74 GB, layers.4.attention.wq.weight                      :  11%|█         | 32/292 [00:01<00:05, 44.77it/s]
ram used:  1.74 GB, layers.4.attention.wq.weight                      :  13%|█▎        | 37/292 [00:01<00:05, 45.67it/s]
ram used:  1.78 GB, layers.4.attention.wk.weight                      :  13%|█▎        | 37/292 [00:01<00:05, 45.67it/s]
ram used:  1.79 GB, layers.4.attention.wv.weight                      :  13%|█▎        | 37/292 [00:01<00:05, 45.67it/s]
ram used:  1.80 GB, layers.4.attention.wo.weight                      :  13%|█▎        | 37/292 [00:01<00:05, 45.67it/s]
ram used:  1.83 GB, layers.4.feed_forward.w1.weight                   :  13%|█▎        | 37/292 [00:01<00:05, 45.67it/s]
ram used:  1.95 GB, layers.4.feed_forward.w2.weight                   :  13%|█▎        | 37/292 [00:01<00:05, 45.67it/s]
ram used:  1.95 GB, layers.4.feed_forward.w2.weight                   :  14%|█▍        | 42/292 [00:01<00:05, 46.03it/s]
ram used:  2.06 GB, layers.4.feed_forward.w3.weight                   :  14%|█▍        | 42/292 [00:01<00:05, 46.03it/s]
ram used:  2.18 GB, layers.4.attention_norm.weight                    :  14%|█▍        | 42/292 [00:01<00:05, 46.03it/s]
ram used:  2.18 GB, layers.4.ffn_norm.weight                          :  14%|█▍        | 42/292 [00:01<00:05, 46.03it/s]
ram used:  2.18 GB, layers.5.attention.wq.weight                      :  14%|█▍        | 42/292 [00:01<00:05, 46.03it/s]
ram used:  2.21 GB, layers.5.attention.wk.weight                      :  14%|█▍        | 42/292 [00:01<00:05, 46.03it/s]
ram used:  2.22 GB, layers.5.attention.wv.weight                      :  14%|█▍        | 42/292 [00:01<00:05, 46.03it/s]
ram used:  2.23 GB, layers.5.attention.wo.weight                      :  14%|█▍        | 42/292 [00:01<00:05, 46.03it/s]
ram used:  2.27 GB, layers.5.feed_forward.w1.weight                   :  14%|█▍        | 42/292 [00:01<00:05, 46.03it/s]
ram used:  2.27 GB, layers.5.feed_forward.w1.weight                   :  17%|█▋        | 50/292 [00:01<00:04, 51.86it/s]
ram used:  2.38 GB, layers.5.feed_forward.w2.weight                   :  17%|█▋        | 50/292 [00:01<00:04, 51.86it/s]
ram used:  2.50 GB, layers.5.feed_forward.w3.weight                   :  17%|█▋        | 50/292 [00:01<00:04, 51.86it/s]
ram used:  2.62 GB, layers.5.attention_norm.weight                    :  17%|█▋        | 50/292 [00:01<00:04, 51.86it/s]
ram used:  2.62 GB, layers.5.ffn_norm.weight                          :  17%|█▋        | 50/292 [00:01<00:04, 51.86it/s]
ram used:  2.62 GB, layers.6.attention.wq.weight                      :  17%|█▋        | 50/292 [00:01<00:04, 51.86it/s]
ram used:  2.65 GB, layers.6.attention.wk.weight                      :  17%|█▋        | 50/292 [00:01<00:04, 51.86it/s]
ram used:  2.65 GB, layers.6.attention.wk.weight                      :  19%|█▉        | 56/292 [00:01<00:04, 53.00it/s]
ram used:  2.66 GB, layers.6.attention.wv.weight                      :  19%|█▉        | 56/292 [00:01<00:04, 53.00it/s]
ram used:  2.67 GB, layers.6.attention.wo.weight                      :  19%|█▉        | 56/292 [00:01<00:04, 53.00it/s]
ram used:  2.70 GB, layers.6.feed_forward.w1.weight                   :  19%|█▉        | 56/292 [00:01<00:04, 53.00it/s]
ram used:  2.82 GB, layers.6.feed_forward.w2.weight                   :  19%|█▉        | 56/292 [00:01<00:04, 53.00it/s]
ram used:  2.94 GB, layers.6.feed_forward.w3.weight                   :  19%|█▉        | 56/292 [00:01<00:04, 53.00it/s]
ram used:  3.05 GB, layers.6.attention_norm.weight                    :  19%|█▉        | 56/292 [00:01<00:04, 53.00it/s]
ram used:  3.05 GB, layers.6.attention_norm.weight                    :  21%|██        | 62/292 [00:01<00:04, 48.91it/s]
ram used:  3.05 GB, layers.6.ffn_norm.weight                          :  21%|██        | 62/292 [00:01<00:04, 48.91it/s]
ram used:  3.05 GB, layers.7.attention.wq.weight                      :  21%|██        | 62/292 [00:01<00:04, 48.91it/s]
ram used:  3.09 GB, layers.7.attention.wk.weight                      :  21%|██        | 62/292 [00:01<00:04, 48.91it/s]
ram used:  3.10 GB, layers.7.attention.wv.weight                      :  21%|██        | 62/292 [00:01<00:04, 48.91it/s]
ram used:  3.10 GB, layers.7.attention.wo.weight                      :  21%|██        | 62/292 [00:01<00:04, 48.91it/s]
ram used:  3.14 GB, layers.7.feed_forward.w1.weight                   :  21%|██        | 62/292 [00:01<00:04, 48.91it/s]
ram used:  3.25 GB, layers.7.feed_forward.w2.weight                   :  21%|██        | 62/292 [00:01<00:04, 48.91it/s]
ram used:  3.25 GB, layers.7.feed_forward.w2.weight                   :  24%|██▎       | 69/292 [00:01<00:04, 51.20it/s]
ram used:  3.37 GB, layers.7.feed_forward.w3.weight                   :  24%|██▎       | 69/292 [00:01<00:04, 51.20it/s]
ram used:  3.49 GB, layers.7.attention_norm.weight                    :  24%|██▎       | 69/292 [00:01<00:04, 51.20it/s]
ram used:  3.49 GB, layers.7.ffn_norm.weight                          :  24%|██▎       | 69/292 [00:01<00:04, 51.20it/s]
ram used:  3.49 GB, layers.8.attention.wq.weight                      :  24%|██▎       | 69/292 [00:01<00:04, 51.20it/s]
ram used:  3.52 GB, layers.8.attention.wk.weight                      :  24%|██▎       | 69/292 [00:01<00:04, 51.20it/s]
ram used:  3.53 GB, layers.8.attention.wv.weight                      :  24%|██▎       | 69/292 [00:01<00:04, 51.20it/s]
ram used:  3.54 GB, layers.8.attention.wo.weight                      :  24%|██▎       | 69/292 [00:01<00:04, 51.20it/s]
ram used:  3.57 GB, layers.8.feed_forward.w1.weight                   :  24%|██▎       | 69/292 [00:01<00:04, 51.20it/s]
ram used:  3.57 GB, layers.8.feed_forward.w1.weight                   :  26%|██▋       | 77/292 [00:01<00:03, 55.18it/s]
ram used:  3.69 GB, layers.8.feed_forward.w2.weight                   :  26%|██▋       | 77/292 [00:01<00:03, 55.18it/s]
ram used:  3.81 GB, layers.8.feed_forward.w3.weight                   :  26%|██▋       | 77/292 [00:02<00:03, 55.18it/s]
ram used:  3.93 GB, layers.8.attention_norm.weight                    :  26%|██▋       | 77/292 [00:02<00:03, 55.18it/s]
ram used:  3.93 GB, layers.8.ffn_norm.weight                          :  26%|██▋       | 77/292 [00:02<00:03, 55.18it/s]
ram used:  3.93 GB, layers.9.attention.wq.weight                      :  26%|██▋       | 77/292 [00:02<00:03, 55.18it/s]
ram used:  3.96 GB, layers.9.attention.wk.weight                      :  26%|██▋       | 77/292 [00:02<00:03, 55.18it/s]
ram used:  3.96 GB, layers.9.attention.wk.weight                      :  28%|██▊       | 83/292 [00:02<00:03, 55.42it/s]
ram used:  3.97 GB, layers.9.attention.wv.weight                      :  28%|██▊       | 83/292 [00:02<00:03, 55.42it/s]
ram used:  3.98 GB, layers.9.attention.wo.weight                      :  28%|██▊       | 83/292 [00:02<00:03, 55.42it/s]
ram used:  4.01 GB, layers.9.feed_forward.w1.weight                   :  28%|██▊       | 83/292 [00:02<00:03, 55.42it/s]
ram used:  4.13 GB, layers.9.feed_forward.w2.weight                   :  28%|██▊       | 83/292 [00:02<00:03, 55.42it/s]
ram used:  4.24 GB, layers.9.feed_forward.w3.weight                   :  28%|██▊       | 83/292 [00:02<00:03, 55.42it/s]
ram used:  4.36 GB, layers.9.attention_norm.weight                    :  28%|██▊       | 83/292 [00:02<00:03, 55.42it/s]
ram used:  4.36 GB, layers.9.attention_norm.weight                    :  30%|███       | 89/292 [00:02<00:04, 50.54it/s]
ram used:  4.36 GB, layers.9.ffn_norm.weight                          :  30%|███       | 89/292 [00:02<00:04, 50.54it/s]
ram used:  4.36 GB, layers.10.attention.wq.weight                     :  30%|███       | 89/292 [00:02<00:04, 50.54it/s]
ram used:  4.40 GB, layers.10.attention.wk.weight                     :  30%|███       | 89/292 [00:02<00:04, 50.54it/s]
ram used:  4.40 GB, layers.10.attention.wv.weight                     :  30%|███       | 89/292 [00:02<00:04, 50.54it/s]
ram used:  4.41 GB, layers.10.attention.wo.weight                     :  30%|███       | 89/292 [00:02<00:04, 50.54it/s]
ram used:  4.45 GB, layers.10.feed_forward.w1.weight                  :  30%|███       | 89/292 [00:02<00:04, 50.54it/s]
ram used:  4.56 GB, layers.10.feed_forward.w2.weight                  :  30%|███       | 89/292 [00:02<00:04, 50.54it/s]
ram used:  4.56 GB, layers.10.feed_forward.w2.weight                  :  33%|███▎      | 96/292 [00:02<00:03, 52.40it/s]
ram used:  4.68 GB, layers.10.feed_forward.w3.weight                  :  33%|███▎      | 96/292 [00:02<00:03, 52.40it/s]
ram used:  4.80 GB, layers.10.attention_norm.weight                   :  33%|███▎      | 96/292 [00:02<00:03, 52.40it/s]
ram used:  4.80 GB, layers.10.ffn_norm.weight                         :  33%|███▎      | 96/292 [00:02<00:03, 52.40it/s]
ram used:  4.80 GB, layers.11.attention.wq.weight                     :  33%|███▎      | 96/292 [00:02<00:03, 52.40it/s]
ram used:  4.83 GB, layers.11.attention.wk.weight                     :  33%|███▎      | 96/292 [00:02<00:03, 52.40it/s]
ram used:  4.84 GB, layers.11.attention.wv.weight                     :  33%|███▎      | 96/292 [00:02<00:03, 52.40it/s]
ram used:  4.85 GB, layers.11.attention.wo.weight                     :  33%|███▎      | 96/292 [00:02<00:03, 52.40it/s]
ram used:  4.88 GB, layers.11.feed_forward.w1.weight                  :  33%|███▎      | 96/292 [00:02<00:03, 52.40it/s]
ram used:  4.88 GB, layers.11.feed_forward.w1.weight                  :  36%|███▌      | 104/292 [00:02<00:03, 55.97it/s]
ram used:  5.00 GB, layers.11.feed_forward.w2.weight                  :  36%|███▌      | 104/292 [00:02<00:03, 55.97it/s]
ram used:  5.12 GB, layers.11.feed_forward.w3.weight                  :  36%|███▌      | 104/292 [00:02<00:03, 55.97it/s]
ram used:  5.23 GB, layers.11.attention_norm.weight                   :  36%|███▌      | 104/292 [00:02<00:03, 55.97it/s]
ram used:  5.23 GB, layers.11.ffn_norm.weight                         :  36%|███▌      | 104/292 [00:02<00:03, 55.97it/s]
ram used:  5.23 GB, layers.12.attention.wq.weight                     :  36%|███▌      | 104/292 [00:02<00:03, 55.97it/s]
ram used:  5.27 GB, layers.12.attention.wk.weight                     :  36%|███▌      | 104/292 [00:02<00:03, 55.97it/s]
ram used:  5.27 GB, layers.12.attention.wk.weight                     :  38%|███▊      | 110/292 [00:02<00:03, 56.03it/s]
ram used:  5.28 GB, layers.12.attention.wv.weight                     :  38%|███▊      | 110/292 [00:02<00:03, 56.03it/s]
ram used:  5.29 GB, layers.12.attention.wo.weight                     :  38%|███▊      | 110/292 [00:02<00:03, 56.03it/s]
ram used:  5.32 GB, layers.12.feed_forward.w1.weight                  :  38%|███▊      | 110/292 [00:02<00:03, 56.03it/s]
ram used:  5.44 GB, layers.12.feed_forward.w2.weight                  :  38%|███▊      | 110/292 [00:02<00:03, 56.03it/s]
ram used:  5.55 GB, layers.12.feed_forward.w3.weight                  :  38%|███▊      | 110/292 [00:02<00:03, 56.03it/s]
ram used:  5.67 GB, layers.12.attention_norm.weight                   :  38%|███▊      | 110/292 [00:02<00:03, 56.03it/s]
ram used:  5.67 GB, layers.12.attention_norm.weight                   :  40%|███▉      | 116/292 [00:02<00:03, 51.03it/s]
ram used:  5.67 GB, layers.12.ffn_norm.weight                         :  40%|███▉      | 116/292 [00:02<00:03, 51.03it/s]
ram used:  5.67 GB, layers.13.attention.wq.weight                     :  40%|███▉      | 116/292 [00:02<00:03, 51.03it/s]
ram used:  5.70 GB, layers.13.attention.wk.weight                     :  40%|███▉      | 116/292 [00:02<00:03, 51.03it/s]
ram used:  5.71 GB, layers.13.attention.wv.weight                     :  40%|███▉      | 116/292 [00:02<00:03, 51.03it/s]
ram used:  5.72 GB, layers.13.attention.wo.weight                     :  40%|███▉      | 116/292 [00:02<00:03, 51.03it/s]
ram used:  5.75 GB, layers.13.feed_forward.w1.weight                  :  40%|███▉      | 116/292 [00:02<00:03, 51.03it/s]
ram used:  5.87 GB, layers.13.feed_forward.w2.weight                  :  40%|███▉      | 116/292 [00:02<00:03, 51.03it/s]
ram used:  5.87 GB, layers.13.feed_forward.w2.weight                  :  42%|████▏     | 123/292 [00:02<00:03, 52.60it/s]
ram used:  5.99 GB, layers.13.feed_forward.w3.weight                  :  42%|████▏     | 123/292 [00:02<00:03, 52.60it/s]
ram used:  6.11 GB, layers.13.attention_norm.weight                   :  42%|████▏     | 123/292 [00:02<00:03, 52.60it/s]
ram used:  6.11 GB, layers.13.ffn_norm.weight                         :  42%|████▏     | 123/292 [00:02<00:03, 52.60it/s]
ram used:  6.11 GB, layers.14.attention.wq.weight                     :  42%|████▏     | 123/292 [00:02<00:03, 52.60it/s]
ram used:  6.14 GB, layers.14.attention.wk.weight                     :  42%|████▏     | 123/292 [00:02<00:03, 52.60it/s]
ram used:  6.15 GB, layers.14.attention.wv.weight                     :  42%|████▏     | 123/292 [00:02<00:03, 52.60it/s]
ram used:  6.16 GB, layers.14.attention.wo.weight                     :  42%|████▏     | 123/292 [00:02<00:03, 52.60it/s]
ram used:  6.19 GB, layers.14.feed_forward.w1.weight                  :  42%|████▏     | 123/292 [00:02<00:03, 52.60it/s]
ram used:  6.19 GB, layers.14.feed_forward.w1.weight                  :  45%|████▍     | 131/292 [00:02<00:02, 55.36it/s]
ram used:  6.31 GB, layers.14.feed_forward.w2.weight                  :  45%|████▍     | 131/292 [00:02<00:02, 55.36it/s]
ram used:  6.43 GB, layers.14.feed_forward.w3.weight                  :  45%|████▍     | 131/292 [00:03<00:02, 55.36it/s]
ram used:  6.54 GB, layers.14.attention_norm.weight                   :  45%|████▍     | 131/292 [00:03<00:02, 55.36it/s]
ram used:  6.54 GB, layers.14.ffn_norm.weight                         :  45%|████▍     | 131/292 [00:03<00:02, 55.36it/s]
ram used:  6.54 GB, layers.15.attention.wq.weight                     :  45%|████▍     | 131/292 [00:03<00:02, 55.36it/s]
ram used:  6.58 GB, layers.15.attention.wk.weight                     :  45%|████▍     | 131/292 [00:03<00:02, 55.36it/s]
ram used:  6.58 GB, layers.15.attention.wk.weight                     :  47%|████▋     | 137/292 [00:03<00:02, 55.27it/s]
ram used:  6.59 GB, layers.15.attention.wv.weight                     :  47%|████▋     | 137/292 [00:03<00:02, 55.27it/s]
ram used:  6.59 GB, layers.15.attention.wo.weight                     :  47%|████▋     | 137/292 [00:03<00:02, 55.27it/s]
ram used:  6.63 GB, layers.15.feed_forward.w1.weight                  :  47%|████▋     | 137/292 [00:03<00:02, 55.27it/s]
ram used:  6.74 GB, layers.15.feed_forward.w2.weight                  :  47%|████▋     | 137/292 [00:03<00:02, 55.27it/s]
ram used:  6.86 GB, layers.15.feed_forward.w3.weight                  :  47%|████▋     | 137/292 [00:03<00:02, 55.27it/s]
ram used:  6.98 GB, layers.15.attention_norm.weight                   :  47%|████▋     | 137/292 [00:03<00:02, 55.27it/s]
ram used:  6.98 GB, layers.15.attention_norm.weight                   :  49%|████▉     | 143/292 [00:03<00:02, 50.26it/s]
ram used:  6.98 GB, layers.15.ffn_norm.weight                         :  49%|████▉     | 143/292 [00:03<00:02, 50.26it/s]
ram used:  6.98 GB, layers.16.attention.wq.weight                     :  49%|████▉     | 143/292 [00:03<00:02, 50.26it/s]
ram used:  7.01 GB, layers.16.attention.wk.weight                     :  49%|████▉     | 143/292 [00:03<00:02, 50.26it/s]
ram used:  7.02 GB, layers.16.attention.wv.weight                     :  49%|████▉     | 143/292 [00:03<00:02, 50.26it/s]
ram used:  7.03 GB, layers.16.attention.wo.weight                     :  49%|████▉     | 143/292 [00:03<00:02, 50.26it/s]
ram used:  7.06 GB, layers.16.feed_forward.w1.weight                  :  49%|████▉     | 143/292 [00:03<00:02, 50.26it/s]
ram used:  7.18 GB, layers.16.feed_forward.w2.weight                  :  49%|████▉     | 143/292 [00:03<00:02, 50.26it/s]
ram used:  7.18 GB, layers.16.feed_forward.w2.weight                  :  51%|█████▏    | 150/292 [00:03<00:02, 51.02it/s]
ram used:  7.30 GB, layers.16.feed_forward.w3.weight                  :  51%|█████▏    | 150/292 [00:03<00:02, 51.02it/s]
ram used:  7.42 GB, layers.16.attention_norm.weight                   :  51%|█████▏    | 150/292 [00:03<00:02, 51.02it/s]
ram used:  7.42 GB, layers.16.ffn_norm.weight                         :  51%|█████▏    | 150/292 [00:03<00:02, 51.02it/s]
ram used:  7.42 GB, layers.17.attention.wq.weight                     :  51%|█████▏    | 150/292 [00:03<00:02, 51.02it/s]
ram used:  7.45 GB, layers.17.attention.wk.weight                     :  51%|█████▏    | 150/292 [00:03<00:02, 51.02it/s]
ram used:  7.46 GB, layers.17.attention.wv.weight                     :  51%|█████▏    | 150/292 [00:03<00:02, 51.02it/s]
ram used:  7.47 GB, layers.17.attention.wo.weight                     :  51%|█████▏    | 150/292 [00:03<00:02, 51.02it/s]
ram used:  7.50 GB, layers.17.feed_forward.w1.weight                  :  51%|█████▏    | 150/292 [00:03<00:02, 51.02it/s]
ram used:  7.50 GB, layers.17.feed_forward.w1.weight                  :  54%|█████▍    | 158/292 [00:03<00:02, 53.90it/s]
ram used:  7.62 GB, layers.17.feed_forward.w2.weight                  :  54%|█████▍    | 158/292 [00:03<00:02, 53.90it/s]
ram used:  7.73 GB, layers.17.feed_forward.w3.weight                  :  54%|█████▍    | 158/292 [00:03<00:02, 53.90it/s]
ram used:  7.85 GB, layers.17.attention_norm.weight                   :  54%|█████▍    | 158/292 [00:03<00:02, 53.90it/s]
ram used:  7.85 GB, layers.17.ffn_norm.weight                         :  54%|█████▍    | 158/292 [00:03<00:02, 53.90it/s]
ram used:  7.85 GB, layers.18.attention.wq.weight                     :  54%|█████▍    | 158/292 [00:03<00:02, 53.90it/s]
ram used:  7.89 GB, layers.18.attention.wk.weight                     :  54%|█████▍    | 158/292 [00:03<00:02, 53.90it/s]
ram used:  7.89 GB, layers.18.attention.wk.weight                     :  56%|█████▌    | 164/292 [00:03<00:02, 54.35it/s]
ram used:  7.89 GB, layers.18.attention.wv.weight                     :  56%|█████▌    | 164/292 [00:03<00:02, 54.35it/s]
ram used:  7.90 GB, layers.18.attention.wo.weight                     :  56%|█████▌    | 164/292 [00:03<00:02, 54.35it/s]
ram used:  7.94 GB, layers.18.feed_forward.w1.weight                  :  56%|█████▌    | 164/292 [00:03<00:02, 54.35it/s]
ram used:  8.05 GB, layers.18.feed_forward.w2.weight                  :  56%|█████▌    | 164/292 [00:03<00:02, 54.35it/s]
ram used:  8.17 GB, layers.18.feed_forward.w3.weight                  :  56%|█████▌    | 164/292 [00:03<00:02, 54.35it/s]
ram used:  8.29 GB, layers.18.attention_norm.weight                   :  56%|█████▌    | 164/292 [00:03<00:02, 54.35it/s]
ram used:  8.29 GB, layers.18.attention_norm.weight                   :  58%|█████▊    | 170/292 [00:03<00:02, 49.50it/s]
ram used:  8.29 GB, layers.18.ffn_norm.weight                         :  58%|█████▊    | 170/292 [00:03<00:02, 49.50it/s]
ram used:  8.29 GB, layers.19.attention.wq.weight                     :  58%|█████▊    | 170/292 [00:03<00:02, 49.50it/s]
ram used:  8.32 GB, layers.19.attention.wk.weight                     :  58%|█████▊    | 170/292 [00:03<00:02, 49.50it/s]
ram used:  8.33 GB, layers.19.attention.wv.weight                     :  58%|█████▊    | 170/292 [00:03<00:02, 49.50it/s]
ram used:  8.34 GB, layers.19.attention.wo.weight                     :  58%|█████▊    | 170/292 [00:03<00:02, 49.50it/s]
ram used:  8.37 GB, layers.19.feed_forward.w1.weight                  :  58%|█████▊    | 170/292 [00:03<00:02, 49.50it/s]
ram used:  8.49 GB, layers.19.feed_forward.w2.weight                  :  58%|█████▊    | 170/292 [00:03<00:02, 49.50it/s]
ram used:  8.49 GB, layers.19.feed_forward.w2.weight                  :  61%|██████    | 177/292 [00:03<00:02, 51.18it/s]
ram used:  8.61 GB, layers.19.feed_forward.w3.weight                  :  61%|██████    | 177/292 [00:03<00:02, 51.18it/s]
ram used:  8.72 GB, layers.19.attention_norm.weight                   :  61%|██████    | 177/292 [00:03<00:02, 51.18it/s]
ram used:  8.72 GB, layers.19.ffn_norm.weight                         :  61%|██████    | 177/292 [00:03<00:02, 51.18it/s]
ram used:  8.72 GB, layers.20.attention.wq.weight                     :  61%|██████    | 177/292 [00:03<00:02, 51.18it/s]
ram used:  8.76 GB, layers.20.attention.wk.weight                     :  61%|██████    | 177/292 [00:03<00:02, 51.18it/s]
ram used:  8.77 GB, layers.20.attention.wv.weight                     :  61%|██████    | 177/292 [00:03<00:02, 51.18it/s]
ram used:  8.77 GB, layers.20.attention.wo.weight                     :  61%|██████    | 177/292 [00:03<00:02, 51.18it/s]
ram used:  8.81 GB, layers.20.feed_forward.w1.weight                  :  61%|██████    | 177/292 [00:03<00:02, 51.18it/s]
ram used:  8.81 GB, layers.20.feed_forward.w1.weight                  :  63%|██████▎   | 185/292 [00:04<00:01, 54.40it/s]
ram used:  8.93 GB, layers.20.feed_forward.w2.weight                  :  63%|██████▎   | 185/292 [00:04<00:01, 54.40it/s]
ram used:  9.04 GB, layers.20.feed_forward.w3.weight                  :  63%|██████▎   | 185/292 [00:04<00:01, 54.40it/s]
ram used:  9.16 GB, layers.20.attention_norm.weight                   :  63%|██████▎   | 185/292 [00:04<00:01, 54.40it/s]
ram used:  9.16 GB, layers.20.ffn_norm.weight                         :  63%|██████▎   | 185/292 [00:04<00:01, 54.40it/s]
ram used:  9.16 GB, layers.21.attention.wq.weight                     :  63%|██████▎   | 185/292 [00:04<00:01, 54.40it/s]
ram used:  9.19 GB, layers.21.attention.wk.weight                     :  63%|██████▎   | 185/292 [00:04<00:01, 54.40it/s]
ram used:  9.19 GB, layers.21.attention.wk.weight                     :  65%|██████▌   | 191/292 [00:04<00:01, 54.59it/s]
ram used:  9.20 GB, layers.21.attention.wv.weight                     :  65%|██████▌   | 191/292 [00:04<00:01, 54.59it/s]
ram used:  9.21 GB, layers.21.attention.wo.weight                     :  65%|██████▌   | 191/292 [00:04<00:01, 54.59it/s]
ram used:  9.24 GB, layers.21.feed_forward.w1.weight                  :  65%|██████▌   | 191/292 [00:04<00:01, 54.59it/s]
ram used:  9.36 GB, layers.21.feed_forward.w2.weight                  :  65%|██████▌   | 191/292 [00:04<00:01, 54.59it/s]
ram used:  9.48 GB, layers.21.feed_forward.w3.weight                  :  65%|██████▌   | 191/292 [00:04<00:01, 54.59it/s]
ram used:  9.60 GB, layers.21.attention_norm.weight                   :  65%|██████▌   | 191/292 [00:04<00:01, 54.59it/s]
ram used:  9.60 GB, layers.21.attention_norm.weight                   :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.60 GB, layers.21.ffn_norm.weight                         :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.60 GB, layers.22.attention.wq.weight                     :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.63 GB, layers.22.attention.wk.weight                     :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.64 GB, layers.22.attention.wv.weight                     :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.65 GB, layers.22.attention.wo.weight                     :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.22.feed_forward.w1.weight                  :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.22.feed_forward.w2.weight                  :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.22.feed_forward.w3.weight                  :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.22.attention_norm.weight                   :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.22.ffn_norm.weight                         :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.23.attention.wq.weight                     :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.23.attention.wk.weight                     :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.23.attention.wv.weight                     :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.23.attention.wo.weight                     :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.23.feed_forward.w1.weight                  :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.23.feed_forward.w2.weight                  :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.23.feed_forward.w3.weight                  :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.23.attention_norm.weight                   :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.23.ffn_norm.weight                         :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.24.attention.wq.weight                     :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.24.attention.wk.weight                     :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.24.attention.wv.weight                     :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.24.attention.wo.weight                     :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.24.feed_forward.w1.weight                  :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.24.feed_forward.w2.weight                  :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.24.feed_forward.w3.weight                  :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.24.attention_norm.weight                   :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.24.ffn_norm.weight                         :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.25.attention.wq.weight                     :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.25.attention.wk.weight                     :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.25.attention.wv.weight                     :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.25.attention.wo.weight                     :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.25.feed_forward.w1.weight                  :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.25.feed_forward.w2.weight                  :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.25.feed_forward.w3.weight                  :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.25.attention_norm.weight                   :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.25.ffn_norm.weight                         :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.26.attention.wq.weight                     :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.26.attention.wk.weight                     :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.26.attention.wv.weight                     :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.26.attention.wo.weight                     :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.26.feed_forward.w1.weight                  :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.26.feed_forward.w2.weight                  :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.26.feed_forward.w3.weight                  :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.26.attention_norm.weight                   :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.26.ffn_norm.weight                         :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.27.attention.wq.weight                     :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.27.attention.wk.weight                     :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.27.attention.wv.weight                     :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.27.attention.wo.weight                     :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.27.feed_forward.w1.weight                  :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.27.feed_forward.w2.weight                  :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.27.feed_forward.w3.weight                  :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.27.attention_norm.weight                   :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.27.ffn_norm.weight                         :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.28.attention.wq.weight                     :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.28.attention.wk.weight                     :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.28.attention.wv.weight                     :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.28.attention.wo.weight                     :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.28.feed_forward.w1.weight                  :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.28.feed_forward.w2.weight                  :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.28.feed_forward.w3.weight                  :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.28.attention_norm.weight                   :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.28.ffn_norm.weight                         :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.29.attention.wq.weight                     :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.29.attention.wk.weight                     :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.29.attention.wv.weight                     :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.29.attention.wo.weight                     :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.29.feed_forward.w1.weight                  :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.29.feed_forward.w2.weight                  :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.29.feed_forward.w3.weight                  :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.29.attention_norm.weight                   :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.29.ffn_norm.weight                         :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.30.attention.wq.weight                     :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.30.attention.wk.weight                     :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.30.attention.wv.weight                     :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.30.attention.wo.weight                     :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.30.feed_forward.w1.weight                  :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.30.feed_forward.w2.weight                  :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.30.feed_forward.w3.weight                  :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.30.attention_norm.weight                   :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.30.ffn_norm.weight                         :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.31.attention.wq.weight                     :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.31.attention.wk.weight                     :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.31.attention.wv.weight                     :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.31.attention.wo.weight                     :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.31.feed_forward.w1.weight                  :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.31.feed_forward.w2.weight                  :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.31.feed_forward.w3.weight                  :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.31.attention_norm.weight                   :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.31.ffn_norm.weight                         :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, norm.weight                                       :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, tok_embeddings.weight                             :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, tok_embeddings.weight                             :  99%|█████████▉| 290/292 [00:04<00:00, 222.51it/s]
ram used:  9.94 GB, output.weight                                     :  99%|█████████▉| 290/292 [00:04<00:00, 222.51it/s]
ram used:  9.94 GB, freqs_cis                                         :  99%|█████████▉| 290/292 [00:04<00:00, 222.51it/s]
ram used:  9.94 GB, freqs_cis                                         : 100%|██████████| 292/292 [00:04<00:00, 65.78it/s] 
loaded weights in 4442.99 ms, 9.94 GB loaded at 2.24 GB/s

  0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.0.attention.wq.weight                      :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.0.attention.wk.weight                      :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.0.attention.wv.weight                      :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.0.attention.wo.weight                      :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.0.feed_forward.w1.weight                   :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.0.feed_forward.w2.weight                   :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.0.feed_forward.w3.weight                   :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.0.attention_norm.weight                    :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.0.ffn_norm.weight                          :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.1.attention.wq.weight                      :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.1.attention.wk.weight                      :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.1.attention.wv.weight                      :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.1.attention.wo.weight                      :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.1.feed_forward.w1.weight                   :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.1.feed_forward.w2.weight                   :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.1.feed_forward.w3.weight                   :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.1.attention_norm.weight                    :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.1.ffn_norm.weight                          :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.2.attention.wq.weight                      :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.2.attention.wk.weight                      :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.2.attention.wv.weight                      :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.2.attention.wo.weight                      :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.2.feed_forward.w1.weight                   :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.2.feed_forward.w2.weight                   :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.2.feed_forward.w3.weight                   :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.2.attention_norm.weight                    :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.2.ffn_norm.weight                          :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.3.attention.wq.weight                      :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.3.attention.wk.weight                      :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.3.attention.wv.weight                      :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.3.attention.wo.weight                      :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.3.feed_forward.w1.weight                   :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.3.feed_forward.w2.weight                   :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.3.feed_forward.w3.weight                   :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.3.attention_norm.weight                    :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.3.ffn_norm.weight                          :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.4.attention.wq.weight                      :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.4.attention.wk.weight                      :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.4.attention.wv.weight                      :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.4.attention.wo.weight                      :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.4.feed_forward.w1.weight                   :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.4.feed_forward.w2.weight                   :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.4.feed_forward.w3.weight                   :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.4.attention_norm.weight                    :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.4.ffn_norm.weight                          :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.5.attention.wq.weight                      :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.5.attention.wk.weight                      :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.5.attention.wv.weight                      :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.5.attention.wo.weight                      :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.5.feed_forward.w1.weight                   :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.5.feed_forward.w2.weight                   :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.5.feed_forward.w3.weight                   :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.5.attention_norm.weight                    :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.5.ffn_norm.weight                          :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.6.attention.wq.weight                      :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.6.attention.wk.weight                      :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.6.attention.wv.weight                      :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.6.attention.wo.weight                      :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.6.feed_forward.w1.weight                   :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.6.feed_forward.w2.weight                   :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.6.feed_forward.w3.weight                   :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.6.attention_norm.weight                    :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.6.ffn_norm.weight                          :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.7.attention.wq.weight                      :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.7.attention.wk.weight                      :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.7.attention.wv.weight                      :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.7.attention.wo.weight                      :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.7.feed_forward.w1.weight                   :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.7.feed_forward.w2.weight                   :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.7.feed_forward.w3.weight                   :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.7.attention_norm.weight                    :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.7.ffn_norm.weight                          :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.8.attention.wq.weight                      :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.8.attention.wk.weight                      :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.8.attention.wv.weight                      :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.8.attention.wo.weight                      :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.8.feed_forward.w1.weight                   :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.8.feed_forward.w2.weight                   :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.8.feed_forward.w3.weight                   :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.8.attention_norm.weight                    :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.8.ffn_norm.weight                          :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.9.attention.wq.weight                      :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.9.attention.wk.weight                      :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.9.attention.wv.weight                      :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.9.attention.wo.weight                      :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.9.feed_forward.w1.weight                   :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.9.feed_forward.w2.weight                   :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.9.feed_forward.w3.weight                   :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.9.attention_norm.weight                    :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.9.ffn_norm.weight                          :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.10.attention.wq.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.10.attention.wk.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.10.attention.wv.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.10.attention.wo.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.10.feed_forward.w1.weight                  :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.10.feed_forward.w2.weight                  :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.10.feed_forward.w3.weight                  :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.10.attention_norm.weight                   :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.10.ffn_norm.weight                         :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.11.attention.wq.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.11.attention.wk.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.11.attention.wv.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.11.attention.wo.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.11.feed_forward.w1.weight                  :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.11.feed_forward.w2.weight                  :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.11.feed_forward.w3.weight                  :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.11.attention_norm.weight                   :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.11.ffn_norm.weight                         :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.12.attention.wq.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.12.attention.wk.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.12.attention.wv.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.12.attention.wo.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.12.feed_forward.w1.weight                  :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.12.feed_forward.w2.weight                  :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.12.feed_forward.w3.weight                  :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.12.attention_norm.weight                   :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.12.ffn_norm.weight                         :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.13.attention.wq.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.13.attention.wk.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.13.attention.wv.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.13.attention.wo.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.13.feed_forward.w1.weight                  :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.13.feed_forward.w2.weight                  :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.13.feed_forward.w3.weight                  :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.13.attention_norm.weight                   :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.13.ffn_norm.weight                         :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.14.attention.wq.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.14.attention.wk.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.14.attention.wv.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.14.attention.wo.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.14.feed_forward.w1.weight                  :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.14.feed_forward.w2.weight                  :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.14.feed_forward.w3.weight                  :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.14.attention_norm.weight                   :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.14.ffn_norm.weight                         :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.15.attention.wq.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.15.attention.wk.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.15.attention.wv.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.15.attention.wo.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.15.feed_forward.w1.weight                  :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.15.feed_forward.w2.weight                  :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.15.feed_forward.w3.weight                  :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.15.attention_norm.weight                   :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.15.ffn_norm.weight                         :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.16.attention.wq.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.16.attention.wk.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.16.attention.wv.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.16.attention.wo.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.16.feed_forward.w1.weight                  :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.16.feed_forward.w2.weight                  :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.16.feed_forward.w3.weight                  :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.16.attention_norm.weight                   :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.16.ffn_norm.weight                         :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.17.attention.wq.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.17.attention.wk.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.17.attention.wv.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.17.attention.wo.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.17.feed_forward.w1.weight                  :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.17.feed_forward.w2.weight                  :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.17.feed_forward.w3.weight                  :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.17.attention_norm.weight                   :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.17.ffn_norm.weight                         :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.18.attention.wq.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.18.attention.wk.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.18.attention.wv.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.18.attention.wo.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.18.feed_forward.w1.weight                  :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.18.feed_forward.w2.weight                  :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.18.feed_forward.w3.weight                  :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.18.attention_norm.weight                   :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.18.ffn_norm.weight                         :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.19.attention.wq.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.19.attention.wk.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.19.attention.wv.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.19.attention.wo.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.19.feed_forward.w1.weight                  :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.19.feed_forward.w2.weight                  :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.19.feed_forward.w3.weight                  :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.19.attention_norm.weight                   :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.19.ffn_norm.weight                         :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.20.attention.wq.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.20.attention.wk.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.20.attention.wv.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.20.attention.wo.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.20.feed_forward.w1.weight                  :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.20.feed_forward.w2.weight                  :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.20.feed_forward.w3.weight                  :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.20.attention_norm.weight                   :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.20.ffn_norm.weight                         :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.21.attention.wq.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.21.attention.wk.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.21.attention.wv.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.21.attention.wo.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.21.feed_forward.w1.weight                  :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.21.feed_forward.w2.weight                  :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.21.feed_forward.w3.weight                  :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.21.attention_norm.weight                   :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.21.ffn_norm.weight                         :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.22.attention.wq.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.22.attention.wk.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.22.attention.wv.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.22.attention.wo.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.22.feed_forward.w1.weight                  :   0%|          | 0/292 [00:00<?, ?it/s]
ram used: 10.06 GB, layers.22.feed_forward.w2.weight                  :   0%|          | 0/292 [00:00<?, ?it/s]
ram used: 10.18 GB, layers.22.feed_forward.w3.weight                  :   0%|          | 0/292 [00:00<?, ?it/s]
ram used: 10.18 GB, layers.22.feed_forward.w3.weight                  :  70%|███████   | 205/292 [00:00<00:00, 1533.22it/s]
ram used: 10.30 GB, layers.22.attention_norm.weight                   :  70%|███████   | 205/292 [00:00<00:00, 1533.22it/s]
ram used: 10.30 GB, layers.22.ffn_norm.weight                         :  70%|███████   | 205/292 [00:00<00:00, 1533.22it/s]
ram used: 10.30 GB, layers.23.attention.wq.weight                     :  70%|███████   | 205/292 [00:00<00:00, 1533.22it/s]
ram used: 10.33 GB, layers.23.attention.wk.weight                     :  70%|███████   | 205/292 [00:00<00:00, 1533.22it/s]
ram used: 10.34 GB, layers.23.attention.wv.weight                     :  70%|███████   | 205/292 [00:00<00:00, 1533.22it/s]
ram used: 10.35 GB, layers.23.attention.wo.weight                     :  70%|███████   | 205/292 [00:00<00:00, 1533.22it/s]
ram used: 10.38 GB, layers.23.feed_forward.w1.weight                  :  70%|███████   | 205/292 [00:00<00:00, 1533.22it/s]
ram used: 10.50 GB, layers.23.feed_forward.w2.weight                  :  70%|███████   | 205/292 [00:00<00:00, 1533.22it/s]
ram used: 10.61 GB, layers.23.feed_forward.w3.weight                  :  70%|███████   | 205/292 [00:00<00:00, 1533.22it/s]
ram used: 10.73 GB, layers.23.attention_norm.weight                   :  70%|███████   | 205/292 [00:00<00:00, 1533.22it/s]
ram used: 10.73 GB, layers.23.ffn_norm.weight                         :  70%|███████   | 205/292 [00:00<00:00, 1533.22it/s]
ram used: 10.73 GB, layers.24.attention.wq.weight                     :  70%|███████   | 205/292 [00:00<00:00, 1533.22it/s]
ram used: 10.77 GB, layers.24.attention.wk.weight                     :  70%|███████   | 205/292 [00:00<00:00, 1533.22it/s]
ram used: 10.77 GB, layers.24.attention.wv.weight                     :  70%|███████   | 205/292 [00:00<00:00, 1533.22it/s]
ram used: 10.78 GB, layers.24.attention.wo.weight                     :  70%|███████   | 205/292 [00:00<00:00, 1533.22it/s]
ram used: 10.82 GB, layers.24.feed_forward.w1.weight                  :  70%|███████   | 205/292 [00:00<00:00, 1533.22it/s]
ram used: 10.93 GB, layers.24.feed_forward.w2.weight                  :  70%|███████   | 205/292 [00:00<00:00, 1533.22it/s]
ram used: 11.05 GB, layers.24.feed_forward.w3.weight                  :  70%|███████   | 205/292 [00:00<00:00, 1533.22it/s]
ram used: 11.17 GB, layers.24.attention_norm.weight                   :  70%|███████   | 205/292 [00:00<00:00, 1533.22it/s]
ram used: 11.17 GB, layers.24.ffn_norm.weight                         :  70%|███████   | 205/292 [00:00<00:00, 1533.22it/s]
ram used: 11.17 GB, layers.25.attention.wq.weight                     :  70%|███████   | 205/292 [00:00<00:00, 1533.22it/s]
ram used: 11.20 GB, layers.25.attention.wk.weight                     :  70%|███████   | 205/292 [00:00<00:00, 1533.22it/s]
ram used: 11.21 GB, layers.25.attention.wv.weight                     :  70%|███████   | 205/292 [00:00<00:00, 1533.22it/s]
ram used: 11.22 GB, layers.25.attention.wo.weight                     :  70%|███████   | 205/292 [00:00<00:00, 1533.22it/s]
ram used: 11.25 GB, layers.25.feed_forward.w1.weight                  :  70%|███████   | 205/292 [00:00<00:00, 1533.22it/s]
ram used: 11.37 GB, layers.25.feed_forward.w2.weight                  :  70%|███████   | 205/292 [00:01<00:00, 1533.22it/s]
ram used: 11.49 GB, layers.25.feed_forward.w3.weight                  :  70%|███████   | 205/292 [00:01<00:00, 1533.22it/s]
ram used: 11.60 GB, layers.25.attention_norm.weight                   :  70%|███████   | 205/292 [00:01<00:00, 1533.22it/s]
ram used: 11.60 GB, layers.25.ffn_norm.weight                         :  70%|███████   | 205/292 [00:01<00:00, 1533.22it/s]
ram used: 11.60 GB, layers.26.attention.wq.weight                     :  70%|███████   | 205/292 [00:01<00:00, 1533.22it/s]
ram used: 11.64 GB, layers.26.attention.wk.weight                     :  70%|███████   | 205/292 [00:01<00:00, 1533.22it/s]
ram used: 11.65 GB, layers.26.attention.wv.weight                     :  70%|███████   | 205/292 [00:01<00:00, 1533.22it/s]
ram used: 11.65 GB, layers.26.attention.wo.weight                     :  70%|███████   | 205/292 [00:01<00:00, 1533.22it/s]
ram used: 11.69 GB, layers.26.feed_forward.w1.weight                  :  70%|███████   | 205/292 [00:01<00:00, 1533.22it/s]
ram used: 11.81 GB, layers.26.feed_forward.w2.weight                  :  70%|███████   | 205/292 [00:01<00:00, 1533.22it/s]
ram used: 11.92 GB, layers.26.feed_forward.w3.weight                  :  70%|███████   | 205/292 [00:01<00:00, 1533.22it/s]
ram used: 12.04 GB, layers.26.attention_norm.weight                   :  70%|███████   | 205/292 [00:02<00:00, 1533.22it/s]
ram used: 12.04 GB, layers.26.ffn_norm.weight                         :  70%|███████   | 205/292 [00:02<00:00, 1533.22it/s]
ram used: 12.04 GB, layers.27.attention.wq.weight                     :  70%|███████   | 205/292 [00:02<00:00, 1533.22it/s]
ram used: 12.07 GB, layers.27.attention.wk.weight                     :  70%|███████   | 205/292 [00:02<00:00, 1533.22it/s]
ram used: 12.08 GB, layers.27.attention.wv.weight                     :  70%|███████   | 205/292 [00:02<00:00, 1533.22it/s]
ram used: 12.09 GB, layers.27.attention.wo.weight                     :  70%|███████   | 205/292 [00:02<00:00, 1533.22it/s]
ram used: 12.12 GB, layers.27.feed_forward.w1.weight                  :  70%|███████   | 205/292 [00:02<00:00, 1533.22it/s]
ram used: 12.24 GB, layers.27.feed_forward.w2.weight                  :  70%|███████   | 205/292 [00:03<00:00, 1533.22it/s]
ram used: 12.36 GB, layers.27.feed_forward.w3.weight                  :  70%|███████   | 205/292 [00:10<00:00, 1533.22it/s]
ram used: 12.48 GB, layers.27.attention_norm.weight                   :  70%|███████   | 205/292 [00:11<00:00, 1533.22it/s]
ram used: 12.48 GB, layers.27.ffn_norm.weight                         :  70%|███████   | 205/292 [00:13<00:00, 1533.22it/s]
ram used: 12.48 GB, layers.28.attention.wq.weight                     :  70%|███████   | 205/292 [00:13<00:00, 1533.22it/s]
ram used: 12.51 GB, layers.28.attention.wk.weight                     :  70%|███████   | 205/292 [00:13<00:00, 1533.22it/s]
ram used: 12.52 GB, layers.28.attention.wv.weight                     :  70%|███████   | 205/292 [00:13<00:00, 1533.22it/s]
ram used: 12.53 GB, layers.28.attention.wo.weight                     :  70%|███████   | 205/292 [00:13<00:00, 1533.22it/s]
ram used: 12.56 GB, layers.28.feed_forward.w1.weight                  :  70%|███████   | 205/292 [00:13<00:00, 1533.22it/s]
ram used: 12.68 GB, layers.28.feed_forward.w2.weight                  :  70%|███████   | 205/292 [00:13<00:00, 1533.22it/s]
ram used: 12.80 GB, layers.28.feed_forward.w3.weight                  :  70%|███████   | 205/292 [00:13<00:00, 1533.22it/s]
ram used: 12.91 GB, layers.28.attention_norm.weight                   :  70%|███████   | 205/292 [00:13<00:00, 1533.22it/s]
ram used: 12.91 GB, layers.28.ffn_norm.weight                         :  70%|███████   | 205/292 [00:13<00:00, 1533.22it/s]
ram used: 12.91 GB, layers.29.attention.wq.weight                     :  70%|███████   | 205/292 [00:13<00:00, 1533.22it/s]
ram used: 12.95 GB, layers.29.attention.wk.weight                     :  70%|███████   | 205/292 [00:13<00:00, 1533.22it/s]
ram used: 12.95 GB, layers.29.attention.wv.weight                     :  70%|███████   | 205/292 [00:13<00:00, 1533.22it/s]
ram used: 12.96 GB, layers.29.attention.wo.weight                     :  70%|███████   | 205/292 [00:13<00:00, 1533.22it/s]
ram used: 13.00 GB, layers.29.feed_forward.w1.weight                  :  70%|███████   | 205/292 [00:13<00:00, 1533.22it/s]
ram used: 13.11 GB, layers.29.feed_forward.w2.weight                  :  70%|███████   | 205/292 [00:13<00:00, 1533.22it/s]
ram used: 13.23 GB, layers.29.feed_forward.w3.weight                  :  70%|███████   | 205/292 [00:13<00:00, 1533.22it/s]
ram used: 13.35 GB, layers.29.attention_norm.weight                   :  70%|███████   | 205/292 [00:13<00:00, 1533.22it/s]
ram used: 13.35 GB, layers.29.ffn_norm.weight                         :  70%|███████   | 205/292 [00:13<00:00, 1533.22it/s]
ram used: 13.35 GB, layers.30.attention.wq.weight                     :  70%|███████   | 205/292 [00:13<00:00, 1533.22it/s]
ram used: 13.38 GB, layers.30.attention.wk.weight                     :  70%|███████   | 205/292 [00:14<00:00, 1533.22it/s]
ram used: 13.39 GB, layers.30.attention.wv.weight                     :  70%|███████   | 205/292 [00:14<00:00, 1533.22it/s]
ram used: 13.40 GB, layers.30.attention.wo.weight                     :  70%|███████   | 205/292 [00:14<00:00, 1533.22it/s]
ram used: 13.43 GB, layers.30.feed_forward.w1.weight                  :  70%|███████   | 205/292 [00:14<00:00, 1533.22it/s]
ram used: 13.55 GB, layers.30.feed_forward.w2.weight                  :  70%|███████   | 205/292 [00:14<00:00, 1533.22it/s]
ram used: 13.67 GB, layers.30.feed_forward.w3.weight                  :  70%|███████   | 205/292 [00:14<00:00, 1533.22it/s]
ram used: 13.79 GB, layers.30.attention_norm.weight                   :  70%|███████   | 205/292 [00:14<00:00, 1533.22it/s]
ram used: 13.79 GB, layers.30.ffn_norm.weight                         :  70%|███████   | 205/292 [00:14<00:00, 1533.22it/s]
ram used: 13.79 GB, layers.31.attention.wq.weight                     :  70%|███████   | 205/292 [00:14<00:00, 1533.22it/s]
ram used: 13.82 GB, layers.31.attention.wk.weight                     :  70%|███████   | 205/292 [00:14<00:00, 1533.22it/s]
ram used: 13.83 GB, layers.31.attention.wv.weight                     :  70%|███████   | 205/292 [00:14<00:00, 1533.22it/s]
ram used: 13.84 GB, layers.31.attention.wo.weight                     :  70%|███████   | 205/292 [00:14<00:00, 1533.22it/s]
ram used: 13.87 GB, layers.31.feed_forward.w1.weight                  :  70%|███████   | 205/292 [00:14<00:00, 1533.22it/s]
ram used: 13.99 GB, layers.31.feed_forward.w2.weight                  :  70%|███████   | 205/292 [00:14<00:00, 1533.22it/s]
ram used: 14.10 GB, layers.31.feed_forward.w3.weight                  :  70%|███████   | 205/292 [00:14<00:00, 1533.22it/s]
ram used: 14.22 GB, layers.31.attention_norm.weight                   :  70%|███████   | 205/292 [00:14<00:00, 1533.22it/s]
ram used: 14.22 GB, layers.31.ffn_norm.weight                         :  70%|███████   | 205/292 [00:14<00:00, 1533.22it/s]
ram used: 14.22 GB, norm.weight                                       :  70%|███████   | 205/292 [00:14<00:00, 1533.22it/s]
ram used: 14.22 GB, tok_embeddings.weight                             :  70%|███████   | 205/292 [00:14<00:00, 1533.22it/s]
ram used: 14.22 GB, output.weight                                     :  70%|███████   | 205/292 [00:14<00:00, 1533.22it/s]
ram used: 14.48 GB, freqs_cis                                         :  70%|███████   | 205/292 [00:14<00:00, 1533.22it/s]
ram used: 14.48 GB, freqs_cis                                         : 100%|██████████| 292/292 [00:14<00:00, 20.23it/s]  
loaded weights in 14431.85 ms, 4.54 GB loaded at 0.31 GB/s
weights -> model: 18909.50 ms
<|im_start|> system
You are Quentin. Quentin is a useful assistant who writes Python code to answer questions. He keeps the code as short as possible and doesn't read from user input<|im_end|> 
Q: 

compile_efficientnet.py

Traceback (most recent call last):
  File "/home/jebba/devel/tinygrad/tinygrad/examples/compile_efficientnet.py", line 13, in <module>
    prg, inp_sizes, out_sizes, state = export_model(model, mode, Tensor.randn(1,3,224,224))
                                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jebba/devel/tinygrad/tinygrad/extra/export_model.py", line 313, in export_model
    assert Device.DEFAULT in EXPORT_SUPPORTED_DEVICE, "only WEBGPU, WEBGL, CLANG, CUDA, GPU, METAL are supported"
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError: only WEBGPU, WEBGL, CLANG, CUDA, GPU, METAL are supported

compile_tensorflow.py.txt

2024-02-06 13:09:00.382257: I external/local_tsl/tsl/cuda/cudart_stub.cc:31] Could not find cuda drivers on your machine, GPU will not be used.
2024-02-06 13:09:00.420457: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-02-06 13:09:00.420503: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-02-06 13:09:00.421410: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-02-06 13:09:00.426991: I external/local_tsl/tsl/cuda/cudart_stub.cc:31] Could not find cuda drivers on your machine, GPU will not be used.
2024-02-06 13:09:00.427163: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-02-06 13:09:01.185053: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
2024-02-06 13:09:02.593911: I tensorflow/core/grappler/devices.cc:66] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 0
2024-02-06 13:09:02.594059: I tensorflow/core/grappler/clusters/single_machine.cc:361] Starting new session
2024-02-06 13:09:02.691872: I tensorflow/core/grappler/devices.cc:66] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 0
2024-02-06 13:09:02.692025: I tensorflow/core/grappler/clusters/single_machine.cc:361] Starting new session
tinygrad: [0.29635584354400635, 0.5070338845252991, 0.6352834105491638, 0.15874029695987701]
compiled: [0.296356, 0.507034, 0.635283, 0.15874]
keras:    [0.29635587 0.5070339  0.6352834  0.15874033]
#include <string.h>
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#define max(x,y) ((x>y)?x:y)
#define int64 long
#define half __fp16
#define uchar unsigned char
#include <stdbool.h>

float buf_0[64];
float input0[128];
float buf_1[2048];
float buf_2[64];
float buf_3[128];
float buf_4[2048];
float output0[16];
float buf_5[512];
void r_16_32(float* restrict data0, const float* restrict data1, const float* restrict data2, const float* restrict data3) {
  float val0 = data1[0];
  float val1 = data1[1];
  float val2 = data1[2];
  float val3 = data1[3];
  float val4 = data1[4];
  float val5 = data1[5];
  float val6 = data1[6];
  float val7 = data1[7];
  float val8 = data1[8];
  float val9 = data1[9];
  float val10 = data1[10];
  float val11 = data1[11];
  float val12 = data1[12];
  float val13 = data1[13];
  float val14 = data1[14];
  float val15 = data1[15];
  float val16 = data1[16];
  float val17 = data1[17];
  float val18 = data1[18];
  float val19 = data1[19];
  float val20 = data1[20];
  float val21 = data1[21];
  float val22 = data1[22];
  float val23 = data1[23];
  float val24 = data1[24];
  float val25 = data1[25];
  float val26 = data1[26];
  float val27 = data1[27];
  float val28 = data1[28];
  float val29 = data1[29];
  float val30 = data1[30];
  float val31 = data1[31];
  for (int ridx0 = 0; ridx0 < 16; ridx0++) {
    float acc0 = 0.0f;
    float val32 = data2[ridx0];
    float val33 = data2[ridx0+16];
    float val34 = data2[ridx0+32];
    float val35 = data2[ridx0+48];
    float val36 = data2[ridx0+64];
    float val37 = data2[ridx0+80];
    float val38 = data2[ridx0+96];
    float val39 = data2[ridx0+112];
    float val40 = data2[ridx0+128];
    float val41 = data2[ridx0+144];
    float val42 = data2[ridx0+160];
    float val43 = data2[ridx0+176];
    float val44 = data2[ridx0+192];
    float val45 = data2[ridx0+208];
    float val46 = data2[ridx0+224];
    float val47 = data2[ridx0+240];
    float val48 = data2[ridx0+256];
    float val49 = data2[ridx0+272];
    float val50 = data2[ridx0+288];
    float val51 = data2[ridx0+304];
    float val52 = data2[ridx0+320];
    float val53 = data2[ridx0+336];
    float val54 = data2[ridx0+352];
    float val55 = data2[ridx0+368];
    float val56 = data2[ridx0+384];
    float val57 = data2[ridx0+400];
    float val58 = data2[ridx0+416];
    float val59 = data2[ridx0+432];
    float val60 = data2[ridx0+448];
    float val61 = data2[ridx0+464];
    float val62 = data2[ridx0+480];
    float val63 = data2[ridx0+496];
    float val64 = data3[ridx0];
    float alu0 = max(((val31*val63)+((val30*val62)+((val29*val61)+((val28*val60)+((val27*val59)+((val26*val58)+((val25*val57)+((val24*val56)+((val23*val55)+((val22*val54)+((val21*val53)+((val20*val52)+((val19*val51)+((val18*val50)+((val17*val49)+((val16*val48)+((val15*val47)+((val14*val46)+((val13*val45)+((val12*val44)+((val11*val43)+((val10*val42)+((val9*val41)+((val8*val40)+((val7*val39)+((val6*val38)+((val5*val37)+((val4*val36)+((val3*val35)+((val2*val34)+((val1*val33)+((val0*val32)+acc0)))))))))))))))))))))))))))))))),0.0f);
    data0[ridx0] = (alu0*val64);
  }
}
void r_32_16(float* restrict data0, const float* restrict data1, const float* restrict data2) {
  float val0 = data1[0];
  float val1 = data1[1];
  float val2 = data1[2];
  float val3 = data1[3];
  float val4 = data1[4];
  float val5 = data1[5];
  float val6 = data1[6];
  float val7 = data1[7];
  float val8 = data1[8];
  float val9 = data1[9];
  float val10 = data1[10];
  float val11 = data1[11];
  float val12 = data1[12];
  float val13 = data1[13];
  float val14 = data1[14];
  float val15 = data1[15];
  for (int ridx0 = 0; ridx0 < 32; ridx0++) {
    float acc0 = 0.0f;
    float val16 = data2[ridx0];
    float val17 = data2[ridx0+32];
    float val18 = data2[ridx0+64];
    float val19 = data2[ridx0+96];
    float val20 = data2[ridx0+128];
    float val21 = data2[ridx0+160];
    float val22 = data2[ridx0+192];
    float val23 = data2[ridx0+224];
    float val24 = data2[ridx0+256];
    float val25 = data2[ridx0+288];
    float val26 = data2[ridx0+320];
    float val27 = data2[ridx0+352];
    float val28 = data2[ridx0+384];
    float val29 = data2[ridx0+416];
    float val30 = data2[ridx0+448];
    float val31 = data2[ridx0+480];
    float alu0 = max(((val15*val31)+((val14*val30)+((val13*val29)+((val12*val28)+((val11*val27)+((val10*val26)+((val9*val25)+((val8*val24)+((val7*val23)+((val6*val22)+((val5*val21)+((val4*val20)+((val3*val19)+((val2*val18)+((val1*val17)+((val0*val16)+acc0)))))))))))))))),0.0f);
    data0[ridx0] = alu0;
  }
}
void r_4_32(float* restrict data0, const float* restrict data1, const float* restrict data2) {
  float val0 = data1[0];
  float val1 = data1[1];
  float val2 = data1[2];
  float val3 = data1[3];
  float val4 = data1[4];
  float val5 = data1[5];
  float val6 = data1[6];
  float val7 = data1[7];
  float val8 = data1[8];
  float val9 = data1[9];
  float val10 = data1[10];
  float val11 = data1[11];
  float val12 = data1[12];
  float val13 = data1[13];
  float val14 = data1[14];
  float val15 = data1[15];
  float val16 = data1[16];
  float val17 = data1[17];
  float val18 = data1[18];
  float val19 = data1[19];
  float val20 = data1[20];
  float val21 = data1[21];
  float val22 = data1[22];
  float val23 = data1[23];
  float val24 = data1[24];
  float val25 = data1[25];
  float val26 = data1[26];
  float val27 = data1[27];
  float val28 = data1[28];
  float val29 = data1[29];
  float val30 = data1[30];
  float val31 = data1[31];
  for (int ridx0 = 0; ridx0 < 4; ridx0++) {
    float acc0 = 0.0f;
    float val32 = data2[ridx0];
    float val33 = data2[ridx0+4];
    float val34 = data2[ridx0+8];
    float val35 = data2[ridx0+12];
    float val36 = data2[ridx0+16];
    float val37 = data2[ridx0+20];
    float val38 = data2[ridx0+24];
    float val39 = data2[ridx0+28];
    float val40 = data2[ridx0+32];
    float val41 = data2[ridx0+36];
    float val42 = data2[ridx0+40];
    float val43 = data2[ridx0+44];
    float val44 = data2[ridx0+48];
    float val45 = data2[ridx0+52];
    float val46 = data2[ridx0+56];
    float val47 = data2[ridx0+60];
    float val48 = data2[ridx0+64];
    float val49 = data2[ridx0+68];
    float val50 = data2[ridx0+72];
    float val51 = data2[ridx0+76];
    float val52 = data2[ridx0+80];
    float val53 = data2[ridx0+84];
    float val54 = data2[ridx0+88];
    float val55 = data2[ridx0+92];
    float val56 = data2[ridx0+96];
    float val57 = data2[ridx0+100];
    float val58 = data2[ridx0+104];
    float val59 = data2[ridx0+108];
    float val60 = data2[ridx0+112];
    float val61 = data2[ridx0+116];
    float val62 = data2[ridx0+120];
    float val63 = data2[ridx0+124];
    data0[ridx0] = (1.0f/(1.0f+exp2((((val31*val63)+((val30*val62)+((val29*val61)+((val28*val60)+((val27*val59)+((val26*val58)+((val25*val57)+((val24*val56)+((val23*val55)+((val22*val54)+((val21*val53)+((val20*val52)+((val19*val51)+((val18*val50)+((val17*val49)+((val16*val48)+((val15*val47)+((val14*val46)+((val13*val45)+((val12*val44)+((val11*val43)+((val10*val42)+((val9*val41)+((val8*val40)+((val7*val39)+((val6*val38)+((val5*val37)+((val4*val36)+((val3*val35)+((val2*val34)+((val1*val33)+((val0*val32)+acc0))))))))))))))))))))))))))))))))*(-1.4426950408889634f)))));
  }
}
void net(float* input0, float* output0) {
r_16_32(buf_0, input0, buf_1, buf_2);
r_32_16(buf_3, buf_0, buf_4);
r_4_32(output0, buf_3, buf_5);
}
void initialize(float *weights) {
memcpy(buf_1, weights + 0, 8192);
memcpy(buf_2, weights + 512, 256);
memcpy(buf_4, weights + 528, 8192);
memcpy(buf_5, weights + 1040, 2048);
}
int main(int argc, char *argv[]) {
    // read in the weights from disk
    FILE *f = fopen("/tmp/tf_weights", "rb");
    float *weights = (float *)malloc(4672);
    fread(weights, 1, 4672, f);
    fclose(f);

    // init the net
    initialize(weights);

    // test run
    float input[32];
    float outputs[4];
    for (int i = 0; i < 32; i++) scanf("%f", &input[i]);
    net(input, outputs);
    printf("%f %f %f %f\n", outputs[0], outputs[1], outputs[2], outputs[3]);
  }

conversation.py

[nltk_data] Downloading package punkt to /home/jebba/nltk_data...
[nltk_data]   Package punkt is already up-to-date!

  0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.00 GB, encoder.conv1.weight                              :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.00 GB, encoder.conv1.bias                                :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.00 GB, encoder.conv2.weight                              :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.00 GB, encoder.conv2.bias                                :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.00 GB, encoder.blocks.0.attn.query.weight                :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.00 GB, encoder.blocks.0.attn.query.bias                  :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.00 GB, encoder.blocks.0.attn.key.weight                  :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.00 GB, encoder.blocks.0.attn.value.weight                :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.00 GB, encoder.blocks.0.attn.value.bias                  :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.00 GB, encoder.blocks.0.attn.out.weight                  :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.00 GB, encoder.blocks.0.attn.out.bias                    :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.00 GB, encoder.blocks.0.attn_ln.weight                   :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.00 GB, encoder.blocks.0.attn_ln.bias                     :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.00 GB, encoder.blocks.0.mlp.0.weight                     :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.01 GB, encoder.blocks.0.mlp.0.bias                       :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.01 GB, encoder.blocks.0.mlp.2.weight                     :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.01 GB, encoder.blocks.0.mlp.2.bias                       :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.01 GB, encoder.blocks.0.mlp_ln.weight                    :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.01 GB, encoder.blocks.0.mlp_ln.bias                      :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.01 GB, encoder.blocks.1.attn.query.weight                :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.01 GB, encoder.blocks.1.attn.query.bias                  :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.01 GB, encoder.blocks.1.attn.key.weight                  :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.01 GB, encoder.blocks.1.attn.value.weight                :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.01 GB, encoder.blocks.1.attn.value.bias                  :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.01 GB, encoder.blocks.1.attn.out.weight                  :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.01 GB, encoder.blocks.1.attn.out.bias                    :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.01 GB, encoder.blocks.1.attn_ln.weight                   :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.01 GB, encoder.blocks.1.attn_ln.bias                     :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.01 GB, encoder.blocks.1.mlp.0.weight                     :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.01 GB, encoder.blocks.1.mlp.0.bias                       :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.01 GB, encoder.blocks.1.mlp.2.weight                     :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.02 GB, encoder.blocks.1.mlp.2.bias                       :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.02 GB, encoder.blocks.1.mlp_ln.weight                    :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.02 GB, encoder.blocks.1.mlp_ln.bias                      :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.02 GB, encoder.blocks.2.attn.query.weight                :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.02 GB, encoder.blocks.2.attn.query.bias                  :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.02 GB, encoder.blocks.2.attn.key.weight                  :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.02 GB, encoder.blocks.2.attn.value.weight                :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.02 GB, encoder.blocks.2.attn.value.bias                  :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.02 GB, encoder.blocks.2.attn.out.weight                  :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.02 GB, encoder.blocks.2.attn.out.bias                    :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.02 GB, encoder.blocks.2.attn_ln.weight                   :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.02 GB, encoder.blocks.2.attn_ln.bias                     :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.02 GB, encoder.blocks.2.mlp.0.weight                     :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.02 GB, encoder.blocks.2.mlp.0.bias                       :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.02 GB, encoder.blocks.2.mlp.2.weight                     :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.02 GB, encoder.blocks.2.mlp.2.bias                       :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.02 GB, encoder.blocks.2.mlp_ln.weight                    :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.02 GB, encoder.blocks.2.mlp_ln.bias                      :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.02 GB, encoder.blocks.3.attn.query.weight                :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.02 GB, encoder.blocks.3.attn.query.bias                  :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.02 GB, encoder.blocks.3.attn.key.weight                  :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.02 GB, encoder.blocks.3.attn.value.weight                :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.02 GB, encoder.blocks.3.attn.value.bias                  :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.02 GB, encoder.blocks.3.attn.out.weight                  :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.03 GB, encoder.blocks.3.attn.out.bias                    :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.03 GB, encoder.blocks.3.attn_ln.weight                   :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.03 GB, encoder.blocks.3.attn_ln.bias                     :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.03 GB, encoder.blocks.3.mlp.0.weight                     :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.03 GB, encoder.blocks.3.mlp.0.bias                       :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.03 GB, encoder.blocks.3.mlp.2.weight                     :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.03 GB, encoder.blocks.3.mlp.2.bias                       :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.03 GB, encoder.blocks.3.mlp_ln.weight                    :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.03 GB, encoder.blocks.3.mlp_ln.bias                      :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.03 GB, encoder.ln_post.weight                            :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.03 GB, encoder.ln_post.bias                              :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.03 GB, encoder.positional_embedding                      :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.03 GB, decoder.token_embedding.weight                    :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.03 GB, decoder.token_embedding.weight                    :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.07 GB, decoder.positional_embedding                      :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.07 GB, decoder.blocks.0.attn.query.weight                :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.07 GB, decoder.blocks.0.attn.query.bias                  :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.07 GB, decoder.blocks.0.attn.key.weight                  :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.07 GB, decoder.blocks.0.attn.value.weight                :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.07 GB, decoder.blocks.0.attn.value.bias                  :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.07 GB, decoder.blocks.0.attn.out.weight                  :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.07 GB, decoder.blocks.0.attn.out.bias                    :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.07 GB, decoder.blocks.0.attn_ln.weight                   :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.07 GB, decoder.blocks.0.attn_ln.bias                     :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.07 GB, decoder.blocks.0.cross_attn.query.weight          :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.07 GB, decoder.blocks.0.cross_attn.query.bias            :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.07 GB, decoder.blocks.0.cross_attn.key.weight            :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.08 GB, decoder.blocks.0.cross_attn.value.weight          :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.08 GB, decoder.blocks.0.cross_attn.value.bias            :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.08 GB, decoder.blocks.0.cross_attn.out.weight            :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.08 GB, decoder.blocks.0.cross_attn.out.bias              :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.08 GB, decoder.blocks.0.cross_attn_ln.weight             :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.08 GB, decoder.blocks.0.cross_attn_ln.bias               :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.08 GB, decoder.blocks.0.mlp.0.weight                     :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.08 GB, decoder.blocks.0.mlp.0.bias                       :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.08 GB, decoder.blocks.0.mlp.2.weight                     :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.08 GB, decoder.blocks.0.mlp.2.bias                       :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.08 GB, decoder.blocks.0.mlp_ln.weight                    :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.08 GB, decoder.blocks.0.mlp_ln.bias                      :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.08 GB, decoder.blocks.1.attn.query.weight                :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.08 GB, decoder.blocks.1.attn.query.bias                  :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.08 GB, decoder.blocks.1.attn.key.weight                  :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.08 GB, decoder.blocks.1.attn.value.weight                :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.08 GB, decoder.blocks.1.attn.value.bias                  :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.08 GB, decoder.blocks.1.attn.out.weight                  :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.08 GB, decoder.blocks.1.attn.out.bias                    :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.08 GB, decoder.blocks.1.attn_ln.weight                   :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.08 GB, decoder.blocks.1.attn_ln.bias                     :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.08 GB, decoder.blocks.1.cross_attn.query.weight          :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.08 GB, decoder.blocks.1.cross_attn.query.bias            :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.08 GB, decoder.blocks.1.cross_attn.key.weight            :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.08 GB, decoder.blocks.1.cross_attn.value.weight          :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.09 GB, decoder.blocks.1.cross_attn.value.bias            :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.09 GB, decoder.blocks.1.cross_attn.out.weight            :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.09 GB, decoder.blocks.1.cross_attn.out.bias              :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.09 GB, decoder.blocks.1.cross_attn_ln.weight             :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.09 GB, decoder.blocks.1.cross_attn_ln.bias               :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.09 GB, decoder.blocks.1.mlp.0.weight                     :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.09 GB, decoder.blocks.1.mlp.0.bias                       :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.09 GB, decoder.blocks.1.mlp.2.weight                     :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.09 GB, decoder.blocks.1.mlp.2.bias                       :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.09 GB, decoder.blocks.1.mlp_ln.weight                    :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.09 GB, decoder.blocks.1.mlp_ln.bias                      :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.09 GB, decoder.blocks.2.attn.query.weight                :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.09 GB, decoder.blocks.2.attn.query.bias                  :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.09 GB, decoder.blocks.2.attn.key.weight                  :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.09 GB, decoder.blocks.2.attn.value.weight                :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.09 GB, decoder.blocks.2.attn.value.bias                  :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.09 GB, decoder.blocks.2.attn.out.weight                  :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.09 GB, decoder.blocks.2.attn.out.bias                    :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.09 GB, decoder.blocks.2.attn_ln.weight                   :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.09 GB, decoder.blocks.2.attn_ln.bias                     :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.09 GB, decoder.blocks.2.cross_attn.query.weight          :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.09 GB, decoder.blocks.2.cross_attn.query.bias            :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.09 GB, decoder.blocks.2.cross_attn.key.weight            :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.09 GB, decoder.blocks.2.cross_attn.value.weight          :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.09 GB, decoder.blocks.2.cross_attn.value.bias            :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.09 GB, decoder.blocks.2.cross_attn.out.weight            :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.10 GB, decoder.blocks.2.cross_attn.out.bias              :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.10 GB, decoder.blocks.2.cross_attn_ln.weight             :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.10 GB, decoder.blocks.2.cross_attn_ln.bias               :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.10 GB, decoder.blocks.2.mlp.0.weight                     :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.10 GB, decoder.blocks.2.mlp.0.bias                       :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.10 GB, decoder.blocks.2.mlp.2.weight                     :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.10 GB, decoder.blocks.2.mlp.2.bias                       :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.10 GB, decoder.blocks.2.mlp_ln.weight                    :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.10 GB, decoder.blocks.2.mlp_ln.bias                      :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.10 GB, decoder.blocks.3.attn.query.weight                :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.10 GB, decoder.blocks.3.attn.query.bias                  :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.10 GB, decoder.blocks.3.attn.key.weight                  :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.10 GB, decoder.blocks.3.attn.value.weight                :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.10 GB, decoder.blocks.3.attn.value.bias                  :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.10 GB, decoder.blocks.3.attn.out.weight                  :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.10 GB, decoder.blocks.3.attn.out.bias                    :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.10 GB, decoder.blocks.3.attn_ln.weight                   :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.10 GB, decoder.blocks.3.attn_ln.bias                     :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.10 GB, decoder.blocks.3.cross_attn.query.weight          :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.10 GB, decoder.blocks.3.cross_attn.query.bias            :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.10 GB, decoder.blocks.3.cross_attn.key.weight            :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.10 GB, decoder.blocks.3.cross_attn.value.weight          :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.10 GB, decoder.blocks.3.cross_attn.value.bias            :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.10 GB, decoder.blocks.3.cross_attn.out.weight            :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.10 GB, decoder.blocks.3.cross_attn.out.bias              :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.10 GB, decoder.blocks.3.cross_attn_ln.weight             :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.10 GB, decoder.blocks.3.cross_attn_ln.bias               :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.10 GB, decoder.blocks.3.mlp.0.weight                     :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.11 GB, decoder.blocks.3.mlp.0.bias                       :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.11 GB, decoder.blocks.3.mlp.2.weight                     :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.11 GB, decoder.blocks.3.mlp.2.bias                       :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.11 GB, decoder.blocks.3.mlp_ln.weight                    :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.11 GB, decoder.blocks.3.mlp_ln.bias                      :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.11 GB, decoder.ln.weight                                 :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.11 GB, decoder.ln.bias                                   :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.11 GB, decoder.mask                                      :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.11 GB, decoder.mask                                      : 100%|██████████| 168/168 [00:00<00:00, 976.08it/s]
loaded weights in 173.76 ms, 0.11 GB loaded at 0.62 GB/s
Traceback (most recent call last):
  File "/home/jebba/devel/tinygrad/tinygrad/examples/conversation.py", line 261, in <module>
    synth, emotion_embedding, text_mapper, hps, model_has_multiple_speakers = init_vits(args.vits_model_to_use, args.vits_emotion_path, args.vits_speaker_id, args.vits_seed)
                                                                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jebba/devel/tinygrad/tinygrad/examples/conversation.py", line 166, in init_vits
    net_g = load_model(text_mapper.symbols, hps, model_config)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jebba/devel/tinygrad/tinygrad/examples/vits.py", line 535, in load_model
    _ = load_checkpoint(fetch(model[1]), net_g, None)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jebba/devel/tinygrad/tinygrad/examples/vits.py", line 540, in load_checkpoint
    checkpoint_dict = torch_load(checkpoint_path)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jebba/devel/tinygrad/tinygrad/tinygrad/nn/state.py", line 145, in torch_load
    _, _, _, rwd, _, ids, base_offset = pkl.load(), pkl.load(), pkl.load(), f.tell(), pkl.load(), pkl.load(), f.tell()
                                        ^^^^^^^^^^
_pickle.UnpicklingError: invalid load key, '<'.

efficientnet.py

281 8.961816 tabby, tabby cat
did inference in 5905.02 ms

f16_w_uint32.py

[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0.]

gpt2.py

using HIP backend
using gpt2-medium

  0%|          | 0/293 [00:00<?, ?it/s]
ram used:  0.00 GB, wte.weight                                        :   0%|          | 0/293 [00:00<?, ?it/s]
ram used:  0.00 GB, wte.weight                                        :   0%|          | 1/293 [00:00<01:37,  3.01it/s]
ram used:  0.21 GB, wpe.weight                                        :   0%|          | 1/293 [00:00<01:37,  3.01it/s]
ram used:  0.21 GB, h.0.attn.c_attn.weight                            :   0%|          | 1/293 [00:00<01:37,  3.01it/s]
ram used:  0.22 GB, h.0.attn.c_attn.bias                              :   0%|          | 1/293 [00:00<01:37,  3.01it/s]
ram used:  0.22 GB, h.0.attn.c_proj.weight                            :   0%|          | 1/293 [00:00<01:37,  3.01it/s]
ram used:  0.23 GB, h.0.attn.c_proj.bias                              :   0%|          | 1/293 [00:00<01:37,  3.01it/s]
ram used:  0.23 GB, h.0.mlp.c_fc.weight                               :   0%|          | 1/293 [00:00<01:37,  3.01it/s]
ram used:  0.24 GB, h.0.mlp.c_fc.bias                                 :   0%|          | 1/293 [00:00<01:37,  3.01it/s]
ram used:  0.24 GB, h.0.mlp.c_proj.weight                             :   0%|          | 1/293 [00:00<01:37,  3.01it/s]
ram used:  0.26 GB, h.0.mlp.c_proj.bias                               :   0%|          | 1/293 [00:00<01:37,  3.01it/s]
ram used:  0.26 GB, h.0.ln_1.weight                                   :   0%|          | 1/293 [00:00<01:37,  3.01it/s]
ram used:  0.26 GB, h.0.ln_1.bias                                     :   0%|          | 1/293 [00:00<01:37,  3.01it/s]
ram used:  0.26 GB, h.0.ln_2.weight                                   :   0%|          | 1/293 [00:00<01:37,  3.01it/s]
ram used:  0.26 GB, h.0.ln_2.bias                                     :   0%|          | 1/293 [00:00<01:37,  3.01it/s]
ram used:  0.26 GB, h.1.attn.c_attn.weight                            :   0%|          | 1/293 [00:00<01:37,  3.01it/s]
ram used:  0.27 GB, h.1.attn.c_attn.bias                              :   0%|          | 1/293 [00:00<01:37,  3.01it/s]
ram used:  0.27 GB, h.1.attn.c_proj.weight                            :   0%|          | 1/293 [00:00<01:37,  3.01it/s]
ram used:  0.28 GB, h.1.attn.c_proj.bias                              :   0%|          | 1/293 [00:00<01:37,  3.01it/s]
ram used:  0.28 GB, h.1.mlp.c_fc.weight                               :   0%|          | 1/293 [00:00<01:37,  3.01it/s]
ram used:  0.29 GB, h.1.mlp.c_fc.bias                                 :   0%|          | 1/293 [00:00<01:37,  3.01it/s]
ram used:  0.29 GB, h.1.mlp.c_proj.weight                             :   0%|          | 1/293 [00:00<01:37,  3.01it/s]
ram used:  0.31 GB, h.1.mlp.c_proj.bias                               :   0%|          | 1/293 [00:00<01:37,  3.01it/s]
ram used:  0.31 GB, h.1.ln_1.weight                                   :   0%|          | 1/293 [00:00<01:37,  3.01it/s]
ram used:  0.31 GB, h.1.ln_1.bias                                     :   0%|          | 1/293 [00:00<01:37,  3.01it/s]
ram used:  0.31 GB, h.1.ln_2.weight                                   :   0%|          | 1/293 [00:00<01:37,  3.01it/s]
ram used:  0.31 GB, h.1.ln_2.bias                                     :   0%|          | 1/293 [00:00<01:37,  3.01it/s]
ram used:  0.31 GB, h.2.attn.c_attn.weight                            :   0%|          | 1/293 [00:00<01:37,  3.01it/s]
ram used:  0.32 GB, h.2.attn.c_attn.bias                              :   0%|          | 1/293 [00:00<01:37,  3.01it/s]
ram used:  0.32 GB, h.2.attn.c_proj.weight                            :   0%|          | 1/293 [00:00<01:37,  3.01it/s]
ram used:  0.33 GB, h.2.attn.c_proj.bias                              :   0%|          | 1/293 [00:00<01:37,  3.01it/s]
ram used:  0.33 GB, h.2.mlp.c_fc.weight                               :   0%|          | 1/293 [00:00<01:37,  3.01it/s]
ram used:  0.34 GB, h.2.mlp.c_fc.bias                                 :   0%|          | 1/293 [00:00<01:37,  3.01it/s]
ram used:  0.34 GB, h.2.mlp.c_proj.weight                             :   0%|          | 1/293 [00:00<01:37,  3.01it/s]
ram used:  0.36 GB, h.2.mlp.c_proj.bias                               :   0%|          | 1/293 [00:00<01:37,  3.01it/s]
ram used:  0.36 GB, h.2.ln_1.weight                                   :   0%|          | 1/293 [00:00<01:37,  3.01it/s]
ram used:  0.36 GB, h.2.ln_1.bias                                     :   0%|          | 1/293 [00:00<01:37,  3.01it/s]
ram used:  0.36 GB, h.2.ln_2.weight                                   :   0%|          | 1/293 [00:00<01:37,  3.01it/s]
ram used:  0.36 GB, h.2.ln_2.bias                                     :   0%|          | 1/293 [00:00<01:37,  3.01it/s]
ram used:  0.36 GB, h.3.attn.c_attn.weight                            :   0%|          | 1/293 [00:00<01:37,  3.01it/s]
ram used:  0.37 GB, h.3.attn.c_attn.bias                              :   0%|          | 1/293 [00:00<01:37,  3.01it/s]
ram used:  0.37 GB, h.3.attn.c_proj.weight                            :   0%|          | 1/293 [00:00<01:37,  3.01it/s]
ram used:  0.38 GB, h.3.attn.c_proj.bias                              :   0%|          | 1/293 [00:00<01:37,  3.01it/s]
ram used:  0.38 GB, h.3.mlp.c_fc.weight                               :   0%|          | 1/293 [00:00<01:37,  3.01it/s]
ram used:  0.38 GB, h.3.mlp.c_fc.weight                               :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.39 GB, h.3.mlp.c_fc.bias                                 :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.39 GB, h.3.mlp.c_proj.weight                             :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.41 GB, h.3.mlp.c_proj.bias                               :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.41 GB, h.3.ln_1.weight                                   :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.41 GB, h.3.ln_1.bias                                     :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.41 GB, h.3.ln_2.weight                                   :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.41 GB, h.3.ln_2.bias                                     :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.41 GB, h.4.attn.c_attn.weight                            :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.42 GB, h.4.attn.c_attn.bias                              :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.42 GB, h.4.attn.c_proj.weight                            :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.43 GB, h.4.attn.c_proj.bias                              :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.43 GB, h.4.mlp.c_fc.weight                               :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.45 GB, h.4.mlp.c_fc.bias                                 :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.45 GB, h.4.mlp.c_proj.weight                             :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.46 GB, h.4.mlp.c_proj.bias                               :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.46 GB, h.4.ln_1.weight                                   :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.46 GB, h.4.ln_1.bias                                     :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.46 GB, h.4.ln_2.weight                                   :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.46 GB, h.4.ln_2.bias                                     :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.46 GB, h.5.attn.c_attn.weight                            :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.47 GB, h.5.attn.c_attn.bias                              :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.47 GB, h.5.attn.c_proj.weight                            :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.48 GB, h.5.attn.c_proj.bias                              :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.48 GB, h.5.mlp.c_fc.weight                               :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.50 GB, h.5.mlp.c_fc.bias                                 :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.50 GB, h.5.mlp.c_proj.weight                             :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.51 GB, h.5.mlp.c_proj.bias                               :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.51 GB, h.5.ln_1.weight                                   :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.51 GB, h.5.ln_1.bias                                     :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.51 GB, h.5.ln_2.weight                                   :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.51 GB, h.5.ln_2.bias                                     :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.51 GB, h.6.attn.c_attn.weight                            :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.52 GB, h.6.attn.c_attn.bias                              :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.52 GB, h.6.attn.c_proj.weight                            :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.53 GB, h.6.attn.c_proj.bias                              :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.53 GB, h.6.mlp.c_fc.weight                               :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.55 GB, h.6.mlp.c_fc.bias                                 :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.55 GB, h.6.mlp.c_proj.weight                             :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.56 GB, h.6.mlp.c_proj.bias                               :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.56 GB, h.6.ln_1.weight                                   :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.56 GB, h.6.ln_1.bias                                     :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.56 GB, h.6.ln_2.weight                                   :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.56 GB, h.6.ln_2.bias                                     :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.56 GB, h.7.attn.c_attn.weight                            :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.58 GB, h.7.attn.c_attn.bias                              :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.58 GB, h.7.attn.c_proj.weight                            :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.58 GB, h.7.attn.c_proj.bias                              :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.58 GB, h.7.mlp.c_fc.weight                               :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.58 GB, h.7.mlp.c_fc.weight                               :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.60 GB, h.7.mlp.c_fc.bias                                 :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.60 GB, h.7.mlp.c_proj.weight                             :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.61 GB, h.7.mlp.c_proj.bias                               :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.61 GB, h.7.ln_1.weight                                   :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.61 GB, h.7.ln_1.bias                                     :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.61 GB, h.7.ln_2.weight                                   :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.61 GB, h.7.ln_2.bias                                     :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.61 GB, h.8.attn.c_attn.weight                            :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.63 GB, h.8.attn.c_attn.bias                              :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.63 GB, h.8.attn.c_proj.weight                            :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.63 GB, h.8.attn.c_proj.bias                              :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.63 GB, h.8.mlp.c_fc.weight                               :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.65 GB, h.8.mlp.c_fc.bias                                 :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.65 GB, h.8.mlp.c_proj.weight                             :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.66 GB, h.8.mlp.c_proj.bias                               :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.66 GB, h.8.ln_1.weight                                   :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.66 GB, h.8.ln_1.bias                                     :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.66 GB, h.8.ln_2.weight                                   :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.66 GB, h.8.ln_2.bias                                     :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.66 GB, h.9.attn.c_attn.weight                            :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.68 GB, h.9.attn.c_attn.bias                              :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.68 GB, h.9.attn.c_proj.weight                            :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.68 GB, h.9.attn.c_proj.bias                              :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.68 GB, h.9.mlp.c_fc.weight                               :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.70 GB, h.9.mlp.c_fc.bias                                 :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.70 GB, h.9.mlp.c_proj.weight                             :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.71 GB, h.9.mlp.c_proj.bias                               :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.71 GB, h.9.ln_1.weight                                   :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.71 GB, h.9.ln_1.bias                                     :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.71 GB, h.9.ln_2.weight                                   :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.71 GB, h.9.ln_2.bias                                     :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.71 GB, h.10.attn.c_attn.weight                           :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.73 GB, h.10.attn.c_attn.bias                             :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.73 GB, h.10.attn.c_proj.weight                           :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.73 GB, h.10.attn.c_proj.bias                             :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.73 GB, h.10.mlp.c_fc.weight                              :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.75 GB, h.10.mlp.c_fc.bias                                :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.75 GB, h.10.mlp.c_proj.weight                            :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.76 GB, h.10.mlp.c_proj.bias                              :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.76 GB, h.10.ln_1.weight                                  :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.76 GB, h.10.ln_1.bias                                    :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.76 GB, h.10.ln_2.weight                                  :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.76 GB, h.10.ln_2.bias                                    :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.76 GB, h.11.attn.c_attn.weight                           :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.78 GB, h.11.attn.c_attn.bias                             :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.78 GB, h.11.attn.c_proj.weight                           :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.78 GB, h.11.attn.c_proj.bias                             :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.78 GB, h.11.mlp.c_fc.weight                              :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.78 GB, h.11.mlp.c_fc.weight                              :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.80 GB, h.11.mlp.c_fc.bias                                :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.80 GB, h.11.mlp.c_proj.weight                            :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.81 GB, h.11.mlp.c_proj.bias                              :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.81 GB, h.11.ln_1.weight                                  :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.81 GB, h.11.ln_1.bias                                    :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.81 GB, h.11.ln_2.weight                                  :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.81 GB, h.11.ln_2.bias                                    :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.81 GB, h.12.attn.c_attn.weight                           :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.83 GB, h.12.attn.c_attn.bias                             :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.83 GB, h.12.attn.c_proj.weight                           :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.83 GB, h.12.attn.c_proj.bias                             :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.83 GB, h.12.mlp.c_fc.weight                              :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.85 GB, h.12.mlp.c_fc.bias                                :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.85 GB, h.12.mlp.c_proj.weight                            :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.87 GB, h.12.mlp.c_proj.bias                              :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.87 GB, h.12.ln_1.weight                                  :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.87 GB, h.12.ln_1.bias                                    :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.87 GB, h.12.ln_2.weight                                  :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.87 GB, h.12.ln_2.bias                                    :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.87 GB, h.13.attn.c_attn.weight                           :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.88 GB, h.13.attn.c_attn.bias                             :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.88 GB, h.13.attn.c_proj.weight                           :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.88 GB, h.13.attn.c_proj.bias                             :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.88 GB, h.13.mlp.c_fc.weight                              :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.90 GB, h.13.mlp.c_fc.bias                                :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.90 GB, h.13.mlp.c_proj.weight                            :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.92 GB, h.13.mlp.c_proj.bias                              :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.92 GB, h.13.ln_1.weight                                  :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.92 GB, h.13.ln_1.bias                                    :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.92 GB, h.13.ln_2.weight                                  :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.92 GB, h.13.ln_2.bias                                    :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.92 GB, h.14.attn.c_attn.weight                           :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.93 GB, h.14.attn.c_attn.bias                             :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.93 GB, h.14.attn.c_proj.weight                           :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.93 GB, h.14.attn.c_proj.bias                             :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.93 GB, h.14.mlp.c_fc.weight                              :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.95 GB, h.14.mlp.c_fc.bias                                :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.95 GB, h.14.mlp.c_proj.weight                            :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.97 GB, h.14.mlp.c_proj.bias                              :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.97 GB, h.14.ln_1.weight                                  :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.97 GB, h.14.ln_1.bias                                    :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.97 GB, h.14.ln_2.weight                                  :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.97 GB, h.14.ln_2.bias                                    :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.97 GB, h.15.attn.c_attn.weight                           :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.98 GB, h.15.attn.c_attn.bias                             :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.98 GB, h.15.attn.c_proj.weight                           :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.98 GB, h.15.attn.c_proj.bias                             :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.98 GB, h.15.mlp.c_fc.weight                              :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.98 GB, h.15.mlp.c_fc.weight                              :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.00 GB, h.15.mlp.c_fc.bias                                :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.00 GB, h.15.mlp.c_proj.weight                            :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.02 GB, h.15.mlp.c_proj.bias                              :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.02 GB, h.15.ln_1.weight                                  :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.02 GB, h.15.ln_1.bias                                    :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.02 GB, h.15.ln_2.weight                                  :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.02 GB, h.15.ln_2.bias                                    :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.02 GB, h.16.attn.c_attn.weight                           :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.03 GB, h.16.attn.c_attn.bias                             :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.03 GB, h.16.attn.c_proj.weight                           :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.03 GB, h.16.attn.c_proj.bias                             :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.03 GB, h.16.mlp.c_fc.weight                              :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.05 GB, h.16.mlp.c_fc.bias                                :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.05 GB, h.16.mlp.c_proj.weight                            :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.07 GB, h.16.mlp.c_proj.bias                              :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.07 GB, h.16.ln_1.weight                                  :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.07 GB, h.16.ln_1.bias                                    :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.07 GB, h.16.ln_2.weight                                  :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.07 GB, h.16.ln_2.bias                                    :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.07 GB, h.17.attn.c_attn.weight                           :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.08 GB, h.17.attn.c_attn.bias                             :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.08 GB, h.17.attn.c_proj.weight                           :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.08 GB, h.17.attn.c_proj.bias                             :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.08 GB, h.17.mlp.c_fc.weight                              :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.10 GB, h.17.mlp.c_fc.bias                                :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.10 GB, h.17.mlp.c_proj.weight                            :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.12 GB, h.17.mlp.c_proj.bias                              :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.12 GB, h.17.ln_1.weight                                  :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.12 GB, h.17.ln_1.bias                                    :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.12 GB, h.17.ln_2.weight                                  :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.12 GB, h.17.ln_2.bias                                    :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.12 GB, h.18.attn.c_attn.weight                           :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.13 GB, h.18.attn.c_attn.bias                             :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.13 GB, h.18.attn.c_proj.weight                           :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.13 GB, h.18.attn.c_proj.bias                             :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.13 GB, h.18.mlp.c_fc.weight                              :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.15 GB, h.18.mlp.c_fc.bias                                :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.15 GB, h.18.mlp.c_proj.weight                            :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.17 GB, h.18.mlp.c_proj.bias                              :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.17 GB, h.18.ln_1.weight                                  :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.17 GB, h.18.ln_1.bias                                    :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.17 GB, h.18.ln_2.weight                                  :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.17 GB, h.18.ln_2.bias                                    :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.17 GB, h.19.attn.c_attn.weight                           :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.18 GB, h.19.attn.c_attn.bias                             :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.18 GB, h.19.attn.c_proj.weight                           :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.18 GB, h.19.attn.c_proj.bias                             :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.18 GB, h.19.mlp.c_fc.weight                              :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.18 GB, h.19.mlp.c_fc.weight                              :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.20 GB, h.19.mlp.c_fc.bias                                :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.20 GB, h.19.mlp.c_proj.weight                            :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.22 GB, h.19.mlp.c_proj.bias                              :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.22 GB, h.19.ln_1.weight                                  :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.22 GB, h.19.ln_1.bias                                    :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.22 GB, h.19.ln_2.weight                                  :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.22 GB, h.19.ln_2.bias                                    :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.22 GB, h.20.attn.c_attn.weight                           :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.23 GB, h.20.attn.c_attn.bias                             :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.23 GB, h.20.attn.c_proj.weight                           :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.23 GB, h.20.attn.c_proj.bias                             :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.23 GB, h.20.mlp.c_fc.weight                              :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.25 GB, h.20.mlp.c_fc.bias                                :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.25 GB, h.20.mlp.c_proj.weight                            :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.27 GB, h.20.mlp.c_proj.bias                              :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.27 GB, h.20.ln_1.weight                                  :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.27 GB, h.20.ln_1.bias                                    :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.27 GB, h.20.ln_2.weight                                  :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.27 GB, h.20.ln_2.bias                                    :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.27 GB, h.21.attn.c_attn.weight                           :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.28 GB, h.21.attn.c_attn.bias                             :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.28 GB, h.21.attn.c_proj.weight                           :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.28 GB, h.21.attn.c_proj.bias                             :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.28 GB, h.21.mlp.c_fc.weight                              :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.30 GB, h.21.mlp.c_fc.bias                                :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.30 GB, h.21.mlp.c_proj.weight                            :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.32 GB, h.21.mlp.c_proj.bias                              :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.32 GB, h.21.ln_1.weight                                  :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.32 GB, h.21.ln_1.bias                                    :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.32 GB, h.21.ln_2.weight                                  :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.32 GB, h.21.ln_2.bias                                    :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.32 GB, h.22.attn.c_attn.weight                           :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.33 GB, h.22.attn.c_attn.bias                             :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.33 GB, h.22.attn.c_proj.weight                           :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.34 GB, h.22.attn.c_proj.bias                             :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.34 GB, h.22.mlp.c_fc.weight                              :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.35 GB, h.22.mlp.c_fc.bias                                :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.35 GB, h.22.mlp.c_proj.weight                            :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.37 GB, h.22.mlp.c_proj.bias                              :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.37 GB, h.22.ln_1.weight                                  :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.37 GB, h.22.ln_1.bias                                    :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.37 GB, h.22.ln_2.weight                                  :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.37 GB, h.22.ln_2.bias                                    :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.37 GB, h.23.attn.c_attn.weight                           :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.38 GB, h.23.attn.c_attn.bias                             :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.38 GB, h.23.attn.c_proj.weight                           :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.39 GB, h.23.attn.c_proj.bias                             :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.39 GB, h.23.mlp.c_fc.weight                              :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.39 GB, h.23.mlp.c_fc.weight                              :  97%|█████████▋| 283/293 [00:00<00:00, 412.40it/s]
ram used:  1.40 GB, h.23.mlp.c_fc.bias                                :  97%|█████████▋| 283/293 [00:00<00:00, 412.40it/s]
ram used:  1.40 GB, h.23.mlp.c_proj.weight                            :  97%|█████████▋| 283/293 [00:00<00:00, 412.40it/s]
ram used:  1.42 GB, h.23.mlp.c_proj.bias                              :  97%|█████████▋| 283/293 [00:00<00:00, 412.40it/s]
ram used:  1.42 GB, h.23.ln_1.weight                                  :  97%|█████████▋| 283/293 [00:00<00:00, 412.40it/s]
ram used:  1.42 GB, h.23.ln_1.bias                                    :  97%|█████████▋| 283/293 [00:00<00:00, 412.40it/s]
ram used:  1.42 GB, h.23.ln_2.weight                                  :  97%|█████████▋| 283/293 [00:00<00:00, 412.40it/s]
ram used:  1.42 GB, h.23.ln_2.bias                                    :  97%|█████████▋| 283/293 [00:00<00:00, 412.40it/s]
ram used:  1.42 GB, ln_f.weight                                       :  97%|█████████▋| 283/293 [00:00<00:00, 412.40it/s]
ram used:  1.42 GB, ln_f.bias                                         :  97%|█████████▋| 283/293 [00:00<00:00, 412.40it/s]
ram used:  1.42 GB, lm_head.weight                                    :  97%|█████████▋| 283/293 [00:00<00:00, 412.40it/s]
ram used:  1.42 GB, lm_head.weight                                    : 100%|██████████| 293/293 [00:01<00:00, 284.19it/s]
loaded weights in 1034.68 ms, 1.63 GB loaded at 1.57 GB/s

  0%|          | 0/100 [00:00<?, ?it/s]
  1%|          | 1/100 [00:00<00:23,  4.14it/s]
  2%|▏         | 2/100 [00:00<00:23,  4.14it/s]
  3%|▎         | 3/100 [00:00<00:19,  5.00it/s]
  4%|▍         | 4/100 [00:00<00:17,  5.54it/s]
  5%|▌         | 5/100 [00:00<00:16,  5.90it/s]
  6%|▌         | 6/100 [00:01<00:15,  6.14it/s]
  7%|▋         | 7/100 [00:01<00:15,  6.06it/s]
  8%|▊         | 8/100 [00:01<00:14,  6.23it/s]
  9%|▉         | 9/100 [00:01<00:14,  6.35it/s]
 10%|█         | 10/100 [00:01<00:13,  6.44it/s]
 11%|█         | 11/100 [00:01<00:13,  6.51it/s]
 12%|█▏        | 12/100 [00:02<00:13,  6.55it/s]
 13%|█▎        | 13/100 [00:02<00:13,  6.58it/s]
 14%|█▍        | 14/100 [00:02<00:13,  6.60it/s]
 15%|█▌        | 15/100 [00:02<00:13,  6.37it/s]
 16%|█▌        | 16/100 [00:02<00:13,  6.45it/s]
 17%|█▋        | 17/100 [00:02<00:12,  6.51it/s]
 18%|█▊        | 18/100 [00:02<00:12,  6.56it/s]
 19%|█▉        | 19/100 [00:03<00:12,  6.59it/s]
 20%|██        | 20/100 [00:03<00:12,  6.60it/s]
 21%|██        | 21/100 [00:03<00:11,  6.62it/s]
 22%|██▏       | 22/100 [00:03<00:11,  6.64it/s]
 23%|██▎       | 23/100 [00:03<00:12,  6.38it/s]
 24%|██▍       | 24/100 [00:03<00:11,  6.46it/s]
 25%|██▌       | 25/100 [00:03<00:11,  6.51it/s]
 26%|██▌       | 26/100 [00:04<00:11,  6.56it/s]
 27%|██▋       | 27/100 [00:04<00:11,  6.59it/s]
 28%|██▊       | 28/100 [00:04<00:10,  6.61it/s]
 29%|██▉       | 29/100 [00:04<00:10,  6.63it/s]
 30%|███       | 30/100 [00:04<00:10,  6.64it/s]
 31%|███       | 31/100 [00:04<00:10,  6.36it/s]
 32%|███▏      | 32/100 [00:05<00:10,  6.43it/s]
 33%|███▎      | 33/100 [00:05<00:10,  6.49it/s]
 34%|███▍      | 34/100 [00:05<00:10,  6.54it/s]
 35%|███▌      | 35/100 [00:05<00:09,  6.58it/s]
 36%|███▌      | 36/100 [00:05<00:09,  6.59it/s]
 37%|███▋      | 37/100 [00:05<00:09,  6.61it/s]
 38%|███▊      | 38/100 [00:05<00:09,  6.62it/s]
 39%|███▉      | 39/100 [00:06<00:09,  6.33it/s]
 40%|████      | 40/100 [00:06<00:09,  6.41it/s]
 41%|████      | 41/100 [00:06<00:09,  6.47it/s]
 42%|████▏     | 42/100 [00:06<00:08,  6.52it/s]
 43%|████▎     | 43/100 [00:06<00:08,  6.55it/s]
 44%|████▍     | 44/100 [00:06<00:08,  6.58it/s]
 45%|████▌     | 45/100 [00:07<00:08,  6.60it/s]
 46%|████▌     | 46/100 [00:07<00:08,  6.61it/s]
 47%|████▋     | 47/100 [00:07<00:08,  6.61it/s]
 48%|████▊     | 48/100 [00:07<00:08,  6.32it/s]
 49%|████▉     | 49/100 [00:07<00:07,  6.39it/s]
 50%|█████     | 50/100 [00:07<00:07,  6.46it/s]
 51%|█████     | 51/100 [00:07<00:07,  6.52it/s]
 52%|█████▏    | 52/100 [00:08<00:07,  6.55it/s]
 53%|█████▎    | 53/100 [00:08<00:07,  6.58it/s]
 54%|█████▍    | 54/100 [00:08<00:06,  6.59it/s]
 55%|█████▌    | 55/100 [00:08<00:06,  6.61it/s]
 56%|█████▌    | 56/100 [00:08<00:06,  6.62it/s]
 57%|█████▋    | 57/100 [00:08<00:06,  6.63it/s]
 58%|█████▊    | 58/100 [00:09<00:06,  6.28it/s]
 59%|█████▉    | 59/100 [00:09<00:06,  6.38it/s]
 60%|██████    | 60/100 [00:09<00:06,  6.45it/s]
 61%|██████    | 61/100 [00:09<00:05,  6.50it/s]
 62%|██████▏   | 62/100 [00:09<00:05,  6.55it/s]
 63%|██████▎   | 63/100 [00:09<00:05,  6.58it/s]
 64%|██████▍   | 64/100 [00:09<00:05,  6.60it/s]
 65%|██████▌   | 65/100 [00:10<00:05,  6.62it/s]
 66%|██████▌   | 66/100 [00:10<00:05,  6.63it/s]
 67%|██████▋   | 67/100 [00:10<00:04,  6.64it/s]
 68%|██████▊   | 68/100 [00:10<00:05,  6.27it/s]
 69%|██████▉   | 69/100 [00:10<00:04,  6.37it/s]
 70%|███████   | 70/100 [00:10<00:04,  6.45it/s]
 71%|███████   | 71/100 [00:11<00:04,  6.50it/s]
 72%|███████▏  | 72/100 [00:11<00:04,  6.54it/s]
 73%|███████▎  | 73/100 [00:11<00:04,  6.58it/s]
 74%|███████▍  | 74/100 [00:11<00:03,  6.59it/s]
 75%|███████▌  | 75/100 [00:11<00:03,  6.60it/s]
 76%|███████▌  | 76/100 [00:11<00:03,  6.61it/s]
 77%|███████▋  | 77/100 [00:11<00:03,  6.62it/s]
 78%|███████▊  | 78/100 [00:12<00:03,  6.24it/s]
 79%|███████▉  | 79/100 [00:12<00:03,  6.34it/s]
 80%|████████  | 80/100 [00:12<00:03,  6.42it/s]
 81%|████████  | 81/100 [00:12<00:02,  6.48it/s]
 82%|████████▏ | 82/100 [00:12<00:02,  6.52it/s]
 83%|████████▎ | 83/100 [00:12<00:02,  6.55it/s]
 84%|████████▍ | 84/100 [00:13<00:02,  6.58it/s]
 85%|████████▌ | 85/100 [00:13<00:02,  6.60it/s]
 86%|████████▌ | 86/100 [00:13<00:02,  6.60it/s]
 87%|████████▋ | 87/100 [00:13<00:01,  6.61it/s]
 88%|████████▊ | 88/100 [00:13<00:01,  6.62it/s]
 89%|████████▉ | 89/100 [00:13<00:01,  6.20it/s]
 90%|█████████ | 90/100 [00:13<00:01,  6.31it/s]
 91%|█████████ | 91/100 [00:14<00:01,  6.40it/s]
 92%|█████████▏| 92/100 [00:14<00:01,  6.47it/s]
 93%|█████████▎| 93/100 [00:14<00:01,  6.51it/s]
 94%|█████████▍| 94/100 [00:14<00:00,  6.55it/s]
 95%|█████████▌| 95/100 [00:14<00:00,  6.58it/s]
 96%|█████████▌| 96/100 [00:14<00:00,  6.60it/s]
 97%|█████████▋| 97/100 [00:15<00:00,  6.61it/s]
 98%|█████████▊| 98/100 [00:15<00:00,  6.62it/s]
 99%|█████████▉| 99/100 [00:15<00:00,  6.63it/s]
100%|██████████| 100/100 [00:15<00:00,  6.19it/s]
100%|██████████| 100/100 [00:15<00:00,  6.44it/s]
Generating text...
What is the answer to life, the universe, and everything? You can't. If you can solve it, you'll find a way. But don't try to do it alone. See what I have done, and what you can do to solve it. The universe has no solution."

Note that Domino is not referring to the existence of God: He is referencing the existence of nine Upside-Down Order-Holes.

P.S. I just re-read the RIT book, pointed out this in the comments,

handcode_resnet50_opt.py

optimizing for HIP
***    2.25 ms : kernel  0 r_64_8_7_7_2_16_4_3_7_4_4_7           [49, 8, 64]        [4, 16, 2]   takes    2.25 ms,   6881 GFLOPS
***    2.58 ms : kernel  1 r_2048_7_7_2_8_8_3_3                  [7, 7, 2048]       [8, 8, 2]    takes    0.33 ms,    351 GFLOPS
***    2.77 ms : kernel  2 r_64_2_49_8_16_16_4_4_4               [49, 2, 64]        [16, 8]      takes    0.19 ms,   9245 GFLOPS
***    3.72 ms : kernel  3 r_64_2_7_7_8_8_2_64_4_4_3_3           [49, 2, 64]        [2, 8, 8]    takes    0.95 ms,  15765 GFLOPS
***    4.38 ms : kernel  4 r_64_8_49_8_16_16_4_4_4               [49, 8, 64]        [16, 8]      takes    0.66 ms,   9984 GFLOPS
***    5.09 ms : kernel  5 r_64_8_49_8_16_16_4_4_4n1             [49, 8, 64]        [16, 8]      takes    0.71 ms,  10228 GFLOPS
***    5.66 ms : kernel  6 r_64_2_49_8_16_64_4_4_4               [49, 2, 64]        [16, 8]      takes    0.57 ms,  11626 GFLOPS
***    6.60 ms : kernel  7 r_64_2_7_7_8_8_2_64_4_4_3_3n1         [49, 2, 64]        [2, 8, 8]    takes    0.95 ms,  15765 GFLOPS
***    7.32 ms : kernel  8 r_64_8_49_8_16_16_4_4_4n2             [49, 8, 64]        [16, 8]      takes    0.71 ms,   9875 GFLOPS
***    7.89 ms : kernel  9 r_64_2_49_8_16_64_4_4_4n1             [49, 2, 64]        [16, 8]      takes    0.57 ms,  11626 GFLOPS
***    8.84 ms : kernel 10 r_64_2_7_7_8_8_2_64_4_4_3_3n2         [49, 2, 64]        [2, 8, 8]    takes    0.95 ms,  15765 GFLOPS
***    9.55 ms : kernel 11 r_64_8_49_8_16_16_4_4_4n3             [49, 8, 64]        [16, 8]      takes    0.71 ms,   9875 GFLOPS
***   10.38 ms : kernel 12 r_64_4_49_8_16_64_4_4_4               [49, 4, 64]        [16, 8]      takes    0.83 ms,  16069 GFLOPS
***   11.95 ms : kernel 13 r_32_2_7_7_2_16_4_128_4_4_3_3         [49, 2, 32]        [4, 16, 2]   takes    1.57 ms,   9461 GFLOPS
***   13.60 ms : kernel 14 r_32_8_7_7_2_16_4_64_4_4_4            [49, 8, 32]        [4, 16, 2]   takes    1.65 ms,   7984 GFLOPS
***   14.37 ms : kernel 15 r_32_8_49_2_16_4_32_4_4_4             [49, 8, 32]        [4, 16, 2]   takes    0.77 ms,   8986 GFLOPS
***   15.19 ms : kernel 16 r_32_2_49_2_16_4_128_4_4_4            [49, 2, 32]        [4, 16, 2]   takes    0.82 ms,   8049 GFLOPS
***   16.47 ms : kernel 17 r_32_2_7_7_2_16_4_128_4_4_3_3n1       [49, 2, 32]        [4, 16, 2]   takes    1.28 ms,  11623 GFLOPS
***   17.25 ms : kernel 18 r_32_8_49_2_16_4_32_4_4_4n1           [49, 8, 32]        [4, 16, 2]   takes    0.78 ms,   8731 GFLOPS
***   18.07 ms : kernel 19 r_32_2_49_2_16_4_128_4_4_4n1          [49, 2, 32]        [4, 16, 2]   takes    0.82 ms,   8049 GFLOPS
***   19.35 ms : kernel 20 r_32_2_7_7_2_16_4_128_4_4_3_3n2       [49, 2, 32]        [4, 16, 2]   takes    1.28 ms,  11623 GFLOPS
***   20.13 ms : kernel 21 r_32_8_49_2_16_4_32_4_4_4n2           [49, 8, 32]        [4, 16, 2]   takes    0.78 ms,   8731 GFLOPS
***   20.95 ms : kernel 22 r_32_2_49_2_16_4_128_4_4_4n2          [49, 2, 32]        [4, 16, 2]   takes    0.82 ms,   8049 GFLOPS
***   22.23 ms : kernel 23 r_32_2_7_7_2_16_4_128_4_4_3_3n3       [49, 2, 32]        [4, 16, 2]   takes    1.28 ms,  11623 GFLOPS
***   23.01 ms : kernel 24 r_32_8_49_2_16_4_32_4_4_4n3           [49, 8, 32]        [4, 16, 2]   takes    0.78 ms,   8731 GFLOPS
***   24.26 ms : kernel 25 r_32_4_49_2_16_4_128_4_4_4            [49, 4, 32]        [4, 16, 2]   takes    1.25 ms,  10619 GFLOPS
***   26.39 ms : kernel 26 r_16_4_7_7_16_2_2_256_4_4_3_3         [49, 4, 16]        [2, 2, 16]   takes    2.13 ms,   6957 GFLOPS
***   29.24 ms : kernel 27 r_16_16_7_7_16_2_2_128_4_4_4          [49, 16, 16]       [2, 2, 16]   takes    2.85 ms,   4618 GFLOPS
***   30.42 ms : kernel 28 r_8_16_49_8_16_64_4_4_4               [49, 16, 8]        [16, 8]      takes    1.18 ms,   5702 GFLOPS
***   31.83 ms : kernel 29 r_8_4_49_8_16_256_4_4_4               [49, 4, 8]         [16, 8]      takes    1.41 ms,   4669 GFLOPS
***   33.35 ms : kernel 30 r_16_4_7_7_16_2_2_256_4_4_3_3n1       [49, 4, 16]        [2, 2, 16]   takes    1.52 ms,   9776 GFLOPS
***   34.55 ms : kernel 31 r_8_16_49_8_16_64_4_4_4n1             [49, 16, 8]        [16, 8]      takes    1.20 ms,   5569 GFLOPS
***   35.97 ms : kernel 32 r_8_4_49_8_16_256_4_4_4n1             [49, 4, 8]         [16, 8]      takes    1.41 ms,   4669 GFLOPS
***   37.48 ms : kernel 33 r_16_4_7_7_16_2_2_256_4_4_3_3n2       [49, 4, 16]        [2, 2, 16]   takes    1.52 ms,   9776 GFLOPS
***   38.68 ms : kernel 34 r_8_16_49_8_16_64_4_4_4n2             [49, 16, 8]        [16, 8]      takes    1.20 ms,   5569 GFLOPS
***   40.10 ms : kernel 35 r_8_4_49_8_16_256_4_4_4n2             [49, 4, 8]         [16, 8]      takes    1.41 ms,   4669 GFLOPS
***   41.61 ms : kernel 36 r_16_4_7_7_16_2_2_256_4_4_3_3n3       [49, 4, 16]        [2, 2, 16]   takes    1.52 ms,   9776 GFLOPS
***   42.82 ms : kernel 37 r_8_16_49_8_16_64_4_4_4n3             [49, 16, 8]        [16, 8]      takes    1.20 ms,   5569 GFLOPS
***   44.23 ms : kernel 38 r_8_4_49_8_16_256_4_4_4n3             [49, 4, 8]         [16, 8]      takes    1.41 ms,   4669 GFLOPS
***   45.75 ms : kernel 39 r_16_4_7_7_16_2_2_256_4_4_3_3n4       [49, 4, 16]        [2, 2, 16]   takes    1.52 ms,   9776 GFLOPS
***   46.95 ms : kernel 40 r_8_16_49_8_16_64_4_4_4n4             [49, 16, 8]        [16, 8]      takes    1.20 ms,   5569 GFLOPS
***   48.36 ms : kernel 41 r_8_4_49_8_16_256_4_4_4n4             [49, 4, 8]         [16, 8]      takes    1.41 ms,   4669 GFLOPS
***   49.88 ms : kernel 42 r_16_4_7_7_16_2_2_256_4_4_3_3n5       [49, 4, 16]        [2, 2, 16]   takes    1.52 ms,   9776 GFLOPS
***   51.08 ms : kernel 43 r_8_16_49_8_16_64_4_4_4n5             [49, 16, 8]        [16, 8]      takes    1.20 ms,   5569 GFLOPS
***   53.78 ms : kernel 44 r_8_8_49_8_16_256_4_4_4               [49, 8, 8]         [16, 8]      takes    2.70 ms,   4896 GFLOPS
***  100.26 ms : kernel 45 r_8_8_8_16_512_3_3_7_7_4              [8, 8]             [16, 8]      takes   46.48 ms,    319 GFLOPS
***  104.56 ms : kernel 46 r_2_32_7_7_8_16_256_4_4_4             [49, 32, 2]        [16, 8]      takes    4.31 ms,   3055 GFLOPS
***  106.54 ms : kernel 47 r_2_32_49_8_16_128_4_4_4              [49, 32, 2]        [16, 8]      takes    1.98 ms,   3367 GFLOPS
***  108.96 ms : kernel 48 r_2_8_49_8_16_512_4_4_4               [49, 8, 2]         [16, 8]      takes    2.41 ms,   2731 GFLOPS
***  126.40 ms : kernel 49 r_8_8_8_16_512_3_3_7_7_4n1            [8, 8]             [16, 8]      takes   17.44 ms,    849 GFLOPS
***  128.35 ms : kernel 50 r_2_32_49_8_16_128_4_4_4n1            [49, 32, 2]        [16, 8]      takes    1.95 ms,   3401 GFLOPS
***  130.76 ms : kernel 51 r_2_8_49_8_16_512_4_4_4n1             [49, 8, 2]         [16, 8]      takes    2.41 ms,   2731 GFLOPS
***  148.20 ms : kernel 52 r_8_8_8_16_512_3_3_7_7_4n2            [8, 8]             [16, 8]      takes   17.44 ms,    849 GFLOPS
***  150.15 ms : kernel 53 r_2_32_49_8_16_128_4_4_4n2            [49, 32, 2]        [16, 8]      takes    1.95 ms,   3401 GFLOPS
***  150.24 ms : kernel 54 r_1024_32_49_4                        [1024]             [32]         takes    0.08 ms,     79 GFLOPS
***  150.43 ms : kernel 55 r_125_16_2_512_4_4_4                  [125]              [2, 16]      takes    0.19 ms,   1382 GFLOPS
***  150.45 ms : kernel 56 r_2_32_250_4                          [2]                [32]         takes    0.03 ms,      2 GFLOPS
***  150.53 ms : kernel 57 r_2_32_250_4n1                        [2]                [32]         takes    0.07 ms,      3 GFLOPS
***  150.54 ms : kernel 58 E_2_125_32_2_4                        [125, 2]           [2, 32]      takes    0.01 ms,      9 GFLOPS
******* total 150.54 ms,   3515 GFLOPS

hlb_cifar10.py

shuffling training dataset in 1337.90 ms (epoch=0)
  0 15108.61 ms run, 15098.47 ms python,   10.13 ms HIP, 1198.26 loss, 0.000015 LR, 0.84 GB used,     44.80 GFLOPS,    676.91 GOPS
  1 5837.29 ms run, 5834.97 ms python,    2.32 ms HIP, 1197.49 loss, 0.000030 LR, 4.54 GB used,    115.64 GFLOPS,    675.05 GOPS
  2   74.37 ms run,    4.43 ms python,   69.93 ms HIP, 1188.08 loss, 0.000045 LR, 4.54 GB used,   9077.39 GFLOPS,    675.05 GOPS
  3   73.70 ms run,    2.85 ms python,   70.85 ms HIP, 1171.24 loss, 0.000060 LR, 4.54 GB used,   9159.11 GFLOPS,    675.05 GOPS
  4   71.96 ms run,    2.86 ms python,   69.10 ms HIP, 1160.91 loss, 0.000075 LR, 4.54 GB used,   9380.72 GFLOPS,    675.05 GOPS
  5   70.66 ms run,    2.79 ms python,   67.87 ms HIP, 1158.75 loss, 0.000090 LR, 4.54 GB used,   9553.50 GFLOPS,    675.05 GOPS
  6   70.78 ms run,    2.77 ms python,   68.02 ms HIP, 1149.44 loss, 0.000105 LR, 4.54 GB used,   9537.02 GFLOPS,    675.05 GOPS
  7   69.50 ms run,    2.79 ms python,   66.71 ms HIP, 1172.52 loss, 0.000120 LR, 4.54 GB used,   9713.08 GFLOPS,    675.05 GOPS
  8   69.50 ms run,    2.76 ms python,   66.73 ms HIP, 1143.33 loss, 0.000135 LR, 4.54 GB used,   9713.20 GFLOPS,    675.05 GOPS
  9   69.41 ms run,    2.74 ms python,   66.67 ms HIP, 1129.94 loss, 0.000149 LR, 4.54 GB used,   9725.38 GFLOPS,    675.05 GOPS
 10   69.42 ms run,    2.96 ms python,   66.46 ms HIP, 1114.84 loss, 0.000164 LR, 4.54 GB used,   9724.70 GFLOPS,    675.05 GOPS
 11   69.34 ms run,    2.75 ms python,   66.59 ms HIP, 1099.61 loss, 0.000179 LR, 4.54 GB used,   9735.30 GFLOPS,    675.05 GOPS
 12   68.99 ms run,    2.71 ms python,   66.28 ms HIP, 1092.18 loss, 0.000194 LR, 4.54 GB used,   9784.85 GFLOPS,    675.05 GOPS
 13   68.94 ms run,    2.72 ms python,   66.22 ms HIP, 1068.63 loss, 0.000209 LR, 4.54 GB used,   9791.70 GFLOPS,    675.05 GOPS
 14   68.94 ms run,    2.74 ms python,   66.19 ms HIP, 1066.79 loss, 0.000224 LR, 4.54 GB used,   9792.07 GFLOPS,    675.05 GOPS
 15   69.61 ms run,    2.74 ms python,   66.86 ms HIP, 1065.05 loss, 0.000239 LR, 4.54 GB used,   9698.12 GFLOPS,    675.05 GOPS
 16   68.99 ms run,    2.76 ms python,   66.23 ms HIP, 1030.24 loss, 0.000254 LR, 4.54 GB used,   9785.08 GFLOPS,    675.05 GOPS
 17   69.05 ms run,    2.74 ms python,   66.31 ms HIP, 1034.32 loss, 0.000269 LR, 4.54 GB used,   9776.24 GFLOPS,    675.05 GOPS
 18   69.09 ms run,    2.79 ms python,   66.30 ms HIP, 1012.50 loss, 0.000284 LR, 4.54 GB used,   9770.48 GFLOPS,    675.05 GOPS
 19   68.84 ms run,    2.75 ms python,   66.10 ms HIP,  995.76 loss, 0.000299 LR, 4.54 GB used,   9805.71 GFLOPS,    675.05 GOPS
 20   69.74 ms run,    2.72 ms python,   67.02 ms HIP,  985.93 loss, 0.000314 LR, 4.54 GB used,   9679.54 GFLOPS,    675.05 GOPS
 21   69.37 ms run,    2.67 ms python,   66.69 ms HIP,  970.91 loss, 0.000329 LR, 4.54 GB used,   9731.73 GFLOPS,    675.05 GOPS
 22   69.33 ms run,    2.76 ms python,   66.58 ms HIP,  978.42 loss, 0.000344 LR, 4.54 GB used,   9736.07 GFLOPS,    675.05 GOPS
 23   69.28 ms run,    2.69 ms python,   66.58 ms HIP,  993.64 loss, 0.000359 LR, 4.54 GB used,   9744.05 GFLOPS,    675.05 GOPS
 24   69.03 ms run,    2.72 ms python,   66.31 ms HIP,  950.65 loss, 0.000374 LR, 4.54 GB used,   9779.28 GFLOPS,    675.05 GOPS
 25   69.20 ms run,    2.72 ms python,   66.48 ms HIP,  928.26 loss, 0.000389 LR, 4.54 GB used,   9754.64 GFLOPS,    675.05 GOPS
 26   69.84 ms run,    2.67 ms python,   67.16 ms HIP,  941.32 loss, 0.000404 LR, 4.54 GB used,   9665.76 GFLOPS,    675.05 GOPS
 27   69.34 ms run,    2.70 ms python,   66.64 ms HIP,  932.31 loss, 0.000418 LR, 4.54 GB used,   9734.64 GFLOPS,    675.05 GOPS
 28   69.07 ms run,    2.75 ms python,   66.32 ms HIP,  928.35 loss, 0.000433 LR, 4.54 GB used,   9773.78 GFLOPS,    675.05 GOPS
 29   68.97 ms run,    2.76 ms python,   66.22 ms HIP,  896.15 loss, 0.000448 LR, 4.54 GB used,   9787.18 GFLOPS,    675.05 GOPS
 30   68.71 ms run,    2.74 ms python,   65.97 ms HIP,  928.76 loss, 0.000463 LR, 4.54 GB used,   9824.42 GFLOPS,    675.05 GOPS
 31   69.22 ms run,    2.78 ms python,   66.44 ms HIP,  907.65 loss, 0.000478 LR, 4.54 GB used,   9751.77 GFLOPS,    675.05 GOPS
 32   69.47 ms run,    2.75 ms python,   66.73 ms HIP,  892.33 loss, 0.000493 LR, 4.54 GB used,   9716.56 GFLOPS,    675.05 GOPS
 33   68.84 ms run,    2.69 ms python,   66.15 ms HIP,  873.42 loss, 0.000508 LR, 4.54 GB used,   9805.90 GFLOPS,    675.05 GOPS
 34   69.59 ms run,    2.72 ms python,   66.86 ms HIP,  873.31 loss, 0.000523 LR, 4.54 GB used,   9700.65 GFLOPS,    675.05 GOPS
 35   69.19 ms run,    2.70 ms python,   66.49 ms HIP,  880.48 loss, 0.000538 LR, 4.54 GB used,   9757.04 GFLOPS,    675.05 GOPS
 36   69.74 ms run,    2.74 ms python,   67.01 ms HIP,  889.98 loss, 0.000553 LR, 4.54 GB used,   9678.76 GFLOPS,    675.05 GOPS
 37   68.99 ms run,    2.72 ms python,   66.27 ms HIP,  944.16 loss, 0.000568 LR, 4.54 GB used,   9784.54 GFLOPS,    675.05 GOPS
 38   69.25 ms run,    2.69 ms python,   66.56 ms HIP,  906.17 loss, 0.000583 LR, 4.54 GB used,   9747.65 GFLOPS,    675.05 GOPS
 39   69.12 ms run,    2.73 ms python,   66.40 ms HIP,  912.06 loss, 0.000598 LR, 4.54 GB used,   9765.62 GFLOPS,    675.05 GOPS
 40   69.11 ms run,    2.68 ms python,   66.42 ms HIP,  872.09 loss, 0.000613 LR, 4.54 GB used,   9768.14 GFLOPS,    675.05 GOPS
 41   68.29 ms run,    2.70 ms python,   65.59 ms HIP,  847.76 loss, 0.000628 LR, 4.54 GB used,   9885.47 GFLOPS,    675.05 GOPS
 42   69.38 ms run,    2.74 ms python,   66.64 ms HIP,  847.61 loss, 0.000643 LR, 4.54 GB used,   9729.01 GFLOPS,    675.05 GOPS
 43   69.86 ms run,    2.73 ms python,   67.12 ms HIP,  852.14 loss, 0.000658 LR, 4.54 GB used,   9663.11 GFLOPS,    675.05 GOPS
 44   69.20 ms run,    2.74 ms python,   66.47 ms HIP,  858.28 loss, 0.000673 LR, 4.54 GB used,   9754.62 GFLOPS,    675.05 GOPS
 45   69.28 ms run,    2.71 ms python,   66.57 ms HIP,  883.35 loss, 0.000688 LR, 4.54 GB used,   9743.96 GFLOPS,    675.05 GOPS
 46   69.97 ms run,    2.66 ms python,   67.31 ms HIP,  854.11 loss, 0.000702 LR, 4.54 GB used,   9647.80 GFLOPS,    675.05 GOPS
 47   69.51 ms run,    2.72 ms python,   66.78 ms HIP,  807.23 loss, 0.000717 LR, 4.54 GB used,   9712.03 GFLOPS,    675.05 GOPS
 48   69.31 ms run,    2.71 ms python,   66.60 ms HIP,  809.48 loss, 0.000732 LR, 4.54 GB used,   9739.65 GFLOPS,    675.05 GOPS
 49   69.46 ms run,    2.70 ms python,   66.76 ms HIP,  838.32 loss, 0.000747 LR, 4.54 GB used,   9718.94 GFLOPS,    675.05 GOPS
 50   69.30 ms run,    2.72 ms python,   66.58 ms HIP,  825.54 loss, 0.000762 LR, 4.54 GB used,   9741.24 GFLOPS,    675.05 GOPS
 51   69.55 ms run,    2.67 ms python,   66.88 ms HIP,  781.49 loss, 0.000777 LR, 4.54 GB used,   9706.30 GFLOPS,    675.05 GOPS
 52   69.25 ms run,    2.71 ms python,   66.54 ms HIP,  841.42 loss, 0.000792 LR, 4.54 GB used,   9747.96 GFLOPS,    675.05 GOPS
 53   69.24 ms run,    2.66 ms python,   66.58 ms HIP,  798.52 loss, 0.000807 LR, 4.54 GB used,   9748.76 GFLOPS,    675.05 GOPS
 54   69.38 ms run,    2.68 ms python,   66.70 ms HIP,  809.88 loss, 0.000822 LR, 4.54 GB used,   9729.31 GFLOPS,    675.05 GOPS
 55   69.59 ms run,    2.67 ms python,   66.92 ms HIP,  823.67 loss, 0.000837 LR, 4.54 GB used,   9700.17 GFLOPS,    675.05 GOPS
 56   68.85 ms run,    2.74 ms python,   66.11 ms HIP,  815.30 loss, 0.000852 LR, 4.54 GB used,   9804.06 GFLOPS,    675.05 GOPS
 57   69.44 ms run,    2.70 ms python,   66.74 ms HIP,  800.62 loss, 0.000867 LR, 4.54 GB used,   9721.16 GFLOPS,    675.05 GOPS
 58   69.37 ms run,    2.74 ms python,   66.63 ms HIP,  782.18 loss, 0.000882 LR, 4.54 GB used,   9731.23 GFLOPS,    675.05 GOPS
 59   69.41 ms run,    2.72 ms python,   66.69 ms HIP,  811.72 loss, 0.000897 LR, 4.54 GB used,   9725.58 GFLOPS,    675.05 GOPS
 60   68.69 ms run,    2.70 ms python,   65.99 ms HIP,  834.58 loss, 0.000912 LR, 4.54 GB used,   9826.86 GFLOPS,    675.05 GOPS
 61   70.02 ms run,    2.67 ms python,   67.35 ms HIP,  817.35 loss, 0.000927 LR, 4.54 GB used,   9640.55 GFLOPS,    675.05 GOPS
 62   69.54 ms run,    2.67 ms python,   66.87 ms HIP,  840.39 loss, 0.000942 LR, 4.54 GB used,   9707.34 GFLOPS,    675.05 GOPS
 63   69.11 ms run,    2.89 ms python,   66.22 ms HIP,  789.49 loss, 0.000957 LR, 4.54 GB used,   9767.09 GFLOPS,    675.05 GOPS
 64   69.37 ms run,    2.79 ms python,   66.58 ms HIP,  767.33 loss, 0.000971 LR, 4.54 GB used,   9730.40 GFLOPS,    675.05 GOPS
 65   68.84 ms run,    2.74 ms python,   66.10 ms HIP,  735.83 loss, 0.000986 LR, 4.54 GB used,   9806.38 GFLOPS,    675.05 GOPS
 66   69.71 ms run,    2.81 ms python,   66.90 ms HIP,  767.32 loss, 0.001001 LR, 4.54 GB used,   9683.70 GFLOPS,    675.05 GOPS
 67   69.49 ms run,    2.73 ms python,   66.76 ms HIP,  740.48 loss, 0.001016 LR, 4.54 GB used,   9714.26 GFLOPS,    675.05 GOPS
 68   69.04 ms run,    2.74 ms python,   66.31 ms HIP,  754.44 loss, 0.001031 LR, 4.54 GB used,   9777.48 GFLOPS,    675.05 GOPS
 69   68.58 ms run,    2.78 ms python,   65.80 ms HIP,  751.04 loss, 0.001046 LR, 4.54 GB used,   9843.00 GFLOPS,    675.05 GOPS
 70   69.71 ms run,    2.75 ms python,   66.95 ms HIP,  758.91 loss, 0.001061 LR, 4.54 GB used,   9684.11 GFLOPS,    675.05 GOPS
 71   69.41 ms run,    2.76 ms python,   66.64 ms HIP,  753.18 loss, 0.001076 LR, 4.54 GB used,   9725.77 GFLOPS,    675.05 GOPS
 72   69.59 ms run,    2.72 ms python,   66.87 ms HIP,  770.21 loss, 0.001091 LR, 4.54 GB used,   9699.87 GFLOPS,    675.05 GOPS
 73   69.39 ms run,    2.72 ms python,   66.67 ms HIP,  758.43 loss, 0.001106 LR, 4.54 GB used,   9727.67 GFLOPS,    675.05 GOPS
 74   69.18 ms run,    2.72 ms python,   66.46 ms HIP,  734.02 loss, 0.001121 LR, 4.54 GB used,   9757.70 GFLOPS,    675.05 GOPS
 75   68.85 ms run,    2.75 ms python,   66.09 ms HIP,  737.91 loss, 0.001136 LR, 4.54 GB used,   9805.03 GFLOPS,    675.05 GOPS
 76   69.30 ms run,    2.71 ms python,   66.59 ms HIP,  727.93 loss, 0.001151 LR, 4.54 GB used,   9741.20 GFLOPS,    675.05 GOPS
 77   69.46 ms run,    2.74 ms python,   66.71 ms HIP,  746.44 loss, 0.001166 LR, 4.54 GB used,   9719.09 GFLOPS,    675.05 GOPS
 78   69.37 ms run,    2.76 ms python,   66.62 ms HIP,  729.42 loss, 0.001181 LR, 4.54 GB used,   9731.01 GFLOPS,    675.05 GOPS
 79   69.36 ms run,    2.86 ms python,   66.50 ms HIP,  763.18 loss, 0.001196 LR, 4.54 GB used,   9731.99 GFLOPS,    675.05 GOPS
 80   69.00 ms run,    2.73 ms python,   66.27 ms HIP,  728.07 loss, 0.001211 LR, 4.54 GB used,   9783.30 GFLOPS,    675.05 GOPS
 81   69.05 ms run,    2.75 ms python,   66.30 ms HIP,  732.20 loss, 0.001226 LR, 4.54 GB used,   9776.00 GFLOPS,    675.05 GOPS
 82   69.80 ms run,    2.73 ms python,   67.08 ms HIP,  731.84 loss, 0.001240 LR, 4.54 GB used,   9670.57 GFLOPS,    675.05 GOPS
 83   69.34 ms run,    2.72 ms python,   66.62 ms HIP,  723.89 loss, 0.001255 LR, 4.54 GB used,   9735.15 GFLOPS,    675.05 GOPS
 84   69.09 ms run,    2.75 ms python,   66.34 ms HIP,  716.49 loss, 0.001270 LR, 4.54 GB used,   9770.21 GFLOPS,    675.05 GOPS
 85   68.92 ms run,    2.73 ms python,   66.19 ms HIP,  721.01 loss, 0.001285 LR, 4.54 GB used,   9795.17 GFLOPS,    675.05 GOPS
 86   69.31 ms run,    2.73 ms python,   66.58 ms HIP,  726.47 loss, 0.001300 LR, 4.54 GB used,   9739.78 GFLOPS,    675.05 GOPS
 87   69.16 ms run,    2.83 ms python,   66.33 ms HIP,  743.07 loss, 0.001315 LR, 4.54 GB used,   9761.19 GFLOPS,    675.05 GOPS
 88   69.55 ms run,    2.77 ms python,   66.78 ms HIP,  751.18 loss, 0.001330 LR, 4.54 GB used,   9706.38 GFLOPS,    675.05 GOPS
 89   69.36 ms run,    2.74 ms python,   66.61 ms HIP,  720.70 loss, 0.001345 LR, 4.54 GB used,   9732.68 GFLOPS,    675.05 GOPS
 90   69.07 ms run,    2.72 ms python,   66.35 ms HIP,  715.80 loss, 0.001360 LR, 4.54 GB used,   9773.50 GFLOPS,    675.05 GOPS
 91   68.65 ms run,    2.73 ms python,   65.91 ms HIP,  716.98 loss, 0.001375 LR, 4.54 GB used,   9833.84 GFLOPS,    675.05 GOPS
 92   69.23 ms run,    2.79 ms python,   66.45 ms HIP,  715.26 loss, 0.001390 LR, 4.54 GB used,   9750.10 GFLOPS,    675.05 GOPS
 93   69.16 ms run,    2.74 ms python,   66.42 ms HIP,  692.31 loss, 0.001405 LR, 4.54 GB used,   9760.43 GFLOPS,    675.05 GOPS
 94   69.43 ms run,    2.71 ms python,   66.72 ms HIP,  678.61 loss, 0.001420 LR, 4.54 GB used,   9723.11 GFLOPS,    675.05 GOPS
 95   69.07 ms run,    2.76 ms python,   66.31 ms HIP,  702.36 loss, 0.001435 LR, 4.54 GB used,   9772.82 GFLOPS,    675.05 GOPS
 96   68.79 ms run,    2.72 ms python,   66.07 ms HIP,  657.52 loss, 0.001450 LR, 4.54 GB used,   9813.28 GFLOPS,    675.05 GOPS
 97   68.94 ms run,    2.75 ms python,   66.19 ms HIP,  665.23 loss, 0.001465 LR, 4.54 GB used,   9792.20 GFLOPS,    675.05 GOPS
shuffling training dataset in 755.98 ms (epoch=1)
 98  831.82 ms run,  759.18 ms python,   72.64 ms HIP,  665.56 loss, 0.001480 LR, 4.54 GB used,    811.88 GFLOPS,    675.34 GOPS
 99   73.72 ms run,    2.85 ms python,   70.87 ms HIP,  695.91 loss, 0.001495 LR, 4.54 GB used,   9157.12 GFLOPS,    675.05 GOPS
100   72.84 ms run,    2.77 ms python,   70.07 ms HIP,  715.10 loss, 0.001510 LR, 4.54 GB used,   9267.20 GFLOPS,    675.05 GOPS
101   71.20 ms run,    2.87 ms python,   68.33 ms HIP,  703.98 loss, 0.001524 LR, 4.54 GB used,   9480.52 GFLOPS,    675.05 GOPS
102   71.12 ms run,    2.77 ms python,   68.34 ms HIP,  691.39 loss, 0.001539 LR, 4.54 GB used,   9492.06 GFLOPS,    675.05 GOPS
103   69.92 ms run,    2.75 ms python,   67.17 ms HIP,  678.40 loss, 0.001554 LR, 4.54 GB used,   9655.06 GFLOPS,    675.05 GOPS
104   69.41 ms run,    2.71 ms python,   66.70 ms HIP,  679.19 loss, 0.001569 LR, 4.54 GB used,   9725.23 GFLOPS,    675.05 GOPS
105   69.77 ms run,    2.70 ms python,   67.07 ms HIP,  684.38 loss, 0.001584 LR, 4.54 GB used,   9675.54 GFLOPS,    675.05 GOPS
106   69.57 ms run,    2.72 ms python,   66.85 ms HIP,  679.57 loss, 0.001599 LR, 4.54 GB used,   9702.74 GFLOPS,    675.05 GOPS
107   69.07 ms run,    2.76 ms python,   66.31 ms HIP,  675.04 loss, 0.001614 LR, 4.54 GB used,   9773.89 GFLOPS,    675.05 GOPS
108   69.64 ms run,    2.76 ms python,   66.88 ms HIP,  663.53 loss, 0.001629 LR, 4.54 GB used,   9693.30 GFLOPS,    675.05 GOPS
109   70.40 ms run,    2.83 ms python,   67.57 ms HIP,  669.80 loss, 0.001644 LR, 4.54 GB used,   9589.22 GFLOPS,    675.05 GOPS
110   69.53 ms run,    2.72 ms python,   66.82 ms HIP,  675.51 loss, 0.001659 LR, 4.54 GB used,   9708.04 GFLOPS,    675.05 GOPS
111   68.61 ms run,    2.80 ms python,   65.81 ms HIP,  675.21 loss, 0.001674 LR, 4.54 GB used,   9838.29 GFLOPS,    675.05 GOPS
112   68.84 ms run,    2.69 ms python,   66.15 ms HIP,  697.70 loss, 0.001689 LR, 4.54 GB used,   9806.06 GFLOPS,    675.05 GOPS
113   68.99 ms run,    2.69 ms python,   66.30 ms HIP,  699.45 loss, 0.001704 LR, 4.54 GB used,   9785.10 GFLOPS,    675.05 GOPS
114   69.07 ms run,    2.74 ms python,   66.34 ms HIP,  666.35 loss, 0.001719 LR, 4.54 GB used,   9772.85 GFLOPS,    675.05 GOPS
115   69.38 ms run,    2.73 ms python,   66.65 ms HIP,  685.84 loss, 0.001734 LR, 4.54 GB used,   9729.75 GFLOPS,    675.05 GOPS
116   69.55 ms run,    2.82 ms python,   66.74 ms HIP,  675.04 loss, 0.001749 LR, 4.54 GB used,   9705.29 GFLOPS,    675.05 GOPS
117   69.10 ms run,    2.72 ms python,   66.37 ms HIP,  659.46 loss, 0.001764 LR, 4.54 GB used,   9769.32 GFLOPS,    675.05 GOPS
118   69.31 ms run,    2.72 ms python,   66.59 ms HIP,  664.42 loss, 0.001779 LR, 4.54 GB used,   9739.88 GFLOPS,    675.05 GOPS
119   69.29 ms run,    2.69 ms python,   66.60 ms HIP,  687.12 loss, 0.001793 LR, 4.54 GB used,   9742.33 GFLOPS,    675.05 GOPS
120   69.46 ms run,    2.71 ms python,   66.76 ms HIP,  702.90 loss, 0.001808 LR, 4.54 GB used,   9717.85 GFLOPS,    675.05 GOPS
121   69.46 ms run,    2.74 ms python,   66.72 ms HIP,  705.35 loss, 0.001823 LR, 4.54 GB used,   9718.82 GFLOPS,    675.05 GOPS
122   69.70 ms run,    2.76 ms python,   66.94 ms HIP,  664.24 loss, 0.001838 LR, 4.54 GB used,   9685.62 GFLOPS,    675.05 GOPS
123   69.16 ms run,    2.75 ms python,   66.41 ms HIP,  670.65 loss, 0.001853 LR, 4.54 GB used,   9760.12 GFLOPS,    675.05 GOPS
124   69.02 ms run,    2.68 ms python,   66.34 ms HIP,  659.85 loss, 0.001868 LR, 4.54 GB used,   9780.20 GFLOPS,    675.05 GOPS
125   69.18 ms run,    2.79 ms python,   66.39 ms HIP,  668.93 loss, 0.001883 LR, 4.54 GB used,   9757.52 GFLOPS,    675.05 GOPS
126   69.38 ms run,    2.70 ms python,   66.68 ms HIP,  661.00 loss, 0.001898 LR, 4.54 GB used,   9729.95 GFLOPS,    675.05 GOPS
127   69.16 ms run,    2.74 ms python,   66.42 ms HIP,  654.85 loss, 0.001913 LR, 4.54 GB used,   9760.06 GFLOPS,    675.05 GOPS
128   69.01 ms run,    2.74 ms python,   66.27 ms HIP,  676.15 loss, 0.001928 LR, 4.54 GB used,   9781.70 GFLOPS,    675.05 GOPS
129   68.86 ms run,    2.70 ms python,   66.16 ms HIP,  676.86 loss, 0.001943 LR, 4.54 GB used,   9802.63 GFLOPS,    675.05 GOPS
130   68.56 ms run,    2.70 ms python,   65.86 ms HIP,  663.13 loss, 0.001958 LR, 4.54 GB used,   9846.06 GFLOPS,    675.05 GOPS
131   69.10 ms run,    2.68 ms python,   66.41 ms HIP,  645.75 loss, 0.001973 LR, 4.54 GB used,   9769.36 GFLOPS,    675.05 GOPS
132   70.37 ms run,    2.70 ms python,   67.67 ms HIP,  670.99 loss, 0.001988 LR, 4.54 GB used,   9593.29 GFLOPS,    675.05 GOPS
133   69.58 ms run,    2.70 ms python,   66.88 ms HIP,  661.26 loss, 0.002003 LR, 4.54 GB used,   9701.80 GFLOPS,    675.05 GOPS
134   69.39 ms run,    2.69 ms python,   66.71 ms HIP,  669.69 loss, 0.002018 LR, 4.54 GB used,   9727.68 GFLOPS,    675.05 GOPS
135   69.94 ms run,    2.70 ms python,   67.24 ms HIP,  673.50 loss, 0.002033 LR, 4.54 GB used,   9651.33 GFLOPS,    675.05 GOPS
136   70.10 ms run,    2.76 ms python,   67.34 ms HIP,  657.75 loss, 0.002048 LR, 4.54 GB used,   9630.03 GFLOPS,    675.05 GOPS
137   70.11 ms run,    2.72 ms python,   67.38 ms HIP,  660.81 loss, 0.002063 LR, 4.54 GB used,   9628.57 GFLOPS,    675.05 GOPS
138   69.64 ms run,    2.73 ms python,   66.91 ms HIP,  671.17 loss, 0.002077 LR, 4.54 GB used,   9693.96 GFLOPS,    675.05 GOPS
139   69.33 ms run,    2.72 ms python,   66.61 ms HIP,  688.35 loss, 0.002092 LR, 4.54 GB used,   9736.29 GFLOPS,    675.05 GOPS
140   69.38 ms run,    2.71 ms python,   66.67 ms HIP,  648.27 loss, 0.002107 LR, 4.54 GB used,   9729.40 GFLOPS,    675.05 GOPS
141   70.56 ms run,    2.72 ms python,   67.84 ms HIP,  645.85 loss, 0.002122 LR, 4.54 GB used,   9567.45 GFLOPS,    675.05 GOPS
142   69.73 ms run,    2.69 ms python,   67.04 ms HIP,  665.99 loss, 0.002137 LR, 4.54 GB used,   9680.96 GFLOPS,    675.05 GOPS
143   69.40 ms run,    2.80 ms python,   66.60 ms HIP,  692.06 loss, 0.002152 LR, 4.54 GB used,   9726.19 GFLOPS,    675.05 GOPS
144   68.76 ms run,    2.68 ms python,   66.08 ms HIP,  675.49 loss, 0.002167 LR, 4.54 GB used,   9817.55 GFLOPS,    675.05 GOPS
145   68.66 ms run,    2.68 ms python,   65.99 ms HIP,  715.16 loss, 0.002182 LR, 4.54 GB used,   9831.15 GFLOPS,    675.05 GOPS
146   69.04 ms run,    2.71 ms python,   66.33 ms HIP,  681.11 loss, 0.002197 LR, 4.54 GB used,   9777.15 GFLOPS,    675.05 GOPS
147   69.47 ms run,    2.67 ms python,   66.79 ms HIP,  713.74 loss, 0.002212 LR, 4.54 GB used,   9717.48 GFLOPS,    675.05 GOPS
148   69.13 ms run,    2.67 ms python,   66.47 ms HIP,  696.30 loss, 0.002227 LR, 4.54 GB used,   9764.18 GFLOPS,    675.05 GOPS
149   69.17 ms run,    2.67 ms python,   66.50 ms HIP,  651.40 loss, 0.002242 LR, 4.54 GB used,   9759.92 GFLOPS,    675.05 GOPS
150   69.74 ms run,    2.74 ms python,   67.00 ms HIP,  656.05 loss, 0.002257 LR, 4.54 GB used,   9680.08 GFLOPS,    675.05 GOPS
151   69.64 ms run,    2.68 ms python,   66.96 ms HIP,  659.93 loss, 0.002272 LR, 4.54 GB used,   9692.70 GFLOPS,    675.05 GOPS
152   70.08 ms run,    2.73 ms python,   67.35 ms HIP,  655.41 loss, 0.002287 LR, 4.54 GB used,   9633.18 GFLOPS,    675.05 GOPS
153   69.50 ms run,    2.71 ms python,   66.79 ms HIP,  642.93 loss, 0.002302 LR, 4.54 GB used,   9713.06 GFLOPS,    675.05 GOPS
154   69.17 ms run,    2.66 ms python,   66.51 ms HIP,  661.86 loss, 0.002317 LR, 4.54 GB used,   9758.88 GFLOPS,    675.05 GOPS
155   69.83 ms run,    2.70 ms python,   67.13 ms HIP,  656.04 loss, 0.002332 LR, 4.54 GB used,   9667.62 GFLOPS,    675.05 GOPS
156   69.37 ms run,    2.71 ms python,   66.66 ms HIP,  671.41 loss, 0.002346 LR, 4.54 GB used,   9731.45 GFLOPS,    675.05 GOPS
157   69.66 ms run,    2.75 ms python,   66.90 ms HIP,  670.28 loss, 0.002361 LR, 4.54 GB used,   9691.02 GFLOPS,    675.05 GOPS
158   70.13 ms run,    2.73 ms python,   67.39 ms HIP,  653.53 loss, 0.002376 LR, 4.54 GB used,   9625.92 GFLOPS,    675.05 GOPS
159   69.86 ms run,    2.68 ms python,   67.18 ms HIP,  645.35 loss, 0.002391 LR, 4.54 GB used,   9662.92 GFLOPS,    675.05 GOPS
160   69.78 ms run,    2.70 ms python,   67.08 ms HIP,  667.87 loss, 0.002406 LR, 4.54 GB used,   9674.32 GFLOPS,    675.05 GOPS
161   68.98 ms run,    2.72 ms python,   66.26 ms HIP,  646.49 loss, 0.002421 LR, 4.54 GB used,   9786.22 GFLOPS,    675.05 GOPS
162   69.74 ms run,    2.67 ms python,   67.07 ms HIP,  649.51 loss, 0.002436 LR, 4.54 GB used,   9679.95 GFLOPS,    675.05 GOPS
163   69.49 ms run,    2.71 ms python,   66.79 ms HIP,  643.96 loss, 0.002451 LR, 4.54 GB used,   9714.14 GFLOPS,    675.05 GOPS
164   69.13 ms run,    2.68 ms python,   66.45 ms HIP,  656.23 loss, 0.002466 LR, 4.54 GB used,   9764.87 GFLOPS,    675.05 GOPS
165   69.57 ms run,    2.66 ms python,   66.90 ms HIP,  670.91 loss, 0.002481 LR, 4.54 GB used,   9703.63 GFLOPS,    675.05 GOPS
166   69.08 ms run,    2.66 ms python,   66.42 ms HIP,  653.54 loss, 0.002496 LR, 4.54 GB used,   9771.72 GFLOPS,    675.05 GOPS
167   69.15 ms run,    2.69 ms python,   66.46 ms HIP,  664.16 loss, 0.002511 LR, 4.54 GB used,   9762.65 GFLOPS,    675.05 GOPS
168   69.67 ms run,    2.68 ms python,   66.98 ms HIP,  649.77 loss, 0.002526 LR, 4.54 GB used,   9689.66 GFLOPS,    675.05 GOPS
169   69.39 ms run,    2.67 ms python,   66.72 ms HIP,  644.44 loss, 0.002541 LR, 4.54 GB used,   9728.23 GFLOPS,    675.05 GOPS
170   69.52 ms run,    2.71 ms python,   66.81 ms HIP,  629.33 loss, 0.002556 LR, 4.54 GB used,   9710.44 GFLOPS,    675.05 GOPS
171   69.34 ms run,    2.73 ms python,   66.61 ms HIP,  655.48 loss, 0.002571 LR, 4.54 GB used,   9735.60 GFLOPS,    675.05 GOPS
172   69.39 ms run,    2.70 ms python,   66.70 ms HIP,  669.01 loss, 0.002586 LR, 4.54 GB used,   9727.61 GFLOPS,    675.05 GOPS
173   68.89 ms run,    2.68 ms python,   66.21 ms HIP,  678.96 loss, 0.002601 LR, 4.54 GB used,   9798.66 GFLOPS,    675.05 GOPS
174   69.34 ms run,    2.71 ms python,   66.63 ms HIP,  695.76 loss, 0.002615 LR, 4.54 GB used,   9735.53 GFLOPS,    675.05 GOPS
175   68.74 ms run,    2.71 ms python,   66.04 ms HIP,  657.40 loss, 0.002630 LR, 4.54 GB used,   9819.56 GFLOPS,    675.05 GOPS
176   69.06 ms run,    2.68 ms python,   66.37 ms HIP,  649.10 loss, 0.002645 LR, 4.54 GB used,   9775.22 GFLOPS,    675.05 GOPS
177   69.97 ms run,    2.66 ms python,   67.30 ms HIP,  640.65 loss, 0.002660 LR, 4.54 GB used,   9648.05 GFLOPS,    675.05 GOPS
178   69.51 ms run,    2.69 ms python,   66.82 ms HIP,  627.96 loss, 0.002675 LR, 4.54 GB used,   9711.71 GFLOPS,    675.05 GOPS
179   68.84 ms run,    2.64 ms python,   66.20 ms HIP,  677.60 loss, 0.002690 LR, 4.54 GB used,   9805.53 GFLOPS,    675.05 GOPS
180   68.97 ms run,    2.68 ms python,   66.29 ms HIP,  646.24 loss, 0.002705 LR, 4.54 GB used,   9787.94 GFLOPS,    675.05 GOPS
181   69.45 ms run,    2.67 ms python,   66.78 ms HIP,  667.96 loss, 0.002720 LR, 4.54 GB used,   9720.16 GFLOPS,    675.05 GOPS
182   69.08 ms run,    2.66 ms python,   66.42 ms HIP,  629.36 loss, 0.002735 LR, 4.54 GB used,   9771.26 GFLOPS,    675.05 GOPS
183   69.24 ms run,    2.66 ms python,   66.58 ms HIP,  662.15 loss, 0.002750 LR, 4.54 GB used,   9749.65 GFLOPS,    675.05 GOPS
184   69.35 ms run,    2.67 ms python,   66.68 ms HIP,  655.82 loss, 0.002765 LR, 4.54 GB used,   9733.97 GFLOPS,    675.05 GOPS
185   69.36 ms run,    2.67 ms python,   66.68 ms HIP,  660.38 loss, 0.002780 LR, 4.54 GB used,   9733.08 GFLOPS,    675.05 GOPS
186   69.16 ms run,    2.68 ms python,   66.48 ms HIP,  653.15 loss, 0.002795 LR, 4.54 GB used,   9760.66 GFLOPS,    675.05 GOPS
187   69.70 ms run,    2.73 ms python,   66.97 ms HIP,  660.77 loss, 0.002810 LR, 4.54 GB used,   9685.19 GFLOPS,    675.05 GOPS
188   69.29 ms run,    2.68 ms python,   66.62 ms HIP,  639.66 loss, 0.002825 LR, 4.54 GB used,   9741.91 GFLOPS,    675.05 GOPS
189   69.36 ms run,    2.68 ms python,   66.68 ms HIP,  677.11 loss, 0.002840 LR, 4.54 GB used,   9732.29 GFLOPS,    675.05 GOPS
190   68.75 ms run,    2.77 ms python,   65.97 ms HIP,  657.45 loss, 0.002855 LR, 4.54 GB used,   9819.44 GFLOPS,    675.05 GOPS
191   69.18 ms run,    2.70 ms python,   66.48 ms HIP,  657.44 loss, 0.002870 LR, 4.54 GB used,   9757.84 GFLOPS,    675.05 GOPS
192   69.28 ms run,    2.70 ms python,   66.58 ms HIP,  644.28 loss, 0.002885 LR, 4.54 GB used,   9743.48 GFLOPS,    675.05 GOPS
193   69.36 ms run,    2.64 ms python,   66.72 ms HIP,  664.24 loss, 0.002899 LR, 4.54 GB used,   9732.33 GFLOPS,    675.05 GOPS
194   69.23 ms run,    2.66 ms python,   66.57 ms HIP,  658.53 loss, 0.002914 LR, 4.54 GB used,   9751.01 GFLOPS,    675.05 GOPS
195   69.44 ms run,    2.65 ms python,   66.79 ms HIP,  625.03 loss, 0.002929 LR, 4.54 GB used,   9721.33 GFLOPS,    675.05 GOPS
shuffling training dataset in 755.85 ms (epoch=2)
196  832.06 ms run,  759.15 ms python,   72.91 ms HIP,  659.99 loss, 0.002944 LR, 4.54 GB used,    811.65 GFLOPS,    675.34 GOPS
197   73.49 ms run,    2.80 ms python,   70.69 ms HIP,  634.10 loss, 0.002959 LR, 4.54 GB used,   9185.29 GFLOPS,    675.05 GOPS
198   72.67 ms run,    2.79 ms python,   69.89 ms HIP,  629.92 loss, 0.002974 LR, 4.54 GB used,   9288.55 GFLOPS,    675.05 GOPS
199   71.07 ms run,    2.72 ms python,   68.35 ms HIP,  619.03 loss, 0.002989 LR, 4.54 GB used,   9498.03 GFLOPS,    675.05 GOPS
200   70.13 ms run,    2.70 ms python,   67.43 ms HIP,  622.81 loss, 0.003004 LR, 4.54 GB used,   9625.49 GFLOPS,    675.05 GOPS
201   69.80 ms run,    2.69 ms python,   67.11 ms HIP,  653.30 loss, 0.003019 LR, 4.54 GB used,   9671.01 GFLOPS,    675.05 GOPS
202   70.01 ms run,    2.66 ms python,   67.35 ms HIP,  669.36 loss, 0.003034 LR, 4.54 GB used,   9642.53 GFLOPS,    675.05 GOPS
203   69.19 ms run,    2.68 ms python,   66.51 ms HIP,  636.66 loss, 0.003049 LR, 4.54 GB used,   9756.09 GFLOPS,    675.05 GOPS
204   69.73 ms run,    2.70 ms python,   67.03 ms HIP,  638.22 loss, 0.003064 LR, 4.54 GB used,   9681.18 GFLOPS,    675.05 GOPS
205   70.08 ms run,    2.67 ms python,   67.41 ms HIP,  637.03 loss, 0.003079 LR, 4.54 GB used,   9631.95 GFLOPS,    675.05 GOPS
206   69.47 ms run,    2.73 ms python,   66.74 ms HIP,  650.79 loss, 0.003094 LR, 4.54 GB used,   9716.82 GFLOPS,    675.05 GOPS
207   69.78 ms run,    2.73 ms python,   67.05 ms HIP,  626.88 loss, 0.003109 LR, 4.54 GB used,   9674.44 GFLOPS,    675.05 GOPS
208   69.31 ms run,    2.66 ms python,   66.65 ms HIP,  640.41 loss, 0.003124 LR, 4.54 GB used,   9739.37 GFLOPS,    675.05 GOPS
209   70.36 ms run,    2.71 ms python,   67.65 ms HIP,  655.87 loss, 0.003139 LR, 4.54 GB used,   9593.98 GFLOPS,    675.05 GOPS
210   69.85 ms run,    2.67 ms python,   67.18 ms HIP,  641.10 loss, 0.003154 LR, 4.54 GB used,   9663.86 GFLOPS,    675.05 GOPS
211   69.48 ms run,    2.67 ms python,   66.80 ms HIP,  612.88 loss, 0.003168 LR, 4.54 GB used,   9716.12 GFLOPS,    675.05 GOPS
212   69.22 ms run,    2.71 ms python,   66.51 ms HIP,  620.64 loss, 0.003183 LR, 4.54 GB used,   9751.83 GFLOPS,    675.05 GOPS
213   69.32 ms run,    2.73 ms python,   66.59 ms HIP,  632.55 loss, 0.003198 LR, 4.54 GB used,   9737.44 GFLOPS,    675.05 GOPS
214   69.71 ms run,    2.69 ms python,   67.02 ms HIP,  647.31 loss, 0.003213 LR, 4.54 GB used,   9683.01 GFLOPS,    675.05 GOPS
215   69.61 ms run,    2.72 ms python,   66.89 ms HIP,  656.43 loss, 0.003228 LR, 4.54 GB used,   9697.12 GFLOPS,    675.05 GOPS
216   69.15 ms run,    2.66 ms python,   66.49 ms HIP,  625.25 loss, 0.003243 LR, 4.54 GB used,   9761.35 GFLOPS,    675.05 GOPS
217   69.35 ms run,    2.67 ms python,   66.67 ms HIP,  657.98 loss, 0.003258 LR, 4.54 GB used,   9733.98 GFLOPS,    675.05 GOPS
218   69.91 ms run,    2.73 ms python,   67.19 ms HIP,  629.03 loss, 0.003273 LR, 4.54 GB used,   9655.39 GFLOPS,    675.05 GOPS
219   69.45 ms run,    2.76 ms python,   66.69 ms HIP,  627.14 loss, 0.003288 LR, 4.54 GB used,   9719.18 GFLOPS,    675.05 GOPS
220   69.20 ms run,    2.68 ms python,   66.52 ms HIP,  627.50 loss, 0.003303 LR, 4.54 GB used,   9754.88 GFLOPS,    675.05 GOPS
221   69.24 ms run,    2.70 ms python,   66.54 ms HIP,  640.78 loss, 0.003318 LR, 4.54 GB used,   9749.73 GFLOPS,    675.05 GOPS
222   69.41 ms run,    2.77 ms python,   66.64 ms HIP,  645.41 loss, 0.003333 LR, 4.54 GB used,   9724.79 GFLOPS,    675.05 GOPS
223   69.49 ms run,    2.70 ms python,   66.79 ms HIP,  653.10 loss, 0.003348 LR, 4.54 GB used,   9714.20 GFLOPS,    675.05 GOPS
224   69.62 ms run,    2.69 ms python,   66.93 ms HIP,  638.37 loss, 0.003363 LR, 4.54 GB used,   9695.58 GFLOPS,    675.05 GOPS
225   69.07 ms run,    2.66 ms python,   66.41 ms HIP,  638.95 loss, 0.003378 LR, 4.54 GB used,   9773.32 GFLOPS,    675.05 GOPS
226   68.82 ms run,    2.67 ms python,   66.15 ms HIP,  620.94 loss, 0.003393 LR, 4.54 GB used,   9808.29 GFLOPS,    675.05 GOPS
227   68.87 ms run,    2.68 ms python,   66.19 ms HIP,  627.10 loss, 0.003408 LR, 4.54 GB used,   9801.68 GFLOPS,    675.05 GOPS
228   69.23 ms run,    2.68 ms python,   66.54 ms HIP,  638.22 loss, 0.003423 LR, 4.54 GB used,   9751.06 GFLOPS,    675.05 GOPS
229   69.44 ms run,    2.65 ms python,   66.78 ms HIP,  630.57 loss, 0.003437 LR, 4.54 GB used,   9721.73 GFLOPS,    675.05 GOPS
230   68.87 ms run,    2.66 ms python,   66.21 ms HIP,  641.26 loss, 0.003433 LR, 4.54 GB used,   9802.19 GFLOPS,    675.05 GOPS
231   69.57 ms run,    2.75 ms python,   66.82 ms HIP,  643.91 loss, 0.003429 LR, 4.54 GB used,   9703.59 GFLOPS,    675.05 GOPS
232   69.32 ms run,    2.68 ms python,   66.64 ms HIP,  635.29 loss, 0.003424 LR, 4.54 GB used,   9737.75 GFLOPS,    675.05 GOPS
233   69.05 ms run,    2.71 ms python,   66.34 ms HIP,  662.56 loss, 0.003420 LR, 4.54 GB used,   9776.68 GFLOPS,    675.05 GOPS
234   69.36 ms run,    2.66 ms python,   66.69 ms HIP,  676.33 loss, 0.003416 LR, 4.54 GB used,   9732.88 GFLOPS,    675.05 GOPS
235   69.24 ms run,    2.71 ms python,   66.52 ms HIP,  685.30 loss, 0.003411 LR, 4.54 GB used,   9749.93 GFLOPS,    675.05 GOPS
236   69.62 ms run,    2.68 ms python,   66.94 ms HIP,  650.38 loss, 0.003407 LR, 4.54 GB used,   9696.14 GFLOPS,    675.05 GOPS
237   69.17 ms run,    2.66 ms python,   66.51 ms HIP,  663.55 loss, 0.003403 LR, 4.54 GB used,   9759.73 GFLOPS,    675.05 GOPS
238   69.29 ms run,    2.66 ms python,   66.63 ms HIP,  647.40 loss, 0.003398 LR, 4.54 GB used,   9742.70 GFLOPS,    675.05 GOPS
239   68.74 ms run,    2.65 ms python,   66.08 ms HIP,  619.72 loss, 0.003394 LR, 4.54 GB used,   9820.60 GFLOPS,    675.05 GOPS
240   69.47 ms run,    2.69 ms python,   66.78 ms HIP,  636.31 loss, 0.003390 LR, 4.54 GB used,   9717.35 GFLOPS,    675.05 GOPS
241   69.68 ms run,    2.69 ms python,   66.99 ms HIP,  645.92 loss, 0.003385 LR, 4.54 GB used,   9687.70 GFLOPS,    675.05 GOPS
242   69.31 ms run,    2.68 ms python,   66.63 ms HIP,  632.35 loss, 0.003381 LR, 4.54 GB used,   9738.96 GFLOPS,    675.05 GOPS
243   69.45 ms run,    2.68 ms python,   66.77 ms HIP,  633.76 loss, 0.003377 LR, 4.54 GB used,   9719.79 GFLOPS,    675.05 GOPS
244   69.14 ms run,    2.72 ms python,   66.42 ms HIP,  637.14 loss, 0.003372 LR, 4.54 GB used,   9763.85 GFLOPS,    675.05 GOPS
245   68.80 ms run,    2.66 ms python,   66.13 ms HIP,  642.28 loss, 0.003368 LR, 4.54 GB used,   9811.88 GFLOPS,    675.05 GOPS
246   69.97 ms run,    2.69 ms python,   67.29 ms HIP,  647.99 loss, 0.003364 LR, 4.54 GB used,   9647.21 GFLOPS,    675.05 GOPS
247   69.52 ms run,    2.67 ms python,   66.85 ms HIP,  619.52 loss, 0.003359 LR, 4.54 GB used,   9709.50 GFLOPS,    675.05 GOPS
248   69.24 ms run,    2.65 ms python,   66.59 ms HIP,  628.87 loss, 0.003355 LR, 4.54 GB used,   9749.61 GFLOPS,    675.05 GOPS
249   69.69 ms run,    2.68 ms python,   67.01 ms HIP,  613.62 loss, 0.003350 LR, 4.54 GB used,   9686.41 GFLOPS,    675.05 GOPS
250   68.42 ms run,    2.64 ms python,   65.78 ms HIP,  639.52 loss, 0.003346 LR, 4.54 GB used,   9866.10 GFLOPS,    675.05 GOPS
251   70.09 ms run,    2.63 ms python,   67.46 ms HIP,  633.70 loss, 0.003342 LR, 4.54 GB used,   9631.68 GFLOPS,    675.05 GOPS
252   68.83 ms run,    2.72 ms python,   66.11 ms HIP,  604.80 loss, 0.003337 LR, 4.54 GB used,   9807.18 GFLOPS,    675.05 GOPS
253   69.05 ms run,    2.72 ms python,   66.33 ms HIP,  621.34 loss, 0.003333 LR, 4.54 GB used,   9775.99 GFLOPS,    675.05 GOPS
254   69.15 ms run,    2.74 ms python,   66.41 ms HIP,  629.10 loss, 0.003329 LR, 4.54 GB used,   9762.59 GFLOPS,    675.05 GOPS
255   69.57 ms run,    2.69 ms python,   66.88 ms HIP,  639.65 loss, 0.003324 LR, 4.54 GB used,   9703.32 GFLOPS,    675.05 GOPS
256   70.20 ms run,    2.76 ms python,   67.45 ms HIP,  615.65 loss, 0.003320 LR, 4.54 GB used,   9615.88 GFLOPS,    675.05 GOPS
257   69.17 ms run,    2.69 ms python,   66.48 ms HIP,  633.85 loss, 0.003316 LR, 4.54 GB used,   9759.26 GFLOPS,    675.05 GOPS
258   69.09 ms run,    2.70 ms python,   66.38 ms HIP,  614.25 loss, 0.003311 LR, 4.54 GB used,   9770.97 GFLOPS,    675.05 GOPS
259   68.95 ms run,    2.68 ms python,   66.27 ms HIP,  616.02 loss, 0.003307 LR, 4.54 GB used,   9790.04 GFLOPS,    675.05 GOPS
260   69.49 ms run,    2.67 ms python,   66.82 ms HIP,  629.08 loss, 0.003303 LR, 4.54 GB used,   9714.19 GFLOPS,    675.05 GOPS
261   69.19 ms run,    2.66 ms python,   66.52 ms HIP,  618.21 loss, 0.003298 LR, 4.54 GB used,   9756.94 GFLOPS,    675.05 GOPS
262   69.17 ms run,    2.63 ms python,   66.54 ms HIP,  647.45 loss, 0.003294 LR, 4.54 GB used,   9758.55 GFLOPS,    675.05 GOPS
263   68.86 ms run,    2.67 ms python,   66.19 ms HIP,  623.23 loss, 0.003290 LR, 4.54 GB used,   9803.43 GFLOPS,    675.05 GOPS
264   69.24 ms run,    2.66 ms python,   66.58 ms HIP,  661.51 loss, 0.003285 LR, 4.54 GB used,   9748.80 GFLOPS,    675.05 GOPS
265   69.57 ms run,    2.66 ms python,   66.92 ms HIP,  644.22 loss, 0.003281 LR, 4.54 GB used,   9702.66 GFLOPS,    675.05 GOPS
266   68.90 ms run,    2.66 ms python,   66.24 ms HIP,  636.75 loss, 0.003276 LR, 4.54 GB used,   9797.61 GFLOPS,    675.05 GOPS
267   68.77 ms run,    2.64 ms python,   66.13 ms HIP,  642.13 loss, 0.003272 LR, 4.54 GB used,   9815.75 GFLOPS,    675.05 GOPS
268   69.19 ms run,    2.69 ms python,   66.50 ms HIP,  629.14 loss, 0.003268 LR, 4.54 GB used,   9756.40 GFLOPS,    675.05 GOPS
269   69.51 ms run,    2.67 ms python,   66.85 ms HIP,  616.75 loss, 0.003263 LR, 4.54 GB used,   9710.94 GFLOPS,    675.05 GOPS
270   69.58 ms run,    2.68 ms python,   66.90 ms HIP,  644.76 loss, 0.003259 LR, 4.54 GB used,   9701.83 GFLOPS,    675.05 GOPS
271   69.62 ms run,    2.65 ms python,   66.97 ms HIP,  644.48 loss, 0.003255 LR, 4.54 GB used,   9696.53 GFLOPS,    675.05 GOPS
272   69.69 ms run,    2.66 ms python,   67.03 ms HIP,  632.84 loss, 0.003250 LR, 4.54 GB used,   9686.25 GFLOPS,    675.05 GOPS
273   69.17 ms run,    2.78 ms python,   66.39 ms HIP,  616.68 loss, 0.003246 LR, 4.54 GB used,   9758.53 GFLOPS,    675.05 GOPS
274   69.09 ms run,    2.67 ms python,   66.42 ms HIP,  622.35 loss, 0.003242 LR, 4.54 GB used,   9770.47 GFLOPS,    675.05 GOPS
275   69.79 ms run,    2.74 ms python,   67.05 ms HIP,  645.60 loss, 0.003237 LR, 4.54 GB used,   9672.81 GFLOPS,    675.05 GOPS
276   69.70 ms run,    2.66 ms python,   67.03 ms HIP,  612.62 loss, 0.003233 LR, 4.54 GB used,   9685.45 GFLOPS,    675.05 GOPS
277   68.45 ms run,    2.70 ms python,   65.75 ms HIP,  602.66 loss, 0.003229 LR, 4.54 GB used,   9861.52 GFLOPS,    675.05 GOPS
278   69.37 ms run,    2.65 ms python,   66.72 ms HIP,  616.91 loss, 0.003224 LR, 4.54 GB used,   9730.95 GFLOPS,    675.05 GOPS
279   70.57 ms run,    2.74 ms python,   67.83 ms HIP,  630.11 loss, 0.003220 LR, 4.54 GB used,   9566.13 GFLOPS,    675.05 GOPS
280   69.70 ms run,    2.67 ms python,   67.03 ms HIP,  650.35 loss, 0.003216 LR, 4.54 GB used,   9685.41 GFLOPS,    675.05 GOPS
281   69.23 ms run,    2.67 ms python,   66.57 ms HIP,  612.47 loss, 0.003211 LR, 4.54 GB used,   9750.26 GFLOPS,    675.05 GOPS
282   69.25 ms run,    2.68 ms python,   66.57 ms HIP,  632.91 loss, 0.003207 LR, 4.54 GB used,   9748.08 GFLOPS,    675.05 GOPS
283   68.70 ms run,    2.66 ms python,   66.05 ms HIP,  602.64 loss, 0.003202 LR, 4.54 GB used,   9825.40 GFLOPS,    675.05 GOPS
284   69.28 ms run,    2.66 ms python,   66.62 ms HIP,  616.49 loss, 0.003198 LR, 4.54 GB used,   9744.36 GFLOPS,    675.05 GOPS
285   69.14 ms run,    2.68 ms python,   66.47 ms HIP,  645.61 loss, 0.003194 LR, 4.54 GB used,   9762.77 GFLOPS,    675.05 GOPS
286   69.64 ms run,    2.78 ms python,   66.86 ms HIP,  615.17 loss, 0.003189 LR, 4.54 GB used,   9693.46 GFLOPS,    675.05 GOPS
287   69.62 ms run,    2.71 ms python,   66.91 ms HIP,  642.57 loss, 0.003185 LR, 4.54 GB used,   9695.77 GFLOPS,    675.05 GOPS
288   69.24 ms run,    2.65 ms python,   66.59 ms HIP,  628.23 loss, 0.003181 LR, 4.54 GB used,   9749.11 GFLOPS,    675.05 GOPS
289   69.29 ms run,    2.63 ms python,   66.66 ms HIP,  652.79 loss, 0.003176 LR, 4.54 GB used,   9742.13 GFLOPS,    675.05 GOPS
290   69.26 ms run,    2.66 ms python,   66.60 ms HIP,  646.73 loss, 0.003172 LR, 4.54 GB used,   9746.35 GFLOPS,    675.05 GOPS
291   69.11 ms run,    2.68 ms python,   66.43 ms HIP,  608.35 loss, 0.003168 LR, 4.54 GB used,   9767.87 GFLOPS,    675.05 GOPS
292   69.11 ms run,    2.64 ms python,   66.48 ms HIP,  607.70 loss, 0.003163 LR, 4.54 GB used,   9767.05 GFLOPS,    675.05 GOPS
293   68.87 ms run,    2.66 ms python,   66.21 ms HIP,  592.41 loss, 0.003159 LR, 4.54 GB used,   9801.96 GFLOPS,    675.05 GOPS
shuffling training dataset in 756.48 ms (epoch=3)
294  832.10 ms run,  759.70 ms python,   72.40 ms HIP,  608.68 loss, 0.003155 LR, 4.54 GB used,    811.61 GFLOPS,    675.34 GOPS
295   74.06 ms run,    2.77 ms python,   71.30 ms HIP,  610.76 loss, 0.003150 LR, 4.54 GB used,   9114.37 GFLOPS,    675.05 GOPS
296   72.09 ms run,    2.76 ms python,   69.33 ms HIP,  602.13 loss, 0.003146 LR, 4.54 GB used,   9364.36 GFLOPS,    675.05 GOPS
297   71.02 ms run,    2.70 ms python,   68.32 ms HIP,  605.34 loss, 0.003142 LR, 4.54 GB used,   9505.38 GFLOPS,    675.05 GOPS
298   70.45 ms run,    2.65 ms python,   67.79 ms HIP,  596.22 loss, 0.003137 LR, 4.54 GB used,   9582.37 GFLOPS,    675.05 GOPS
299   70.28 ms run,    2.71 ms python,   67.57 ms HIP,  588.64 loss, 0.003133 LR, 4.54 GB used,   9604.62 GFLOPS,    675.05 GOPS
300   69.83 ms run,    2.73 ms python,   67.10 ms HIP,  621.56 loss, 0.003128 LR, 4.54 GB used,   9667.02 GFLOPS,    675.05 GOPS
301   69.90 ms run,    2.75 ms python,   67.15 ms HIP,  629.12 loss, 0.003124 LR, 4.54 GB used,   9657.46 GFLOPS,    675.05 GOPS
302   69.82 ms run,    2.69 ms python,   67.13 ms HIP,  612.17 loss, 0.003120 LR, 4.54 GB used,   9668.33 GFLOPS,    675.05 GOPS
303   69.37 ms run,    2.68 ms python,   66.69 ms HIP,  590.49 loss, 0.003115 LR, 4.54 GB used,   9730.96 GFLOPS,    675.05 GOPS
304   69.43 ms run,    2.69 ms python,   66.74 ms HIP,  604.18 loss, 0.003111 LR, 4.54 GB used,   9721.99 GFLOPS,    675.05 GOPS
305   69.31 ms run,    2.68 ms python,   66.62 ms HIP,  618.26 loss, 0.003107 LR, 4.54 GB used,   9740.19 GFLOPS,    675.05 GOPS
306   69.64 ms run,    2.65 ms python,   66.99 ms HIP,  601.56 loss, 0.003102 LR, 4.54 GB used,   9693.74 GFLOPS,    675.05 GOPS
307   69.08 ms run,    2.70 ms python,   66.38 ms HIP,  602.58 loss, 0.003098 LR, 4.54 GB used,   9772.04 GFLOPS,    675.05 GOPS
308   69.51 ms run,    2.66 ms python,   66.86 ms HIP,  592.20 loss, 0.003094 LR, 4.54 GB used,   9710.86 GFLOPS,    675.05 GOPS
309   68.93 ms run,    2.76 ms python,   66.18 ms HIP,  581.20 loss, 0.003089 LR, 4.54 GB used,   9793.17 GFLOPS,    675.05 GOPS
310   69.35 ms run,    2.67 ms python,   66.68 ms HIP,  595.50 loss, 0.003085 LR, 4.54 GB used,   9733.98 GFLOPS,    675.05 GOPS
311   69.28 ms run,    2.64 ms python,   66.64 ms HIP,  602.04 loss, 0.003081 LR, 4.54 GB used,   9743.81 GFLOPS,    675.05 GOPS
312   69.76 ms run,    2.67 ms python,   67.10 ms HIP,  613.41 loss, 0.003076 LR, 4.54 GB used,   9676.11 GFLOPS,    675.05 GOPS
313   69.49 ms run,    2.74 ms python,   66.75 ms HIP,  612.33 loss, 0.003072 LR, 4.54 GB used,   9714.14 GFLOPS,    675.05 GOPS
314   69.55 ms run,    2.67 ms python,   66.88 ms HIP,  601.82 loss, 0.003068 LR, 4.54 GB used,   9706.48 GFLOPS,    675.05 GOPS
315   69.46 ms run,    2.70 ms python,   66.76 ms HIP,  599.91 loss, 0.003063 LR, 4.54 GB used,   9718.26 GFLOPS,    675.05 GOPS
316   69.61 ms run,    2.73 ms python,   66.88 ms HIP,  603.31 loss, 0.003059 LR, 4.54 GB used,   9697.68 GFLOPS,    675.05 GOPS
317   69.45 ms run,    2.67 ms python,   66.78 ms HIP,  606.40 loss, 0.003054 LR, 4.54 GB used,   9720.29 GFLOPS,    675.05 GOPS
318   69.77 ms run,    2.67 ms python,   67.10 ms HIP,  599.24 loss, 0.003050 LR, 4.54 GB used,   9675.79 GFLOPS,    675.05 GOPS
319   69.22 ms run,    2.66 ms python,   66.56 ms HIP,  580.43 loss, 0.003046 LR, 4.54 GB used,   9752.39 GFLOPS,    675.05 GOPS
320   69.93 ms run,    2.67 ms python,   67.26 ms HIP,  605.67 loss, 0.003041 LR, 4.54 GB used,   9653.45 GFLOPS,    675.05 GOPS
321   69.24 ms run,    2.63 ms python,   66.61 ms HIP,  618.09 loss, 0.003037 LR, 4.54 GB used,   9748.92 GFLOPS,    675.05 GOPS
322   69.77 ms run,    2.69 ms python,   67.09 ms HIP,  624.88 loss, 0.003033 LR, 4.54 GB used,   9674.90 GFLOPS,    675.05 GOPS
323   69.21 ms run,    2.78 ms python,   66.43 ms HIP,  617.77 loss, 0.003028 LR, 4.54 GB used,   9754.27 GFLOPS,    675.05 GOPS
324   68.91 ms run,    2.65 ms python,   66.26 ms HIP,  599.90 loss, 0.003024 LR, 4.54 GB used,   9796.42 GFLOPS,    675.05 GOPS
325   69.33 ms run,    2.69 ms python,   66.64 ms HIP,  604.47 loss, 0.003020 LR, 4.54 GB used,   9736.84 GFLOPS,    675.05 GOPS
326   68.80 ms run,    2.65 ms python,   66.16 ms HIP,  616.35 loss, 0.003015 LR, 4.54 GB used,   9811.55 GFLOPS,    675.05 GOPS
327   69.53 ms run,    2.66 ms python,   66.87 ms HIP,  599.59 loss, 0.003011 LR, 4.54 GB used,   9708.45 GFLOPS,    675.05 GOPS
328   68.94 ms run,    2.67 ms python,   66.27 ms HIP,  615.25 loss, 0.003007 LR, 4.54 GB used,   9792.02 GFLOPS,    675.05 GOPS
329   69.34 ms run,    2.65 ms python,   66.69 ms HIP,  599.05 loss, 0.003002 LR, 4.54 GB used,   9735.76 GFLOPS,    675.05 GOPS
330   69.63 ms run,    2.63 ms python,   67.00 ms HIP,  623.61 loss, 0.002998 LR, 4.54 GB used,   9694.07 GFLOPS,    675.05 GOPS
331   69.41 ms run,    2.64 ms python,   66.77 ms HIP,  646.88 loss, 0.002994 LR, 4.54 GB used,   9725.43 GFLOPS,    675.05 GOPS
332   69.64 ms run,    2.74 ms python,   66.90 ms HIP,  604.86 loss, 0.002989 LR, 4.54 GB used,   9693.58 GFLOPS,    675.05 GOPS
333   69.29 ms run,    2.66 ms python,   66.63 ms HIP,  590.82 loss, 0.002985 LR, 4.54 GB used,   9742.17 GFLOPS,    675.05 GOPS
334   69.44 ms run,    2.68 ms python,   66.76 ms HIP,  600.96 loss, 0.002980 LR, 4.54 GB used,   9720.99 GFLOPS,    675.05 GOPS
335   69.36 ms run,    2.68 ms python,   66.68 ms HIP,  599.57 loss, 0.002976 LR, 4.54 GB used,   9732.55 GFLOPS,    675.05 GOPS
336   69.23 ms run,    2.66 ms python,   66.57 ms HIP,  600.89 loss, 0.002972 LR, 4.54 GB used,   9750.55 GFLOPS,    675.05 GOPS
337   69.57 ms run,    2.65 ms python,   66.92 ms HIP,  606.04 loss, 0.002967 LR, 4.54 GB used,   9702.82 GFLOPS,    675.05 GOPS
338   68.98 ms run,    2.71 ms python,   66.26 ms HIP,  600.90 loss, 0.002963 LR, 4.54 GB used,   9786.77 GFLOPS,    675.05 GOPS
339   68.97 ms run,    2.65 ms python,   66.32 ms HIP,  627.56 loss, 0.002959 LR, 4.54 GB used,   9787.82 GFLOPS,    675.05 GOPS
340   68.49 ms run,    2.68 ms python,   65.81 ms HIP,  616.45 loss, 0.002954 LR, 4.54 GB used,   9856.42 GFLOPS,    675.05 GOPS
341   68.23 ms run,    2.73 ms python,   65.50 ms HIP,  584.24 loss, 0.002950 LR, 4.54 GB used,   9893.86 GFLOPS,    675.05 GOPS
342   68.86 ms run,    2.66 ms python,   66.20 ms HIP,  603.59 loss, 0.002946 LR, 4.54 GB used,   9802.71 GFLOPS,    675.05 GOPS
343   69.03 ms run,    2.67 ms python,   66.36 ms HIP,  605.93 loss, 0.002941 LR, 4.54 GB used,   9778.69 GFLOPS,    675.05 GOPS
344   68.69 ms run,    2.68 ms python,   66.00 ms HIP,  593.60 loss, 0.002937 LR, 4.54 GB used,   9827.99 GFLOPS,    675.05 GOPS
345   69.16 ms run,    2.68 ms python,   66.49 ms HIP,  594.98 loss, 0.002933 LR, 4.54 GB used,   9760.08 GFLOPS,    675.05 GOPS
346   69.51 ms run,    2.71 ms python,   66.80 ms HIP,  576.23 loss, 0.002928 LR, 4.54 GB used,   9712.07 GFLOPS,    675.05 GOPS
347   69.89 ms run,    2.65 ms python,   67.24 ms HIP,  583.97 loss, 0.002924 LR, 4.54 GB used,   9658.74 GFLOPS,    675.05 GOPS
348   69.54 ms run,    2.66 ms python,   66.88 ms HIP,  606.75 loss, 0.002920 LR, 4.54 GB used,   9707.00 GFLOPS,    675.05 GOPS
349   69.49 ms run,    2.60 ms python,   66.88 ms HIP,  586.33 loss, 0.002915 LR, 4.54 GB used,   9714.95 GFLOPS,    675.05 GOPS
350   70.11 ms run,    2.65 ms python,   67.46 ms HIP,  623.88 loss, 0.002911 LR, 4.54 GB used,   9627.77 GFLOPS,    675.05 GOPS
351   69.72 ms run,    2.65 ms python,   67.07 ms HIP,  633.02 loss, 0.002906 LR, 4.54 GB used,   9681.56 GFLOPS,    675.05 GOPS
352   69.37 ms run,    2.68 ms python,   66.70 ms HIP,  594.73 loss, 0.002902 LR, 4.54 GB used,   9730.85 GFLOPS,    675.05 GOPS
353   69.62 ms run,    2.67 ms python,   66.95 ms HIP,  577.12 loss, 0.002898 LR, 4.54 GB used,   9695.89 GFLOPS,    675.05 GOPS
354   69.93 ms run,    2.66 ms python,   67.27 ms HIP,  617.79 loss, 0.002893 LR, 4.54 GB used,   9653.10 GFLOPS,    675.05 GOPS
355   69.53 ms run,    2.70 ms python,   66.82 ms HIP,  619.25 loss, 0.002889 LR, 4.54 GB used,   9709.34 GFLOPS,    675.05 GOPS
356   69.69 ms run,    2.68 ms python,   67.01 ms HIP,  604.83 loss, 0.002885 LR, 4.54 GB used,   9687.09 GFLOPS,    675.05 GOPS
357   69.60 ms run,    2.67 ms python,   66.93 ms HIP,  586.27 loss, 0.002880 LR, 4.54 GB used,   9699.40 GFLOPS,    675.05 GOPS
358   69.78 ms run,    2.69 ms python,   67.09 ms HIP,  608.11 loss, 0.002876 LR, 4.54 GB used,   9674.21 GFLOPS,    675.05 GOPS
359   69.36 ms run,    2.69 ms python,   66.67 ms HIP,  595.93 loss, 0.002872 LR, 4.54 GB used,   9732.25 GFLOPS,    675.05 GOPS
360   69.24 ms run,    2.70 ms python,   66.54 ms HIP,  618.43 loss, 0.002867 LR, 4.54 GB used,   9750.03 GFLOPS,    675.05 GOPS
361   68.86 ms run,    2.62 ms python,   66.23 ms HIP,  611.55 loss, 0.002863 LR, 4.54 GB used,   9803.30 GFLOPS,    675.05 GOPS
362   69.63 ms run,    2.66 ms python,   66.97 ms HIP,  605.26 loss, 0.002859 LR, 4.54 GB used,   9695.27 GFLOPS,    675.05 GOPS
363   69.32 ms run,    2.75 ms python,   66.56 ms HIP,  611.81 loss, 0.002854 LR, 4.54 GB used,   9738.57 GFLOPS,    675.05 GOPS
364   68.96 ms run,    2.68 ms python,   66.29 ms HIP,  607.24 loss, 0.002850 LR, 4.54 GB used,   9788.85 GFLOPS,    675.05 GOPS
365   69.30 ms run,    2.70 ms python,   66.60 ms HIP,  613.96 loss, 0.002846 LR, 4.54 GB used,   9740.94 GFLOPS,    675.05 GOPS
366   70.14 ms run,    2.64 ms python,   67.50 ms HIP,  616.48 loss, 0.002841 LR, 4.54 GB used,   9624.49 GFLOPS,    675.05 GOPS
367   69.24 ms run,    2.68 ms python,   66.56 ms HIP,  617.21 loss, 0.002837 LR, 4.54 GB used,   9749.79 GFLOPS,    675.05 GOPS
368   69.28 ms run,    2.68 ms python,   66.60 ms HIP,  601.58 loss, 0.002832 LR, 4.54 GB used,   9743.34 GFLOPS,    675.05 GOPS
369   69.19 ms run,    2.76 ms python,   66.43 ms HIP,  601.81 loss, 0.002828 LR, 4.54 GB used,   9756.90 GFLOPS,    675.05 GOPS
370   69.64 ms run,    2.72 ms python,   66.92 ms HIP,  608.82 loss, 0.002824 LR, 4.54 GB used,   9693.09 GFLOPS,    675.05 GOPS
371   69.56 ms run,    2.71 ms python,   66.84 ms HIP,  611.55 loss, 0.002819 LR, 4.54 GB used,   9705.04 GFLOPS,    675.05 GOPS
372   69.52 ms run,    2.69 ms python,   66.82 ms HIP,  589.56 loss, 0.002815 LR, 4.54 GB used,   9710.32 GFLOPS,    675.05 GOPS
373   69.17 ms run,    2.69 ms python,   66.48 ms HIP,  607.22 loss, 0.002811 LR, 4.54 GB used,   9759.81 GFLOPS,    675.05 GOPS
374   69.31 ms run,    2.72 ms python,   66.59 ms HIP,  605.37 loss, 0.002806 LR, 4.54 GB used,   9739.58 GFLOPS,    675.05 GOPS
375   69.51 ms run,    2.67 ms python,   66.83 ms HIP,  591.62 loss, 0.002802 LR, 4.54 GB used,   9711.56 GFLOPS,    675.05 GOPS
376   69.93 ms run,    2.65 ms python,   67.27 ms HIP,  599.20 loss, 0.002798 LR, 4.54 GB used,   9653.40 GFLOPS,    675.05 GOPS
377   69.33 ms run,    2.64 ms python,   66.68 ms HIP,  626.14 loss, 0.002793 LR, 4.54 GB used,   9737.21 GFLOPS,    675.05 GOPS
378   69.49 ms run,    2.65 ms python,   66.84 ms HIP,  600.17 loss, 0.002789 LR, 4.54 GB used,   9714.43 GFLOPS,    675.05 GOPS
379   70.17 ms run,    2.71 ms python,   67.46 ms HIP,  593.42 loss, 0.002785 LR, 4.54 GB used,   9620.51 GFLOPS,    675.05 GOPS
380   68.98 ms run,    2.64 ms python,   66.34 ms HIP,  606.59 loss, 0.002780 LR, 4.54 GB used,   9786.28 GFLOPS,    675.05 GOPS
381   68.92 ms run,    2.68 ms python,   66.24 ms HIP,  600.88 loss, 0.002776 LR, 4.54 GB used,   9795.10 GFLOPS,    675.05 GOPS
382   69.30 ms run,    2.65 ms python,   66.65 ms HIP,  581.13 loss, 0.002772 LR, 4.54 GB used,   9740.97 GFLOPS,    675.05 GOPS
383   69.86 ms run,    2.80 ms python,   67.06 ms HIP,  600.93 loss, 0.002767 LR, 4.54 GB used,   9663.49 GFLOPS,    675.05 GOPS
384   69.91 ms run,    2.66 ms python,   67.25 ms HIP,  644.06 loss, 0.002763 LR, 4.54 GB used,   9655.47 GFLOPS,    675.05 GOPS
385   71.23 ms run,    2.68 ms python,   68.55 ms HIP,  579.32 loss, 0.002758 LR, 4.54 GB used,   9476.84 GFLOPS,    675.05 GOPS
386   69.61 ms run,    2.57 ms python,   67.04 ms HIP,  606.70 loss, 0.002754 LR, 4.54 GB used,   9697.09 GFLOPS,    675.05 GOPS
387   69.46 ms run,    2.74 ms python,   66.72 ms HIP,  598.76 loss, 0.002750 LR, 4.54 GB used,   9718.86 GFLOPS,    675.05 GOPS
388   69.43 ms run,    2.63 ms python,   66.80 ms HIP,  593.29 loss, 0.002745 LR, 4.54 GB used,   9722.94 GFLOPS,    675.05 GOPS
389   69.34 ms run,    2.68 ms python,   66.65 ms HIP,  615.35 loss, 0.002741 LR, 4.54 GB used,   9735.92 GFLOPS,    675.05 GOPS
390   69.72 ms run,    2.62 ms python,   67.09 ms HIP,  584.42 loss, 0.002737 LR, 4.54 GB used,   9682.56 GFLOPS,    675.05 GOPS
391   69.64 ms run,    2.69 ms python,   66.94 ms HIP,  567.33 loss, 0.002732 LR, 4.54 GB used,   9693.97 GFLOPS,    675.05 GOPS
shuffling training dataset in 756.63 ms (epoch=4)
392  832.02 ms run,  759.77 ms python,   72.25 ms HIP,  582.36 loss, 0.002728 LR, 4.54 GB used,    811.69 GFLOPS,    675.34 GOPS
393   74.18 ms run,    2.79 ms python,   71.39 ms HIP,  568.34 loss, 0.002724 LR, 4.54 GB used,   9100.01 GFLOPS,    675.05 GOPS
394   72.54 ms run,    2.70 ms python,   69.84 ms HIP,  577.83 loss, 0.002719 LR, 4.54 GB used,   9306.23 GFLOPS,    675.05 GOPS
395   71.62 ms run,    2.70 ms python,   68.92 ms HIP,  585.91 loss, 0.002715 LR, 4.54 GB used,   9425.68 GFLOPS,    675.05 GOPS
396   70.43 ms run,    2.68 ms python,   67.76 ms HIP,  577.13 loss, 0.002711 LR, 4.54 GB used,   9584.01 GFLOPS,    675.05 GOPS
397   70.22 ms run,    2.66 ms python,   67.56 ms HIP,  569.58 loss, 0.002706 LR, 4.54 GB used,   9612.95 GFLOPS,    675.05 GOPS
398   69.91 ms run,    2.65 ms python,   67.27 ms HIP,  577.52 loss, 0.002702 LR, 4.54 GB used,   9655.27 GFLOPS,    675.05 GOPS
399   69.58 ms run,    2.76 ms python,   66.82 ms HIP,  612.04 loss, 0.002698 LR, 4.54 GB used,   9701.20 GFLOPS,    675.05 GOPS
400   69.40 ms run,    2.69 ms python,   66.72 ms HIP,  604.85 loss, 0.002693 LR, 4.54 GB used,   9726.28 GFLOPS,    675.05 GOPS
401   69.48 ms run,    2.75 ms python,   66.73 ms HIP,  596.98 loss, 0.002689 LR, 4.54 GB used,   9715.27 GFLOPS,    675.05 GOPS
402   69.22 ms run,    2.64 ms python,   66.58 ms HIP,  592.40 loss, 0.002684 LR, 4.54 GB used,   9752.47 GFLOPS,    675.05 GOPS
403   69.29 ms run,    2.69 ms python,   66.60 ms HIP,  606.83 loss, 0.002680 LR, 4.54 GB used,   9742.85 GFLOPS,    675.05 GOPS
404   68.71 ms run,    2.67 ms python,   66.05 ms HIP,  591.62 loss, 0.002676 LR, 4.54 GB used,   9824.14 GFLOPS,    675.05 GOPS
405   68.61 ms run,    2.69 ms python,   65.92 ms HIP,  583.69 loss, 0.002671 LR, 4.54 GB used,   9838.98 GFLOPS,    675.05 GOPS
406   69.00 ms run,    2.65 ms python,   66.36 ms HIP,  571.33 loss, 0.002667 LR, 4.54 GB used,   9782.59 GFLOPS,    675.05 GOPS
407   68.76 ms run,    2.65 ms python,   66.11 ms HIP,  588.53 loss, 0.002663 LR, 4.54 GB used,   9817.98 GFLOPS,    675.05 GOPS
408   69.14 ms run,    2.67 ms python,   66.46 ms HIP,  615.86 loss, 0.002658 LR, 4.54 GB used,   9764.08 GFLOPS,    675.05 GOPS
409   68.86 ms run,    2.69 ms python,   66.16 ms HIP,  592.83 loss, 0.002654 LR, 4.54 GB used,   9803.67 GFLOPS,    675.05 GOPS
410   68.96 ms run,    2.68 ms python,   66.28 ms HIP,  593.65 loss, 0.002650 LR, 4.54 GB used,   9788.92 GFLOPS,    675.05 GOPS
411   69.54 ms run,    2.68 ms python,   66.85 ms HIP,  581.44 loss, 0.002645 LR, 4.54 GB used,   9707.84 GFLOPS,    675.05 GOPS
412   69.68 ms run,    2.64 ms python,   67.04 ms HIP,  579.73 loss, 0.002641 LR, 4.54 GB used,   9688.36 GFLOPS,    675.05 GOPS
413   69.88 ms run,    2.64 ms python,   67.24 ms HIP,  581.51 loss, 0.002637 LR, 4.54 GB used,   9659.46 GFLOPS,    675.05 GOPS
414   69.52 ms run,    2.70 ms python,   66.82 ms HIP,  585.01 loss, 0.002632 LR, 4.54 GB used,   9710.34 GFLOPS,    675.05 GOPS
415   69.01 ms run,    2.76 ms python,   66.25 ms HIP,  589.84 loss, 0.002628 LR, 4.54 GB used,   9782.43 GFLOPS,    675.05 GOPS
416   68.36 ms run,    2.71 ms python,   65.65 ms HIP,  577.49 loss, 0.002624 LR, 4.54 GB used,   9875.12 GFLOPS,    675.05 GOPS
417   69.35 ms run,    2.67 ms python,   66.68 ms HIP,  576.82 loss, 0.002619 LR, 4.54 GB used,   9734.03 GFLOPS,    675.05 GOPS
418   69.35 ms run,    2.65 ms python,   66.70 ms HIP,  598.78 loss, 0.002615 LR, 4.54 GB used,   9734.31 GFLOPS,    675.05 GOPS
419   69.68 ms run,    2.66 ms python,   67.02 ms HIP,  582.48 loss, 0.002610 LR, 4.54 GB used,   9687.70 GFLOPS,    675.05 GOPS
420   70.64 ms run,    2.65 ms python,   67.99 ms HIP,  594.60 loss, 0.002606 LR, 4.54 GB used,   9555.55 GFLOPS,    675.05 GOPS
421   70.61 ms run,    2.72 ms python,   67.89 ms HIP,  592.59 loss, 0.002602 LR, 4.54 GB used,   9560.37 GFLOPS,    675.05 GOPS
422   70.40 ms run,    2.62 ms python,   67.77 ms HIP,  577.48 loss, 0.002597 LR, 4.54 GB used,   9589.23 GFLOPS,    675.05 GOPS
423   70.03 ms run,    2.72 ms python,   67.31 ms HIP,  617.56 loss, 0.002593 LR, 4.54 GB used,   9640.01 GFLOPS,    675.05 GOPS
424   70.42 ms run,    2.65 ms python,   67.78 ms HIP,  615.94 loss, 0.002589 LR, 4.54 GB used,   9585.44 GFLOPS,    675.05 GOPS
425   70.32 ms run,    2.64 ms python,   67.69 ms HIP,  603.24 loss, 0.002584 LR, 4.54 GB used,   9599.25 GFLOPS,    675.05 GOPS
426   69.76 ms run,    2.78 ms python,   66.98 ms HIP,  584.85 loss, 0.002580 LR, 4.54 GB used,   9676.22 GFLOPS,    675.05 GOPS
427   69.61 ms run,    2.65 ms python,   66.95 ms HIP,  592.30 loss, 0.002576 LR, 4.54 GB used,   9698.03 GFLOPS,    675.05 GOPS
428   69.93 ms run,    2.73 ms python,   67.19 ms HIP,  579.46 loss, 0.002571 LR, 4.54 GB used,   9653.71 GFLOPS,    675.05 GOPS
429   69.83 ms run,    2.66 ms python,   67.18 ms HIP,  595.40 loss, 0.002567 LR, 4.54 GB used,   9666.58 GFLOPS,    675.05 GOPS
430   69.72 ms run,    2.70 ms python,   67.01 ms HIP,  589.41 loss, 0.002563 LR, 4.54 GB used,   9682.79 GFLOPS,    675.05 GOPS
431   69.85 ms run,    2.63 ms python,   67.22 ms HIP,  589.12 loss, 0.002558 LR, 4.54 GB used,   9664.38 GFLOPS,    675.05 GOPS
432   69.36 ms run,    2.71 ms python,   66.65 ms HIP,  583.52 loss, 0.002554 LR, 4.54 GB used,   9732.81 GFLOPS,    675.05 GOPS
433   69.06 ms run,    2.63 ms python,   66.43 ms HIP,  601.16 loss, 0.002550 LR, 4.54 GB used,   9774.94 GFLOPS,    675.05 GOPS
434   69.68 ms run,    2.63 ms python,   67.05 ms HIP,  595.74 loss, 0.002545 LR, 4.54 GB used,   9688.07 GFLOPS,    675.05 GOPS
435   69.27 ms run,    2.65 ms python,   66.62 ms HIP,  578.42 loss, 0.002541 LR, 4.54 GB used,   9744.83 GFLOPS,    675.05 GOPS
436   69.72 ms run,    2.64 ms python,   67.08 ms HIP,  579.35 loss, 0.002536 LR, 4.54 GB used,   9681.54 GFLOPS,    675.05 GOPS
437   69.53 ms run,    2.69 ms python,   66.84 ms HIP,  575.27 loss, 0.002532 LR, 4.54 GB used,   9708.32 GFLOPS,    675.05 GOPS
438   69.31 ms run,    2.65 ms python,   66.66 ms HIP,  589.69 loss, 0.002528 LR, 4.54 GB used,   9738.86 GFLOPS,    675.05 GOPS
439   68.81 ms run,    2.70 ms python,   66.11 ms HIP,  577.37 loss, 0.002523 LR, 4.54 GB used,   9810.67 GFLOPS,    675.05 GOPS
440   69.26 ms run,    2.64 ms python,   66.62 ms HIP,  594.12 loss, 0.002519 LR, 4.54 GB used,   9746.05 GFLOPS,    675.05 GOPS
441   69.77 ms run,    2.69 ms python,   67.08 ms HIP,  601.86 loss, 0.002515 LR, 4.54 GB used,   9675.42 GFLOPS,    675.05 GOPS
442   69.41 ms run,    2.68 ms python,   66.73 ms HIP,  584.64 loss, 0.002510 LR, 4.54 GB used,   9725.30 GFLOPS,    675.05 GOPS
443   69.14 ms run,    2.74 ms python,   66.40 ms HIP,  593.23 loss, 0.002506 LR, 4.54 GB used,   9764.04 GFLOPS,    675.05 GOPS
444   68.74 ms run,    2.64 ms python,   66.11 ms HIP,  579.12 loss, 0.002502 LR, 4.54 GB used,   9820.21 GFLOPS,    675.05 GOPS
445   69.27 ms run,    2.63 ms python,   66.65 ms HIP,  575.39 loss, 0.002497 LR, 4.54 GB used,   9744.64 GFLOPS,    675.05 GOPS
446   69.20 ms run,    2.62 ms python,   66.58 ms HIP,  581.47 loss, 0.002493 LR, 4.54 GB used,   9755.36 GFLOPS,    675.05 GOPS
447   69.39 ms run,    2.66 ms python,   66.73 ms HIP,  580.66 loss, 0.002489 LR, 4.54 GB used,   9727.94 GFLOPS,    675.05 GOPS
448   69.30 ms run,    2.61 ms python,   66.69 ms HIP,  577.62 loss, 0.002484 LR, 4.54 GB used,   9740.59 GFLOPS,    675.05 GOPS
449   70.02 ms run,    2.73 ms python,   67.29 ms HIP,  586.47 loss, 0.002480 LR, 4.54 GB used,   9640.13 GFLOPS,    675.05 GOPS
450   69.28 ms run,    2.70 ms python,   66.58 ms HIP,  591.01 loss, 0.002476 LR, 4.54 GB used,   9743.76 GFLOPS,    675.05 GOPS
451   69.43 ms run,    2.75 ms python,   66.68 ms HIP,  603.39 loss, 0.002471 LR, 4.54 GB used,   9723.26 GFLOPS,    675.05 GOPS
452   68.91 ms run,    2.66 ms python,   66.25 ms HIP,  608.29 loss, 0.002467 LR, 4.54 GB used,   9796.44 GFLOPS,    675.05 GOPS
453   69.21 ms run,    2.68 ms python,   66.53 ms HIP,  589.10 loss, 0.002463 LR, 4.54 GB used,   9753.33 GFLOPS,    675.05 GOPS
454   68.97 ms run,    2.73 ms python,   66.25 ms HIP,  604.86 loss, 0.002458 LR, 4.54 GB used,   9786.83 GFLOPS,    675.05 GOPS
455   69.34 ms run,    2.64 ms python,   66.70 ms HIP,  586.66 loss, 0.002454 LR, 4.54 GB used,   9734.98 GFLOPS,    675.05 GOPS
456   69.24 ms run,    2.67 ms python,   66.56 ms HIP,  576.25 loss, 0.002449 LR, 4.54 GB used,   9749.74 GFLOPS,    675.05 GOPS
457   69.19 ms run,    2.67 ms python,   66.51 ms HIP,  589.74 loss, 0.002445 LR, 4.54 GB used,   9756.83 GFLOPS,    675.05 GOPS
458   68.60 ms run,    2.65 ms python,   65.96 ms HIP,  593.15 loss, 0.002441 LR, 4.54 GB used,   9839.60 GFLOPS,    675.05 GOPS
459   69.54 ms run,    2.71 ms python,   66.83 ms HIP,  582.82 loss, 0.002436 LR, 4.54 GB used,   9707.40 GFLOPS,    675.05 GOPS
460   69.08 ms run,    2.63 ms python,   66.45 ms HIP,  590.13 loss, 0.002432 LR, 4.54 GB used,   9771.37 GFLOPS,    675.05 GOPS
461   69.15 ms run,    2.74 ms python,   66.41 ms HIP,  586.29 loss, 0.002428 LR, 4.54 GB used,   9762.12 GFLOPS,    675.05 GOPS
462   69.12 ms run,    2.62 ms python,   66.50 ms HIP,  582.98 loss, 0.002423 LR, 4.54 GB used,   9766.12 GFLOPS,    675.05 GOPS
463   69.67 ms run,    2.64 ms python,   67.03 ms HIP,  565.42 loss, 0.002419 LR, 4.54 GB used,   9689.48 GFLOPS,    675.05 GOPS
464   69.61 ms run,    2.65 ms python,   66.96 ms HIP,  573.48 loss, 0.002415 LR, 4.54 GB used,   9697.12 GFLOPS,    675.05 GOPS
465   69.30 ms run,    2.72 ms python,   66.59 ms HIP,  593.76 loss, 0.002410 LR, 4.54 GB used,   9740.30 GFLOPS,    675.05 GOPS
466   69.44 ms run,    2.71 ms python,   66.74 ms HIP,  566.20 loss, 0.002406 LR, 4.54 GB used,   9720.68 GFLOPS,    675.05 GOPS
467   69.39 ms run,    2.69 ms python,   66.70 ms HIP,  581.88 loss, 0.002402 LR, 4.54 GB used,   9728.38 GFLOPS,    675.05 GOPS
468   69.24 ms run,    2.64 ms python,   66.60 ms HIP,  578.37 loss, 0.002397 LR, 4.54 GB used,   9749.12 GFLOPS,    675.05 GOPS
469   69.55 ms run,    2.65 ms python,   66.90 ms HIP,  585.02 loss, 0.002393 LR, 4.54 GB used,   9705.48 GFLOPS,    675.05 GOPS
470   69.91 ms run,    2.74 ms python,   67.17 ms HIP,  582.28 loss, 0.002389 LR, 4.54 GB used,   9656.55 GFLOPS,    675.05 GOPS
471   69.71 ms run,    2.64 ms python,   67.07 ms HIP,  585.25 loss, 0.002384 LR, 4.54 GB used,   9684.00 GFLOPS,    675.05 GOPS
472   69.15 ms run,    2.74 ms python,   66.41 ms HIP,  581.99 loss, 0.002380 LR, 4.54 GB used,   9761.66 GFLOPS,    675.05 GOPS
473   69.56 ms run,    2.66 ms python,   66.90 ms HIP,  577.38 loss, 0.002375 LR, 4.54 GB used,   9704.28 GFLOPS,    675.05 GOPS
474   70.51 ms run,    2.73 ms python,   67.78 ms HIP,  580.28 loss, 0.002371 LR, 4.54 GB used,   9574.16 GFLOPS,    675.05 GOPS
475   69.59 ms run,    2.70 ms python,   66.88 ms HIP,  586.50 loss, 0.002367 LR, 4.54 GB used,   9700.53 GFLOPS,    675.05 GOPS
476   69.72 ms run,    2.72 ms python,   67.00 ms HIP,  597.28 loss, 0.002362 LR, 4.54 GB used,   9682.55 GFLOPS,    675.05 GOPS
477   69.12 ms run,    2.66 ms python,   66.46 ms HIP,  589.34 loss, 0.002358 LR, 4.54 GB used,   9766.60 GFLOPS,    675.05 GOPS
478   69.86 ms run,    2.63 ms python,   67.23 ms HIP,  573.59 loss, 0.002354 LR, 4.54 GB used,   9662.94 GFLOPS,    675.05 GOPS
479   69.15 ms run,    2.71 ms python,   66.44 ms HIP,  562.75 loss, 0.002349 LR, 4.54 GB used,   9761.58 GFLOPS,    675.05 GOPS
480   69.32 ms run,    2.71 ms python,   66.61 ms HIP,  593.10 loss, 0.002345 LR, 4.54 GB used,   9737.73 GFLOPS,    675.05 GOPS
481   69.49 ms run,    2.72 ms python,   66.77 ms HIP,  579.93 loss, 0.002341 LR, 4.54 GB used,   9714.20 GFLOPS,    675.05 GOPS
482   70.00 ms run,    2.64 ms python,   67.36 ms HIP,  590.04 loss, 0.002336 LR, 4.54 GB used,   9643.66 GFLOPS,    675.05 GOPS
483   68.91 ms run,    2.71 ms python,   66.20 ms HIP,  584.13 loss, 0.002332 LR, 4.54 GB used,   9795.87 GFLOPS,    675.05 GOPS
484   69.08 ms run,    2.64 ms python,   66.45 ms HIP,  594.45 loss, 0.002328 LR, 4.54 GB used,   9771.60 GFLOPS,    675.05 GOPS
485   69.07 ms run,    2.65 ms python,   66.42 ms HIP,  598.35 loss, 0.002323 LR, 4.54 GB used,   9772.80 GFLOPS,    675.05 GOPS
486   68.98 ms run,    2.66 ms python,   66.31 ms HIP,  566.22 loss, 0.002319 LR, 4.54 GB used,   9786.77 GFLOPS,    675.05 GOPS
487   69.25 ms run,    2.67 ms python,   66.58 ms HIP,  571.19 loss, 0.002315 LR, 4.54 GB used,   9747.93 GFLOPS,    675.05 GOPS
488   70.21 ms run,    2.62 ms python,   67.59 ms HIP,  581.04 loss, 0.002310 LR, 4.54 GB used,   9614.92 GFLOPS,    675.05 GOPS
489   69.87 ms run,    2.67 ms python,   67.20 ms HIP,  553.44 loss, 0.002306 LR, 4.54 GB used,   9660.97 GFLOPS,    675.05 GOPS
shuffling training dataset in 753.43 ms (epoch=5)
490  828.96 ms run,  756.58 ms python,   72.39 ms HIP,  560.60 loss, 0.002301 LR, 4.54 GB used,    814.68 GFLOPS,    675.34 GOPS
491   73.86 ms run,    2.78 ms python,   71.08 ms HIP,  548.40 loss, 0.002297 LR, 4.54 GB used,   9139.33 GFLOPS,    675.05 GOPS
492   72.50 ms run,    2.68 ms python,   69.82 ms HIP,  559.04 loss, 0.002293 LR, 4.54 GB used,   9310.86 GFLOPS,    675.05 GOPS
493   71.02 ms run,    2.68 ms python,   68.34 ms HIP,  559.92 loss, 0.002288 LR, 4.54 GB used,   9504.53 GFLOPS,    675.05 GOPS
494   71.47 ms run,    2.72 ms python,   68.75 ms HIP,  566.71 loss, 0.002284 LR, 4.54 GB used,   9445.23 GFLOPS,    675.05 GOPS
495   70.28 ms run,    2.71 ms python,   67.57 ms HIP,  566.45 loss, 0.002280 LR, 4.54 GB used,   9605.55 GFLOPS,    675.05 GOPS
496   70.03 ms run,    2.65 ms python,   67.38 ms HIP,  573.89 loss, 0.002275 LR, 4.54 GB used,   9639.65 GFLOPS,    675.05 GOPS
497   70.01 ms run,    2.64 ms python,   67.37 ms HIP,  571.42 loss, 0.002271 LR, 4.54 GB used,   9641.88 GFLOPS,    675.05 GOPS
498   69.38 ms run,    2.66 ms python,   66.72 ms HIP,  562.14 loss, 0.002267 LR, 4.54 GB used,   9729.92 GFLOPS,    675.05 GOPS
499   69.61 ms run,    2.75 ms python,   66.86 ms HIP,  551.55 loss, 0.002262 LR, 4.54 GB used,   9697.79 GFLOPS,    675.05 GOPS
500   69.50 ms run,    2.65 ms python,   66.84 ms HIP,  557.15 loss, 0.002258 LR, 4.54 GB used,   9713.18 GFLOPS,    675.05 GOPS
501   69.49 ms run,    2.65 ms python,   66.83 ms HIP,  591.58 loss, 0.002254 LR, 4.54 GB used,   9714.97 GFLOPS,    675.05 GOPS
502   69.68 ms run,    2.66 ms python,   67.02 ms HIP,  603.83 loss, 0.002249 LR, 4.54 GB used,   9688.20 GFLOPS,    675.05 GOPS
503   69.32 ms run,    2.71 ms python,   66.61 ms HIP,  562.22 loss, 0.002245 LR, 4.54 GB used,   9737.52 GFLOPS,    675.05 GOPS
504   69.64 ms run,    2.64 ms python,   67.01 ms HIP,  580.17 loss, 0.002241 LR, 4.54 GB used,   9692.68 GFLOPS,    675.05 GOPS
505   69.34 ms run,    2.67 ms python,   66.67 ms HIP,  566.71 loss, 0.002236 LR, 4.54 GB used,   9735.38 GFLOPS,    675.05 GOPS
506   69.94 ms run,    2.63 ms python,   67.31 ms HIP,  591.87 loss, 0.002232 LR, 4.54 GB used,   9651.92 GFLOPS,    675.05 GOPS
507   69.58 ms run,    2.64 ms python,   66.94 ms HIP,  566.65 loss, 0.002227 LR, 4.54 GB used,   9701.37 GFLOPS,    675.05 GOPS
508   68.80 ms run,    2.64 ms python,   66.15 ms HIP,  569.30 loss, 0.002223 LR, 4.54 GB used,   9811.99 GFLOPS,    675.05 GOPS
509   69.45 ms run,    2.64 ms python,   66.82 ms HIP,  554.92 loss, 0.002219 LR, 4.54 GB used,   9719.59 GFLOPS,    675.05 GOPS
510   68.93 ms run,    2.62 ms python,   66.30 ms HIP,  593.63 loss, 0.002214 LR, 4.54 GB used,   9793.74 GFLOPS,    675.05 GOPS
511   69.09 ms run,    2.73 ms python,   66.36 ms HIP,  574.09 loss, 0.002210 LR, 4.54 GB used,   9770.91 GFLOPS,    675.05 GOPS
512   69.08 ms run,    2.71 ms python,   66.37 ms HIP,  558.48 loss, 0.002206 LR, 4.54 GB used,   9771.45 GFLOPS,    675.05 GOPS
513   69.20 ms run,    2.66 ms python,   66.54 ms HIP,  571.87 loss, 0.002201 LR, 4.54 GB used,   9754.31 GFLOPS,    675.05 GOPS
514   69.76 ms run,    2.68 ms python,   67.08 ms HIP,  566.25 loss, 0.002197 LR, 4.54 GB used,   9676.61 GFLOPS,    675.05 GOPS
515   69.53 ms run,    2.67 ms python,   66.86 ms HIP,  575.40 loss, 0.002193 LR, 4.54 GB used,   9708.89 GFLOPS,    675.05 GOPS
516   69.52 ms run,    2.64 ms python,   66.87 ms HIP,  585.70 loss, 0.002188 LR, 4.54 GB used,   9710.32 GFLOPS,    675.05 GOPS
517   69.81 ms run,    2.66 ms python,   67.16 ms HIP,  575.09 loss, 0.002184 LR, 4.54 GB used,   9669.11 GFLOPS,    675.05 GOPS
518   69.25 ms run,    2.65 ms python,   66.60 ms HIP,  577.65 loss, 0.002180 LR, 4.54 GB used,   9748.00 GFLOPS,    675.05 GOPS
519   69.38 ms run,    2.64 ms python,   66.74 ms HIP,  562.22 loss, 0.002175 LR, 4.54 GB used,   9729.67 GFLOPS,    675.05 GOPS
520   69.59 ms run,    2.66 ms python,   66.92 ms HIP,  588.31 loss, 0.002171 LR, 4.54 GB used,   9700.68 GFLOPS,    675.05 GOPS
521   69.61 ms run,    2.68 ms python,   66.92 ms HIP,  563.62 loss, 0.002167 LR, 4.54 GB used,   9697.95 GFLOPS,    675.05 GOPS
522   69.46 ms run,    2.64 ms python,   66.81 ms HIP,  575.65 loss, 0.002162 LR, 4.54 GB used,   9718.78 GFLOPS,    675.05 GOPS
523   69.85 ms run,    2.63 ms python,   67.22 ms HIP,  582.18 loss, 0.002158 LR, 4.54 GB used,   9664.21 GFLOPS,    675.05 GOPS
524   69.46 ms run,    2.60 ms python,   66.86 ms HIP,  574.66 loss, 0.002153 LR, 4.54 GB used,   9718.70 GFLOPS,    675.05 GOPS
525   69.56 ms run,    2.65 ms python,   66.91 ms HIP,  586.75 loss, 0.002149 LR, 4.54 GB used,   9704.57 GFLOPS,    675.05 GOPS
526   69.75 ms run,    2.71 ms python,   67.05 ms HIP,  586.70 loss, 0.002145 LR, 4.54 GB used,   9677.48 GFLOPS,    675.05 GOPS
527   69.51 ms run,    2.71 ms python,   66.80 ms HIP,  571.59 loss, 0.002140 LR, 4.54 GB used,   9711.94 GFLOPS,    675.05 GOPS
528   69.46 ms run,    2.62 ms python,   66.84 ms HIP,  564.29 loss, 0.002136 LR, 4.54 GB used,   9718.11 GFLOPS,    675.05 GOPS
529   69.58 ms run,    2.65 ms python,   66.93 ms HIP,  572.51 loss, 0.002132 LR, 4.54 GB used,   9701.15 GFLOPS,    675.05 GOPS
530   69.35 ms run,    2.66 ms python,   66.69 ms HIP,  570.33 loss, 0.002127 LR, 4.54 GB used,   9733.48 GFLOPS,    675.05 GOPS
531   69.59 ms run,    2.66 ms python,   66.93 ms HIP,  566.56 loss, 0.002123 LR, 4.54 GB used,   9699.89 GFLOPS,    675.05 GOPS
532   69.29 ms run,    2.65 ms python,   66.64 ms HIP,  581.82 loss, 0.002119 LR, 4.54 GB used,   9742.58 GFLOPS,    675.05 GOPS
533   69.48 ms run,    2.66 ms python,   66.82 ms HIP,  561.05 loss, 0.002114 LR, 4.54 GB used,   9715.61 GFLOPS,    675.05 GOPS
534   69.53 ms run,    2.65 ms python,   66.89 ms HIP,  594.77 loss, 0.002110 LR, 4.54 GB used,   9708.01 GFLOPS,    675.05 GOPS
535   69.76 ms run,    2.72 ms python,   67.04 ms HIP,  558.92 loss, 0.002106 LR, 4.54 GB used,   9676.33 GFLOPS,    675.05 GOPS
536   69.24 ms run,    2.67 ms python,   66.57 ms HIP,  586.19 loss, 0.002101 LR, 4.54 GB used,   9748.85 GFLOPS,    675.05 GOPS
537   69.22 ms run,    2.71 ms python,   66.51 ms HIP,  578.50 loss, 0.002097 LR, 4.54 GB used,   9752.34 GFLOPS,    675.05 GOPS
538   69.18 ms run,    2.65 ms python,   66.54 ms HIP,  572.07 loss, 0.002093 LR, 4.54 GB used,   9757.39 GFLOPS,    675.05 GOPS
539   69.19 ms run,    2.60 ms python,   66.60 ms HIP,  565.12 loss, 0.002088 LR, 4.54 GB used,   9756.18 GFLOPS,    675.05 GOPS
540   68.98 ms run,    2.59 ms python,   66.39 ms HIP,  562.70 loss, 0.002084 LR, 4.54 GB used,   9785.63 GFLOPS,    675.05 GOPS
541   69.21 ms run,    2.62 ms python,   66.59 ms HIP,  579.45 loss, 0.002079 LR, 4.54 GB used,   9752.89 GFLOPS,    675.05 GOPS
542   68.45 ms run,    2.61 ms python,   65.84 ms HIP,  572.89 loss, 0.002075 LR, 4.54 GB used,   9861.98 GFLOPS,    675.05 GOPS
543   70.01 ms run,    2.66 ms python,   67.35 ms HIP,  569.13 loss, 0.002071 LR, 4.54 GB used,   9641.54 GFLOPS,    675.05 GOPS
544   69.05 ms run,    2.64 ms python,   66.42 ms HIP,  569.87 loss, 0.002066 LR, 4.54 GB used,   9775.58 GFLOPS,    675.05 GOPS
545   69.32 ms run,    2.65 ms python,   66.67 ms HIP,  555.79 loss, 0.002062 LR, 4.54 GB used,   9738.47 GFLOPS,    675.05 GOPS
546   69.17 ms run,    2.61 ms python,   66.56 ms HIP,  558.11 loss, 0.002058 LR, 4.54 GB used,   9759.22 GFLOPS,    675.05 GOPS
547   69.30 ms run,    2.74 ms python,   66.56 ms HIP,  557.49 loss, 0.002053 LR, 4.54 GB used,   9740.97 GFLOPS,    675.05 GOPS
548   69.23 ms run,    2.66 ms python,   66.57 ms HIP,  569.55 loss, 0.002049 LR, 4.54 GB used,   9750.78 GFLOPS,    675.05 GOPS
549   69.23 ms run,    2.66 ms python,   66.57 ms HIP,  564.51 loss, 0.002045 LR, 4.54 GB used,   9750.81 GFLOPS,    675.05 GOPS
550   69.27 ms run,    2.67 ms python,   66.60 ms HIP,  564.71 loss, 0.002040 LR, 4.54 GB used,   9744.46 GFLOPS,    675.05 GOPS
551   68.79 ms run,    2.65 ms python,   66.15 ms HIP,  571.34 loss, 0.002036 LR, 4.54 GB used,   9812.75 GFLOPS,    675.05 GOPS
552   69.87 ms run,    2.65 ms python,   67.22 ms HIP,  560.34 loss, 0.002032 LR, 4.54 GB used,   9661.72 GFLOPS,    675.05 GOPS
553   69.18 ms run,    2.64 ms python,   66.54 ms HIP,  581.30 loss, 0.002027 LR, 4.54 GB used,   9757.62 GFLOPS,    675.05 GOPS
554   69.23 ms run,    2.63 ms python,   66.60 ms HIP,  567.74 loss, 0.002023 LR, 4.54 GB used,   9751.29 GFLOPS,    675.05 GOPS
555   69.11 ms run,    2.69 ms python,   66.42 ms HIP,  565.75 loss, 0.002019 LR, 4.54 GB used,   9767.84 GFLOPS,    675.05 GOPS
556   69.28 ms run,    2.61 ms python,   66.67 ms HIP,  560.14 loss, 0.002014 LR, 4.54 GB used,   9743.20 GFLOPS,    675.05 GOPS
557   69.16 ms run,    2.64 ms python,   66.52 ms HIP,  566.14 loss, 0.002010 LR, 4.54 GB used,   9761.05 GFLOPS,    675.05 GOPS
558   69.59 ms run,    2.67 ms python,   66.93 ms HIP,  576.05 loss, 0.002005 LR, 4.54 GB used,   9699.90 GFLOPS,    675.05 GOPS
559   69.34 ms run,    2.65 ms python,   66.70 ms HIP,  568.57 loss, 0.002001 LR, 4.54 GB used,   9734.75 GFLOPS,    675.05 GOPS
560   68.96 ms run,    2.62 ms python,   66.34 ms HIP,  547.41 loss, 0.001997 LR, 4.54 GB used,   9788.71 GFLOPS,    675.05 GOPS
561   69.55 ms run,    2.62 ms python,   66.93 ms HIP,  575.74 loss, 0.001992 LR, 4.54 GB used,   9705.91 GFLOPS,    675.05 GOPS
562   69.27 ms run,    2.67 ms python,   66.60 ms HIP,  568.30 loss, 0.001988 LR, 4.54 GB used,   9745.32 GFLOPS,    675.05 GOPS
563   70.53 ms run,    2.65 ms python,   67.88 ms HIP,  573.20 loss, 0.001984 LR, 4.54 GB used,   9570.72 GFLOPS,    675.05 GOPS
564   69.40 ms run,    2.70 ms python,   66.70 ms HIP,  569.90 loss, 0.001979 LR, 4.54 GB used,   9727.09 GFLOPS,    675.05 GOPS
565   68.97 ms run,    2.67 ms python,   66.30 ms HIP,  584.92 loss, 0.001975 LR, 4.54 GB used,   9786.93 GFLOPS,    675.05 GOPS
566   69.16 ms run,    2.63 ms python,   66.53 ms HIP,  574.28 loss, 0.001971 LR, 4.54 GB used,   9760.44 GFLOPS,    675.05 GOPS
567   69.56 ms run,    2.67 ms python,   66.89 ms HIP,  574.55 loss, 0.001966 LR, 4.54 GB used,   9704.27 GFLOPS,    675.05 GOPS
568   69.87 ms run,    2.70 ms python,   67.17 ms HIP,  576.63 loss, 0.001962 LR, 4.54 GB used,   9661.82 GFLOPS,    675.05 GOPS
569   69.29 ms run,    2.72 ms python,   66.57 ms HIP,  583.46 loss, 0.001958 LR, 4.54 GB used,   9742.41 GFLOPS,    675.05 GOPS
570   69.35 ms run,    2.66 ms python,   66.69 ms HIP,  576.70 loss, 0.001953 LR, 4.54 GB used,   9733.60 GFLOPS,    675.05 GOPS
571   69.36 ms run,    2.64 ms python,   66.72 ms HIP,  572.55 loss, 0.001949 LR, 4.54 GB used,   9732.30 GFLOPS,    675.05 GOPS
572   69.22 ms run,    2.64 ms python,   66.57 ms HIP,  561.28 loss, 0.001945 LR, 4.54 GB used,   9752.43 GFLOPS,    675.05 GOPS
573   69.22 ms run,    2.67 ms python,   66.55 ms HIP,  556.27 loss, 0.001940 LR, 4.54 GB used,   9752.26 GFLOPS,    675.05 GOPS
574   69.27 ms run,    2.64 ms python,   66.63 ms HIP,  564.08 loss, 0.001936 LR, 4.54 GB used,   9744.94 GFLOPS,    675.05 GOPS
575   70.44 ms run,    2.62 ms python,   67.82 ms HIP,  553.18 loss, 0.001931 LR, 4.54 GB used,   9582.95 GFLOPS,    675.05 GOPS
576   70.21 ms run,    2.73 ms python,   67.48 ms HIP,  559.14 loss, 0.001927 LR, 4.54 GB used,   9614.34 GFLOPS,    675.05 GOPS
577   69.58 ms run,    2.64 ms python,   66.94 ms HIP,  573.60 loss, 0.001923 LR, 4.54 GB used,   9701.77 GFLOPS,    675.05 GOPS
578   69.91 ms run,    2.61 ms python,   67.29 ms HIP,  575.48 loss, 0.001918 LR, 4.54 GB used,   9656.14 GFLOPS,    675.05 GOPS
579   69.53 ms run,    2.63 ms python,   66.91 ms HIP,  573.98 loss, 0.001914 LR, 4.54 GB used,   9708.07 GFLOPS,    675.05 GOPS
580   69.42 ms run,    2.76 ms python,   66.65 ms HIP,  574.23 loss, 0.001910 LR, 4.54 GB used,   9724.32 GFLOPS,    675.05 GOPS
581   69.01 ms run,    2.64 ms python,   66.37 ms HIP,  569.52 loss, 0.001905 LR, 4.54 GB used,   9781.98 GFLOPS,    675.05 GOPS
582   69.88 ms run,    2.66 ms python,   67.21 ms HIP,  562.67 loss, 0.001901 LR, 4.54 GB used,   9660.34 GFLOPS,    675.05 GOPS
583   69.47 ms run,    2.64 ms python,   66.82 ms HIP,  548.51 loss, 0.001897 LR, 4.54 GB used,   9717.55 GFLOPS,    675.05 GOPS
584   69.13 ms run,    2.64 ms python,   66.50 ms HIP,  568.97 loss, 0.001892 LR, 4.54 GB used,   9764.27 GFLOPS,    675.05 GOPS
585   69.10 ms run,    2.63 ms python,   66.48 ms HIP,  565.35 loss, 0.001888 LR, 4.54 GB used,   9768.65 GFLOPS,    675.05 GOPS
586   68.92 ms run,    2.62 ms python,   66.29 ms HIP,  578.32 loss, 0.001884 LR, 4.54 GB used,   9795.32 GFLOPS,    675.05 GOPS
587   69.53 ms run,    2.64 ms python,   66.89 ms HIP,  538.04 loss, 0.001879 LR, 4.54 GB used,   9708.32 GFLOPS,    675.05 GOPS
shuffling training dataset in 1532.30 ms (epoch=6)
588 1607.52 ms run, 1535.42 ms python,   72.11 ms HIP,  571.49 loss, 0.001875 LR, 4.54 GB used,    420.28 GFLOPS,    675.61 GOPS
589   73.93 ms run,    2.84 ms python,   71.09 ms HIP,  564.70 loss, 0.001871 LR, 4.54 GB used,   9130.80 GFLOPS,    675.05 GOPS
590   72.13 ms run,    2.73 ms python,   69.41 ms HIP,  570.04 loss, 0.001866 LR, 4.54 GB used,   9358.34 GFLOPS,    675.05 GOPS
591   71.02 ms run,    2.75 ms python,   68.28 ms HIP,  569.78 loss, 0.001862 LR, 4.54 GB used,   9504.48 GFLOPS,    675.05 GOPS
592   70.83 ms run,    3.07 ms python,   67.76 ms HIP,  578.29 loss, 0.001857 LR, 4.54 GB used,   9531.00 GFLOPS,    675.05 GOPS
593   69.67 ms run,    2.67 ms python,   67.00 ms HIP,  573.34 loss, 0.001853 LR, 4.54 GB used,   9689.20 GFLOPS,    675.05 GOPS
594   69.52 ms run,    2.66 ms python,   66.86 ms HIP,  578.10 loss, 0.001849 LR, 4.54 GB used,   9710.61 GFLOPS,    675.05 GOPS
595   69.88 ms run,    2.70 ms python,   67.19 ms HIP,  591.64 loss, 0.001844 LR, 4.54 GB used,   9659.64 GFLOPS,    675.05 GOPS
596   69.11 ms run,    2.66 ms python,   66.45 ms HIP,  603.30 loss, 0.001840 LR, 4.54 GB used,   9767.51 GFLOPS,    675.05 GOPS
597   68.57 ms run,    2.68 ms python,   65.89 ms HIP,  567.99 loss, 0.001836 LR, 4.54 GB used,   9844.15 GFLOPS,    675.05 GOPS
598   69.68 ms run,    2.68 ms python,   67.00 ms HIP,  575.65 loss, 0.001831 LR, 4.54 GB used,   9687.80 GFLOPS,    675.05 GOPS
599   68.71 ms run,    2.64 ms python,   66.07 ms HIP,  570.96 loss, 0.001827 LR, 4.54 GB used,   9825.15 GFLOPS,    675.05 GOPS
600   68.25 ms run,    2.65 ms python,   65.60 ms HIP,  557.41 loss, 0.001823 LR, 4.54 GB used,   9890.67 GFLOPS,    675.05 GOPS
601   69.30 ms run,    2.66 ms python,   66.64 ms HIP,  571.70 loss, 0.001818 LR, 4.54 GB used,   9740.51 GFLOPS,    675.05 GOPS
602   69.25 ms run,    2.63 ms python,   66.61 ms HIP,  569.83 loss, 0.001814 LR, 4.54 GB used,   9748.63 GFLOPS,    675.05 GOPS
603   69.96 ms run,    2.64 ms python,   67.32 ms HIP,  571.17 loss, 0.001810 LR, 4.54 GB used,   9648.41 GFLOPS,    675.05 GOPS
604   69.57 ms run,    2.68 ms python,   66.89 ms HIP,  575.71 loss, 0.001805 LR, 4.54 GB used,   9702.87 GFLOPS,    675.05 GOPS
605   68.99 ms run,    2.62 ms python,   66.37 ms HIP,  577.77 loss, 0.001801 LR, 4.54 GB used,   9784.19 GFLOPS,    675.05 GOPS
606   68.93 ms run,    2.64 ms python,   66.29 ms HIP,  569.05 loss, 0.001797 LR, 4.54 GB used,   9793.24 GFLOPS,    675.05 GOPS
607   69.57 ms run,    2.66 ms python,   66.91 ms HIP,  573.34 loss, 0.001792 LR, 4.54 GB used,   9703.31 GFLOPS,    675.05 GOPS
608   69.33 ms run,    2.66 ms python,   66.67 ms HIP,  570.30 loss, 0.001788 LR, 4.54 GB used,   9737.12 GFLOPS,    675.05 GOPS
609   69.04 ms run,    2.66 ms python,   66.38 ms HIP,  570.39 loss, 0.001783 LR, 4.54 GB used,   9777.89 GFLOPS,    675.05 GOPS
610   69.05 ms run,    2.68 ms python,   66.37 ms HIP,  566.33 loss, 0.001779 LR, 4.54 GB used,   9776.76 GFLOPS,    675.05 GOPS
611   69.30 ms run,    2.63 ms python,   66.66 ms HIP,  566.64 loss, 0.001775 LR, 4.54 GB used,   9741.48 GFLOPS,    675.05 GOPS
612   68.96 ms run,    2.61 ms python,   66.34 ms HIP,  569.52 loss, 0.001770 LR, 4.54 GB used,   9789.49 GFLOPS,    675.05 GOPS
613   68.96 ms run,    2.62 ms python,   66.34 ms HIP,  581.94 loss, 0.001766 LR, 4.54 GB used,   9788.75 GFLOPS,    675.05 GOPS
614   69.12 ms run,    2.64 ms python,   66.48 ms HIP,  578.21 loss, 0.001762 LR, 4.54 GB used,   9766.66 GFLOPS,    675.05 GOPS
615   69.22 ms run,    2.68 ms python,   66.54 ms HIP,  577.52 loss, 0.001757 LR, 4.54 GB used,   9751.63 GFLOPS,    675.05 GOPS
616   69.22 ms run,    2.65 ms python,   66.57 ms HIP,  573.82 loss, 0.001753 LR, 4.54 GB used,   9752.63 GFLOPS,    675.05 GOPS
617   68.98 ms run,    2.65 ms python,   66.33 ms HIP,  570.04 loss, 0.001749 LR, 4.54 GB used,   9785.64 GFLOPS,    675.05 GOPS
618   68.79 ms run,    2.63 ms python,   66.16 ms HIP,  569.29 loss, 0.001744 LR, 4.54 GB used,   9813.12 GFLOPS,    675.05 GOPS
619   69.47 ms run,    2.62 ms python,   66.86 ms HIP,  587.24 loss, 0.001740 LR, 4.54 GB used,   9716.59 GFLOPS,    675.05 GOPS
620   69.73 ms run,    2.65 ms python,   67.07 ms HIP,  568.84 loss, 0.001736 LR, 4.54 GB used,   9681.47 GFLOPS,    675.05 GOPS
621   69.34 ms run,    2.65 ms python,   66.70 ms HIP,  566.43 loss, 0.001731 LR, 4.54 GB used,   9734.93 GFLOPS,    675.05 GOPS
622   69.13 ms run,    2.63 ms python,   66.50 ms HIP,  569.45 loss, 0.001727 LR, 4.54 GB used,   9764.84 GFLOPS,    675.05 GOPS
623   68.65 ms run,    2.63 ms python,   66.02 ms HIP,  581.13 loss, 0.001723 LR, 4.54 GB used,   9832.96 GFLOPS,    675.05 GOPS
624   69.20 ms run,    2.67 ms python,   66.54 ms HIP,  564.13 loss, 0.001718 LR, 4.54 GB used,   9754.55 GFLOPS,    675.05 GOPS
625   68.98 ms run,    2.64 ms python,   66.34 ms HIP,  573.42 loss, 0.001714 LR, 4.54 GB used,   9786.71 GFLOPS,    675.05 GOPS
626   69.54 ms run,    2.72 ms python,   66.82 ms HIP,  563.98 loss, 0.001709 LR, 4.54 GB used,   9707.19 GFLOPS,    675.05 GOPS
627   69.17 ms run,    2.61 ms python,   66.56 ms HIP,  569.14 loss, 0.001705 LR, 4.54 GB used,   9758.61 GFLOPS,    675.05 GOPS
628   69.14 ms run,    2.64 ms python,   66.50 ms HIP,  567.16 loss, 0.001701 LR, 4.54 GB used,   9763.47 GFLOPS,    675.05 GOPS
629   69.67 ms run,    2.62 ms python,   67.04 ms HIP,  573.60 loss, 0.001696 LR, 4.54 GB used,   9689.26 GFLOPS,    675.05 GOPS
630   70.03 ms run,    2.68 ms python,   67.34 ms HIP,  562.19 loss, 0.001692 LR, 4.54 GB used,   9639.76 GFLOPS,    675.05 GOPS
631   70.27 ms run,    2.70 ms python,   67.58 ms HIP,  575.71 loss, 0.001688 LR, 4.54 GB used,   9605.96 GFLOPS,    675.05 GOPS
632   69.10 ms run,    2.68 ms python,   66.41 ms HIP,  570.74 loss, 0.001683 LR, 4.54 GB used,   9769.67 GFLOPS,    675.05 GOPS
633   69.40 ms run,    2.66 ms python,   66.74 ms HIP,  567.73 loss, 0.001679 LR, 4.54 GB used,   9727.11 GFLOPS,    675.05 GOPS
634   69.35 ms run,    2.60 ms python,   66.75 ms HIP,  568.36 loss, 0.001675 LR, 4.54 GB used,   9734.21 GFLOPS,    675.05 GOPS
635   69.23 ms run,    2.63 ms python,   66.60 ms HIP,  586.04 loss, 0.001670 LR, 4.54 GB used,   9751.31 GFLOPS,    675.05 GOPS
636   69.23 ms run,    2.66 ms python,   66.57 ms HIP,  574.17 loss, 0.001666 LR, 4.54 GB used,   9751.14 GFLOPS,    675.05 GOPS
637   69.20 ms run,    2.67 ms python,   66.54 ms HIP,  579.93 loss, 0.001662 LR, 4.54 GB used,   9754.30 GFLOPS,    675.05 GOPS
638   69.40 ms run,    2.72 ms python,   66.68 ms HIP,  560.20 loss, 0.001657 LR, 4.54 GB used,   9727.38 GFLOPS,    675.05 GOPS
639   69.54 ms run,    2.66 ms python,   66.88 ms HIP,  576.22 loss, 0.001653 LR, 4.54 GB used,   9707.03 GFLOPS,    675.05 GOPS
640   70.02 ms run,    2.63 ms python,   67.39 ms HIP,  583.38 loss, 0.001649 LR, 4.54 GB used,   9640.45 GFLOPS,    675.05 GOPS
641   69.67 ms run,    2.70 ms python,   66.96 ms HIP,  563.03 loss, 0.001644 LR, 4.54 GB used,   9689.69 GFLOPS,    675.05 GOPS
642   69.39 ms run,    2.65 ms python,   66.74 ms HIP,  577.68 loss, 0.001640 LR, 4.54 GB used,   9727.86 GFLOPS,    675.05 GOPS
643   69.51 ms run,    2.60 ms python,   66.91 ms HIP,  561.45 loss, 0.001635 LR, 4.54 GB used,   9710.91 GFLOPS,    675.05 GOPS
644   68.99 ms run,    2.65 ms python,   66.34 ms HIP,  580.14 loss, 0.001631 LR, 4.54 GB used,   9784.95 GFLOPS,    675.05 GOPS
645   69.26 ms run,    2.67 ms python,   66.59 ms HIP,  579.88 loss, 0.001627 LR, 4.54 GB used,   9747.02 GFLOPS,    675.05 GOPS
646   69.48 ms run,    2.61 ms python,   66.88 ms HIP,  575.16 loss, 0.001622 LR, 4.54 GB used,   9715.32 GFLOPS,    675.05 GOPS
647   69.02 ms run,    2.64 ms python,   66.38 ms HIP,  596.25 loss, 0.001618 LR, 4.54 GB used,   9781.13 GFLOPS,    675.05 GOPS
648   70.13 ms run,    2.62 ms python,   67.51 ms HIP,  564.59 loss, 0.001614 LR, 4.54 GB used,   9626.22 GFLOPS,    675.05 GOPS
649   69.09 ms run,    2.76 ms python,   66.33 ms HIP,  569.52 loss, 0.001609 LR, 4.54 GB used,   9770.15 GFLOPS,    675.05 GOPS
650   69.10 ms run,    2.65 ms python,   66.45 ms HIP,  586.70 loss, 0.001605 LR, 4.54 GB used,   9769.54 GFLOPS,    675.05 GOPS
651   69.49 ms run,    2.65 ms python,   66.83 ms HIP,  577.35 loss, 0.001601 LR, 4.54 GB used,   9714.46 GFLOPS,    675.05 GOPS
652   70.05 ms run,    2.66 ms python,   67.40 ms HIP,  575.93 loss, 0.001596 LR, 4.54 GB used,   9636.33 GFLOPS,    675.05 GOPS
653   69.37 ms run,    2.62 ms python,   66.75 ms HIP,  576.70 loss, 0.001592 LR, 4.54 GB used,   9730.73 GFLOPS,    675.05 GOPS
654   69.51 ms run,    2.67 ms python,   66.84 ms HIP,  566.77 loss, 0.001588 LR, 4.54 GB used,   9711.19 GFLOPS,    675.05 GOPS
655   69.64 ms run,    2.63 ms python,   67.01 ms HIP,  576.23 loss, 0.001583 LR, 4.54 GB used,   9693.35 GFLOPS,    675.05 GOPS
656   69.26 ms run,    2.63 ms python,   66.63 ms HIP,  562.23 loss, 0.001579 LR, 4.54 GB used,   9746.43 GFLOPS,    675.05 GOPS
657   69.32 ms run,    2.64 ms python,   66.68 ms HIP,  564.83 loss, 0.001575 LR, 4.54 GB used,   9738.11 GFLOPS,    675.05 GOPS
658   69.05 ms run,    2.63 ms python,   66.42 ms HIP,  577.87 loss, 0.001570 LR, 4.54 GB used,   9775.95 GFLOPS,    675.05 GOPS
659   69.75 ms run,    2.66 ms python,   67.09 ms HIP,  574.14 loss, 0.001566 LR, 4.54 GB used,   9677.60 GFLOPS,    675.05 GOPS
660   69.21 ms run,    2.74 ms python,   66.47 ms HIP,  582.64 loss, 0.001561 LR, 4.54 GB used,   9753.05 GFLOPS,    675.05 GOPS
661   69.58 ms run,    2.62 ms python,   66.96 ms HIP,  586.00 loss, 0.001557 LR, 4.54 GB used,   9701.75 GFLOPS,    675.05 GOPS
662   70.11 ms run,    2.63 ms python,   67.48 ms HIP,  552.82 loss, 0.001553 LR, 4.54 GB used,   9628.76 GFLOPS,    675.05 GOPS
663   69.24 ms run,    2.61 ms python,   66.63 ms HIP,  569.97 loss, 0.001548 LR, 4.54 GB used,   9749.22 GFLOPS,    675.05 GOPS
664   69.93 ms run,    2.62 ms python,   67.31 ms HIP,  576.78 loss, 0.001544 LR, 4.54 GB used,   9653.37 GFLOPS,    675.05 GOPS
665   69.92 ms run,    2.79 ms python,   67.13 ms HIP,  565.56 loss, 0.001540 LR, 4.54 GB used,   9654.51 GFLOPS,    675.05 GOPS
666   69.48 ms run,    2.70 ms python,   66.79 ms HIP,  579.94 loss, 0.001535 LR, 4.54 GB used,   9715.04 GFLOPS,    675.05 GOPS
667   68.98 ms run,    2.66 ms python,   66.32 ms HIP,  564.24 loss, 0.001531 LR, 4.54 GB used,   9785.96 GFLOPS,    675.05 GOPS
668   69.42 ms run,    2.62 ms python,   66.80 ms HIP,  573.23 loss, 0.001527 LR, 4.54 GB used,   9724.37 GFLOPS,    675.05 GOPS
669   69.36 ms run,    2.65 ms python,   66.71 ms HIP,  571.86 loss, 0.001522 LR, 4.54 GB used,   9732.52 GFLOPS,    675.05 GOPS
670   69.28 ms run,    2.63 ms python,   66.66 ms HIP,  596.87 loss, 0.001518 LR, 4.54 GB used,   9743.53 GFLOPS,    675.05 GOPS
671   69.96 ms run,    2.71 ms python,   67.25 ms HIP,  570.27 loss, 0.001514 LR, 4.54 GB used,   9648.68 GFLOPS,    675.05 GOPS
672   69.56 ms run,    2.63 ms python,   66.93 ms HIP,  575.17 loss, 0.001509 LR, 4.54 GB used,   9704.28 GFLOPS,    675.05 GOPS
673   69.26 ms run,    2.64 ms python,   66.62 ms HIP,  563.80 loss, 0.001505 LR, 4.54 GB used,   9746.84 GFLOPS,    675.05 GOPS
674   69.59 ms run,    2.65 ms python,   66.94 ms HIP,  575.78 loss, 0.001501 LR, 4.54 GB used,   9700.17 GFLOPS,    675.05 GOPS
675   69.84 ms run,    2.62 ms python,   67.22 ms HIP,  563.96 loss, 0.001496 LR, 4.54 GB used,   9665.74 GFLOPS,    675.05 GOPS
676   69.58 ms run,    2.64 ms python,   66.94 ms HIP,  575.17 loss, 0.001492 LR, 4.54 GB used,   9701.21 GFLOPS,    675.05 GOPS
677   69.35 ms run,    2.63 ms python,   66.72 ms HIP,  576.46 loss, 0.001488 LR, 4.54 GB used,   9733.49 GFLOPS,    675.05 GOPS
678   69.84 ms run,    2.66 ms python,   67.18 ms HIP,  563.43 loss, 0.001483 LR, 4.54 GB used,   9665.56 GFLOPS,    675.05 GOPS
679   69.77 ms run,    2.63 ms python,   67.14 ms HIP,  582.41 loss, 0.001479 LR, 4.54 GB used,   9675.46 GFLOPS,    675.05 GOPS
680   69.82 ms run,    2.62 ms python,   67.20 ms HIP,  565.88 loss, 0.001474 LR, 4.54 GB used,   9667.93 GFLOPS,    675.05 GOPS
681   69.72 ms run,    2.68 ms python,   67.04 ms HIP,  571.82 loss, 0.001470 LR, 4.54 GB used,   9681.60 GFLOPS,    675.05 GOPS
682   69.43 ms run,    2.60 ms python,   66.84 ms HIP,  566.03 loss, 0.001466 LR, 4.54 GB used,   9722.06 GFLOPS,    675.05 GOPS
683   69.22 ms run,    2.76 ms python,   66.46 ms HIP,  568.31 loss, 0.001461 LR, 4.54 GB used,   9751.85 GFLOPS,    675.05 GOPS
684   69.83 ms run,    2.65 ms python,   67.18 ms HIP,  573.44 loss, 0.001457 LR, 4.54 GB used,   9667.25 GFLOPS,    675.05 GOPS
685   69.66 ms run,    2.66 ms python,   66.99 ms HIP,  533.51 loss, 0.001453 LR, 4.54 GB used,   9690.88 GFLOPS,    675.05 GOPS
shuffling training dataset in 1159.49 ms (epoch=7)
686 1235.82 ms run, 1162.57 ms python,   73.25 ms HIP,  554.17 loss, 0.001448 LR, 4.54 GB used,    546.69 GFLOPS,    675.61 GOPS
687   74.20 ms run,    2.77 ms python,   71.43 ms HIP,  546.76 loss, 0.001444 LR, 4.54 GB used,   9097.61 GFLOPS,    675.05 GOPS
688   72.21 ms run,    2.71 ms python,   69.50 ms HIP,  554.29 loss, 0.001440 LR, 4.54 GB used,   9348.90 GFLOPS,    675.05 GOPS
689   71.74 ms run,    2.66 ms python,   69.08 ms HIP,  560.47 loss, 0.001435 LR, 4.54 GB used,   9409.93 GFLOPS,    675.05 GOPS
690   70.23 ms run,    2.69 ms python,   67.54 ms HIP,  552.45 loss, 0.001431 LR, 4.54 GB used,   9612.55 GFLOPS,    675.05 GOPS
691   69.90 ms run,    2.70 ms python,   67.20 ms HIP,  558.22 loss, 0.001427 LR, 4.54 GB used,   9656.94 GFLOPS,    675.05 GOPS
692   69.89 ms run,    2.65 ms python,   67.24 ms HIP,  548.10 loss, 0.001422 LR, 4.54 GB used,   9658.98 GFLOPS,    675.05 GOPS
693   69.28 ms run,    2.64 ms python,   66.64 ms HIP,  550.90 loss, 0.001418 LR, 4.54 GB used,   9743.44 GFLOPS,    675.05 GOPS
694   69.40 ms run,    2.71 ms python,   66.69 ms HIP,  545.27 loss, 0.001414 LR, 4.54 GB used,   9727.47 GFLOPS,    675.05 GOPS
695   69.11 ms run,    2.70 ms python,   66.40 ms HIP,  555.22 loss, 0.001409 LR, 4.54 GB used,   9768.15 GFLOPS,    675.05 GOPS
696   69.04 ms run,    2.64 ms python,   66.41 ms HIP,  553.44 loss, 0.001405 LR, 4.54 GB used,   9777.12 GFLOPS,    675.05 GOPS
697   69.41 ms run,    2.62 ms python,   66.79 ms HIP,  542.86 loss, 0.001400 LR, 4.54 GB used,   9726.06 GFLOPS,    675.05 GOPS
698   70.41 ms run,    2.65 ms python,   67.76 ms HIP,  538.98 loss, 0.001396 LR, 4.54 GB used,   9588.01 GFLOPS,    675.05 GOPS
699   69.45 ms run,    2.72 ms python,   66.73 ms HIP,  547.87 loss, 0.001392 LR, 4.54 GB used,   9719.51 GFLOPS,    675.05 GOPS
700   69.36 ms run,    2.65 ms python,   66.72 ms HIP,  550.62 loss, 0.001387 LR, 4.54 GB used,   9732.17 GFLOPS,    675.05 GOPS
701   69.73 ms run,    2.61 ms python,   67.12 ms HIP,  548.99 loss, 0.001383 LR, 4.54 GB used,   9680.67 GFLOPS,    675.05 GOPS
702   69.49 ms run,    2.66 ms python,   66.83 ms HIP,  562.56 loss, 0.001379 LR, 4.54 GB used,   9714.51 GFLOPS,    675.05 GOPS
703   69.42 ms run,    2.72 ms python,   66.70 ms HIP,  545.80 loss, 0.001374 LR, 4.54 GB used,   9724.47 GFLOPS,    675.05 GOPS
704   69.53 ms run,    2.65 ms python,   66.88 ms HIP,  543.95 loss, 0.001370 LR, 4.54 GB used,   9708.93 GFLOPS,    675.05 GOPS
705   69.95 ms run,    2.65 ms python,   67.31 ms HIP,  550.51 loss, 0.001366 LR, 4.54 GB used,   9650.16 GFLOPS,    675.05 GOPS
706   69.06 ms run,    2.64 ms python,   66.42 ms HIP,  563.62 loss, 0.001361 LR, 4.54 GB used,   9774.28 GFLOPS,    675.05 GOPS
707   69.70 ms run,    2.66 ms python,   67.04 ms HIP,  553.62 loss, 0.001357 LR, 4.54 GB used,   9684.64 GFLOPS,    675.05 GOPS
708   69.86 ms run,    2.63 ms python,   67.24 ms HIP,  548.91 loss, 0.001353 LR, 4.54 GB used,   9662.51 GFLOPS,    675.05 GOPS
709   69.76 ms run,    2.64 ms python,   67.12 ms HIP,  553.52 loss, 0.001348 LR, 4.54 GB used,   9676.87 GFLOPS,    675.05 GOPS
710   69.71 ms run,    2.63 ms python,   67.09 ms HIP,  552.99 loss, 0.001344 LR, 4.54 GB used,   9683.37 GFLOPS,    675.05 GOPS
711   69.57 ms run,    2.65 ms python,   66.91 ms HIP,  554.13 loss, 0.001340 LR, 4.54 GB used,   9703.72 GFLOPS,    675.05 GOPS
712   70.20 ms run,    2.64 ms python,   67.56 ms HIP,  559.88 loss, 0.001335 LR, 4.54 GB used,   9616.10 GFLOPS,    675.05 GOPS
713   69.14 ms run,    2.70 ms python,   66.44 ms HIP,  561.09 loss, 0.001331 LR, 4.54 GB used,   9762.93 GFLOPS,    675.05 GOPS
714   69.68 ms run,    2.67 ms python,   67.01 ms HIP,  570.73 loss, 0.001326 LR, 4.54 GB used,   9688.16 GFLOPS,    675.05 GOPS
715   69.28 ms run,    2.64 ms python,   66.64 ms HIP,  549.33 loss, 0.001322 LR, 4.54 GB used,   9744.16 GFLOPS,    675.05 GOPS
716   69.36 ms run,    2.65 ms python,   66.72 ms HIP,  560.05 loss, 0.001318 LR, 4.54 GB used,   9732.05 GFLOPS,    675.05 GOPS
717   68.89 ms run,    2.65 ms python,   66.24 ms HIP,  558.79 loss, 0.001313 LR, 4.54 GB used,   9799.27 GFLOPS,    675.05 GOPS
718   69.34 ms run,    2.64 ms python,   66.70 ms HIP,  567.84 loss, 0.001309 LR, 4.54 GB used,   9735.95 GFLOPS,    675.05 GOPS
719   69.56 ms run,    2.61 ms python,   66.95 ms HIP,  556.73 loss, 0.001305 LR, 4.54 GB used,   9705.03 GFLOPS,    675.05 GOPS
720   68.84 ms run,    2.63 ms python,   66.21 ms HIP,  552.36 loss, 0.001300 LR, 4.54 GB used,   9805.99 GFLOPS,    675.05 GOPS
721   69.28 ms run,    2.62 ms python,   66.67 ms HIP,  548.49 loss, 0.001296 LR, 4.54 GB used,   9743.15 GFLOPS,    675.05 GOPS
722   69.55 ms run,    2.66 ms python,   66.89 ms HIP,  545.67 loss, 0.001292 LR, 4.54 GB used,   9705.72 GFLOPS,    675.05 GOPS
723   69.17 ms run,    2.62 ms python,   66.54 ms HIP,  562.48 loss, 0.001287 LR, 4.54 GB used,   9759.35 GFLOPS,    675.05 GOPS
724   69.49 ms run,    2.65 ms python,   66.84 ms HIP,  557.03 loss, 0.001283 LR, 4.54 GB used,   9714.21 GFLOPS,    675.05 GOPS
725   69.28 ms run,    2.68 ms python,   66.60 ms HIP,  552.82 loss, 0.001279 LR, 4.54 GB used,   9743.30 GFLOPS,    675.05 GOPS
726   70.04 ms run,    2.65 ms python,   67.39 ms HIP,  554.57 loss, 0.001274 LR, 4.54 GB used,   9638.21 GFLOPS,    675.05 GOPS
727   69.25 ms run,    2.62 ms python,   66.63 ms HIP,  546.55 loss, 0.001270 LR, 4.54 GB used,   9748.51 GFLOPS,    675.05 GOPS
728   69.16 ms run,    2.64 ms python,   66.52 ms HIP,  559.60 loss, 0.001266 LR, 4.54 GB used,   9760.15 GFLOPS,    675.05 GOPS
729   69.81 ms run,    2.63 ms python,   67.18 ms HIP,  552.28 loss, 0.001261 LR, 4.54 GB used,   9670.00 GFLOPS,    675.05 GOPS
730   69.50 ms run,    2.64 ms python,   66.86 ms HIP,  562.51 loss, 0.001257 LR, 4.54 GB used,   9712.98 GFLOPS,    675.05 GOPS
731   69.31 ms run,    2.62 ms python,   66.70 ms HIP,  557.94 loss, 0.001252 LR, 4.54 GB used,   9739.10 GFLOPS,    675.05 GOPS
732   69.84 ms run,    2.63 ms python,   67.21 ms HIP,  557.27 loss, 0.001248 LR, 4.54 GB used,   9665.30 GFLOPS,    675.05 GOPS
733   69.26 ms run,    2.74 ms python,   66.52 ms HIP,  545.38 loss, 0.001244 LR, 4.54 GB used,   9746.24 GFLOPS,    675.05 GOPS
734   69.50 ms run,    2.68 ms python,   66.82 ms HIP,  557.56 loss, 0.001239 LR, 4.54 GB used,   9713.20 GFLOPS,    675.05 GOPS
735   69.29 ms run,    2.65 ms python,   66.65 ms HIP,  539.31 loss, 0.001235 LR, 4.54 GB used,   9741.67 GFLOPS,    675.05 GOPS
736   69.40 ms run,    2.67 ms python,   66.73 ms HIP,  550.31 loss, 0.001231 LR, 4.54 GB used,   9726.19 GFLOPS,    675.05 GOPS
737   69.08 ms run,    2.64 ms python,   66.44 ms HIP,  550.49 loss, 0.001226 LR, 4.54 GB used,   9771.43 GFLOPS,    675.05 GOPS
738   69.15 ms run,    2.61 ms python,   66.54 ms HIP,  569.39 loss, 0.001222 LR, 4.54 GB used,   9762.12 GFLOPS,    675.05 GOPS
739   68.99 ms run,    2.69 ms python,   66.30 ms HIP,  557.04 loss, 0.001218 LR, 4.54 GB used,   9784.27 GFLOPS,    675.05 GOPS
740   69.56 ms run,    2.62 ms python,   66.94 ms HIP,  559.56 loss, 0.001213 LR, 4.54 GB used,   9704.54 GFLOPS,    675.05 GOPS
741   69.13 ms run,    2.62 ms python,   66.52 ms HIP,  551.71 loss, 0.001209 LR, 4.54 GB used,   9764.18 GFLOPS,    675.05 GOPS
742   69.37 ms run,    2.71 ms python,   66.65 ms HIP,  547.54 loss, 0.001205 LR, 4.54 GB used,   9731.62 GFLOPS,    675.05 GOPS
743   68.66 ms run,    2.63 ms python,   66.04 ms HIP,  557.30 loss, 0.001200 LR, 4.54 GB used,   9831.25 GFLOPS,    675.05 GOPS
744   69.95 ms run,    2.65 ms python,   67.31 ms HIP,  558.61 loss, 0.001196 LR, 4.54 GB used,   9649.86 GFLOPS,    675.05 GOPS
745   69.91 ms run,    2.65 ms python,   67.26 ms HIP,  550.68 loss, 0.001192 LR, 4.54 GB used,   9656.39 GFLOPS,    675.05 GOPS
746   70.53 ms run,    2.62 ms python,   67.91 ms HIP,  555.99 loss, 0.001187 LR, 4.54 GB used,   9570.85 GFLOPS,    675.05 GOPS
747   70.22 ms run,    2.75 ms python,   67.47 ms HIP,  555.76 loss, 0.001183 LR, 4.54 GB used,   9612.83 GFLOPS,    675.05 GOPS
748   69.29 ms run,    2.68 ms python,   66.61 ms HIP,  557.85 loss, 0.001178 LR, 4.54 GB used,   9742.13 GFLOPS,    675.05 GOPS
749   69.46 ms run,    2.64 ms python,   66.82 ms HIP,  562.48 loss, 0.001174 LR, 4.54 GB used,   9717.84 GFLOPS,    675.05 GOPS
750   69.44 ms run,    2.65 ms python,   66.80 ms HIP,  539.62 loss, 0.001170 LR, 4.54 GB used,   9721.07 GFLOPS,    675.05 GOPS
751   69.20 ms run,    2.60 ms python,   66.60 ms HIP,  550.59 loss, 0.001165 LR, 4.54 GB used,   9754.40 GFLOPS,    675.05 GOPS
752   68.78 ms run,    2.64 ms python,   66.14 ms HIP,  545.35 loss, 0.001161 LR, 4.54 GB used,   9814.52 GFLOPS,    675.05 GOPS
753   69.49 ms run,    2.62 ms python,   66.87 ms HIP,  562.82 loss, 0.001157 LR, 4.54 GB used,   9714.61 GFLOPS,    675.05 GOPS
754   69.30 ms run,    2.71 ms python,   66.59 ms HIP,  553.77 loss, 0.001152 LR, 4.54 GB used,   9740.49 GFLOPS,    675.05 GOPS
755   69.49 ms run,    2.70 ms python,   66.79 ms HIP,  558.77 loss, 0.001148 LR, 4.54 GB used,   9714.21 GFLOPS,    675.05 GOPS
756   69.36 ms run,    2.67 ms python,   66.69 ms HIP,  553.38 loss, 0.001144 LR, 4.54 GB used,   9732.63 GFLOPS,    675.05 GOPS
757   69.12 ms run,    2.67 ms python,   66.45 ms HIP,  558.24 loss, 0.001139 LR, 4.54 GB used,   9766.93 GFLOPS,    675.05 GOPS
758   68.50 ms run,    2.62 ms python,   65.88 ms HIP,  542.75 loss, 0.001135 LR, 4.54 GB used,   9855.19 GFLOPS,    675.05 GOPS
759   69.96 ms run,    2.65 ms python,   67.31 ms HIP,  567.17 loss, 0.001131 LR, 4.54 GB used,   9648.36 GFLOPS,    675.05 GOPS
760   68.99 ms run,    2.63 ms python,   66.36 ms HIP,  565.41 loss, 0.001126 LR, 4.54 GB used,   9784.37 GFLOPS,    675.05 GOPS
761   69.63 ms run,    2.64 ms python,   66.99 ms HIP,  561.65 loss, 0.001122 LR, 4.54 GB used,   9694.19 GFLOPS,    675.05 GOPS
762   69.51 ms run,    2.63 ms python,   66.89 ms HIP,  551.63 loss, 0.001118 LR, 4.54 GB used,   9710.90 GFLOPS,    675.05 GOPS
763   69.10 ms run,    2.74 ms python,   66.36 ms HIP,  552.97 loss, 0.001113 LR, 4.54 GB used,   9769.21 GFLOPS,    675.05 GOPS
764   69.26 ms run,    2.65 ms python,   66.62 ms HIP,  544.57 loss, 0.001109 LR, 4.54 GB used,   9745.87 GFLOPS,    675.05 GOPS
765   69.40 ms run,    2.64 ms python,   66.76 ms HIP,  545.29 loss, 0.001104 LR, 4.54 GB used,   9726.92 GFLOPS,    675.05 GOPS
766   69.41 ms run,    2.62 ms python,   66.79 ms HIP,  545.26 loss, 0.001100 LR, 4.54 GB used,   9725.45 GFLOPS,    675.05 GOPS
767   69.17 ms run,    2.62 ms python,   66.55 ms HIP,  543.88 loss, 0.001096 LR, 4.54 GB used,   9759.39 GFLOPS,    675.05 GOPS
768   69.20 ms run,    2.70 ms python,   66.51 ms HIP,  550.40 loss, 0.001091 LR, 4.54 GB used,   9754.92 GFLOPS,    675.05 GOPS
769   69.48 ms run,    2.63 ms python,   66.85 ms HIP,  551.71 loss, 0.001087 LR, 4.54 GB used,   9715.69 GFLOPS,    675.05 GOPS
770   69.94 ms run,    2.73 ms python,   67.21 ms HIP,  559.03 loss, 0.001083 LR, 4.54 GB used,   9651.86 GFLOPS,    675.05 GOPS
771   70.06 ms run,    2.63 ms python,   67.43 ms HIP,  568.44 loss, 0.001078 LR, 4.54 GB used,   9635.35 GFLOPS,    675.05 GOPS
772   69.31 ms run,    2.77 ms python,   66.54 ms HIP,  541.96 loss, 0.001074 LR, 4.54 GB used,   9740.15 GFLOPS,    675.05 GOPS
773   69.65 ms run,    2.65 ms python,   67.01 ms HIP,  550.99 loss, 0.001070 LR, 4.54 GB used,   9691.46 GFLOPS,    675.05 GOPS
774   69.71 ms run,    2.68 ms python,   67.03 ms HIP,  564.27 loss, 0.001065 LR, 4.54 GB used,   9683.71 GFLOPS,    675.05 GOPS
775   70.05 ms run,    2.71 ms python,   67.34 ms HIP,  542.48 loss, 0.001061 LR, 4.54 GB used,   9636.43 GFLOPS,    675.05 GOPS
776   69.61 ms run,    2.63 ms python,   66.98 ms HIP,  550.75 loss, 0.001057 LR, 4.54 GB used,   9697.27 GFLOPS,    675.05 GOPS
777   69.24 ms run,    2.74 ms python,   66.50 ms HIP,  538.54 loss, 0.001052 LR, 4.54 GB used,   9749.74 GFLOPS,    675.05 GOPS
778   69.04 ms run,    2.63 ms python,   66.42 ms HIP,  545.96 loss, 0.001048 LR, 4.54 GB used,   9776.91 GFLOPS,    675.05 GOPS
779   68.63 ms run,    2.69 ms python,   65.94 ms HIP,  542.59 loss, 0.001044 LR, 4.54 GB used,   9836.05 GFLOPS,    675.05 GOPS
780   69.08 ms run,    2.69 ms python,   66.39 ms HIP,  571.61 loss, 0.001039 LR, 4.54 GB used,   9771.52 GFLOPS,    675.05 GOPS
781   69.83 ms run,    2.64 ms python,   67.19 ms HIP,  547.73 loss, 0.001035 LR, 4.54 GB used,   9667.32 GFLOPS,    675.05 GOPS
782   69.63 ms run,    2.65 ms python,   66.98 ms HIP,  550.16 loss, 0.001030 LR, 4.54 GB used,   9695.20 GFLOPS,    675.05 GOPS
783   69.15 ms run,    2.69 ms python,   66.46 ms HIP,  526.68 loss, 0.001026 LR, 4.54 GB used,   9762.53 GFLOPS,    675.05 GOPS
shuffling training dataset in 1248.18 ms (epoch=8)
784 1323.85 ms run, 1251.22 ms python,   72.63 ms HIP,  546.24 loss, 0.001022 LR, 4.54 GB used,    510.34 GFLOPS,    675.61 GOPS
785   74.02 ms run,    2.81 ms python,   71.21 ms HIP,  541.10 loss, 0.001017 LR, 4.54 GB used,   9119.92 GFLOPS,    675.05 GOPS
786   72.56 ms run,    2.70 ms python,   69.86 ms HIP,  551.43 loss, 0.001013 LR, 4.54 GB used,   9303.12 GFLOPS,    675.05 GOPS
787   71.45 ms run,    2.70 ms python,   68.74 ms HIP,  547.53 loss, 0.001009 LR, 4.54 GB used,   9448.06 GFLOPS,    675.05 GOPS
788   71.17 ms run,    2.69 ms python,   68.48 ms HIP,  532.55 loss, 0.001004 LR, 4.54 GB used,   9484.84 GFLOPS,    675.05 GOPS
789   70.74 ms run,    2.63 ms python,   68.11 ms HIP,  535.94 loss, 0.001000 LR, 4.54 GB used,   9542.38 GFLOPS,    675.05 GOPS
790   69.83 ms run,    2.72 ms python,   67.11 ms HIP,  536.94 loss, 0.000996 LR, 4.54 GB used,   9666.81 GFLOPS,    675.05 GOPS
791   70.25 ms run,    2.66 ms python,   67.60 ms HIP,  529.04 loss, 0.000991 LR, 4.54 GB used,   9608.96 GFLOPS,    675.05 GOPS
792   69.84 ms run,    2.64 ms python,   67.20 ms HIP,  529.08 loss, 0.000987 LR, 4.54 GB used,   9665.85 GFLOPS,    675.05 GOPS
793   69.01 ms run,    2.63 ms python,   66.38 ms HIP,  548.76 loss, 0.000983 LR, 4.54 GB used,   9782.01 GFLOPS,    675.05 GOPS
794   69.28 ms run,    2.65 ms python,   66.62 ms HIP,  543.41 loss, 0.000978 LR, 4.54 GB used,   9743.95 GFLOPS,    675.05 GOPS
795   68.82 ms run,    2.66 ms python,   66.16 ms HIP,  543.51 loss, 0.000974 LR, 4.54 GB used,   9809.33 GFLOPS,    675.05 GOPS
796   69.00 ms run,    2.65 ms python,   66.35 ms HIP,  544.11 loss, 0.000970 LR, 4.54 GB used,   9782.78 GFLOPS,    675.05 GOPS
797   68.95 ms run,    2.67 ms python,   66.28 ms HIP,  528.20 loss, 0.000965 LR, 4.54 GB used,   9790.26 GFLOPS,    675.05 GOPS
798   69.07 ms run,    2.65 ms python,   66.42 ms HIP,  539.38 loss, 0.000961 LR, 4.54 GB used,   9773.15 GFLOPS,    675.05 GOPS
799   69.50 ms run,    2.62 ms python,   66.88 ms HIP,  544.07 loss, 0.000956 LR, 4.54 GB used,   9713.17 GFLOPS,    675.05 GOPS
800   69.56 ms run,    2.61 ms python,   66.95 ms HIP,  531.35 loss, 0.000952 LR, 4.54 GB used,   9704.85 GFLOPS,    675.05 GOPS
801   69.45 ms run,    2.59 ms python,   66.86 ms HIP,  532.82 loss, 0.000948 LR, 4.54 GB used,   9719.67 GFLOPS,    675.05 GOPS
802   68.74 ms run,    2.67 ms python,   66.07 ms HIP,  541.33 loss, 0.000943 LR, 4.54 GB used,   9820.35 GFLOPS,    675.05 GOPS
803   69.12 ms run,    2.64 ms python,   66.48 ms HIP,  528.60 loss, 0.000939 LR, 4.54 GB used,   9766.26 GFLOPS,    675.05 GOPS
804   69.52 ms run,    2.75 ms python,   66.77 ms HIP,  525.63 loss, 0.000935 LR, 4.54 GB used,   9710.05 GFLOPS,    675.05 GOPS
805   69.72 ms run,    2.62 ms python,   67.10 ms HIP,  536.51 loss, 0.000930 LR, 4.54 GB used,   9681.63 GFLOPS,    675.05 GOPS
806   69.32 ms run,    2.64 ms python,   66.68 ms HIP,  532.97 loss, 0.000926 LR, 4.54 GB used,   9737.54 GFLOPS,    675.05 GOPS
807   68.84 ms run,    2.66 ms python,   66.19 ms HIP,  532.18 loss, 0.000922 LR, 4.54 GB used,   9805.46 GFLOPS,    675.05 GOPS
808   69.55 ms run,    2.65 ms python,   66.90 ms HIP,  533.36 loss, 0.000917 LR, 4.54 GB used,   9706.15 GFLOPS,    675.05 GOPS
809   69.58 ms run,    2.70 ms python,   66.88 ms HIP,  535.91 loss, 0.000913 LR, 4.54 GB used,   9701.92 GFLOPS,    675.05 GOPS
810   69.74 ms run,    2.65 ms python,   67.09 ms HIP,  543.84 loss, 0.000909 LR, 4.54 GB used,   9678.97 GFLOPS,    675.05 GOPS
811   69.24 ms run,    2.61 ms python,   66.63 ms HIP,  547.66 loss, 0.000904 LR, 4.54 GB used,   9749.23 GFLOPS,    675.05 GOPS
812   69.11 ms run,    2.72 ms python,   66.39 ms HIP,  536.60 loss, 0.000900 LR, 4.54 GB used,   9768.01 GFLOPS,    675.05 GOPS
813   68.71 ms run,    2.64 ms python,   66.07 ms HIP,  542.15 loss, 0.000896 LR, 4.54 GB used,   9824.97 GFLOPS,    675.05 GOPS
814   69.23 ms run,    2.66 ms python,   66.57 ms HIP,  534.80 loss, 0.000891 LR, 4.54 GB used,   9750.34 GFLOPS,    675.05 GOPS
815   69.53 ms run,    2.68 ms python,   66.85 ms HIP,  531.42 loss, 0.000887 LR, 4.54 GB used,   9709.17 GFLOPS,    675.05 GOPS
816   70.02 ms run,    2.60 ms python,   67.41 ms HIP,  528.62 loss, 0.000882 LR, 4.54 GB used,   9641.35 GFLOPS,    675.05 GOPS
817   69.68 ms run,    2.64 ms python,   67.04 ms HIP,  538.65 loss, 0.000878 LR, 4.54 GB used,   9688.05 GFLOPS,    675.05 GOPS
818   69.74 ms run,    2.63 ms python,   67.10 ms HIP,  534.32 loss, 0.000874 LR, 4.54 GB used,   9680.00 GFLOPS,    675.05 GOPS
819   69.87 ms run,    2.64 ms python,   67.23 ms HIP,  528.08 loss, 0.000869 LR, 4.54 GB used,   9661.57 GFLOPS,    675.05 GOPS
820   69.37 ms run,    2.70 ms python,   66.67 ms HIP,  541.12 loss, 0.000865 LR, 4.54 GB used,   9730.47 GFLOPS,    675.05 GOPS
821   69.86 ms run,    2.64 ms python,   67.21 ms HIP,  527.85 loss, 0.000861 LR, 4.54 GB used,   9663.30 GFLOPS,    675.05 GOPS
822   69.52 ms run,    2.68 ms python,   66.84 ms HIP,  531.59 loss, 0.000856 LR, 4.54 GB used,   9710.65 GFLOPS,    675.05 GOPS
823   69.42 ms run,    2.62 ms python,   66.80 ms HIP,  538.39 loss, 0.000852 LR, 4.54 GB used,   9724.43 GFLOPS,    675.05 GOPS
824   70.23 ms run,    2.62 ms python,   67.61 ms HIP,  541.76 loss, 0.000848 LR, 4.54 GB used,   9611.38 GFLOPS,    675.05 GOPS
825   69.53 ms run,    2.68 ms python,   66.85 ms HIP,  528.67 loss, 0.000843 LR, 4.54 GB used,   9708.25 GFLOPS,    675.05 GOPS
826   68.89 ms run,    2.61 ms python,   66.28 ms HIP,  535.09 loss, 0.000839 LR, 4.54 GB used,   9798.49 GFLOPS,    675.05 GOPS
827   69.24 ms run,    2.68 ms python,   66.56 ms HIP,  533.99 loss, 0.000835 LR, 4.54 GB used,   9749.74 GFLOPS,    675.05 GOPS
828   69.66 ms run,    2.64 ms python,   67.02 ms HIP,  530.62 loss, 0.000830 LR, 4.54 GB used,   9690.68 GFLOPS,    675.05 GOPS
829   69.72 ms run,    2.66 ms python,   67.06 ms HIP,  537.17 loss, 0.000826 LR, 4.54 GB used,   9682.39 GFLOPS,    675.05 GOPS
830   69.32 ms run,    2.66 ms python,   66.66 ms HIP,  538.48 loss, 0.000822 LR, 4.54 GB used,   9737.89 GFLOPS,    675.05 GOPS
831   69.51 ms run,    2.63 ms python,   66.88 ms HIP,  535.70 loss, 0.000817 LR, 4.54 GB used,   9710.83 GFLOPS,    675.05 GOPS
832   69.41 ms run,    2.62 ms python,   66.79 ms HIP,  525.67 loss, 0.000813 LR, 4.54 GB used,   9725.40 GFLOPS,    675.05 GOPS
833   69.50 ms run,    2.67 ms python,   66.83 ms HIP,  532.23 loss, 0.000808 LR, 4.54 GB used,   9712.42 GFLOPS,    675.05 GOPS
834   69.33 ms run,    2.63 ms python,   66.69 ms HIP,  528.42 loss, 0.000804 LR, 4.54 GB used,   9737.16 GFLOPS,    675.05 GOPS
835   69.31 ms run,    2.68 ms python,   66.63 ms HIP,  535.32 loss, 0.000800 LR, 4.54 GB used,   9739.56 GFLOPS,    675.05 GOPS
836   69.73 ms run,    2.67 ms python,   67.07 ms HIP,  536.20 loss, 0.000795 LR, 4.54 GB used,   9680.20 GFLOPS,    675.05 GOPS
837   69.43 ms run,    2.64 ms python,   66.78 ms HIP,  538.89 loss, 0.000791 LR, 4.54 GB used,   9723.36 GFLOPS,    675.05 GOPS
838   69.13 ms run,    2.61 ms python,   66.52 ms HIP,  535.20 loss, 0.000787 LR, 4.54 GB used,   9764.50 GFLOPS,    675.05 GOPS
839   69.10 ms run,    2.60 ms python,   66.50 ms HIP,  533.34 loss, 0.000782 LR, 4.54 GB used,   9768.46 GFLOPS,    675.05 GOPS
840   68.19 ms run,    2.62 ms python,   65.57 ms HIP,  539.02 loss, 0.000778 LR, 4.54 GB used,   9899.76 GFLOPS,    675.05 GOPS
841   69.23 ms run,    2.65 ms python,   66.57 ms HIP,  536.14 loss, 0.000774 LR, 4.54 GB used,   9751.34 GFLOPS,    675.05 GOPS
842   69.47 ms run,    2.65 ms python,   66.83 ms HIP,  534.74 loss, 0.000769 LR, 4.54 GB used,   9716.57 GFLOPS,    675.05 GOPS
843   69.61 ms run,    2.62 ms python,   67.00 ms HIP,  539.64 loss, 0.000765 LR, 4.54 GB used,   9697.05 GFLOPS,    675.05 GOPS
844   69.04 ms run,    2.64 ms python,   66.40 ms HIP,  530.05 loss, 0.000761 LR, 4.54 GB used,   9777.32 GFLOPS,    675.05 GOPS
845   69.49 ms run,    2.65 ms python,   66.83 ms HIP,  536.00 loss, 0.000756 LR, 4.54 GB used,   9714.84 GFLOPS,    675.05 GOPS
846   69.04 ms run,    2.70 ms python,   66.34 ms HIP,  534.40 loss, 0.000752 LR, 4.54 GB used,   9777.87 GFLOPS,    675.05 GOPS
847   69.78 ms run,    2.63 ms python,   67.15 ms HIP,  536.08 loss, 0.000748 LR, 4.54 GB used,   9673.98 GFLOPS,    675.05 GOPS
848   68.62 ms run,    2.61 ms python,   66.02 ms HIP,  541.98 loss, 0.000743 LR, 4.54 GB used,   9836.97 GFLOPS,    675.05 GOPS
849   69.78 ms run,    2.60 ms python,   67.19 ms HIP,  526.80 loss, 0.000739 LR, 4.54 GB used,   9673.56 GFLOPS,    675.05 GOPS
850   69.34 ms run,    2.64 ms python,   66.70 ms HIP,  531.86 loss, 0.000734 LR, 4.54 GB used,   9735.31 GFLOPS,    675.05 GOPS
851   69.14 ms run,    2.65 ms python,   66.49 ms HIP,  534.87 loss, 0.000730 LR, 4.54 GB used,   9764.06 GFLOPS,    675.05 GOPS
852   69.53 ms run,    2.63 ms python,   66.90 ms HIP,  523.91 loss, 0.000726 LR, 4.54 GB used,   9709.10 GFLOPS,    675.05 GOPS
853   69.87 ms run,    3.20 ms python,   66.67 ms HIP,  535.79 loss, 0.000721 LR, 4.54 GB used,   9660.99 GFLOPS,    675.05 GOPS
854   69.08 ms run,    2.67 ms python,   66.42 ms HIP,  529.02 loss, 0.000717 LR, 4.54 GB used,   9771.59 GFLOPS,    675.05 GOPS
855   68.56 ms run,    2.65 ms python,   65.90 ms HIP,  525.92 loss, 0.000713 LR, 4.54 GB used,   9846.09 GFLOPS,    675.05 GOPS
856   69.18 ms run,    2.64 ms python,   66.54 ms HIP,  536.56 loss, 0.000708 LR, 4.54 GB used,   9757.63 GFLOPS,    675.05 GOPS
857   69.45 ms run,    2.70 ms python,   66.75 ms HIP,  520.37 loss, 0.000704 LR, 4.54 GB used,   9719.67 GFLOPS,    675.05 GOPS
858   69.41 ms run,    2.64 ms python,   66.77 ms HIP,  531.96 loss, 0.000700 LR, 4.54 GB used,   9725.64 GFLOPS,    675.05 GOPS
859   70.08 ms run,    2.66 ms python,   67.41 ms HIP,  540.40 loss, 0.000695 LR, 4.54 GB used,   9632.94 GFLOPS,    675.05 GOPS
860   69.25 ms run,    2.65 ms python,   66.60 ms HIP,  535.21 loss, 0.000691 LR, 4.54 GB used,   9747.91 GFLOPS,    675.05 GOPS
861   69.53 ms run,    2.64 ms python,   66.90 ms HIP,  538.28 loss, 0.000687 LR, 4.54 GB used,   9708.03 GFLOPS,    675.05 GOPS
862   69.53 ms run,    2.63 ms python,   66.90 ms HIP,  523.75 loss, 0.000682 LR, 4.54 GB used,   9708.76 GFLOPS,    675.05 GOPS
863   69.10 ms run,    2.63 ms python,   66.47 ms HIP,  540.04 loss, 0.000678 LR, 4.54 GB used,   9768.97 GFLOPS,    675.05 GOPS
864   69.50 ms run,    2.65 ms python,   66.86 ms HIP,  532.45 loss, 0.000674 LR, 4.54 GB used,   9712.18 GFLOPS,    675.05 GOPS
865   69.51 ms run,    2.71 ms python,   66.80 ms HIP,  535.42 loss, 0.000669 LR, 4.54 GB used,   9710.89 GFLOPS,    675.05 GOPS
866   68.93 ms run,    2.71 ms python,   66.22 ms HIP,  530.29 loss, 0.000665 LR, 4.54 GB used,   9793.27 GFLOPS,    675.05 GOPS
867   69.27 ms run,    2.68 ms python,   66.59 ms HIP,  513.62 loss, 0.000660 LR, 4.54 GB used,   9744.63 GFLOPS,    675.05 GOPS
868   69.18 ms run,    2.59 ms python,   66.58 ms HIP,  533.72 loss, 0.000656 LR, 4.54 GB used,   9758.46 GFLOPS,    675.05 GOPS
869   69.22 ms run,    2.64 ms python,   66.58 ms HIP,  520.23 loss, 0.000652 LR, 4.54 GB used,   9752.79 GFLOPS,    675.05 GOPS
870   69.57 ms run,    2.63 ms python,   66.94 ms HIP,  531.70 loss, 0.000647 LR, 4.54 GB used,   9702.61 GFLOPS,    675.05 GOPS
871   69.55 ms run,    2.68 ms python,   66.87 ms HIP,  540.77 loss, 0.000643 LR, 4.54 GB used,   9705.38 GFLOPS,    675.05 GOPS
872   69.33 ms run,    2.63 ms python,   66.70 ms HIP,  533.11 loss, 0.000639 LR, 4.54 GB used,   9736.47 GFLOPS,    675.05 GOPS
873   69.83 ms run,    2.65 ms python,   67.19 ms HIP,  527.30 loss, 0.000634 LR, 4.54 GB used,   9666.60 GFLOPS,    675.05 GOPS
874   69.72 ms run,    2.68 ms python,   67.03 ms HIP,  537.70 loss, 0.000630 LR, 4.54 GB used,   9682.89 GFLOPS,    675.05 GOPS
875   69.59 ms run,    2.64 ms python,   66.95 ms HIP,  547.15 loss, 0.000626 LR, 4.54 GB used,   9700.12 GFLOPS,    675.05 GOPS
876   69.41 ms run,    2.65 ms python,   66.77 ms HIP,  532.26 loss, 0.000621 LR, 4.54 GB used,   9725.11 GFLOPS,    675.05 GOPS
877   69.81 ms run,    2.61 ms python,   67.20 ms HIP,  546.89 loss, 0.000617 LR, 4.54 GB used,   9669.47 GFLOPS,    675.05 GOPS
878   69.05 ms run,    2.61 ms python,   66.43 ms HIP,  526.37 loss, 0.000613 LR, 4.54 GB used,   9776.26 GFLOPS,    675.05 GOPS
879   69.45 ms run,    2.61 ms python,   66.84 ms HIP,  537.78 loss, 0.000608 LR, 4.54 GB used,   9720.34 GFLOPS,    675.05 GOPS
880   69.33 ms run,    2.68 ms python,   66.64 ms HIP,  532.12 loss, 0.000604 LR, 4.54 GB used,   9737.14 GFLOPS,    675.05 GOPS
881   69.40 ms run,    2.69 ms python,   66.71 ms HIP,  512.16 loss, 0.000600 LR, 4.54 GB used,   9727.35 GFLOPS,    675.05 GOPS
shuffling training dataset in 1159.44 ms (epoch=9)
882 1235.03 ms run, 1162.57 ms python,   72.46 ms HIP,  520.87 loss, 0.000595 LR, 4.54 GB used,    547.04 GFLOPS,    675.61 GOPS
883   74.16 ms run,    2.88 ms python,   71.29 ms HIP,  516.05 loss, 0.000591 LR, 4.54 GB used,   9102.24 GFLOPS,    675.05 GOPS
884   72.66 ms run,    2.72 ms python,   69.94 ms HIP,  525.34 loss, 0.000586 LR, 4.54 GB used,   9291.01 GFLOPS,    675.05 GOPS
885   71.26 ms run,    2.67 ms python,   68.59 ms HIP,  506.30 loss, 0.000582 LR, 4.54 GB used,   9472.61 GFLOPS,    675.05 GOPS
886   70.69 ms run,    2.71 ms python,   67.98 ms HIP,  511.95 loss, 0.000578 LR, 4.54 GB used,   9550.00 GFLOPS,    675.05 GOPS
887   70.02 ms run,    2.62 ms python,   67.40 ms HIP,  516.78 loss, 0.000573 LR, 4.54 GB used,   9640.87 GFLOPS,    675.05 GOPS
888   69.19 ms run,    2.66 ms python,   66.53 ms HIP,  515.21 loss, 0.000569 LR, 4.54 GB used,   9757.04 GFLOPS,    675.05 GOPS
889   69.68 ms run,    2.62 ms python,   67.06 ms HIP,  517.03 loss, 0.000565 LR, 4.54 GB used,   9687.99 GFLOPS,    675.05 GOPS
890   69.62 ms run,    2.63 ms python,   66.99 ms HIP,  507.96 loss, 0.000560 LR, 4.54 GB used,   9696.45 GFLOPS,    675.05 GOPS
891   69.55 ms run,    2.64 ms python,   66.91 ms HIP,  515.07 loss, 0.000556 LR, 4.54 GB used,   9705.92 GFLOPS,    675.05 GOPS
892   69.32 ms run,    2.63 ms python,   66.69 ms HIP,  521.80 loss, 0.000552 LR, 4.54 GB used,   9738.24 GFLOPS,    675.05 GOPS
893   69.32 ms run,    2.63 ms python,   66.69 ms HIP,  524.90 loss, 0.000547 LR, 4.54 GB used,   9737.80 GFLOPS,    675.05 GOPS
894   69.05 ms run,    2.63 ms python,   66.42 ms HIP,  518.19 loss, 0.000543 LR, 4.54 GB used,   9776.12 GFLOPS,    675.05 GOPS
895   68.94 ms run,    2.61 ms python,   66.34 ms HIP,  514.16 loss, 0.000539 LR, 4.54 GB used,   9791.23 GFLOPS,    675.05 GOPS
896   69.71 ms run,    2.61 ms python,   67.09 ms HIP,  514.89 loss, 0.000534 LR, 4.54 GB used,   9683.90 GFLOPS,    675.05 GOPS
897   69.34 ms run,    2.65 ms python,   66.69 ms HIP,  511.81 loss, 0.000530 LR, 4.54 GB used,   9734.89 GFLOPS,    675.05 GOPS
898   69.24 ms run,    2.64 ms python,   66.61 ms HIP,  512.98 loss, 0.000526 LR, 4.54 GB used,   9749.11 GFLOPS,    675.05 GOPS
899   69.35 ms run,    2.63 ms python,   66.72 ms HIP,  508.40 loss, 0.000521 LR, 4.54 GB used,   9734.27 GFLOPS,    675.05 GOPS
900   68.77 ms run,    2.63 ms python,   66.14 ms HIP,  511.02 loss, 0.000517 LR, 4.54 GB used,   9815.58 GFLOPS,    675.05 GOPS
901   69.17 ms run,    2.64 ms python,   66.53 ms HIP,  515.47 loss, 0.000513 LR, 4.54 GB used,   9759.06 GFLOPS,    675.05 GOPS
902   69.13 ms run,    2.64 ms python,   66.49 ms HIP,  514.25 loss, 0.000508 LR, 4.54 GB used,   9764.42 GFLOPS,    675.05 GOPS
903   69.23 ms run,    2.74 ms python,   66.49 ms HIP,  514.87 loss, 0.000504 LR, 4.54 GB used,   9751.23 GFLOPS,    675.05 GOPS
904   69.54 ms run,    2.67 ms python,   66.87 ms HIP,  510.42 loss, 0.000499 LR, 4.54 GB used,   9707.15 GFLOPS,    675.05 GOPS
905   68.88 ms run,    2.63 ms python,   66.25 ms HIP,  511.87 loss, 0.000495 LR, 4.54 GB used,   9800.72 GFLOPS,    675.05 GOPS
906   68.98 ms run,    2.66 ms python,   66.32 ms HIP,  508.94 loss, 0.000491 LR, 4.54 GB used,   9785.88 GFLOPS,    675.05 GOPS
907   68.82 ms run,    2.65 ms python,   66.18 ms HIP,  519.92 loss, 0.000486 LR, 4.54 GB used,   9808.29 GFLOPS,    675.05 GOPS
908   68.83 ms run,    2.69 ms python,   66.14 ms HIP,  507.73 loss, 0.000482 LR, 4.54 GB used,   9806.73 GFLOPS,    675.05 GOPS
909   69.73 ms run,    2.63 ms python,   67.10 ms HIP,  518.91 loss, 0.000478 LR, 4.54 GB used,   9680.49 GFLOPS,    675.05 GOPS
910   69.38 ms run,    2.64 ms python,   66.74 ms HIP,  520.09 loss, 0.000473 LR, 4.54 GB used,   9729.58 GFLOPS,    675.05 GOPS
911   69.32 ms run,    2.66 ms python,   66.66 ms HIP,  520.63 loss, 0.000469 LR, 4.54 GB used,   9738.13 GFLOPS,    675.05 GOPS
912   69.55 ms run,    2.62 ms python,   66.93 ms HIP,  501.50 loss, 0.000465 LR, 4.54 GB used,   9705.62 GFLOPS,    675.05 GOPS
913   68.85 ms run,    2.63 ms python,   66.22 ms HIP,  515.14 loss, 0.000460 LR, 4.54 GB used,   9804.22 GFLOPS,    675.05 GOPS
914   69.07 ms run,    2.64 ms python,   66.43 ms HIP,  515.34 loss, 0.000456 LR, 4.54 GB used,   9773.31 GFLOPS,    675.05 GOPS
915   69.08 ms run,    2.66 ms python,   66.42 ms HIP,  515.66 loss, 0.000452 LR, 4.54 GB used,   9771.55 GFLOPS,    675.05 GOPS
916   69.16 ms run,    2.69 ms python,   66.47 ms HIP,  522.36 loss, 0.000447 LR, 4.54 GB used,   9760.84 GFLOPS,    675.05 GOPS
917   69.71 ms run,    2.60 ms python,   67.11 ms HIP,  514.66 loss, 0.000443 LR, 4.54 GB used,   9683.72 GFLOPS,    675.05 GOPS
918   69.54 ms run,    2.64 ms python,   66.90 ms HIP,  515.18 loss, 0.000439 LR, 4.54 GB used,   9707.37 GFLOPS,    675.05 GOPS
919   69.04 ms run,    2.63 ms python,   66.41 ms HIP,  515.85 loss, 0.000434 LR, 4.54 GB used,   9777.36 GFLOPS,    675.05 GOPS
920   69.66 ms run,    2.64 ms python,   67.02 ms HIP,  509.09 loss, 0.000430 LR, 4.54 GB used,   9689.92 GFLOPS,    675.05 GOPS
921   69.06 ms run,    2.60 ms python,   66.46 ms HIP,  507.59 loss, 0.000425 LR, 4.54 GB used,   9775.43 GFLOPS,    675.05 GOPS
922   69.37 ms run,    2.62 ms python,   66.75 ms HIP,  513.96 loss, 0.000421 LR, 4.54 GB used,   9730.39 GFLOPS,    675.05 GOPS
923   69.24 ms run,    2.59 ms python,   66.65 ms HIP,  509.86 loss, 0.000417 LR, 4.54 GB used,   9749.35 GFLOPS,    675.05 GOPS
924   69.20 ms run,    2.63 ms python,   66.57 ms HIP,  516.51 loss, 0.000412 LR, 4.54 GB used,   9755.15 GFLOPS,    675.05 GOPS
925   69.61 ms run,    2.61 ms python,   67.00 ms HIP,  504.50 loss, 0.000408 LR, 4.54 GB used,   9698.17 GFLOPS,    675.05 GOPS
926   68.96 ms run,    2.66 ms python,   66.30 ms HIP,  512.93 loss, 0.000404 LR, 4.54 GB used,   9789.02 GFLOPS,    675.05 GOPS
927   69.25 ms run,    2.63 ms python,   66.61 ms HIP,  510.84 loss, 0.000399 LR, 4.54 GB used,   9748.62 GFLOPS,    675.05 GOPS
928   69.52 ms run,    2.63 ms python,   66.90 ms HIP,  522.02 loss, 0.000395 LR, 4.54 GB used,   9709.53 GFLOPS,    675.05 GOPS
929   69.28 ms run,    2.62 ms python,   66.66 ms HIP,  512.55 loss, 0.000391 LR, 4.54 GB used,   9743.84 GFLOPS,    675.05 GOPS
930   69.60 ms run,    2.64 ms python,   66.96 ms HIP,  503.71 loss, 0.000386 LR, 4.54 GB used,   9699.44 GFLOPS,    675.05 GOPS
931   69.42 ms run,    2.67 ms python,   66.75 ms HIP,  509.71 loss, 0.000382 LR, 4.54 GB used,   9723.59 GFLOPS,    675.05 GOPS
932   69.20 ms run,    2.63 ms python,   66.57 ms HIP,  506.57 loss, 0.000378 LR, 4.54 GB used,   9754.58 GFLOPS,    675.05 GOPS
933   69.03 ms run,    2.68 ms python,   66.35 ms HIP,  509.72 loss, 0.000373 LR, 4.54 GB used,   9779.67 GFLOPS,    675.05 GOPS
934   68.65 ms run,    2.66 ms python,   65.99 ms HIP,  519.52 loss, 0.000369 LR, 4.54 GB used,   9832.97 GFLOPS,    675.05 GOPS
935   69.10 ms run,    2.63 ms python,   66.47 ms HIP,  523.69 loss, 0.000365 LR, 4.54 GB used,   9769.73 GFLOPS,    675.05 GOPS
936   68.79 ms run,    2.62 ms python,   66.16 ms HIP,  515.27 loss, 0.000360 LR, 4.54 GB used,   9813.41 GFLOPS,    675.05 GOPS
937   70.71 ms run,    2.65 ms python,   68.06 ms HIP,  518.81 loss, 0.000356 LR, 4.54 GB used,   9546.80 GFLOPS,    675.05 GOPS
938   69.36 ms run,    2.64 ms python,   66.71 ms HIP,  504.38 loss, 0.000351 LR, 4.54 GB used,   9733.15 GFLOPS,    675.05 GOPS
939   69.48 ms run,    2.64 ms python,   66.84 ms HIP,  507.79 loss, 0.000347 LR, 4.54 GB used,   9715.61 GFLOPS,    675.05 GOPS
940   68.71 ms run,    2.69 ms python,   66.02 ms HIP,  514.01 loss, 0.000343 LR, 4.54 GB used,   9824.71 GFLOPS,    675.05 GOPS
941   69.19 ms run,    2.64 ms python,   66.55 ms HIP,  507.03 loss, 0.000338 LR, 4.54 GB used,   9756.46 GFLOPS,    675.05 GOPS
942   69.27 ms run,    2.67 ms python,   66.60 ms HIP,  506.73 loss, 0.000334 LR, 4.54 GB used,   9744.75 GFLOPS,    675.05 GOPS
943   69.16 ms run,    2.65 ms python,   66.52 ms HIP,  524.35 loss, 0.000330 LR, 4.54 GB used,   9760.49 GFLOPS,    675.05 GOPS
944   68.93 ms run,    2.62 ms python,   66.31 ms HIP,  501.72 loss, 0.000325 LR, 4.54 GB used,   9793.02 GFLOPS,    675.05 GOPS
945   69.79 ms run,    2.62 ms python,   67.17 ms HIP,  522.89 loss, 0.000321 LR, 4.54 GB used,   9672.81 GFLOPS,    675.05 GOPS
946   69.44 ms run,    2.69 ms python,   66.75 ms HIP,  500.60 loss, 0.000317 LR, 4.54 GB used,   9720.65 GFLOPS,    675.05 GOPS
947   70.38 ms run,    2.59 ms python,   67.79 ms HIP,  526.98 loss, 0.000312 LR, 4.54 GB used,   9591.57 GFLOPS,    675.05 GOPS
948   69.53 ms run,    2.66 ms python,   66.87 ms HIP,  509.99 loss, 0.000308 LR, 4.54 GB used,   9708.92 GFLOPS,    675.05 GOPS
949   69.39 ms run,    2.60 ms python,   66.78 ms HIP,  507.01 loss, 0.000304 LR, 4.54 GB used,   9728.96 GFLOPS,    675.05 GOPS
950   69.55 ms run,    2.61 ms python,   66.94 ms HIP,  510.29 loss, 0.000299 LR, 4.54 GB used,   9705.98 GFLOPS,    675.05 GOPS
951   69.38 ms run,    2.69 ms python,   66.69 ms HIP,  508.08 loss, 0.000295 LR, 4.54 GB used,   9729.52 GFLOPS,    675.05 GOPS
952   69.67 ms run,    2.60 ms python,   67.07 ms HIP,  511.04 loss, 0.000291 LR, 4.54 GB used,   9688.63 GFLOPS,    675.05 GOPS
953   68.88 ms run,    2.66 ms python,   66.22 ms HIP,  506.13 loss, 0.000286 LR, 4.54 GB used,   9800.60 GFLOPS,    675.05 GOPS
954   69.86 ms run,    2.64 ms python,   67.22 ms HIP,  503.12 loss, 0.000282 LR, 4.54 GB used,   9662.69 GFLOPS,    675.05 GOPS
955   70.15 ms run,    2.67 ms python,   67.48 ms HIP,  509.37 loss, 0.000277 LR, 4.54 GB used,   9622.85 GFLOPS,    675.05 GOPS
956   68.98 ms run,    2.67 ms python,   66.31 ms HIP,  514.90 loss, 0.000273 LR, 4.54 GB used,   9786.24 GFLOPS,    675.05 GOPS
957   69.87 ms run,    2.61 ms python,   67.26 ms HIP,  508.61 loss, 0.000269 LR, 4.54 GB used,   9661.85 GFLOPS,    675.05 GOPS
958   69.52 ms run,    2.64 ms python,   66.88 ms HIP,  506.84 loss, 0.000264 LR, 4.54 GB used,   9710.24 GFLOPS,    675.05 GOPS
959   69.13 ms run,    2.62 ms python,   66.50 ms HIP,  513.77 loss, 0.000260 LR, 4.54 GB used,   9765.39 GFLOPS,    675.05 GOPS
960   69.53 ms run,    2.62 ms python,   66.91 ms HIP,  512.25 loss, 0.000256 LR, 4.54 GB used,   9709.08 GFLOPS,    675.05 GOPS
961   69.65 ms run,    2.64 ms python,   67.01 ms HIP,  516.46 loss, 0.000251 LR, 4.54 GB used,   9692.63 GFLOPS,    675.05 GOPS
962   69.41 ms run,    2.64 ms python,   66.77 ms HIP,  513.07 loss, 0.000247 LR, 4.54 GB used,   9724.91 GFLOPS,    675.05 GOPS
963   69.36 ms run,    2.62 ms python,   66.74 ms HIP,  499.24 loss, 0.000243 LR, 4.54 GB used,   9732.38 GFLOPS,    675.05 GOPS
964   69.86 ms run,    2.68 ms python,   67.18 ms HIP,  508.07 loss, 0.000238 LR, 4.54 GB used,   9663.18 GFLOPS,    675.05 GOPS
965   69.71 ms run,    2.64 ms python,   67.07 ms HIP,  503.10 loss, 0.000234 LR, 4.54 GB used,   9683.32 GFLOPS,    675.05 GOPS
966   69.43 ms run,    2.64 ms python,   66.79 ms HIP,  503.75 loss, 0.000230 LR, 4.54 GB used,   9722.44 GFLOPS,    675.05 GOPS
967   69.82 ms run,    2.65 ms python,   67.17 ms HIP,  507.46 loss, 0.000225 LR, 4.54 GB used,   9668.10 GFLOPS,    675.05 GOPS
968   69.49 ms run,    2.62 ms python,   66.87 ms HIP,  501.20 loss, 0.000221 LR, 4.54 GB used,   9713.78 GFLOPS,    675.05 GOPS
969   69.42 ms run,    2.63 ms python,   66.79 ms HIP,  518.10 loss, 0.000217 LR, 4.54 GB used,   9723.82 GFLOPS,    675.05 GOPS
970   69.72 ms run,    2.63 ms python,   67.08 ms HIP,  504.68 loss, 0.000212 LR, 4.54 GB used,   9682.81 GFLOPS,    675.05 GOPS
971   69.56 ms run,    2.69 ms python,   66.87 ms HIP,  518.01 loss, 0.000208 LR, 4.54 GB used,   9704.35 GFLOPS,    675.05 GOPS
972   69.55 ms run,    2.61 ms python,   66.94 ms HIP,  508.69 loss, 0.000203 LR, 4.54 GB used,   9706.43 GFLOPS,    675.05 GOPS
973   69.74 ms run,    2.67 ms python,   67.07 ms HIP,  503.42 loss, 0.000199 LR, 4.54 GB used,   9680.06 GFLOPS,    675.05 GOPS
974   69.33 ms run,    2.68 ms python,   66.65 ms HIP,  507.29 loss, 0.000195 LR, 4.54 GB used,   9736.17 GFLOPS,    675.05 GOPS
975   69.80 ms run,    2.61 ms python,   67.18 ms HIP,  498.26 loss, 0.000190 LR, 4.54 GB used,   9671.72 GFLOPS,    675.05 GOPS
976   69.22 ms run,    2.63 ms python,   66.59 ms HIP,  512.58 loss, 0.000186 LR, 4.54 GB used,   9752.85 GFLOPS,    675.05 GOPS
977   69.06 ms run,    2.63 ms python,   66.44 ms HIP,  513.81 loss, 0.000182 LR, 4.54 GB used,   9774.40 GFLOPS,    675.05 GOPS
978   68.90 ms run,    2.62 ms python,   66.28 ms HIP,  501.11 loss, 0.000177 LR, 4.54 GB used,   9797.71 GFLOPS,    675.05 GOPS
979   69.44 ms run,    2.63 ms python,   66.81 ms HIP,  508.49 loss, 0.000173 LR, 4.54 GB used,   9720.73 GFLOPS,    675.05 GOPS
shuffling training dataset in 1154.76 ms (epoch=10)
980 1230.07 ms run, 1157.80 ms python,   72.27 ms HIP,  496.13 loss, 0.000169 LR, 4.54 GB used,    549.25 GFLOPS,    675.61 GOPS
981   73.90 ms run,    2.78 ms python,   71.13 ms HIP,  495.23 loss, 0.000164 LR, 4.54 GB used,   9134.13 GFLOPS,    675.05 GOPS
982   72.39 ms run,    2.67 ms python,   69.72 ms HIP,  496.88 loss, 0.000160 LR, 4.54 GB used,   9325.09 GFLOPS,    675.05 GOPS
983   71.53 ms run,    2.73 ms python,   68.80 ms HIP,  490.28 loss, 0.000156 LR, 4.54 GB used,   9436.97 GFLOPS,    675.05 GOPS
984   70.56 ms run,    2.68 ms python,   67.88 ms HIP,  489.54 loss, 0.000151 LR, 4.54 GB used,   9567.28 GFLOPS,    675.05 GOPS
985   69.59 ms run,    2.66 ms python,   66.92 ms HIP,  495.99 loss, 0.000147 LR, 4.54 GB used,   9700.61 GFLOPS,    675.05 GOPS
986   70.18 ms run,    2.67 ms python,   67.51 ms HIP,  503.37 loss, 0.000143 LR, 4.54 GB used,   9619.28 GFLOPS,    675.05 GOPS
987   69.76 ms run,    2.73 ms python,   67.02 ms HIP,  495.15 loss, 0.000138 LR, 4.54 GB used,   9677.25 GFLOPS,    675.05 GOPS
988   69.16 ms run,    2.78 ms python,   66.38 ms HIP,  501.15 loss, 0.000134 LR, 4.54 GB used,   9760.82 GFLOPS,    675.05 GOPS
989   70.33 ms run,    2.66 ms python,   67.67 ms HIP,  500.14 loss, 0.000129 LR, 4.54 GB used,   9598.59 GFLOPS,    675.05 GOPS
990   69.20 ms run,    2.63 ms python,   66.56 ms HIP,  496.23 loss, 0.000125 LR, 4.54 GB used,   9755.47 GFLOPS,    675.05 GOPS
991   68.96 ms run,    2.66 ms python,   66.29 ms HIP,  500.80 loss, 0.000121 LR, 4.54 GB used,   9789.60 GFLOPS,    675.05 GOPS
992   69.12 ms run,    2.67 ms python,   66.45 ms HIP,  497.86 loss, 0.000116 LR, 4.54 GB used,   9766.96 GFLOPS,    675.05 GOPS
993   69.31 ms run,    2.69 ms python,   66.62 ms HIP,  498.34 loss, 0.000112 LR, 4.54 GB used,   9739.42 GFLOPS,    675.05 GOPS
994   70.40 ms run,    2.62 ms python,   67.78 ms HIP,  495.24 loss, 0.000108 LR, 4.54 GB used,   9588.55 GFLOPS,    675.05 GOPS
995   70.55 ms run,    2.61 ms python,   67.93 ms HIP,  494.52 loss, 0.000103 LR, 4.54 GB used,   9568.90 GFLOPS,    675.05 GOPS
996   69.80 ms run,    2.63 ms python,   67.18 ms HIP,  494.78 loss, 0.000099 LR, 4.54 GB used,   9670.67 GFLOPS,    675.05 GOPS
997   69.70 ms run,    2.69 ms python,   67.01 ms HIP,  494.93 loss, 0.000095 LR, 4.54 GB used,   9684.90 GFLOPS,    675.05 GOPS
998   69.11 ms run,    2.65 ms python,   66.46 ms HIP,  497.52 loss, 0.000090 LR, 4.54 GB used,   9768.35 GFLOPS,    675.05 GOPS
999   69.46 ms run,    2.65 ms python,   66.81 ms HIP,  491.65 loss, 0.000086 LR, 4.54 GB used,   9718.04 GFLOPS,    675.05 GOPS
shuffling test dataset in 185.55 ms (epoch=0)
eval     9616/10240 93.91%,    0.40 val_loss STEP=1000 (in 1416.17 ms)

llama.py

using HIP backend
using LLaMA-7B model
Traceback (most recent call last):
  File "/home/jebba/devel/tinygrad/tinygrad/examples/llama.py", line 386, in <module>
    llama = LLaMa.build(MODEL_PATH, TOKENIZER_PATH, model_gen=args.gen, model_size=args.size, quantize=args.quantize, device=device)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jebba/devel/tinygrad/tinygrad/examples/llama.py", line 155, in build
    sp_model = SentencePieceProcessor(model_file=str(tokenizer_path))
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jebba/devel/tinygrad/tinygrad/venv/lib/python3.11/site-packages/sentencepiece/__init__.py", line 447, in Init
    self.Load(model_file=model_file, model_proto=model_proto)
  File "/home/jebba/devel/tinygrad/tinygrad/venv/lib/python3.11/site-packages/sentencepiece/__init__.py", line 905, in Load
    return self.LoadFromFile(model_file)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jebba/devel/tinygrad/tinygrad/venv/lib/python3.11/site-packages/sentencepiece/__init__.py", line 310, in LoadFromFile
    return _sentencepiece.SentencePieceProcessor_LoadFromFile(self, arg)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
OSError: Not found: "/home/jebba/devel/tinygrad/tinygrad/weights/LLaMA/tokenizer.model": No such file or directory Error #2

mask_rcnn.py

Traceback (most recent call last):
  File "/home/jebba/devel/tinygrad/tinygrad/venv/lib/python3.11/site-packages/PIL/Image.py", line 3135, in open
    fp.seek(0)
    ^^^^^^^
AttributeError: 'NoneType' object has no attribute 'seek'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/jebba/devel/tinygrad/tinygrad/examples/mask_rcnn.py", line 290, in <module>
    img = Image.open(args.image)
          ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jebba/devel/tinygrad/tinygrad/venv/lib/python3.11/site-packages/PIL/Image.py", line 3137, in open
    fp = io.BytesIO(fp.read())
                    ^^^^^^^
AttributeError: 'NoneType' object has no attribute 'read'

mixtral.py

Traceback (most recent call last):
  File "/home/jebba/devel/tinygrad/tinygrad/examples/mixtral.py", line 33, in <module>
    state = torch_load(args.weights + "/consolidated.00.pth.b")
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jebba/devel/tinygrad/tinygrad/tinygrad/nn/state.py", line 77, in torch_load
    t = Tensor.empty(os.stat(fn).st_size, dtype=dtypes.uint8, device=f"disk:{fn}")
                     ^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: '/home/jebba/devel/tinygrad/tinygrad/weights/mixtral-8x7b-32kseqlen/consolidated.00.pth.b'

mnist_gan.py

  0%|          | 0/300 [00:00<?, ?it/s]
Generator loss: 1.392429107471424, Discriminator loss: 1.2234591077876222:   0%|          | 0/300 [00:52<?, ?it/s]
Generator loss: 1.392429107471424, Discriminator loss: 1.2234591077876222:   0%|          | 1/300 [00:52<4:23:15, 52.83s/it]
Generator loss: 0.742053180713864, Discriminator loss: 1.3610961077844395:   0%|          | 1/300 [01:10<4:23:15, 52.83s/it]
Generator loss: 0.742053180713864, Discriminator loss: 1.3610961077844395:   1%|          | 2/300 [01:10<2:39:21, 32.09s/it]
Generator loss: 0.7467478732852375, Discriminator loss: 1.3732893361764795:   1%|          | 2/300 [01:28<2:39:21, 32.09s/it]
Generator loss: 0.7467478732852375, Discriminator loss: 1.3732893361764795:   1%|          | 3/300 [01:28<2:06:44, 25.61s/it]
Generator loss: 0.7906162344357547, Discriminator loss: 1.280748259057017:   1%|          | 3/300 [01:45<2:06:44, 25.61s/it] 
Generator loss: 0.7906162344357547, Discriminator loss: 1.280748259057017:   1%|▏         | 4/300 [01:45<1:49:47, 22.25s/it]
Generator loss: 1.3312474861741066, Discriminator loss: 1.0106008916216738:   1%|▏         | 4/300 [01:56<1:49:47, 22.25s/it]
Generator loss: 1.3312474861741066, Discriminator loss: 1.0106008916216738:   2%|▏         | 5/300 [01:56<1:28:50, 18.07s/it]
Generator loss: 1.783903177608462, Discriminator loss: 0.7816843103398295:   2%|▏         | 5/300 [02:05<1:28:50, 18.07s/it] 
Generator loss: 1.783903177608462, Discriminator loss: 0.7816843103398295:   2%|▏         | 6/300 [02:05<1:13:42, 15.04s/it]
Generator loss: 2.0209721011274002, Discriminator loss: 0.7753454947515446:   2%|▏         | 6/300 [02:21<1:13:42, 15.04s/it]
Generator loss: 2.0209721011274002, Discriminator loss: 0.7753454947515446:   2%|▏         | 7/300 [02:21<1:15:22, 15.43s/it]
Generator loss: 1.7905255674439318, Discriminator loss: 0.8429553416721961:   2%|▏         | 7/300 [02:36<1:15:22, 15.43s/it]
Generator loss: 1.7905255674439318, Discriminator loss: 0.8429553416721961:   3%|▎         | 8/300 [02:36<1:13:46, 15.16s/it]
Generator loss: 1.8539055073085953, Discriminator loss: 0.8918271034079439:   3%|▎         | 8/300 [02:50<1:13:46, 15.16s/it]
Generator loss: 1.8539055073085953, Discriminator loss: 0.8918271034079439:   3%|▎         | 9/300 [02:50<1:12:13, 14.89s/it]
Generator loss: 1.3095603074659319, Discriminator loss: 1.036813088637941:   3%|▎         | 9/300 [03:05<1:12:13, 14.89s/it] 
Generator loss: 1.3095603074659319, Discriminator loss: 1.036813088637941:   3%|▎         | 10/300 [03:05<1:13:03, 15.12s/it]
Generator loss: 1.7761961335645002, Discriminator loss: 0.8414131459944388:   3%|▎         | 10/300 [03:18<1:13:03, 15.12s/it]
Generator loss: 1.7761961335645002, Discriminator loss: 0.8414131459944388:   4%|▎         | 11/300 [03:18<1:08:35, 14.24s/it]
Generator loss: 2.118274845621165, Discriminator loss: 0.6414801717242774:   4%|▎         | 11/300 [03:27<1:08:35, 14.24s/it] 
Generator loss: 2.118274845621165, Discriminator loss: 0.6414801717242774:   4%|▍         | 12/300 [03:27<1:01:02, 12.72s/it]
Generator loss: 2.3443840575568817, Discriminator loss: 0.6470773925676065:   4%|▍         | 12/300 [03:36<1:01:02, 12.72s/it]
Generator loss: 2.3443840575568817, Discriminator loss: 0.6470773925676065:   4%|▍         | 13/300 [03:36<55:28, 11.60s/it]  
Generator loss: 2.4566313190495266, Discriminator loss: 0.5863546194399104:   4%|▍         | 13/300 [03:45<55:28, 11.60s/it]
Generator loss: 2.4566313190495266, Discriminator loss: 0.5863546194399104:   5%|▍         | 14/300 [03:45<51:37, 10.83s/it]
Generator loss: 2.6201670660692105, Discriminator loss: 0.6070238498642164:   5%|▍         | 14/300 [03:54<51:37, 10.83s/it]
Generator loss: 2.6201670660692105, Discriminator loss: 0.6070238498642164:   5%|▌         | 15/300 [03:54<48:49, 10.28s/it]
Generator loss: 2.6223588322891906, Discriminator loss: 0.5109086793792599:   5%|▌         | 15/300 [04:02<48:49, 10.28s/it]
Generator loss: 2.6223588322891906, Discriminator loss: 0.5109086793792599:   5%|▌         | 16/300 [04:02<46:03,  9.73s/it]
Generator loss: 2.5351900554755153, Discriminator loss: 0.5703090098412598:   5%|▌         | 16/300 [04:10<46:03,  9.73s/it]
Generator loss: 2.5351900554755153, Discriminator loss: 0.5703090098412598:   6%|▌         | 17/300 [04:10<42:44,  9.06s/it]
Generator loss: 2.224751223536099, Discriminator loss: 0.6933160715681665:   6%|▌         | 17/300 [04:17<42:44,  9.06s/it] 
Generator loss: 2.224751223536099, Discriminator loss: 0.6933160715681665:   6%|▌         | 18/300 [04:17<39:40,  8.44s/it]
Generator loss: 2.2140680621652042, Discriminator loss: 0.6735449100241941:   6%|▌         | 18/300 [04:24<39:40,  8.44s/it]
Generator loss: 2.2140680621652042, Discriminator loss: 0.6735449100241941:   6%|▋         | 19/300 [04:24<37:23,  7.98s/it]
Generator loss: 1.9777411288198303, Discriminator loss: 0.692805947845473:   6%|▋         | 19/300 [04:31<37:23,  7.98s/it] 
Generator loss: 1.9777411288198303, Discriminator loss: 0.692805947845473:   7%|▋         | 20/300 [04:31<35:43,  7.66s/it]
Generator loss: 1.8741068846600897, Discriminator loss: 0.7449631901348338:   7%|▋         | 20/300 [04:38<35:43,  7.66s/it]
Generator loss: 1.8741068846600897, Discriminator loss: 0.7449631901348338:   7%|▋         | 21/300 [04:38<34:32,  7.43s/it]
Generator loss: 1.9462997685460484, Discriminator loss: 0.61833321062081:   7%|▋         | 21/300 [04:45<34:32,  7.43s/it]  
Generator loss: 1.9462997685460484, Discriminator loss: 0.61833321062081:   7%|▋         | 22/300 [04:45<33:38,  7.26s/it]
Generator loss: 2.117940893944572, Discriminator loss: 0.6480395973605269:   7%|▋         | 22/300 [04:51<33:38,  7.26s/it]
Generator loss: 2.117940893944572, Discriminator loss: 0.6480395973605269:   8%|▊         | 23/300 [04:51<33:01,  7.15s/it]
Generator loss: 2.0316804112756954, Discriminator loss: 0.5923001387101763:   8%|▊         | 23/300 [04:58<33:01,  7.15s/it]
Generator loss: 2.0316804112756954, Discriminator loss: 0.5923001387101763:   8%|▊         | 24/300 [04:58<32:34,  7.08s/it]
Generator loss: 2.110353120109614, Discriminator loss: 0.5987094748107826:   8%|▊         | 24/300 [05:05<32:34,  7.08s/it] 
Generator loss: 2.110353120109614, Discriminator loss: 0.5987094748107826:   8%|▊         | 25/300 [05:05<32:13,  7.03s/it]
Generator loss: 2.12758731053156, Discriminator loss: 0.6420223644989378:   8%|▊         | 25/300 [05:12<32:13,  7.03s/it] 
Generator loss: 2.12758731053156, Discriminator loss: 0.6420223644989378:   9%|▊         | 26/300 [05:12<31:53,  6.99s/it]
Generator loss: 1.9290453379645067, Discriminator loss: 0.7173247856690603:   9%|▊         | 26/300 [05:19<31:53,  6.99s/it]
Generator loss: 1.9290453379645067, Discriminator loss: 0.7173247856690603:   9%|▉         | 27/300 [05:19<31:42,  6.97s/it]
Generator loss: 1.9236001933322233, Discriminator loss: 0.7348716399248909:   9%|▉         | 27/300 [05:26<31:42,  6.97s/it]
Generator loss: 1.9236001933322233, Discriminator loss: 0.7348716399248909:   9%|▉         | 28/300 [05:26<31:32,  6.96s/it]
Generator loss: 1.9243448284618996, Discriminator loss: 0.6838795056237894:   9%|▉         | 28/300 [05:33<31:32,  6.96s/it]
Generator loss: 1.9243448284618996, Discriminator loss: 0.6838795056237894:  10%|▉         | 29/300 [05:33<31:26,  6.96s/it]
Generator loss: 1.9530906686011482, Discriminator loss: 0.6574842614286086:  10%|▉         | 29/300 [05:40<31:26,  6.96s/it]
Generator loss: 1.9530906686011482, Discriminator loss: 0.6574842614286086:  10%|█         | 30/300 [05:40<31:42,  7.05s/it]
Generator loss: 1.987439405392198, Discriminator loss: 0.6398052040706662:  10%|█         | 30/300 [05:47<31:42,  7.05s/it] 
Generator loss: 1.987439405392198, Discriminator loss: 0.6398052040706662:  10%|█         | 31/300 [05:47<30:59,  6.91s/it]
Generator loss: 2.044340126654681, Discriminator loss: 0.6637535426108276:  10%|█         | 31/300 [05:53<30:59,  6.91s/it]
Generator loss: 2.044340126654681, Discriminator loss: 0.6637535426108276:  11%|█         | 32/300 [05:53<30:30,  6.83s/it]
Generator loss: 1.987223917070557, Discriminator loss: 0.6655241660773754:  11%|█         | 32/300 [06:00<30:30,  6.83s/it]
Generator loss: 1.987223917070557, Discriminator loss: 0.6655241660773754:  11%|█         | 33/300 [06:00<30:05,  6.76s/it]
Generator loss: 1.9940989157732796, Discriminator loss: 0.6812811452237999:  11%|█         | 33/300 [06:07<30:05,  6.76s/it]
Generator loss: 1.9940989157732796, Discriminator loss: 0.6812811452237999:  11%|█▏        | 34/300 [06:07<29:52,  6.74s/it]
Generator loss: 1.9257937928333002, Discriminator loss: 0.6992776880369467:  11%|█▏        | 34/300 [06:13<29:52,  6.74s/it]
Generator loss: 1.9257937928333002, Discriminator loss: 0.6992776880369467:  12%|█▏        | 35/300 [06:13<29:33,  6.69s/it]
Generator loss: 1.9709613064632696, Discriminator loss: 0.6311690412900027:  12%|█▏        | 35/300 [06:20<29:33,  6.69s/it]
Generator loss: 1.9709613064632696, Discriminator loss: 0.6311690412900027:  12%|█▏        | 36/300 [06:20<29:22,  6.68s/it]
Generator loss: 1.860101638471379, Discriminator loss: 0.7256270524333505:  12%|█▏        | 36/300 [06:27<29:22,  6.68s/it] 
Generator loss: 1.860101638471379, Discriminator loss: 0.7256270524333505:  12%|█▏        | 37/300 [06:27<29:12,  6.66s/it]
Generator loss: 1.7695811439086409, Discriminator loss: 0.7627294414183673:  12%|█▏        | 37/300 [06:33<29:12,  6.66s/it]
Generator loss: 1.7695811439086409, Discriminator loss: 0.7627294414183673:  13%|█▎        | 38/300 [06:33<29:06,  6.67s/it]
Generator loss: 1.710711133392418, Discriminator loss: 0.7783913958598586:  13%|█▎        | 38/300 [06:40<29:06,  6.67s/it] 
Generator loss: 1.710711133392418, Discriminator loss: 0.7783913958598586:  13%|█▎        | 39/300 [06:40<28:51,  6.63s/it]
Generator loss: 1.6298308666138088, Discriminator loss: 0.8163589966647765:  13%|█▎        | 39/300 [06:46<28:51,  6.63s/it]
Generator loss: 1.6298308666138088, Discriminator loss: 0.8163589966647765:  13%|█▎        | 40/300 [06:46<28:42,  6.62s/it]
Generator loss: 1.6522921586737913, Discriminator loss: 0.7979075299466357:  13%|█▎        | 40/300 [06:53<28:42,  6.62s/it]
Generator loss: 1.6522921586737913, Discriminator loss: 0.7979075299466357:  14%|█▎        | 41/300 [06:53<28:35,  6.62s/it]
Generator loss: 1.672510720351163, Discriminator loss: 0.7909478799385183:  14%|█▎        | 41/300 [07:00<28:35,  6.62s/it] 
Generator loss: 1.672510720351163, Discriminator loss: 0.7909478799385183:  14%|█▍        | 42/300 [07:00<28:29,  6.63s/it]
Generator loss: 1.6667017340660095, Discriminator loss: 0.7982207159785664:  14%|█▍        | 42/300 [07:06<28:29,  6.63s/it]
Generator loss: 1.6667017340660095, Discriminator loss: 0.7982207159785664:  14%|█▍        | 43/300 [07:06<28:17,  6.61s/it]
Generator loss: 1.65825504020733, Discriminator loss: 0.7997007856474203:  14%|█▍        | 43/300 [07:13<28:17,  6.61s/it]  
Generator loss: 1.65825504020733, Discriminator loss: 0.7997007856474203:  15%|█▍        | 44/300 [07:13<28:10,  6.61s/it]
Generator loss: 1.6209210477331106, Discriminator loss: 0.8447715745252722:  15%|█▍        | 44/300 [07:20<28:10,  6.61s/it]
Generator loss: 1.6209210477331106, Discriminator loss: 0.8447715745252722:  15%|█▌        | 45/300 [07:20<28:08,  6.62s/it]
Generator loss: 1.5814028741682278, Discriminator loss: 0.8335039265015546:  15%|█▌        | 45/300 [07:26<28:08,  6.62s/it]
Generator loss: 1.5814028741682278, Discriminator loss: 0.8335039265015546:  15%|█▌        | 46/300 [07:26<28:02,  6.63s/it]
Generator loss: 1.584620900452137, Discriminator loss: 0.8714165779597619:  15%|█▌        | 46/300 [07:33<28:02,  6.63s/it] 
Generator loss: 1.584620900452137, Discriminator loss: 0.8714165779597619:  16%|█▌        | 47/300 [07:33<28:01,  6.65s/it]
Generator loss: 1.5452900244032635, Discriminator loss: 0.841843509060495:  16%|█▌        | 47/300 [07:39<28:01,  6.65s/it]
Generator loss: 1.5452900244032635, Discriminator loss: 0.841843509060495:  16%|█▌        | 48/300 [07:39<27:47,  6.62s/it]
Generator loss: 1.542084024671246, Discriminator loss: 0.8812410480835858:  16%|█▌        | 48/300 [07:46<27:47,  6.62s/it]
Generator loss: 1.542084024671246, Discriminator loss: 0.8812410480835858:  16%|█▋        | 49/300 [07:46<27:36,  6.60s/it]
Generator loss: 1.486083539969781, Discriminator loss: 0.8791569059385973:  16%|█▋        | 49/300 [07:53<27:36,  6.60s/it]
Generator loss: 1.486083539969781, Discriminator loss: 0.8791569059385973:  17%|█▋        | 50/300 [07:53<27:27,  6.59s/it]
Generator loss: 1.5112405147622614, Discriminator loss: 0.8912068890298114:  17%|█▋        | 50/300 [07:59<27:27,  6.59s/it]
Generator loss: 1.5112405147622614, Discriminator loss: 0.8912068890298114:  17%|█▋        | 51/300 [07:59<27:29,  6.62s/it]
Generator loss: 1.4893103610066807, Discriminator loss: 0.8984880587633919:  17%|█▋        | 51/300 [08:06<27:29,  6.62s/it]
Generator loss: 1.4893103610066807, Discriminator loss: 0.8984880587633919:  17%|█▋        | 52/300 [08:06<27:22,  6.62s/it]
Generator loss: 1.5350522933637394, Discriminator loss: 0.8891902101390502:  17%|█▋        | 52/300 [08:13<27:22,  6.62s/it]
Generator loss: 1.5350522933637394, Discriminator loss: 0.8891902101390502:  18%|█▊        | 53/300 [08:13<27:16,  6.63s/it]
Generator loss: 1.5181330429280506, Discriminator loss: 0.899337364031988:  18%|█▊        | 53/300 [08:19<27:16,  6.63s/it] 
Generator loss: 1.5181330429280506, Discriminator loss: 0.899337364031988:  18%|█▊        | 54/300 [08:19<27:10,  6.63s/it]
Generator loss: 1.5080074895830715, Discriminator loss: 0.8935804108486456:  18%|█▊        | 54/300 [08:26<27:10,  6.63s/it]
Generator loss: 1.5080074895830715, Discriminator loss: 0.8935804108486456:  18%|█▊        | 55/300 [08:26<27:11,  6.66s/it]
Generator loss: 1.4986476161900688, Discriminator loss: 0.9006087337346638:  18%|█▊        | 55/300 [08:33<27:11,  6.66s/it]
Generator loss: 1.4986476161900688, Discriminator loss: 0.9006087337346638:  19%|█▊        | 56/300 [08:33<27:02,  6.65s/it]
Generator loss: 1.478811984114787, Discriminator loss: 0.9054143661085297:  19%|█▊        | 56/300 [08:39<27:02,  6.65s/it] 
Generator loss: 1.478811984114787, Discriminator loss: 0.9054143661085297:  19%|█▉        | 57/300 [08:39<26:52,  6.63s/it]
Generator loss: 1.5209060469094444, Discriminator loss: 0.9018364202450303:  19%|█▉        | 57/300 [08:46<26:52,  6.63s/it]
Generator loss: 1.5209060469094444, Discriminator loss: 0.9018364202450303:  19%|█▉        | 58/300 [08:46<26:39,  6.61s/it]
Generator loss: 1.4833326471202515, Discriminator loss: 0.8973440760198761:  19%|█▉        | 58/300 [08:52<26:39,  6.61s/it]
Generator loss: 1.4833326471202515, Discriminator loss: 0.8973440760198761:  20%|█▉        | 59/300 [08:52<26:29,  6.59s/it]
Generator loss: 1.5183102063396399, Discriminator loss: 0.895128549898372:  20%|█▉        | 59/300 [08:59<26:29,  6.59s/it] 
Generator loss: 1.5183102063396399, Discriminator loss: 0.895128549898372:  20%|██        | 60/300 [08:59<26:25,  6.61s/it]
Generator loss: 1.5287320368430193, Discriminator loss: 0.8949343741816633:  20%|██        | 60/300 [09:05<26:25,  6.61s/it]
Generator loss: 1.5287320368430193, Discriminator loss: 0.8949343741816633:  20%|██        | 61/300 [09:05<26:15,  6.59s/it]
Generator loss: 1.5220893643358175, Discriminator loss: 0.9018049928195336:  20%|██        | 61/300 [09:12<26:15,  6.59s/it]
Generator loss: 1.5220893643358175, Discriminator loss: 0.9018049928195336:  21%|██        | 62/300 [09:12<26:06,  6.58s/it]
Generator loss: 1.5172377960646855, Discriminator loss: 0.8894347741323358:  21%|██        | 62/300 [09:19<26:06,  6.58s/it]
Generator loss: 1.5172377960646855, Discriminator loss: 0.8894347741323358:  21%|██        | 63/300 [09:19<26:04,  6.60s/it]
Generator loss: 1.5071446154924, Discriminator loss: 0.896211767459617:  21%|██        | 63/300 [09:25<26:04,  6.60s/it]    
Generator loss: 1.5071446154924, Discriminator loss: 0.896211767459617:  21%|██▏       | 64/300 [09:25<26:07,  6.64s/it]
Generator loss: 1.521011765827151, Discriminator loss: 0.9010778538444463:  21%|██▏       | 64/300 [09:32<26:07,  6.64s/it]
Generator loss: 1.521011765827151, Discriminator loss: 0.9010778538444463:  22%|██▏       | 65/300 [09:32<26:00,  6.64s/it]
Generator loss: 1.5088214804144466, Discriminator loss: 0.8929120166336789:  22%|██▏       | 65/300 [09:39<26:00,  6.64s/it]
Generator loss: 1.5088214804144466, Discriminator loss: 0.8929120166336789:  22%|██▏       | 66/300 [09:39<25:53,  6.64s/it]
Generator loss: 1.5387992946540607, Discriminator loss: 0.8957542154718848:  22%|██▏       | 66/300 [09:45<25:53,  6.64s/it]
Generator loss: 1.5387992946540607, Discriminator loss: 0.8957542154718848:  22%|██▏       | 67/300 [09:45<25:45,  6.63s/it]
Generator loss: 1.5478724832920467, Discriminator loss: 0.8870287239551544:  22%|██▏       | 67/300 [09:52<25:45,  6.63s/it]
Generator loss: 1.5478724832920467, Discriminator loss: 0.8870287239551544:  23%|██▎       | 68/300 [09:52<25:44,  6.66s/it]
Generator loss: 1.5466448664665222, Discriminator loss: 0.8750020063975278:  23%|██▎       | 68/300 [09:59<25:44,  6.66s/it]
Generator loss: 1.5466448664665222, Discriminator loss: 0.8750020063975278:  23%|██▎       | 69/300 [09:59<25:35,  6.65s/it]
Generator loss: 1.5378692430608414, Discriminator loss: 0.8932287136421484:  23%|██▎       | 69/300 [10:05<25:35,  6.65s/it]
Generator loss: 1.5378692430608414, Discriminator loss: 0.8932287136421484:  23%|██▎       | 70/300 [10:05<25:25,  6.63s/it]
Generator loss: 1.5668704426463913, Discriminator loss: 0.866356801460771:  23%|██▎       | 70/300 [10:12<25:25,  6.63s/it] 
Generator loss: 1.5668704426463913, Discriminator loss: 0.866356801460771:  24%|██▎       | 71/300 [10:12<25:18,  6.63s/it]
Generator loss: 1.5750630353303516, Discriminator loss: 0.8744459419566042:  24%|██▎       | 71/300 [10:18<25:18,  6.63s/it]
Generator loss: 1.5750630353303516, Discriminator loss: 0.8744459419566042:  24%|██▍       | 72/300 [10:18<25:12,  6.63s/it]
Generator loss: 1.5859676148085033, Discriminator loss: 0.8690760284662247:  24%|██▍       | 72/300 [10:25<25:12,  6.63s/it]
Generator loss: 1.5859676148085033, Discriminator loss: 0.8690760284662247:  24%|██▍       | 73/300 [10:25<25:11,  6.66s/it]
Generator loss: 1.5942121735390495, Discriminator loss: 0.8656245947760695:  24%|██▍       | 73/300 [10:32<25:11,  6.66s/it]
Generator loss: 1.5942121735390495, Discriminator loss: 0.8656245947760695:  25%|██▍       | 74/300 [10:32<25:03,  6.65s/it]
Generator loss: 1.5940522244747948, Discriminator loss: 0.8590863057795692:  25%|██▍       | 74/300 [10:38<25:03,  6.65s/it]
Generator loss: 1.5940522244747948, Discriminator loss: 0.8590863057795692:  25%|██▌       | 75/300 [10:38<24:55,  6.65s/it]
Generator loss: 1.625133322442279, Discriminator loss: 0.8461951021762455:  25%|██▌       | 75/300 [10:45<24:55,  6.65s/it] 
Generator loss: 1.625133322442279, Discriminator loss: 0.8461951021762455:  25%|██▌       | 76/300 [10:45<24:47,  6.64s/it]
Generator loss: 1.6150044386877733, Discriminator loss: 0.8720687991556:  25%|██▌       | 76/300 [10:52<24:47,  6.64s/it]  
Generator loss: 1.6150044386877733, Discriminator loss: 0.8720687991556:  26%|██▌       | 77/300 [10:52<24:45,  6.66s/it]
Generator loss: 1.5708426285315962, Discriminator loss: 0.8573583491584834:  26%|██▌       | 77/300 [10:58<24:45,  6.66s/it]
Generator loss: 1.5708426285315962, Discriminator loss: 0.8573583491584834:  26%|██▌       | 78/300 [10:58<24:34,  6.64s/it]
Generator loss: 1.5813802248414826, Discriminator loss: 0.8782505826915011:  26%|██▌       | 78/300 [11:05<24:34,  6.64s/it]
Generator loss: 1.5813802248414826, Discriminator loss: 0.8782505826915011:  26%|██▋       | 79/300 [11:05<24:24,  6.63s/it]
Generator loss: 1.567047947908149, Discriminator loss: 0.8690432971891235:  26%|██▋       | 79/300 [11:12<24:24,  6.63s/it] 
Generator loss: 1.567047947908149, Discriminator loss: 0.8690432971891235:  27%|██▋       | 80/300 [11:12<24:13,  6.61s/it]
Generator loss: 1.583584943676696, Discriminator loss: 0.8649900998262798:  27%|██▋       | 80/300 [11:18<24:13,  6.61s/it]
Generator loss: 1.583584943676696, Discriminator loss: 0.8649900998262798:  27%|██▋       | 81/300 [11:18<24:13,  6.64s/it]
Generator loss: 1.5720649168771856, Discriminator loss: 0.8745317069046638:  27%|██▋       | 81/300 [11:25<24:13,  6.64s/it]
Generator loss: 1.5720649168771856, Discriminator loss: 0.8745317069046638:  27%|██▋       | 82/300 [11:25<24:05,  6.63s/it]
Generator loss: 1.5929746877621203, Discriminator loss: 0.884861863711301:  27%|██▋       | 82/300 [11:31<24:05,  6.63s/it] 
Generator loss: 1.5929746877621203, Discriminator loss: 0.884861863711301:  28%|██▊       | 83/300 [11:31<23:56,  6.62s/it]
Generator loss: 1.5993040927192743, Discriminator loss: 0.8676384678658318:  28%|██▊       | 83/300 [11:38<23:56,  6.62s/it]
Generator loss: 1.5993040927192743, Discriminator loss: 0.8676384678658318:  28%|██▊       | 84/300 [11:38<23:48,  6.61s/it]
Generator loss: 1.6024239904740278, Discriminator loss: 0.8599830475800178:  28%|██▊       | 84/300 [11:45<23:48,  6.61s/it]
Generator loss: 1.6024239904740278, Discriminator loss: 0.8599830475800178:  28%|██▊       | 85/300 [11:45<23:43,  6.62s/it]
Generator loss: 1.5926220316220732, Discriminator loss: 0.8531831862295375:  28%|██▊       | 85/300 [11:51<23:43,  6.62s/it]
Generator loss: 1.5926220316220732, Discriminator loss: 0.8531831862295375:  29%|██▊       | 86/300 [11:51<23:42,  6.65s/it]
Generator loss: 1.6410702589680166, Discriminator loss: 0.8368448203100878:  29%|██▊       | 86/300 [11:58<23:42,  6.65s/it]
Generator loss: 1.6410702589680166, Discriminator loss: 0.8368448203100878:  29%|██▉       | 87/300 [11:58<23:34,  6.64s/it]
Generator loss: 1.612115778028965, Discriminator loss: 0.866695671397097:  29%|██▉       | 87/300 [12:05<23:34,  6.64s/it]  
Generator loss: 1.612115778028965, Discriminator loss: 0.866695671397097:  29%|██▉       | 88/300 [12:05<23:26,  6.63s/it]
Generator loss: 1.6191413227249594, Discriminator loss: 0.8370693299700233:  29%|██▉       | 88/300 [12:11<23:26,  6.63s/it]
Generator loss: 1.6191413227249594, Discriminator loss: 0.8370693299700233:  30%|██▉       | 89/300 [12:11<23:19,  6.63s/it]
Generator loss: 1.6128742370535345, Discriminator loss: 0.8590594262761229:  30%|██▉       | 89/300 [12:18<23:19,  6.63s/it]
Generator loss: 1.6128742370535345, Discriminator loss: 0.8590594262761229:  30%|███       | 90/300 [12:18<23:18,  6.66s/it]
Generator loss: 1.6292471666546429, Discriminator loss: 0.8545262178077417:  30%|███       | 90/300 [12:25<23:18,  6.66s/it]
Generator loss: 1.6292471666546429, Discriminator loss: 0.8545262178077417:  30%|███       | 91/300 [12:25<23:07,  6.64s/it]
Generator loss: 1.645205161150764, Discriminator loss: 0.8198951066416853:  30%|███       | 91/300 [12:31<23:07,  6.64s/it] 
Generator loss: 1.645205161150764, Discriminator loss: 0.8198951066416853:  31%|███       | 92/300 [12:31<23:00,  6.64s/it]
Generator loss: 1.625604101401918, Discriminator loss: 0.8479677473797518:  31%|███       | 92/300 [12:38<23:00,  6.64s/it]
Generator loss: 1.625604101401918, Discriminator loss: 0.8479677473797518:  31%|███       | 93/300 [12:38<22:51,  6.63s/it]
Generator loss: 1.6520345750100471, Discriminator loss: 0.8335142885060871:  31%|███       | 93/300 [12:44<22:51,  6.63s/it]
Generator loss: 1.6520345750100471, Discriminator loss: 0.8335142885060871:  31%|███▏      | 94/300 [12:44<22:49,  6.65s/it]
Generator loss: 1.6731279856141876, Discriminator loss: 0.8364165591842988:  31%|███▏      | 94/300 [12:51<22:49,  6.65s/it]
Generator loss: 1.6731279856141876, Discriminator loss: 0.8364165591842988:  32%|███▏      | 95/300 [12:51<22:40,  6.64s/it]
Generator loss: 1.6628490156110596, Discriminator loss: 0.8328475991592688:  32%|███▏      | 95/300 [12:58<22:40,  6.64s/it]
Generator loss: 1.6628490156110596, Discriminator loss: 0.8328475991592688:  32%|███▏      | 96/300 [12:58<22:33,  6.63s/it]
Generator loss: 1.6615130590165363, Discriminator loss: 0.8281888374510933:  32%|███▏      | 96/300 [13:04<22:33,  6.63s/it]
Generator loss: 1.6615130590165363, Discriminator loss: 0.8281888374510933:  32%|███▏      | 97/300 [13:04<22:25,  6.63s/it]
Generator loss: 1.6148593184702538, Discriminator loss: 0.853957395781489:  32%|███▏      | 97/300 [13:11<22:25,  6.63s/it] 
Generator loss: 1.6148593184702538, Discriminator loss: 0.853957395781489:  33%|███▎      | 98/300 [13:11<22:17,  6.62s/it]
Generator loss: 1.6571329468313385, Discriminator loss: 0.8237307260141653:  33%|███▎      | 98/300 [13:18<22:17,  6.62s/it]
Generator loss: 1.6571329468313385, Discriminator loss: 0.8237307260141653:  33%|███▎      | 99/300 [13:18<22:15,  6.65s/it]
Generator loss: 1.6786996196298039, Discriminator loss: 0.8207834148231674:  33%|███▎      | 99/300 [13:24<22:15,  6.65s/it]
Generator loss: 1.6786996196298039, Discriminator loss: 0.8207834148231674:  33%|███▎      | 100/300 [13:24<22:07,  6.64s/it]
Generator loss: 1.6578268540256165, Discriminator loss: 0.8502317338305361:  33%|███▎      | 100/300 [13:31<22:07,  6.64s/it]
Generator loss: 1.6578268540256165, Discriminator loss: 0.8502317338305361:  34%|███▎      | 101/300 [13:31<22:00,  6.63s/it]
Generator loss: 1.6500777036828154, Discriminator loss: 0.8360809832811356:  34%|███▎      | 101/300 [13:37<22:00,  6.63s/it]
Generator loss: 1.6500777036828154, Discriminator loss: 0.8360809832811356:  34%|███▍      | 102/300 [13:37<21:49,  6.61s/it]
Generator loss: 1.667227153392399, Discriminator loss: 0.8129227284122916:  34%|███▍      | 102/300 [13:44<21:49,  6.61s/it] 
Generator loss: 1.667227153392399, Discriminator loss: 0.8129227284122916:  34%|███▍      | 103/300 [13:44<21:47,  6.64s/it]
Generator loss: 1.6800949520924513, Discriminator loss: 0.8065380576778861:  34%|███▍      | 103/300 [13:51<21:47,  6.64s/it]
Generator loss: 1.6800949520924513, Discriminator loss: 0.8065380576778861:  35%|███▍      | 104/300 [13:51<21:41,  6.64s/it]
Generator loss: 1.7048907402683706, Discriminator loss: 0.8073845773058779:  35%|███▍      | 104/300 [13:57<21:41,  6.64s/it]
Generator loss: 1.7048907402683706, Discriminator loss: 0.8073845773058779:  35%|███▌      | 105/300 [13:57<21:33,  6.63s/it]
Generator loss: 1.678999839898418, Discriminator loss: 0.8190580729176017:  35%|███▌      | 105/300 [14:04<21:33,  6.63s/it] 
Generator loss: 1.678999839898418, Discriminator loss: 0.8190580729176017:  35%|███▌      | 106/300 [14:04<21:22,  6.61s/it]
Generator loss: 1.6873094570987366, Discriminator loss: 0.8126953007543788:  35%|███▌      | 106/300 [14:11<21:22,  6.61s/it]
Generator loss: 1.6873094570987366, Discriminator loss: 0.8126953007543788:  36%|███▌      | 107/300 [14:11<21:22,  6.65s/it]
Generator loss: 1.678116421927424, Discriminator loss: 0.8270426212864763:  36%|███▌      | 107/300 [14:17<21:22,  6.65s/it] 
Generator loss: 1.678116421927424, Discriminator loss: 0.8270426212864763:  36%|███▌      | 108/300 [14:17<21:15,  6.64s/it]
Generator loss: 1.6950164301430477, Discriminator loss: 0.8026039232225979:  36%|███▌      | 108/300 [14:24<21:15,  6.64s/it]
Generator loss: 1.6950164301430477, Discriminator loss: 0.8026039232225979:  36%|███▋      | 109/300 [14:24<21:04,  6.62s/it]
Generator loss: 1.7014941169935114, Discriminator loss: 0.7969357147812843:  36%|███▋      | 109/300 [14:30<21:04,  6.62s/it]
Generator loss: 1.7014941169935114, Discriminator loss: 0.7969357147812843:  37%|███▋      | 110/300 [14:30<20:53,  6.60s/it]
Generator loss: 1.719758610953303, Discriminator loss: 0.7976097617955769:  37%|███▋      | 110/300 [14:37<20:53,  6.60s/it] 
Generator loss: 1.719758610953303, Discriminator loss: 0.7976097617955769:  37%|███▋      | 111/300 [14:37<20:44,  6.59s/it]
Generator loss: 1.7388707302949007, Discriminator loss: 0.7936992846867618:  37%|███▋      | 111/300 [14:44<20:44,  6.59s/it]
Generator loss: 1.7388707302949007, Discriminator loss: 0.7936992846867618:  37%|███▋      | 112/300 [14:44<20:40,  6.60s/it]
Generator loss: 1.7232475412242554, Discriminator loss: 0.7827038243412971:  37%|███▋      | 112/300 [14:50<20:40,  6.60s/it]
Generator loss: 1.7232475412242554, Discriminator loss: 0.7827038243412971:  38%|███▊      | 113/300 [14:50<20:31,  6.59s/it]
Generator loss: 1.7635971491827684, Discriminator loss: 0.7883220731335527:  38%|███▊      | 113/300 [14:57<20:31,  6.59s/it]
Generator loss: 1.7635971491827684, Discriminator loss: 0.7883220731335527:  38%|███▊      | 114/300 [14:57<20:22,  6.58s/it]
Generator loss: 1.7574936063850628, Discriminator loss: 0.7908676126424004:  38%|███▊      | 114/300 [15:03<20:22,  6.58s/it]
Generator loss: 1.7574936063850628, Discriminator loss: 0.7908676126424004:  38%|███▊      | 115/300 [15:03<20:15,  6.57s/it]
Generator loss: 1.7410221165593933, Discriminator loss: 0.796536782208611:  38%|███▊      | 115/300 [15:10<20:15,  6.57s/it] 
Generator loss: 1.7410221165593933, Discriminator loss: 0.796536782208611:  39%|███▊      | 116/300 [15:10<20:12,  6.59s/it]
Generator loss: 1.7575040148461567, Discriminator loss: 0.7798431997790056:  39%|███▊      | 116/300 [15:16<20:12,  6.59s/it]
Generator loss: 1.7575040148461567, Discriminator loss: 0.7798431997790056:  39%|███▉      | 117/300 [15:16<20:03,  6.58s/it]
Generator loss: 1.7489790890146704, Discriminator loss: 0.7853775230400702:  39%|███▉      | 117/300 [15:23<20:03,  6.58s/it]
Generator loss: 1.7489790890146704, Discriminator loss: 0.7853775230400702:  39%|███▉      | 118/300 [15:23<19:55,  6.57s/it]
Generator loss: 1.7569141054854673, Discriminator loss: 0.7929856251267826:  39%|███▉      | 118/300 [15:30<19:55,  6.57s/it]
Generator loss: 1.7569141054854673, Discriminator loss: 0.7929856251267826:  40%|███▉      | 119/300 [15:30<19:48,  6.56s/it]
Generator loss: 1.7644893553327112, Discriminator loss: 0.7732302866437856:  40%|███▉      | 119/300 [15:36<19:48,  6.56s/it]
Generator loss: 1.7644893553327112, Discriminator loss: 0.7732302866437856:  40%|████      | 120/300 [15:36<19:45,  6.59s/it]
Generator loss: 1.7821472046129845, Discriminator loss: 0.7615856252172414:  40%|████      | 120/300 [15:43<19:45,  6.59s/it]
Generator loss: 1.7821472046129845, Discriminator loss: 0.7615856252172414:  40%|████      | 121/300 [15:43<19:27,  6.52s/it]
Generator loss: 1.7836939162191223, Discriminator loss: 0.7784580820623566:  40%|████      | 121/300 [15:49<19:27,  6.52s/it]
Generator loss: 1.7836939162191223, Discriminator loss: 0.7784580820623566:  41%|████      | 122/300 [15:49<19:13,  6.48s/it]
Generator loss: 1.7743018295835047, Discriminator loss: 0.7797000899034388:  41%|████      | 122/300 [15:55<19:13,  6.48s/it]
Generator loss: 1.7743018295835047, Discriminator loss: 0.7797000899034388:  41%|████      | 123/300 [15:55<19:01,  6.45s/it]
Generator loss: 1.7860442395595943, Discriminator loss: 0.779537913553855:  41%|████      | 123/300 [16:02<19:01,  6.45s/it] 
Generator loss: 1.7860442395595943, Discriminator loss: 0.779537913553855:  41%|████▏     | 124/300 [16:02<18:52,  6.43s/it]
Generator loss: 1.7961063314886654, Discriminator loss: 0.7632119296228185:  41%|████▏     | 124/300 [16:08<18:52,  6.43s/it]
Generator loss: 1.7961063314886654, Discriminator loss: 0.7632119296228185:  42%|████▏     | 125/300 [16:08<18:46,  6.44s/it]
Generator loss: 1.7912442175781025, Discriminator loss: 0.7627700041322147:  42%|████▏     | 125/300 [16:15<18:46,  6.44s/it]
Generator loss: 1.7912442175781025, Discriminator loss: 0.7627700041322147:  42%|████▏     | 126/300 [16:15<18:37,  6.42s/it]
Generator loss: 1.8034317142823164, Discriminator loss: 0.7996259073124212:  42%|████▏     | 126/300 [16:21<18:37,  6.42s/it]
Generator loss: 1.8034317142823164, Discriminator loss: 0.7996259073124212:  42%|████▏     | 127/300 [16:21<18:31,  6.42s/it]
Generator loss: 1.7972839588628096, Discriminator loss: 0.7540580669746679:  42%|████▏     | 127/300 [16:27<18:31,  6.42s/it]
Generator loss: 1.7972839588628096, Discriminator loss: 0.7540580669746679:  43%|████▎     | 128/300 [16:27<18:22,  6.41s/it]
Generator loss: 1.819234922528267, Discriminator loss: 0.751131749328445:  43%|████▎     | 128/300 [16:34<18:22,  6.41s/it]  
Generator loss: 1.819234922528267, Discriminator loss: 0.751131749328445:  43%|████▎     | 129/300 [16:34<18:14,  6.40s/it]
Generator loss: 1.8124022290987127, Discriminator loss: 0.757165546364644:  43%|████▎     | 129/300 [16:40<18:14,  6.40s/it]
Generator loss: 1.8124022290987127, Discriminator loss: 0.757165546364644:  43%|████▎     | 130/300 [16:40<18:06,  6.39s/it]
Generator loss: 1.8343695323256886, Discriminator loss: 0.7554672592703033:  43%|████▎     | 130/300 [16:46<18:06,  6.39s/it]
Generator loss: 1.8343695323256886, Discriminator loss: 0.7554672592703033:  44%|████▎     | 131/300 [16:46<17:57,  6.38s/it]
Generator loss: 1.8135301514583475, Discriminator loss: 0.7552496725145508:  44%|████▎     | 131/300 [16:53<17:57,  6.38s/it]
Generator loss: 1.8135301514583475, Discriminator loss: 0.7552496725145508:  44%|████▍     | 132/300 [16:53<17:53,  6.39s/it]
Generator loss: 1.7663639468305252, Discriminator loss: 0.843885213136673:  44%|████▍     | 132/300 [16:59<17:53,  6.39s/it] 
Generator loss: 1.7663639468305252, Discriminator loss: 0.843885213136673:  44%|████▍     | 133/300 [16:59<17:53,  6.43s/it]
Generator loss: 1.7719000402618856, Discriminator loss: 0.7477432011681444:  44%|████▍     | 133/300 [17:06<17:53,  6.43s/it]
Generator loss: 1.7719000402618856, Discriminator loss: 0.7477432011681444:  45%|████▍     | 134/300 [17:06<17:46,  6.43s/it]
Generator loss: 1.8142524659633636, Discriminator loss: 0.7370717661345706:  45%|████▍     | 134/300 [17:12<17:46,  6.43s/it]
Generator loss: 1.8142524659633636, Discriminator loss: 0.7370717661345706:  45%|████▌     | 135/300 [17:12<17:38,  6.41s/it]
Generator loss: 1.8233654722571373, Discriminator loss: 0.7425804826266625:  45%|████▌     | 135/300 [17:19<17:38,  6.41s/it]
Generator loss: 1.8233654722571373, Discriminator loss: 0.7425804826266625:  45%|████▌     | 136/300 [17:19<17:32,  6.42s/it]
Generator loss: 1.8471009555984945, Discriminator loss: 0.7435824047116673:  45%|████▌     | 136/300 [17:25<17:32,  6.42s/it]
Generator loss: 1.8471009555984945, Discriminator loss: 0.7435824047116673:  46%|████▌     | 137/300 [17:25<17:30,  6.44s/it]
Generator loss: 1.8517840381930857, Discriminator loss: 0.7490521963028347:  46%|████▌     | 137/300 [17:32<17:30,  6.44s/it]
Generator loss: 1.8517840381930857, Discriminator loss: 0.7490521963028347:  46%|████▌     | 138/300 [17:32<17:23,  6.44s/it]
Generator loss: 1.8829423031386208, Discriminator loss: 0.7372819693649516:  46%|████▌     | 138/300 [17:38<17:23,  6.44s/it]
Generator loss: 1.8829423031386208, Discriminator loss: 0.7372819693649516:  46%|████▋     | 139/300 [17:38<17:24,  6.49s/it]
Generator loss: 1.8680496364831924, Discriminator loss: 0.7670731268384877:  46%|████▋     | 139/300 [17:45<17:24,  6.49s/it]
Generator loss: 1.8680496364831924, Discriminator loss: 0.7670731268384877:  47%|████▋     | 140/300 [17:45<17:16,  6.48s/it]
Generator loss: 1.857833515633555, Discriminator loss: 0.7407474833376267:  47%|████▋     | 140/300 [17:51<17:16,  6.48s/it] 
Generator loss: 1.857833515633555, Discriminator loss: 0.7407474833376267:  47%|████▋     | 141/300 [17:51<17:08,  6.47s/it]
Generator loss: 1.8678517113713657, Discriminator loss: 0.7261389905915541:  47%|████▋     | 141/300 [17:58<17:08,  6.47s/it]
Generator loss: 1.8678517113713657, Discriminator loss: 0.7261389905915541:  47%|████▋     | 142/300 [17:58<17:00,  6.46s/it]
Generator loss: 1.8683480313595604, Discriminator loss: 0.7343018848229858:  47%|████▋     | 142/300 [18:04<17:00,  6.46s/it]
Generator loss: 1.8683480313595604, Discriminator loss: 0.7343018848229858:  48%|████▊     | 143/300 [18:04<16:56,  6.47s/it]
Generator loss: 1.8793934167307966, Discriminator loss: 0.735251775559257:  48%|████▊     | 143/300 [18:11<16:56,  6.47s/it] 
Generator loss: 1.8793934167307966, Discriminator loss: 0.735251775559257:  48%|████▊     | 144/300 [18:11<16:49,  6.47s/it]
Generator loss: 1.8663447096067316, Discriminator loss: 0.7562430614934248:  48%|████▊     | 144/300 [18:17<16:49,  6.47s/it]
Generator loss: 1.8663447096067316, Discriminator loss: 0.7562430614934248:  48%|████▊     | 145/300 [18:17<16:46,  6.49s/it]
Generator loss: 1.853991342818036, Discriminator loss: 0.7314641164506183:  48%|████▊     | 145/300 [18:23<16:46,  6.49s/it] 
Generator loss: 1.853991342818036, Discriminator loss: 0.7314641164506183:  49%|████▊     | 146/300 [18:23<16:36,  6.47s/it]
Generator loss: 1.8588006592848723, Discriminator loss: 0.7245916391120237:  49%|████▊     | 146/300 [18:30<16:36,  6.47s/it]
Generator loss: 1.8588006592848723, Discriminator loss: 0.7245916391120237:  49%|████▉     | 147/300 [18:30<16:31,  6.48s/it]
Generator loss: 1.887835018336773, Discriminator loss: 0.7272283965173889:  49%|████▉     | 147/300 [18:36<16:31,  6.48s/it] 
Generator loss: 1.887835018336773, Discriminator loss: 0.7272283965173889:  49%|████▉     | 148/300 [18:36<16:21,  6.45s/it]
Generator loss: 1.870099083027419, Discriminator loss: 0.7440555796903723:  49%|████▉     | 148/300 [18:43<16:21,  6.45s/it]
Generator loss: 1.870099083027419, Discriminator loss: 0.7440555796903723:  50%|████▉     | 149/300 [18:43<16:14,  6.45s/it]
Generator loss: 1.8805445914759356, Discriminator loss: 0.7166378572583199:  50%|████▉     | 149/300 [18:49<16:14,  6.45s/it]
Generator loss: 1.8805445914759356, Discriminator loss: 0.7166378572583199:  50%|█████     | 150/300 [18:49<16:06,  6.44s/it]
Generator loss: 1.9056233597152374, Discriminator loss: 0.730847950805636:  50%|█████     | 150/300 [18:56<16:06,  6.44s/it] 
Generator loss: 1.9056233597152374, Discriminator loss: 0.730847950805636:  50%|█████     | 151/300 [18:56<16:03,  6.47s/it]
Generator loss: 1.9140165527077282, Discriminator loss: 0.7123787074404604:  50%|█████     | 151/300 [19:02<16:03,  6.47s/it]
Generator loss: 1.9140165527077282, Discriminator loss: 0.7123787074404604:  51%|█████     | 152/300 [19:02<15:56,  6.46s/it]
Generator loss: 1.9222180150887545, Discriminator loss: 0.7199763120973811:  51%|█████     | 152/300 [19:09<15:56,  6.46s/it]
Generator loss: 1.9222180150887545, Discriminator loss: 0.7199763120973811:  51%|█████     | 153/300 [19:09<15:50,  6.46s/it]
Generator loss: 1.8877925149658148, Discriminator loss: 0.7476223606397124:  51%|█████     | 153/300 [19:15<15:50,  6.46s/it]
Generator loss: 1.8877925149658148, Discriminator loss: 0.7476223606397124:  51%|█████▏    | 154/300 [19:15<15:40,  6.44s/it]
Generator loss: 1.883141139850897, Discriminator loss: 0.7233283677521873:  51%|█████▏    | 154/300 [19:22<15:40,  6.44s/it] 
Generator loss: 1.883141139850897, Discriminator loss: 0.7233283677521873:  52%|█████▏    | 155/300 [19:22<15:34,  6.45s/it]
Generator loss: 1.8941808974041658, Discriminator loss: 0.7128842459882007:  52%|█████▏    | 155/300 [19:28<15:34,  6.45s/it]
Generator loss: 1.8941808974041658, Discriminator loss: 0.7128842459882007:  52%|█████▏    | 156/300 [19:28<15:31,  6.47s/it]
Generator loss: 1.9113889801151611, Discriminator loss: 0.7131333828848951:  52%|█████▏    | 156/300 [19:35<15:31,  6.47s/it]
Generator loss: 1.9113889801151611, Discriminator loss: 0.7131333828848951:  52%|█████▏    | 157/300 [19:35<15:30,  6.51s/it]
Generator loss: 1.907938465476036, Discriminator loss: 0.7078887916663114:  52%|█████▏    | 157/300 [19:41<15:30,  6.51s/it] 
Generator loss: 1.907938465476036, Discriminator loss: 0.7078887916663114:  53%|█████▎    | 158/300 [19:41<15:24,  6.51s/it]
Generator loss: 1.911291787729544, Discriminator loss: 0.7252739961532986:  53%|█████▎    | 158/300 [19:48<15:24,  6.51s/it]
Generator loss: 1.911291787729544, Discriminator loss: 0.7252739961532986:  53%|█████▎    | 159/300 [19:48<15:16,  6.50s/it]
Generator loss: 1.9184895096456303, Discriminator loss: 0.7103787857820006:  53%|█████▎    | 159/300 [19:54<15:16,  6.50s/it]
Generator loss: 1.9184895096456303, Discriminator loss: 0.7103787857820006:  53%|█████▎    | 160/300 [19:54<15:07,  6.48s/it]
Generator loss: 1.6518787658076903, Discriminator loss: 1.6146739887840607:  53%|█████▎    | 160/300 [20:01<15:07,  6.48s/it]
Generator loss: 1.6518787658076903, Discriminator loss: 1.6146739887840607:  54%|█████▎    | 161/300 [20:01<14:59,  6.47s/it]
Generator loss: 1.1205224596402223, Discriminator loss: 1.0990365059936749:  54%|█████▎    | 161/300 [20:07<14:59,  6.47s/it]
Generator loss: 1.1205224596402223, Discriminator loss: 1.0990365059936749:  54%|█████▍    | 162/300 [20:07<14:54,  6.48s/it]
Generator loss: 1.3062065813471289, Discriminator loss: 0.9539765860227978:  54%|█████▍    | 162/300 [20:13<14:54,  6.48s/it]
Generator loss: 1.3062065813471289, Discriminator loss: 0.9539765860227978:  54%|█████▍    | 163/300 [20:13<14:47,  6.48s/it]
Generator loss: 1.5725498107426308, Discriminator loss: 0.8217732069246909:  54%|█████▍    | 163/300 [20:20<14:47,  6.48s/it]
Generator loss: 1.5725498107426308, Discriminator loss: 0.8217732069246909:  55%|█████▍    | 164/300 [20:20<14:43,  6.50s/it]
Generator loss: 1.6757422453340363, Discriminator loss: 0.7526570115895832:  55%|█████▍    | 164/300 [20:26<14:43,  6.50s/it]
Generator loss: 1.6757422453340363, Discriminator loss: 0.7526570115895832:  55%|█████▌    | 165/300 [20:26<14:35,  6.49s/it]
Generator loss: 1.7489459041286917, Discriminator loss: 0.7313809710390428:  55%|█████▌    | 165/300 [20:33<14:35,  6.49s/it]
Generator loss: 1.7489459041286917, Discriminator loss: 0.7313809710390428:  55%|█████▌    | 166/300 [20:33<14:26,  6.46s/it]
Generator loss: 1.793169997194234, Discriminator loss: 0.7243372022229082:  55%|█████▌    | 166/300 [20:39<14:26,  6.46s/it] 
Generator loss: 1.793169997194234, Discriminator loss: 0.7243372022229082:  56%|█████▌    | 167/300 [20:39<14:17,  6.45s/it]
Generator loss: 1.8074285098735023, Discriminator loss: 0.7162313873276991:  56%|█████▌    | 167/300 [20:46<14:17,  6.45s/it]
Generator loss: 1.8074285098735023, Discriminator loss: 0.7162313873276991:  56%|█████▌    | 168/300 [20:46<14:11,  6.45s/it]
Generator loss: 1.8447622586699093, Discriminator loss: 0.7173904425957623:  56%|█████▌    | 168/300 [20:52<14:11,  6.45s/it]
Generator loss: 1.8447622586699093, Discriminator loss: 0.7173904425957623:  56%|█████▋    | 169/300 [20:52<14:06,  6.46s/it]
Generator loss: 1.8570084054680431, Discriminator loss: 0.7179789012845825:  56%|█████▋    | 169/300 [20:59<14:06,  6.46s/it]
Generator loss: 1.8570084054680431, Discriminator loss: 0.7179789012845825:  57%|█████▋    | 170/300 [20:59<14:03,  6.49s/it]
Generator loss: 1.8499646011520834, Discriminator loss: 0.7092674168593743:  57%|█████▋    | 170/300 [21:05<14:03,  6.49s/it]
Generator loss: 1.8499646011520834, Discriminator loss: 0.7092674168593743:  57%|█████▋    | 171/300 [21:05<13:55,  6.48s/it]
Generator loss: 1.8719150879803825, Discriminator loss: 0.7033579261863933:  57%|█████▋    | 171/300 [21:12<13:55,  6.48s/it]
Generator loss: 1.8719150879803825, Discriminator loss: 0.7033579261863933:  57%|█████▋    | 172/300 [21:12<13:47,  6.47s/it]
Generator loss: 1.8675539554918514, Discriminator loss: 0.7109484506003997:  57%|█████▋    | 172/300 [21:18<13:47,  6.47s/it]
Generator loss: 1.8675539554918514, Discriminator loss: 0.7109484506003997:  58%|█████▊    | 173/300 [21:18<13:39,  6.46s/it]
Generator loss: 1.886877525378676, Discriminator loss: 0.715422235429287:  58%|█████▊    | 173/300 [21:25<13:39,  6.46s/it]  
Generator loss: 1.886877525378676, Discriminator loss: 0.715422235429287:  58%|█████▊    | 174/300 [21:25<13:30,  6.43s/it]
Generator loss: 1.8639318084015566, Discriminator loss: 0.7092845672193695:  58%|█████▊    | 174/300 [21:31<13:30,  6.43s/it]
Generator loss: 1.8639318084015566, Discriminator loss: 0.7092845672193695:  58%|█████▊    | 175/300 [21:31<13:23,  6.43s/it]
Generator loss: 1.8762365180779905, Discriminator loss: 0.7277653979904511:  58%|█████▊    | 175/300 [21:37<13:23,  6.43s/it]
Generator loss: 1.8762365180779905, Discriminator loss: 0.7277653979904511:  59%|█████▊    | 176/300 [21:37<13:20,  6.46s/it]
Generator loss: 1.8840555084102295, Discriminator loss: 0.7117495133596308:  59%|█████▊    | 176/300 [21:44<13:20,  6.46s/it]
Generator loss: 1.8840555084102295, Discriminator loss: 0.7117495133596308:  59%|█████▉    | 177/300 [21:44<13:09,  6.42s/it]
Generator loss: 1.8722961474867428, Discriminator loss: 0.7381857567850281:  59%|█████▉    | 177/300 [21:50<13:09,  6.42s/it]
Generator loss: 1.8722961474867428, Discriminator loss: 0.7381857567850281:  59%|█████▉    | 178/300 [21:50<12:58,  6.38s/it]
Generator loss: 1.879811488530215, Discriminator loss: 0.7050962079973782:  59%|█████▉    | 178/300 [21:57<12:58,  6.38s/it] 
Generator loss: 1.879811488530215, Discriminator loss: 0.7050962079973782:  60%|█████▉    | 179/300 [21:57<12:54,  6.40s/it]
Generator loss: 1.8758899578276802, Discriminator loss: 0.7048510142108974:  60%|█████▉    | 179/300 [22:03<12:54,  6.40s/it]
Generator loss: 1.8758899578276802, Discriminator loss: 0.7048510142108974:  60%|██████    | 180/300 [22:03<12:50,  6.42s/it]
Generator loss: 1.8873987224172144, Discriminator loss: 0.7038660505238701:  60%|██████    | 180/300 [22:09<12:50,  6.42s/it]
Generator loss: 1.8873987224172144, Discriminator loss: 0.7038660505238701:  60%|██████    | 181/300 [22:09<12:44,  6.42s/it]
Generator loss: 1.8888916346956701, Discriminator loss: 0.7137206582462087:  60%|██████    | 181/300 [22:16<12:44,  6.42s/it]
Generator loss: 1.8888916346956701, Discriminator loss: 0.7137206582462087:  61%|██████    | 182/300 [22:16<12:37,  6.42s/it]
Generator loss: 1.9036011213765425, Discriminator loss: 0.7122918327941614:  61%|██████    | 182/300 [22:22<12:37,  6.42s/it]
Generator loss: 1.9036011213765425, Discriminator loss: 0.7122918327941614:  61%|██████    | 183/300 [22:22<12:27,  6.39s/it]
Generator loss: 1.8918571752660416, Discriminator loss: 0.7075192012331065:  61%|██████    | 183/300 [22:28<12:27,  6.39s/it]
Generator loss: 1.8918571752660416, Discriminator loss: 0.7075192012331065:  61%|██████▏   | 184/300 [22:28<12:18,  6.37s/it]
Generator loss: 1.9028138803208576, Discriminator loss: 0.7169827043133623:  61%|██████▏   | 184/300 [22:35<12:18,  6.37s/it]
Generator loss: 1.9028138803208576, Discriminator loss: 0.7169827043133623:  62%|██████▏   | 185/300 [22:35<12:13,  6.38s/it]
Generator loss: 1.8884707106386913, Discriminator loss: 0.7273798655061161:  62%|██████▏   | 185/300 [22:41<12:13,  6.38s/it]
Generator loss: 1.8884707106386913, Discriminator loss: 0.7273798655061161:  62%|██████▏   | 186/300 [22:41<12:10,  6.41s/it]
Generator loss: 1.89948268848307, Discriminator loss: 0.6963861961575115:  62%|██████▏   | 186/300 [22:48<12:10,  6.41s/it]  
Generator loss: 1.89948268848307, Discriminator loss: 0.6963861961575115:  62%|██████▏   | 187/300 [22:48<12:03,  6.41s/it]
Generator loss: 1.8734046777381617, Discriminator loss: 0.71642957058023:  62%|██████▏   | 187/300 [22:54<12:03,  6.41s/it]
Generator loss: 1.8734046777381617, Discriminator loss: 0.71642957058023:  63%|██████▎   | 188/300 [22:54<11:57,  6.41s/it]
Generator loss: 1.897829573820619, Discriminator loss: 0.7021876989918596:  63%|██████▎   | 188/300 [23:00<11:57,  6.41s/it]
Generator loss: 1.897829573820619, Discriminator loss: 0.7021876989918596:  63%|██████▎   | 189/300 [23:00<11:47,  6.38s/it]
Generator loss: 1.9079211196478676, Discriminator loss: 0.7051354537115377:  63%|██████▎   | 189/300 [23:07<11:47,  6.38s/it]
Generator loss: 1.9079211196478676, Discriminator loss: 0.7051354537115377:  63%|██████▎   | 190/300 [23:07<11:38,  6.35s/it]
Generator loss: 1.8897704271709217, Discriminator loss: 0.7202173291760332:  63%|██████▎   | 190/300 [23:13<11:38,  6.35s/it]
Generator loss: 1.8897704271709217, Discriminator loss: 0.7202173291760332:  64%|██████▎   | 191/300 [23:13<11:29,  6.32s/it]
Generator loss: 1.896269340725506, Discriminator loss: 0.6983693526948199:  64%|██████▎   | 191/300 [23:19<11:29,  6.32s/it] 
Generator loss: 1.896269340725506, Discriminator loss: 0.6983693526948199:  64%|██████▍   | 192/300 [23:19<11:22,  6.32s/it]
Generator loss: 1.9090866350075777, Discriminator loss: 0.7042679291437653:  64%|██████▍   | 192/300 [23:26<11:22,  6.32s/it]
Generator loss: 1.9090866350075777, Discriminator loss: 0.7042679291437653:  64%|██████▍   | 193/300 [23:26<11:15,  6.32s/it]
Generator loss: 1.9113440697684008, Discriminator loss: 0.6876289436922354:  64%|██████▍   | 193/300 [23:32<11:15,  6.32s/it]
Generator loss: 1.9113440697684008, Discriminator loss: 0.6876289436922354:  65%|██████▍   | 194/300 [23:32<11:13,  6.35s/it]
Generator loss: 1.903952716904528, Discriminator loss: 0.7130992004976553:  65%|██████▍   | 194/300 [23:38<11:13,  6.35s/it] 
Generator loss: 1.903952716904528, Discriminator loss: 0.7130992004976553:  65%|██████▌   | 195/300 [23:38<11:06,  6.35s/it]
Generator loss: 1.9276543259620667, Discriminator loss: 0.6819565396975068:  65%|██████▌   | 195/300 [23:45<11:06,  6.35s/it]
Generator loss: 1.9276543259620667, Discriminator loss: 0.6819565396975068:  65%|██████▌   | 196/300 [23:45<11:00,  6.35s/it]
Generator loss: 1.9072776890414602, Discriminator loss: 0.7404027844176573:  65%|██████▌   | 196/300 [23:51<11:00,  6.35s/it]
Generator loss: 1.9072776890414602, Discriminator loss: 0.7404027844176573:  66%|██████▌   | 197/300 [23:51<10:53,  6.35s/it]
Generator loss: 1.9140455117997002, Discriminator loss: 0.695412377224249:  66%|██████▌   | 197/300 [23:57<10:53,  6.35s/it] 
Generator loss: 1.9140455117997002, Discriminator loss: 0.695412377224249:  66%|██████▌   | 198/300 [23:57<10:47,  6.35s/it]
Generator loss: 1.939264622681281, Discriminator loss: 0.6866086449693231:  66%|██████▌   | 198/300 [24:04<10:47,  6.35s/it]
Generator loss: 1.939264622681281, Discriminator loss: 0.6866086449693231:  66%|██████▋   | 199/300 [24:04<10:40,  6.34s/it]
Generator loss: 1.8983901568195398, Discriminator loss: 0.7000106559956775:  66%|██████▋   | 199/300 [24:10<10:40,  6.34s/it]
Generator loss: 1.8983901568195398, Discriminator loss: 0.7000106559956775:  67%|██████▋   | 200/300 [24:10<10:37,  6.37s/it]
Generator loss: 1.9391444851370419, Discriminator loss: 0.6873982890563852:  67%|██████▋   | 200/300 [24:17<10:37,  6.37s/it]
Generator loss: 1.9391444851370419, Discriminator loss: 0.6873982890563852:  67%|██████▋   | 201/300 [24:17<10:29,  6.36s/it]
Generator loss: 1.937527555753203, Discriminator loss: 0.6812970143030671:  67%|██████▋   | 201/300 [24:23<10:29,  6.36s/it] 
Generator loss: 1.937527555753203, Discriminator loss: 0.6812970143030671:  67%|██████▋   | 202/300 [24:23<10:22,  6.36s/it]
Generator loss: 1.9439553898923538, Discriminator loss: 0.6840969950837248:  67%|██████▋   | 202/300 [24:29<10:22,  6.36s/it]
Generator loss: 1.9439553898923538, Discriminator loss: 0.6840969950837248:  68%|██████▊   | 203/300 [24:29<10:15,  6.35s/it]
Generator loss: 1.9516027254216812, Discriminator loss: 0.6815296915524146:  68%|██████▊   | 203/300 [24:36<10:15,  6.35s/it]
Generator loss: 1.9516027254216812, Discriminator loss: 0.6815296915524146:  68%|██████▊   | 204/300 [24:36<10:09,  6.35s/it]
Generator loss: 1.947387943373007, Discriminator loss: 0.7077010602635496:  68%|██████▊   | 204/300 [24:42<10:09,  6.35s/it] 
Generator loss: 1.947387943373007, Discriminator loss: 0.7077010602635496:  68%|██████▊   | 205/300 [24:42<10:02,  6.34s/it]
Generator loss: 1.943607038434814, Discriminator loss: 0.6858728102901402:  68%|██████▊   | 205/300 [24:48<10:02,  6.34s/it]
Generator loss: 1.943607038434814, Discriminator loss: 0.6858728102901402:  69%|██████▊   | 206/300 [24:48<09:56,  6.34s/it]
Generator loss: 1.955567304702366, Discriminator loss: 0.6870937211548581:  69%|██████▊   | 206/300 [24:55<09:56,  6.34s/it]
Generator loss: 1.955567304702366, Discriminator loss: 0.6870937211548581:  69%|██████▉   | 207/300 [24:55<09:52,  6.37s/it]
Generator loss: 1.939436419045224, Discriminator loss: 0.684797972878989:  69%|██████▉   | 207/300 [25:01<09:52,  6.37s/it] 
Generator loss: 1.939436419045224, Discriminator loss: 0.684797972878989:  69%|██████▉   | 208/300 [25:01<09:45,  6.36s/it]
Generator loss: 1.951779685476247, Discriminator loss: 0.6865587295854793:  69%|██████▉   | 208/300 [25:07<09:45,  6.36s/it]
Generator loss: 1.951779685476247, Discriminator loss: 0.6865587295854793:  70%|██████▉   | 209/300 [25:07<09:38,  6.36s/it]
Generator loss: 1.9524799935957964, Discriminator loss: 0.6790948058752453:  70%|██████▉   | 209/300 [25:14<09:38,  6.36s/it]
Generator loss: 1.9524799935957964, Discriminator loss: 0.6790948058752453:  70%|███████   | 210/300 [25:14<09:31,  6.35s/it]
Generator loss: 1.9670421332120895, Discriminator loss: 0.6861023486537092:  70%|███████   | 210/300 [25:20<09:31,  6.35s/it]
Generator loss: 1.9670421332120895, Discriminator loss: 0.6861023486537092:  70%|███████   | 211/300 [25:20<09:24,  6.35s/it]
Generator loss: 1.9545674709712757, Discriminator loss: 0.6863196763922187:  70%|███████   | 211/300 [25:26<09:24,  6.35s/it]
Generator loss: 1.9545674709712757, Discriminator loss: 0.6863196763922187:  71%|███████   | 212/300 [25:26<09:18,  6.34s/it]
Generator loss: 1.958948059117093, Discriminator loss: 0.6826620995998383:  71%|███████   | 212/300 [25:33<09:18,  6.34s/it] 
Generator loss: 1.958948059117093, Discriminator loss: 0.6826620995998383:  71%|███████   | 213/300 [25:33<09:14,  6.37s/it]
Generator loss: 1.9341519448687048, Discriminator loss: 0.6950189532602534:  71%|███████   | 213/300 [25:39<09:14,  6.37s/it]
Generator loss: 1.9341519448687048, Discriminator loss: 0.6950189532602534:  71%|███████▏  | 214/300 [25:39<09:06,  6.36s/it]
Generator loss: 1.9501662600566358, Discriminator loss: 0.6898260752067846:  71%|███████▏  | 214/300 [25:46<09:06,  6.36s/it]
Generator loss: 1.9501662600566358, Discriminator loss: 0.6898260752067846:  72%|███████▏  | 215/300 [25:46<08:59,  6.35s/it]
Generator loss: 1.9576767919694675, Discriminator loss: 0.6672668312402332:  72%|███████▏  | 215/300 [25:52<08:59,  6.35s/it]
Generator loss: 1.9576767919694675, Discriminator loss: 0.6672668312402332:  72%|███████▏  | 216/300 [25:52<08:53,  6.35s/it]
Generator loss: 1.9782413156593548, Discriminator loss: 0.6659881433143335:  72%|███████▏  | 216/300 [25:58<08:53,  6.35s/it]
Generator loss: 1.9782413156593548, Discriminator loss: 0.6659881433143335:  72%|███████▏  | 217/300 [25:58<08:46,  6.35s/it]
Generator loss: 1.97869812039768, Discriminator loss: 0.6723256720339551:  72%|███████▏  | 217/300 [26:05<08:46,  6.35s/it]  
Generator loss: 1.97869812039768, Discriminator loss: 0.6723256720339551:  73%|███████▎  | 218/300 [26:05<08:40,  6.34s/it]
Generator loss: 1.9745400394586956, Discriminator loss: 0.6871117230723885:  73%|███████▎  | 218/300 [26:11<08:40,  6.34s/it]
Generator loss: 1.9745400394586956, Discriminator loss: 0.6871117230723885:  73%|███████▎  | 219/300 [26:11<08:35,  6.37s/it]
Generator loss: 1.9665792154915192, Discriminator loss: 0.6812433035058134:  73%|███████▎  | 219/300 [26:17<08:35,  6.37s/it]
Generator loss: 1.9665792154915192, Discriminator loss: 0.6812433035058134:  73%|███████▎  | 220/300 [26:17<08:28,  6.36s/it]
Generator loss: 1.9785992275266087, Discriminator loss: 0.67591892182827:  73%|███████▎  | 220/300 [26:24<08:28,  6.36s/it]  
Generator loss: 1.9785992275266087, Discriminator loss: 0.67591892182827:  74%|███████▎  | 221/300 [26:24<08:21,  6.35s/it]
Generator loss: 1.973692320725497, Discriminator loss: 0.676657390945098:  74%|███████▎  | 221/300 [26:30<08:21,  6.35s/it]
Generator loss: 1.973692320725497, Discriminator loss: 0.676657390945098:  74%|███████▍  | 222/300 [26:30<08:15,  6.35s/it]
Generator loss: 1.9833708035157007, Discriminator loss: 0.6975493062944973:  74%|███████▍  | 222/300 [26:36<08:15,  6.35s/it]
Generator loss: 1.9833708035157007, Discriminator loss: 0.6975493062944973:  74%|███████▍  | 223/300 [26:36<08:08,  6.34s/it]
Generator loss: 1.9668212234973907, Discriminator loss: 0.6715009891811539:  74%|███████▍  | 223/300 [26:43<08:08,  6.34s/it]
Generator loss: 1.9668212234973907, Discriminator loss: 0.6715009891811539:  75%|███████▍  | 224/300 [26:43<08:02,  6.34s/it]
Generator loss: 1.9748070257551529, Discriminator loss: 0.6686904412858626:  75%|███████▍  | 224/300 [26:49<08:02,  6.34s/it]
Generator loss: 1.9748070257551529, Discriminator loss: 0.6686904412858626:  75%|███████▌  | 225/300 [26:49<07:57,  6.37s/it]
Generator loss: 2.0004925999571297, Discriminator loss: 0.6750858426094055:  75%|███████▌  | 225/300 [26:55<07:57,  6.37s/it]
Generator loss: 2.0004925999571297, Discriminator loss: 0.6750858426094055:  75%|███████▌  | 226/300 [26:55<07:52,  6.38s/it]
Generator loss: 1.9894653199350132, Discriminator loss: 0.6713695771553937:  75%|███████▌  | 226/300 [27:02<07:52,  6.38s/it]
Generator loss: 1.9894653199350132, Discriminator loss: 0.6713695771553937:  76%|███████▌  | 227/300 [27:02<07:47,  6.40s/it]
Generator loss: 1.9966776563840754, Discriminator loss: 0.6589639384080382:  76%|███████▌  | 227/300 [27:08<07:47,  6.40s/it]
Generator loss: 1.9966776563840754, Discriminator loss: 0.6589639384080382:  76%|███████▌  | 228/300 [27:08<07:40,  6.39s/it]
Generator loss: 1.994829827810035, Discriminator loss: 0.6754506212823531:  76%|███████▌  | 228/300 [27:15<07:40,  6.39s/it] 
Generator loss: 1.994829827810035, Discriminator loss: 0.6754506212823531:  76%|███████▋  | 229/300 [27:15<07:33,  6.39s/it]
Generator loss: 2.001828783575226, Discriminator loss: 0.650514531223213:  76%|███████▋  | 229/300 [27:21<07:33,  6.39s/it] 
Generator loss: 2.001828783575226, Discriminator loss: 0.650514531223213:  77%|███████▋  | 230/300 [27:21<07:28,  6.40s/it]
Generator loss: 2.0208084548220917, Discriminator loss: 0.6529218304683181:  77%|███████▋  | 230/300 [27:28<07:28,  6.40s/it]
Generator loss: 2.0208084548220917, Discriminator loss: 0.6529218304683181:  77%|███████▋  | 231/300 [27:28<07:23,  6.43s/it]
Generator loss: 2.000203351764118, Discriminator loss: 0.665237603818669:  77%|███████▋  | 231/300 [27:34<07:23,  6.43s/it]  
Generator loss: 2.000203351764118, Discriminator loss: 0.665237603818669:  77%|███████▋  | 232/300 [27:34<07:15,  6.40s/it]
Generator loss: 2.0168945166994545, Discriminator loss: 0.6575310440624461:  77%|███████▋  | 232/300 [27:40<07:15,  6.40s/it]
Generator loss: 2.0168945166994545, Discriminator loss: 0.6575310440624461:  78%|███████▊  | 233/300 [27:40<07:08,  6.40s/it]
Generator loss: 2.0304839768830467, Discriminator loss: 0.6567836140885073:  78%|███████▊  | 233/300 [27:47<07:08,  6.40s/it]
Generator loss: 2.0304839768830467, Discriminator loss: 0.6567836140885073:  78%|███████▊  | 234/300 [27:47<07:02,  6.40s/it]
Generator loss: 1.983285310513833, Discriminator loss: 0.692187463535982:  78%|███████▊  | 234/300 [27:53<07:02,  6.40s/it]  
Generator loss: 1.983285310513833, Discriminator loss: 0.692187463535982:  78%|███████▊  | 235/300 [27:53<06:55,  6.39s/it]
Generator loss: 1.9890784682596432, Discriminator loss: 0.6605642341515597:  78%|███████▊  | 235/300 [28:00<06:55,  6.39s/it]
Generator loss: 1.9890784682596432, Discriminator loss: 0.6605642341515597:  79%|███████▊  | 236/300 [28:00<06:49,  6.39s/it]
Generator loss: 2.0142978377201977, Discriminator loss: 0.653194293818053:  79%|███████▊  | 236/300 [28:06<06:49,  6.39s/it] 
Generator loss: 2.0142978377201977, Discriminator loss: 0.653194293818053:  79%|███████▉  | 237/300 [28:06<06:44,  6.42s/it]
Generator loss: 2.049765216953614, Discriminator loss: 0.6443047887262177:  79%|███████▉  | 237/300 [28:12<06:44,  6.42s/it]
Generator loss: 2.049765216953614, Discriminator loss: 0.6443047887262177:  79%|███████▉  | 238/300 [28:12<06:38,  6.43s/it]
Generator loss: 2.0278301063705895, Discriminator loss: 0.64878829963067:  79%|███████▉  | 238/300 [28:19<06:38,  6.43s/it] 
Generator loss: 2.0278301063705895, Discriminator loss: 0.64878829963067:  80%|███████▉  | 239/300 [28:19<06:31,  6.42s/it]
Generator loss: 2.043862792498925, Discriminator loss: 0.658869181485737:  80%|███████▉  | 239/300 [28:25<06:31,  6.42s/it]
Generator loss: 2.043862792498925, Discriminator loss: 0.658869181485737:  80%|████████  | 240/300 [28:25<06:24,  6.41s/it]
Generator loss: 2.0490779552389595, Discriminator loss: 0.6539010339800049:  80%|████████  | 240/300 [28:32<06:24,  6.41s/it]
Generator loss: 2.0490779552389595, Discriminator loss: 0.6539010339800049:  80%|████████  | 241/300 [28:32<06:18,  6.41s/it]
Generator loss: 2.0562286771395626, Discriminator loss: 0.6415835838107502:  80%|████████  | 241/300 [28:38<06:18,  6.41s/it]
Generator loss: 2.0562286771395626, Discriminator loss: 0.6415835838107502:  81%|████████  | 242/300 [28:38<06:12,  6.42s/it]
Generator loss: 2.0574584980221355, Discriminator loss: 0.633175873581101:  81%|████████  | 242/300 [28:45<06:12,  6.42s/it] 
Generator loss: 2.0574584980221355, Discriminator loss: 0.633175873581101:  81%|████████  | 243/300 [28:45<06:07,  6.44s/it]
Generator loss: 2.0515074326711544, Discriminator loss: 0.6391502916812897:  81%|████████  | 243/300 [28:51<06:07,  6.44s/it]
Generator loss: 2.0515074326711544, Discriminator loss: 0.6391502916812897:  81%|████████▏ | 244/300 [28:51<05:59,  6.43s/it]
Generator loss: 2.0736149461830364, Discriminator loss: 0.6443934633451349:  81%|████████▏ | 244/300 [28:57<05:59,  6.43s/it]
Generator loss: 2.0736149461830364, Discriminator loss: 0.6443934633451349:  82%|████████▏ | 245/300 [28:57<05:53,  6.43s/it]
Generator loss: 2.0439358862007366, Discriminator loss: 0.675808913567487:  82%|████████▏ | 245/300 [29:04<05:53,  6.43s/it] 
Generator loss: 2.0439358862007366, Discriminator loss: 0.675808913567487:  82%|████████▏ | 246/300 [29:04<05:46,  6.42s/it]
Generator loss: 2.0692701777991127, Discriminator loss: 0.6446802813340636:  82%|████████▏ | 246/300 [29:10<05:46,  6.42s/it]
Generator loss: 2.0692701777991127, Discriminator loss: 0.6446802813340636:  82%|████████▏ | 247/300 [29:10<05:40,  6.42s/it]
Generator loss: 2.049621323452276, Discriminator loss: 0.6281508106519195:  82%|████████▏ | 247/300 [29:17<05:40,  6.42s/it] 
Generator loss: 2.049621323452276, Discriminator loss: 0.6281508106519195:  83%|████████▎ | 248/300 [29:17<05:33,  6.41s/it]
Generator loss: 2.068838319357704, Discriminator loss: 0.6554880567333278:  83%|████████▎ | 248/300 [29:23<05:33,  6.41s/it]
Generator loss: 2.068838319357704, Discriminator loss: 0.6554880567333278:  83%|████████▎ | 249/300 [29:23<05:27,  6.42s/it]
Generator loss: 2.0447138083331726, Discriminator loss: 0.6430171342457042:  83%|████████▎ | 249/300 [29:30<05:27,  6.42s/it]
Generator loss: 2.0447138083331726, Discriminator loss: 0.6430171342457042:  83%|████████▎ | 250/300 [29:30<05:22,  6.44s/it]
Generator loss: 2.0798104010960636, Discriminator loss: 0.6381222447928261:  83%|████████▎ | 250/300 [29:36<05:22,  6.44s/it]
Generator loss: 2.0798104010960636, Discriminator loss: 0.6381222447928261:  84%|████████▎ | 251/300 [29:36<05:15,  6.43s/it]
Generator loss: 2.071589014985982, Discriminator loss: 0.6498981937766075:  84%|████████▎ | 251/300 [29:42<05:15,  6.43s/it] 
Generator loss: 2.071589014985982, Discriminator loss: 0.6498981937766075:  84%|████████▍ | 252/300 [29:42<05:08,  6.43s/it]
Generator loss: 2.0728830090340447, Discriminator loss: 0.6422935833825785:  84%|████████▍ | 252/300 [29:49<05:08,  6.43s/it]
Generator loss: 2.0728830090340447, Discriminator loss: 0.6422935833825785:  84%|████████▍ | 253/300 [29:49<05:01,  6.42s/it]
Generator loss: 2.059836439350072, Discriminator loss: 0.6349203437566757:  84%|████████▍ | 253/300 [29:55<05:01,  6.42s/it] 
Generator loss: 2.059836439350072, Discriminator loss: 0.6349203437566757:  85%|████████▍ | 254/300 [29:55<04:55,  6.41s/it]
Generator loss: 2.091802353368086, Discriminator loss: 0.6277351261061781:  85%|████████▍ | 254/300 [30:02<04:55,  6.41s/it]
Generator loss: 2.091802353368086, Discriminator loss: 0.6277351261061781:  85%|████████▌ | 255/300 [30:02<04:48,  6.42s/it]
Generator loss: 2.0656274004894146, Discriminator loss: 0.6457043259459383:  85%|████████▌ | 255/300 [30:08<04:48,  6.42s/it]
Generator loss: 2.0656274004894146, Discriminator loss: 0.6457043259459383:  85%|████████▌ | 256/300 [30:08<04:43,  6.45s/it]
Generator loss: 2.1072967964060165, Discriminator loss: 0.6247626034652486:  85%|████████▌ | 256/300 [30:15<04:43,  6.45s/it]
Generator loss: 2.1072967964060165, Discriminator loss: 0.6247626034652486:  86%|████████▌ | 257/300 [30:15<04:36,  6.44s/it]
Generator loss: 2.0962683190317715, Discriminator loss: 0.631443692919086:  86%|████████▌ | 257/300 [30:21<04:36,  6.44s/it] 
Generator loss: 2.0962683190317715, Discriminator loss: 0.631443692919086:  86%|████████▌ | 258/300 [30:21<04:30,  6.43s/it]
Generator loss: 2.1010903006090835, Discriminator loss: 0.6387709094321027:  86%|████████▌ | 258/300 [30:27<04:30,  6.43s/it]
Generator loss: 2.1010903006090835, Discriminator loss: 0.6387709094321027:  86%|████████▋ | 259/300 [30:27<04:23,  6.43s/it]
Generator loss: 2.108405414749594, Discriminator loss: 0.6291532516479492:  86%|████████▋ | 259/300 [30:34<04:23,  6.43s/it] 
Generator loss: 2.108405414749594, Discriminator loss: 0.6291532516479492:  87%|████████▋ | 260/300 [30:34<04:17,  6.43s/it]
Generator loss: 2.1114456373102524, Discriminator loss: 0.6167570541010183:  87%|████████▋ | 260/300 [30:40<04:17,  6.43s/it]
Generator loss: 2.1114456373102524, Discriminator loss: 0.6167570541010183:  87%|████████▋ | 261/300 [30:40<04:10,  6.43s/it]
Generator loss: 2.1329608182696735, Discriminator loss: 0.6212209826883148:  87%|████████▋ | 261/300 [30:47<04:10,  6.43s/it]
Generator loss: 2.1329608182696735, Discriminator loss: 0.6212209826883148:  87%|████████▋ | 262/300 [30:47<04:05,  6.45s/it]
Generator loss: 2.120411331162733, Discriminator loss: 0.6373535583124441:  87%|████████▋ | 262/300 [30:53<04:05,  6.45s/it] 
Generator loss: 2.120411331162733, Discriminator loss: 0.6373535583124441:  88%|████████▊ | 263/300 [30:53<03:59,  6.48s/it]
Generator loss: 2.099952261237537, Discriminator loss: 0.6335013679721776:  88%|████████▊ | 263/300 [31:00<03:59,  6.48s/it]
Generator loss: 2.099952261237537, Discriminator loss: 0.6335013679721776:  88%|████████▊ | 264/300 [31:00<03:54,  6.52s/it]
Generator loss: 2.1003993810976254, Discriminator loss: 0.6331959783154375:  88%|████████▊ | 264/300 [31:07<03:54,  6.52s/it]
Generator loss: 2.1003993810976254, Discriminator loss: 0.6331959783154375:  88%|████████▊ | 265/300 [31:07<03:49,  6.55s/it]
Generator loss: 2.088849120280322, Discriminator loss: 0.6298180586274933:  88%|████████▊ | 265/300 [31:13<03:49,  6.55s/it] 
Generator loss: 2.088849120280322, Discriminator loss: 0.6298180586274933:  89%|████████▊ | 266/300 [31:13<03:43,  6.57s/it]
Generator loss: 2.1161189938292786, Discriminator loss: 0.6166971799205331:  89%|████████▊ | 266/300 [31:20<03:43,  6.57s/it]
Generator loss: 2.1161189938292786, Discriminator loss: 0.6166971799205331:  89%|████████▉ | 267/300 [31:20<03:37,  6.59s/it]
Generator loss: 2.142376761226093, Discriminator loss: 0.6130306541043169:  89%|████████▉ | 267/300 [31:27<03:37,  6.59s/it] 
Generator loss: 2.142376761226093, Discriminator loss: 0.6130306541043169:  89%|████████▉ | 268/300 [31:27<03:32,  6.65s/it]
Generator loss: 2.1195085732375873, Discriminator loss: 0.6244634438086959:  89%|████████▉ | 268/300 [31:33<03:32,  6.65s/it]
Generator loss: 2.1195085732375873, Discriminator loss: 0.6244634438086959:  90%|████████▉ | 269/300 [31:33<03:26,  6.67s/it]
Generator loss: 2.136966422200203, Discriminator loss: 0.6256932173581684:  90%|████████▉ | 269/300 [31:40<03:26,  6.67s/it] 
Generator loss: 2.136966422200203, Discriminator loss: 0.6256932173581684:  90%|█████████ | 270/300 [31:40<03:20,  6.69s/it]
Generator loss: 2.1253008071114037, Discriminator loss: 0.6211840646231875:  90%|█████████ | 270/300 [31:47<03:20,  6.69s/it]
Generator loss: 2.1253008071114037, Discriminator loss: 0.6211840646231875:  90%|█████████ | 271/300 [31:47<03:14,  6.70s/it]
Generator loss: 2.1245981340899185, Discriminator loss: 0.6270312956150841:  90%|█████████ | 271/300 [31:53<03:14,  6.70s/it]
Generator loss: 2.1245981340899185, Discriminator loss: 0.6270312956150841:  91%|█████████ | 272/300 [31:53<03:07,  6.71s/it]
Generator loss: 2.151730140342432, Discriminator loss: 0.6084378054913353:  91%|█████████ | 272/300 [32:00<03:07,  6.71s/it] 
Generator loss: 2.151730140342432, Discriminator loss: 0.6084378054913353:  91%|█████████ | 273/300 [32:00<03:01,  6.72s/it]
Generator loss: 2.1285239063641606, Discriminator loss: 0.6252047546646174:  91%|█████████ | 273/300 [32:07<03:01,  6.72s/it]
Generator loss: 2.1285239063641606, Discriminator loss: 0.6252047546646174:  91%|█████████▏| 274/300 [32:07<02:54,  6.73s/it]
Generator loss: 2.1367137300617554, Discriminator loss: 0.6136608995935496:  91%|█████████▏| 274/300 [32:14<02:54,  6.73s/it]
Generator loss: 2.1367137300617554, Discriminator loss: 0.6136608995935496:  92%|█████████▏| 275/300 [32:14<02:47,  6.71s/it]
Generator loss: 2.147204104153549, Discriminator loss: 0.6176723807173616:  92%|█████████▏| 275/300 [32:20<02:47,  6.71s/it] 
Generator loss: 2.147204104153549, Discriminator loss: 0.6176723807173616:  92%|█████████▏| 276/300 [32:20<02:41,  6.71s/it]
Generator loss: 2.1437266688136494, Discriminator loss: 0.6095429987591856:  92%|█████████▏| 276/300 [32:27<02:41,  6.71s/it]
Generator loss: 2.1437266688136494, Discriminator loss: 0.6095429987591856:  92%|█████████▏| 277/300 [32:27<02:33,  6.68s/it]
Generator loss: 2.152301244875964, Discriminator loss: 0.618432903991026:  92%|█████████▏| 277/300 [32:34<02:33,  6.68s/it]  
Generator loss: 2.152301244875964, Discriminator loss: 0.618432903991026:  93%|█████████▎| 278/300 [32:34<02:27,  6.69s/it]
Generator loss: 2.1435503083116867, Discriminator loss: 0.6124763900742811:  93%|█████████▎| 278/300 [32:40<02:27,  6.69s/it]
Generator loss: 2.1435503083116867, Discriminator loss: 0.6124763900742811:  93%|█████████▎| 279/300 [32:40<02:20,  6.69s/it]
Generator loss: 2.1760846148518955, Discriminator loss: 0.6109043350991081:  93%|█████████▎| 279/300 [32:47<02:20,  6.69s/it]
Generator loss: 2.1760846148518955, Discriminator loss: 0.6109043350991081:  93%|█████████▎| 280/300 [32:47<02:14,  6.70s/it]
Generator loss: 2.1473377986865887, Discriminator loss: 0.6142729858265203:  93%|█████████▎| 280/300 [32:54<02:14,  6.70s/it]
Generator loss: 2.1473377986865887, Discriminator loss: 0.6142729858265203:  94%|█████████▎| 281/300 [32:54<02:07,  6.70s/it]
Generator loss: 2.147419778739705, Discriminator loss: 0.6075115444905618:  94%|█████████▎| 281/300 [33:00<02:07,  6.70s/it] 
Generator loss: 2.147419778739705, Discriminator loss: 0.6075115444905618:  94%|█████████▍| 282/300 [33:00<02:00,  6.68s/it]
Generator loss: 2.1586636971024906, Discriminator loss: 0.6037897603476748:  94%|█████████▍| 282/300 [33:07<02:00,  6.68s/it]
Generator loss: 2.1586636971024906, Discriminator loss: 0.6037897603476748:  94%|█████████▍| 283/300 [33:07<01:53,  6.67s/it]
Generator loss: 2.1525688136325165, Discriminator loss: 0.611905101029312:  94%|█████████▍| 283/300 [33:14<01:53,  6.67s/it] 
Generator loss: 2.1525688136325165, Discriminator loss: 0.611905101029312:  95%|█████████▍| 284/300 [33:14<01:46,  6.66s/it]
Generator loss: 2.1883829025661243, Discriminator loss: 0.6092779662679223:  95%|█████████▍| 284/300 [33:20<01:46,  6.66s/it]
Generator loss: 2.1883829025661243, Discriminator loss: 0.6092779662679223:  95%|█████████▌| 285/300 [33:20<01:39,  6.66s/it]
Generator loss: 2.175755638410063, Discriminator loss: 0.5928393164101768:  95%|█████████▌| 285/300 [33:27<01:39,  6.66s/it] 
Generator loss: 2.175755638410063, Discriminator loss: 0.5928393164101768:  95%|█████████▌| 286/300 [33:27<01:33,  6.67s/it]
Generator loss: 2.1687005433966133, Discriminator loss: 0.6157251898856724:  95%|█████████▌| 286/300 [33:34<01:33,  6.67s/it]
Generator loss: 2.1687005433966133, Discriminator loss: 0.6157251898856724:  96%|█████████▌| 287/300 [33:34<01:26,  6.65s/it]
Generator loss: 2.16642308585784, Discriminator loss: 0.6245952195980969:  96%|█████████▌| 287/300 [33:40<01:26,  6.65s/it]  
Generator loss: 2.16642308585784, Discriminator loss: 0.6245952195980969:  96%|█████████▌| 288/300 [33:40<01:19,  6.63s/it]
Generator loss: 2.178683908546672, Discriminator loss: 0.593368058476378:  96%|█████████▌| 288/300 [33:47<01:19,  6.63s/it]
Generator loss: 2.178683908546672, Discriminator loss: 0.593368058476378:  96%|█████████▋| 289/300 [33:47<01:12,  6.62s/it]
Generator loss: 2.181570036446347, Discriminator loss: 0.5935841163291651:  96%|█████████▋| 289/300 [33:53<01:12,  6.62s/it]
Generator loss: 2.181570036446347, Discriminator loss: 0.5935841163291651:  97%|█████████▋| 290/300 [33:53<01:06,  6.62s/it]
Generator loss: 2.180004572167116, Discriminator loss: 0.6197993030004642:  97%|█████████▋| 290/300 [34:00<01:06,  6.62s/it]
Generator loss: 2.180004572167116, Discriminator loss: 0.6197993030004642:  97%|█████████▋| 291/300 [34:00<00:59,  6.61s/it]
Generator loss: 2.157050187096876, Discriminator loss: 0.5961371349061236:  97%|█████████▋| 291/300 [34:07<00:59,  6.61s/it]
Generator loss: 2.157050187096876, Discriminator loss: 0.5961371349061236:  97%|█████████▋| 292/300 [34:07<00:52,  6.62s/it]
Generator loss: 2.181761022876291, Discriminator loss: 0.5929972853730706:  97%|█████████▋| 292/300 [34:13<00:52,  6.62s/it]
Generator loss: 2.181761022876291, Discriminator loss: 0.5929972853730706:  98%|█████████▊| 293/300 [34:13<00:46,  6.65s/it]
Generator loss: 2.198557503959712, Discriminator loss: 0.6052266035009833:  98%|█████████▊| 293/300 [34:20<00:46,  6.65s/it]
Generator loss: 2.198557503959712, Discriminator loss: 0.6052266035009833:  98%|█████████▊| 294/300 [34:20<00:39,  6.64s/it]
Generator loss: 2.1911894468700184, Discriminator loss: 0.5976759361870149:  98%|█████████▊| 294/300 [34:27<00:39,  6.64s/it]
Generator loss: 2.1911894468700184, Discriminator loss: 0.5976759361870149:  98%|█████████▊| 295/300 [34:27<00:33,  6.62s/it]
Generator loss: 2.207013448371607, Discriminator loss: 0.5941765610786045:  98%|█████████▊| 295/300 [34:33<00:33,  6.62s/it] 
Generator loss: 2.207013448371607, Discriminator loss: 0.5941765610786045:  99%|█████████▊| 296/300 [34:33<00:26,  6.60s/it]
Generator loss: 2.2090228690820584, Discriminator loss: 0.5952651058049763:  99%|█████████▊| 296/300 [34:40<00:26,  6.60s/it]
Generator loss: 2.2090228690820584, Discriminator loss: 0.5952651058049763:  99%|█████████▉| 297/300 [34:40<00:19,  6.61s/it]
Generator loss: 2.207116046372582, Discriminator loss: 0.5886701748651617:  99%|█████████▉| 297/300 [34:46<00:19,  6.61s/it] 
Generator loss: 2.207116046372582, Discriminator loss: 0.5886701748651617:  99%|█████████▉| 298/300 [34:46<00:13,  6.60s/it]
Generator loss: 2.222158696721582, Discriminator loss: 0.5882657041006228:  99%|█████████▉| 298/300 [34:53<00:13,  6.60s/it]
Generator loss: 2.222158696721582, Discriminator loss: 0.5882657041006228: 100%|█████████▉| 299/300 [34:53<00:06,  6.64s/it]
Generator loss: 2.2106246729107464, Discriminator loss: 0.5984119886861128: 100%|█████████▉| 299/300 [35:00<00:06,  6.64s/it]
Generator loss: 2.2106246729107464, Discriminator loss: 0.5984119886861128: 100%|██████████| 300/300 [35:00<00:00,  6.60s/it]
Generator loss: 2.2106246729107464, Discriminator loss: 0.5984119886861128: 100%|██████████| 300/300 [35:00<00:00,  7.00s/it]
Training Completed!

serious_mnist.py

  0%|          | 0/1875 [00:00<?, ?it/s]
loss 2.32 accuracy 0.09:   0%|          | 0/1875 [00:07<?, ?it/s]
loss 2.32 accuracy 0.09:   0%|          | 1/1875 [00:07<3:53:47,  7.49s/it]
loss 2.32 accuracy 0.06:   0%|          | 1/1875 [00:10<3:53:47,  7.49s/it]
loss 2.32 accuracy 0.06:   0%|          | 2/1875 [00:10<2:23:39,  4.60s/it]
loss 2.34 accuracy 0.12:   0%|          | 2/1875 [00:10<2:23:39,  4.60s/it]
loss 2.29 accuracy 0.25:   0%|          | 2/1875 [00:10<2:23:39,  4.60s/it]
loss 2.29 accuracy 0.22:   0%|          | 2/1875 [00:10<2:23:39,  4.60s/it]
loss 2.24 accuracy 0.22:   0%|          | 2/1875 [00:10<2:23:39,  4.60s/it]
loss 2.26 accuracy 0.06:   0%|          | 2/1875 [00:10<2:23:39,  4.60s/it]
loss 2.26 accuracy 0.06:   0%|          | 7/1875 [00:10<28:06,  1.11it/s]  
loss 2.35 accuracy 0.19:   0%|          | 7/1875 [00:10<28:06,  1.11it/s]
loss 2.37 accuracy 0.19:   0%|          | 7/1875 [00:10<28:06,  1.11it/s]
loss 2.25 accuracy 0.12:   0%|          | 7/1875 [00:10<28:06,  1.11it/s]
loss 2.16 accuracy 0.31:   0%|          | 7/1875 [00:10<28:06,  1.11it/s]
loss 2.24 accuracy 0.12:   0%|          | 7/1875 [00:10<28:06,  1.11it/s]
loss 2.24 accuracy 0.12:   1%|          | 12/1875 [00:10<13:23,  2.32it/s]
loss 2.16 accuracy 0.16:   1%|          | 12/1875 [00:10<13:23,  2.32it/s]
loss 2.14 accuracy 0.09:   1%|          | 12/1875 [00:10<13:23,  2.32it/s]
loss 2.18 accuracy 0.22:   1%|          | 12/1875 [00:10<13:23,  2.32it/s]
loss 2.18 accuracy 0.25:   1%|          | 12/1875 [00:10<13:23,  2.32it/s]
loss 2.03 accuracy 0.41:   1%|          | 12/1875 [00:10<13:23,  2.32it/s]
loss 2.03 accuracy 0.41:   1%|          | 17/1875 [00:10<07:51,  3.94it/s]
loss 2.00 accuracy 0.25:   1%|          | 17/1875 [00:10<07:51,  3.94it/s]
loss 1.91 accuracy 0.47:   1%|          | 17/1875 [00:10<07:51,  3.94it/s]
loss 2.09 accuracy 0.22:   1%|          | 17/1875 [00:10<07:51,  3.94it/s]
loss 1.93 accuracy 0.34:   1%|          | 17/1875 [00:10<07:51,  3.94it/s]
loss 1.78 accuracy 0.38:   1%|          | 17/1875 [00:10<07:51,  3.94it/s]
loss 1.78 accuracy 0.38:   1%|          | 22/1875 [00:10<05:05,  6.06it/s]
loss 1.78 accuracy 0.38:   1%|          | 22/1875 [00:10<05:05,  6.06it/s]
loss 1.94 accuracy 0.38:   1%|          | 22/1875 [00:10<05:05,  6.06it/s]
loss 1.88 accuracy 0.47:   1%|          | 22/1875 [00:10<05:05,  6.06it/s]
loss 1.90 accuracy 0.38:   1%|          | 22/1875 [00:10<05:05,  6.06it/s]
loss 1.89 accuracy 0.41:   1%|          | 22/1875 [00:10<05:05,  6.06it/s]
loss 1.89 accuracy 0.41:   1%|▏         | 27/1875 [00:10<03:31,  8.74it/s]
loss 1.78 accuracy 0.31:   1%|▏         | 27/1875 [00:10<03:31,  8.74it/s]
loss 1.76 accuracy 0.41:   1%|▏         | 27/1875 [00:10<03:31,  8.74it/s]
loss 1.96 accuracy 0.19:   1%|▏         | 27/1875 [00:10<03:31,  8.74it/s]
loss 1.74 accuracy 0.31:   1%|▏         | 27/1875 [00:10<03:31,  8.74it/s]
loss 1.62 accuracy 0.44:   1%|▏         | 27/1875 [00:10<03:31,  8.74it/s]
loss 1.62 accuracy 0.44:   2%|▏         | 32/1875 [00:10<02:33, 12.00it/s]
loss 1.69 accuracy 0.44:   2%|▏         | 32/1875 [00:10<02:33, 12.00it/s]
loss 1.74 accuracy 0.41:   2%|▏         | 32/1875 [00:10<02:33, 12.00it/s]
loss 1.77 accuracy 0.38:   2%|▏         | 32/1875 [00:10<02:33, 12.00it/s]
loss 1.67 accuracy 0.44:   2%|▏         | 32/1875 [00:10<02:33, 12.00it/s]
loss 1.58 accuracy 0.47:   2%|▏         | 32/1875 [00:10<02:33, 12.00it/s]
loss 1.58 accuracy 0.47:   2%|▏         | 37/1875 [00:10<01:56, 15.75it/s]
loss 1.72 accuracy 0.34:   2%|▏         | 37/1875 [00:10<01:56, 15.75it/s]
loss 1.58 accuracy 0.41:   2%|▏         | 37/1875 [00:10<01:56, 15.75it/s]
loss 1.63 accuracy 0.50:   2%|▏         | 37/1875 [00:10<01:56, 15.75it/s]
loss 1.66 accuracy 0.44:   2%|▏         | 37/1875 [00:10<01:56, 15.75it/s]
loss 1.78 accuracy 0.38:   2%|▏         | 37/1875 [00:10<01:56, 15.75it/s]
loss 1.78 accuracy 0.38:   2%|▏         | 42/1875 [00:10<01:32, 19.90it/s]
loss 1.42 accuracy 0.47:   2%|▏         | 42/1875 [00:10<01:32, 19.90it/s]
loss 1.38 accuracy 0.53:   2%|▏         | 42/1875 [00:10<01:32, 19.90it/s]
loss 1.71 accuracy 0.34:   2%|▏         | 42/1875 [00:11<01:32, 19.90it/s]
loss 1.47 accuracy 0.41:   2%|▏         | 42/1875 [00:11<01:32, 19.90it/s]
loss 1.51 accuracy 0.47:   2%|▏         | 42/1875 [00:11<01:32, 19.90it/s]
loss 1.51 accuracy 0.47:   3%|▎         | 47/1875 [00:11<01:15, 24.18it/s]
loss 1.61 accuracy 0.38:   3%|▎         | 47/1875 [00:11<01:15, 24.18it/s]
loss 1.33 accuracy 0.53:   3%|▎         | 47/1875 [00:11<01:15, 24.18it/s]
loss 1.51 accuracy 0.56:   3%|▎         | 47/1875 [00:11<01:15, 24.18it/s]
loss 1.26 accuracy 0.69:   3%|▎         | 47/1875 [00:11<01:15, 24.18it/s]
loss 1.26 accuracy 0.56:   3%|▎         | 47/1875 [00:11<01:15, 24.18it/s]
loss 1.26 accuracy 0.56:   3%|▎         | 52/1875 [00:11<01:04, 28.33it/s]
loss 1.31 accuracy 0.56:   3%|▎         | 52/1875 [00:11<01:04, 28.33it/s]
loss 1.53 accuracy 0.44:   3%|▎         | 52/1875 [00:11<01:04, 28.33it/s]
loss 1.29 accuracy 0.62:   3%|▎         | 52/1875 [00:11<01:04, 28.33it/s]
loss 1.32 accuracy 0.53:   3%|▎         | 52/1875 [00:11<01:04, 28.33it/s]
loss 1.39 accuracy 0.53:   3%|▎         | 52/1875 [00:11<01:04, 28.33it/s]
loss 1.39 accuracy 0.53:   3%|▎         | 57/1875 [00:11<00:56, 32.13it/s]
loss 1.43 accuracy 0.47:   3%|▎         | 57/1875 [00:11<00:56, 32.13it/s]
loss 1.23 accuracy 0.59:   3%|▎         | 57/1875 [00:11<00:56, 32.13it/s]
loss 1.26 accuracy 0.72:   3%|▎         | 57/1875 [00:11<00:56, 32.13it/s]
loss 1.60 accuracy 0.50:   3%|▎         | 57/1875 [00:11<00:56, 32.13it/s]
loss 1.19 accuracy 0.72:   3%|▎         | 57/1875 [00:11<00:56, 32.13it/s]
loss 1.19 accuracy 0.72:   3%|▎         | 62/1875 [00:11<00:51, 35.39it/s]
loss 1.44 accuracy 0.59:   3%|▎         | 62/1875 [00:11<00:51, 35.39it/s]
loss 1.54 accuracy 0.44:   3%|▎         | 62/1875 [00:11<00:51, 35.39it/s]
loss 1.45 accuracy 0.47:   3%|▎         | 62/1875 [00:11<00:51, 35.39it/s]
loss 1.38 accuracy 0.53:   3%|▎         | 62/1875 [00:11<00:51, 35.39it/s]
loss 1.12 accuracy 0.66:   3%|▎         | 62/1875 [00:11<00:51, 35.39it/s]
loss 1.12 accuracy 0.66:   4%|▎         | 67/1875 [00:11<00:47, 38.05it/s]
loss 1.32 accuracy 0.53:   4%|▎         | 67/1875 [00:11<00:47, 38.05it/s]
loss 1.37 accuracy 0.47:   4%|▎         | 67/1875 [00:11<00:47, 38.05it/s]
loss 1.29 accuracy 0.44:   4%|▎         | 67/1875 [00:11<00:47, 38.05it/s]
loss 1.37 accuracy 0.41:   4%|▎         | 67/1875 [00:11<00:47, 38.05it/s]
loss 1.19 accuracy 0.59:   4%|▎         | 67/1875 [00:11<00:47, 38.05it/s]
loss 1.19 accuracy 0.59:   4%|▍         | 72/1875 [00:11<00:44, 40.18it/s]
loss 1.45 accuracy 0.47:   4%|▍         | 72/1875 [00:11<00:44, 40.18it/s]
loss 1.52 accuracy 0.53:   4%|▍         | 72/1875 [00:11<00:44, 40.18it/s]
loss 1.20 accuracy 0.56:   4%|▍         | 72/1875 [00:11<00:44, 40.18it/s]
loss 1.37 accuracy 0.50:   4%|▍         | 72/1875 [00:11<00:44, 40.18it/s]
loss 1.36 accuracy 0.59:   4%|▍         | 72/1875 [00:11<00:44, 40.18it/s]
loss 1.36 accuracy 0.59:   4%|▍         | 77/1875 [00:11<00:42, 41.83it/s]
loss 1.07 accuracy 0.69:   4%|▍         | 77/1875 [00:11<00:42, 41.83it/s]
loss 1.15 accuracy 0.62:   4%|▍         | 77/1875 [00:11<00:42, 41.83it/s]
loss 1.37 accuracy 0.53:   4%|▍         | 77/1875 [00:11<00:42, 41.83it/s]
loss 1.27 accuracy 0.66:   4%|▍         | 77/1875 [00:11<00:42, 41.83it/s]
loss 1.17 accuracy 0.66:   4%|▍         | 77/1875 [00:11<00:42, 41.83it/s]
loss 1.17 accuracy 0.66:   4%|▍         | 82/1875 [00:11<00:41, 43.02it/s]
loss 1.07 accuracy 0.62:   4%|▍         | 82/1875 [00:11<00:41, 43.02it/s]
loss 1.24 accuracy 0.69:   4%|▍         | 82/1875 [00:11<00:41, 43.02it/s]
loss 1.27 accuracy 0.47:   4%|▍         | 82/1875 [00:11<00:41, 43.02it/s]
loss 1.33 accuracy 0.56:   4%|▍         | 82/1875 [00:11<00:41, 43.02it/s]
loss 1.24 accuracy 0.59:   4%|▍         | 82/1875 [00:11<00:41, 43.02it/s]
loss 1.24 accuracy 0.59:   5%|▍         | 87/1875 [00:11<00:40, 43.93it/s]
loss 1.12 accuracy 0.62:   5%|▍         | 87/1875 [00:11<00:40, 43.93it/s]
loss 1.20 accuracy 0.59:   5%|▍         | 87/1875 [00:11<00:40, 43.93it/s]
loss 0.96 accuracy 0.75:   5%|▍         | 87/1875 [00:11<00:40, 43.93it/s]
loss 1.24 accuracy 0.53:   5%|▍         | 87/1875 [00:12<00:40, 43.93it/s]
loss 1.31 accuracy 0.56:   5%|▍         | 87/1875 [00:12<00:40, 43.93it/s]
loss 1.31 accuracy 0.56:   5%|▍         | 92/1875 [00:12<00:40, 44.55it/s]
loss 1.30 accuracy 0.41:   5%|▍         | 92/1875 [00:12<00:40, 44.55it/s]
loss 1.26 accuracy 0.56:   5%|▍         | 92/1875 [00:12<00:40, 44.55it/s]
loss 1.03 accuracy 0.69:   5%|▍         | 92/1875 [00:12<00:40, 44.55it/s]
loss 0.84 accuracy 0.88:   5%|▍         | 92/1875 [00:12<00:40, 44.55it/s]
loss 1.10 accuracy 0.66:   5%|▍         | 92/1875 [00:12<00:40, 44.55it/s]
loss 1.10 accuracy 0.66:   5%|▌         | 97/1875 [00:12<00:39, 45.01it/s]
loss 1.14 accuracy 0.56:   5%|▌         | 97/1875 [00:12<00:39, 45.01it/s]
loss 0.83 accuracy 0.88:   5%|▌         | 97/1875 [00:12<00:39, 45.01it/s]
loss 1.00 accuracy 0.69:   5%|▌         | 97/1875 [00:12<00:39, 45.01it/s]
loss 1.18 accuracy 0.47:   5%|▌         | 97/1875 [00:12<00:39, 45.01it/s]
loss 1.16 accuracy 0.50:   5%|▌         | 97/1875 [00:12<00:39, 45.01it/s]
loss 1.16 accuracy 0.50:   5%|▌         | 102/1875 [00:12<00:39, 45.34it/s]
loss 1.13 accuracy 0.66:   5%|▌         | 102/1875 [00:12<00:39, 45.34it/s]
loss 1.21 accuracy 0.66:   5%|▌         | 102/1875 [00:12<00:39, 45.34it/s]
loss 1.38 accuracy 0.44:   5%|▌         | 102/1875 [00:12<00:39, 45.34it/s]
loss 1.11 accuracy 0.66:   5%|▌         | 102/1875 [00:12<00:39, 45.34it/s]
loss 0.75 accuracy 0.81:   5%|▌         | 102/1875 [00:12<00:39, 45.34it/s]
loss 0.75 accuracy 0.81:   6%|▌         | 107/1875 [00:12<00:38, 45.57it/s]
loss 0.91 accuracy 0.75:   6%|▌         | 107/1875 [00:12<00:38, 45.57it/s]
loss 0.61 accuracy 0.88:   6%|▌         | 107/1875 [00:12<00:38, 45.57it/s]
loss 1.01 accuracy 0.69:   6%|▌         | 107/1875 [00:12<00:38, 45.57it/s]
loss 0.78 accuracy 0.81:   6%|▌         | 107/1875 [00:12<00:38, 45.57it/s]
loss 1.55 accuracy 0.47:   6%|▌         | 107/1875 [00:12<00:38, 45.57it/s]
loss 1.55 accuracy 0.47:   6%|▌         | 112/1875 [00:12<00:38, 45.75it/s]
loss 0.99 accuracy 0.78:   6%|▌         | 112/1875 [00:12<00:38, 45.75it/s]
loss 1.33 accuracy 0.56:   6%|▌         | 112/1875 [00:12<00:38, 45.75it/s]
loss 1.10 accuracy 0.66:   6%|▌         | 112/1875 [00:12<00:38, 45.75it/s]
loss 1.13 accuracy 0.69:   6%|▌         | 112/1875 [00:12<00:38, 45.75it/s]
loss 1.01 accuracy 0.78:   6%|▌         | 112/1875 [00:12<00:38, 45.75it/s]
loss 1.01 accuracy 0.78:   6%|▌         | 117/1875 [00:12<00:38, 45.83it/s]
loss 1.38 accuracy 0.56:   6%|▌         | 117/1875 [00:12<00:38, 45.83it/s]
loss 0.97 accuracy 0.59:   6%|▌         | 117/1875 [00:12<00:38, 45.83it/s]
loss 0.92 accuracy 0.75:   6%|▌         | 117/1875 [00:12<00:38, 45.83it/s]
loss 0.86 accuracy 0.75:   6%|▌         | 117/1875 [00:12<00:38, 45.83it/s]
loss 1.07 accuracy 0.69:   6%|▌         | 117/1875 [00:12<00:38, 45.83it/s]
loss 1.07 accuracy 0.69:   7%|▋         | 122/1875 [00:12<00:38, 45.89it/s]
loss 0.92 accuracy 0.81:   7%|▋         | 122/1875 [00:12<00:38, 45.89it/s]
loss 0.96 accuracy 0.62:   7%|▋         | 122/1875 [00:12<00:38, 45.89it/s]
loss 0.74 accuracy 0.81:   7%|▋         | 122/1875 [00:12<00:38, 45.89it/s]
loss 1.03 accuracy 0.50:   7%|▋         | 122/1875 [00:12<00:38, 45.89it/s]
loss 0.92 accuracy 0.62:   7%|▋         | 122/1875 [00:12<00:38, 45.89it/s]
loss 0.92 accuracy 0.62:   7%|▋         | 127/1875 [00:12<00:38, 45.86it/s]
loss 1.03 accuracy 0.69:   7%|▋         | 127/1875 [00:12<00:38, 45.86it/s]
loss 1.15 accuracy 0.62:   7%|▋         | 127/1875 [00:12<00:38, 45.86it/s]
loss 1.06 accuracy 0.69:   7%|▋         | 127/1875 [00:12<00:38, 45.86it/s]
loss 0.88 accuracy 0.59:   7%|▋         | 127/1875 [00:12<00:38, 45.86it/s]
loss 0.78 accuracy 0.75:   7%|▋         | 127/1875 [00:12<00:38, 45.86it/s]
loss 0.78 accuracy 0.75:   7%|▋         | 132/1875 [00:12<00:38, 45.76it/s]
loss 1.20 accuracy 0.56:   7%|▋         | 132/1875 [00:12<00:38, 45.76it/s]
loss 0.99 accuracy 0.69:   7%|▋         | 132/1875 [00:12<00:38, 45.76it/s]
loss 0.79 accuracy 0.81:   7%|▋         | 132/1875 [00:12<00:38, 45.76it/s]
loss 0.94 accuracy 0.62:   7%|▋         | 132/1875 [00:12<00:38, 45.76it/s]
loss 0.80 accuracy 0.78:   7%|▋         | 132/1875 [00:13<00:38, 45.76it/s]
loss 0.80 accuracy 0.78:   7%|▋         | 137/1875 [00:13<00:37, 45.79it/s]
loss 0.91 accuracy 0.66:   7%|▋         | 137/1875 [00:13<00:37, 45.79it/s]
loss 0.40 accuracy 0.94:   7%|▋         | 137/1875 [00:13<00:37, 45.79it/s]
loss 0.68 accuracy 0.78:   7%|▋         | 137/1875 [00:13<00:37, 45.79it/s]
loss 0.75 accuracy 0.75:   7%|▋         | 137/1875 [00:13<00:37, 45.79it/s]
loss 0.91 accuracy 0.72:   7%|▋         | 137/1875 [00:13<00:37, 45.79it/s]
loss 0.91 accuracy 0.72:   8%|▊         | 142/1875 [00:13<00:37, 45.80it/s]
loss 0.71 accuracy 0.75:   8%|▊         | 142/1875 [00:13<00:37, 45.80it/s]
loss 0.96 accuracy 0.66:   8%|▊         | 142/1875 [00:13<00:37, 45.80it/s]
loss 0.80 accuracy 0.75:   8%|▊         | 142/1875 [00:13<00:37, 45.80it/s]
loss 0.76 accuracy 0.84:   8%|▊         | 142/1875 [00:13<00:37, 45.80it/s]
loss 1.19 accuracy 0.75:   8%|▊         | 142/1875 [00:13<00:37, 45.80it/s]
loss 1.19 accuracy 0.75:   8%|▊         | 147/1875 [00:13<00:37, 45.70it/s]
loss 0.89 accuracy 0.75:   8%|▊         | 147/1875 [00:13<00:37, 45.70it/s]
loss 0.60 accuracy 0.88:   8%|▊         | 147/1875 [00:13<00:37, 45.70it/s]
loss 0.60 accuracy 0.84:   8%|▊         | 147/1875 [00:13<00:37, 45.70it/s]
loss 0.74 accuracy 0.84:   8%|▊         | 147/1875 [00:13<00:37, 45.70it/s]
loss 0.71 accuracy 0.78:   8%|▊         | 147/1875 [00:13<00:37, 45.70it/s]
loss 0.71 accuracy 0.78:   8%|▊         | 152/1875 [00:13<00:37, 45.75it/s]
loss 0.61 accuracy 0.94:   8%|▊         | 152/1875 [00:13<00:37, 45.75it/s]
loss 0.62 accuracy 0.84:   8%|▊         | 152/1875 [00:13<00:37, 45.75it/s]
loss 0.64 accuracy 0.84:   8%|▊         | 152/1875 [00:13<00:37, 45.75it/s]
loss 0.61 accuracy 0.84:   8%|▊         | 152/1875 [00:13<00:37, 45.75it/s]
loss 0.46 accuracy 0.91:   8%|▊         | 152/1875 [00:13<00:37, 45.75it/s]
loss 0.46 accuracy 0.91:   8%|▊         | 157/1875 [00:13<00:37, 45.72it/s]
loss 0.66 accuracy 0.81:   8%|▊         | 157/1875 [00:13<00:37, 45.72it/s]
loss 0.79 accuracy 0.78:   8%|▊         | 157/1875 [00:13<00:37, 45.72it/s]
loss 0.76 accuracy 0.72:   8%|▊         | 157/1875 [00:13<00:37, 45.72it/s]
loss 0.85 accuracy 0.69:   8%|▊         | 157/1875 [00:13<00:37, 45.72it/s]
loss 0.81 accuracy 0.75:   8%|▊         | 157/1875 [00:13<00:37, 45.72it/s]
loss 0.81 accuracy 0.75:   9%|▊         | 162/1875 [00:13<00:37, 45.74it/s]
loss 0.92 accuracy 0.69:   9%|▊         | 162/1875 [00:13<00:37, 45.74it/s]
loss 1.02 accuracy 0.66:   9%|▊         | 162/1875 [00:13<00:37, 45.74it/s]
loss 0.46 accuracy 0.97:   9%|▊         | 162/1875 [00:13<00:37, 45.74it/s]
loss 0.71 accuracy 0.78:   9%|▊         | 162/1875 [00:13<00:37, 45.74it/s]
loss 0.57 accuracy 0.78:   9%|▊         | 162/1875 [00:13<00:37, 45.74it/s]
loss 0.57 accuracy 0.78:   9%|▉         | 167/1875 [00:13<00:37, 45.82it/s]
loss 0.58 accuracy 0.78:   9%|▉         | 167/1875 [00:13<00:37, 45.82it/s]
loss 0.51 accuracy 0.91:   9%|▉         | 167/1875 [00:13<00:37, 45.82it/s]
loss 0.62 accuracy 0.84:   9%|▉         | 167/1875 [00:13<00:37, 45.82it/s]
loss 0.69 accuracy 0.84:   9%|▉         | 167/1875 [00:13<00:37, 45.82it/s]
loss 0.89 accuracy 0.72:   9%|▉         | 167/1875 [00:13<00:37, 45.82it/s]
loss 0.89 accuracy 0.72:   9%|▉         | 172/1875 [00:13<00:37, 45.90it/s]
loss 0.53 accuracy 0.94:   9%|▉         | 172/1875 [00:13<00:37, 45.90it/s]
loss 0.51 accuracy 0.84:   9%|▉         | 172/1875 [00:13<00:37, 45.90it/s]
loss 0.40 accuracy 0.94:   9%|▉         | 172/1875 [00:13<00:37, 45.90it/s]
loss 0.57 accuracy 0.78:   9%|▉         | 172/1875 [00:13<00:37, 45.90it/s]
loss 0.54 accuracy 0.91:   9%|▉         | 172/1875 [00:13<00:37, 45.90it/s]
loss 0.54 accuracy 0.91:   9%|▉         | 177/1875 [00:13<00:36, 45.94it/s]
loss 0.54 accuracy 0.84:   9%|▉         | 177/1875 [00:13<00:36, 45.94it/s]
loss 0.84 accuracy 0.72:   9%|▉         | 177/1875 [00:13<00:36, 45.94it/s]
loss 0.29 accuracy 0.94:   9%|▉         | 177/1875 [00:13<00:36, 45.94it/s]
loss 0.38 accuracy 0.84:   9%|▉         | 177/1875 [00:13<00:36, 45.94it/s]
loss 0.62 accuracy 0.75:   9%|▉         | 177/1875 [00:13<00:36, 45.94it/s]
loss 0.62 accuracy 0.75:  10%|▉         | 182/1875 [00:13<00:36, 45.99it/s]
loss 0.78 accuracy 0.81:  10%|▉         | 182/1875 [00:14<00:36, 45.99it/s]
loss 0.55 accuracy 0.81:  10%|▉         | 182/1875 [00:14<00:36, 45.99it/s]
loss 0.64 accuracy 0.88:  10%|▉         | 182/1875 [00:14<00:36, 45.99it/s]
loss 1.16 accuracy 0.66:  10%|▉         | 182/1875 [00:14<00:36, 45.99it/s]
loss 0.69 accuracy 0.78:  10%|▉         | 182/1875 [00:14<00:36, 45.99it/s]
loss 0.69 accuracy 0.78:  10%|▉         | 187/1875 [00:14<00:36, 46.00it/s]
loss 0.89 accuracy 0.75:  10%|▉         | 187/1875 [00:14<00:36, 46.00it/s]
loss 0.45 accuracy 0.88:  10%|▉         | 187/1875 [00:14<00:36, 46.00it/s]
loss 0.66 accuracy 0.81:  10%|▉         | 187/1875 [00:14<00:36, 46.00it/s]
loss 0.53 accuracy 0.84:  10%|▉         | 187/1875 [00:14<00:36, 46.00it/s]
loss 0.53 accuracy 0.88:  10%|▉         | 187/1875 [00:14<00:36, 46.00it/s]
loss 0.53 accuracy 0.88:  10%|█         | 192/1875 [00:14<00:36, 46.05it/s]
loss 0.62 accuracy 0.84:  10%|█         | 192/1875 [00:14<00:36, 46.05it/s]
loss 0.54 accuracy 0.78:  10%|█         | 192/1875 [00:14<00:36, 46.05it/s]
loss 0.60 accuracy 0.84:  10%|█         | 192/1875 [00:14<00:36, 46.05it/s]
loss 0.77 accuracy 0.72:  10%|█         | 192/1875 [00:14<00:36, 46.05it/s]
loss 0.65 accuracy 0.84:  10%|█         | 192/1875 [00:14<00:36, 46.05it/s]
loss 0.65 accuracy 0.84:  11%|█         | 197/1875 [00:14<00:36, 46.06it/s]
loss 0.47 accuracy 0.81:  11%|█         | 197/1875 [00:14<00:36, 46.06it/s]
loss 0.67 accuracy 0.78:  11%|█         | 197/1875 [00:14<00:36, 46.06it/s]
loss 0.80 accuracy 0.69:  11%|█         | 197/1875 [00:14<00:36, 46.06it/s]
loss 0.61 accuracy 0.81:  11%|█         | 197/1875 [00:14<00:36, 46.06it/s]
loss 0.50 accuracy 0.88:  11%|█         | 197/1875 [00:14<00:36, 46.06it/s]
loss 0.50 accuracy 0.88:  11%|█         | 202/1875 [00:14<00:36, 46.09it/s]
loss 0.54 accuracy 0.78:  11%|█         | 202/1875 [00:14<00:36, 46.09it/s]
loss 0.33 accuracy 0.94:  11%|█         | 202/1875 [00:14<00:36, 46.09it/s]
loss 0.45 accuracy 0.91:  11%|█         | 202/1875 [00:14<00:36, 46.09it/s]
loss 0.36 accuracy 0.94:  11%|█         | 202/1875 [00:14<00:36, 46.09it/s]
loss 0.54 accuracy 0.81:  11%|█         | 202/1875 [00:14<00:36, 46.09it/s]
loss 0.54 accuracy 0.81:  11%|█         | 207/1875 [00:14<00:36, 46.12it/s]
loss 0.44 accuracy 0.84:  11%|█         | 207/1875 [00:14<00:36, 46.12it/s]
loss 0.61 accuracy 0.81:  11%|█         | 207/1875 [00:14<00:36, 46.12it/s]
loss 0.41 accuracy 0.94:  11%|█         | 207/1875 [00:14<00:36, 46.12it/s]
loss 0.40 accuracy 0.88:  11%|█         | 207/1875 [00:14<00:36, 46.12it/s]
loss 0.54 accuracy 0.88:  11%|█         | 207/1875 [00:14<00:36, 46.12it/s]
loss 0.54 accuracy 0.88:  11%|█▏        | 212/1875 [00:14<00:36, 46.13it/s]
loss 0.56 accuracy 0.81:  11%|█▏        | 212/1875 [00:14<00:36, 46.13it/s]
loss 0.42 accuracy 0.88:  11%|█▏        | 212/1875 [00:14<00:36, 46.13it/s]
loss 0.35 accuracy 0.88:  11%|█▏        | 212/1875 [00:14<00:36, 46.13it/s]
loss 0.59 accuracy 0.88:  11%|█▏        | 212/1875 [00:14<00:36, 46.13it/s]
loss 0.60 accuracy 0.84:  11%|█▏        | 212/1875 [00:14<00:36, 46.13it/s]
loss 0.60 accuracy 0.84:  12%|█▏        | 217/1875 [00:14<00:35, 46.14it/s]
loss 0.32 accuracy 0.91:  12%|█▏        | 217/1875 [00:14<00:35, 46.14it/s]
loss 0.42 accuracy 0.88:  12%|█▏        | 217/1875 [00:14<00:35, 46.14it/s]
loss 0.60 accuracy 0.81:  12%|█▏        | 217/1875 [00:14<00:35, 46.14it/s]
loss 0.42 accuracy 0.88:  12%|█▏        | 217/1875 [00:14<00:35, 46.14it/s]
loss 0.41 accuracy 0.91:  12%|█▏        | 217/1875 [00:14<00:35, 46.14it/s]
loss 0.41 accuracy 0.91:  12%|█▏        | 222/1875 [00:14<00:35, 46.14it/s]
loss 0.35 accuracy 0.84:  12%|█▏        | 222/1875 [00:14<00:35, 46.14it/s]
loss 0.40 accuracy 0.88:  12%|█▏        | 222/1875 [00:14<00:35, 46.14it/s]
loss 0.64 accuracy 0.84:  12%|█▏        | 222/1875 [00:14<00:35, 46.14it/s]
loss 0.52 accuracy 0.91:  12%|█▏        | 222/1875 [00:14<00:35, 46.14it/s]
loss 0.27 accuracy 0.94:  12%|█▏        | 222/1875 [00:14<00:35, 46.14it/s]
loss 0.27 accuracy 0.94:  12%|█▏        | 227/1875 [00:14<00:35, 46.15it/s]
loss 0.39 accuracy 0.94:  12%|█▏        | 227/1875 [00:14<00:35, 46.15it/s]
loss 0.30 accuracy 0.88:  12%|█▏        | 227/1875 [00:15<00:35, 46.15it/s]
loss 0.56 accuracy 0.78:  12%|█▏        | 227/1875 [00:15<00:35, 46.15it/s]
loss 0.32 accuracy 0.91:  12%|█▏        | 227/1875 [00:15<00:35, 46.15it/s]
loss 0.32 accuracy 0.88:  12%|█▏        | 227/1875 [00:15<00:35, 46.15it/s]
loss 0.32 accuracy 0.88:  12%|█▏        | 232/1875 [00:15<00:35, 46.11it/s]
loss 0.39 accuracy 0.91:  12%|█▏        | 232/1875 [00:15<00:35, 46.11it/s]
loss 0.54 accuracy 0.91:  12%|█▏        | 232/1875 [00:15<00:35, 46.11it/s]
loss 0.33 accuracy 0.88:  12%|█▏        | 232/1875 [00:15<00:35, 46.11it/s]
loss 0.60 accuracy 0.84:  12%|█▏        | 232/1875 [00:15<00:35, 46.11it/s]
loss 0.36 accuracy 0.94:  12%|█▏        | 232/1875 [00:15<00:35, 46.11it/s]
loss 0.36 accuracy 0.94:  13%|█▎        | 237/1875 [00:15<00:35, 46.11it/s]
loss 0.79 accuracy 0.81:  13%|█▎        | 237/1875 [00:15<00:35, 46.11it/s]
loss 0.51 accuracy 0.84:  13%|█▎        | 237/1875 [00:15<00:35, 46.11it/s]
loss 0.29 accuracy 0.97:  13%|█▎        | 237/1875 [00:15<00:35, 46.11it/s]
loss 0.33 accuracy 0.91:  13%|█▎        | 237/1875 [00:15<00:35, 46.11it/s]
loss 0.43 accuracy 0.84:  13%|█▎        | 237/1875 [00:15<00:35, 46.11it/s]
loss 0.43 accuracy 0.84:  13%|█▎        | 242/1875 [00:15<00:35, 46.03it/s]
loss 0.37 accuracy 0.88:  13%|█▎        | 242/1875 [00:15<00:35, 46.03it/s]
loss 0.47 accuracy 0.84:  13%|█▎        | 242/1875 [00:15<00:35, 46.03it/s]
loss 0.60 accuracy 0.78:  13%|█▎        | 242/1875 [00:15<00:35, 46.03it/s]
loss 0.57 accuracy 0.81:  13%|█▎        | 242/1875 [00:15<00:35, 46.03it/s]
loss 0.36 accuracy 0.94:  13%|█▎        | 242/1875 [00:15<00:35, 46.03it/s]
loss 0.36 accuracy 0.94:  13%|█▎        | 247/1875 [00:15<00:35, 45.94it/s]
loss 0.50 accuracy 0.81:  13%|█▎        | 247/1875 [00:15<00:35, 45.94it/s]
loss 0.35 accuracy 0.84:  13%|█▎        | 247/1875 [00:15<00:35, 45.94it/s]
loss 0.40 accuracy 0.91:  13%|█▎        | 247/1875 [00:15<00:35, 45.94it/s]
loss 0.46 accuracy 0.88:  13%|█▎        | 247/1875 [00:15<00:35, 45.94it/s]
loss 0.31 accuracy 0.91:  13%|█▎        | 247/1875 [00:15<00:35, 45.94it/s]
loss 0.31 accuracy 0.91:  13%|█▎        | 252/1875 [00:15<00:35, 45.85it/s]
loss 0.43 accuracy 0.84:  13%|█▎        | 252/1875 [00:15<00:35, 45.85it/s]
loss 0.53 accuracy 0.81:  13%|█▎        | 252/1875 [00:15<00:35, 45.85it/s]
loss 0.28 accuracy 0.94:  13%|█▎        | 252/1875 [00:15<00:35, 45.85it/s]
loss 0.40 accuracy 0.91:  13%|█▎        | 252/1875 [00:15<00:35, 45.85it/s]
loss 0.26 accuracy 0.91:  13%|█▎        | 252/1875 [00:15<00:35, 45.85it/s]
loss 0.26 accuracy 0.91:  14%|█▎        | 257/1875 [00:15<00:35, 45.90it/s]
loss 0.45 accuracy 0.91:  14%|█▎        | 257/1875 [00:15<00:35, 45.90it/s]
loss 0.28 accuracy 0.94:  14%|█▎        | 257/1875 [00:15<00:35, 45.90it/s]
loss 0.52 accuracy 0.84:  14%|█▎        | 257/1875 [00:15<00:35, 45.90it/s]
loss 0.24 accuracy 0.88:  14%|█▎        | 257/1875 [00:15<00:35, 45.90it/s]
loss 0.40 accuracy 0.91:  14%|█▎        | 257/1875 [00:15<00:35, 45.90it/s]
loss 0.40 accuracy 0.91:  14%|█▍        | 262/1875 [00:15<00:35, 45.81it/s]
loss 0.35 accuracy 0.88:  14%|█▍        | 262/1875 [00:15<00:35, 45.81it/s]
loss 0.38 accuracy 0.94:  14%|█▍        | 262/1875 [00:15<00:35, 45.81it/s]
loss 0.30 accuracy 0.91:  14%|█▍        | 262/1875 [00:15<00:35, 45.81it/s]
loss 0.29 accuracy 0.91:  14%|█▍        | 262/1875 [00:15<00:35, 45.81it/s]
loss 0.41 accuracy 0.84:  14%|█▍        | 262/1875 [00:15<00:35, 45.81it/s]
loss 0.41 accuracy 0.84:  14%|█▍        | 267/1875 [00:15<00:35, 45.74it/s]
loss 0.40 accuracy 0.84:  14%|█▍        | 267/1875 [00:15<00:35, 45.74it/s]
loss 0.45 accuracy 0.81:  14%|█▍        | 267/1875 [00:15<00:35, 45.74it/s]
loss 0.19 accuracy 1.00:  14%|█▍        | 267/1875 [00:15<00:35, 45.74it/s]
loss 0.27 accuracy 1.00:  14%|█▍        | 267/1875 [00:15<00:35, 45.74it/s]
loss 0.37 accuracy 0.88:  14%|█▍        | 267/1875 [00:15<00:35, 45.74it/s]
loss 0.37 accuracy 0.88:  15%|█▍        | 272/1875 [00:15<00:35, 45.71it/s]
loss 0.46 accuracy 0.78:  15%|█▍        | 272/1875 [00:15<00:35, 45.71it/s]
loss 0.43 accuracy 0.88:  15%|█▍        | 272/1875 [00:15<00:35, 45.71it/s]
loss 0.47 accuracy 0.84:  15%|█▍        | 272/1875 [00:16<00:35, 45.71it/s]
loss 0.51 accuracy 0.88:  15%|█▍        | 272/1875 [00:16<00:35, 45.71it/s]
loss 0.20 accuracy 0.91:  15%|█▍        | 272/1875 [00:16<00:35, 45.71it/s]
loss 0.20 accuracy 0.91:  15%|█▍        | 277/1875 [00:16<00:34, 45.69it/s]
loss 0.46 accuracy 0.88:  15%|█▍        | 277/1875 [00:16<00:34, 45.69it/s]
loss 0.31 accuracy 0.91:  15%|█▍        | 277/1875 [00:16<00:34, 45.69it/s]
loss 0.23 accuracy 0.94:  15%|█▍        | 277/1875 [00:16<00:34, 45.69it/s]
loss 0.25 accuracy 0.91:  15%|█▍        | 277/1875 [00:16<00:34, 45.69it/s]
loss 0.39 accuracy 0.88:  15%|█▍        | 277/1875 [00:16<00:34, 45.69it/s]
loss 0.39 accuracy 0.88:  15%|█▌        | 282/1875 [00:16<00:34, 45.81it/s]
loss 0.21 accuracy 0.94:  15%|█▌        | 282/1875 [00:16<00:34, 45.81it/s]
loss 0.29 accuracy 0.91:  15%|█▌        | 282/1875 [00:16<00:34, 45.81it/s]
loss 0.51 accuracy 0.88:  15%|█▌        | 282/1875 [00:16<00:34, 45.81it/s]
loss 0.46 accuracy 0.78:  15%|█▌        | 282/1875 [00:16<00:34, 45.81it/s]
loss 0.27 accuracy 0.88:  15%|█▌        | 282/1875 [00:16<00:34, 45.81it/s]
loss 0.27 accuracy 0.88:  15%|█▌        | 287/1875 [00:16<00:34, 45.87it/s]
loss 0.11 accuracy 1.00:  15%|█▌        | 287/1875 [00:16<00:34, 45.87it/s]
loss 0.13 accuracy 1.00:  15%|█▌        | 287/1875 [00:16<00:34, 45.87it/s]
loss 0.30 accuracy 0.91:  15%|█▌        | 287/1875 [00:16<00:34, 45.87it/s]
loss 0.23 accuracy 0.97:  15%|█▌        | 287/1875 [00:16<00:34, 45.87it/s]
loss 0.48 accuracy 0.91:  15%|█▌        | 287/1875 [00:16<00:34, 45.87it/s]
loss 0.48 accuracy 0.91:  16%|█▌        | 292/1875 [00:16<00:34, 45.94it/s]
loss 0.50 accuracy 0.88:  16%|█▌        | 292/1875 [00:16<00:34, 45.94it/s]
loss 0.37 accuracy 0.88:  16%|█▌        | 292/1875 [00:16<00:34, 45.94it/s]
loss 0.37 accuracy 0.84:  16%|█▌        | 292/1875 [00:16<00:34, 45.94it/s]
loss 0.27 accuracy 0.94:  16%|█▌        | 292/1875 [00:16<00:34, 45.94it/s]
loss 0.25 accuracy 0.91:  16%|█▌        | 292/1875 [00:16<00:34, 45.94it/s]
loss 0.25 accuracy 0.91:  16%|█▌        | 297/1875 [00:16<00:34, 46.00it/s]
loss 0.10 accuracy 1.00:  16%|█▌        | 297/1875 [00:16<00:34, 46.00it/s]
loss 0.38 accuracy 0.88:  16%|█▌        | 297/1875 [00:16<00:34, 46.00it/s]
loss 0.47 accuracy 0.88:  16%|█▌        | 297/1875 [00:16<00:34, 46.00it/s]
loss 0.17 accuracy 0.94:  16%|█▌        | 297/1875 [00:16<00:34, 46.00it/s]
loss 0.22 accuracy 0.97:  16%|█▌        | 297/1875 [00:16<00:34, 46.00it/s]
loss 0.22 accuracy 0.97:  16%|█▌        | 302/1875 [00:16<00:34, 46.07it/s]
loss 0.20 accuracy 0.94:  16%|█▌        | 302/1875 [00:16<00:34, 46.07it/s]
loss 0.22 accuracy 0.97:  16%|█▌        | 302/1875 [00:16<00:34, 46.07it/s]
loss 0.15 accuracy 0.97:  16%|█▌        | 302/1875 [00:16<00:34, 46.07it/s]
loss 0.34 accuracy 0.88:  16%|█▌        | 302/1875 [00:16<00:34, 46.07it/s]
loss 0.22 accuracy 0.91:  16%|█▌        | 302/1875 [00:16<00:34, 46.07it/s]
loss 0.22 accuracy 0.91:  16%|█▋        | 307/1875 [00:16<00:34, 46.11it/s]
loss 0.21 accuracy 0.94:  16%|█▋        | 307/1875 [00:16<00:34, 46.11it/s]
loss 0.28 accuracy 0.94:  16%|█▋        | 307/1875 [00:16<00:34, 46.11it/s]
loss 0.58 accuracy 0.78:  16%|█▋        | 307/1875 [00:16<00:34, 46.11it/s]
loss 0.40 accuracy 0.88:  16%|█▋        | 307/1875 [00:16<00:34, 46.11it/s]
loss 0.32 accuracy 0.91:  16%|█▋        | 307/1875 [00:16<00:34, 46.11it/s]
loss 0.32 accuracy 0.91:  17%|█▋        | 312/1875 [00:16<00:33, 46.11it/s]
loss 0.43 accuracy 0.88:  17%|█▋        | 312/1875 [00:16<00:33, 46.11it/s]
loss 0.29 accuracy 0.94:  17%|█▋        | 312/1875 [00:16<00:33, 46.11it/s]
loss 0.42 accuracy 0.91:  17%|█▋        | 312/1875 [00:16<00:33, 46.11it/s]
loss 0.51 accuracy 0.81:  17%|█▋        | 312/1875 [00:16<00:33, 46.11it/s]
loss 0.33 accuracy 0.91:  17%|█▋        | 312/1875 [00:16<00:33, 46.11it/s]
loss 0.33 accuracy 0.91:  17%|█▋        | 317/1875 [00:16<00:33, 46.10it/s]
loss 0.30 accuracy 0.94:  17%|█▋        | 317/1875 [00:16<00:33, 46.10it/s]
loss 0.24 accuracy 0.94:  17%|█▋        | 317/1875 [00:16<00:33, 46.10it/s]
loss 0.23 accuracy 0.97:  17%|█▋        | 317/1875 [00:16<00:33, 46.10it/s]
loss 0.56 accuracy 0.88:  17%|█▋        | 317/1875 [00:17<00:33, 46.10it/s]
loss 0.23 accuracy 0.97:  17%|█▋        | 317/1875 [00:17<00:33, 46.10it/s]
loss 0.23 accuracy 0.97:  17%|█▋        | 322/1875 [00:17<00:33, 46.13it/s]
loss 0.23 accuracy 0.94:  17%|█▋        | 322/1875 [00:17<00:33, 46.13it/s]
loss 0.22 accuracy 0.94:  17%|█▋        | 322/1875 [00:17<00:33, 46.13it/s]
loss 0.35 accuracy 0.91:  17%|█▋        | 322/1875 [00:17<00:33, 46.13it/s]
loss 0.31 accuracy 0.91:  17%|█▋        | 322/1875 [00:17<00:33, 46.13it/s]
loss 0.22 accuracy 0.94:  17%|█▋        | 322/1875 [00:17<00:33, 46.13it/s]
loss 0.22 accuracy 0.94:  17%|█▋        | 327/1875 [00:17<00:33, 46.14it/s]
loss 0.40 accuracy 0.88:  17%|█▋        | 327/1875 [00:17<00:33, 46.14it/s]
loss 0.30 accuracy 0.88:  17%|█▋        | 327/1875 [00:17<00:33, 46.14it/s]
loss 0.34 accuracy 0.94:  17%|█▋        | 327/1875 [00:17<00:33, 46.14it/s]
loss 0.81 accuracy 0.81:  17%|█▋        | 327/1875 [00:17<00:33, 46.14it/s]
loss 0.30 accuracy 0.91:  17%|█▋        | 327/1875 [00:17<00:33, 46.14it/s]
loss 0.30 accuracy 0.91:  18%|█▊        | 332/1875 [00:17<00:33, 46.14it/s]
loss 0.37 accuracy 0.91:  18%|█▊        | 332/1875 [00:17<00:33, 46.14it/s]
loss 0.10 accuracy 0.97:  18%|█▊        | 332/1875 [00:17<00:33, 46.14it/s]
loss 0.32 accuracy 0.94:  18%|█▊        | 332/1875 [00:17<00:33, 46.14it/s]
loss 0.26 accuracy 0.88:  18%|█▊        | 332/1875 [00:17<00:33, 46.14it/s]
loss 1.01 accuracy 0.78:  18%|█▊        | 332/1875 [00:17<00:33, 46.14it/s]
loss 1.01 accuracy 0.78:  18%|█▊        | 337/1875 [00:17<00:33, 46.14it/s]
loss 0.45 accuracy 0.91:  18%|█▊        | 337/1875 [00:17<00:33, 46.14it/s]
loss 0.38 accuracy 0.88:  18%|█▊        | 337/1875 [00:17<00:33, 46.14it/s]
loss 0.31 accuracy 0.91:  18%|█▊        | 337/1875 [00:17<00:33, 46.14it/s]
loss 0.27 accuracy 0.91:  18%|█▊        | 337/1875 [00:17<00:33, 46.14it/s]
loss 0.57 accuracy 0.72:  18%|█▊        | 337/1875 [00:17<00:33, 46.14it/s]
loss 0.57 accuracy 0.72:  18%|█▊        | 342/1875 [00:17<00:33, 46.14it/s]
loss 0.51 accuracy 0.84:  18%|█▊        | 342/1875 [00:17<00:33, 46.14it/s]
loss 0.37 accuracy 0.84:  18%|█▊        | 342/1875 [00:17<00:33, 46.14it/s]
loss 0.51 accuracy 0.88:  18%|█▊        | 342/1875 [00:17<00:33, 46.14it/s]
loss 0.50 accuracy 0.81:  18%|█▊        | 342/1875 [00:17<00:33, 46.14it/s]
loss 0.30 accuracy 0.94:  18%|█▊        | 342/1875 [00:17<00:33, 46.14it/s]
loss 0.30 accuracy 0.94:  19%|█▊        | 347/1875 [00:17<00:33, 46.11it/s]
loss 0.42 accuracy 0.84:  19%|█▊        | 347/1875 [00:17<00:33, 46.11it/s]
loss 0.62 accuracy 0.81:  19%|█▊        | 347/1875 [00:17<00:33, 46.11it/s]
loss 0.21 accuracy 0.97:  19%|█▊        | 347/1875 [00:17<00:33, 46.11it/s]
loss 0.17 accuracy 0.94:  19%|█▊        | 347/1875 [00:17<00:33, 46.11it/s]
loss 0.20 accuracy 0.94:  19%|█▊        | 347/1875 [00:17<00:33, 46.11it/s]
loss 0.20 accuracy 0.94:  19%|█▉        | 352/1875 [00:17<00:33, 46.08it/s]
loss 0.34 accuracy 0.88:  19%|█▉        | 352/1875 [00:17<00:33, 46.08it/s]
loss 0.40 accuracy 0.91:  19%|█▉        | 352/1875 [00:17<00:33, 46.08it/s]
loss 0.51 accuracy 0.88:  19%|█▉        | 352/1875 [00:17<00:33, 46.08it/s]
loss 0.16 accuracy 0.97:  19%|█▉        | 352/1875 [00:17<00:33, 46.08it/s]
loss 0.45 accuracy 0.84:  19%|█▉        | 352/1875 [00:17<00:33, 46.08it/s]
loss 0.45 accuracy 0.84:  19%|█▉        | 357/1875 [00:17<00:32, 46.08it/s]
loss 0.44 accuracy 0.81:  19%|█▉        | 357/1875 [00:17<00:32, 46.08it/s]
loss 0.21 accuracy 0.94:  19%|█▉        | 357/1875 [00:17<00:32, 46.08it/s]
loss 0.23 accuracy 0.94:  19%|█▉        | 357/1875 [00:17<00:32, 46.08it/s]
loss 0.34 accuracy 0.88:  19%|█▉        | 357/1875 [00:17<00:32, 46.08it/s]
loss 0.10 accuracy 0.94:  19%|█▉        | 357/1875 [00:17<00:32, 46.08it/s]
loss 0.10 accuracy 0.94:  19%|█▉        | 362/1875 [00:17<00:32, 46.04it/s]
loss 0.66 accuracy 0.78:  19%|█▉        | 362/1875 [00:17<00:32, 46.04it/s]
loss 0.25 accuracy 0.91:  19%|█▉        | 362/1875 [00:17<00:32, 46.04it/s]
loss 0.30 accuracy 0.91:  19%|█▉        | 362/1875 [00:17<00:32, 46.04it/s]
loss 0.24 accuracy 0.97:  19%|█▉        | 362/1875 [00:17<00:32, 46.04it/s]
loss 0.11 accuracy 0.97:  19%|█▉        | 362/1875 [00:18<00:32, 46.04it/s]
loss 0.11 accuracy 0.97:  20%|█▉        | 367/1875 [00:18<00:32, 46.02it/s]
loss 0.17 accuracy 0.97:  20%|█▉        | 367/1875 [00:18<00:32, 46.02it/s]
loss 0.39 accuracy 0.91:  20%|█▉        | 367/1875 [00:18<00:32, 46.02it/s]
loss 0.28 accuracy 0.91:  20%|█▉        | 367/1875 [00:18<00:32, 46.02it/s]
loss 0.17 accuracy 0.97:  20%|█▉        | 367/1875 [00:18<00:32, 46.02it/s]
loss 0.24 accuracy 0.94:  20%|█▉        | 367/1875 [00:18<00:32, 46.02it/s]
loss 0.24 accuracy 0.94:  20%|█▉        | 372/1875 [00:18<00:32, 45.89it/s]
loss 0.34 accuracy 0.88:  20%|█▉        | 372/1875 [00:18<00:32, 45.89it/s]
loss 0.29 accuracy 0.94:  20%|█▉        | 372/1875 [00:18<00:32, 45.89it/s]
loss 0.21 accuracy 0.91:  20%|█▉        | 372/1875 [00:18<00:32, 45.89it/s]
loss 0.28 accuracy 0.91:  20%|█▉        | 372/1875 [00:18<00:32, 45.89it/s]
loss 0.23 accuracy 0.88:  20%|█▉        | 372/1875 [00:18<00:32, 45.89it/s]
loss 0.23 accuracy 0.88:  20%|██        | 377/1875 [00:18<00:32, 45.87it/s]
loss 0.19 accuracy 0.94:  20%|██        | 377/1875 [00:18<00:32, 45.87it/s]
loss 0.33 accuracy 0.91:  20%|██        | 377/1875 [00:18<00:32, 45.87it/s]
loss 0.19 accuracy 0.94:  20%|██        | 377/1875 [00:18<00:32, 45.87it/s]
loss 0.11 accuracy 1.00:  20%|██        | 377/1875 [00:18<00:32, 45.87it/s]
loss 0.31 accuracy 0.91:  20%|██        | 377/1875 [00:18<00:32, 45.87it/s]
loss 0.31 accuracy 0.91:  20%|██        | 382/1875 [00:18<00:32, 45.87it/s]
loss 0.16 accuracy 0.97:  20%|██        | 382/1875 [00:18<00:32, 45.87it/s]
loss 0.20 accuracy 0.91:  20%|██        | 382/1875 [00:18<00:32, 45.87it/s]
loss 0.39 accuracy 0.84:  20%|██        | 382/1875 [00:18<00:32, 45.87it/s]
loss 0.16 accuracy 0.97:  20%|██        | 382/1875 [00:18<00:32, 45.87it/s]
loss 0.58 accuracy 0.84:  20%|██        | 382/1875 [00:18<00:32, 45.87it/s]
loss 0.58 accuracy 0.84:  21%|██        | 387/1875 [00:18<00:32, 45.89it/s]
loss 0.35 accuracy 0.91:  21%|██        | 387/1875 [00:18<00:32, 45.89it/s]
loss 0.44 accuracy 0.78:  21%|██        | 387/1875 [00:18<00:32, 45.89it/s]
loss 0.17 accuracy 0.94:  21%|██        | 387/1875 [00:18<00:32, 45.89it/s]
loss 0.45 accuracy 0.91:  21%|██        | 387/1875 [00:18<00:32, 45.89it/s]
loss 0.41 accuracy 0.94:  21%|██        | 387/1875 [00:18<00:32, 45.89it/s]
loss 0.41 accuracy 0.94:  21%|██        | 392/1875 [00:18<00:32, 45.78it/s]
loss 0.67 accuracy 0.91:  21%|██        | 392/1875 [00:18<00:32, 45.78it/s]
loss 0.35 accuracy 0.88:  21%|██        | 392/1875 [00:18<00:32, 45.78it/s]
loss 0.54 accuracy 0.84:  21%|██        | 392/1875 [00:18<00:32, 45.78it/s]
loss 0.27 accuracy 0.91:  21%|██        | 392/1875 [00:18<00:32, 45.78it/s]
loss 0.34 accuracy 0.94:  21%|██        | 392/1875 [00:18<00:32, 45.78it/s]
loss 0.34 accuracy 0.94:  21%|██        | 397/1875 [00:18<00:32, 45.81it/s]
loss 0.51 accuracy 0.81:  21%|██        | 397/1875 [00:18<00:32, 45.81it/s]
loss 0.78 accuracy 0.72:  21%|██        | 397/1875 [00:18<00:32, 45.81it/s]
loss 0.51 accuracy 0.75:  21%|██        | 397/1875 [00:18<00:32, 45.81it/s]
loss 0.42 accuracy 0.88:  21%|██        | 397/1875 [00:18<00:32, 45.81it/s]
loss 0.19 accuracy 0.97:  21%|██        | 397/1875 [00:18<00:32, 45.81it/s]
loss 0.19 accuracy 0.97:  21%|██▏       | 402/1875 [00:18<00:32, 45.81it/s]
loss 0.72 accuracy 0.81:  21%|██▏       | 402/1875 [00:18<00:32, 45.81it/s]
loss 0.33 accuracy 0.91:  21%|██▏       | 402/1875 [00:18<00:32, 45.81it/s]
loss 0.54 accuracy 0.81:  21%|██▏       | 402/1875 [00:18<00:32, 45.81it/s]
loss 0.61 accuracy 0.72:  21%|██▏       | 402/1875 [00:18<00:32, 45.81it/s]
loss 0.81 accuracy 0.72:  21%|██▏       | 402/1875 [00:18<00:32, 45.81it/s]
loss 0.81 accuracy 0.72:  22%|██▏       | 407/1875 [00:18<00:32, 45.80it/s]
loss 0.69 accuracy 0.72:  22%|██▏       | 407/1875 [00:18<00:32, 45.80it/s]
loss 0.21 accuracy 0.97:  22%|██▏       | 407/1875 [00:18<00:32, 45.80it/s]
loss 0.22 accuracy 0.94:  22%|██▏       | 407/1875 [00:18<00:32, 45.80it/s]
loss 0.33 accuracy 0.88:  22%|██▏       | 407/1875 [00:18<00:32, 45.80it/s]
loss 0.41 accuracy 0.88:  22%|██▏       | 407/1875 [00:18<00:32, 45.80it/s]
loss 0.41 accuracy 0.88:  22%|██▏       | 412/1875 [00:18<00:31, 45.85it/s]
loss 0.63 accuracy 0.81:  22%|██▏       | 412/1875 [00:19<00:31, 45.85it/s]
loss 0.19 accuracy 0.94:  22%|██▏       | 412/1875 [00:19<00:31, 45.85it/s]
loss 0.22 accuracy 0.91:  22%|██▏       | 412/1875 [00:19<00:31, 45.85it/s]
loss 0.38 accuracy 0.88:  22%|██▏       | 412/1875 [00:19<00:31, 45.85it/s]
loss 0.26 accuracy 0.94:  22%|██▏       | 412/1875 [00:19<00:31, 45.85it/s]
loss 0.26 accuracy 0.94:  22%|██▏       | 417/1875 [00:19<00:31, 45.89it/s]
loss 0.34 accuracy 0.94:  22%|██▏       | 417/1875 [00:19<00:31, 45.89it/s]
loss 0.37 accuracy 0.91:  22%|██▏       | 417/1875 [00:19<00:31, 45.89it/s]
loss 0.20 accuracy 0.94:  22%|██▏       | 417/1875 [00:19<00:31, 45.89it/s]
loss 0.19 accuracy 0.94:  22%|██▏       | 417/1875 [00:19<00:31, 45.89it/s]
loss 0.29 accuracy 0.91:  22%|██▏       | 417/1875 [00:19<00:31, 45.89it/s]
loss 0.29 accuracy 0.91:  23%|██▎       | 422/1875 [00:19<00:31, 45.98it/s]
loss 0.53 accuracy 0.88:  23%|██▎       | 422/1875 [00:19<00:31, 45.98it/s]
loss 0.38 accuracy 0.88:  23%|██▎       | 422/1875 [00:19<00:31, 45.98it/s]
loss 0.38 accuracy 0.91:  23%|██▎       | 422/1875 [00:19<00:31, 45.98it/s]
loss 0.45 accuracy 0.91:  23%|██▎       | 422/1875 [00:19<00:31, 45.98it/s]
loss 0.42 accuracy 0.84:  23%|██▎       | 422/1875 [00:19<00:31, 45.98it/s]
loss 0.42 accuracy 0.84:  23%|██▎       | 427/1875 [00:19<00:31, 46.01it/s]
loss 0.18 accuracy 0.97:  23%|██▎       | 427/1875 [00:19<00:31, 46.01it/s]
loss 0.34 accuracy 0.88:  23%|██▎       | 427/1875 [00:19<00:31, 46.01it/s]
loss 0.38 accuracy 0.88:  23%|██▎       | 427/1875 [00:19<00:31, 46.01it/s]
loss 0.50 accuracy 0.81:  23%|██▎       | 427/1875 [00:19<00:31, 46.01it/s]
loss 0.11 accuracy 0.97:  23%|██▎       | 427/1875 [00:19<00:31, 46.01it/s]
loss 0.11 accuracy 0.97:  23%|██▎       | 432/1875 [00:19<00:31, 46.04it/s]
loss 0.22 accuracy 0.91:  23%|██▎       | 432/1875 [00:19<00:31, 46.04it/s]
loss 0.25 accuracy 0.97:  23%|██▎       | 432/1875 [00:19<00:31, 46.04it/s]
loss 0.20 accuracy 0.97:  23%|██▎       | 432/1875 [00:19<00:31, 46.04it/s]
loss 0.27 accuracy 0.91:  23%|██▎       | 432/1875 [00:19<00:31, 46.04it/s]
loss 0.30 accuracy 0.91:  23%|██▎       | 432/1875 [00:19<00:31, 46.04it/s]
loss 0.30 accuracy 0.91:  23%|██▎       | 437/1875 [00:19<00:31, 46.08it/s]
loss 0.40 accuracy 0.88:  23%|██▎       | 437/1875 [00:19<00:31, 46.08it/s]
loss 0.64 accuracy 0.81:  23%|██▎       | 437/1875 [00:19<00:31, 46.08it/s]
loss 0.29 accuracy 0.91:  23%|██▎       | 437/1875 [00:19<00:31, 46.08it/s]
loss 0.46 accuracy 0.91:  23%|██▎       | 437/1875 [00:19<00:31, 46.08it/s]
loss 0.35 accuracy 0.91:  23%|██▎       | 437/1875 [00:19<00:31, 46.08it/s]
loss 0.35 accuracy 0.91:  24%|██▎       | 442/1875 [00:19<00:31, 46.08it/s]
loss 0.24 accuracy 0.94:  24%|██▎       | 442/1875 [00:19<00:31, 46.08it/s]
loss 0.30 accuracy 0.88:  24%|██▎       | 442/1875 [00:19<00:31, 46.08it/s]
loss 0.32 accuracy 0.88:  24%|██▎       | 442/1875 [00:19<00:31, 46.08it/s]
loss 0.20 accuracy 0.94:  24%|██▎       | 442/1875 [00:19<00:31, 46.08it/s]
loss 0.23 accuracy 0.97:  24%|██▎       | 442/1875 [00:19<00:31, 46.08it/s]
loss 0.23 accuracy 0.97:  24%|██▍       | 447/1875 [00:19<00:31, 46.05it/s]
loss 0.17 accuracy 0.94:  24%|██▍       | 447/1875 [00:19<00:31, 46.05it/s]
loss 0.14 accuracy 1.00:  24%|██▍       | 447/1875 [00:19<00:31, 46.05it/s]
loss 0.37 accuracy 0.91:  24%|██▍       | 447/1875 [00:19<00:31, 46.05it/s]
loss 0.43 accuracy 0.84:  24%|██▍       | 447/1875 [00:19<00:31, 46.05it/s]
loss 0.20 accuracy 0.94:  24%|██▍       | 447/1875 [00:19<00:31, 46.05it/s]
loss 0.20 accuracy 0.94:  24%|██▍       | 452/1875 [00:19<00:30, 46.05it/s]
loss 0.42 accuracy 0.91:  24%|██▍       | 452/1875 [00:19<00:30, 46.05it/s]
loss 0.45 accuracy 0.91:  24%|██▍       | 452/1875 [00:19<00:30, 46.05it/s]
loss 0.41 accuracy 0.84:  24%|██▍       | 452/1875 [00:19<00:30, 46.05it/s]
loss 0.25 accuracy 0.88:  24%|██▍       | 452/1875 [00:19<00:30, 46.05it/s]
loss 0.10 accuracy 1.00:  24%|██▍       | 452/1875 [00:19<00:30, 46.05it/s]
loss 0.10 accuracy 1.00:  24%|██▍       | 457/1875 [00:19<00:30, 46.09it/s]
loss 0.25 accuracy 0.94:  24%|██▍       | 457/1875 [00:19<00:30, 46.09it/s]
loss 0.36 accuracy 0.84:  24%|██▍       | 457/1875 [00:20<00:30, 46.09it/s]
loss 0.20 accuracy 0.94:  24%|██▍       | 457/1875 [00:20<00:30, 46.09it/s]
loss 0.16 accuracy 0.97:  24%|██▍       | 457/1875 [00:20<00:30, 46.09it/s]
loss 0.20 accuracy 0.97:  24%|██▍       | 457/1875 [00:20<00:30, 46.09it/s]
loss 0.20 accuracy 0.97:  25%|██▍       | 462/1875 [00:20<00:30, 46.05it/s]
loss 0.14 accuracy 0.97:  25%|██▍       | 462/1875 [00:20<00:30, 46.05it/s]
loss 0.44 accuracy 0.91:  25%|██▍       | 462/1875 [00:20<00:30, 46.05it/s]
loss 0.11 accuracy 1.00:  25%|██▍       | 462/1875 [00:20<00:30, 46.05it/s]
loss 0.32 accuracy 0.97:  25%|██▍       | 462/1875 [00:20<00:30, 46.05it/s]
loss 0.30 accuracy 0.91:  25%|██▍       | 462/1875 [00:20<00:30, 46.05it/s]
loss 0.30 accuracy 0.91:  25%|██▍       | 467/1875 [00:20<00:30, 46.04it/s]
loss 0.15 accuracy 0.97:  25%|██▍       | 467/1875 [00:20<00:30, 46.04it/s]
loss 0.10 accuracy 1.00:  25%|██▍       | 467/1875 [00:20<00:30, 46.04it/s]
loss 0.29 accuracy 0.88:  25%|██▍       | 467/1875 [00:20<00:30, 46.04it/s]
loss 0.18 accuracy 0.97:  25%|██▍       | 467/1875 [00:20<00:30, 46.04it/s]
loss 0.28 accuracy 0.94:  25%|██▍       | 467/1875 [00:20<00:30, 46.04it/s]
loss 0.28 accuracy 0.94:  25%|██▌       | 472/1875 [00:20<00:30, 45.99it/s]
loss 0.18 accuracy 0.94:  25%|██▌       | 472/1875 [00:20<00:30, 45.99it/s]
loss 0.35 accuracy 0.94:  25%|██▌       | 472/1875 [00:20<00:30, 45.99it/s]
loss 0.13 accuracy 0.97:  25%|██▌       | 472/1875 [00:20<00:30, 45.99it/s]
loss 0.20 accuracy 0.97:  25%|██▌       | 472/1875 [00:20<00:30, 45.99it/s]
loss 0.29 accuracy 0.91:  25%|██▌       | 472/1875 [00:20<00:30, 45.99it/s]
loss 0.29 accuracy 0.91:  25%|██▌       | 477/1875 [00:20<00:30, 45.86it/s]
loss 0.51 accuracy 0.88:  25%|██▌       | 477/1875 [00:20<00:30, 45.86it/s]
loss 0.19 accuracy 0.94:  25%|██▌       | 477/1875 [00:20<00:30, 45.86it/s]
loss 0.30 accuracy 0.88:  25%|██▌       | 477/1875 [00:20<00:30, 45.86it/s]
loss 0.10 accuracy 0.97:  25%|██▌       | 477/1875 [00:20<00:30, 45.86it/s]
loss 0.44 accuracy 0.81:  25%|██▌       | 477/1875 [00:20<00:30, 45.86it/s]
loss 0.44 accuracy 0.81:  26%|██▌       | 482/1875 [00:20<00:30, 45.87it/s]
loss 0.35 accuracy 0.91:  26%|██▌       | 482/1875 [00:20<00:30, 45.87it/s]
loss 0.57 accuracy 0.91:  26%|██▌       | 482/1875 [00:20<00:30, 45.87it/s]
loss 0.12 accuracy 0.97:  26%|██▌       | 482/1875 [00:20<00:30, 45.87it/s]
loss 0.18 accuracy 0.94:  26%|██▌       | 482/1875 [00:20<00:30, 45.87it/s]
loss 0.09 accuracy 0.97:  26%|██▌       | 482/1875 [00:20<00:30, 45.87it/s]
loss 0.09 accuracy 0.97:  26%|██▌       | 487/1875 [00:20<00:30, 45.88it/s]
loss 0.28 accuracy 0.94:  26%|██▌       | 487/1875 [00:20<00:30, 45.88it/s]
loss 0.28 accuracy 0.88:  26%|██▌       | 487/1875 [00:20<00:30, 45.88it/s]
loss 0.41 accuracy 0.88:  26%|██▌       | 487/1875 [00:20<00:30, 45.88it/s]
loss 0.29 accuracy 0.91:  26%|██▌       | 487/1875 [00:20<00:30, 45.88it/s]
loss 0.29 accuracy 0.88:  26%|██▌       | 487/1875 [00:20<00:30, 45.88it/s]
loss 0.29 accuracy 0.88:  26%|██▌       | 492/1875 [00:20<00:30, 45.70it/s]
loss 0.05 accuracy 1.00:  26%|██▌       | 492/1875 [00:20<00:30, 45.70it/s]
loss 0.10 accuracy 0.94:  26%|██▌       | 492/1875 [00:20<00:30, 45.70it/s]
loss 0.11 accuracy 0.97:  26%|██▌       | 492/1875 [00:20<00:30, 45.70it/s]
loss 0.15 accuracy 0.94:  26%|██▌       | 492/1875 [00:20<00:30, 45.70it/s]
loss 0.20 accuracy 0.91:  26%|██▌       | 492/1875 [00:20<00:30, 45.70it/s]
loss 0.20 accuracy 0.91:  27%|██▋       | 497/1875 [00:20<00:30, 45.74it/s]
loss 0.13 accuracy 0.97:  27%|██▋       | 497/1875 [00:20<00:30, 45.74it/s]
loss 0.41 accuracy 0.94:  27%|██▋       | 497/1875 [00:20<00:30, 45.74it/s]
loss 0.30 accuracy 0.91:  27%|██▋       | 497/1875 [00:20<00:30, 45.74it/s]
loss 0.57 accuracy 0.84:  27%|██▋       | 497/1875 [00:20<00:30, 45.74it/s]
loss 0.44 accuracy 0.88:  27%|██▋       | 497/1875 [00:20<00:30, 45.74it/s]
loss 0.44 accuracy 0.88:  27%|██▋       | 502/1875 [00:20<00:30, 45.68it/s]
loss 0.26 accuracy 0.91:  27%|██▋       | 502/1875 [00:20<00:30, 45.68it/s]
loss 0.17 accuracy 0.94:  27%|██▋       | 502/1875 [00:20<00:30, 45.68it/s]
loss 0.52 accuracy 0.88:  27%|██▋       | 502/1875 [00:21<00:30, 45.68it/s]
loss 0.06 accuracy 1.00:  27%|██▋       | 502/1875 [00:21<00:30, 45.68it/s]
loss 0.34 accuracy 0.88:  27%|██▋       | 502/1875 [00:21<00:30, 45.68it/s]
loss 0.34 accuracy 0.88:  27%|██▋       | 507/1875 [00:21<00:29, 45.75it/s]
loss 0.42 accuracy 0.94:  27%|██▋       | 507/1875 [00:21<00:29, 45.75it/s]
loss 0.32 accuracy 0.94:  27%|██▋       | 507/1875 [00:21<00:29, 45.75it/s]
loss 0.28 accuracy 0.84:  27%|██▋       | 507/1875 [00:21<00:29, 45.75it/s]
loss 0.26 accuracy 0.97:  27%|██▋       | 507/1875 [00:21<00:29, 45.75it/s]
loss 0.23 accuracy 0.91:  27%|██▋       | 507/1875 [00:21<00:29, 45.75it/s]
loss 0.23 accuracy 0.91:  27%|██▋       | 512/1875 [00:21<00:29, 45.84it/s]
loss 0.40 accuracy 0.81:  27%|██▋       | 512/1875 [00:21<00:29, 45.84it/s]
loss 0.20 accuracy 0.94:  27%|██▋       | 512/1875 [00:21<00:29, 45.84it/s]
loss 0.11 accuracy 1.00:  27%|██▋       | 512/1875 [00:21<00:29, 45.84it/s]
loss 0.28 accuracy 0.91:  27%|██▋       | 512/1875 [00:21<00:29, 45.84it/s]
loss 0.16 accuracy 0.97:  27%|██▋       | 512/1875 [00:21<00:29, 45.84it/s]
loss 0.16 accuracy 0.97:  28%|██▊       | 517/1875 [00:21<00:29, 45.92it/s]
loss 0.37 accuracy 0.91:  28%|██▊       | 517/1875 [00:21<00:29, 45.92it/s]
loss 0.16 accuracy 0.97:  28%|██▊       | 517/1875 [00:21<00:29, 45.92it/s]
loss 0.27 accuracy 0.84:  28%|██▊       | 517/1875 [00:21<00:29, 45.92it/s]
loss 0.58 accuracy 0.84:  28%|██▊       | 517/1875 [00:21<00:29, 45.92it/s]
loss 0.32 accuracy 0.97:  28%|██▊       | 517/1875 [00:21<00:29, 45.92it/s]
loss 0.32 accuracy 0.97:  28%|██▊       | 522/1875 [00:21<00:29, 45.97it/s]
loss 0.14 accuracy 0.97:  28%|██▊       | 522/1875 [00:21<00:29, 45.97it/s]
loss 0.14 accuracy 1.00:  28%|██▊       | 522/1875 [00:21<00:29, 45.97it/s]
loss 0.14 accuracy 0.97:  28%|██▊       | 522/1875 [00:21<00:29, 45.97it/s]
loss 0.08 accuracy 1.00:  28%|██▊       | 522/1875 [00:21<00:29, 45.97it/s]
loss 0.18 accuracy 0.94:  28%|██▊       | 522/1875 [00:21<00:29, 45.97it/s]
loss 0.18 accuracy 0.94:  28%|██▊       | 527/1875 [00:21<00:29, 45.98it/s]
loss 0.31 accuracy 0.91:  28%|██▊       | 527/1875 [00:21<00:29, 45.98it/s]
loss 0.07 accuracy 1.00:  28%|██▊       | 527/1875 [00:21<00:29, 45.98it/s]
loss 0.07 accuracy 1.00:  28%|██▊       | 527/1875 [00:21<00:29, 45.98it/s]
loss 0.32 accuracy 0.88:  28%|██▊       | 527/1875 [00:21<00:29, 45.98it/s]
loss 0.24 accuracy 0.91:  28%|██▊       | 527/1875 [00:21<00:29, 45.98it/s]
loss 0.24 accuracy 0.91:  28%|██▊       | 532/1875 [00:21<00:29, 46.00it/s]
loss 0.15 accuracy 0.97:  28%|██▊       | 532/1875 [00:21<00:29, 46.00it/s]
loss 0.11 accuracy 0.97:  28%|██▊       | 532/1875 [00:21<00:29, 46.00it/s]
loss 0.37 accuracy 0.84:  28%|██▊       | 532/1875 [00:21<00:29, 46.00it/s]
loss 0.15 accuracy 0.94:  28%|██▊       | 532/1875 [00:21<00:29, 46.00it/s]
loss 0.20 accuracy 0.94:  28%|██▊       | 532/1875 [00:21<00:29, 46.00it/s]
loss 0.20 accuracy 0.94:  29%|██▊       | 537/1875 [00:21<00:29, 46.04it/s]
loss 0.09 accuracy 1.00:  29%|██▊       | 537/1875 [00:21<00:29, 46.04it/s]
loss 0.22 accuracy 0.97:  29%|██▊       | 537/1875 [00:21<00:29, 46.04it/s]
loss 0.08 accuracy 1.00:  29%|██▊       | 537/1875 [00:21<00:29, 46.04it/s]
loss 0.27 accuracy 0.91:  29%|██▊       | 537/1875 [00:21<00:29, 46.04it/s]
loss 0.31 accuracy 0.91:  29%|██▊       | 537/1875 [00:21<00:29, 46.04it/s]
loss 0.31 accuracy 0.91:  29%|██▉       | 542/1875 [00:21<00:28, 46.03it/s]
loss 0.27 accuracy 0.91:  29%|██▉       | 542/1875 [00:21<00:28, 46.03it/s]
loss 0.44 accuracy 0.91:  29%|██▉       | 542/1875 [00:21<00:28, 46.03it/s]
loss 0.29 accuracy 0.91:  29%|██▉       | 542/1875 [00:21<00:28, 46.03it/s]
loss 0.26 accuracy 0.91:  29%|██▉       | 542/1875 [00:21<00:28, 46.03it/s]
loss 0.14 accuracy 0.97:  29%|██▉       | 542/1875 [00:21<00:28, 46.03it/s]
loss 0.14 accuracy 0.97:  29%|██▉       | 547/1875 [00:21<00:28, 45.99it/s]
loss 0.15 accuracy 0.97:  29%|██▉       | 547/1875 [00:21<00:28, 45.99it/s]
loss 0.13 accuracy 0.97:  29%|██▉       | 547/1875 [00:21<00:28, 45.99it/s]
loss 0.54 accuracy 0.88:  29%|██▉       | 547/1875 [00:21<00:28, 45.99it/s]
loss 0.31 accuracy 0.91:  29%|██▉       | 547/1875 [00:22<00:28, 45.99it/s]
loss 0.20 accuracy 0.94:  29%|██▉       | 547/1875 [00:22<00:28, 45.99it/s]
loss 0.20 accuracy 0.94:  29%|██▉       | 552/1875 [00:22<00:28, 45.96it/s]
loss 0.07 accuracy 1.00:  29%|██▉       | 552/1875 [00:22<00:28, 45.96it/s]
loss 0.09 accuracy 0.97:  29%|██▉       | 552/1875 [00:22<00:28, 45.96it/s]
loss 0.10 accuracy 0.97:  29%|██▉       | 552/1875 [00:22<00:28, 45.96it/s]
loss 0.34 accuracy 0.91:  29%|██▉       | 552/1875 [00:22<00:28, 45.96it/s]
loss 0.11 accuracy 0.97:  29%|██▉       | 552/1875 [00:22<00:28, 45.96it/s]
loss 0.11 accuracy 0.97:  30%|██▉       | 557/1875 [00:22<00:28, 45.83it/s]
loss 0.40 accuracy 0.88:  30%|██▉       | 557/1875 [00:22<00:28, 45.83it/s]
loss 0.25 accuracy 0.94:  30%|██▉       | 557/1875 [00:22<00:28, 45.83it/s]
loss 0.07 accuracy 1.00:  30%|██▉       | 557/1875 [00:22<00:28, 45.83it/s]
loss 0.11 accuracy 0.97:  30%|██▉       | 557/1875 [00:22<00:28, 45.83it/s]
loss 0.20 accuracy 0.97:  30%|██▉       | 557/1875 [00:22<00:28, 45.83it/s]
loss 0.20 accuracy 0.97:  30%|██▉       | 562/1875 [00:22<00:28, 45.84it/s]
loss 0.05 accuracy 1.00:  30%|██▉       | 562/1875 [00:22<00:28, 45.84it/s]
loss 0.15 accuracy 0.91:  30%|██▉       | 562/1875 [00:22<00:28, 45.84it/s]
loss 0.11 accuracy 1.00:  30%|██▉       | 562/1875 [00:22<00:28, 45.84it/s]
loss 0.11 accuracy 0.97:  30%|██▉       | 562/1875 [00:22<00:28, 45.84it/s]
loss 0.04 accuracy 1.00:  30%|██▉       | 562/1875 [00:22<00:28, 45.84it/s]
loss 0.04 accuracy 1.00:  30%|███       | 567/1875 [00:22<00:28, 45.80it/s]
loss 0.03 accuracy 1.00:  30%|███       | 567/1875 [00:22<00:28, 45.80it/s]
loss 0.19 accuracy 0.91:  30%|███       | 567/1875 [00:22<00:28, 45.80it/s]
loss 0.11 accuracy 0.97:  30%|███       | 567/1875 [00:22<00:28, 45.80it/s]
loss 0.18 accuracy 0.91:  30%|███       | 567/1875 [00:22<00:28, 45.80it/s]
loss 0.04 accuracy 1.00:  30%|███       | 567/1875 [00:22<00:28, 45.80it/s]
loss 0.04 accuracy 1.00:  31%|███       | 572/1875 [00:22<00:28, 45.69it/s]
loss 0.25 accuracy 0.94:  31%|███       | 572/1875 [00:22<00:28, 45.69it/s]
loss 0.33 accuracy 0.91:  31%|███       | 572/1875 [00:22<00:28, 45.69it/s]
loss 0.12 accuracy 0.97:  31%|███       | 572/1875 [00:22<00:28, 45.69it/s]
loss 0.11 accuracy 0.97:  31%|███       | 572/1875 [00:22<00:28, 45.69it/s]
loss 0.52 accuracy 0.91:  31%|███       | 572/1875 [00:22<00:28, 45.69it/s]
loss 0.52 accuracy 0.91:  31%|███       | 577/1875 [00:22<00:28, 45.70it/s]
loss 0.24 accuracy 0.94:  31%|███       | 577/1875 [00:22<00:28, 45.70it/s]
loss 0.16 accuracy 0.94:  31%|███       | 577/1875 [00:22<00:28, 45.70it/s]
loss 0.17 accuracy 0.94:  31%|███       | 577/1875 [00:22<00:28, 45.70it/s]
loss 0.17 accuracy 0.94:  31%|███       | 577/1875 [00:22<00:28, 45.70it/s]
loss 0.17 accuracy 0.91:  31%|███       | 577/1875 [00:22<00:28, 45.70it/s]
loss 0.17 accuracy 0.91:  31%|███       | 582/1875 [00:22<00:28, 45.71it/s]
loss 0.15 accuracy 0.97:  31%|███       | 582/1875 [00:22<00:28, 45.71it/s]
loss 0.13 accuracy 0.97:  31%|███       | 582/1875 [00:22<00:28, 45.71it/s]
loss 0.14 accuracy 0.94:  31%|███       | 582/1875 [00:22<00:28, 45.71it/s]
loss 0.18 accuracy 0.97:  31%|███       | 582/1875 [00:22<00:28, 45.71it/s]
loss 0.05 accuracy 1.00:  31%|███       | 582/1875 [00:22<00:28, 45.71it/s]
loss 0.05 accuracy 1.00:  31%|███▏      | 587/1875 [00:22<00:28, 45.81it/s]
loss 0.53 accuracy 0.91:  31%|███▏      | 587/1875 [00:22<00:28, 45.81it/s]
loss 0.10 accuracy 0.97:  31%|███▏      | 587/1875 [00:22<00:28, 45.81it/s]
loss 0.29 accuracy 0.91:  31%|███▏      | 587/1875 [00:22<00:28, 45.81it/s]
loss 0.12 accuracy 0.94:  31%|███▏      | 587/1875 [00:22<00:28, 45.81it/s]
loss 0.16 accuracy 0.94:  31%|███▏      | 587/1875 [00:22<00:28, 45.81it/s]
loss 0.16 accuracy 0.94:  32%|███▏      | 592/1875 [00:22<00:27, 45.88it/s]
loss 0.21 accuracy 0.94:  32%|███▏      | 592/1875 [00:22<00:27, 45.88it/s]
loss 0.15 accuracy 0.94:  32%|███▏      | 592/1875 [00:22<00:27, 45.88it/s]
loss 0.05 accuracy 1.00:  32%|███▏      | 592/1875 [00:22<00:27, 45.88it/s]
loss 0.15 accuracy 0.94:  32%|███▏      | 592/1875 [00:23<00:27, 45.88it/s]
loss 0.07 accuracy 0.97:  32%|███▏      | 592/1875 [00:23<00:27, 45.88it/s]
loss 0.07 accuracy 0.97:  32%|███▏      | 597/1875 [00:23<00:27, 45.94it/s]
loss 0.23 accuracy 0.91:  32%|███▏      | 597/1875 [00:23<00:27, 45.94it/s]
loss 0.16 accuracy 0.97:  32%|███▏      | 597/1875 [00:23<00:27, 45.94it/s]
loss 0.46 accuracy 0.88:  32%|███▏      | 597/1875 [00:23<00:27, 45.94it/s]
loss 0.34 accuracy 0.94:  32%|███▏      | 597/1875 [00:23<00:27, 45.94it/s]
loss 0.14 accuracy 0.97:  32%|███▏      | 597/1875 [00:23<00:27, 45.94it/s]
loss 0.14 accuracy 0.97:  32%|███▏      | 602/1875 [00:23<00:27, 45.99it/s]
loss 0.09 accuracy 1.00:  32%|███▏      | 602/1875 [00:23<00:27, 45.99it/s]
loss 0.23 accuracy 0.94:  32%|███▏      | 602/1875 [00:23<00:27, 45.99it/s]
loss 0.35 accuracy 0.88:  32%|███▏      | 602/1875 [00:23<00:27, 45.99it/s]
loss 0.34 accuracy 0.94:  32%|███▏      | 602/1875 [00:23<00:27, 45.99it/s]
loss 0.46 accuracy 0.88:  32%|███▏      | 602/1875 [00:23<00:27, 45.99it/s]
loss 0.46 accuracy 0.88:  32%|███▏      | 607/1875 [00:23<00:27, 46.05it/s]
loss 0.41 accuracy 0.84:  32%|███▏      | 607/1875 [00:23<00:27, 46.05it/s]
loss 0.15 accuracy 0.94:  32%|███▏      | 607/1875 [00:23<00:27, 46.05it/s]
loss 0.12 accuracy 0.97:  32%|███▏      | 607/1875 [00:23<00:27, 46.05it/s]
loss 0.39 accuracy 0.84:  32%|███▏      | 607/1875 [00:23<00:27, 46.05it/s]
loss 0.07 accuracy 0.97:  32%|███▏      | 607/1875 [00:23<00:27, 46.05it/s]
loss 0.07 accuracy 0.97:  33%|███▎      | 612/1875 [00:23<00:27, 46.08it/s]
loss 0.06 accuracy 0.97:  33%|███▎      | 612/1875 [00:23<00:27, 46.08it/s]
loss 0.11 accuracy 0.97:  33%|███▎      | 612/1875 [00:23<00:27, 46.08it/s]
loss 0.23 accuracy 0.91:  33%|███▎      | 612/1875 [00:23<00:27, 46.08it/s]
loss 0.11 accuracy 1.00:  33%|███▎      | 612/1875 [00:23<00:27, 46.08it/s]
loss 0.19 accuracy 0.94:  33%|███▎      | 612/1875 [00:23<00:27, 46.08it/s]
loss 0.19 accuracy 0.94:  33%|███▎      | 617/1875 [00:23<00:27, 46.07it/s]
loss 0.17 accuracy 0.94:  33%|███▎      | 617/1875 [00:23<00:27, 46.07it/s]
loss 0.06 accuracy 1.00:  33%|███▎      | 617/1875 [00:23<00:27, 46.07it/s]
loss 0.15 accuracy 0.97:  33%|███▎      | 617/1875 [00:23<00:27, 46.07it/s]
loss 0.07 accuracy 1.00:  33%|███▎      | 617/1875 [00:23<00:27, 46.07it/s]
loss 0.34 accuracy 0.94:  33%|███▎      | 617/1875 [00:23<00:27, 46.07it/s]
loss 0.34 accuracy 0.94:  33%|███▎      | 622/1875 [00:23<00:27, 46.09it/s]
loss 0.07 accuracy 0.97:  33%|███▎      | 622/1875 [00:23<00:27, 46.09it/s]
loss 0.16 accuracy 0.94:  33%|███▎      | 622/1875 [00:23<00:27, 46.09it/s]
loss 0.28 accuracy 0.91:  33%|███▎      | 622/1875 [00:23<00:27, 46.09it/s]
loss 0.04 accuracy 1.00:  33%|███▎      | 622/1875 [00:23<00:27, 46.09it/s]
loss 0.24 accuracy 0.97:  33%|███▎      | 622/1875 [00:23<00:27, 46.09it/s]
loss 0.24 accuracy 0.97:  33%|███▎      | 627/1875 [00:23<00:27, 46.12it/s]
loss 0.11 accuracy 0.97:  33%|███▎      | 627/1875 [00:23<00:27, 46.12it/s]
loss 0.15 accuracy 0.97:  33%|███▎      | 627/1875 [00:23<00:27, 46.12it/s]
loss 0.09 accuracy 1.00:  33%|███▎      | 627/1875 [00:23<00:27, 46.12it/s]
loss 0.07 accuracy 1.00:  33%|███▎      | 627/1875 [00:23<00:27, 46.12it/s]
loss 0.04 accuracy 1.00:  33%|███▎      | 627/1875 [00:23<00:27, 46.12it/s]
loss 0.04 accuracy 1.00:  34%|███▎      | 632/1875 [00:23<00:26, 46.14it/s]
loss 0.07 accuracy 1.00:  34%|███▎      | 632/1875 [00:23<00:26, 46.14it/s]
loss 0.52 accuracy 0.88:  34%|███▎      | 632/1875 [00:23<00:26, 46.14it/s]
loss 0.23 accuracy 0.94:  34%|███▎      | 632/1875 [00:23<00:26, 46.14it/s]
loss 0.16 accuracy 0.91:  34%|███▎      | 632/1875 [00:23<00:26, 46.14it/s]
loss 0.14 accuracy 0.97:  34%|███▎      | 632/1875 [00:23<00:26, 46.14it/s]
loss 0.14 accuracy 0.97:  34%|███▍      | 637/1875 [00:23<00:26, 46.16it/s]
loss 0.50 accuracy 0.91:  34%|███▍      | 637/1875 [00:23<00:26, 46.16it/s]
loss 0.18 accuracy 0.97:  34%|███▍      | 637/1875 [00:23<00:26, 46.16it/s]
loss 0.12 accuracy 1.00:  34%|███▍      | 637/1875 [00:23<00:26, 46.16it/s]
loss 0.28 accuracy 0.88:  34%|███▍      | 637/1875 [00:23<00:26, 46.16it/s]
loss 0.09 accuracy 1.00:  34%|███▍      | 637/1875 [00:23<00:26, 46.16it/s]
loss 0.09 accuracy 1.00:  34%|███▍      | 642/1875 [00:23<00:26, 46.18it/s]
loss 0.22 accuracy 0.94:  34%|███▍      | 642/1875 [00:24<00:26, 46.18it/s]
loss 0.05 accuracy 1.00:  34%|███▍      | 642/1875 [00:24<00:26, 46.18it/s]
loss 0.26 accuracy 0.88:  34%|███▍      | 642/1875 [00:24<00:26, 46.18it/s]
loss 0.14 accuracy 0.97:  34%|███▍      | 642/1875 [00:24<00:26, 46.18it/s]
loss 0.23 accuracy 0.94:  34%|███▍      | 642/1875 [00:24<00:26, 46.18it/s]
loss 0.23 accuracy 0.94:  35%|███▍      | 647/1875 [00:24<00:26, 46.14it/s]
loss 0.17 accuracy 0.97:  35%|███▍      | 647/1875 [00:24<00:26, 46.14it/s]
loss 0.04 accuracy 1.00:  35%|███▍      | 647/1875 [00:24<00:26, 46.14it/s]
loss 0.25 accuracy 0.94:  35%|███▍      | 647/1875 [00:24<00:26, 46.14it/s]
loss 0.21 accuracy 0.97:  35%|███▍      | 647/1875 [00:24<00:26, 46.14it/s]
loss 0.15 accuracy 0.94:  35%|███▍      | 647/1875 [00:24<00:26, 46.14it/s]
loss 0.15 accuracy 0.94:  35%|███▍      | 652/1875 [00:24<00:26, 46.15it/s]
loss 0.13 accuracy 0.97:  35%|███▍      | 652/1875 [00:24<00:26, 46.15it/s]
loss 0.47 accuracy 0.88:  35%|███▍      | 652/1875 [00:24<00:26, 46.15it/s]
loss 0.05 accuracy 1.00:  35%|███▍      | 652/1875 [00:24<00:26, 46.15it/s]
loss 0.21 accuracy 0.91:  35%|███▍      | 652/1875 [00:24<00:26, 46.15it/s]
loss 0.12 accuracy 0.97:  35%|███▍      | 652/1875 [00:24<00:26, 46.15it/s]
loss 0.12 accuracy 0.97:  35%|███▌      | 657/1875 [00:24<00:26, 46.17it/s]
loss 0.10 accuracy 0.97:  35%|███▌      | 657/1875 [00:24<00:26, 46.17it/s]
loss 0.22 accuracy 0.97:  35%|███▌      | 657/1875 [00:24<00:26, 46.17it/s]
loss 0.22 accuracy 0.91:  35%|███▌      | 657/1875 [00:24<00:26, 46.17it/s]
loss 0.20 accuracy 0.94:  35%|███▌      | 657/1875 [00:24<00:26, 46.17it/s]
loss 0.14 accuracy 0.94:  35%|███▌      | 657/1875 [00:24<00:26, 46.17it/s]
loss 0.14 accuracy 0.94:  35%|███▌      | 662/1875 [00:24<00:26, 46.18it/s]
loss 0.46 accuracy 0.91:  35%|███▌      | 662/1875 [00:24<00:26, 46.18it/s]
loss 0.18 accuracy 0.94:  35%|███▌      | 662/1875 [00:24<00:26, 46.18it/s]
loss 0.06 accuracy 1.00:  35%|███▌      | 662/1875 [00:24<00:26, 46.18it/s]
loss 0.05 accuracy 1.00:  35%|███▌      | 662/1875 [00:24<00:26, 46.18it/s]
loss 0.17 accuracy 0.94:  35%|███▌      | 662/1875 [00:24<00:26, 46.18it/s]
loss 0.17 accuracy 0.94:  36%|███▌      | 667/1875 [00:24<00:26, 46.18it/s]
loss 0.11 accuracy 0.97:  36%|███▌      | 667/1875 [00:24<00:26, 46.18it/s]
loss 0.40 accuracy 0.84:  36%|███▌      | 667/1875 [00:24<00:26, 46.18it/s]
loss 0.27 accuracy 0.94:  36%|███▌      | 667/1875 [00:24<00:26, 46.18it/s]
loss 0.13 accuracy 0.97:  36%|███▌      | 667/1875 [00:24<00:26, 46.18it/s]
loss 0.15 accuracy 0.94:  36%|███▌      | 667/1875 [00:24<00:26, 46.18it/s]
loss 0.15 accuracy 0.94:  36%|███▌      | 672/1875 [00:24<00:26, 46.17it/s]
loss 0.10 accuracy 0.97:  36%|███▌      | 672/1875 [00:24<00:26, 46.17it/s]
loss 0.22 accuracy 0.97:  36%|███▌      | 672/1875 [00:24<00:26, 46.17it/s]
loss 0.10 accuracy 0.97:  36%|███▌      | 672/1875 [00:24<00:26, 46.17it/s]
loss 0.18 accuracy 0.97:  36%|███▌      | 672/1875 [00:24<00:26, 46.17it/s]
loss 0.38 accuracy 0.91:  36%|███▌      | 672/1875 [00:24<00:26, 46.17it/s]
loss 0.38 accuracy 0.91:  36%|███▌      | 677/1875 [00:24<00:25, 46.17it/s]
loss 0.03 accuracy 1.00:  36%|███▌      | 677/1875 [00:24<00:25, 46.17it/s]
loss 0.23 accuracy 0.91:  36%|███▌      | 677/1875 [00:24<00:25, 46.17it/s]
loss 0.10 accuracy 0.97:  36%|███▌      | 677/1875 [00:24<00:25, 46.17it/s]
loss 0.07 accuracy 1.00:  36%|███▌      | 677/1875 [00:24<00:25, 46.17it/s]
loss 0.09 accuracy 0.97:  36%|███▌      | 677/1875 [00:24<00:25, 46.17it/s]
loss 0.09 accuracy 0.97:  36%|███▋      | 682/1875 [00:24<00:25, 46.18it/s]
loss 0.13 accuracy 0.94:  36%|███▋      | 682/1875 [00:24<00:25, 46.18it/s]
loss 0.08 accuracy 0.97:  36%|███▋      | 682/1875 [00:24<00:25, 46.18it/s]
loss 0.08 accuracy 0.97:  36%|███▋      | 682/1875 [00:24<00:25, 46.18it/s]
loss 0.11 accuracy 0.97:  36%|███▋      | 682/1875 [00:24<00:25, 46.18it/s]
loss 0.10 accuracy 0.94:  36%|███▋      | 682/1875 [00:24<00:25, 46.18it/s]
loss 0.10 accuracy 0.94:  37%|███▋      | 687/1875 [00:24<00:25, 46.15it/s]
loss 0.15 accuracy 0.94:  37%|███▋      | 687/1875 [00:24<00:25, 46.15it/s]
loss 0.10 accuracy 0.97:  37%|███▋      | 687/1875 [00:25<00:25, 46.15it/s]
loss 0.28 accuracy 0.97:  37%|███▋      | 687/1875 [00:25<00:25, 46.15it/s]
loss 0.32 accuracy 0.91:  37%|███▋      | 687/1875 [00:25<00:25, 46.15it/s]
loss 0.11 accuracy 0.97:  37%|███▋      | 687/1875 [00:25<00:25, 46.15it/s]
loss 0.11 accuracy 0.97:  37%|███▋      | 692/1875 [00:25<00:25, 46.09it/s]
loss 0.08 accuracy 0.97:  37%|███▋      | 692/1875 [00:25<00:25, 46.09it/s]
loss 0.06 accuracy 1.00:  37%|███▋      | 692/1875 [00:25<00:25, 46.09it/s]
loss 0.09 accuracy 1.00:  37%|███▋      | 692/1875 [00:25<00:25, 46.09it/s]
loss 0.19 accuracy 0.97:  37%|███▋      | 692/1875 [00:25<00:25, 46.09it/s]
loss 0.38 accuracy 0.91:  37%|███▋      | 692/1875 [00:25<00:25, 46.09it/s]
loss 0.38 accuracy 0.91:  37%|███▋      | 697/1875 [00:25<00:25, 46.05it/s]
loss 0.11 accuracy 0.97:  37%|███▋      | 697/1875 [00:25<00:25, 46.05it/s]
loss 0.16 accuracy 0.97:  37%|███▋      | 697/1875 [00:25<00:25, 46.05it/s]
loss 0.30 accuracy 0.91:  37%|███▋      | 697/1875 [00:25<00:25, 46.05it/s]
loss 0.05 accuracy 1.00:  37%|███▋      | 697/1875 [00:25<00:25, 46.05it/s]
loss 0.19 accuracy 0.91:  37%|███▋      | 697/1875 [00:25<00:25, 46.05it/s]
loss 0.19 accuracy 0.91:  37%|███▋      | 702/1875 [00:25<00:25, 46.05it/s]
loss 0.24 accuracy 0.94:  37%|███▋      | 702/1875 [00:25<00:25, 46.05it/s]
loss 0.15 accuracy 0.94:  37%|███▋      | 702/1875 [00:25<00:25, 46.05it/s]
loss 0.10 accuracy 1.00:  37%|███▋      | 702/1875 [00:25<00:25, 46.05it/s]
loss 0.09 accuracy 1.00:  37%|███▋      | 702/1875 [00:25<00:25, 46.05it/s]
loss 0.16 accuracy 0.94:  37%|███▋      | 702/1875 [00:25<00:25, 46.05it/s]
loss 0.16 accuracy 0.94:  38%|███▊      | 707/1875 [00:25<00:25, 45.98it/s]
loss 0.07 accuracy 1.00:  38%|███▊      | 707/1875 [00:25<00:25, 45.98it/s]
loss 0.42 accuracy 0.88:  38%|███▊      | 707/1875 [00:25<00:25, 45.98it/s]
loss 0.15 accuracy 0.97:  38%|███▊      | 707/1875 [00:25<00:25, 45.98it/s]
loss 0.16 accuracy 0.97:  38%|███▊      | 707/1875 [00:25<00:25, 45.98it/s]
loss 0.13 accuracy 0.97:  38%|███▊      | 707/1875 [00:25<00:25, 45.98it/s]
loss 0.13 accuracy 0.97:  38%|███▊      | 712/1875 [00:25<00:25, 45.84it/s]
loss 0.08 accuracy 0.97:  38%|███▊      | 712/1875 [00:25<00:25, 45.84it/s]
loss 0.11 accuracy 0.97:  38%|███▊      | 712/1875 [00:25<00:25, 45.84it/s]
loss 0.37 accuracy 0.88:  38%|███▊      | 712/1875 [00:25<00:25, 45.84it/s]
loss 0.20 accuracy 0.97:  38%|███▊      | 712/1875 [00:25<00:25, 45.84it/s]
loss 0.13 accuracy 0.97:  38%|███▊      | 712/1875 [00:25<00:25, 45.84it/s]
loss 0.13 accuracy 0.97:  38%|███▊      | 717/1875 [00:25<00:25, 45.84it/s]
loss 0.22 accuracy 0.91:  38%|███▊      | 717/1875 [00:25<00:25, 45.84it/s]
loss 0.20 accuracy 0.91:  38%|███▊      | 717/1875 [00:25<00:25, 45.84it/s]
loss 0.08 accuracy 0.97:  38%|███▊      | 717/1875 [00:25<00:25, 45.84it/s]
loss 0.20 accuracy 0.94:  38%|███▊      | 717/1875 [00:25<00:25, 45.84it/s]
loss 0.12 accuracy 0.97:  38%|███▊      | 717/1875 [00:25<00:25, 45.84it/s]
loss 0.12 accuracy 0.97:  39%|███▊      | 722/1875 [00:25<00:25, 45.83it/s]
loss 0.06 accuracy 1.00:  39%|███▊      | 722/1875 [00:25<00:25, 45.83it/s]
loss 0.12 accuracy 0.97:  39%|███▊      | 722/1875 [00:25<00:25, 45.83it/s]
loss 0.15 accuracy 0.97:  39%|███▊      | 722/1875 [00:25<00:25, 45.83it/s]
loss 0.18 accuracy 0.94:  39%|███▊      | 722/1875 [00:25<00:25, 45.83it/s]
loss 0.11 accuracy 0.97:  39%|███▊      | 722/1875 [00:25<00:25, 45.83it/s]
loss 0.11 accuracy 0.97:  39%|███▉      | 727/1875 [00:25<00:25, 45.71it/s]
loss 0.09 accuracy 1.00:  39%|███▉      | 727/1875 [00:25<00:25, 45.71it/s]
loss 0.08 accuracy 0.97:  39%|███▉      | 727/1875 [00:25<00:25, 45.71it/s]
loss 0.27 accuracy 0.97:  39%|███▉      | 727/1875 [00:25<00:25, 45.71it/s]
loss 0.12 accuracy 0.97:  39%|███▉      | 727/1875 [00:25<00:25, 45.71it/s]
loss 0.04 accuracy 1.00:  39%|███▉      | 727/1875 [00:25<00:25, 45.71it/s]
loss 0.04 accuracy 1.00:  39%|███▉      | 732/1875 [00:25<00:25, 45.71it/s]
loss 0.17 accuracy 0.94:  39%|███▉      | 732/1875 [00:25<00:25, 45.71it/s]
loss 0.11 accuracy 0.97:  39%|███▉      | 732/1875 [00:25<00:25, 45.71it/s]
loss 0.13 accuracy 0.97:  39%|███▉      | 732/1875 [00:26<00:25, 45.71it/s]
loss 0.08 accuracy 0.97:  39%|███▉      | 732/1875 [00:26<00:25, 45.71it/s]
loss 0.17 accuracy 0.97:  39%|███▉      | 732/1875 [00:26<00:25, 45.71it/s]
loss 0.17 accuracy 0.97:  39%|███▉      | 737/1875 [00:26<00:24, 45.69it/s]
loss 0.07 accuracy 0.97:  39%|███▉      | 737/1875 [00:26<00:24, 45.69it/s]
loss 0.45 accuracy 0.91:  39%|███▉      | 737/1875 [00:26<00:24, 45.69it/s]
loss 0.20 accuracy 0.91:  39%|███▉      | 737/1875 [00:26<00:24, 45.69it/s]
loss 0.32 accuracy 0.94:  39%|███▉      | 737/1875 [00:26<00:24, 45.69it/s]
loss 0.12 accuracy 0.97:  39%|███▉      | 737/1875 [00:26<00:24, 45.69it/s]
loss 0.12 accuracy 0.97:  40%|███▉      | 742/1875 [00:26<00:24, 45.74it/s]
loss 0.02 accuracy 1.00:  40%|███▉      | 742/1875 [00:26<00:24, 45.74it/s]
loss 0.18 accuracy 0.97:  40%|███▉      | 742/1875 [00:26<00:24, 45.74it/s]
loss 0.03 accuracy 1.00:  40%|███▉      | 742/1875 [00:26<00:24, 45.74it/s]
loss 0.27 accuracy 0.94:  40%|███▉      | 742/1875 [00:26<00:24, 45.74it/s]
loss 0.14 accuracy 0.97:  40%|███▉      | 742/1875 [00:26<00:24, 45.74it/s]
loss 0.14 accuracy 0.97:  40%|███▉      | 747/1875 [00:26<00:24, 45.86it/s]
loss 0.05 accuracy 1.00:  40%|███▉      | 747/1875 [00:26<00:24, 45.86it/s]
loss 0.07 accuracy 1.00:  40%|███▉      | 747/1875 [00:26<00:24, 45.86it/s]
loss 0.19 accuracy 0.91:  40%|███▉      | 747/1875 [00:26<00:24, 45.86it/s]
loss 0.23 accuracy 0.91:  40%|███▉      | 747/1875 [00:26<00:24, 45.86it/s]
loss 0.29 accuracy 0.91:  40%|███▉      | 747/1875 [00:26<00:24, 45.86it/s]
loss 0.29 accuracy 0.91:  40%|████      | 752/1875 [00:26<00:24, 45.93it/s]
loss 0.19 accuracy 0.97:  40%|████      | 752/1875 [00:26<00:24, 45.93it/s]
loss 0.07 accuracy 1.00:  40%|████      | 752/1875 [00:26<00:24, 45.93it/s]
loss 0.10 accuracy 0.97:  40%|████      | 752/1875 [00:26<00:24, 45.93it/s]
loss 0.37 accuracy 0.94:  40%|████      | 752/1875 [00:26<00:24, 45.93it/s]
loss 0.06 accuracy 1.00:  40%|████      | 752/1875 [00:26<00:24, 45.93it/s]
loss 0.06 accuracy 1.00:  40%|████      | 757/1875 [00:26<00:24, 45.98it/s]
loss 0.27 accuracy 0.94:  40%|████      | 757/1875 [00:26<00:24, 45.98it/s]
loss 0.23 accuracy 0.91:  40%|████      | 757/1875 [00:26<00:24, 45.98it/s]
loss 0.09 accuracy 0.97:  40%|████      | 757/1875 [00:26<00:24, 45.98it/s]
loss 0.05 accuracy 1.00:  40%|████      | 757/1875 [00:26<00:24, 45.98it/s]
loss 0.24 accuracy 0.88:  40%|████      | 757/1875 [00:26<00:24, 45.98it/s]
loss 0.24 accuracy 0.88:  41%|████      | 762/1875 [00:26<00:24, 46.04it/s]
loss 0.11 accuracy 0.97:  41%|████      | 762/1875 [00:26<00:24, 46.04it/s]
loss 0.28 accuracy 0.91:  41%|████      | 762/1875 [00:26<00:24, 46.04it/s]
loss 0.18 accuracy 0.91:  41%|████      | 762/1875 [00:26<00:24, 46.04it/s]
loss 0.08 accuracy 0.97:  41%|████      | 762/1875 [00:26<00:24, 46.04it/s]
loss 0.08 accuracy 1.00:  41%|████      | 762/1875 [00:26<00:24, 46.04it/s]
loss 0.08 accuracy 1.00:  41%|████      | 767/1875 [00:26<00:24, 46.09it/s]
loss 0.16 accuracy 0.94:  41%|████      | 767/1875 [00:26<00:24, 46.09it/s]
loss 0.14 accuracy 0.97:  41%|████      | 767/1875 [00:26<00:24, 46.09it/s]
loss 0.09 accuracy 1.00:  41%|████      | 767/1875 [00:26<00:24, 46.09it/s]
loss 0.09 accuracy 0.97:  41%|████      | 767/1875 [00:26<00:24, 46.09it/s]
loss 0.11 accuracy 0.97:  41%|████      | 767/1875 [00:26<00:24, 46.09it/s]
loss 0.11 accuracy 0.97:  41%|████      | 772/1875 [00:26<00:23, 46.13it/s]
loss 0.34 accuracy 0.94:  41%|████      | 772/1875 [00:26<00:23, 46.13it/s]
loss 0.17 accuracy 0.94:  41%|████      | 772/1875 [00:26<00:23, 46.13it/s]
loss 0.29 accuracy 0.94:  41%|████      | 772/1875 [00:26<00:23, 46.13it/s]
loss 0.06 accuracy 1.00:  41%|████      | 772/1875 [00:26<00:23, 46.13it/s]
loss 0.11 accuracy 0.97:  41%|████      | 772/1875 [00:26<00:23, 46.13it/s]
loss 0.11 accuracy 0.97:  41%|████▏     | 777/1875 [00:26<00:23, 46.11it/s]
loss 0.10 accuracy 0.94:  41%|████▏     | 777/1875 [00:26<00:23, 46.11it/s]
loss 0.03 accuracy 1.00:  41%|████▏     | 777/1875 [00:26<00:23, 46.11it/s]
loss 0.06 accuracy 0.97:  41%|████▏     | 777/1875 [00:26<00:23, 46.11it/s]
loss 0.24 accuracy 0.97:  41%|████▏     | 777/1875 [00:27<00:23, 46.11it/s]
loss 0.08 accuracy 1.00:  41%|████▏     | 777/1875 [00:27<00:23, 46.11it/s]
loss 0.08 accuracy 1.00:  42%|████▏     | 782/1875 [00:27<00:23, 46.10it/s]
loss 0.07 accuracy 0.97:  42%|████▏     | 782/1875 [00:27<00:23, 46.10it/s]
loss 0.03 accuracy 1.00:  42%|████▏     | 782/1875 [00:27<00:23, 46.10it/s]
loss 0.16 accuracy 0.97:  42%|████▏     | 782/1875 [00:27<00:23, 46.10it/s]
loss 0.24 accuracy 0.91:  42%|████▏     | 782/1875 [00:27<00:23, 46.10it/s]
loss 0.25 accuracy 0.91:  42%|████▏     | 782/1875 [00:27<00:23, 46.10it/s]
loss 0.25 accuracy 0.91:  42%|████▏     | 787/1875 [00:27<00:23, 46.11it/s]
loss 0.39 accuracy 0.81:  42%|████▏     | 787/1875 [00:27<00:23, 46.11it/s]
loss 0.10 accuracy 0.97:  42%|████▏     | 787/1875 [00:27<00:23, 46.11it/s]
loss 0.10 accuracy 0.97:  42%|████▏     | 787/1875 [00:27<00:23, 46.11it/s]
loss 0.17 accuracy 0.91:  42%|████▏     | 787/1875 [00:27<00:23, 46.11it/s]
loss 0.10 accuracy 0.94:  42%|████▏     | 787/1875 [00:27<00:23, 46.11it/s]
loss 0.10 accuracy 0.94:  42%|████▏     | 792/1875 [00:27<00:23, 46.09it/s]
loss 0.15 accuracy 0.94:  42%|████▏     | 792/1875 [00:27<00:23, 46.09it/s]
loss 0.09 accuracy 0.97:  42%|████▏     | 792/1875 [00:27<00:23, 46.09it/s]
loss 0.28 accuracy 0.88:  42%|████▏     | 792/1875 [00:27<00:23, 46.09it/s]
loss 0.06 accuracy 1.00:  42%|████▏     | 792/1875 [00:27<00:23, 46.09it/s]
loss 0.13 accuracy 0.97:  42%|████▏     | 792/1875 [00:27<00:23, 46.09it/s]
loss 0.13 accuracy 0.97:  43%|████▎     | 797/1875 [00:27<00:23, 46.08it/s]
loss 0.17 accuracy 0.94:  43%|████▎     | 797/1875 [00:27<00:23, 46.08it/s]
loss 0.28 accuracy 0.94:  43%|████▎     | 797/1875 [00:27<00:23, 46.08it/s]
loss 0.14 accuracy 0.97:  43%|████▎     | 797/1875 [00:27<00:23, 46.08it/s]
loss 0.13 accuracy 0.97:  43%|████▎     | 797/1875 [00:27<00:23, 46.08it/s]
loss 0.36 accuracy 0.91:  43%|████▎     | 797/1875 [00:27<00:23, 46.08it/s]
loss 0.36 accuracy 0.91:  43%|████▎     | 802/1875 [00:27<00:23, 46.04it/s]
loss 0.13 accuracy 0.97:  43%|████▎     | 802/1875 [00:27<00:23, 46.04it/s]
loss 0.22 accuracy 0.94:  43%|████▎     | 802/1875 [00:27<00:23, 46.04it/s]
loss 0.19 accuracy 0.94:  43%|████▎     | 802/1875 [00:27<00:23, 46.04it/s]
loss 0.24 accuracy 0.91:  43%|████▎     | 802/1875 [00:27<00:23, 46.04it/s]
loss 0.14 accuracy 0.97:  43%|████▎     | 802/1875 [00:27<00:23, 46.04it/s]
loss 0.14 accuracy 0.97:  43%|████▎     | 807/1875 [00:27<00:23, 45.99it/s]
loss 0.13 accuracy 0.97:  43%|████▎     | 807/1875 [00:27<00:23, 45.99it/s]
loss 0.14 accuracy 0.94:  43%|████▎     | 807/1875 [00:27<00:23, 45.99it/s]
loss 0.09 accuracy 0.97:  43%|████▎     | 807/1875 [00:27<00:23, 45.99it/s]
loss 0.16 accuracy 0.97:  43%|████▎     | 807/1875 [00:27<00:23, 45.99it/s]
loss 0.21 accuracy 0.97:  43%|████▎     | 807/1875 [00:27<00:23, 45.99it/s]
loss 0.21 accuracy 0.97:  43%|████▎     | 812/1875 [00:27<00:23, 45.83it/s]
loss 0.07 accuracy 0.97:  43%|████▎     | 812/1875 [00:27<00:23, 45.83it/s]
loss 0.16 accuracy 0.94:  43%|████▎     | 812/1875 [00:27<00:23, 45.83it/s]
loss 0.17 accuracy 0.94:  43%|████▎     | 812/1875 [00:27<00:23, 45.83it/s]
loss 0.30 accuracy 0.91:  43%|████▎     | 812/1875 [00:27<00:23, 45.83it/s]
loss 0.07 accuracy 1.00:  43%|████▎     | 812/1875 [00:27<00:23, 45.83it/s]
loss 0.07 accuracy 1.00:  44%|████▎     | 817/1875 [00:27<00:23, 45.80it/s]
loss 0.15 accuracy 0.94:  44%|████▎     | 817/1875 [00:27<00:23, 45.80it/s]
loss 0.10 accuracy 1.00:  44%|████▎     | 817/1875 [00:27<00:23, 45.80it/s]
loss 0.23 accuracy 0.94:  44%|████▎     | 817/1875 [00:27<00:23, 45.80it/s]
loss 0.11 accuracy 0.94:  44%|████▎     | 817/1875 [00:27<00:23, 45.80it/s]
loss 0.08 accuracy 1.00:  44%|████▎     | 817/1875 [00:27<00:23, 45.80it/s]
loss 0.08 accuracy 1.00:  44%|████▍     | 822/1875 [00:27<00:23, 45.73it/s]
loss 0.21 accuracy 0.88:  44%|████▍     | 822/1875 [00:27<00:23, 45.73it/s]
loss 0.11 accuracy 0.97:  44%|████▍     | 822/1875 [00:27<00:23, 45.73it/s]
loss 0.07 accuracy 0.97:  44%|████▍     | 822/1875 [00:27<00:23, 45.73it/s]
loss 0.10 accuracy 0.97:  44%|████▍     | 822/1875 [00:28<00:23, 45.73it/s]
loss 0.24 accuracy 0.91:  44%|████▍     | 822/1875 [00:28<00:23, 45.73it/s]
loss 0.24 accuracy 0.91:  44%|████▍     | 827/1875 [00:28<00:22, 45.70it/s]
loss 0.27 accuracy 0.88:  44%|████▍     | 827/1875 [00:28<00:22, 45.70it/s]
loss 0.07 accuracy 0.97:  44%|████▍     | 827/1875 [00:28<00:22, 45.70it/s]
loss 0.24 accuracy 0.91:  44%|████▍     | 827/1875 [00:28<00:22, 45.70it/s]
loss 0.09 accuracy 1.00:  44%|████▍     | 827/1875 [00:28<00:22, 45.70it/s]
loss 0.25 accuracy 0.94:  44%|████▍     | 827/1875 [00:28<00:22, 45.70it/s]
loss 0.25 accuracy 0.94:  44%|████▍     | 832/1875 [00:28<00:22, 45.70it/s]
loss 0.12 accuracy 0.97:  44%|████▍     | 832/1875 [00:28<00:22, 45.70it/s]
loss 0.42 accuracy 0.84:  44%|████▍     | 832/1875 [00:28<00:22, 45.70it/s]
loss 0.05 accuracy 1.00:  44%|████▍     | 832/1875 [00:28<00:22, 45.70it/s]
loss 0.05 accuracy 1.00:  44%|████▍     | 832/1875 [00:28<00:22, 45.70it/s]
loss 0.13 accuracy 0.97:  44%|████▍     | 832/1875 [00:28<00:22, 45.70it/s]
loss 0.13 accuracy 0.97:  45%|████▍     | 837/1875 [00:28<00:22, 45.72it/s]
loss 0.18 accuracy 0.94:  45%|████▍     | 837/1875 [00:28<00:22, 45.72it/s]
loss 0.24 accuracy 0.94:  45%|████▍     | 837/1875 [00:28<00:22, 45.72it/s]
loss 0.12 accuracy 0.94:  45%|████▍     | 837/1875 [00:28<00:22, 45.72it/s]
loss 0.09 accuracy 0.97:  45%|████▍     | 837/1875 [00:28<00:22, 45.72it/s]
loss 0.17 accuracy 0.97:  45%|████▍     | 837/1875 [00:28<00:22, 45.72it/s]
loss 0.17 accuracy 0.97:  45%|████▍     | 842/1875 [00:28<00:22, 45.75it/s]
loss 0.25 accuracy 0.91:  45%|████▍     | 842/1875 [00:28<00:22, 45.75it/s]
loss 0.06 accuracy 1.00:  45%|████▍     | 842/1875 [00:28<00:22, 45.75it/s]
loss 0.20 accuracy 0.97:  45%|████▍     | 842/1875 [00:28<00:22, 45.75it/s]
loss 0.04 accuracy 1.00:  45%|████▍     | 842/1875 [00:28<00:22, 45.75it/s]
loss 0.18 accuracy 0.94:  45%|████▍     | 842/1875 [00:28<00:22, 45.75it/s]
loss 0.18 accuracy 0.94:  45%|████▌     | 847/1875 [00:28<00:22, 45.88it/s]
loss 0.03 accuracy 1.00:  45%|████▌     | 847/1875 [00:28<00:22, 45.88it/s]
loss 0.21 accuracy 0.94:  45%|████▌     | 847/1875 [00:28<00:22, 45.88it/s]
loss 0.14 accuracy 0.97:  45%|████▌     | 847/1875 [00:28<00:22, 45.88it/s]
loss 0.04 accuracy 1.00:  45%|████▌     | 847/1875 [00:28<00:22, 45.88it/s]
loss 0.18 accuracy 0.97:  45%|████▌     | 847/1875 [00:28<00:22, 45.88it/s]
loss 0.18 accuracy 0.97:  45%|████▌     | 852/1875 [00:28<00:22, 45.95it/s]
loss 0.06 accuracy 1.00:  45%|████▌     | 852/1875 [00:28<00:22, 45.95it/s]
loss 0.06 accuracy 1.00:  45%|████▌     | 852/1875 [00:28<00:22, 45.95it/s]
loss 0.13 accuracy 0.97:  45%|████▌     | 852/1875 [00:28<00:22, 45.95it/s]
loss 0.03 accuracy 1.00:  45%|████▌     | 852/1875 [00:28<00:22, 45.95it/s]
loss 0.03 accuracy 1.00:  45%|████▌     | 852/1875 [00:28<00:22, 45.95it/s]
loss 0.03 accuracy 1.00:  46%|████▌     | 857/1875 [00:28<00:22, 46.01it/s]
loss 0.12 accuracy 0.94:  46%|████▌     | 857/1875 [00:28<00:22, 46.01it/s]
loss 0.16 accuracy 0.94:  46%|████▌     | 857/1875 [00:28<00:22, 46.01it/s]
loss 0.09 accuracy 0.97:  46%|████▌     | 857/1875 [00:28<00:22, 46.01it/s]
loss 0.05 accuracy 1.00:  46%|████▌     | 857/1875 [00:28<00:22, 46.01it/s]
loss 0.17 accuracy 0.94:  46%|████▌     | 857/1875 [00:28<00:22, 46.01it/s]
loss 0.17 accuracy 0.94:  46%|████▌     | 862/1875 [00:28<00:21, 46.07it/s]
loss 0.04 accuracy 1.00:  46%|████▌     | 862/1875 [00:28<00:21, 46.07it/s]
loss 0.20 accuracy 0.94:  46%|████▌     | 862/1875 [00:28<00:21, 46.07it/s]
loss 0.06 accuracy 1.00:  46%|████▌     | 862/1875 [00:28<00:21, 46.07it/s]
loss 0.05 accuracy 1.00:  46%|████▌     | 862/1875 [00:28<00:21, 46.07it/s]
loss 0.06 accuracy 0.97:  46%|████▌     | 862/1875 [00:28<00:21, 46.07it/s]
loss 0.06 accuracy 0.97:  46%|████▌     | 867/1875 [00:28<00:21, 46.07it/s]
loss 0.11 accuracy 0.97:  46%|████▌     | 867/1875 [00:28<00:21, 46.07it/s]
loss 0.19 accuracy 0.94:  46%|████▌     | 867/1875 [00:28<00:21, 46.07it/s]
loss 0.12 accuracy 1.00:  46%|████▌     | 867/1875 [00:28<00:21, 46.07it/s]
loss 0.21 accuracy 0.97:  46%|████▌     | 867/1875 [00:28<00:21, 46.07it/s]
loss 0.10 accuracy 0.97:  46%|████▌     | 867/1875 [00:29<00:21, 46.07it/s]
loss 0.10 accuracy 0.97:  47%|████▋     | 872/1875 [00:29<00:21, 46.12it/s]
loss 0.05 accuracy 1.00:  47%|████▋     | 872/1875 [00:29<00:21, 46.12it/s]
loss 0.14 accuracy 0.97:  47%|████▋     | 872/1875 [00:29<00:21, 46.12it/s]
loss 0.04 accuracy 1.00:  47%|████▋     | 872/1875 [00:29<00:21, 46.12it/s]
loss 0.03 accuracy 1.00:  47%|████▋     | 872/1875 [00:29<00:21, 46.12it/s]
loss 0.16 accuracy 0.97:  47%|████▋     | 872/1875 [00:29<00:21, 46.12it/s]
loss 0.16 accuracy 0.97:  47%|████▋     | 877/1875 [00:29<00:21, 46.12it/s]
loss 0.06 accuracy 0.97:  47%|████▋     | 877/1875 [00:29<00:21, 46.12it/s]
loss 0.07 accuracy 0.97:  47%|████▋     | 877/1875 [00:29<00:21, 46.12it/s]
loss 0.08 accuracy 0.97:  47%|████▋     | 877/1875 [00:29<00:21, 46.12it/s]
loss 0.12 accuracy 0.94:  47%|████▋     | 877/1875 [00:29<00:21, 46.12it/s]
loss 0.18 accuracy 0.94:  47%|████▋     | 877/1875 [00:29<00:21, 46.12it/s]
loss 0.18 accuracy 0.94:  47%|████▋     | 882/1875 [00:29<00:21, 46.12it/s]
loss 0.15 accuracy 0.94:  47%|████▋     | 882/1875 [00:29<00:21, 46.12it/s]
loss 0.21 accuracy 0.97:  47%|████▋     | 882/1875 [00:29<00:21, 46.12it/s]
loss 0.06 accuracy 1.00:  47%|████▋     | 882/1875 [00:29<00:21, 46.12it/s]
loss 0.12 accuracy 0.94:  47%|████▋     | 882/1875 [00:29<00:21, 46.12it/s]
loss 0.05 accuracy 1.00:  47%|████▋     | 882/1875 [00:29<00:21, 46.12it/s]
loss 0.05 accuracy 1.00:  47%|████▋     | 887/1875 [00:29<00:21, 46.10it/s]
loss 0.08 accuracy 0.94:  47%|████▋     | 887/1875 [00:29<00:21, 46.10it/s]
loss 0.33 accuracy 0.94:  47%|████▋     | 887/1875 [00:29<00:21, 46.10it/s]
loss 0.47 accuracy 0.84:  47%|████▋     | 887/1875 [00:29<00:21, 46.10it/s]
loss 0.10 accuracy 0.97:  47%|████▋     | 887/1875 [00:29<00:21, 46.10it/s]
loss 0.21 accuracy 0.91:  47%|████▋     | 887/1875 [00:29<00:21, 46.10it/s]
loss 0.21 accuracy 0.91:  48%|████▊     | 892/1875 [00:29<00:21, 46.13it/s]
loss 0.16 accuracy 0.97:  48%|████▊     | 892/1875 [00:29<00:21, 46.13it/s]
loss 0.09 accuracy 0.97:  48%|████▊     | 892/1875 [00:29<00:21, 46.13it/s]
loss 0.16 accuracy 0.97:  48%|████▊     | 892/1875 [00:29<00:21, 46.13it/s]
loss 0.17 accuracy 0.94:  48%|████▊     | 892/1875 [00:29<00:21, 46.13it/s]
loss 0.04 accuracy 1.00:  48%|████▊     | 892/1875 [00:29<00:21, 46.13it/s]
loss 0.04 accuracy 1.00:  48%|████▊     | 897/1875 [00:29<00:21, 46.13it/s]
loss 0.04 accuracy 1.00:  48%|████▊     | 897/1875 [00:29<00:21, 46.13it/s]
loss 0.05 accuracy 1.00:  48%|████▊     | 897/1875 [00:29<00:21, 46.13it/s]
loss 0.17 accuracy 0.94:  48%|████▊     | 897/1875 [00:29<00:21, 46.13it/s]
loss 0.09 accuracy 0.94:  48%|████▊     | 897/1875 [00:29<00:21, 46.13it/s]
loss 0.06 accuracy 1.00:  48%|████▊     | 897/1875 [00:29<00:21, 46.13it/s]
loss 0.06 accuracy 1.00:  48%|████▊     | 902/1875 [00:29<00:21, 46.16it/s]
loss 0.26 accuracy 0.94:  48%|████▊     | 902/1875 [00:29<00:21, 46.16it/s]
loss 0.07 accuracy 1.00:  48%|████▊     | 902/1875 [00:29<00:21, 46.16it/s]
loss 0.02 accuracy 1.00:  48%|████▊     | 902/1875 [00:29<00:21, 46.16it/s]
loss 0.08 accuracy 0.97:  48%|████▊     | 902/1875 [00:29<00:21, 46.16it/s]
loss 0.21 accuracy 0.94:  48%|████▊     | 902/1875 [00:29<00:21, 46.16it/s]
loss 0.21 accuracy 0.94:  48%|████▊     | 907/1875 [00:29<00:20, 46.16it/s]
loss 0.06 accuracy 0.97:  48%|████▊     | 907/1875 [00:29<00:20, 46.16it/s]
loss 0.05 accuracy 1.00:  48%|████▊     | 907/1875 [00:29<00:20, 46.16it/s]
loss 0.10 accuracy 1.00:  48%|████▊     | 907/1875 [00:29<00:20, 46.16it/s]
loss 0.12 accuracy 0.97:  48%|████▊     | 907/1875 [00:29<00:20, 46.16it/s]
loss 0.08 accuracy 0.97:  48%|████▊     | 907/1875 [00:29<00:20, 46.16it/s]
loss 0.08 accuracy 0.97:  49%|████▊     | 912/1875 [00:29<00:20, 46.15it/s]
loss 0.13 accuracy 0.97:  49%|████▊     | 912/1875 [00:29<00:20, 46.15it/s]
loss 0.08 accuracy 1.00:  49%|████▊     | 912/1875 [00:29<00:20, 46.15it/s]
loss 0.05 accuracy 1.00:  49%|████▊     | 912/1875 [00:29<00:20, 46.15it/s]
loss 0.33 accuracy 0.91:  49%|████▊     | 912/1875 [00:29<00:20, 46.15it/s]
loss 0.03 accuracy 1.00:  49%|████▊     | 912/1875 [00:29<00:20, 46.15it/s]
loss 0.03 accuracy 1.00:  49%|████▉     | 917/1875 [00:29<00:20, 46.14it/s]
loss 0.11 accuracy 1.00:  49%|████▉     | 917/1875 [00:29<00:20, 46.14it/s]
loss 0.17 accuracy 0.97:  49%|████▉     | 917/1875 [00:30<00:20, 46.14it/s]
loss 0.38 accuracy 0.91:  49%|████▉     | 917/1875 [00:30<00:20, 46.14it/s]
loss 0.12 accuracy 0.97:  49%|████▉     | 917/1875 [00:30<00:20, 46.14it/s]
loss 0.15 accuracy 0.94:  49%|████▉     | 917/1875 [00:30<00:20, 46.14it/s]
loss 0.15 accuracy 0.94:  49%|████▉     | 922/1875 [00:30<00:20, 46.16it/s]
loss 0.18 accuracy 0.94:  49%|████▉     | 922/1875 [00:30<00:20, 46.16it/s]
loss 0.31 accuracy 0.88:  49%|████▉     | 922/1875 [00:30<00:20, 46.16it/s]
loss 0.19 accuracy 0.91:  49%|████▉     | 922/1875 [00:30<00:20, 46.16it/s]
loss 0.16 accuracy 0.94:  49%|████▉     | 922/1875 [00:30<00:20, 46.16it/s]
loss 0.14 accuracy 0.94:  49%|████▉     | 922/1875 [00:30<00:20, 46.16it/s]
loss 0.14 accuracy 0.94:  49%|████▉     | 927/1875 [00:30<00:20, 46.15it/s]
loss 0.11 accuracy 0.97:  49%|████▉     | 927/1875 [00:30<00:20, 46.15it/s]
loss 0.18 accuracy 0.94:  49%|████▉     | 927/1875 [00:30<00:20, 46.15it/s]
loss 0.08 accuracy 1.00:  49%|████▉     | 927/1875 [00:30<00:20, 46.15it/s]
loss 0.10 accuracy 0.97:  49%|████▉     | 927/1875 [00:30<00:20, 46.15it/s]
loss 0.18 accuracy 0.97:  49%|████▉     | 927/1875 [00:30<00:20, 46.15it/s]
loss 0.18 accuracy 0.97:  50%|████▉     | 932/1875 [00:30<00:20, 46.14it/s]
loss 0.21 accuracy 0.94:  50%|████▉     | 932/1875 [00:30<00:20, 46.14it/s]
loss 0.11 accuracy 0.97:  50%|████▉     | 932/1875 [00:30<00:20, 46.14it/s]
loss 0.08 accuracy 1.00:  50%|████▉     | 932/1875 [00:30<00:20, 46.14it/s]
loss 0.26 accuracy 0.97:  50%|████▉     | 932/1875 [00:30<00:20, 46.14it/s]
loss 0.14 accuracy 0.94:  50%|████▉     | 932/1875 [00:30<00:20, 46.14it/s]
loss 0.14 accuracy 0.94:  50%|████▉     | 937/1875 [00:30<00:20, 46.08it/s]
loss 0.07 accuracy 0.97:  50%|████▉     | 937/1875 [00:30<00:20, 46.08it/s]
loss 0.51 accuracy 0.91:  50%|████▉     | 937/1875 [00:30<00:20, 46.08it/s]
loss 0.18 accuracy 0.94:  50%|████▉     | 937/1875 [00:30<00:20, 46.08it/s]
loss 0.14 accuracy 0.97:  50%|████▉     | 937/1875 [00:30<00:20, 46.08it/s]
loss 0.38 accuracy 0.94:  50%|████▉     | 937/1875 [00:30<00:20, 46.08it/s]
loss 0.38 accuracy 0.94:  50%|█████     | 942/1875 [00:30<00:20, 46.06it/s]
loss 0.07 accuracy 1.00:  50%|█████     | 942/1875 [00:30<00:20, 46.06it/s]
loss 0.24 accuracy 0.97:  50%|█████     | 942/1875 [00:30<00:20, 46.06it/s]
loss 0.13 accuracy 0.97:  50%|█████     | 942/1875 [00:30<00:20, 46.06it/s]
loss 0.07 accuracy 0.97:  50%|█████     | 942/1875 [00:30<00:20, 46.06it/s]
loss 0.15 accuracy 0.97:  50%|█████     | 942/1875 [00:30<00:20, 46.06it/s]
loss 0.15 accuracy 0.97:  51%|█████     | 947/1875 [00:30<00:20, 45.93it/s]
loss 0.15 accuracy 0.97:  51%|█████     | 947/1875 [00:30<00:20, 45.93it/s]
loss 0.40 accuracy 0.91:  51%|█████     | 947/1875 [00:30<00:20, 45.93it/s]
loss 0.07 accuracy 0.97:  51%|█████     | 947/1875 [00:30<00:20, 45.93it/s]
loss 0.11 accuracy 0.97:  51%|█████     | 947/1875 [00:30<00:20, 45.93it/s]
loss 0.09 accuracy 0.97:  51%|█████     | 947/1875 [00:30<00:20, 45.93it/s]
loss 0.09 accuracy 0.97:  51%|█████     | 952/1875 [00:30<00:20, 45.86it/s]
loss 0.03 accuracy 1.00:  51%|█████     | 952/1875 [00:30<00:20, 45.86it/s]
loss 0.06 accuracy 1.00:  51%|█████     | 952/1875 [00:30<00:20, 45.86it/s]
loss 0.03 accuracy 1.00:  51%|█████     | 952/1875 [00:30<00:20, 45.86it/s]
loss 0.28 accuracy 0.91:  51%|█████     | 952/1875 [00:30<00:20, 45.86it/s]
loss 0.10 accuracy 0.97:  51%|█████     | 952/1875 [00:30<00:20, 45.86it/s]
loss 0.10 accuracy 0.97:  51%|█████     | 957/1875 [00:30<00:20, 45.83it/s]
loss 0.20 accuracy 0.94:  51%|█████     | 957/1875 [00:30<00:20, 45.83it/s]
loss 0.09 accuracy 0.97:  51%|█████     | 957/1875 [00:30<00:20, 45.83it/s]
loss 0.19 accuracy 0.94:  51%|█████     | 957/1875 [00:30<00:20, 45.83it/s]
loss 0.13 accuracy 0.94:  51%|█████     | 957/1875 [00:30<00:20, 45.83it/s]
loss 0.03 accuracy 1.00:  51%|█████     | 957/1875 [00:30<00:20, 45.83it/s]
loss 0.03 accuracy 1.00:  51%|█████▏    | 962/1875 [00:30<00:19, 45.80it/s]
loss 0.19 accuracy 0.94:  51%|█████▏    | 962/1875 [00:30<00:19, 45.80it/s]
loss 0.42 accuracy 0.91:  51%|█████▏    | 962/1875 [00:31<00:19, 45.80it/s]
loss 0.22 accuracy 0.91:  51%|█████▏    | 962/1875 [00:31<00:19, 45.80it/s]
loss 0.09 accuracy 0.94:  51%|█████▏    | 962/1875 [00:31<00:19, 45.80it/s]
loss 0.22 accuracy 0.94:  51%|█████▏    | 962/1875 [00:31<00:19, 45.80it/s]
loss 0.22 accuracy 0.94:  52%|█████▏    | 967/1875 [00:31<00:19, 45.74it/s]
loss 0.08 accuracy 0.97:  52%|█████▏    | 967/1875 [00:31<00:19, 45.74it/s]
loss 0.19 accuracy 0.97:  52%|█████▏    | 967/1875 [00:31<00:19, 45.74it/s]
loss 0.18 accuracy 0.97:  52%|█████▏    | 967/1875 [00:31<00:19, 45.74it/s]
loss 0.04 accuracy 1.00:  52%|█████▏    | 967/1875 [00:31<00:19, 45.74it/s]
loss 0.11 accuracy 0.97:  52%|█████▏    | 967/1875 [00:31<00:19, 45.74it/s]
loss 0.11 accuracy 0.97:  52%|█████▏    | 972/1875 [00:31<00:19, 45.74it/s]
loss 0.03 accuracy 1.00:  52%|█████▏    | 972/1875 [00:31<00:19, 45.74it/s]
loss 0.16 accuracy 0.97:  52%|█████▏    | 972/1875 [00:31<00:19, 45.74it/s]
loss 0.26 accuracy 0.97:  52%|█████▏    | 972/1875 [00:31<00:19, 45.74it/s]
loss 0.06 accuracy 1.00:  52%|█████▏    | 972/1875 [00:31<00:19, 45.74it/s]
loss 0.15 accuracy 0.97:  52%|█████▏    | 972/1875 [00:31<00:19, 45.74it/s]
loss 0.15 accuracy 0.97:  52%|█████▏    | 977/1875 [00:31<00:19, 45.70it/s]
loss 0.27 accuracy 0.94:  52%|█████▏    | 977/1875 [00:31<00:19, 45.70it/s]
loss 0.07 accuracy 1.00:  52%|█████▏    | 977/1875 [00:31<00:19, 45.70it/s]
loss 0.11 accuracy 0.94:  52%|█████▏    | 977/1875 [00:31<00:19, 45.70it/s]
loss 0.08 accuracy 0.97:  52%|█████▏    | 977/1875 [00:31<00:19, 45.70it/s]
loss 0.13 accuracy 0.94:  52%|█████▏    | 977/1875 [00:31<00:19, 45.70it/s]
loss 0.13 accuracy 0.94:  52%|█████▏    | 982/1875 [00:31<00:19, 45.76it/s]
loss 0.41 accuracy 0.88:  52%|█████▏    | 982/1875 [00:31<00:19, 45.76it/s]
loss 0.16 accuracy 0.97:  52%|█████▏    | 982/1875 [00:31<00:19, 45.76it/s]
loss 0.18 accuracy 0.94:  52%|█████▏    | 982/1875 [00:31<00:19, 45.76it/s]
loss 0.04 accuracy 1.00:  52%|█████▏    | 982/1875 [00:31<00:19, 45.76it/s]
loss 0.08 accuracy 1.00:  52%|█████▏    | 982/1875 [00:31<00:19, 45.76it/s]
loss 0.08 accuracy 1.00:  53%|█████▎    | 987/1875 [00:31<00:19, 45.83it/s]
loss 0.08 accuracy 1.00:  53%|█████▎    | 987/1875 [00:31<00:19, 45.83it/s]
loss 0.14 accuracy 0.94:  53%|█████▎    | 987/1875 [00:31<00:19, 45.83it/s]
loss 0.40 accuracy 0.91:  53%|█████▎    | 987/1875 [00:31<00:19, 45.83it/s]
loss 0.12 accuracy 0.97:  53%|█████▎    | 987/1875 [00:31<00:19, 45.83it/s]
loss 0.05 accuracy 1.00:  53%|█████▎    | 987/1875 [00:31<00:19, 45.83it/s]
loss 0.05 accuracy 1.00:  53%|█████▎    | 992/1875 [00:31<00:19, 45.92it/s]
loss 0.03 accuracy 1.00:  53%|█████▎    | 992/1875 [00:31<00:19, 45.92it/s]
loss 0.12 accuracy 0.97:  53%|█████▎    | 992/1875 [00:31<00:19, 45.92it/s]
loss 0.09 accuracy 0.97:  53%|█████▎    | 992/1875 [00:31<00:19, 45.92it/s]
loss 0.11 accuracy 0.97:  53%|█████▎    | 992/1875 [00:31<00:19, 45.92it/s]
loss 0.04 accuracy 1.00:  53%|█████▎    | 992/1875 [00:31<00:19, 45.92it/s]
loss 0.04 accuracy 1.00:  53%|█████▎    | 997/1875 [00:31<00:19, 45.95it/s]
loss 0.06 accuracy 0.97:  53%|█████▎    | 997/1875 [00:31<00:19, 45.95it/s]
loss 0.06 accuracy 1.00:  53%|█████▎    | 997/1875 [00:31<00:19, 45.95it/s]
loss 0.15 accuracy 0.94:  53%|█████▎    | 997/1875 [00:31<00:19, 45.95it/s]
loss 0.06 accuracy 0.97:  53%|█████▎    | 997/1875 [00:31<00:19, 45.95it/s]
loss 0.06 accuracy 0.97:  53%|█████▎    | 997/1875 [00:31<00:19, 45.95it/s]
loss 0.06 accuracy 0.97:  53%|█████▎    | 1002/1875 [00:31<00:18, 45.99it/s]
loss 0.09 accuracy 0.97:  53%|█████▎    | 1002/1875 [00:31<00:18, 45.99it/s]
loss 0.03 accuracy 1.00:  53%|█████▎    | 1002/1875 [00:31<00:18, 45.99it/s]
loss 0.08 accuracy 0.97:  53%|█████▎    | 1002/1875 [00:31<00:18, 45.99it/s]
loss 0.14 accuracy 0.97:  53%|█████▎    | 1002/1875 [00:31<00:18, 45.99it/s]
loss 0.17 accuracy 0.94:  53%|█████▎    | 1002/1875 [00:31<00:18, 45.99it/s]
loss 0.17 accuracy 0.94:  54%|█████▎    | 1007/1875 [00:31<00:18, 45.98it/s]
loss 0.18 accuracy 0.97:  54%|█████▎    | 1007/1875 [00:31<00:18, 45.98it/s]
loss 0.09 accuracy 0.94:  54%|█████▎    | 1007/1875 [00:31<00:18, 45.98it/s]
loss 0.03 accuracy 1.00:  54%|█████▎    | 1007/1875 [00:32<00:18, 45.98it/s]
loss 0.05 accuracy 1.00:  54%|█████▎    | 1007/1875 [00:32<00:18, 45.98it/s]
loss 0.11 accuracy 0.97:  54%|█████▎    | 1007/1875 [00:32<00:18, 45.98it/s]
loss 0.11 accuracy 0.97:  54%|█████▍    | 1012/1875 [00:32<00:18, 46.03it/s]
loss 0.26 accuracy 0.88:  54%|█████▍    | 1012/1875 [00:32<00:18, 46.03it/s]
loss 0.33 accuracy 0.88:  54%|█████▍    | 1012/1875 [00:32<00:18, 46.03it/s]
loss 0.11 accuracy 0.97:  54%|█████▍    | 1012/1875 [00:32<00:18, 46.03it/s]
loss 0.13 accuracy 0.97:  54%|█████▍    | 1012/1875 [00:32<00:18, 46.03it/s]
loss 0.20 accuracy 0.94:  54%|█████▍    | 1012/1875 [00:32<00:18, 46.03it/s]
loss 0.20 accuracy 0.94:  54%|█████▍    | 1017/1875 [00:32<00:18, 46.03it/s]
loss 0.09 accuracy 0.97:  54%|█████▍    | 1017/1875 [00:32<00:18, 46.03it/s]
loss 0.11 accuracy 0.97:  54%|█████▍    | 1017/1875 [00:32<00:18, 46.03it/s]
loss 0.13 accuracy 0.97:  54%|█████▍    | 1017/1875 [00:32<00:18, 46.03it/s]
loss 0.09 accuracy 0.97:  54%|█████▍    | 1017/1875 [00:32<00:18, 46.03it/s]
loss 0.06 accuracy 1.00:  54%|█████▍    | 1017/1875 [00:32<00:18, 46.03it/s]
loss 0.06 accuracy 1.00:  55%|█████▍    | 1022/1875 [00:32<00:18, 46.02it/s]
loss 0.13 accuracy 0.94:  55%|█████▍    | 1022/1875 [00:32<00:18, 46.02it/s]
loss 0.05 accuracy 0.97:  55%|█████▍    | 1022/1875 [00:32<00:18, 46.02it/s]
loss 0.21 accuracy 0.91:  55%|█████▍    | 1022/1875 [00:32<00:18, 46.02it/s]
loss 0.03 accuracy 1.00:  55%|█████▍    | 1022/1875 [00:32<00:18, 46.02it/s]
loss 0.46 accuracy 0.88:  55%|█████▍    | 1022/1875 [00:32<00:18, 46.02it/s]
loss 0.46 accuracy 0.88:  55%|█████▍    | 1027/1875 [00:32<00:18, 45.91it/s]
loss 0.07 accuracy 0.97:  55%|█████▍    | 1027/1875 [00:32<00:18, 45.91it/s]
loss 0.16 accuracy 0.94:  55%|█████▍    | 1027/1875 [00:32<00:18, 45.91it/s]
loss 0.16 accuracy 0.97:  55%|█████▍    | 1027/1875 [00:32<00:18, 45.91it/s]
loss 0.15 accuracy 0.94:  55%|█████▍    | 1027/1875 [00:32<00:18, 45.91it/s]
loss 0.08 accuracy 1.00:  55%|█████▍    | 1027/1875 [00:32<00:18, 45.91it/s]
loss 0.08 accuracy 1.00:  55%|█████▌    | 1032/1875 [00:32<00:18, 45.83it/s]
loss 0.17 accuracy 0.94:  55%|█████▌    | 1032/1875 [00:32<00:18, 45.83it/s]
loss 0.12 accuracy 0.94:  55%|█████▌    | 1032/1875 [00:32<00:18, 45.83it/s]
loss 0.22 accuracy 0.91:  55%|█████▌    | 1032/1875 [00:32<00:18, 45.83it/s]
loss 0.45 accuracy 0.84:  55%|█████▌    | 1032/1875 [00:32<00:18, 45.83it/s]
loss 0.03 accuracy 1.00:  55%|█████▌    | 1032/1875 [00:32<00:18, 45.83it/s]
loss 0.03 accuracy 1.00:  55%|█████▌    | 1037/1875 [00:32<00:18, 45.85it/s]
loss 0.04 accuracy 1.00:  55%|█████▌    | 1037/1875 [00:32<00:18, 45.85it/s]
loss 0.06 accuracy 1.00:  55%|█████▌    | 1037/1875 [00:32<00:18, 45.85it/s]
loss 0.28 accuracy 0.84:  55%|█████▌    | 1037/1875 [00:32<00:18, 45.85it/s]
loss 0.09 accuracy 0.97:  55%|█████▌    | 1037/1875 [00:32<00:18, 45.85it/s]
loss 0.15 accuracy 0.91:  55%|█████▌    | 1037/1875 [00:32<00:18, 45.85it/s]
loss 0.15 accuracy 0.91:  56%|█████▌    | 1042/1875 [00:32<00:18, 45.75it/s]
loss 0.29 accuracy 0.91:  56%|█████▌    | 1042/1875 [00:32<00:18, 45.75it/s]
loss 0.12 accuracy 0.97:  56%|█████▌    | 1042/1875 [00:32<00:18, 45.75it/s]
loss 0.16 accuracy 0.91:  56%|█████▌    | 1042/1875 [00:32<00:18, 45.75it/s]
loss 0.16 accuracy 0.94:  56%|█████▌    | 1042/1875 [00:32<00:18, 45.75it/s]
loss 0.11 accuracy 0.97:  56%|█████▌    | 1042/1875 [00:32<00:18, 45.75it/s]
loss 0.11 accuracy 0.97:  56%|█████▌    | 1047/1875 [00:32<00:18, 45.73it/s]
loss 0.07 accuracy 1.00:  56%|█████▌    | 1047/1875 [00:32<00:18, 45.73it/s]
loss 0.25 accuracy 0.97:  56%|█████▌    | 1047/1875 [00:32<00:18, 45.73it/s]
loss 0.04 accuracy 1.00:  56%|█████▌    | 1047/1875 [00:32<00:18, 45.73it/s]
loss 0.20 accuracy 0.94:  56%|█████▌    | 1047/1875 [00:32<00:18, 45.73it/s]
loss 0.18 accuracy 0.94:  56%|█████▌    | 1047/1875 [00:32<00:18, 45.73it/s]
loss 0.18 accuracy 0.94:  56%|█████▌    | 1052/1875 [00:32<00:17, 45.74it/s]
loss 0.15 accuracy 0.97:  56%|█████▌    | 1052/1875 [00:32<00:17, 45.74it/s]
loss 0.10 accuracy 0.94:  56%|█████▌    | 1052/1875 [00:32<00:17, 45.74it/s]
loss 0.02 accuracy 1.00:  56%|█████▌    | 1052/1875 [00:32<00:17, 45.74it/s]
loss 0.05 accuracy 1.00:  56%|█████▌    | 1052/1875 [00:33<00:17, 45.74it/s]
loss 0.07 accuracy 0.97:  56%|█████▌    | 1052/1875 [00:33<00:17, 45.74it/s]
loss 0.07 accuracy 0.97:  56%|█████▋    | 1057/1875 [00:33<00:17, 45.73it/s]
loss 0.06 accuracy 0.97:  56%|█████▋    | 1057/1875 [00:33<00:17, 45.73it/s]
loss 0.17 accuracy 0.94:  56%|█████▋    | 1057/1875 [00:33<00:17, 45.73it/s]
loss 0.06 accuracy 1.00:  56%|█████▋    | 1057/1875 [00:33<00:17, 45.73it/s]
loss 0.22 accuracy 0.94:  56%|█████▋    | 1057/1875 [00:33<00:17, 45.73it/s]
loss 0.15 accuracy 0.97:  56%|█████▋    | 1057/1875 [00:33<00:17, 45.73it/s]
loss 0.15 accuracy 0.97:  57%|█████▋    | 1062/1875 [00:33<00:17, 45.78it/s]
loss 0.11 accuracy 0.97:  57%|█████▋    | 1062/1875 [00:33<00:17, 45.78it/s]
loss 0.02 accuracy 1.00:  57%|█████▋    | 1062/1875 [00:33<00:17, 45.78it/s]
loss 0.02 accuracy 1.00:  57%|█████▋    | 1062/1875 [00:33<00:17, 45.78it/s]
loss 0.09 accuracy 0.97:  57%|█████▋    | 1062/1875 [00:33<00:17, 45.78it/s]
loss 0.10 accuracy 0.97:  57%|█████▋    | 1062/1875 [00:33<00:17, 45.78it/s]
loss 0.10 accuracy 0.97:  57%|█████▋    | 1067/1875 [00:33<00:17, 45.87it/s]
loss 0.07 accuracy 0.97:  57%|█████▋    | 1067/1875 [00:33<00:17, 45.87it/s]
loss 0.21 accuracy 0.91:  57%|█████▋    | 1067/1875 [00:33<00:17, 45.87it/s]
loss 0.10 accuracy 1.00:  57%|█████▋    | 1067/1875 [00:33<00:17, 45.87it/s]
loss 0.15 accuracy 0.97:  57%|█████▋    | 1067/1875 [00:33<00:17, 45.87it/s]
loss 0.32 accuracy 0.97:  57%|█████▋    | 1067/1875 [00:33<00:17, 45.87it/s]
loss 0.32 accuracy 0.97:  57%|█████▋    | 1072/1875 [00:33<00:17, 45.93it/s]
loss 0.09 accuracy 0.97:  57%|█████▋    | 1072/1875 [00:33<00:17, 45.93it/s]
loss 0.09 accuracy 0.97:  57%|█████▋    | 1072/1875 [00:33<00:17, 45.93it/s]
loss 0.23 accuracy 0.97:  57%|█████▋    | 1072/1875 [00:33<00:17, 45.93it/s]
loss 0.14 accuracy 0.94:  57%|█████▋    | 1072/1875 [00:33<00:17, 45.93it/s]
loss 0.30 accuracy 0.97:  57%|█████▋    | 1072/1875 [00:33<00:17, 45.93it/s]
loss 0.30 accuracy 0.97:  57%|█████▋    | 1077/1875 [00:33<00:17, 45.98it/s]
loss 0.11 accuracy 1.00:  57%|█████▋    | 1077/1875 [00:33<00:17, 45.98it/s]
loss 0.09 accuracy 0.97:  57%|█████▋    | 1077/1875 [00:33<00:17, 45.98it/s]
loss 0.06 accuracy 1.00:  57%|█████▋    | 1077/1875 [00:33<00:17, 45.98it/s]
loss 0.11 accuracy 0.94:  57%|█████▋    | 1077/1875 [00:33<00:17, 45.98it/s]
loss 0.21 accuracy 0.94:  57%|█████▋    | 1077/1875 [00:33<00:17, 45.98it/s]
loss 0.21 accuracy 0.94:  58%|█████▊    | 1082/1875 [00:33<00:17, 46.04it/s]
loss 0.05 accuracy 1.00:  58%|█████▊    | 1082/1875 [00:33<00:17, 46.04it/s]
loss 0.08 accuracy 0.97:  58%|█████▊    | 1082/1875 [00:33<00:17, 46.04it/s]
loss 0.14 accuracy 0.97:  58%|█████▊    | 1082/1875 [00:33<00:17, 46.04it/s]
loss 0.22 accuracy 0.94:  58%|█████▊    | 1082/1875 [00:33<00:17, 46.04it/s]
loss 0.11 accuracy 1.00:  58%|█████▊    | 1082/1875 [00:33<00:17, 46.04it/s]
loss 0.11 accuracy 1.00:  58%|█████▊    | 1087/1875 [00:33<00:17, 46.08it/s]
loss 0.13 accuracy 0.97:  58%|█████▊    | 1087/1875 [00:33<00:17, 46.08it/s]
loss 0.05 accuracy 0.97:  58%|█████▊    | 1087/1875 [00:33<00:17, 46.08it/s]
loss 0.16 accuracy 0.91:  58%|█████▊    | 1087/1875 [00:33<00:17, 46.08it/s]
loss 0.05 accuracy 1.00:  58%|█████▊    | 1087/1875 [00:33<00:17, 46.08it/s]
loss 0.20 accuracy 0.91:  58%|█████▊    | 1087/1875 [00:33<00:17, 46.08it/s]
loss 0.20 accuracy 0.91:  58%|█████▊    | 1092/1875 [00:33<00:17, 46.05it/s]
loss 0.08 accuracy 0.97:  58%|█████▊    | 1092/1875 [00:33<00:17, 46.05it/s]
loss 0.04 accuracy 1.00:  58%|█████▊    | 1092/1875 [00:33<00:17, 46.05it/s]
loss 0.14 accuracy 0.94:  58%|█████▊    | 1092/1875 [00:33<00:17, 46.05it/s]
loss 0.06 accuracy 1.00:  58%|█████▊    | 1092/1875 [00:33<00:17, 46.05it/s]
loss 0.04 accuracy 1.00:  58%|█████▊    | 1092/1875 [00:33<00:17, 46.05it/s]
loss 0.04 accuracy 1.00:  59%|█████▊    | 1097/1875 [00:33<00:16, 46.08it/s]
loss 0.02 accuracy 1.00:  59%|█████▊    | 1097/1875 [00:33<00:16, 46.08it/s]
loss 0.29 accuracy 0.91:  59%|█████▊    | 1097/1875 [00:33<00:16, 46.08it/s]
loss 0.03 accuracy 1.00:  59%|█████▊    | 1097/1875 [00:33<00:16, 46.08it/s]
loss 0.13 accuracy 0.97:  59%|█████▊    | 1097/1875 [00:33<00:16, 46.08it/s]
loss 0.06 accuracy 1.00:  59%|█████▊    | 1097/1875 [00:34<00:16, 46.08it/s]
loss 0.06 accuracy 1.00:  59%|█████▉    | 1102/1875 [00:34<00:16, 46.08it/s]
loss 0.14 accuracy 0.97:  59%|█████▉    | 1102/1875 [00:34<00:16, 46.08it/s]
loss 0.31 accuracy 0.91:  59%|█████▉    | 1102/1875 [00:34<00:16, 46.08it/s]
loss 0.22 accuracy 0.94:  59%|█████▉    | 1102/1875 [00:34<00:16, 46.08it/s]
loss 0.14 accuracy 0.97:  59%|█████▉    | 1102/1875 [00:34<00:16, 46.08it/s]
loss 0.10 accuracy 0.97:  59%|█████▉    | 1102/1875 [00:34<00:16, 46.08it/s]
loss 0.10 accuracy 0.97:  59%|█████▉    | 1107/1875 [00:34<00:16, 46.07it/s]
loss 0.07 accuracy 1.00:  59%|█████▉    | 1107/1875 [00:34<00:16, 46.07it/s]
loss 0.04 accuracy 1.00:  59%|█████▉    | 1107/1875 [00:34<00:16, 46.07it/s]
loss 0.20 accuracy 0.91:  59%|█████▉    | 1107/1875 [00:34<00:16, 46.07it/s]
loss 0.27 accuracy 0.94:  59%|█████▉    | 1107/1875 [00:34<00:16, 46.07it/s]
loss 0.24 accuracy 0.97:  59%|█████▉    | 1107/1875 [00:34<00:16, 46.07it/s]
loss 0.24 accuracy 0.97:  59%|█████▉    | 1112/1875 [00:34<00:16, 46.07it/s]
loss 0.18 accuracy 0.94:  59%|█████▉    | 1112/1875 [00:34<00:16, 46.07it/s]
loss 0.04 accuracy 1.00:  59%|█████▉    | 1112/1875 [00:34<00:16, 46.07it/s]
loss 0.11 accuracy 0.94:  59%|█████▉    | 1112/1875 [00:34<00:16, 46.07it/s]
loss 0.04 accuracy 1.00:  59%|█████▉    | 1112/1875 [00:34<00:16, 46.07it/s]
loss 0.04 accuracy 1.00:  59%|█████▉    | 1112/1875 [00:34<00:16, 46.07it/s]
loss 0.04 accuracy 1.00:  60%|█████▉    | 1117/1875 [00:34<00:16, 46.07it/s]
loss 0.11 accuracy 0.97:  60%|█████▉    | 1117/1875 [00:34<00:16, 46.07it/s]
loss 0.06 accuracy 1.00:  60%|█████▉    | 1117/1875 [00:34<00:16, 46.07it/s]
loss 0.44 accuracy 0.84:  60%|█████▉    | 1117/1875 [00:34<00:16, 46.07it/s]
loss 0.17 accuracy 0.94:  60%|█████▉    | 1117/1875 [00:34<00:16, 46.07it/s]
loss 0.17 accuracy 0.94:  60%|█████▉    | 1117/1875 [00:34<00:16, 46.07it/s]
loss 0.17 accuracy 0.94:  60%|█████▉    | 1122/1875 [00:34<00:16, 46.01it/s]
loss 0.11 accuracy 0.97:  60%|█████▉    | 1122/1875 [00:34<00:16, 46.01it/s]
loss 0.20 accuracy 0.91:  60%|█████▉    | 1122/1875 [00:34<00:16, 46.01it/s]
loss 0.06 accuracy 1.00:  60%|█████▉    | 1122/1875 [00:34<00:16, 46.01it/s]
loss 0.27 accuracy 0.88:  60%|█████▉    | 1122/1875 [00:34<00:16, 46.01it/s]
loss 0.31 accuracy 0.91:  60%|█████▉    | 1122/1875 [00:34<00:16, 46.01it/s]
loss 0.31 accuracy 0.91:  60%|██████    | 1127/1875 [00:34<00:16, 45.87it/s]
loss 0.12 accuracy 0.97:  60%|██████    | 1127/1875 [00:34<00:16, 45.87it/s]
loss 0.11 accuracy 0.97:  60%|██████    | 1127/1875 [00:34<00:16, 45.87it/s]
loss 0.09 accuracy 0.97:  60%|██████    | 1127/1875 [00:34<00:16, 45.87it/s]
loss 0.20 accuracy 0.91:  60%|██████    | 1127/1875 [00:34<00:16, 45.87it/s]
loss 0.08 accuracy 1.00:  60%|██████    | 1127/1875 [00:34<00:16, 45.87it/s]
loss 0.08 accuracy 1.00:  60%|██████    | 1132/1875 [00:34<00:16, 45.88it/s]
loss 0.11 accuracy 0.97:  60%|██████    | 1132/1875 [00:34<00:16, 45.88it/s]
loss 0.24 accuracy 0.94:  60%|██████    | 1132/1875 [00:34<00:16, 45.88it/s]
loss 0.27 accuracy 0.88:  60%|██████    | 1132/1875 [00:34<00:16, 45.88it/s]
loss 0.04 accuracy 1.00:  60%|██████    | 1132/1875 [00:34<00:16, 45.88it/s]
loss 0.09 accuracy 0.97:  60%|██████    | 1132/1875 [00:34<00:16, 45.88it/s]
loss 0.09 accuracy 0.97:  61%|██████    | 1137/1875 [00:34<00:16, 45.89it/s]
loss 0.12 accuracy 0.97:  61%|██████    | 1137/1875 [00:34<00:16, 45.89it/s]
loss 0.13 accuracy 0.97:  61%|██████    | 1137/1875 [00:34<00:16, 45.89it/s]
loss 0.34 accuracy 0.97:  61%|██████    | 1137/1875 [00:34<00:16, 45.89it/s]
loss 0.07 accuracy 0.97:  61%|██████    | 1137/1875 [00:34<00:16, 45.89it/s]
loss 0.39 accuracy 0.84:  61%|██████    | 1137/1875 [00:34<00:16, 45.89it/s]
loss 0.39 accuracy 0.84:  61%|██████    | 1142/1875 [00:34<00:16, 45.77it/s]
loss 0.14 accuracy 0.97:  61%|██████    | 1142/1875 [00:34<00:16, 45.77it/s]
loss 0.11 accuracy 0.94:  61%|██████    | 1142/1875 [00:34<00:16, 45.77it/s]
loss 0.12 accuracy 0.97:  61%|██████    | 1142/1875 [00:34<00:16, 45.77it/s]
loss 0.17 accuracy 0.94:  61%|██████    | 1142/1875 [00:34<00:16, 45.77it/s]
loss 0.22 accuracy 0.91:  61%|██████    | 1142/1875 [00:34<00:16, 45.77it/s]
loss 0.22 accuracy 0.91:  61%|██████    | 1147/1875 [00:34<00:15, 45.76it/s]
loss 0.14 accuracy 0.94:  61%|██████    | 1147/1875 [00:35<00:15, 45.76it/s]
loss 0.13 accuracy 0.94:  61%|██████    | 1147/1875 [00:35<00:15, 45.76it/s]
loss 0.03 accuracy 1.00:  61%|██████    | 1147/1875 [00:35<00:15, 45.76it/s]
loss 0.10 accuracy 0.94:  61%|██████    | 1147/1875 [00:35<00:15, 45.76it/s]
loss 0.05 accuracy 1.00:  61%|██████    | 1147/1875 [00:35<00:15, 45.76it/s]
loss 0.05 accuracy 1.00:  61%|██████▏   | 1152/1875 [00:35<00:15, 45.73it/s]
loss 0.08 accuracy 0.97:  61%|██████▏   | 1152/1875 [00:35<00:15, 45.73it/s]
loss 0.08 accuracy 1.00:  61%|██████▏   | 1152/1875 [00:35<00:15, 45.73it/s]
loss 0.03 accuracy 1.00:  61%|██████▏   | 1152/1875 [00:35<00:15, 45.73it/s]
loss 0.30 accuracy 0.94:  61%|██████▏   | 1152/1875 [00:35<00:15, 45.73it/s]
loss 0.13 accuracy 0.94:  61%|██████▏   | 1152/1875 [00:35<00:15, 45.73it/s]
loss 0.13 accuracy 0.94:  62%|██████▏   | 1157/1875 [00:35<00:15, 45.73it/s]
loss 0.06 accuracy 1.00:  62%|██████▏   | 1157/1875 [00:35<00:15, 45.73it/s]
loss 0.17 accuracy 0.97:  62%|██████▏   | 1157/1875 [00:35<00:15, 45.73it/s]
loss 0.23 accuracy 0.91:  62%|██████▏   | 1157/1875 [00:35<00:15, 45.73it/s]
loss 0.12 accuracy 0.94:  62%|██████▏   | 1157/1875 [00:35<00:15, 45.73it/s]
loss 0.06 accuracy 1.00:  62%|██████▏   | 1157/1875 [00:35<00:15, 45.73it/s]
loss 0.06 accuracy 1.00:  62%|██████▏   | 1162/1875 [00:35<00:15, 45.81it/s]
loss 0.19 accuracy 0.94:  62%|██████▏   | 1162/1875 [00:35<00:15, 45.81it/s]
loss 0.07 accuracy 0.97:  62%|██████▏   | 1162/1875 [00:35<00:15, 45.81it/s]
loss 0.09 accuracy 1.00:  62%|██████▏   | 1162/1875 [00:35<00:15, 45.81it/s]
loss 0.21 accuracy 0.94:  62%|██████▏   | 1162/1875 [00:35<00:15, 45.81it/s]
loss 0.19 accuracy 0.91:  62%|██████▏   | 1162/1875 [00:35<00:15, 45.81it/s]
loss 0.19 accuracy 0.91:  62%|██████▏   | 1167/1875 [00:35<00:15, 45.85it/s]
loss 0.11 accuracy 0.97:  62%|██████▏   | 1167/1875 [00:35<00:15, 45.85it/s]
loss 0.23 accuracy 0.94:  62%|██████▏   | 1167/1875 [00:35<00:15, 45.85it/s]
loss 0.13 accuracy 0.97:  62%|██████▏   | 1167/1875 [00:35<00:15, 45.85it/s]
loss 0.06 accuracy 0.97:  62%|██████▏   | 1167/1875 [00:35<00:15, 45.85it/s]
loss 0.09 accuracy 0.97:  62%|██████▏   | 1167/1875 [00:35<00:15, 45.85it/s]
loss 0.09 accuracy 0.97:  63%|██████▎   | 1172/1875 [00:35<00:15, 45.96it/s]
loss 0.11 accuracy 0.97:  63%|██████▎   | 1172/1875 [00:35<00:15, 45.96it/s]
loss 0.16 accuracy 0.97:  63%|██████▎   | 1172/1875 [00:35<00:15, 45.96it/s]
loss 0.06 accuracy 0.97:  63%|██████▎   | 1172/1875 [00:35<00:15, 45.96it/s]
loss 0.06 accuracy 1.00:  63%|██████▎   | 1172/1875 [00:35<00:15, 45.96it/s]
loss 0.13 accuracy 0.97:  63%|██████▎   | 1172/1875 [00:35<00:15, 45.96it/s]
loss 0.13 accuracy 0.97:  63%|██████▎   | 1177/1875 [00:35<00:15, 45.99it/s]
loss 0.26 accuracy 0.91:  63%|██████▎   | 1177/1875 [00:35<00:15, 45.99it/s]
loss 0.28 accuracy 0.94:  63%|██████▎   | 1177/1875 [00:35<00:15, 45.99it/s]
loss 0.06 accuracy 1.00:  63%|██████▎   | 1177/1875 [00:35<00:15, 45.99it/s]
loss 0.08 accuracy 0.97:  63%|██████▎   | 1177/1875 [00:35<00:15, 45.99it/s]
loss 0.10 accuracy 0.97:  63%|██████▎   | 1177/1875 [00:35<00:15, 45.99it/s]
loss 0.10 accuracy 0.97:  63%|██████▎   | 1182/1875 [00:35<00:15, 46.04it/s]
loss 0.35 accuracy 0.88:  63%|██████▎   | 1182/1875 [00:35<00:15, 46.04it/s]
loss 0.09 accuracy 0.97:  63%|██████▎   | 1182/1875 [00:35<00:15, 46.04it/s]
loss 0.34 accuracy 0.91:  63%|██████▎   | 1182/1875 [00:35<00:15, 46.04it/s]
loss 0.08 accuracy 0.97:  63%|██████▎   | 1182/1875 [00:35<00:15, 46.04it/s]
loss 0.18 accuracy 0.94:  63%|██████▎   | 1182/1875 [00:35<00:15, 46.04it/s]
loss 0.18 accuracy 0.94:  63%|██████▎   | 1187/1875 [00:35<00:14, 46.06it/s]
loss 0.28 accuracy 0.91:  63%|██████▎   | 1187/1875 [00:35<00:14, 46.06it/s]
loss 0.14 accuracy 0.97:  63%|██████▎   | 1187/1875 [00:35<00:14, 46.06it/s]
loss 0.04 accuracy 1.00:  63%|██████▎   | 1187/1875 [00:35<00:14, 46.06it/s]
loss 0.08 accuracy 1.00:  63%|██████▎   | 1187/1875 [00:35<00:14, 46.06it/s]
loss 0.18 accuracy 0.94:  63%|██████▎   | 1187/1875 [00:35<00:14, 46.06it/s]
loss 0.18 accuracy 0.94:  64%|██████▎   | 1192/1875 [00:35<00:14, 46.07it/s]
loss 0.21 accuracy 0.94:  64%|██████▎   | 1192/1875 [00:35<00:14, 46.07it/s]
loss 0.08 accuracy 0.97:  64%|██████▎   | 1192/1875 [00:36<00:14, 46.07it/s]
loss 0.26 accuracy 0.94:  64%|██████▎   | 1192/1875 [00:36<00:14, 46.07it/s]
loss 0.04 accuracy 1.00:  64%|██████▎   | 1192/1875 [00:36<00:14, 46.07it/s]
loss 0.04 accuracy 1.00:  64%|██████▎   | 1192/1875 [00:36<00:14, 46.07it/s]
loss 0.04 accuracy 1.00:  64%|██████▍   | 1197/1875 [00:36<00:14, 46.14it/s]
loss 0.05 accuracy 1.00:  64%|██████▍   | 1197/1875 [00:36<00:14, 46.14it/s]
loss 0.19 accuracy 0.91:  64%|██████▍   | 1197/1875 [00:36<00:14, 46.14it/s]
loss 0.11 accuracy 1.00:  64%|██████▍   | 1197/1875 [00:36<00:14, 46.14it/s]
loss 0.20 accuracy 0.91:  64%|██████▍   | 1197/1875 [00:36<00:14, 46.14it/s]
loss 0.13 accuracy 0.94:  64%|██████▍   | 1197/1875 [00:36<00:14, 46.14it/s]
loss 0.13 accuracy 0.94:  64%|██████▍   | 1202/1875 [00:36<00:14, 46.16it/s]
loss 0.05 accuracy 0.97:  64%|██████▍   | 1202/1875 [00:36<00:14, 46.16it/s]
loss 0.11 accuracy 0.97:  64%|██████▍   | 1202/1875 [00:36<00:14, 46.16it/s]
loss 0.09 accuracy 0.97:  64%|██████▍   | 1202/1875 [00:36<00:14, 46.16it/s]
loss 0.08 accuracy 0.97:  64%|██████▍   | 1202/1875 [00:36<00:14, 46.16it/s]
loss 0.13 accuracy 0.94:  64%|██████▍   | 1202/1875 [00:36<00:14, 46.16it/s]
loss 0.13 accuracy 0.94:  64%|██████▍   | 1207/1875 [00:36<00:14, 46.15it/s]
loss 0.11 accuracy 0.97:  64%|██████▍   | 1207/1875 [00:36<00:14, 46.15it/s]
loss 0.09 accuracy 0.97:  64%|██████▍   | 1207/1875 [00:36<00:14, 46.15it/s]
loss 0.11 accuracy 0.94:  64%|██████▍   | 1207/1875 [00:36<00:14, 46.15it/s]
loss 0.06 accuracy 1.00:  64%|██████▍   | 1207/1875 [00:36<00:14, 46.15it/s]
loss 0.43 accuracy 0.94:  64%|██████▍   | 1207/1875 [00:36<00:14, 46.15it/s]
loss 0.43 accuracy 0.94:  65%|██████▍   | 1212/1875 [00:36<00:14, 46.19it/s]
loss 0.11 accuracy 0.94:  65%|██████▍   | 1212/1875 [00:36<00:14, 46.19it/s]
loss 0.09 accuracy 1.00:  65%|██████▍   | 1212/1875 [00:36<00:14, 46.19it/s]
loss 0.10 accuracy 0.97:  65%|██████▍   | 1212/1875 [00:36<00:14, 46.19it/s]
loss 0.06 accuracy 0.97:  65%|██████▍   | 1212/1875 [00:36<00:14, 46.19it/s]
loss 0.22 accuracy 0.94:  65%|██████▍   | 1212/1875 [00:36<00:14, 46.19it/s]
loss 0.22 accuracy 0.94:  65%|██████▍   | 1217/1875 [00:36<00:14, 46.18it/s]
loss 0.11 accuracy 0.97:  65%|██████▍   | 1217/1875 [00:36<00:14, 46.18it/s]
loss 0.17 accuracy 0.94:  65%|██████▍   | 1217/1875 [00:36<00:14, 46.18it/s]
loss 0.14 accuracy 0.91:  65%|██████▍   | 1217/1875 [00:36<00:14, 46.18it/s]
loss 0.03 accuracy 1.00:  65%|██████▍   | 1217/1875 [00:36<00:14, 46.18it/s]
loss 0.04 accuracy 1.00:  65%|██████▍   | 1217/1875 [00:36<00:14, 46.18it/s]
loss 0.04 accuracy 1.00:  65%|██████▌   | 1222/1875 [00:36<00:14, 46.17it/s]
loss 0.09 accuracy 0.97:  65%|██████▌   | 1222/1875 [00:36<00:14, 46.17it/s]
loss 0.18 accuracy 0.94:  65%|██████▌   | 1222/1875 [00:36<00:14, 46.17it/s]
loss 0.03 accuracy 1.00:  65%|██████▌   | 1222/1875 [00:36<00:14, 46.17it/s]
loss 0.24 accuracy 0.97:  65%|██████▌   | 1222/1875 [00:36<00:14, 46.17it/s]
loss 0.04 accuracy 1.00:  65%|██████▌   | 1222/1875 [00:36<00:14, 46.17it/s]
loss 0.04 accuracy 1.00:  65%|██████▌   | 1227/1875 [00:36<00:14, 46.15it/s]
loss 0.10 accuracy 1.00:  65%|██████▌   | 1227/1875 [00:36<00:14, 46.15it/s]
loss 0.20 accuracy 0.91:  65%|██████▌   | 1227/1875 [00:36<00:14, 46.15it/s]
loss 0.07 accuracy 0.97:  65%|██████▌   | 1227/1875 [00:36<00:14, 46.15it/s]
loss 0.07 accuracy 0.97:  65%|██████▌   | 1227/1875 [00:36<00:14, 46.15it/s]
loss 0.04 accuracy 1.00:  65%|██████▌   | 1227/1875 [00:36<00:14, 46.15it/s]
loss 0.04 accuracy 1.00:  66%|██████▌   | 1232/1875 [00:36<00:13, 46.15it/s]
loss 0.05 accuracy 1.00:  66%|██████▌   | 1232/1875 [00:36<00:13, 46.15it/s]
loss 0.06 accuracy 0.97:  66%|██████▌   | 1232/1875 [00:36<00:13, 46.15it/s]
loss 0.30 accuracy 0.94:  66%|██████▌   | 1232/1875 [00:36<00:13, 46.15it/s]
loss 0.10 accuracy 0.97:  66%|██████▌   | 1232/1875 [00:36<00:13, 46.15it/s]
loss 0.20 accuracy 0.94:  66%|██████▌   | 1232/1875 [00:36<00:13, 46.15it/s]
loss 0.20 accuracy 0.94:  66%|██████▌   | 1237/1875 [00:36<00:13, 46.15it/s]
loss 0.06 accuracy 0.97:  66%|██████▌   | 1237/1875 [00:36<00:13, 46.15it/s]
loss 0.22 accuracy 0.91:  66%|██████▌   | 1237/1875 [00:36<00:13, 46.15it/s]
loss 0.05 accuracy 1.00:  66%|██████▌   | 1237/1875 [00:37<00:13, 46.15it/s]
loss 0.22 accuracy 0.97:  66%|██████▌   | 1237/1875 [00:37<00:13, 46.15it/s]
loss 0.24 accuracy 0.91:  66%|██████▌   | 1237/1875 [00:37<00:13, 46.15it/s]
loss 0.24 accuracy 0.91:  66%|██████▌   | 1242/1875 [00:37<00:13, 46.13it/s]
loss 0.06 accuracy 0.97:  66%|██████▌   | 1242/1875 [00:37<00:13, 46.13it/s]
loss 0.07 accuracy 1.00:  66%|██████▌   | 1242/1875 [00:37<00:13, 46.13it/s]
loss 0.06 accuracy 1.00:  66%|██████▌   | 1242/1875 [00:37<00:13, 46.13it/s]
loss 0.09 accuracy 0.97:  66%|██████▌   | 1242/1875 [00:37<00:13, 46.13it/s]
loss 0.11 accuracy 0.97:  66%|██████▌   | 1242/1875 [00:37<00:13, 46.13it/s]
loss 0.11 accuracy 0.97:  67%|██████▋   | 1247/1875 [00:37<00:13, 46.09it/s]
loss 0.13 accuracy 0.94:  67%|██████▋   | 1247/1875 [00:37<00:13, 46.09it/s]
loss 0.02 accuracy 1.00:  67%|██████▋   | 1247/1875 [00:37<00:13, 46.09it/s]
loss 0.04 accuracy 1.00:  67%|██████▋   | 1247/1875 [00:37<00:13, 46.09it/s]
loss 0.14 accuracy 0.97:  67%|██████▋   | 1247/1875 [00:37<00:13, 46.09it/s]
loss 0.05 accuracy 1.00:  67%|██████▋   | 1247/1875 [00:37<00:13, 46.09it/s]
loss 0.05 accuracy 1.00:  67%|██████▋   | 1252/1875 [00:37<00:13, 46.06it/s]
loss 0.08 accuracy 0.97:  67%|██████▋   | 1252/1875 [00:37<00:13, 46.06it/s]
loss 0.05 accuracy 1.00:  67%|██████▋   | 1252/1875 [00:37<00:13, 46.06it/s]
loss 0.17 accuracy 0.91:  67%|██████▋   | 1252/1875 [00:37<00:13, 46.06it/s]
loss 0.06 accuracy 1.00:  67%|██████▋   | 1252/1875 [00:37<00:13, 46.06it/s]
loss 0.02 accuracy 1.00:  67%|██████▋   | 1252/1875 [00:37<00:13, 46.06it/s]
loss 0.02 accuracy 1.00:  67%|██████▋   | 1257/1875 [00:37<00:13, 46.05it/s]
loss 0.07 accuracy 1.00:  67%|██████▋   | 1257/1875 [00:37<00:13, 46.05it/s]
loss 0.04 accuracy 1.00:  67%|██████▋   | 1257/1875 [00:37<00:13, 46.05it/s]
loss 0.08 accuracy 0.97:  67%|██████▋   | 1257/1875 [00:37<00:13, 46.05it/s]
loss 0.06 accuracy 1.00:  67%|██████▋   | 1257/1875 [00:37<00:13, 46.05it/s]
loss 0.16 accuracy 0.97:  67%|██████▋   | 1257/1875 [00:37<00:13, 46.05it/s]
loss 0.16 accuracy 0.97:  67%|██████▋   | 1262/1875 [00:37<00:13, 46.01it/s]
loss 0.16 accuracy 0.94:  67%|██████▋   | 1262/1875 [00:37<00:13, 46.01it/s]
loss 0.02 accuracy 1.00:  67%|██████▋   | 1262/1875 [00:37<00:13, 46.01it/s]
loss 0.16 accuracy 0.94:  67%|██████▋   | 1262/1875 [00:37<00:13, 46.01it/s]
loss 0.05 accuracy 1.00:  67%|██████▋   | 1262/1875 [00:37<00:13, 46.01it/s]
loss 0.07 accuracy 0.97:  67%|██████▋   | 1262/1875 [00:37<00:13, 46.01it/s]
loss 0.07 accuracy 0.97:  68%|██████▊   | 1267/1875 [00:37<00:13, 45.86it/s]
loss 0.04 accuracy 1.00:  68%|██████▊   | 1267/1875 [00:37<00:13, 45.86it/s]
loss 0.24 accuracy 0.91:  68%|██████▊   | 1267/1875 [00:37<00:13, 45.86it/s]
loss 0.04 accuracy 1.00:  68%|██████▊   | 1267/1875 [00:37<00:13, 45.86it/s]
loss 0.04 accuracy 1.00:  68%|██████▊   | 1267/1875 [00:37<00:13, 45.86it/s]
loss 0.03 accuracy 1.00:  68%|██████▊   | 1267/1875 [00:37<00:13, 45.86it/s]
loss 0.03 accuracy 1.00:  68%|██████▊   | 1272/1875 [00:37<00:13, 45.84it/s]
loss 0.05 accuracy 1.00:  68%|██████▊   | 1272/1875 [00:37<00:13, 45.84it/s]
loss 0.12 accuracy 0.97:  68%|██████▊   | 1272/1875 [00:37<00:13, 45.84it/s]
loss 0.26 accuracy 0.91:  68%|██████▊   | 1272/1875 [00:37<00:13, 45.84it/s]
loss 0.26 accuracy 0.94:  68%|██████▊   | 1272/1875 [00:37<00:13, 45.84it/s]
loss 0.07 accuracy 1.00:  68%|██████▊   | 1272/1875 [00:37<00:13, 45.84it/s]
loss 0.07 accuracy 1.00:  68%|██████▊   | 1277/1875 [00:37<00:13, 45.79it/s]
loss 0.24 accuracy 0.94:  68%|██████▊   | 1277/1875 [00:37<00:13, 45.79it/s]
loss 0.04 accuracy 1.00:  68%|██████▊   | 1277/1875 [00:37<00:13, 45.79it/s]
loss 0.17 accuracy 0.97:  68%|██████▊   | 1277/1875 [00:37<00:13, 45.79it/s]
loss 0.12 accuracy 0.94:  68%|██████▊   | 1277/1875 [00:37<00:13, 45.79it/s]
loss 0.22 accuracy 0.97:  68%|██████▊   | 1277/1875 [00:37<00:13, 45.79it/s]
loss 0.22 accuracy 0.97:  68%|██████▊   | 1282/1875 [00:37<00:12, 45.72it/s]
loss 0.04 accuracy 1.00:  68%|██████▊   | 1282/1875 [00:37<00:12, 45.72it/s]
loss 0.18 accuracy 0.94:  68%|██████▊   | 1282/1875 [00:37<00:12, 45.72it/s]
loss 0.16 accuracy 0.94:  68%|██████▊   | 1282/1875 [00:37<00:12, 45.72it/s]
loss 0.13 accuracy 0.97:  68%|██████▊   | 1282/1875 [00:38<00:12, 45.72it/s]
loss 0.14 accuracy 0.94:  68%|██████▊   | 1282/1875 [00:38<00:12, 45.72it/s]
loss 0.14 accuracy 0.94:  69%|██████▊   | 1287/1875 [00:38<00:12, 45.77it/s]
loss 0.06 accuracy 1.00:  69%|██████▊   | 1287/1875 [00:38<00:12, 45.77it/s]
loss 0.15 accuracy 0.94:  69%|██████▊   | 1287/1875 [00:38<00:12, 45.77it/s]
loss 0.23 accuracy 0.94:  69%|██████▊   | 1287/1875 [00:38<00:12, 45.77it/s]
loss 0.04 accuracy 1.00:  69%|██████▊   | 1287/1875 [00:38<00:12, 45.77it/s]
loss 0.10 accuracy 0.97:  69%|██████▊   | 1287/1875 [00:38<00:12, 45.77it/s]
loss 0.10 accuracy 0.97:  69%|██████▉   | 1292/1875 [00:38<00:12, 45.69it/s]
loss 0.14 accuracy 0.94:  69%|██████▉   | 1292/1875 [00:38<00:12, 45.69it/s]
loss 0.32 accuracy 0.94:  69%|██████▉   | 1292/1875 [00:38<00:12, 45.69it/s]
loss 0.07 accuracy 0.97:  69%|██████▉   | 1292/1875 [00:38<00:12, 45.69it/s]
loss 0.22 accuracy 0.91:  69%|██████▉   | 1292/1875 [00:38<00:12, 45.69it/s]
loss 0.24 accuracy 0.94:  69%|██████▉   | 1292/1875 [00:38<00:12, 45.69it/s]
loss 0.24 accuracy 0.94:  69%|██████▉   | 1297/1875 [00:38<00:12, 45.74it/s]
loss 0.06 accuracy 1.00:  69%|██████▉   | 1297/1875 [00:38<00:12, 45.74it/s]
loss 0.13 accuracy 0.97:  69%|██████▉   | 1297/1875 [00:38<00:12, 45.74it/s]
loss 0.26 accuracy 0.94:  69%|██████▉   | 1297/1875 [00:38<00:12, 45.74it/s]
loss 0.20 accuracy 0.97:  69%|██████▉   | 1297/1875 [00:38<00:12, 45.74it/s]
loss 0.20 accuracy 0.97:  69%|██████▉   | 1297/1875 [00:38<00:12, 45.74it/s]
loss 0.20 accuracy 0.97:  69%|██████▉   | 1302/1875 [00:38<00:12, 45.82it/s]
loss 0.09 accuracy 0.97:  69%|██████▉   | 1302/1875 [00:38<00:12, 45.82it/s]
loss 0.09 accuracy 0.97:  69%|██████▉   | 1302/1875 [00:38<00:12, 45.82it/s]
loss 0.06 accuracy 0.97:  69%|██████▉   | 1302/1875 [00:38<00:12, 45.82it/s]
loss 0.18 accuracy 0.97:  69%|██████▉   | 1302/1875 [00:38<00:12, 45.82it/s]
loss 0.07 accuracy 1.00:  69%|██████▉   | 1302/1875 [00:38<00:12, 45.82it/s]
loss 0.07 accuracy 1.00:  70%|██████▉   | 1307/1875 [00:38<00:12, 45.90it/s]
loss 0.03 accuracy 1.00:  70%|██████▉   | 1307/1875 [00:38<00:12, 45.90it/s]
loss 0.22 accuracy 0.91:  70%|██████▉   | 1307/1875 [00:38<00:12, 45.90it/s]
loss 0.04 accuracy 1.00:  70%|██████▉   | 1307/1875 [00:38<00:12, 45.90it/s]
loss 0.19 accuracy 0.94:  70%|██████▉   | 1307/1875 [00:38<00:12, 45.90it/s]
loss 0.05 accuracy 1.00:  70%|██████▉   | 1307/1875 [00:38<00:12, 45.90it/s]
loss 0.05 accuracy 1.00:  70%|██████▉   | 1312/1875 [00:38<00:12, 45.98it/s]
loss 0.16 accuracy 0.97:  70%|██████▉   | 1312/1875 [00:38<00:12, 45.98it/s]
loss 0.30 accuracy 0.91:  70%|██████▉   | 1312/1875 [00:38<00:12, 45.98it/s]
loss 0.12 accuracy 0.97:  70%|██████▉   | 1312/1875 [00:38<00:12, 45.98it/s]
loss 0.21 accuracy 0.97:  70%|██████▉   | 1312/1875 [00:38<00:12, 45.98it/s]
loss 0.30 accuracy 0.94:  70%|██████▉   | 1312/1875 [00:38<00:12, 45.98it/s]
loss 0.30 accuracy 0.94:  70%|███████   | 1317/1875 [00:38<00:12, 46.01it/s]
loss 0.15 accuracy 0.94:  70%|███████   | 1317/1875 [00:38<00:12, 46.01it/s]
loss 0.06 accuracy 1.00:  70%|███████   | 1317/1875 [00:38<00:12, 46.01it/s]
loss 0.16 accuracy 0.97:  70%|███████   | 1317/1875 [00:38<00:12, 46.01it/s]
loss 0.02 accuracy 1.00:  70%|███████   | 1317/1875 [00:38<00:12, 46.01it/s]
loss 0.34 accuracy 0.88:  70%|███████   | 1317/1875 [00:38<00:12, 46.01it/s]
loss 0.34 accuracy 0.88:  71%|███████   | 1322/1875 [00:38<00:12, 46.06it/s]
loss 0.04 accuracy 1.00:  71%|███████   | 1322/1875 [00:38<00:12, 46.06it/s]
loss 0.10 accuracy 0.97:  71%|███████   | 1322/1875 [00:38<00:12, 46.06it/s]
loss 0.06 accuracy 1.00:  71%|███████   | 1322/1875 [00:38<00:12, 46.06it/s]
loss 0.04 accuracy 1.00:  71%|███████   | 1322/1875 [00:38<00:12, 46.06it/s]
loss 0.37 accuracy 0.94:  71%|███████   | 1322/1875 [00:38<00:12, 46.06it/s]
loss 0.37 accuracy 0.94:  71%|███████   | 1327/1875 [00:38<00:11, 46.07it/s]
loss 0.06 accuracy 1.00:  71%|███████   | 1327/1875 [00:38<00:11, 46.07it/s]
loss 0.05 accuracy 1.00:  71%|███████   | 1327/1875 [00:38<00:11, 46.07it/s]
loss 0.04 accuracy 1.00:  71%|███████   | 1327/1875 [00:38<00:11, 46.07it/s]
loss 0.07 accuracy 0.97:  71%|███████   | 1327/1875 [00:38<00:11, 46.07it/s]
loss 0.03 accuracy 1.00:  71%|███████   | 1327/1875 [00:39<00:11, 46.07it/s]
loss 0.03 accuracy 1.00:  71%|███████   | 1332/1875 [00:39<00:11, 46.07it/s]
loss 0.14 accuracy 0.94:  71%|███████   | 1332/1875 [00:39<00:11, 46.07it/s]
loss 0.13 accuracy 0.97:  71%|███████   | 1332/1875 [00:39<00:11, 46.07it/s]
loss 0.11 accuracy 0.97:  71%|███████   | 1332/1875 [00:39<00:11, 46.07it/s]
loss 0.16 accuracy 0.97:  71%|███████   | 1332/1875 [00:39<00:11, 46.07it/s]
loss 0.02 accuracy 1.00:  71%|███████   | 1332/1875 [00:39<00:11, 46.07it/s]
loss 0.02 accuracy 1.00:  71%|███████▏  | 1337/1875 [00:39<00:11, 46.09it/s]
loss 0.09 accuracy 0.97:  71%|███████▏  | 1337/1875 [00:39<00:11, 46.09it/s]
loss 0.10 accuracy 1.00:  71%|███████▏  | 1337/1875 [00:39<00:11, 46.09it/s]
loss 0.05 accuracy 1.00:  71%|███████▏  | 1337/1875 [00:39<00:11, 46.09it/s]
loss 0.18 accuracy 0.97:  71%|███████▏  | 1337/1875 [00:39<00:11, 46.09it/s]
loss 0.06 accuracy 1.00:  71%|███████▏  | 1337/1875 [00:39<00:11, 46.09it/s]
loss 0.06 accuracy 1.00:  72%|███████▏  | 1342/1875 [00:39<00:11, 46.05it/s]
loss 0.18 accuracy 0.94:  72%|███████▏  | 1342/1875 [00:39<00:11, 46.05it/s]
loss 0.17 accuracy 0.94:  72%|███████▏  | 1342/1875 [00:39<00:11, 46.05it/s]
loss 0.04 accuracy 1.00:  72%|███████▏  | 1342/1875 [00:39<00:11, 46.05it/s]
loss 0.10 accuracy 0.97:  72%|███████▏  | 1342/1875 [00:39<00:11, 46.05it/s]
loss 0.07 accuracy 0.97:  72%|███████▏  | 1342/1875 [00:39<00:11, 46.05it/s]
loss 0.07 accuracy 0.97:  72%|███████▏  | 1347/1875 [00:39<00:11, 46.00it/s]
loss 0.22 accuracy 0.91:  72%|███████▏  | 1347/1875 [00:39<00:11, 46.00it/s]
loss 0.02 accuracy 1.00:  72%|███████▏  | 1347/1875 [00:39<00:11, 46.00it/s]
loss 0.03 accuracy 1.00:  72%|███████▏  | 1347/1875 [00:39<00:11, 46.00it/s]
loss 0.03 accuracy 1.00:  72%|███████▏  | 1347/1875 [00:39<00:11, 46.00it/s]
loss 0.07 accuracy 1.00:  72%|███████▏  | 1347/1875 [00:39<00:11, 46.00it/s]
loss 0.07 accuracy 1.00:  72%|███████▏  | 1352/1875 [00:39<00:11, 45.86it/s]
loss 0.13 accuracy 0.94:  72%|███████▏  | 1352/1875 [00:39<00:11, 45.86it/s]
loss 0.24 accuracy 0.94:  72%|███████▏  | 1352/1875 [00:39<00:11, 45.86it/s]
loss 0.05 accuracy 1.00:  72%|███████▏  | 1352/1875 [00:39<00:11, 45.86it/s]
loss 0.12 accuracy 0.97:  72%|███████▏  | 1352/1875 [00:39<00:11, 45.86it/s]
loss 0.07 accuracy 1.00:  72%|███████▏  | 1352/1875 [00:39<00:11, 45.86it/s]
loss 0.07 accuracy 1.00:  72%|███████▏  | 1357/1875 [00:39<00:11, 45.82it/s]
loss 0.04 accuracy 1.00:  72%|███████▏  | 1357/1875 [00:39<00:11, 45.82it/s]
loss 0.12 accuracy 0.97:  72%|███████▏  | 1357/1875 [00:39<00:11, 45.82it/s]
loss 0.20 accuracy 0.94:  72%|███████▏  | 1357/1875 [00:39<00:11, 45.82it/s]
loss 0.11 accuracy 0.94:  72%|███████▏  | 1357/1875 [00:39<00:11, 45.82it/s]
loss 0.03 accuracy 1.00:  72%|███████▏  | 1357/1875 [00:39<00:11, 45.82it/s]
loss 0.03 accuracy 1.00:  73%|███████▎  | 1362/1875 [00:39<00:11, 45.84it/s]
loss 0.10 accuracy 0.97:  73%|███████▎  | 1362/1875 [00:39<00:11, 45.84it/s]
loss 0.07 accuracy 0.97:  73%|███████▎  | 1362/1875 [00:39<00:11, 45.84it/s]
loss 0.11 accuracy 0.97:  73%|███████▎  | 1362/1875 [00:39<00:11, 45.84it/s]
loss 0.38 accuracy 0.91:  73%|███████▎  | 1362/1875 [00:39<00:11, 45.84it/s]
loss 0.01 accuracy 1.00:  73%|███████▎  | 1362/1875 [00:39<00:11, 45.84it/s]
loss 0.01 accuracy 1.00:  73%|███████▎  | 1367/1875 [00:39<00:11, 45.70it/s]
loss 0.10 accuracy 0.94:  73%|███████▎  | 1367/1875 [00:39<00:11, 45.70it/s]
loss 0.16 accuracy 0.94:  73%|███████▎  | 1367/1875 [00:39<00:11, 45.70it/s]
loss 0.07 accuracy 0.97:  73%|███████▎  | 1367/1875 [00:39<00:11, 45.70it/s]
loss 0.23 accuracy 0.94:  73%|███████▎  | 1367/1875 [00:39<00:11, 45.70it/s]
loss 0.04 accuracy 1.00:  73%|███████▎  | 1367/1875 [00:39<00:11, 45.70it/s]
loss 0.04 accuracy 1.00:  73%|███████▎  | 1372/1875 [00:39<00:11, 45.72it/s]
loss 0.08 accuracy 0.97:  73%|███████▎  | 1372/1875 [00:39<00:11, 45.72it/s]
loss 0.35 accuracy 0.91:  73%|███████▎  | 1372/1875 [00:39<00:11, 45.72it/s]
loss 0.03 accuracy 1.00:  73%|███████▎  | 1372/1875 [00:39<00:11, 45.72it/s]
loss 0.18 accuracy 0.94:  73%|███████▎  | 1372/1875 [00:39<00:11, 45.72it/s]
loss 0.09 accuracy 0.97:  73%|███████▎  | 1372/1875 [00:39<00:11, 45.72it/s]
loss 0.09 accuracy 0.97:  73%|███████▎  | 1377/1875 [00:39<00:10, 45.70it/s]
loss 0.14 accuracy 0.97:  73%|███████▎  | 1377/1875 [00:40<00:10, 45.70it/s]
loss 0.15 accuracy 0.94:  73%|███████▎  | 1377/1875 [00:40<00:10, 45.70it/s]
loss 0.05 accuracy 1.00:  73%|███████▎  | 1377/1875 [00:40<00:10, 45.70it/s]
loss 0.29 accuracy 0.94:  73%|███████▎  | 1377/1875 [00:40<00:10, 45.70it/s]
loss 0.06 accuracy 0.97:  73%|███████▎  | 1377/1875 [00:40<00:10, 45.70it/s]
loss 0.06 accuracy 0.97:  74%|███████▎  | 1382/1875 [00:40<00:10, 45.74it/s]
loss 0.17 accuracy 0.94:  74%|███████▎  | 1382/1875 [00:40<00:10, 45.74it/s]
loss 0.06 accuracy 0.97:  74%|███████▎  | 1382/1875 [00:40<00:10, 45.74it/s]
loss 0.08 accuracy 0.97:  74%|███████▎  | 1382/1875 [00:40<00:10, 45.74it/s]
loss 0.06 accuracy 0.97:  74%|███████▎  | 1382/1875 [00:40<00:10, 45.74it/s]
loss 0.05 accuracy 1.00:  74%|███████▎  | 1382/1875 [00:40<00:10, 45.74it/s]
loss 0.05 accuracy 1.00:  74%|███████▍  | 1387/1875 [00:40<00:10, 45.82it/s]
loss 0.06 accuracy 1.00:  74%|███████▍  | 1387/1875 [00:40<00:10, 45.82it/s]
loss 0.20 accuracy 0.97:  74%|███████▍  | 1387/1875 [00:40<00:10, 45.82it/s]
loss 0.10 accuracy 0.97:  74%|███████▍  | 1387/1875 [00:40<00:10, 45.82it/s]
loss 0.20 accuracy 0.97:  74%|███████▍  | 1387/1875 [00:40<00:10, 45.82it/s]
loss 0.21 accuracy 0.91:  74%|███████▍  | 1387/1875 [00:40<00:10, 45.82it/s]
loss 0.21 accuracy 0.91:  74%|███████▍  | 1392/1875 [00:40<00:10, 45.90it/s]
loss 0.07 accuracy 0.97:  74%|███████▍  | 1392/1875 [00:40<00:10, 45.90it/s]
loss 0.28 accuracy 0.88:  74%|███████▍  | 1392/1875 [00:40<00:10, 45.90it/s]
loss 0.08 accuracy 0.97:  74%|███████▍  | 1392/1875 [00:40<00:10, 45.90it/s]
loss 0.08 accuracy 0.97:  74%|███████▍  | 1392/1875 [00:40<00:10, 45.90it/s]
loss 0.03 accuracy 1.00:  74%|███████▍  | 1392/1875 [00:40<00:10, 45.90it/s]
loss 0.03 accuracy 1.00:  75%|███████▍  | 1397/1875 [00:40<00:10, 45.94it/s]
loss 0.11 accuracy 0.94:  75%|███████▍  | 1397/1875 [00:40<00:10, 45.94it/s]
loss 0.07 accuracy 0.97:  75%|███████▍  | 1397/1875 [00:40<00:10, 45.94it/s]
loss 0.11 accuracy 0.97:  75%|███████▍  | 1397/1875 [00:40<00:10, 45.94it/s]
loss 0.02 accuracy 1.00:  75%|███████▍  | 1397/1875 [00:40<00:10, 45.94it/s]
loss 0.07 accuracy 0.97:  75%|███████▍  | 1397/1875 [00:40<00:10, 45.94it/s]
loss 0.07 accuracy 0.97:  75%|███████▍  | 1402/1875 [00:40<00:10, 46.03it/s]
loss 0.21 accuracy 0.91:  75%|███████▍  | 1402/1875 [00:40<00:10, 46.03it/s]
loss 0.03 accuracy 1.00:  75%|███████▍  | 1402/1875 [00:40<00:10, 46.03it/s]
loss 0.31 accuracy 0.94:  75%|███████▍  | 1402/1875 [00:40<00:10, 46.03it/s]
loss 0.12 accuracy 0.97:  75%|███████▍  | 1402/1875 [00:40<00:10, 46.03it/s]
loss 0.09 accuracy 0.97:  75%|███████▍  | 1402/1875 [00:40<00:10, 46.03it/s]
loss 0.09 accuracy 0.97:  75%|███████▌  | 1407/1875 [00:40<00:10, 46.07it/s]
loss 0.08 accuracy 1.00:  75%|███████▌  | 1407/1875 [00:40<00:10, 46.07it/s]
loss 0.10 accuracy 1.00:  75%|███████▌  | 1407/1875 [00:40<00:10, 46.07it/s]
loss 0.35 accuracy 0.91:  75%|███████▌  | 1407/1875 [00:40<00:10, 46.07it/s]
loss 0.06 accuracy 1.00:  75%|███████▌  | 1407/1875 [00:40<00:10, 46.07it/s]
loss 0.09 accuracy 0.97:  75%|███████▌  | 1407/1875 [00:40<00:10, 46.07it/s]
loss 0.09 accuracy 0.97:  75%|███████▌  | 1412/1875 [00:40<00:10, 46.08it/s]
loss 0.08 accuracy 0.97:  75%|███████▌  | 1412/1875 [00:40<00:10, 46.08it/s]
loss 0.21 accuracy 0.94:  75%|███████▌  | 1412/1875 [00:40<00:10, 46.08it/s]
loss 0.08 accuracy 1.00:  75%|███████▌  | 1412/1875 [00:40<00:10, 46.08it/s]
loss 0.06 accuracy 1.00:  75%|███████▌  | 1412/1875 [00:40<00:10, 46.08it/s]
loss 0.10 accuracy 0.97:  75%|███████▌  | 1412/1875 [00:40<00:10, 46.08it/s]
loss 0.10 accuracy 0.97:  76%|███████▌  | 1417/1875 [00:40<00:09, 46.10it/s]
loss 0.16 accuracy 0.97:  76%|███████▌  | 1417/1875 [00:40<00:09, 46.10it/s]
loss 0.15 accuracy 0.94:  76%|███████▌  | 1417/1875 [00:40<00:09, 46.10it/s]
loss 0.24 accuracy 0.94:  76%|███████▌  | 1417/1875 [00:40<00:09, 46.10it/s]
loss 0.06 accuracy 1.00:  76%|███████▌  | 1417/1875 [00:40<00:09, 46.10it/s]
loss 0.34 accuracy 0.94:  76%|███████▌  | 1417/1875 [00:40<00:09, 46.10it/s]
loss 0.34 accuracy 0.94:  76%|███████▌  | 1422/1875 [00:40<00:09, 46.08it/s]
loss 0.16 accuracy 0.97:  76%|███████▌  | 1422/1875 [00:40<00:09, 46.08it/s]
loss 0.04 accuracy 1.00:  76%|███████▌  | 1422/1875 [00:41<00:09, 46.08it/s]
loss 0.14 accuracy 0.94:  76%|███████▌  | 1422/1875 [00:41<00:09, 46.08it/s]
loss 0.03 accuracy 1.00:  76%|███████▌  | 1422/1875 [00:41<00:09, 46.08it/s]
loss 0.02 accuracy 1.00:  76%|███████▌  | 1422/1875 [00:41<00:09, 46.08it/s]
loss 0.02 accuracy 1.00:  76%|███████▌  | 1427/1875 [00:41<00:09, 46.06it/s]
loss 0.08 accuracy 1.00:  76%|███████▌  | 1427/1875 [00:41<00:09, 46.06it/s]
loss 0.18 accuracy 0.94:  76%|███████▌  | 1427/1875 [00:41<00:09, 46.06it/s]
loss 0.05 accuracy 1.00:  76%|███████▌  | 1427/1875 [00:41<00:09, 46.06it/s]
loss 0.19 accuracy 0.94:  76%|███████▌  | 1427/1875 [00:41<00:09, 46.06it/s]
loss 0.08 accuracy 0.97:  76%|███████▌  | 1427/1875 [00:41<00:09, 46.06it/s]
loss 0.08 accuracy 0.97:  76%|███████▋  | 1432/1875 [00:41<00:09, 46.03it/s]
loss 0.19 accuracy 0.97:  76%|███████▋  | 1432/1875 [00:41<00:09, 46.03it/s]
loss 0.07 accuracy 0.97:  76%|███████▋  | 1432/1875 [00:41<00:09, 46.03it/s]
loss 0.26 accuracy 0.94:  76%|███████▋  | 1432/1875 [00:41<00:09, 46.03it/s]
loss 0.36 accuracy 0.84:  76%|███████▋  | 1432/1875 [00:41<00:09, 46.03it/s]
loss 0.02 accuracy 1.00:  76%|███████▋  | 1432/1875 [00:41<00:09, 46.03it/s]
loss 0.02 accuracy 1.00:  77%|███████▋  | 1437/1875 [00:41<00:09, 45.95it/s]
loss 0.16 accuracy 0.97:  77%|███████▋  | 1437/1875 [00:41<00:09, 45.95it/s]
loss 0.04 accuracy 1.00:  77%|███████▋  | 1437/1875 [00:41<00:09, 45.95it/s]
loss 0.19 accuracy 0.94:  77%|███████▋  | 1437/1875 [00:41<00:09, 45.95it/s]
loss 0.05 accuracy 1.00:  77%|███████▋  | 1437/1875 [00:41<00:09, 45.95it/s]
loss 0.05 accuracy 1.00:  77%|███████▋  | 1437/1875 [00:41<00:09, 45.95it/s]
loss 0.05 accuracy 1.00:  77%|███████▋  | 1442/1875 [00:41<00:09, 45.83it/s]
loss 0.06 accuracy 1.00:  77%|███████▋  | 1442/1875 [00:41<00:09, 45.83it/s]
loss 0.11 accuracy 0.94:  77%|███████▋  | 1442/1875 [00:41<00:09, 45.83it/s]
loss 0.06 accuracy 0.97:  77%|███████▋  | 1442/1875 [00:41<00:09, 45.83it/s]
loss 0.26 accuracy 0.94:  77%|███████▋  | 1442/1875 [00:41<00:09, 45.83it/s]
loss 0.16 accuracy 0.97:  77%|███████▋  | 1442/1875 [00:41<00:09, 45.83it/s]
loss 0.16 accuracy 0.97:  77%|███████▋  | 1447/1875 [00:41<00:09, 45.82it/s]
loss 0.11 accuracy 0.97:  77%|███████▋  | 1447/1875 [00:41<00:09, 45.82it/s]
loss 0.02 accuracy 1.00:  77%|███████▋  | 1447/1875 [00:41<00:09, 45.82it/s]
loss 0.03 accuracy 1.00:  77%|███████▋  | 1447/1875 [00:41<00:09, 45.82it/s]
loss 0.05 accuracy 1.00:  77%|███████▋  | 1447/1875 [00:41<00:09, 45.82it/s]
loss 0.28 accuracy 0.94:  77%|███████▋  | 1447/1875 [00:41<00:09, 45.82it/s]
loss 0.28 accuracy 0.94:  77%|███████▋  | 1452/1875 [00:41<00:09, 45.67it/s]
loss 0.03 accuracy 1.00:  77%|███████▋  | 1452/1875 [00:41<00:09, 45.67it/s]
loss 0.11 accuracy 0.97:  77%|███████▋  | 1452/1875 [00:41<00:09, 45.67it/s]
loss 0.20 accuracy 0.94:  77%|███████▋  | 1452/1875 [00:41<00:09, 45.67it/s]
loss 0.07 accuracy 0.97:  77%|███████▋  | 1452/1875 [00:41<00:09, 45.67it/s]
loss 0.22 accuracy 0.94:  77%|███████▋  | 1452/1875 [00:41<00:09, 45.67it/s]
loss 0.22 accuracy 0.94:  78%|███████▊  | 1457/1875 [00:41<00:09, 45.71it/s]
loss 0.19 accuracy 0.97:  78%|███████▊  | 1457/1875 [00:41<00:09, 45.71it/s]
loss 0.06 accuracy 1.00:  78%|███████▊  | 1457/1875 [00:41<00:09, 45.71it/s]
loss 0.17 accuracy 0.97:  78%|███████▊  | 1457/1875 [00:41<00:09, 45.71it/s]
loss 0.10 accuracy 0.97:  78%|███████▊  | 1457/1875 [00:41<00:09, 45.71it/s]
loss 0.04 accuracy 1.00:  78%|███████▊  | 1457/1875 [00:41<00:09, 45.71it/s]
loss 0.04 accuracy 1.00:  78%|███████▊  | 1462/1875 [00:41<00:09, 45.68it/s]
loss 0.15 accuracy 0.94:  78%|███████▊  | 1462/1875 [00:41<00:09, 45.68it/s]
loss 0.03 accuracy 1.00:  78%|███████▊  | 1462/1875 [00:41<00:09, 45.68it/s]
loss 0.17 accuracy 0.94:  78%|███████▊  | 1462/1875 [00:41<00:09, 45.68it/s]
loss 0.10 accuracy 0.97:  78%|███████▊  | 1462/1875 [00:41<00:09, 45.68it/s]
loss 0.19 accuracy 0.91:  78%|███████▊  | 1462/1875 [00:41<00:09, 45.68it/s]
loss 0.19 accuracy 0.91:  78%|███████▊  | 1467/1875 [00:41<00:08, 45.70it/s]
loss 0.03 accuracy 1.00:  78%|███████▊  | 1467/1875 [00:41<00:08, 45.70it/s]
loss 0.15 accuracy 0.94:  78%|███████▊  | 1467/1875 [00:41<00:08, 45.70it/s]
loss 0.02 accuracy 1.00:  78%|███████▊  | 1467/1875 [00:42<00:08, 45.70it/s]
loss 0.20 accuracy 0.94:  78%|███████▊  | 1467/1875 [00:42<00:08, 45.70it/s]
loss 0.10 accuracy 0.97:  78%|███████▊  | 1467/1875 [00:42<00:08, 45.70it/s]
loss 0.10 accuracy 0.97:  79%|███████▊  | 1472/1875 [00:42<00:08, 45.77it/s]
loss 0.16 accuracy 0.94:  79%|███████▊  | 1472/1875 [00:42<00:08, 45.77it/s]
loss 0.20 accuracy 0.97:  79%|███████▊  | 1472/1875 [00:42<00:08, 45.77it/s]
loss 0.08 accuracy 0.97:  79%|███████▊  | 1472/1875 [00:42<00:08, 45.77it/s]
loss 0.14 accuracy 0.97:  79%|███████▊  | 1472/1875 [00:42<00:08, 45.77it/s]
loss 0.27 accuracy 0.97:  79%|███████▊  | 1472/1875 [00:42<00:08, 45.77it/s]
loss 0.27 accuracy 0.97:  79%|███████▉  | 1477/1875 [00:42<00:08, 45.87it/s]
loss 0.32 accuracy 0.94:  79%|███████▉  | 1477/1875 [00:42<00:08, 45.87it/s]
loss 0.19 accuracy 0.94:  79%|███████▉  | 1477/1875 [00:42<00:08, 45.87it/s]
loss 0.13 accuracy 0.94:  79%|███████▉  | 1477/1875 [00:42<00:08, 45.87it/s]
loss 0.19 accuracy 0.94:  79%|███████▉  | 1477/1875 [00:42<00:08, 45.87it/s]
loss 0.26 accuracy 0.91:  79%|███████▉  | 1477/1875 [00:42<00:08, 45.87it/s]
loss 0.26 accuracy 0.91:  79%|███████▉  | 1482/1875 [00:42<00:08, 45.91it/s]
loss 0.05 accuracy 0.97:  79%|███████▉  | 1482/1875 [00:42<00:08, 45.91it/s]
loss 0.10 accuracy 0.97:  79%|███████▉  | 1482/1875 [00:42<00:08, 45.91it/s]
loss 0.10 accuracy 0.97:  79%|███████▉  | 1482/1875 [00:42<00:08, 45.91it/s]
loss 0.07 accuracy 1.00:  79%|███████▉  | 1482/1875 [00:42<00:08, 45.91it/s]
loss 0.24 accuracy 0.94:  79%|███████▉  | 1482/1875 [00:42<00:08, 45.91it/s]
loss 0.24 accuracy 0.94:  79%|███████▉  | 1487/1875 [00:42<00:08, 45.92it/s]
loss 0.08 accuracy 0.97:  79%|███████▉  | 1487/1875 [00:42<00:08, 45.92it/s]
loss 0.09 accuracy 0.97:  79%|███████▉  | 1487/1875 [00:42<00:08, 45.92it/s]
loss 0.13 accuracy 0.94:  79%|███████▉  | 1487/1875 [00:42<00:08, 45.92it/s]
loss 0.03 accuracy 1.00:  79%|███████▉  | 1487/1875 [00:42<00:08, 45.92it/s]
loss 0.05 accuracy 1.00:  79%|███████▉  | 1487/1875 [00:42<00:08, 45.92it/s]
loss 0.05 accuracy 1.00:  80%|███████▉  | 1492/1875 [00:42<00:08, 45.95it/s]
loss 0.11 accuracy 0.97:  80%|███████▉  | 1492/1875 [00:42<00:08, 45.95it/s]
loss 0.09 accuracy 0.97:  80%|███████▉  | 1492/1875 [00:42<00:08, 45.95it/s]
loss 0.20 accuracy 0.94:  80%|███████▉  | 1492/1875 [00:42<00:08, 45.95it/s]
loss 0.13 accuracy 0.91:  80%|███████▉  | 1492/1875 [00:42<00:08, 45.95it/s]
loss 0.17 accuracy 0.97:  80%|███████▉  | 1492/1875 [00:42<00:08, 45.95it/s]
loss 0.17 accuracy 0.97:  80%|███████▉  | 1497/1875 [00:42<00:08, 45.98it/s]
loss 0.03 accuracy 1.00:  80%|███████▉  | 1497/1875 [00:42<00:08, 45.98it/s]
loss 0.26 accuracy 0.91:  80%|███████▉  | 1497/1875 [00:42<00:08, 45.98it/s]
loss 0.05 accuracy 0.97:  80%|███████▉  | 1497/1875 [00:42<00:08, 45.98it/s]
loss 0.03 accuracy 1.00:  80%|███████▉  | 1497/1875 [00:42<00:08, 45.98it/s]
loss 0.11 accuracy 0.97:  80%|███████▉  | 1497/1875 [00:42<00:08, 45.98it/s]
loss 0.11 accuracy 0.97:  80%|████████  | 1502/1875 [00:42<00:08, 45.93it/s]
loss 0.06 accuracy 0.97:  80%|████████  | 1502/1875 [00:42<00:08, 45.93it/s]
loss 0.07 accuracy 0.97:  80%|████████  | 1502/1875 [00:42<00:08, 45.93it/s]
loss 0.19 accuracy 0.97:  80%|████████  | 1502/1875 [00:42<00:08, 45.93it/s]
loss 0.19 accuracy 0.94:  80%|████████  | 1502/1875 [00:42<00:08, 45.93it/s]
loss 0.04 accuracy 1.00:  80%|████████  | 1502/1875 [00:42<00:08, 45.93it/s]
loss 0.04 accuracy 1.00:  80%|████████  | 1507/1875 [00:42<00:08, 45.87it/s]
loss 0.19 accuracy 0.94:  80%|████████  | 1507/1875 [00:42<00:08, 45.87it/s]
loss 0.05 accuracy 0.97:  80%|████████  | 1507/1875 [00:42<00:08, 45.87it/s]
loss 0.09 accuracy 0.97:  80%|████████  | 1507/1875 [00:42<00:08, 45.87it/s]
loss 0.20 accuracy 0.94:  80%|████████  | 1507/1875 [00:42<00:08, 45.87it/s]
loss 0.07 accuracy 0.97:  80%|████████  | 1507/1875 [00:42<00:08, 45.87it/s]
loss 0.07 accuracy 0.97:  81%|████████  | 1512/1875 [00:42<00:07, 45.84it/s]
loss 0.05 accuracy 1.00:  81%|████████  | 1512/1875 [00:42<00:07, 45.84it/s]
loss 0.04 accuracy 1.00:  81%|████████  | 1512/1875 [00:42<00:07, 45.84it/s]
loss 0.08 accuracy 0.97:  81%|████████  | 1512/1875 [00:42<00:07, 45.84it/s]
loss 0.06 accuracy 0.97:  81%|████████  | 1512/1875 [00:43<00:07, 45.84it/s]
loss 0.09 accuracy 0.94:  81%|████████  | 1512/1875 [00:43<00:07, 45.84it/s]
loss 0.09 accuracy 0.94:  81%|████████  | 1517/1875 [00:43<00:07, 45.87it/s]
loss 0.15 accuracy 0.97:  81%|████████  | 1517/1875 [00:43<00:07, 45.87it/s]
loss 0.13 accuracy 0.94:  81%|████████  | 1517/1875 [00:43<00:07, 45.87it/s]
loss 0.02 accuracy 1.00:  81%|████████  | 1517/1875 [00:43<00:07, 45.87it/s]
loss 0.13 accuracy 0.94:  81%|████████  | 1517/1875 [00:43<00:07, 45.87it/s]
loss 0.25 accuracy 0.97:  81%|████████  | 1517/1875 [00:43<00:07, 45.87it/s]
loss 0.25 accuracy 0.97:  81%|████████  | 1522/1875 [00:43<00:07, 45.84it/s]
loss 0.21 accuracy 0.94:  81%|████████  | 1522/1875 [00:43<00:07, 45.84it/s]
loss 0.05 accuracy 1.00:  81%|████████  | 1522/1875 [00:43<00:07, 45.84it/s]
loss 0.03 accuracy 1.00:  81%|████████  | 1522/1875 [00:43<00:07, 45.84it/s]
loss 0.02 accuracy 1.00:  81%|████████  | 1522/1875 [00:43<00:07, 45.84it/s]
loss 0.29 accuracy 0.91:  81%|████████  | 1522/1875 [00:43<00:07, 45.84it/s]
loss 0.29 accuracy 0.91:  81%|████████▏ | 1527/1875 [00:43<00:07, 45.75it/s]
loss 0.14 accuracy 0.91:  81%|████████▏ | 1527/1875 [00:43<00:07, 45.75it/s]
loss 0.05 accuracy 1.00:  81%|████████▏ | 1527/1875 [00:43<00:07, 45.75it/s]
loss 0.08 accuracy 0.97:  81%|████████▏ | 1527/1875 [00:43<00:07, 45.75it/s]
loss 0.08 accuracy 0.97:  81%|████████▏ | 1527/1875 [00:43<00:07, 45.75it/s]
loss 0.08 accuracy 1.00:  81%|████████▏ | 1527/1875 [00:43<00:07, 45.75it/s]
loss 0.08 accuracy 1.00:  82%|████████▏ | 1532/1875 [00:43<00:07, 45.75it/s]
loss 0.15 accuracy 1.00:  82%|████████▏ | 1532/1875 [00:43<00:07, 45.75it/s]
loss 0.16 accuracy 0.97:  82%|████████▏ | 1532/1875 [00:43<00:07, 45.75it/s]
loss 0.22 accuracy 0.97:  82%|████████▏ | 1532/1875 [00:43<00:07, 45.75it/s]
loss 0.13 accuracy 0.94:  82%|████████▏ | 1532/1875 [00:43<00:07, 45.75it/s]
loss 0.05 accuracy 0.97:  82%|████████▏ | 1532/1875 [00:43<00:07, 45.75it/s]
loss 0.05 accuracy 0.97:  82%|████████▏ | 1537/1875 [00:43<00:07, 45.73it/s]
loss 0.03 accuracy 1.00:  82%|████████▏ | 1537/1875 [00:43<00:07, 45.73it/s]
loss 0.11 accuracy 0.94:  82%|████████▏ | 1537/1875 [00:43<00:07, 45.73it/s]
loss 0.04 accuracy 1.00:  82%|████████▏ | 1537/1875 [00:43<00:07, 45.73it/s]
loss 0.07 accuracy 0.97:  82%|████████▏ | 1537/1875 [00:43<00:07, 45.73it/s]
loss 0.08 accuracy 0.97:  82%|████████▏ | 1537/1875 [00:43<00:07, 45.73it/s]
loss 0.08 accuracy 0.97:  82%|████████▏ | 1542/1875 [00:43<00:07, 45.76it/s]
loss 0.21 accuracy 0.94:  82%|████████▏ | 1542/1875 [00:43<00:07, 45.76it/s]
loss 0.10 accuracy 0.97:  82%|████████▏ | 1542/1875 [00:43<00:07, 45.76it/s]
loss 0.03 accuracy 1.00:  82%|████████▏ | 1542/1875 [00:43<00:07, 45.76it/s]
loss 0.03 accuracy 1.00:  82%|████████▏ | 1542/1875 [00:43<00:07, 45.76it/s]
loss 0.20 accuracy 0.91:  82%|████████▏ | 1542/1875 [00:43<00:07, 45.76it/s]
loss 0.20 accuracy 0.91:  83%|████████▎ | 1547/1875 [00:43<00:07, 45.84it/s]
loss 0.11 accuracy 0.94:  83%|████████▎ | 1547/1875 [00:43<00:07, 45.84it/s]
loss 0.32 accuracy 0.94:  83%|████████▎ | 1547/1875 [00:43<00:07, 45.84it/s]
loss 0.04 accuracy 1.00:  83%|████████▎ | 1547/1875 [00:43<00:07, 45.84it/s]
loss 0.06 accuracy 0.97:  83%|████████▎ | 1547/1875 [00:43<00:07, 45.84it/s]
loss 0.02 accuracy 1.00:  83%|████████▎ | 1547/1875 [00:43<00:07, 45.84it/s]
loss 0.02 accuracy 1.00:  83%|████████▎ | 1552/1875 [00:43<00:07, 45.91it/s]
loss 0.02 accuracy 1.00:  83%|████████▎ | 1552/1875 [00:43<00:07, 45.91it/s]
loss 0.10 accuracy 0.97:  83%|████████▎ | 1552/1875 [00:43<00:07, 45.91it/s]
loss 0.21 accuracy 0.97:  83%|████████▎ | 1552/1875 [00:43<00:07, 45.91it/s]
loss 0.14 accuracy 0.97:  83%|████████▎ | 1552/1875 [00:43<00:07, 45.91it/s]
loss 0.03 accuracy 1.00:  83%|████████▎ | 1552/1875 [00:43<00:07, 45.91it/s]
loss 0.03 accuracy 1.00:  83%|████████▎ | 1557/1875 [00:43<00:06, 45.96it/s]
loss 0.07 accuracy 1.00:  83%|████████▎ | 1557/1875 [00:43<00:06, 45.96it/s]
loss 0.04 accuracy 1.00:  83%|████████▎ | 1557/1875 [00:43<00:06, 45.96it/s]
loss 0.02 accuracy 1.00:  83%|████████▎ | 1557/1875 [00:43<00:06, 45.96it/s]
loss 0.27 accuracy 0.94:  83%|████████▎ | 1557/1875 [00:44<00:06, 45.96it/s]
loss 0.24 accuracy 0.94:  83%|████████▎ | 1557/1875 [00:44<00:06, 45.96it/s]
loss 0.24 accuracy 0.94:  83%|████████▎ | 1562/1875 [00:44<00:06, 46.02it/s]
loss 0.08 accuracy 0.97:  83%|████████▎ | 1562/1875 [00:44<00:06, 46.02it/s]
loss 0.13 accuracy 0.97:  83%|████████▎ | 1562/1875 [00:44<00:06, 46.02it/s]
loss 0.08 accuracy 0.97:  83%|████████▎ | 1562/1875 [00:44<00:06, 46.02it/s]
loss 0.07 accuracy 1.00:  83%|████████▎ | 1562/1875 [00:44<00:06, 46.02it/s]
loss 0.03 accuracy 1.00:  83%|████████▎ | 1562/1875 [00:44<00:06, 46.02it/s]
loss 0.03 accuracy 1.00:  84%|████████▎ | 1567/1875 [00:44<00:06, 46.05it/s]
loss 0.14 accuracy 0.94:  84%|████████▎ | 1567/1875 [00:44<00:06, 46.05it/s]
loss 0.08 accuracy 0.97:  84%|████████▎ | 1567/1875 [00:44<00:06, 46.05it/s]
loss 0.04 accuracy 1.00:  84%|████████▎ | 1567/1875 [00:44<00:06, 46.05it/s]
loss 0.04 accuracy 1.00:  84%|████████▎ | 1567/1875 [00:44<00:06, 46.05it/s]
loss 0.09 accuracy 0.97:  84%|████████▎ | 1567/1875 [00:44<00:06, 46.05it/s]
loss 0.09 accuracy 0.97:  84%|████████▍ | 1572/1875 [00:44<00:06, 46.07it/s]
loss 0.11 accuracy 0.97:  84%|████████▍ | 1572/1875 [00:44<00:06, 46.07it/s]
loss 0.10 accuracy 0.97:  84%|████████▍ | 1572/1875 [00:44<00:06, 46.07it/s]
loss 0.04 accuracy 1.00:  84%|████████▍ | 1572/1875 [00:44<00:06, 46.07it/s]
loss 0.02 accuracy 1.00:  84%|████████▍ | 1572/1875 [00:44<00:06, 46.07it/s]
loss 0.19 accuracy 0.94:  84%|████████▍ | 1572/1875 [00:44<00:06, 46.07it/s]
loss 0.19 accuracy 0.94:  84%|████████▍ | 1577/1875 [00:44<00:06, 46.10it/s]
loss 0.06 accuracy 1.00:  84%|████████▍ | 1577/1875 [00:44<00:06, 46.10it/s]
loss 0.03 accuracy 1.00:  84%|████████▍ | 1577/1875 [00:44<00:06, 46.10it/s]
loss 0.17 accuracy 0.97:  84%|████████▍ | 1577/1875 [00:44<00:06, 46.10it/s]
loss 0.08 accuracy 0.97:  84%|████████▍ | 1577/1875 [00:44<00:06, 46.10it/s]
loss 0.07 accuracy 0.97:  84%|████████▍ | 1577/1875 [00:44<00:06, 46.10it/s]
loss 0.07 accuracy 0.97:  84%|████████▍ | 1582/1875 [00:44<00:06, 46.06it/s]
loss 0.04 accuracy 1.00:  84%|████████▍ | 1582/1875 [00:44<00:06, 46.06it/s]
loss 0.05 accuracy 0.97:  84%|████████▍ | 1582/1875 [00:44<00:06, 46.06it/s]
loss 0.14 accuracy 0.97:  84%|████████▍ | 1582/1875 [00:44<00:06, 46.06it/s]
loss 0.11 accuracy 0.97:  84%|████████▍ | 1582/1875 [00:44<00:06, 46.06it/s]
loss 0.12 accuracy 0.97:  84%|████████▍ | 1582/1875 [00:44<00:06, 46.06it/s]
loss 0.12 accuracy 0.97:  85%|████████▍ | 1587/1875 [00:44<00:06, 46.05it/s]
loss 0.14 accuracy 0.97:  85%|████████▍ | 1587/1875 [00:44<00:06, 46.05it/s]
loss 0.02 accuracy 1.00:  85%|████████▍ | 1587/1875 [00:44<00:06, 46.05it/s]
loss 0.06 accuracy 0.97:  85%|████████▍ | 1587/1875 [00:44<00:06, 46.05it/s]
loss 0.10 accuracy 0.97:  85%|████████▍ | 1587/1875 [00:44<00:06, 46.05it/s]
loss 0.02 accuracy 1.00:  85%|████████▍ | 1587/1875 [00:44<00:06, 46.05it/s]
loss 0.02 accuracy 1.00:  85%|████████▍ | 1592/1875 [00:44<00:06, 46.02it/s]
loss 0.13 accuracy 0.97:  85%|████████▍ | 1592/1875 [00:44<00:06, 46.02it/s]
loss 0.46 accuracy 0.88:  85%|████████▍ | 1592/1875 [00:44<00:06, 46.02it/s]
loss 0.04 accuracy 1.00:  85%|████████▍ | 1592/1875 [00:44<00:06, 46.02it/s]
loss 0.03 accuracy 1.00:  85%|████████▍ | 1592/1875 [00:44<00:06, 46.02it/s]
loss 0.05 accuracy 1.00:  85%|████████▍ | 1592/1875 [00:44<00:06, 46.02it/s]
loss 0.05 accuracy 1.00:  85%|████████▌ | 1597/1875 [00:44<00:06, 45.93it/s]
loss 0.26 accuracy 0.94:  85%|████████▌ | 1597/1875 [00:44<00:06, 45.93it/s]
loss 0.07 accuracy 0.97:  85%|████████▌ | 1597/1875 [00:44<00:06, 45.93it/s]
loss 0.05 accuracy 1.00:  85%|████████▌ | 1597/1875 [00:44<00:06, 45.93it/s]
loss 0.12 accuracy 0.97:  85%|████████▌ | 1597/1875 [00:44<00:06, 45.93it/s]
loss 0.09 accuracy 0.97:  85%|████████▌ | 1597/1875 [00:44<00:06, 45.93it/s]
loss 0.09 accuracy 0.97:  85%|████████▌ | 1602/1875 [00:44<00:05, 45.81it/s]
loss 0.09 accuracy 1.00:  85%|████████▌ | 1602/1875 [00:44<00:05, 45.81it/s]
loss 0.05 accuracy 1.00:  85%|████████▌ | 1602/1875 [00:44<00:05, 45.81it/s]
loss 0.09 accuracy 0.97:  85%|████████▌ | 1602/1875 [00:44<00:05, 45.81it/s]
loss 0.10 accuracy 0.94:  85%|████████▌ | 1602/1875 [00:44<00:05, 45.81it/s]
loss 0.02 accuracy 1.00:  85%|████████▌ | 1602/1875 [00:45<00:05, 45.81it/s]
loss 0.02 accuracy 1.00:  86%|████████▌ | 1607/1875 [00:45<00:05, 45.82it/s]
loss 0.13 accuracy 0.97:  86%|████████▌ | 1607/1875 [00:45<00:05, 45.82it/s]
loss 0.07 accuracy 0.97:  86%|████████▌ | 1607/1875 [00:45<00:05, 45.82it/s]
loss 0.10 accuracy 0.97:  86%|████████▌ | 1607/1875 [00:45<00:05, 45.82it/s]
loss 0.21 accuracy 0.97:  86%|████████▌ | 1607/1875 [00:45<00:05, 45.82it/s]
loss 0.09 accuracy 0.97:  86%|████████▌ | 1607/1875 [00:45<00:05, 45.82it/s]
loss 0.09 accuracy 0.97:  86%|████████▌ | 1612/1875 [00:45<00:05, 45.66it/s]
loss 0.17 accuracy 0.97:  86%|████████▌ | 1612/1875 [00:45<00:05, 45.66it/s]
loss 0.46 accuracy 0.94:  86%|████████▌ | 1612/1875 [00:45<00:05, 45.66it/s]
loss 0.11 accuracy 0.97:  86%|████████▌ | 1612/1875 [00:45<00:05, 45.66it/s]
loss 0.03 accuracy 1.00:  86%|████████▌ | 1612/1875 [00:45<00:05, 45.66it/s]
loss 0.06 accuracy 1.00:  86%|████████▌ | 1612/1875 [00:45<00:05, 45.66it/s]
loss 0.06 accuracy 1.00:  86%|████████▌ | 1617/1875 [00:45<00:05, 45.70it/s]
loss 0.19 accuracy 0.97:  86%|████████▌ | 1617/1875 [00:45<00:05, 45.70it/s]
loss 0.07 accuracy 1.00:  86%|████████▌ | 1617/1875 [00:45<00:05, 45.70it/s]
loss 0.09 accuracy 0.97:  86%|████████▌ | 1617/1875 [00:45<00:05, 45.70it/s]
loss 0.08 accuracy 0.97:  86%|████████▌ | 1617/1875 [00:45<00:05, 45.70it/s]
loss 0.14 accuracy 0.97:  86%|████████▌ | 1617/1875 [00:45<00:05, 45.70it/s]
loss 0.14 accuracy 0.97:  87%|████████▋ | 1622/1875 [00:45<00:05, 45.65it/s]
loss 0.04 accuracy 1.00:  87%|████████▋ | 1622/1875 [00:45<00:05, 45.65it/s]
loss 0.09 accuracy 0.97:  87%|████████▋ | 1622/1875 [00:45<00:05, 45.65it/s]
loss 0.03 accuracy 1.00:  87%|████████▋ | 1622/1875 [00:45<00:05, 45.65it/s]
loss 0.06 accuracy 0.97:  87%|████████▋ | 1622/1875 [00:45<00:05, 45.65it/s]
loss 0.02 accuracy 1.00:  87%|████████▋ | 1622/1875 [00:45<00:05, 45.65it/s]
loss 0.02 accuracy 1.00:  87%|████████▋ | 1627/1875 [00:45<00:05, 45.72it/s]
loss 0.09 accuracy 1.00:  87%|████████▋ | 1627/1875 [00:45<00:05, 45.72it/s]
loss 0.04 accuracy 1.00:  87%|████████▋ | 1627/1875 [00:45<00:05, 45.72it/s]
loss 0.15 accuracy 0.97:  87%|████████▋ | 1627/1875 [00:45<00:05, 45.72it/s]
loss 0.03 accuracy 1.00:  87%|████████▋ | 1627/1875 [00:45<00:05, 45.72it/s]
loss 0.06 accuracy 1.00:  87%|████████▋ | 1627/1875 [00:45<00:05, 45.72it/s]
loss 0.06 accuracy 1.00:  87%|████████▋ | 1632/1875 [00:45<00:05, 45.82it/s]
loss 0.41 accuracy 0.97:  87%|████████▋ | 1632/1875 [00:45<00:05, 45.82it/s]
loss 0.10 accuracy 0.97:  87%|████████▋ | 1632/1875 [00:45<00:05, 45.82it/s]
loss 0.08 accuracy 0.97:  87%|████████▋ | 1632/1875 [00:45<00:05, 45.82it/s]
loss 0.04 accuracy 1.00:  87%|████████▋ | 1632/1875 [00:45<00:05, 45.82it/s]
loss 0.02 accuracy 1.00:  87%|████████▋ | 1632/1875 [00:45<00:05, 45.82it/s]
loss 0.02 accuracy 1.00:  87%|████████▋ | 1637/1875 [00:45<00:05, 45.90it/s]
loss 0.22 accuracy 0.97:  87%|████████▋ | 1637/1875 [00:45<00:05, 45.90it/s]
loss 0.23 accuracy 0.94:  87%|████████▋ | 1637/1875 [00:45<00:05, 45.90it/s]
loss 0.10 accuracy 0.97:  87%|████████▋ | 1637/1875 [00:45<00:05, 45.90it/s]
loss 0.19 accuracy 0.97:  87%|████████▋ | 1637/1875 [00:45<00:05, 45.90it/s]
loss 0.13 accuracy 0.97:  87%|████████▋ | 1637/1875 [00:45<00:05, 45.90it/s]
loss 0.13 accuracy 0.97:  88%|████████▊ | 1642/1875 [00:45<00:05, 45.96it/s]
loss 0.07 accuracy 0.97:  88%|████████▊ | 1642/1875 [00:45<00:05, 45.96it/s]
loss 0.03 accuracy 1.00:  88%|████████▊ | 1642/1875 [00:45<00:05, 45.96it/s]
loss 0.27 accuracy 0.94:  88%|████████▊ | 1642/1875 [00:45<00:05, 45.96it/s]
loss 0.11 accuracy 0.97:  88%|████████▊ | 1642/1875 [00:45<00:05, 45.96it/s]
loss 0.04 accuracy 1.00:  88%|████████▊ | 1642/1875 [00:45<00:05, 45.96it/s]
loss 0.04 accuracy 1.00:  88%|████████▊ | 1647/1875 [00:45<00:04, 46.02it/s]
loss 0.08 accuracy 0.97:  88%|████████▊ | 1647/1875 [00:45<00:04, 46.02it/s]
loss 0.04 accuracy 1.00:  88%|████████▊ | 1647/1875 [00:45<00:04, 46.02it/s]
loss 0.07 accuracy 0.97:  88%|████████▊ | 1647/1875 [00:45<00:04, 46.02it/s]
loss 0.06 accuracy 1.00:  88%|████████▊ | 1647/1875 [00:45<00:04, 46.02it/s]
loss 0.14 accuracy 0.94:  88%|████████▊ | 1647/1875 [00:45<00:04, 46.02it/s]
loss 0.14 accuracy 0.94:  88%|████████▊ | 1652/1875 [00:45<00:04, 46.10it/s]
loss 0.02 accuracy 1.00:  88%|████████▊ | 1652/1875 [00:46<00:04, 46.10it/s]
loss 0.02 accuracy 1.00:  88%|████████▊ | 1652/1875 [00:46<00:04, 46.10it/s]
loss 0.10 accuracy 0.94:  88%|████████▊ | 1652/1875 [00:46<00:04, 46.10it/s]
loss 0.10 accuracy 0.97:  88%|████████▊ | 1652/1875 [00:46<00:04, 46.10it/s]
loss 0.04 accuracy 1.00:  88%|████████▊ | 1652/1875 [00:46<00:04, 46.10it/s]
loss 0.04 accuracy 1.00:  88%|████████▊ | 1657/1875 [00:46<00:04, 46.10it/s]
loss 0.13 accuracy 0.97:  88%|████████▊ | 1657/1875 [00:46<00:04, 46.10it/s]
loss 0.10 accuracy 0.97:  88%|████████▊ | 1657/1875 [00:46<00:04, 46.10it/s]
loss 0.06 accuracy 1.00:  88%|████████▊ | 1657/1875 [00:46<00:04, 46.10it/s]
loss 0.18 accuracy 0.91:  88%|████████▊ | 1657/1875 [00:46<00:04, 46.10it/s]
loss 0.13 accuracy 0.97:  88%|████████▊ | 1657/1875 [00:46<00:04, 46.10it/s]
loss 0.13 accuracy 0.97:  89%|████████▊ | 1662/1875 [00:46<00:04, 46.10it/s]
loss 0.04 accuracy 1.00:  89%|████████▊ | 1662/1875 [00:46<00:04, 46.10it/s]
loss 0.06 accuracy 0.97:  89%|████████▊ | 1662/1875 [00:46<00:04, 46.10it/s]
loss 0.13 accuracy 0.97:  89%|████████▊ | 1662/1875 [00:46<00:04, 46.10it/s]
loss 0.03 accuracy 1.00:  89%|████████▊ | 1662/1875 [00:46<00:04, 46.10it/s]
loss 0.05 accuracy 0.97:  89%|████████▊ | 1662/1875 [00:46<00:04, 46.10it/s]
loss 0.05 accuracy 0.97:  89%|████████▉ | 1667/1875 [00:46<00:04, 46.08it/s]
loss 0.09 accuracy 0.97:  89%|████████▉ | 1667/1875 [00:46<00:04, 46.08it/s]
loss 0.12 accuracy 0.97:  89%|████████▉ | 1667/1875 [00:46<00:04, 46.08it/s]
loss 0.17 accuracy 0.97:  89%|████████▉ | 1667/1875 [00:46<00:04, 46.08it/s]
loss 0.13 accuracy 0.97:  89%|████████▉ | 1667/1875 [00:46<00:04, 46.08it/s]
loss 0.14 accuracy 0.97:  89%|████████▉ | 1667/1875 [00:46<00:04, 46.08it/s]
loss 0.14 accuracy 0.97:  89%|████████▉ | 1672/1875 [00:46<00:04, 46.04it/s]
loss 0.08 accuracy 0.97:  89%|████████▉ | 1672/1875 [00:46<00:04, 46.04it/s]
loss 0.06 accuracy 1.00:  89%|████████▉ | 1672/1875 [00:46<00:04, 46.04it/s]
loss 0.16 accuracy 0.97:  89%|████████▉ | 1672/1875 [00:46<00:04, 46.04it/s]
loss 0.13 accuracy 0.91:  89%|████████▉ | 1672/1875 [00:46<00:04, 46.04it/s]
loss 0.63 accuracy 0.91:  89%|████████▉ | 1672/1875 [00:46<00:04, 46.04it/s]
loss 0.63 accuracy 0.91:  89%|████████▉ | 1677/1875 [00:46<00:04, 46.02it/s]
loss 0.05 accuracy 1.00:  89%|████████▉ | 1677/1875 [00:46<00:04, 46.02it/s]
loss 0.07 accuracy 0.97:  89%|████████▉ | 1677/1875 [00:46<00:04, 46.02it/s]
loss 0.02 accuracy 1.00:  89%|████████▉ | 1677/1875 [00:46<00:04, 46.02it/s]
loss 0.03 accuracy 1.00:  89%|████████▉ | 1677/1875 [00:46<00:04, 46.02it/s]
loss 0.09 accuracy 0.97:  89%|████████▉ | 1677/1875 [00:46<00:04, 46.02it/s]
loss 0.09 accuracy 0.97:  90%|████████▉ | 1682/1875 [00:46<00:04, 45.98it/s]
loss 0.21 accuracy 0.97:  90%|████████▉ | 1682/1875 [00:46<00:04, 45.98it/s]
loss 0.09 accuracy 0.97:  90%|████████▉ | 1682/1875 [00:46<00:04, 45.98it/s]
loss 0.12 accuracy 0.97:  90%|████████▉ | 1682/1875 [00:46<00:04, 45.98it/s]
loss 0.04 accuracy 1.00:  90%|████████▉ | 1682/1875 [00:46<00:04, 45.98it/s]
loss 0.04 accuracy 1.00:  90%|████████▉ | 1682/1875 [00:46<00:04, 45.98it/s]
loss 0.04 accuracy 1.00:  90%|████████▉ | 1687/1875 [00:46<00:04, 45.85it/s]
loss 0.13 accuracy 0.97:  90%|████████▉ | 1687/1875 [00:46<00:04, 45.85it/s]
loss 0.04 accuracy 1.00:  90%|████████▉ | 1687/1875 [00:46<00:04, 45.85it/s]
loss 0.06 accuracy 1.00:  90%|████████▉ | 1687/1875 [00:46<00:04, 45.85it/s]
loss 0.05 accuracy 1.00:  90%|████████▉ | 1687/1875 [00:46<00:04, 45.85it/s]
loss 0.18 accuracy 0.94:  90%|████████▉ | 1687/1875 [00:46<00:04, 45.85it/s]
loss 0.18 accuracy 0.94:  90%|█████████ | 1692/1875 [00:46<00:03, 45.80it/s]
loss 0.14 accuracy 0.97:  90%|█████████ | 1692/1875 [00:46<00:03, 45.80it/s]
loss 0.15 accuracy 0.91:  90%|█████████ | 1692/1875 [00:46<00:03, 45.80it/s]
loss 0.08 accuracy 0.97:  90%|█████████ | 1692/1875 [00:46<00:03, 45.80it/s]
loss 0.04 accuracy 1.00:  90%|█████████ | 1692/1875 [00:46<00:03, 45.80it/s]
loss 0.04 accuracy 1.00:  90%|█████████ | 1692/1875 [00:46<00:03, 45.80it/s]
loss 0.04 accuracy 1.00:  91%|█████████ | 1697/1875 [00:46<00:03, 45.71it/s]
loss 0.14 accuracy 0.97:  91%|█████████ | 1697/1875 [00:46<00:03, 45.71it/s]
loss 0.23 accuracy 0.97:  91%|█████████ | 1697/1875 [00:47<00:03, 45.71it/s]
loss 0.40 accuracy 0.94:  91%|█████████ | 1697/1875 [00:47<00:03, 45.71it/s]
loss 0.03 accuracy 1.00:  91%|█████████ | 1697/1875 [00:47<00:03, 45.71it/s]
loss 0.10 accuracy 0.97:  91%|█████████ | 1697/1875 [00:47<00:03, 45.71it/s]
loss 0.10 accuracy 0.97:  91%|█████████ | 1702/1875 [00:47<00:03, 45.71it/s]
loss 0.09 accuracy 0.97:  91%|█████████ | 1702/1875 [00:47<00:03, 45.71it/s]
loss 0.04 accuracy 1.00:  91%|█████████ | 1702/1875 [00:47<00:03, 45.71it/s]
loss 0.03 accuracy 1.00:  91%|█████████ | 1702/1875 [00:47<00:03, 45.71it/s]
loss 0.10 accuracy 0.97:  91%|█████████ | 1702/1875 [00:47<00:03, 45.71it/s]
loss 0.02 accuracy 1.00:  91%|█████████ | 1702/1875 [00:47<00:03, 45.71it/s]
loss 0.02 accuracy 1.00:  91%|█████████ | 1707/1875 [00:47<00:03, 45.72it/s]
loss 0.07 accuracy 0.97:  91%|█████████ | 1707/1875 [00:47<00:03, 45.72it/s]
loss 0.23 accuracy 0.94:  91%|█████████ | 1707/1875 [00:47<00:03, 45.72it/s]
loss 0.15 accuracy 0.97:  91%|█████████ | 1707/1875 [00:47<00:03, 45.72it/s]
loss 0.11 accuracy 0.94:  91%|█████████ | 1707/1875 [00:47<00:03, 45.72it/s]
loss 0.02 accuracy 1.00:  91%|█████████ | 1707/1875 [00:47<00:03, 45.72it/s]
loss 0.02 accuracy 1.00:  91%|█████████▏| 1712/1875 [00:47<00:03, 45.70it/s]
loss 0.04 accuracy 1.00:  91%|█████████▏| 1712/1875 [00:47<00:03, 45.70it/s]
loss 0.12 accuracy 0.94:  91%|█████████▏| 1712/1875 [00:47<00:03, 45.70it/s]
loss 0.05 accuracy 1.00:  91%|█████████▏| 1712/1875 [00:47<00:03, 45.70it/s]
loss 0.02 accuracy 1.00:  91%|█████████▏| 1712/1875 [00:47<00:03, 45.70it/s]
loss 0.13 accuracy 0.94:  91%|█████████▏| 1712/1875 [00:47<00:03, 45.70it/s]
loss 0.13 accuracy 0.94:  92%|█████████▏| 1717/1875 [00:47<00:03, 45.77it/s]
loss 0.06 accuracy 0.97:  92%|█████████▏| 1717/1875 [00:47<00:03, 45.77it/s]
loss 0.11 accuracy 0.97:  92%|█████████▏| 1717/1875 [00:47<00:03, 45.77it/s]
loss 0.05 accuracy 0.97:  92%|█████████▏| 1717/1875 [00:47<00:03, 45.77it/s]
loss 0.05 accuracy 1.00:  92%|█████████▏| 1717/1875 [00:47<00:03, 45.77it/s]
loss 0.08 accuracy 0.97:  92%|█████████▏| 1717/1875 [00:47<00:03, 45.77it/s]
loss 0.08 accuracy 0.97:  92%|█████████▏| 1722/1875 [00:47<00:03, 45.84it/s]
loss 0.03 accuracy 1.00:  92%|█████████▏| 1722/1875 [00:47<00:03, 45.84it/s]
loss 0.02 accuracy 1.00:  92%|█████████▏| 1722/1875 [00:47<00:03, 45.84it/s]
loss 0.15 accuracy 0.94:  92%|█████████▏| 1722/1875 [00:47<00:03, 45.84it/s]
loss 0.09 accuracy 0.97:  92%|█████████▏| 1722/1875 [00:47<00:03, 45.84it/s]
loss 0.03 accuracy 1.00:  92%|█████████▏| 1722/1875 [00:47<00:03, 45.84it/s]
loss 0.03 accuracy 1.00:  92%|█████████▏| 1727/1875 [00:47<00:03, 45.93it/s]
loss 0.02 accuracy 1.00:  92%|█████████▏| 1727/1875 [00:47<00:03, 45.93it/s]
loss 0.07 accuracy 0.97:  92%|█████████▏| 1727/1875 [00:47<00:03, 45.93it/s]
loss 0.29 accuracy 0.97:  92%|█████████▏| 1727/1875 [00:47<00:03, 45.93it/s]
loss 0.07 accuracy 0.97:  92%|█████████▏| 1727/1875 [00:47<00:03, 45.93it/s]
loss 0.05 accuracy 1.00:  92%|█████████▏| 1727/1875 [00:47<00:03, 45.93it/s]
loss 0.05 accuracy 1.00:  92%|█████████▏| 1732/1875 [00:47<00:03, 46.00it/s]
loss 0.04 accuracy 0.97:  92%|█████████▏| 1732/1875 [00:47<00:03, 46.00it/s]
loss 0.02 accuracy 1.00:  92%|█████████▏| 1732/1875 [00:47<00:03, 46.00it/s]
loss 0.10 accuracy 0.94:  92%|█████████▏| 1732/1875 [00:47<00:03, 46.00it/s]
loss 0.03 accuracy 1.00:  92%|█████████▏| 1732/1875 [00:47<00:03, 46.00it/s]
loss 0.10 accuracy 0.97:  92%|█████████▏| 1732/1875 [00:47<00:03, 46.00it/s]
loss 0.10 accuracy 0.97:  93%|█████████▎| 1737/1875 [00:47<00:02, 46.05it/s]
loss 0.06 accuracy 0.97:  93%|█████████▎| 1737/1875 [00:47<00:02, 46.05it/s]
loss 0.25 accuracy 0.91:  93%|█████████▎| 1737/1875 [00:47<00:02, 46.05it/s]
loss 0.12 accuracy 0.97:  93%|█████████▎| 1737/1875 [00:47<00:02, 46.05it/s]
loss 0.06 accuracy 1.00:  93%|█████████▎| 1737/1875 [00:47<00:02, 46.05it/s]
loss 0.04 accuracy 1.00:  93%|█████████▎| 1737/1875 [00:47<00:02, 46.05it/s]
loss 0.04 accuracy 1.00:  93%|█████████▎| 1742/1875 [00:47<00:02, 46.04it/s]
loss 0.21 accuracy 0.97:  93%|█████████▎| 1742/1875 [00:47<00:02, 46.04it/s]
loss 0.03 accuracy 1.00:  93%|█████████▎| 1742/1875 [00:47<00:02, 46.04it/s]
loss 0.17 accuracy 0.94:  93%|█████████▎| 1742/1875 [00:48<00:02, 46.04it/s]
loss 0.03 accuracy 1.00:  93%|█████████▎| 1742/1875 [00:48<00:02, 46.04it/s]
loss 0.07 accuracy 1.00:  93%|█████████▎| 1742/1875 [00:48<00:02, 46.04it/s]
loss 0.07 accuracy 1.00:  93%|█████████▎| 1747/1875 [00:48<00:02, 46.04it/s]
loss 0.27 accuracy 0.91:  93%|█████████▎| 1747/1875 [00:48<00:02, 46.04it/s]
loss 0.02 accuracy 1.00:  93%|█████████▎| 1747/1875 [00:48<00:02, 46.04it/s]
loss 0.25 accuracy 0.97:  93%|█████████▎| 1747/1875 [00:48<00:02, 46.04it/s]
loss 0.06 accuracy 1.00:  93%|█████████▎| 1747/1875 [00:48<00:02, 46.04it/s]
loss 0.04 accuracy 1.00:  93%|█████████▎| 1747/1875 [00:48<00:02, 46.04it/s]
loss 0.04 accuracy 1.00:  93%|█████████▎| 1752/1875 [00:48<00:02, 46.02it/s]
loss 0.33 accuracy 0.91:  93%|█████████▎| 1752/1875 [00:48<00:02, 46.02it/s]
loss 0.10 accuracy 0.94:  93%|█████████▎| 1752/1875 [00:48<00:02, 46.02it/s]
loss 0.03 accuracy 1.00:  93%|█████████▎| 1752/1875 [00:48<00:02, 46.02it/s]
loss 0.02 accuracy 1.00:  93%|█████████▎| 1752/1875 [00:48<00:02, 46.02it/s]
loss 0.12 accuracy 0.94:  93%|█████████▎| 1752/1875 [00:48<00:02, 46.02it/s]
loss 0.12 accuracy 0.94:  94%|█████████▎| 1757/1875 [00:48<00:02, 45.99it/s]
loss 0.03 accuracy 1.00:  94%|█████████▎| 1757/1875 [00:48<00:02, 45.99it/s]
loss 0.12 accuracy 0.97:  94%|█████████▎| 1757/1875 [00:48<00:02, 45.99it/s]
loss 0.03 accuracy 1.00:  94%|█████████▎| 1757/1875 [00:48<00:02, 45.99it/s]
loss 0.14 accuracy 0.94:  94%|█████████▎| 1757/1875 [00:48<00:02, 45.99it/s]
loss 0.12 accuracy 0.94:  94%|█████████▎| 1757/1875 [00:48<00:02, 45.99it/s]
loss 0.12 accuracy 0.94:  94%|█████████▍| 1762/1875 [00:48<00:02, 45.94it/s]
loss 0.11 accuracy 0.94:  94%|█████████▍| 1762/1875 [00:48<00:02, 45.94it/s]
loss 0.02 accuracy 1.00:  94%|█████████▍| 1762/1875 [00:48<00:02, 45.94it/s]
loss 0.04 accuracy 1.00:  94%|█████████▍| 1762/1875 [00:48<00:02, 45.94it/s]
loss 0.10 accuracy 0.94:  94%|█████████▍| 1762/1875 [00:48<00:02, 45.94it/s]
loss 0.25 accuracy 0.94:  94%|█████████▍| 1762/1875 [00:48<00:02, 45.94it/s]
loss 0.25 accuracy 0.94:  94%|█████████▍| 1767/1875 [00:48<00:02, 45.80it/s]
loss 0.17 accuracy 0.97:  94%|█████████▍| 1767/1875 [00:48<00:02, 45.80it/s]
loss 0.11 accuracy 0.97:  94%|█████████▍| 1767/1875 [00:48<00:02, 45.80it/s]
loss 0.13 accuracy 0.94:  94%|█████████▍| 1767/1875 [00:48<00:02, 45.80it/s]
loss 0.24 accuracy 0.91:  94%|█████████▍| 1767/1875 [00:48<00:02, 45.80it/s]
loss 0.04 accuracy 1.00:  94%|█████████▍| 1767/1875 [00:48<00:02, 45.80it/s]
loss 0.04 accuracy 1.00:  95%|█████████▍| 1772/1875 [00:48<00:02, 45.81it/s]
loss 0.24 accuracy 0.97:  95%|█████████▍| 1772/1875 [00:48<00:02, 45.81it/s]
loss 0.06 accuracy 1.00:  95%|█████████▍| 1772/1875 [00:48<00:02, 45.81it/s]
loss 0.02 accuracy 1.00:  95%|█████████▍| 1772/1875 [00:48<00:02, 45.81it/s]
loss 0.05 accuracy 1.00:  95%|█████████▍| 1772/1875 [00:48<00:02, 45.81it/s]
loss 0.02 accuracy 1.00:  95%|█████████▍| 1772/1875 [00:48<00:02, 45.81it/s]
loss 0.02 accuracy 1.00:  95%|█████████▍| 1777/1875 [00:48<00:02, 45.67it/s]
loss 0.04 accuracy 1.00:  95%|█████████▍| 1777/1875 [00:48<00:02, 45.67it/s]
loss 0.15 accuracy 0.97:  95%|█████████▍| 1777/1875 [00:48<00:02, 45.67it/s]
loss 0.06 accuracy 0.97:  95%|█████████▍| 1777/1875 [00:48<00:02, 45.67it/s]
loss 0.15 accuracy 0.94:  95%|█████████▍| 1777/1875 [00:48<00:02, 45.67it/s]
loss 0.08 accuracy 0.97:  95%|█████████▍| 1777/1875 [00:48<00:02, 45.67it/s]
loss 0.08 accuracy 0.97:  95%|█████████▌| 1782/1875 [00:48<00:02, 45.69it/s]
loss 0.08 accuracy 0.97:  95%|█████████▌| 1782/1875 [00:48<00:02, 45.69it/s]
loss 0.04 accuracy 1.00:  95%|█████████▌| 1782/1875 [00:48<00:02, 45.69it/s]
loss 0.09 accuracy 0.97:  95%|█████████▌| 1782/1875 [00:48<00:02, 45.69it/s]
loss 0.03 accuracy 1.00:  95%|█████████▌| 1782/1875 [00:48<00:02, 45.69it/s]
loss 0.03 accuracy 1.00:  95%|█████████▌| 1782/1875 [00:48<00:02, 45.69it/s]
loss 0.03 accuracy 1.00:  95%|█████████▌| 1787/1875 [00:48<00:01, 45.67it/s]
loss 0.04 accuracy 1.00:  95%|█████████▌| 1787/1875 [00:48<00:01, 45.67it/s]
loss 0.01 accuracy 1.00:  95%|█████████▌| 1787/1875 [00:48<00:01, 45.67it/s]
loss 0.03 accuracy 1.00:  95%|█████████▌| 1787/1875 [00:48<00:01, 45.67it/s]
loss 0.21 accuracy 0.97:  95%|█████████▌| 1787/1875 [00:49<00:01, 45.67it/s]
loss 0.09 accuracy 0.94:  95%|█████████▌| 1787/1875 [00:49<00:01, 45.67it/s]
loss 0.09 accuracy 0.94:  96%|█████████▌| 1792/1875 [00:49<00:01, 45.72it/s]
loss 0.20 accuracy 0.97:  96%|█████████▌| 1792/1875 [00:49<00:01, 45.72it/s]
loss 0.04 accuracy 1.00:  96%|█████████▌| 1792/1875 [00:49<00:01, 45.72it/s]
loss 0.02 accuracy 1.00:  96%|█████████▌| 1792/1875 [00:49<00:01, 45.72it/s]
loss 0.03 accuracy 1.00:  96%|█████████▌| 1792/1875 [00:49<00:01, 45.72it/s]
loss 0.02 accuracy 1.00:  96%|█████████▌| 1792/1875 [00:49<00:01, 45.72it/s]
loss 0.02 accuracy 1.00:  96%|█████████▌| 1797/1875 [00:49<00:01, 45.77it/s]
loss 0.07 accuracy 0.97:  96%|█████████▌| 1797/1875 [00:49<00:01, 45.77it/s]
loss 0.08 accuracy 0.97:  96%|█████████▌| 1797/1875 [00:49<00:01, 45.77it/s]
loss 0.09 accuracy 0.97:  96%|█████████▌| 1797/1875 [00:49<00:01, 45.77it/s]
loss 0.04 accuracy 1.00:  96%|█████████▌| 1797/1875 [00:49<00:01, 45.77it/s]
loss 0.06 accuracy 0.97:  96%|█████████▌| 1797/1875 [00:49<00:01, 45.77it/s]
loss 0.06 accuracy 0.97:  96%|█████████▌| 1802/1875 [00:49<00:01, 45.86it/s]
loss 0.12 accuracy 0.97:  96%|█████████▌| 1802/1875 [00:49<00:01, 45.86it/s]
loss 0.05 accuracy 0.97:  96%|█████████▌| 1802/1875 [00:49<00:01, 45.86it/s]
loss 0.31 accuracy 0.94:  96%|█████████▌| 1802/1875 [00:49<00:01, 45.86it/s]
loss 0.16 accuracy 0.94:  96%|█████████▌| 1802/1875 [00:49<00:01, 45.86it/s]
loss 0.06 accuracy 0.97:  96%|█████████▌| 1802/1875 [00:49<00:01, 45.86it/s]
loss 0.06 accuracy 0.97:  96%|█████████▋| 1807/1875 [00:49<00:01, 45.96it/s]
loss 0.09 accuracy 0.94:  96%|█████████▋| 1807/1875 [00:49<00:01, 45.96it/s]
loss 0.17 accuracy 0.97:  96%|█████████▋| 1807/1875 [00:49<00:01, 45.96it/s]
loss 0.08 accuracy 0.97:  96%|█████████▋| 1807/1875 [00:49<00:01, 45.96it/s]
loss 0.04 accuracy 1.00:  96%|█████████▋| 1807/1875 [00:49<00:01, 45.96it/s]
loss 0.02 accuracy 1.00:  96%|█████████▋| 1807/1875 [00:49<00:01, 45.96it/s]
loss 0.02 accuracy 1.00:  97%|█████████▋| 1812/1875 [00:49<00:01, 45.99it/s]
loss 0.02 accuracy 1.00:  97%|█████████▋| 1812/1875 [00:49<00:01, 45.99it/s]
loss 0.02 accuracy 1.00:  97%|█████████▋| 1812/1875 [00:49<00:01, 45.99it/s]
loss 0.08 accuracy 1.00:  97%|█████████▋| 1812/1875 [00:49<00:01, 45.99it/s]
loss 0.16 accuracy 0.94:  97%|█████████▋| 1812/1875 [00:49<00:01, 45.99it/s]
loss 0.11 accuracy 0.94:  97%|█████████▋| 1812/1875 [00:49<00:01, 45.99it/s]
loss 0.11 accuracy 0.94:  97%|█████████▋| 1817/1875 [00:49<00:01, 46.06it/s]
loss 0.03 accuracy 1.00:  97%|█████████▋| 1817/1875 [00:49<00:01, 46.06it/s]
loss 0.01 accuracy 1.00:  97%|█████████▋| 1817/1875 [00:49<00:01, 46.06it/s]
loss 0.31 accuracy 0.91:  97%|█████████▋| 1817/1875 [00:49<00:01, 46.06it/s]
loss 0.02 accuracy 1.00:  97%|█████████▋| 1817/1875 [00:49<00:01, 46.06it/s]
loss 0.16 accuracy 0.97:  97%|█████████▋| 1817/1875 [00:49<00:01, 46.06it/s]
loss 0.16 accuracy 0.97:  97%|█████████▋| 1822/1875 [00:49<00:01, 46.10it/s]
loss 0.07 accuracy 0.97:  97%|█████████▋| 1822/1875 [00:49<00:01, 46.10it/s]
loss 0.03 accuracy 1.00:  97%|█████████▋| 1822/1875 [00:49<00:01, 46.10it/s]
loss 0.02 accuracy 1.00:  97%|█████████▋| 1822/1875 [00:49<00:01, 46.10it/s]
loss 0.06 accuracy 0.97:  97%|█████████▋| 1822/1875 [00:49<00:01, 46.10it/s]
loss 0.02 accuracy 1.00:  97%|█████████▋| 1822/1875 [00:49<00:01, 46.10it/s]
loss 0.02 accuracy 1.00:  97%|█████████▋| 1827/1875 [00:49<00:01, 46.13it/s]
loss 0.04 accuracy 1.00:  97%|█████████▋| 1827/1875 [00:49<00:01, 46.13it/s]
loss 0.07 accuracy 0.97:  97%|█████████▋| 1827/1875 [00:49<00:01, 46.13it/s]
loss 0.05 accuracy 0.97:  97%|█████████▋| 1827/1875 [00:49<00:01, 46.13it/s]
loss 0.28 accuracy 0.94:  97%|█████████▋| 1827/1875 [00:49<00:01, 46.13it/s]
loss 0.09 accuracy 0.97:  97%|█████████▋| 1827/1875 [00:49<00:01, 46.13it/s]
loss 0.09 accuracy 0.97:  98%|█████████▊| 1832/1875 [00:49<00:00, 46.11it/s]
loss 0.04 accuracy 1.00:  98%|█████████▊| 1832/1875 [00:49<00:00, 46.11it/s]
loss 0.05 accuracy 1.00:  98%|█████████▊| 1832/1875 [00:49<00:00, 46.11it/s]
loss 0.02 accuracy 1.00:  98%|█████████▊| 1832/1875 [00:49<00:00, 46.11it/s]
loss 0.03 accuracy 1.00:  98%|█████████▊| 1832/1875 [00:49<00:00, 46.11it/s]
loss 0.20 accuracy 0.94:  98%|█████████▊| 1832/1875 [00:50<00:00, 46.11it/s]
loss 0.20 accuracy 0.94:  98%|█████████▊| 1837/1875 [00:50<00:00, 46.08it/s]
loss 0.23 accuracy 0.97:  98%|█████████▊| 1837/1875 [00:50<00:00, 46.08it/s]
loss 0.04 accuracy 1.00:  98%|█████████▊| 1837/1875 [00:50<00:00, 46.08it/s]
loss 0.07 accuracy 0.97:  98%|█████████▊| 1837/1875 [00:50<00:00, 46.08it/s]
loss 0.02 accuracy 1.00:  98%|█████████▊| 1837/1875 [00:50<00:00, 46.08it/s]
loss 0.15 accuracy 0.97:  98%|█████████▊| 1837/1875 [00:50<00:00, 46.08it/s]
loss 0.15 accuracy 0.97:  98%|█████████▊| 1842/1875 [00:50<00:00, 46.04it/s]
loss 0.08 accuracy 0.97:  98%|█████████▊| 1842/1875 [00:50<00:00, 46.04it/s]
loss 0.05 accuracy 0.97:  98%|█████████▊| 1842/1875 [00:50<00:00, 46.04it/s]
loss 0.07 accuracy 0.97:  98%|█████████▊| 1842/1875 [00:50<00:00, 46.04it/s]
loss 0.16 accuracy 0.94:  98%|█████████▊| 1842/1875 [00:50<00:00, 46.04it/s]
loss 0.09 accuracy 0.97:  98%|█████████▊| 1842/1875 [00:50<00:00, 46.04it/s]
loss 0.09 accuracy 0.97:  99%|█████████▊| 1847/1875 [00:50<00:00, 45.98it/s]
loss 0.06 accuracy 0.97:  99%|█████████▊| 1847/1875 [00:50<00:00, 45.98it/s]
loss 0.17 accuracy 0.94:  99%|█████████▊| 1847/1875 [00:50<00:00, 45.98it/s]
loss 0.15 accuracy 0.94:  99%|█████████▊| 1847/1875 [00:50<00:00, 45.98it/s]
loss 0.09 accuracy 0.94:  99%|█████████▊| 1847/1875 [00:50<00:00, 45.98it/s]
loss 0.03 accuracy 1.00:  99%|█████████▊| 1847/1875 [00:50<00:00, 45.98it/s]
loss 0.03 accuracy 1.00:  99%|█████████▉| 1852/1875 [00:50<00:00, 45.84it/s]
loss 0.06 accuracy 0.97:  99%|█████████▉| 1852/1875 [00:50<00:00, 45.84it/s]
loss 0.04 accuracy 1.00:  99%|█████████▉| 1852/1875 [00:50<00:00, 45.84it/s]
loss 0.02 accuracy 1.00:  99%|█████████▉| 1852/1875 [00:50<00:00, 45.84it/s]
loss 0.17 accuracy 0.94:  99%|█████████▉| 1852/1875 [00:50<00:00, 45.84it/s]
loss 0.15 accuracy 0.97:  99%|█████████▉| 1852/1875 [00:50<00:00, 45.84it/s]
loss 0.15 accuracy 0.97:  99%|█████████▉| 1857/1875 [00:50<00:00, 45.81it/s]
loss 0.12 accuracy 0.97:  99%|█████████▉| 1857/1875 [00:50<00:00, 45.81it/s]
loss 0.13 accuracy 0.94:  99%|█████████▉| 1857/1875 [00:50<00:00, 45.81it/s]
loss 0.09 accuracy 0.97:  99%|█████████▉| 1857/1875 [00:50<00:00, 45.81it/s]
loss 0.10 accuracy 0.94:  99%|█████████▉| 1857/1875 [00:50<00:00, 45.81it/s]
loss 0.02 accuracy 1.00:  99%|█████████▉| 1857/1875 [00:50<00:00, 45.81it/s]
loss 0.02 accuracy 1.00:  99%|█████████▉| 1862/1875 [00:50<00:00, 45.76it/s]
loss 0.05 accuracy 1.00:  99%|█████████▉| 1862/1875 [00:50<00:00, 45.76it/s]
loss 0.03 accuracy 1.00:  99%|█████████▉| 1862/1875 [00:50<00:00, 45.76it/s]
loss 0.06 accuracy 1.00:  99%|█████████▉| 1862/1875 [00:50<00:00, 45.76it/s]
loss 0.08 accuracy 0.97:  99%|█████████▉| 1862/1875 [00:50<00:00, 45.76it/s]
loss 0.02 accuracy 1.00:  99%|█████████▉| 1862/1875 [00:50<00:00, 45.76it/s]
loss 0.02 accuracy 1.00: 100%|█████████▉| 1867/1875 [00:50<00:00, 45.69it/s]
loss 0.06 accuracy 1.00: 100%|█████████▉| 1867/1875 [00:50<00:00, 45.69it/s]
loss 0.13 accuracy 0.94: 100%|█████████▉| 1867/1875 [00:50<00:00, 45.69it/s]
loss 0.02 accuracy 1.00: 100%|█████████▉| 1867/1875 [00:50<00:00, 45.69it/s]
loss 0.07 accuracy 0.97: 100%|█████████▉| 1867/1875 [00:50<00:00, 45.69it/s]
loss 0.03 accuracy 1.00: 100%|█████████▉| 1867/1875 [00:50<00:00, 45.69it/s]
loss 0.03 accuracy 1.00: 100%|█████████▉| 1872/1875 [00:50<00:00, 45.74it/s]
loss 0.24 accuracy 0.94: 100%|█████████▉| 1872/1875 [00:50<00:00, 45.74it/s]
loss 0.03 accuracy 1.00: 100%|█████████▉| 1872/1875 [00:50<00:00, 45.74it/s]
loss 0.05 accuracy 1.00: 100%|█████████▉| 1872/1875 [00:50<00:00, 45.74it/s]
loss 0.05 accuracy 1.00: 100%|██████████| 1875/1875 [00:50<00:00, 36.88it/s]

  0%|          | 0/313 [00:00<?, ?it/s]
  0%|          | 1/313 [00:00<03:12,  1.62it/s]
  3%|▎         | 9/313 [00:00<00:18, 16.01it/s]
  5%|▌         | 17/313 [00:00<00:10, 28.79it/s]
  8%|▊         | 25/313 [00:00<00:07, 39.58it/s]
 10%|█         | 32/313 [00:01<00:06, 43.70it/s]
 13%|█▎        | 40/313 [00:01<00:05, 51.35it/s]
 15%|█▌        | 48/313 [00:01<00:04, 57.25it/s]
 18%|█▊        | 56/313 [00:01<00:04, 61.63it/s]
 20%|██        | 64/313 [00:01<00:03, 64.95it/s]
 23%|██▎       | 72/313 [00:01<00:03, 62.33it/s]
 26%|██▌       | 80/313 [00:01<00:03, 65.47it/s]
 28%|██▊       | 88/313 [00:01<00:03, 67.75it/s]
 31%|███       | 96/313 [00:01<00:03, 69.27it/s]
 33%|███▎      | 104/313 [00:02<00:03, 65.15it/s]
 36%|███▌      | 112/313 [00:02<00:02, 67.43it/s]
 38%|███▊      | 120/313 [00:02<00:02, 69.06it/s]
 41%|████      | 128/313 [00:02<00:02, 70.22it/s]
 43%|████▎     | 136/313 [00:02<00:02, 65.75it/s]
 46%|████▌     | 144/313 [00:02<00:02, 67.79it/s]
 49%|████▊     | 152/313 [00:02<00:02, 69.14it/s]
 51%|█████     | 160/313 [00:02<00:02, 70.12it/s]
 54%|█████▎    | 168/313 [00:03<00:02, 65.53it/s]
 56%|█████▌    | 176/313 [00:03<00:02, 67.52it/s]
 59%|█████▉    | 184/313 [00:03<00:01, 69.06it/s]
 61%|██████▏   | 192/313 [00:03<00:01, 70.09it/s]
 64%|██████▍   | 200/313 [00:03<00:01, 65.59it/s]
 66%|██████▋   | 208/313 [00:03<00:01, 67.65it/s]
 69%|██████▉   | 216/313 [00:03<00:01, 69.11it/s]
 72%|███████▏  | 224/313 [00:03<00:01, 70.37it/s]
 74%|███████▍  | 232/313 [00:03<00:01, 65.93it/s]
 77%|███████▋  | 240/313 [00:04<00:01, 67.85it/s]
 79%|███████▉  | 248/313 [00:04<00:00, 69.22it/s]
 82%|████████▏ | 256/313 [00:04<00:00, 70.28it/s]
 84%|████████▍ | 264/313 [00:04<00:00, 65.71it/s]
 87%|████████▋ | 272/313 [00:04<00:00, 67.59it/s]
 89%|████████▉ | 280/313 [00:04<00:00, 69.10it/s]
 92%|█████████▏| 288/313 [00:04<00:00, 70.17it/s]
 95%|█████████▍| 296/313 [00:04<00:00, 65.61it/s]
 97%|█████████▋| 304/313 [00:05<00:00, 67.60it/s]
100%|█████████▉| 312/313 [00:05<00:00, 69.04it/s]
100%|██████████| 313/313 [00:06<00:00, 49.51it/s]
test set accuracy is 0.973000
Traceback (most recent call last):
  File "/home/jebba/devel/tinygrad/tinygrad/examples/serious_mnist.py", line 136, in <module>
    model.save(f'examples/checkpoint{accuracy * 1e6:.0f}')
  File "/home/jebba/devel/tinygrad/tinygrad/examples/serious_mnist.py", line 72, in save
    with open(filename+'.npy', 'wb') as f:
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: 'examples/checkpoint973000.npy'

simple_conv_bn.py

running network

so_vits_svc.py

Traceback (most recent call last):
  File "/home/jebba/devel/tinygrad/tinygrad/examples/so_vits_svc.py", line 10, in <module>
    from examples.vits import ResidualCouplingBlock, PosteriorEncoder, Encoder, ResBlock1, ResBlock2, LRELU_SLOPE, sequence_mask, split, download_if_not_present, get_hparams_from_file, load_checkpoint, weight_norm, HParams
ImportError: cannot import name 'download_if_not_present' from 'examples.vits' (/home/jebba/devel/tinygrad/tinygrad/examples/vits.py)

stable_diffusion.py

  0%|          | 0/1131 [00:00<?, ?it/s]
ram used:  0.00 GB, alphas_cumprod                                    :   0%|          | 0/1131 [00:00<?, ?it/s]
ram used:  0.00 GB, alphas_cumprod                                    :   0%|          | 1/1131 [00:00<03:58,  4.73it/s]
ram used:  0.00 GB, model.diffusion_model.time_embed.0.weight         :   0%|          | 1/1131 [00:00<03:58,  4.73it/s]
ram used:  0.00 GB, model.diffusion_model.time_embed.0.bias           :   0%|          | 1/1131 [00:00<03:58,  4.73it/s]
ram used:  0.00 GB, model.diffusion_model.time_embed.2.weight         :   0%|          | 1/1131 [00:00<03:58,  4.73it/s]
ram used:  0.01 GB, model.diffusion_model.time_embed.2.bias           :   0%|          | 1/1131 [00:00<03:58,  4.73it/s]
ram used:  0.01 GB, model.diffusion_model.input_blocks.0.0.weight     :   0%|          | 1/1131 [00:00<03:58,  4.73it/s]
ram used:  0.01 GB, model.diffusion_model.input_blocks.0.0.bias       :   0%|          | 1/1131 [00:00<03:58,  4.73it/s]
ram used:  0.01 GB, model.diffusion_model.input_blocks.1.0.in_layers.0.weight:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]
ram used:  0.01 GB, model.diffusion_model.input_blocks.1.0.in_layers.0.bias:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]  
ram used:  0.01 GB, model.diffusion_model.input_blocks.1.0.in_layers.2.weight:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]
ram used:  0.01 GB, model.diffusion_model.input_blocks.1.0.in_layers.2.bias:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]  
ram used:  0.01 GB, model.diffusion_model.input_blocks.1.0.emb_layers.1.weight:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]
ram used:  0.01 GB, model.diffusion_model.input_blocks.1.0.emb_layers.1.bias:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]  
ram used:  0.01 GB, model.diffusion_model.input_blocks.1.0.out_layers.0.weight:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]
ram used:  0.01 GB, model.diffusion_model.input_blocks.1.0.out_layers.0.bias:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]  
ram used:  0.01 GB, model.diffusion_model.input_blocks.1.0.out_layers.3.weight:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]
ram used:  0.02 GB, model.diffusion_model.input_blocks.1.0.out_layers.3.bias:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]  
ram used:  0.02 GB, model.diffusion_model.input_blocks.1.1.norm.weight:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]      
ram used:  0.02 GB, model.diffusion_model.input_blocks.1.1.norm.bias  :   0%|          | 1/1131 [00:00<03:58,  4.73it/s]
ram used:  0.02 GB, model.diffusion_model.input_blocks.1.1.proj_in.weight:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]
ram used:  0.02 GB, model.diffusion_model.input_blocks.1.1.proj_in.bias:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]  
ram used:  0.02 GB, model.diffusion_model.input_blocks.1.1.transformer_blocks.0.attn1.to_q.weight:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]
ram used:  0.02 GB, model.diffusion_model.input_blocks.1.1.transformer_blocks.0.attn1.to_k.weight:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]
ram used:  0.02 GB, model.diffusion_model.input_blocks.1.1.transformer_blocks.0.attn1.to_v.weight:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]
ram used:  0.02 GB, model.diffusion_model.input_blocks.1.1.transformer_blocks.0.attn1.to_out.0.weight:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]
ram used:  0.02 GB, model.diffusion_model.input_blocks.1.1.transformer_blocks.0.attn1.to_out.0.bias:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]  
ram used:  0.02 GB, model.diffusion_model.input_blocks.1.1.transformer_blocks.0.ff.net.0.proj.weight:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]
ram used:  0.02 GB, model.diffusion_model.input_blocks.1.1.transformer_blocks.0.ff.net.0.proj.bias:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]  
ram used:  0.02 GB, model.diffusion_model.input_blocks.1.1.transformer_blocks.0.ff.net.2.weight:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]   
ram used:  0.02 GB, model.diffusion_model.input_blocks.1.1.transformer_blocks.0.ff.net.2.bias:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]  
ram used:  0.02 GB, model.diffusion_model.input_blocks.1.1.transformer_blocks.0.attn2.to_q.weight:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]
ram used:  0.02 GB, model.diffusion_model.input_blocks.1.1.transformer_blocks.0.attn2.to_k.weight:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]
ram used:  0.03 GB, model.diffusion_model.input_blocks.1.1.transformer_blocks.0.attn2.to_v.weight:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]
ram used:  0.03 GB, model.diffusion_model.input_blocks.1.1.transformer_blocks.0.attn2.to_out.0.weight:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]
ram used:  0.03 GB, model.diffusion_model.input_blocks.1.1.transformer_blocks.0.attn2.to_out.0.bias:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]  
ram used:  0.03 GB, model.diffusion_model.input_blocks.1.1.transformer_blocks.0.norm1.weight:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]       
ram used:  0.03 GB, model.diffusion_model.input_blocks.1.1.transformer_blocks.0.norm1.bias:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]  
ram used:  0.03 GB, model.diffusion_model.input_blocks.1.1.transformer_blocks.0.norm2.weight:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]
ram used:  0.03 GB, model.diffusion_model.input_blocks.1.1.transformer_blocks.0.norm2.bias:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]  
ram used:  0.03 GB, model.diffusion_model.input_blocks.1.1.transformer_blocks.0.norm3.weight:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]
ram used:  0.03 GB, model.diffusion_model.input_blocks.1.1.transformer_blocks.0.norm3.bias:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]  
ram used:  0.03 GB, model.diffusion_model.input_blocks.1.1.proj_out.weight:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]                
ram used:  0.03 GB, model.diffusion_model.input_blocks.1.1.proj_out.bias:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]  
ram used:  0.03 GB, model.diffusion_model.input_blocks.2.0.in_layers.0.weight:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]
ram used:  0.03 GB, model.diffusion_model.input_blocks.2.0.in_layers.0.bias:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]  
ram used:  0.03 GB, model.diffusion_model.input_blocks.2.0.in_layers.2.weight:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]
ram used:  0.03 GB, model.diffusion_model.input_blocks.2.0.in_layers.2.bias:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]  
ram used:  0.03 GB, model.diffusion_model.input_blocks.2.0.emb_layers.1.weight:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]
ram used:  0.03 GB, model.diffusion_model.input_blocks.2.0.emb_layers.1.bias:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]  
ram used:  0.03 GB, model.diffusion_model.input_blocks.2.0.out_layers.0.weight:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]
ram used:  0.03 GB, model.diffusion_model.input_blocks.2.0.out_layers.0.bias:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]  
ram used:  0.03 GB, model.diffusion_model.input_blocks.2.0.out_layers.3.weight:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]
ram used:  0.04 GB, model.diffusion_model.input_blocks.2.0.out_layers.3.bias:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]  
ram used:  0.04 GB, model.diffusion_model.input_blocks.2.1.norm.weight:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]      
ram used:  0.04 GB, model.diffusion_model.input_blocks.2.1.norm.bias  :   0%|          | 1/1131 [00:00<03:58,  4.73it/s]
ram used:  0.04 GB, model.diffusion_model.input_blocks.2.1.proj_in.weight:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]
ram used:  0.04 GB, model.diffusion_model.input_blocks.2.1.proj_in.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]
ram used:  0.04 GB, model.diffusion_model.input_blocks.2.1.proj_in.bias:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]  
ram used:  0.04 GB, model.diffusion_model.input_blocks.2.1.transformer_blocks.0.attn1.to_q.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]
ram used:  0.04 GB, model.diffusion_model.input_blocks.2.1.transformer_blocks.0.attn1.to_k.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]
ram used:  0.04 GB, model.diffusion_model.input_blocks.2.1.transformer_blocks.0.attn1.to_v.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]
ram used:  0.04 GB, model.diffusion_model.input_blocks.2.1.transformer_blocks.0.attn1.to_out.0.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]
ram used:  0.04 GB, model.diffusion_model.input_blocks.2.1.transformer_blocks.0.attn1.to_out.0.bias:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]  
ram used:  0.04 GB, model.diffusion_model.input_blocks.2.1.transformer_blocks.0.ff.net.0.proj.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]
ram used:  0.04 GB, model.diffusion_model.input_blocks.2.1.transformer_blocks.0.ff.net.0.proj.bias:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]  
ram used:  0.04 GB, model.diffusion_model.input_blocks.2.1.transformer_blocks.0.ff.net.2.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]   
ram used:  0.04 GB, model.diffusion_model.input_blocks.2.1.transformer_blocks.0.ff.net.2.bias:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]  
ram used:  0.04 GB, model.diffusion_model.input_blocks.2.1.transformer_blocks.0.attn2.to_q.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]
ram used:  0.04 GB, model.diffusion_model.input_blocks.2.1.transformer_blocks.0.attn2.to_k.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]
ram used:  0.04 GB, model.diffusion_model.input_blocks.2.1.transformer_blocks.0.attn2.to_v.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]
ram used:  0.05 GB, model.diffusion_model.input_blocks.2.1.transformer_blocks.0.attn2.to_out.0.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]
ram used:  0.05 GB, model.diffusion_model.input_blocks.2.1.transformer_blocks.0.attn2.to_out.0.bias:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]  
ram used:  0.05 GB, model.diffusion_model.input_blocks.2.1.transformer_blocks.0.norm1.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]       
ram used:  0.05 GB, model.diffusion_model.input_blocks.2.1.transformer_blocks.0.norm1.bias:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]  
ram used:  0.05 GB, model.diffusion_model.input_blocks.2.1.transformer_blocks.0.norm2.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]
ram used:  0.05 GB, model.diffusion_model.input_blocks.2.1.transformer_blocks.0.norm2.bias:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]  
ram used:  0.05 GB, model.diffusion_model.input_blocks.2.1.transformer_blocks.0.norm3.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]
ram used:  0.05 GB, model.diffusion_model.input_blocks.2.1.transformer_blocks.0.norm3.bias:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]  
ram used:  0.05 GB, model.diffusion_model.input_blocks.2.1.proj_out.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]                
ram used:  0.05 GB, model.diffusion_model.input_blocks.2.1.proj_out.bias:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]  
ram used:  0.05 GB, model.diffusion_model.input_blocks.3.0.op.weight  :   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]  
ram used:  0.05 GB, model.diffusion_model.input_blocks.3.0.op.bias    :   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]
ram used:  0.05 GB, model.diffusion_model.input_blocks.4.0.in_layers.0.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]
ram used:  0.05 GB, model.diffusion_model.input_blocks.4.0.in_layers.0.bias:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]  
ram used:  0.05 GB, model.diffusion_model.input_blocks.4.0.in_layers.2.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]
ram used:  0.06 GB, model.diffusion_model.input_blocks.4.0.in_layers.2.bias:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]  
ram used:  0.06 GB, model.diffusion_model.input_blocks.4.0.emb_layers.1.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]
ram used:  0.06 GB, model.diffusion_model.input_blocks.4.0.emb_layers.1.bias:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]  
ram used:  0.06 GB, model.diffusion_model.input_blocks.4.0.out_layers.0.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]
ram used:  0.06 GB, model.diffusion_model.input_blocks.4.0.out_layers.0.bias:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]  
ram used:  0.06 GB, model.diffusion_model.input_blocks.4.0.out_layers.3.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]
ram used:  0.08 GB, model.diffusion_model.input_blocks.4.0.out_layers.3.bias:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]  
ram used:  0.08 GB, model.diffusion_model.input_blocks.4.0.skip_connection.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]
ram used:  0.08 GB, model.diffusion_model.input_blocks.4.0.skip_connection.bias:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]  
ram used:  0.08 GB, model.diffusion_model.input_blocks.4.1.norm.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]         
ram used:  0.08 GB, model.diffusion_model.input_blocks.4.1.norm.bias  :   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]
ram used:  0.08 GB, model.diffusion_model.input_blocks.4.1.proj_in.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]
ram used:  0.08 GB, model.diffusion_model.input_blocks.4.1.proj_in.bias:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]  
ram used:  0.08 GB, model.diffusion_model.input_blocks.4.1.transformer_blocks.0.attn1.to_q.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]
ram used:  0.08 GB, model.diffusion_model.input_blocks.4.1.transformer_blocks.0.attn1.to_k.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]
ram used:  0.08 GB, model.diffusion_model.input_blocks.4.1.transformer_blocks.0.attn1.to_v.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]
ram used:  0.08 GB, model.diffusion_model.input_blocks.4.1.transformer_blocks.0.attn1.to_out.0.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]
ram used:  0.08 GB, model.diffusion_model.input_blocks.4.1.transformer_blocks.0.attn1.to_out.0.bias:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]  
ram used:  0.08 GB, model.diffusion_model.input_blocks.4.1.transformer_blocks.0.ff.net.0.proj.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]
ram used:  0.10 GB, model.diffusion_model.input_blocks.4.1.transformer_blocks.0.ff.net.0.proj.bias:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]  
ram used:  0.10 GB, model.diffusion_model.input_blocks.4.1.transformer_blocks.0.ff.net.2.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]   
ram used:  0.10 GB, model.diffusion_model.input_blocks.4.1.transformer_blocks.0.ff.net.2.bias:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]  
ram used:  0.10 GB, model.diffusion_model.input_blocks.4.1.transformer_blocks.0.attn2.to_q.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]
ram used:  0.11 GB, model.diffusion_model.input_blocks.4.1.transformer_blocks.0.attn2.to_k.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]
ram used:  0.11 GB, model.diffusion_model.input_blocks.4.1.transformer_blocks.0.attn2.to_v.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]
ram used:  0.11 GB, model.diffusion_model.input_blocks.4.1.transformer_blocks.0.attn2.to_out.0.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]
ram used:  0.11 GB, model.diffusion_model.input_blocks.4.1.transformer_blocks.0.attn2.to_out.0.bias:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]  
ram used:  0.11 GB, model.diffusion_model.input_blocks.4.1.transformer_blocks.0.norm1.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]       
ram used:  0.11 GB, model.diffusion_model.input_blocks.4.1.transformer_blocks.0.norm1.bias:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]  
ram used:  0.11 GB, model.diffusion_model.input_blocks.4.1.transformer_blocks.0.norm2.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]
ram used:  0.11 GB, model.diffusion_model.input_blocks.4.1.transformer_blocks.0.norm2.bias:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]  
ram used:  0.11 GB, model.diffusion_model.input_blocks.4.1.transformer_blocks.0.norm3.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]
ram used:  0.11 GB, model.diffusion_model.input_blocks.4.1.transformer_blocks.0.norm3.bias:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]  
ram used:  0.11 GB, model.diffusion_model.input_blocks.4.1.proj_out.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]                
ram used:  0.11 GB, model.diffusion_model.input_blocks.4.1.proj_out.bias:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]  
ram used:  0.11 GB, model.diffusion_model.input_blocks.5.0.in_layers.0.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]
ram used:  0.11 GB, model.diffusion_model.input_blocks.5.0.in_layers.0.bias:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]  
ram used:  0.11 GB, model.diffusion_model.input_blocks.5.0.in_layers.2.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]
ram used:  0.13 GB, model.diffusion_model.input_blocks.5.0.in_layers.2.bias:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]  
ram used:  0.13 GB, model.diffusion_model.input_blocks.5.0.emb_layers.1.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]
ram used:  0.13 GB, model.diffusion_model.input_blocks.5.0.emb_layers.1.bias:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]  
ram used:  0.13 GB, model.diffusion_model.input_blocks.5.0.out_layers.0.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]
ram used:  0.13 GB, model.diffusion_model.input_blocks.5.0.out_layers.0.bias:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]  
ram used:  0.13 GB, model.diffusion_model.input_blocks.5.0.out_layers.3.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]
ram used:  0.15 GB, model.diffusion_model.input_blocks.5.0.out_layers.3.bias:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]  
ram used:  0.15 GB, model.diffusion_model.input_blocks.5.1.norm.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]      
ram used:  0.15 GB, model.diffusion_model.input_blocks.5.1.norm.bias  :   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]
ram used:  0.15 GB, model.diffusion_model.input_blocks.5.1.proj_in.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]
ram used:  0.15 GB, model.diffusion_model.input_blocks.5.1.proj_in.bias:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]  
ram used:  0.15 GB, model.diffusion_model.input_blocks.5.1.transformer_blocks.0.attn1.to_q.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]
ram used:  0.15 GB, model.diffusion_model.input_blocks.5.1.transformer_blocks.0.attn1.to_k.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]
ram used:  0.15 GB, model.diffusion_model.input_blocks.5.1.transformer_blocks.0.attn1.to_v.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]
ram used:  0.15 GB, model.diffusion_model.input_blocks.5.1.transformer_blocks.0.attn1.to_out.0.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]
ram used:  0.15 GB, model.diffusion_model.input_blocks.5.1.transformer_blocks.0.attn1.to_out.0.bias:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]  
ram used:  0.15 GB, model.diffusion_model.input_blocks.5.1.transformer_blocks.0.ff.net.0.proj.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]
ram used:  0.15 GB, model.diffusion_model.input_blocks.5.1.transformer_blocks.0.ff.net.0.proj.weight:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]
ram used:  0.17 GB, model.diffusion_model.input_blocks.5.1.transformer_blocks.0.ff.net.0.proj.bias:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]  
ram used:  0.17 GB, model.diffusion_model.input_blocks.5.1.transformer_blocks.0.ff.net.2.weight:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]   
ram used:  0.17 GB, model.diffusion_model.input_blocks.5.1.transformer_blocks.0.ff.net.2.bias:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]  
ram used:  0.17 GB, model.diffusion_model.input_blocks.5.1.transformer_blocks.0.attn2.to_q.weight:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]
ram used:  0.18 GB, model.diffusion_model.input_blocks.5.1.transformer_blocks.0.attn2.to_k.weight:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]
ram used:  0.18 GB, model.diffusion_model.input_blocks.5.1.transformer_blocks.0.attn2.to_v.weight:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]
ram used:  0.18 GB, model.diffusion_model.input_blocks.5.1.transformer_blocks.0.attn2.to_out.0.weight:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]
ram used:  0.18 GB, model.diffusion_model.input_blocks.5.1.transformer_blocks.0.attn2.to_out.0.bias:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]  
ram used:  0.18 GB, model.diffusion_model.input_blocks.5.1.transformer_blocks.0.norm1.weight:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]       
ram used:  0.18 GB, model.diffusion_model.input_blocks.5.1.transformer_blocks.0.norm1.bias:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]  
ram used:  0.18 GB, model.diffusion_model.input_blocks.5.1.transformer_blocks.0.norm2.weight:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]
ram used:  0.18 GB, model.diffusion_model.input_blocks.5.1.transformer_blocks.0.norm2.bias:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]  
ram used:  0.18 GB, model.diffusion_model.input_blocks.5.1.transformer_blocks.0.norm3.weight:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]
ram used:  0.18 GB, model.diffusion_model.input_blocks.5.1.transformer_blocks.0.norm3.bias:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]  
ram used:  0.18 GB, model.diffusion_model.input_blocks.5.1.proj_out.weight:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]                
ram used:  0.18 GB, model.diffusion_model.input_blocks.5.1.proj_out.bias:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]  
ram used:  0.18 GB, model.diffusion_model.input_blocks.6.0.op.weight  :  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]  
ram used:  0.20 GB, model.diffusion_model.input_blocks.6.0.op.bias    :  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]
ram used:  0.20 GB, model.diffusion_model.input_blocks.7.0.in_layers.0.weight:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]
ram used:  0.20 GB, model.diffusion_model.input_blocks.7.0.in_layers.0.bias:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]  
ram used:  0.20 GB, model.diffusion_model.input_blocks.7.0.in_layers.2.weight:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]
ram used:  0.23 GB, model.diffusion_model.input_blocks.7.0.in_layers.2.bias:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]  
ram used:  0.23 GB, model.diffusion_model.input_blocks.7.0.emb_layers.1.weight:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]
ram used:  0.23 GB, model.diffusion_model.input_blocks.7.0.emb_layers.1.bias:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]  
ram used:  0.23 GB, model.diffusion_model.input_blocks.7.0.out_layers.0.weight:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]
ram used:  0.23 GB, model.diffusion_model.input_blocks.7.0.out_layers.0.bias:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]  
ram used:  0.23 GB, model.diffusion_model.input_blocks.7.0.out_layers.3.weight:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]
ram used:  0.29 GB, model.diffusion_model.input_blocks.7.0.out_layers.3.bias:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]  
ram used:  0.29 GB, model.diffusion_model.input_blocks.7.0.skip_connection.weight:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]
ram used:  0.30 GB, model.diffusion_model.input_blocks.7.0.skip_connection.bias:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]  
ram used:  0.30 GB, model.diffusion_model.input_blocks.7.1.norm.weight:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]         
ram used:  0.30 GB, model.diffusion_model.input_blocks.7.1.norm.bias  :  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]
ram used:  0.30 GB, model.diffusion_model.input_blocks.7.1.proj_in.weight:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]
ram used:  0.30 GB, model.diffusion_model.input_blocks.7.1.proj_in.bias:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]  
ram used:  0.30 GB, model.diffusion_model.input_blocks.7.1.transformer_blocks.0.attn1.to_q.weight:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]
ram used:  0.31 GB, model.diffusion_model.input_blocks.7.1.transformer_blocks.0.attn1.to_k.weight:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]
ram used:  0.32 GB, model.diffusion_model.input_blocks.7.1.transformer_blocks.0.attn1.to_v.weight:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]
ram used:  0.32 GB, model.diffusion_model.input_blocks.7.1.transformer_blocks.0.attn1.to_out.0.weight:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]
ram used:  0.33 GB, model.diffusion_model.input_blocks.7.1.transformer_blocks.0.attn1.to_out.0.bias:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]  
ram used:  0.33 GB, model.diffusion_model.input_blocks.7.1.transformer_blocks.0.ff.net.0.proj.weight:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]
ram used:  0.38 GB, model.diffusion_model.input_blocks.7.1.transformer_blocks.0.ff.net.0.proj.bias:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]  
ram used:  0.38 GB, model.diffusion_model.input_blocks.7.1.transformer_blocks.0.ff.net.2.weight:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]   
ram used:  0.41 GB, model.diffusion_model.input_blocks.7.1.transformer_blocks.0.ff.net.2.bias:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]  
ram used:  0.41 GB, model.diffusion_model.input_blocks.7.1.transformer_blocks.0.attn2.to_q.weight:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]
ram used:  0.41 GB, model.diffusion_model.input_blocks.7.1.transformer_blocks.0.attn2.to_k.weight:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]
ram used:  0.42 GB, model.diffusion_model.input_blocks.7.1.transformer_blocks.0.attn2.to_v.weight:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]
ram used:  0.42 GB, model.diffusion_model.input_blocks.7.1.transformer_blocks.0.attn2.to_out.0.weight:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]
ram used:  0.43 GB, model.diffusion_model.input_blocks.7.1.transformer_blocks.0.attn2.to_out.0.bias:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]  
ram used:  0.43 GB, model.diffusion_model.input_blocks.7.1.transformer_blocks.0.norm1.weight:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]       
ram used:  0.43 GB, model.diffusion_model.input_blocks.7.1.transformer_blocks.0.norm1.bias:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]  
ram used:  0.43 GB, model.diffusion_model.input_blocks.7.1.transformer_blocks.0.norm2.weight:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]
ram used:  0.43 GB, model.diffusion_model.input_blocks.7.1.transformer_blocks.0.norm2.bias:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]  
ram used:  0.43 GB, model.diffusion_model.input_blocks.7.1.transformer_blocks.0.norm3.weight:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]
ram used:  0.43 GB, model.diffusion_model.input_blocks.7.1.transformer_blocks.0.norm3.bias:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]  
ram used:  0.43 GB, model.diffusion_model.input_blocks.7.1.proj_out.weight:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]                
ram used:  0.44 GB, model.diffusion_model.input_blocks.7.1.proj_out.bias:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]  
ram used:  0.44 GB, model.diffusion_model.input_blocks.7.1.proj_out.bias:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]
ram used:  0.44 GB, model.diffusion_model.input_blocks.8.0.in_layers.0.weight:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]
ram used:  0.44 GB, model.diffusion_model.input_blocks.8.0.in_layers.0.bias:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]  
ram used:  0.44 GB, model.diffusion_model.input_blocks.8.0.in_layers.2.weight:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]
ram used:  0.49 GB, model.diffusion_model.input_blocks.8.0.in_layers.2.bias:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]  
ram used:  0.49 GB, model.diffusion_model.input_blocks.8.0.emb_layers.1.weight:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]
ram used:  0.50 GB, model.diffusion_model.input_blocks.8.0.emb_layers.1.bias:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]  
ram used:  0.50 GB, model.diffusion_model.input_blocks.8.0.out_layers.0.weight:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]
ram used:  0.50 GB, model.diffusion_model.input_blocks.8.0.out_layers.0.bias:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]  
ram used:  0.50 GB, model.diffusion_model.input_blocks.8.0.out_layers.3.weight:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]
ram used:  0.56 GB, model.diffusion_model.input_blocks.8.0.out_layers.3.bias:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]  
ram used:  0.56 GB, model.diffusion_model.input_blocks.8.1.norm.weight:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]      
ram used:  0.56 GB, model.diffusion_model.input_blocks.8.1.norm.bias  :  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]
ram used:  0.56 GB, model.diffusion_model.input_blocks.8.1.proj_in.weight:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]
ram used:  0.57 GB, model.diffusion_model.input_blocks.8.1.proj_in.bias:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]  
ram used:  0.57 GB, model.diffusion_model.input_blocks.8.1.transformer_blocks.0.attn1.to_q.weight:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]
ram used:  0.57 GB, model.diffusion_model.input_blocks.8.1.transformer_blocks.0.attn1.to_k.weight:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]
ram used:  0.58 GB, model.diffusion_model.input_blocks.8.1.transformer_blocks.0.attn1.to_v.weight:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]
ram used:  0.59 GB, model.diffusion_model.input_blocks.8.1.transformer_blocks.0.attn1.to_out.0.weight:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]
ram used:  0.59 GB, model.diffusion_model.input_blocks.8.1.transformer_blocks.0.attn1.to_out.0.bias:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]  
ram used:  0.59 GB, model.diffusion_model.input_blocks.8.1.transformer_blocks.0.ff.net.0.proj.weight:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]
ram used:  0.64 GB, model.diffusion_model.input_blocks.8.1.transformer_blocks.0.ff.net.0.proj.bias:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]  
ram used:  0.64 GB, model.diffusion_model.input_blocks.8.1.transformer_blocks.0.ff.net.2.weight:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]   
ram used:  0.67 GB, model.diffusion_model.input_blocks.8.1.transformer_blocks.0.ff.net.2.bias:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]  
ram used:  0.67 GB, model.diffusion_model.input_blocks.8.1.transformer_blocks.0.attn2.to_q.weight:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]
ram used:  0.68 GB, model.diffusion_model.input_blocks.8.1.transformer_blocks.0.attn2.to_k.weight:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]
ram used:  0.68 GB, model.diffusion_model.input_blocks.8.1.transformer_blocks.0.attn2.to_v.weight:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]
ram used:  0.69 GB, model.diffusion_model.input_blocks.8.1.transformer_blocks.0.attn2.to_out.0.weight:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]
ram used:  0.69 GB, model.diffusion_model.input_blocks.8.1.transformer_blocks.0.attn2.to_out.0.bias:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]  
ram used:  0.69 GB, model.diffusion_model.input_blocks.8.1.transformer_blocks.0.norm1.weight:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]       
ram used:  0.69 GB, model.diffusion_model.input_blocks.8.1.transformer_blocks.0.norm1.bias:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]  
ram used:  0.69 GB, model.diffusion_model.input_blocks.8.1.transformer_blocks.0.norm2.weight:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]
ram used:  0.69 GB, model.diffusion_model.input_blocks.8.1.transformer_blocks.0.norm2.bias:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]  
ram used:  0.69 GB, model.diffusion_model.input_blocks.8.1.transformer_blocks.0.norm3.weight:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]
ram used:  0.69 GB, model.diffusion_model.input_blocks.8.1.transformer_blocks.0.norm3.bias:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]  
ram used:  0.69 GB, model.diffusion_model.input_blocks.8.1.proj_out.weight:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]                
ram used:  0.70 GB, model.diffusion_model.input_blocks.8.1.proj_out.bias:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]  
ram used:  0.70 GB, model.diffusion_model.input_blocks.9.0.op.weight  :  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]  
ram used:  0.76 GB, model.diffusion_model.input_blocks.9.0.op.bias    :  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]
ram used:  0.76 GB, model.diffusion_model.input_blocks.10.0.in_layers.0.weight:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]
ram used:  0.76 GB, model.diffusion_model.input_blocks.10.0.in_layers.0.bias:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]  
ram used:  0.76 GB, model.diffusion_model.input_blocks.10.0.in_layers.2.weight:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]
ram used:  0.82 GB, model.diffusion_model.input_blocks.10.0.in_layers.2.bias:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]  
ram used:  0.82 GB, model.diffusion_model.input_blocks.10.0.emb_layers.1.weight:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]
ram used:  0.82 GB, model.diffusion_model.input_blocks.10.0.emb_layers.1.bias:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]  
ram used:  0.82 GB, model.diffusion_model.input_blocks.10.0.out_layers.0.weight:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]
ram used:  0.82 GB, model.diffusion_model.input_blocks.10.0.out_layers.0.bias:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]  
ram used:  0.82 GB, model.diffusion_model.input_blocks.10.0.out_layers.3.weight:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]
ram used:  0.88 GB, model.diffusion_model.input_blocks.10.0.out_layers.3.bias:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]  
ram used:  0.88 GB, model.diffusion_model.input_blocks.11.0.in_layers.0.weight:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]
ram used:  0.88 GB, model.diffusion_model.input_blocks.11.0.in_layers.0.bias:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]  
ram used:  0.88 GB, model.diffusion_model.input_blocks.11.0.in_layers.0.bias:  22%|██▏       | 245/1131 [00:00<00:02, 345.57it/s]
ram used:  0.88 GB, model.diffusion_model.input_blocks.11.0.in_layers.2.weight:  22%|██▏       | 245/1131 [00:00<00:02, 345.57it/s]
ram used:  0.94 GB, model.diffusion_model.input_blocks.11.0.in_layers.2.bias:  22%|██▏       | 245/1131 [00:00<00:02, 345.57it/s]  
ram used:  0.94 GB, model.diffusion_model.input_blocks.11.0.emb_layers.1.weight:  22%|██▏       | 245/1131 [00:00<00:02, 345.57it/s]
ram used:  0.95 GB, model.diffusion_model.input_blocks.11.0.emb_layers.1.bias:  22%|██▏       | 245/1131 [00:00<00:02, 345.57it/s]  
ram used:  0.95 GB, model.diffusion_model.input_blocks.11.0.out_layers.0.weight:  22%|██▏       | 245/1131 [00:00<00:02, 345.57it/s]
ram used:  0.95 GB, model.diffusion_model.input_blocks.11.0.out_layers.0.bias:  22%|██▏       | 245/1131 [00:00<00:02, 345.57it/s]  
ram used:  0.95 GB, model.diffusion_model.input_blocks.11.0.out_layers.3.weight:  22%|██▏       | 245/1131 [00:00<00:02, 345.57it/s]
ram used:  1.01 GB, model.diffusion_model.input_blocks.11.0.out_layers.3.bias:  22%|██▏       | 245/1131 [00:00<00:02, 345.57it/s]  
ram used:  1.01 GB, model.diffusion_model.middle_block.0.in_layers.0.weight:  22%|██▏       | 245/1131 [00:00<00:02, 345.57it/s]  
ram used:  1.01 GB, model.diffusion_model.middle_block.0.in_layers.0.bias:  22%|██▏       | 245/1131 [00:00<00:02, 345.57it/s]  
ram used:  1.01 GB, model.diffusion_model.middle_block.0.in_layers.2.weight:  22%|██▏       | 245/1131 [00:00<00:02, 345.57it/s]
ram used:  1.07 GB, model.diffusion_model.middle_block.0.in_layers.2.bias:  22%|██▏       | 245/1131 [00:00<00:02, 345.57it/s]  
ram used:  1.07 GB, model.diffusion_model.middle_block.0.emb_layers.1.weight:  22%|██▏       | 245/1131 [00:00<00:02, 345.57it/s]
ram used:  1.07 GB, model.diffusion_model.middle_block.0.emb_layers.1.bias:  22%|██▏       | 245/1131 [00:00<00:02, 345.57it/s]  
ram used:  1.07 GB, model.diffusion_model.middle_block.0.out_layers.0.weight:  22%|██▏       | 245/1131 [00:00<00:02, 345.57it/s]
ram used:  1.07 GB, model.diffusion_model.middle_block.0.out_layers.0.bias:  22%|██▏       | 245/1131 [00:00<00:02, 345.57it/s]  
ram used:  1.07 GB, model.diffusion_model.middle_block.0.out_layers.3.weight:  22%|██▏       | 245/1131 [00:00<00:02, 345.57it/s]
ram used:  1.13 GB, model.diffusion_model.middle_block.0.out_layers.3.bias:  22%|██▏       | 245/1131 [00:00<00:02, 345.57it/s]  
ram used:  1.13 GB, model.diffusion_model.middle_block.1.norm.weight  :  22%|██▏       | 245/1131 [00:00<00:02, 345.57it/s]    
ram used:  1.13 GB, model.diffusion_model.middle_block.1.norm.bias    :  22%|██▏       | 245/1131 [00:00<00:02, 345.57it/s]
ram used:  1.13 GB, model.diffusion_model.middle_block.1.proj_in.weight:  22%|██▏       | 245/1131 [00:00<00:02, 345.57it/s]
ram used:  1.14 GB, model.diffusion_model.middle_block.1.proj_in.bias :  22%|██▏       | 245/1131 [00:00<00:02, 345.57it/s] 
ram used:  1.14 GB, model.diffusion_model.middle_block.1.transformer_blocks.0.attn1.to_q.weight:  22%|██▏       | 245/1131 [00:00<00:02, 345.57it/s]
ram used:  1.14 GB, model.diffusion_model.middle_block.1.transformer_blocks.0.attn1.to_k.weight:  22%|██▏       | 245/1131 [00:00<00:02, 345.57it/s]
ram used:  1.15 GB, model.diffusion_model.middle_block.1.transformer_blocks.0.attn1.to_v.weight:  22%|██▏       | 245/1131 [00:00<00:02, 345.57it/s]
ram used:  1.16 GB, model.diffusion_model.middle_block.1.transformer_blocks.0.attn1.to_out.0.weight:  22%|██▏       | 245/1131 [00:00<00:02, 345.57it/s]
ram used:  1.16 GB, model.diffusion_model.middle_block.1.transformer_blocks.0.attn1.to_out.0.bias:  22%|██▏       | 245/1131 [00:00<00:02, 345.57it/s]  
ram used:  1.16 GB, model.diffusion_model.middle_block.1.transformer_blocks.0.ff.net.0.proj.weight:  22%|██▏       | 245/1131 [00:00<00:02, 345.57it/s]
ram used:  1.22 GB, model.diffusion_model.middle_block.1.transformer_blocks.0.ff.net.0.proj.bias:  22%|██▏       | 245/1131 [00:00<00:02, 345.57it/s]  
ram used:  1.22 GB, model.diffusion_model.middle_block.1.transformer_blocks.0.ff.net.2.weight:  22%|██▏       | 245/1131 [00:00<00:02, 345.57it/s]   
ram used:  1.24 GB, model.diffusion_model.middle_block.1.transformer_blocks.0.ff.net.2.bias:  22%|██▏       | 245/1131 [00:00<00:02, 345.57it/s]  
ram used:  1.24 GB, model.diffusion_model.middle_block.1.transformer_blocks.0.attn2.to_q.weight:  22%|██▏       | 245/1131 [00:00<00:02, 345.57it/s]
ram used:  1.25 GB, model.diffusion_model.middle_block.1.transformer_blocks.0.attn2.to_k.weight:  22%|██▏       | 245/1131 [00:00<00:02, 345.57it/s]
ram used:  1.25 GB, model.diffusion_model.middle_block.1.transformer_blocks.0.attn2.to_v.weight:  22%|██▏       | 245/1131 [00:00<00:02, 345.57it/s]
ram used:  1.26 GB, model.diffusion_model.middle_block.1.transformer_blocks.0.attn2.to_out.0.weight:  22%|██▏       | 245/1131 [00:00<00:02, 345.57it/s]
ram used:  1.26 GB, model.diffusion_model.middle_block.1.transformer_blocks.0.attn2.to_out.0.bias:  22%|██▏       | 245/1131 [00:00<00:02, 345.57it/s]  
ram used:  1.26 GB, model.diffusion_model.middle_block.1.transformer_blocks.0.norm1.weight:  22%|██▏       | 245/1131 [00:00<00:02, 345.57it/s]       
ram used:  1.26 GB, model.diffusion_model.middle_block.1.transformer_blocks.0.norm1.bias:  22%|██▏       | 245/1131 [00:00<00:02, 345.57it/s]  
ram used:  1.26 GB, model.diffusion_model.middle_block.1.transformer_blocks.0.norm2.weight:  22%|██▏       | 245/1131 [00:00<00:02, 345.57it/s]
ram used:  1.26 GB, model.diffusion_model.middle_block.1.transformer_blocks.0.norm2.bias:  22%|██▏       | 245/1131 [00:00<00:02, 345.57it/s]  
ram used:  1.26 GB, model.diffusion_model.middle_block.1.transformer_blocks.0.norm3.weight:  22%|██▏       | 245/1131 [00:00<00:02, 345.57it/s]
ram used:  1.26 GB, model.diffusion_model.middle_block.1.transformer_blocks.0.norm3.weight:  25%|██▌       | 286/1131 [00:00<00:02, 306.95it/s]
ram used:  1.26 GB, model.diffusion_model.middle_block.1.transformer_blocks.0.norm3.bias:  25%|██▌       | 286/1131 [00:00<00:02, 306.95it/s]  
ram used:  1.26 GB, model.diffusion_model.middle_block.1.proj_out.weight:  25%|██▌       | 286/1131 [00:00<00:02, 306.95it/s]                
ram used:  1.27 GB, model.diffusion_model.middle_block.1.proj_out.bias:  25%|██▌       | 286/1131 [00:00<00:02, 306.95it/s]  
ram used:  1.27 GB, model.diffusion_model.middle_block.2.in_layers.0.weight:  25%|██▌       | 286/1131 [00:00<00:02, 306.95it/s]
ram used:  1.27 GB, model.diffusion_model.middle_block.2.in_layers.0.bias:  25%|██▌       | 286/1131 [00:00<00:02, 306.95it/s]  
ram used:  1.27 GB, model.diffusion_model.middle_block.2.in_layers.2.weight:  25%|██▌       | 286/1131 [00:00<00:02, 306.95it/s]
ram used:  1.33 GB, model.diffusion_model.middle_block.2.in_layers.2.bias:  25%|██▌       | 286/1131 [00:00<00:02, 306.95it/s]  
ram used:  1.33 GB, model.diffusion_model.middle_block.2.emb_layers.1.weight:  25%|██▌       | 286/1131 [00:00<00:02, 306.95it/s]
ram used:  1.34 GB, model.diffusion_model.middle_block.2.emb_layers.1.bias:  25%|██▌       | 286/1131 [00:00<00:02, 306.95it/s]  
ram used:  1.34 GB, model.diffusion_model.middle_block.2.out_layers.0.weight:  25%|██▌       | 286/1131 [00:00<00:02, 306.95it/s]
ram used:  1.34 GB, model.diffusion_model.middle_block.2.out_layers.0.bias:  25%|██▌       | 286/1131 [00:00<00:02, 306.95it/s]  
ram used:  1.34 GB, model.diffusion_model.middle_block.2.out_layers.3.weight:  25%|██▌       | 286/1131 [00:00<00:02, 306.95it/s]
ram used:  1.39 GB, model.diffusion_model.middle_block.2.out_layers.3.bias:  25%|██▌       | 286/1131 [00:00<00:02, 306.95it/s]  
ram used:  1.39 GB, model.diffusion_model.output_blocks.0.0.in_layers.0.weight:  25%|██▌       | 286/1131 [00:00<00:02, 306.95it/s]
ram used:  1.39 GB, model.diffusion_model.output_blocks.0.0.in_layers.0.bias:  25%|██▌       | 286/1131 [00:00<00:02, 306.95it/s]  
ram used:  1.39 GB, model.diffusion_model.output_blocks.0.0.in_layers.2.weight:  25%|██▌       | 286/1131 [00:00<00:02, 306.95it/s]
ram used:  1.51 GB, model.diffusion_model.output_blocks.0.0.in_layers.2.bias:  25%|██▌       | 286/1131 [00:01<00:02, 306.95it/s]  
ram used:  1.51 GB, model.diffusion_model.output_blocks.0.0.emb_layers.1.weight:  25%|██▌       | 286/1131 [00:01<00:02, 306.95it/s]
ram used:  1.52 GB, model.diffusion_model.output_blocks.0.0.emb_layers.1.bias:  25%|██▌       | 286/1131 [00:01<00:02, 306.95it/s]  
ram used:  1.52 GB, model.diffusion_model.output_blocks.0.0.out_layers.0.weight:  25%|██▌       | 286/1131 [00:01<00:02, 306.95it/s]
ram used:  1.52 GB, model.diffusion_model.output_blocks.0.0.out_layers.0.bias:  25%|██▌       | 286/1131 [00:01<00:02, 306.95it/s]  
ram used:  1.52 GB, model.diffusion_model.output_blocks.0.0.out_layers.3.weight:  25%|██▌       | 286/1131 [00:01<00:02, 306.95it/s]
ram used:  1.58 GB, model.diffusion_model.output_blocks.0.0.out_layers.3.bias:  25%|██▌       | 286/1131 [00:01<00:02, 306.95it/s]  
ram used:  1.58 GB, model.diffusion_model.output_blocks.0.0.skip_connection.weight:  25%|██▌       | 286/1131 [00:01<00:02, 306.95it/s]
ram used:  1.59 GB, model.diffusion_model.output_blocks.0.0.skip_connection.bias:  25%|██▌       | 286/1131 [00:01<00:02, 306.95it/s]  
ram used:  1.59 GB, model.diffusion_model.output_blocks.1.0.in_layers.0.weight:  25%|██▌       | 286/1131 [00:01<00:02, 306.95it/s]  
ram used:  1.59 GB, model.diffusion_model.output_blocks.1.0.in_layers.0.bias:  25%|██▌       | 286/1131 [00:01<00:02, 306.95it/s]  
ram used:  1.59 GB, model.diffusion_model.output_blocks.1.0.in_layers.2.weight:  25%|██▌       | 286/1131 [00:01<00:02, 306.95it/s]
ram used:  1.71 GB, model.diffusion_model.output_blocks.1.0.in_layers.2.bias:  25%|██▌       | 286/1131 [00:01<00:02, 306.95it/s]  
ram used:  1.71 GB, model.diffusion_model.output_blocks.1.0.emb_layers.1.weight:  25%|██▌       | 286/1131 [00:01<00:02, 306.95it/s]
ram used:  1.72 GB, model.diffusion_model.output_blocks.1.0.emb_layers.1.bias:  25%|██▌       | 286/1131 [00:01<00:02, 306.95it/s]  
ram used:  1.72 GB, model.diffusion_model.output_blocks.1.0.out_layers.0.weight:  25%|██▌       | 286/1131 [00:01<00:02, 306.95it/s]
ram used:  1.72 GB, model.diffusion_model.output_blocks.1.0.out_layers.0.bias:  25%|██▌       | 286/1131 [00:01<00:02, 306.95it/s]  
ram used:  1.72 GB, model.diffusion_model.output_blocks.1.0.out_layers.3.weight:  25%|██▌       | 286/1131 [00:01<00:02, 306.95it/s]
ram used:  1.78 GB, model.diffusion_model.output_blocks.1.0.out_layers.3.bias:  25%|██▌       | 286/1131 [00:01<00:02, 306.95it/s]  
ram used:  1.78 GB, model.diffusion_model.output_blocks.1.0.skip_connection.weight:  25%|██▌       | 286/1131 [00:01<00:02, 306.95it/s]
ram used:  1.78 GB, model.diffusion_model.output_blocks.1.0.skip_connection.weight:  28%|██▊       | 322/1131 [00:01<00:03, 248.71it/s]
ram used:  1.79 GB, model.diffusion_model.output_blocks.1.0.skip_connection.bias:  28%|██▊       | 322/1131 [00:01<00:03, 248.71it/s]  
ram used:  1.79 GB, model.diffusion_model.output_blocks.2.0.in_layers.0.weight:  28%|██▊       | 322/1131 [00:01<00:03, 248.71it/s]  
ram used:  1.79 GB, model.diffusion_model.output_blocks.2.0.in_layers.0.bias:  28%|██▊       | 322/1131 [00:01<00:03, 248.71it/s]  
ram used:  1.79 GB, model.diffusion_model.output_blocks.2.0.in_layers.2.weight:  28%|██▊       | 322/1131 [00:01<00:03, 248.71it/s]
ram used:  1.91 GB, model.diffusion_model.output_blocks.2.0.in_layers.2.bias:  28%|██▊       | 322/1131 [00:01<00:03, 248.71it/s]  
ram used:  1.91 GB, model.diffusion_model.output_blocks.2.0.emb_layers.1.weight:  28%|██▊       | 322/1131 [00:01<00:03, 248.71it/s]
ram used:  1.91 GB, model.diffusion_model.output_blocks.2.0.emb_layers.1.bias:  28%|██▊       | 322/1131 [00:01<00:03, 248.71it/s]  
ram used:  1.91 GB, model.diffusion_model.output_blocks.2.0.out_layers.0.weight:  28%|██▊       | 322/1131 [00:01<00:03, 248.71it/s]
ram used:  1.91 GB, model.diffusion_model.output_blocks.2.0.out_layers.0.bias:  28%|██▊       | 322/1131 [00:01<00:03, 248.71it/s]  
ram used:  1.91 GB, model.diffusion_model.output_blocks.2.0.out_layers.3.weight:  28%|██▊       | 322/1131 [00:01<00:03, 248.71it/s]
ram used:  1.97 GB, model.diffusion_model.output_blocks.2.0.out_layers.3.bias:  28%|██▊       | 322/1131 [00:01<00:03, 248.71it/s]  
ram used:  1.97 GB, model.diffusion_model.output_blocks.2.0.skip_connection.weight:  28%|██▊       | 322/1131 [00:01<00:03, 248.71it/s]
ram used:  1.98 GB, model.diffusion_model.output_blocks.2.0.skip_connection.bias:  28%|██▊       | 322/1131 [00:01<00:03, 248.71it/s]  
ram used:  1.98 GB, model.diffusion_model.output_blocks.2.1.conv.weight:  28%|██▊       | 322/1131 [00:01<00:03, 248.71it/s]         
ram used:  2.04 GB, model.diffusion_model.output_blocks.2.1.conv.bias :  28%|██▊       | 322/1131 [00:01<00:03, 248.71it/s] 
ram used:  2.04 GB, model.diffusion_model.output_blocks.3.0.in_layers.0.weight:  28%|██▊       | 322/1131 [00:01<00:03, 248.71it/s]
ram used:  2.04 GB, model.diffusion_model.output_blocks.3.0.in_layers.0.bias:  28%|██▊       | 322/1131 [00:01<00:03, 248.71it/s]  
ram used:  2.04 GB, model.diffusion_model.output_blocks.3.0.in_layers.2.weight:  28%|██▊       | 322/1131 [00:01<00:03, 248.71it/s]
ram used:  2.16 GB, model.diffusion_model.output_blocks.3.0.in_layers.2.bias:  28%|██▊       | 322/1131 [00:01<00:03, 248.71it/s]  
ram used:  2.16 GB, model.diffusion_model.output_blocks.3.0.emb_layers.1.weight:  28%|██▊       | 322/1131 [00:01<00:03, 248.71it/s]
ram used:  2.17 GB, model.diffusion_model.output_blocks.3.0.emb_layers.1.bias:  28%|██▊       | 322/1131 [00:01<00:03, 248.71it/s]  
ram used:  2.17 GB, model.diffusion_model.output_blocks.3.0.out_layers.0.weight:  28%|██▊       | 322/1131 [00:01<00:03, 248.71it/s]
ram used:  2.17 GB, model.diffusion_model.output_blocks.3.0.out_layers.0.bias:  28%|██▊       | 322/1131 [00:01<00:03, 248.71it/s]  
ram used:  2.17 GB, model.diffusion_model.output_blocks.3.0.out_layers.3.weight:  28%|██▊       | 322/1131 [00:01<00:03, 248.71it/s]
ram used:  2.23 GB, model.diffusion_model.output_blocks.3.0.out_layers.3.bias:  28%|██▊       | 322/1131 [00:01<00:03, 248.71it/s]  
ram used:  2.23 GB, model.diffusion_model.output_blocks.3.0.skip_connection.weight:  28%|██▊       | 322/1131 [00:01<00:03, 248.71it/s]
ram used:  2.24 GB, model.diffusion_model.output_blocks.3.0.skip_connection.bias:  28%|██▊       | 322/1131 [00:01<00:03, 248.71it/s]  
ram used:  2.24 GB, model.diffusion_model.output_blocks.3.1.norm.weight:  28%|██▊       | 322/1131 [00:01<00:03, 248.71it/s]         
ram used:  2.24 GB, model.diffusion_model.output_blocks.3.1.norm.bias :  28%|██▊       | 322/1131 [00:01<00:03, 248.71it/s] 
ram used:  2.24 GB, model.diffusion_model.output_blocks.3.1.norm.bias :  31%|███       | 351/1131 [00:01<00:03, 216.94it/s]
ram used:  2.24 GB, model.diffusion_model.output_blocks.3.1.proj_in.weight:  31%|███       | 351/1131 [00:01<00:03, 216.94it/s]
ram used:  2.25 GB, model.diffusion_model.output_blocks.3.1.proj_in.bias:  31%|███       | 351/1131 [00:01<00:03, 216.94it/s]  
ram used:  2.25 GB, model.diffusion_model.output_blocks.3.1.transformer_blocks.0.attn1.to_q.weight:  31%|███       | 351/1131 [00:01<00:03, 216.94it/s]
ram used:  2.25 GB, model.diffusion_model.output_blocks.3.1.transformer_blocks.0.attn1.to_k.weight:  31%|███       | 351/1131 [00:01<00:03, 216.94it/s]
ram used:  2.26 GB, model.diffusion_model.output_blocks.3.1.transformer_blocks.0.attn1.to_v.weight:  31%|███       | 351/1131 [00:01<00:03, 216.94it/s]
ram used:  2.27 GB, model.diffusion_model.output_blocks.3.1.transformer_blocks.0.attn1.to_out.0.weight:  31%|███       | 351/1131 [00:01<00:03, 216.94it/s]
ram used:  2.27 GB, model.diffusion_model.output_blocks.3.1.transformer_blocks.0.attn1.to_out.0.bias:  31%|███       | 351/1131 [00:01<00:03, 216.94it/s]  
ram used:  2.27 GB, model.diffusion_model.output_blocks.3.1.transformer_blocks.0.ff.net.0.proj.weight:  31%|███       | 351/1131 [00:01<00:03, 216.94it/s]
ram used:  2.33 GB, model.diffusion_model.output_blocks.3.1.transformer_blocks.0.ff.net.0.proj.bias:  31%|███       | 351/1131 [00:01<00:03, 216.94it/s]  
ram used:  2.33 GB, model.diffusion_model.output_blocks.3.1.transformer_blocks.0.ff.net.2.weight:  31%|███       | 351/1131 [00:01<00:03, 216.94it/s]   
ram used:  2.35 GB, model.diffusion_model.output_blocks.3.1.transformer_blocks.0.ff.net.2.bias:  31%|███       | 351/1131 [00:01<00:03, 216.94it/s]  
ram used:  2.35 GB, model.diffusion_model.output_blocks.3.1.transformer_blocks.0.attn2.to_q.weight:  31%|███       | 351/1131 [00:01<00:03, 216.94it/s]
ram used:  2.36 GB, model.diffusion_model.output_blocks.3.1.transformer_blocks.0.attn2.to_k.weight:  31%|███       | 351/1131 [00:01<00:03, 216.94it/s]
ram used:  2.36 GB, model.diffusion_model.output_blocks.3.1.transformer_blocks.0.attn2.to_v.weight:  31%|███       | 351/1131 [00:01<00:03, 216.94it/s]
ram used:  2.37 GB, model.diffusion_model.output_blocks.3.1.transformer_blocks.0.attn2.to_out.0.weight:  31%|███       | 351/1131 [00:01<00:03, 216.94it/s]
ram used:  2.37 GB, model.diffusion_model.output_blocks.3.1.transformer_blocks.0.attn2.to_out.0.bias:  31%|███       | 351/1131 [00:01<00:03, 216.94it/s]  
ram used:  2.37 GB, model.diffusion_model.output_blocks.3.1.transformer_blocks.0.norm1.weight:  31%|███       | 351/1131 [00:01<00:03, 216.94it/s]       
ram used:  2.37 GB, model.diffusion_model.output_blocks.3.1.transformer_blocks.0.norm1.bias:  31%|███       | 351/1131 [00:01<00:03, 216.94it/s]  
ram used:  2.37 GB, model.diffusion_model.output_blocks.3.1.transformer_blocks.0.norm2.weight:  31%|███       | 351/1131 [00:01<00:03, 216.94it/s]
ram used:  2.37 GB, model.diffusion_model.output_blocks.3.1.transformer_blocks.0.norm2.bias:  31%|███       | 351/1131 [00:01<00:03, 216.94it/s]  
ram used:  2.37 GB, model.diffusion_model.output_blocks.3.1.transformer_blocks.0.norm3.weight:  31%|███       | 351/1131 [00:01<00:03, 216.94it/s]
ram used:  2.37 GB, model.diffusion_model.output_blocks.3.1.transformer_blocks.0.norm3.bias:  31%|███       | 351/1131 [00:01<00:03, 216.94it/s]  
ram used:  2.37 GB, model.diffusion_model.output_blocks.3.1.proj_out.weight:  31%|███       | 351/1131 [00:01<00:03, 216.94it/s]                
ram used:  2.38 GB, model.diffusion_model.output_blocks.3.1.proj_out.bias:  31%|███       | 351/1131 [00:01<00:03, 216.94it/s]  
ram used:  2.38 GB, model.diffusion_model.output_blocks.4.0.in_layers.0.weight:  31%|███       | 351/1131 [00:01<00:03, 216.94it/s]
ram used:  2.38 GB, model.diffusion_model.output_blocks.4.0.in_layers.0.bias:  31%|███       | 351/1131 [00:01<00:03, 216.94it/s]  
ram used:  2.38 GB, model.diffusion_model.output_blocks.4.0.in_layers.2.weight:  31%|███       | 351/1131 [00:01<00:03, 216.94it/s]
ram used:  2.38 GB, model.diffusion_model.output_blocks.4.0.in_layers.2.weight:  33%|███▎      | 378/1131 [00:01<00:03, 222.81it/s]
ram used:  2.50 GB, model.diffusion_model.output_blocks.4.0.in_layers.2.bias:  33%|███▎      | 378/1131 [00:01<00:03, 222.81it/s]  
ram used:  2.50 GB, model.diffusion_model.output_blocks.4.0.emb_layers.1.weight:  33%|███▎      | 378/1131 [00:01<00:03, 222.81it/s]
ram used:  2.50 GB, model.diffusion_model.output_blocks.4.0.emb_layers.1.bias:  33%|███▎      | 378/1131 [00:01<00:03, 222.81it/s]  
ram used:  2.50 GB, model.diffusion_model.output_blocks.4.0.out_layers.0.weight:  33%|███▎      | 378/1131 [00:01<00:03, 222.81it/s]
ram used:  2.50 GB, model.diffusion_model.output_blocks.4.0.out_layers.0.bias:  33%|███▎      | 378/1131 [00:01<00:03, 222.81it/s]  
ram used:  2.50 GB, model.diffusion_model.output_blocks.4.0.out_layers.3.weight:  33%|███▎      | 378/1131 [00:01<00:03, 222.81it/s]
ram used:  2.56 GB, model.diffusion_model.output_blocks.4.0.out_layers.3.bias:  33%|███▎      | 378/1131 [00:01<00:03, 222.81it/s]  
ram used:  2.56 GB, model.diffusion_model.output_blocks.4.0.skip_connection.weight:  33%|███▎      | 378/1131 [00:01<00:03, 222.81it/s]
ram used:  2.58 GB, model.diffusion_model.output_blocks.4.0.skip_connection.bias:  33%|███▎      | 378/1131 [00:01<00:03, 222.81it/s]  
ram used:  2.58 GB, model.diffusion_model.output_blocks.4.1.norm.weight:  33%|███▎      | 378/1131 [00:01<00:03, 222.81it/s]         
ram used:  2.58 GB, model.diffusion_model.output_blocks.4.1.norm.bias :  33%|███▎      | 378/1131 [00:01<00:03, 222.81it/s] 
ram used:  2.58 GB, model.diffusion_model.output_blocks.4.1.proj_in.weight:  33%|███▎      | 378/1131 [00:01<00:03, 222.81it/s]
ram used:  2.58 GB, model.diffusion_model.output_blocks.4.1.proj_in.bias:  33%|███▎      | 378/1131 [00:01<00:03, 222.81it/s]  
ram used:  2.58 GB, model.diffusion_model.output_blocks.4.1.transformer_blocks.0.attn1.to_q.weight:  33%|███▎      | 378/1131 [00:01<00:03, 222.81it/s]
ram used:  2.59 GB, model.diffusion_model.output_blocks.4.1.transformer_blocks.0.attn1.to_k.weight:  33%|███▎      | 378/1131 [00:01<00:03, 222.81it/s]
ram used:  2.60 GB, model.diffusion_model.output_blocks.4.1.transformer_blocks.0.attn1.to_v.weight:  33%|███▎      | 378/1131 [00:01<00:03, 222.81it/s]
ram used:  2.60 GB, model.diffusion_model.output_blocks.4.1.transformer_blocks.0.attn1.to_out.0.weight:  33%|███▎      | 378/1131 [00:01<00:03, 222.81it/s]
ram used:  2.61 GB, model.diffusion_model.output_blocks.4.1.transformer_blocks.0.attn1.to_out.0.bias:  33%|███▎      | 378/1131 [00:01<00:03, 222.81it/s]  
ram used:  2.61 GB, model.diffusion_model.output_blocks.4.1.transformer_blocks.0.ff.net.0.proj.weight:  33%|███▎      | 378/1131 [00:01<00:03, 222.81it/s]
ram used:  2.66 GB, model.diffusion_model.output_blocks.4.1.transformer_blocks.0.ff.net.0.proj.bias:  33%|███▎      | 378/1131 [00:01<00:03, 222.81it/s]  
ram used:  2.66 GB, model.diffusion_model.output_blocks.4.1.transformer_blocks.0.ff.net.2.weight:  33%|███▎      | 378/1131 [00:01<00:03, 222.81it/s]   
ram used:  2.69 GB, model.diffusion_model.output_blocks.4.1.transformer_blocks.0.ff.net.2.bias:  33%|███▎      | 378/1131 [00:01<00:03, 222.81it/s]  
ram used:  2.69 GB, model.diffusion_model.output_blocks.4.1.transformer_blocks.0.attn2.to_q.weight:  33%|███▎      | 378/1131 [00:01<00:03, 222.81it/s]
ram used:  2.69 GB, model.diffusion_model.output_blocks.4.1.transformer_blocks.0.attn2.to_k.weight:  33%|███▎      | 378/1131 [00:01<00:03, 222.81it/s]
ram used:  2.70 GB, model.diffusion_model.output_blocks.4.1.transformer_blocks.0.attn2.to_v.weight:  33%|███▎      | 378/1131 [00:01<00:03, 222.81it/s]
ram used:  2.70 GB, model.diffusion_model.output_blocks.4.1.transformer_blocks.0.attn2.to_out.0.weight:  33%|███▎      | 378/1131 [00:01<00:03, 222.81it/s]
ram used:  2.70 GB, model.diffusion_model.output_blocks.4.1.transformer_blocks.0.attn2.to_out.0.weight:  36%|███▌      | 404/1131 [00:01<00:03, 230.72it/s]
ram used:  2.71 GB, model.diffusion_model.output_blocks.4.1.transformer_blocks.0.attn2.to_out.0.bias:  36%|███▌      | 404/1131 [00:01<00:03, 230.72it/s]  
ram used:  2.71 GB, model.diffusion_model.output_blocks.4.1.transformer_blocks.0.norm1.weight:  36%|███▌      | 404/1131 [00:01<00:03, 230.72it/s]       
ram used:  2.71 GB, model.diffusion_model.output_blocks.4.1.transformer_blocks.0.norm1.bias:  36%|███▌      | 404/1131 [00:01<00:03, 230.72it/s]  
ram used:  2.71 GB, model.diffusion_model.output_blocks.4.1.transformer_blocks.0.norm2.weight:  36%|███▌      | 404/1131 [00:01<00:03, 230.72it/s]
ram used:  2.71 GB, model.diffusion_model.output_blocks.4.1.transformer_blocks.0.norm2.bias:  36%|███▌      | 404/1131 [00:01<00:03, 230.72it/s]  
ram used:  2.71 GB, model.diffusion_model.output_blocks.4.1.transformer_blocks.0.norm3.weight:  36%|███▌      | 404/1131 [00:01<00:03, 230.72it/s]
ram used:  2.71 GB, model.diffusion_model.output_blocks.4.1.transformer_blocks.0.norm3.bias:  36%|███▌      | 404/1131 [00:01<00:03, 230.72it/s]  
ram used:  2.71 GB, model.diffusion_model.output_blocks.4.1.proj_out.weight:  36%|███▌      | 404/1131 [00:01<00:03, 230.72it/s]                
ram used:  2.72 GB, model.diffusion_model.output_blocks.4.1.proj_out.bias:  36%|███▌      | 404/1131 [00:01<00:03, 230.72it/s]  
ram used:  2.72 GB, model.diffusion_model.output_blocks.5.0.in_layers.0.weight:  36%|███▌      | 404/1131 [00:01<00:03, 230.72it/s]
ram used:  2.72 GB, model.diffusion_model.output_blocks.5.0.in_layers.0.bias:  36%|███▌      | 404/1131 [00:01<00:03, 230.72it/s]  
ram used:  2.72 GB, model.diffusion_model.output_blocks.5.0.in_layers.2.weight:  36%|███▌      | 404/1131 [00:01<00:03, 230.72it/s]
ram used:  2.80 GB, model.diffusion_model.output_blocks.5.0.in_layers.2.bias:  36%|███▌      | 404/1131 [00:01<00:03, 230.72it/s]  
ram used:  2.80 GB, model.diffusion_model.output_blocks.5.0.emb_layers.1.weight:  36%|███▌      | 404/1131 [00:01<00:03, 230.72it/s]
ram used:  2.81 GB, model.diffusion_model.output_blocks.5.0.emb_layers.1.bias:  36%|███▌      | 404/1131 [00:01<00:03, 230.72it/s]  
ram used:  2.81 GB, model.diffusion_model.output_blocks.5.0.out_layers.0.weight:  36%|███▌      | 404/1131 [00:01<00:03, 230.72it/s]
ram used:  2.81 GB, model.diffusion_model.output_blocks.5.0.out_layers.0.bias:  36%|███▌      | 404/1131 [00:01<00:03, 230.72it/s]  
ram used:  2.81 GB, model.diffusion_model.output_blocks.5.0.out_layers.3.weight:  36%|███▌      | 404/1131 [00:01<00:03, 230.72it/s]
ram used:  2.87 GB, model.diffusion_model.output_blocks.5.0.out_layers.3.bias:  36%|███▌      | 404/1131 [00:01<00:03, 230.72it/s]  
ram used:  2.87 GB, model.diffusion_model.output_blocks.5.0.skip_connection.weight:  36%|███▌      | 404/1131 [00:01<00:03, 230.72it/s]
ram used:  2.88 GB, model.diffusion_model.output_blocks.5.0.skip_connection.bias:  36%|███▌      | 404/1131 [00:01<00:03, 230.72it/s]  
ram used:  2.88 GB, model.diffusion_model.output_blocks.5.1.norm.weight:  36%|███▌      | 404/1131 [00:01<00:03, 230.72it/s]         
ram used:  2.88 GB, model.diffusion_model.output_blocks.5.1.norm.bias :  36%|███▌      | 404/1131 [00:01<00:03, 230.72it/s] 
ram used:  2.88 GB, model.diffusion_model.output_blocks.5.1.proj_in.weight:  36%|███▌      | 404/1131 [00:01<00:03, 230.72it/s]
ram used:  2.89 GB, model.diffusion_model.output_blocks.5.1.proj_in.bias:  36%|███▌      | 404/1131 [00:01<00:03, 230.72it/s]  
ram used:  2.89 GB, model.diffusion_model.output_blocks.5.1.transformer_blocks.0.attn1.to_q.weight:  36%|███▌      | 404/1131 [00:01<00:03, 230.72it/s]
ram used:  2.89 GB, model.diffusion_model.output_blocks.5.1.transformer_blocks.0.attn1.to_k.weight:  36%|███▌      | 404/1131 [00:01<00:03, 230.72it/s]
ram used:  2.90 GB, model.diffusion_model.output_blocks.5.1.transformer_blocks.0.attn1.to_v.weight:  36%|███▌      | 404/1131 [00:01<00:03, 230.72it/s]
ram used:  2.91 GB, model.diffusion_model.output_blocks.5.1.transformer_blocks.0.attn1.to_out.0.weight:  36%|███▌      | 404/1131 [00:01<00:03, 230.72it/s]
ram used:  2.91 GB, model.diffusion_model.output_blocks.5.1.transformer_blocks.0.attn1.to_out.0.bias:  36%|███▌      | 404/1131 [00:01<00:03, 230.72it/s]  
ram used:  2.91 GB, model.diffusion_model.output_blocks.5.1.transformer_blocks.0.ff.net.0.proj.weight:  36%|███▌      | 404/1131 [00:01<00:03, 230.72it/s]
ram used:  2.91 GB, model.diffusion_model.output_blocks.5.1.transformer_blocks.0.ff.net.0.proj.weight:  38%|███▊      | 435/1131 [00:01<00:02, 240.90it/s]
ram used:  2.96 GB, model.diffusion_model.output_blocks.5.1.transformer_blocks.0.ff.net.0.proj.bias:  38%|███▊      | 435/1131 [00:01<00:02, 240.90it/s]  
ram used:  2.96 GB, model.diffusion_model.output_blocks.5.1.transformer_blocks.0.ff.net.2.weight:  38%|███▊      | 435/1131 [00:01<00:02, 240.90it/s]   
ram used:  2.99 GB, model.diffusion_model.output_blocks.5.1.transformer_blocks.0.ff.net.2.bias:  38%|███▊      | 435/1131 [00:01<00:02, 240.90it/s]  
ram used:  2.99 GB, model.diffusion_model.output_blocks.5.1.transformer_blocks.0.attn2.to_q.weight:  38%|███▊      | 435/1131 [00:01<00:02, 240.90it/s]
ram used:  3.00 GB, model.diffusion_model.output_blocks.5.1.transformer_blocks.0.attn2.to_k.weight:  38%|███▊      | 435/1131 [00:01<00:02, 240.90it/s]
ram used:  3.00 GB, model.diffusion_model.output_blocks.5.1.transformer_blocks.0.attn2.to_v.weight:  38%|███▊      | 435/1131 [00:01<00:02, 240.90it/s]
ram used:  3.01 GB, model.diffusion_model.output_blocks.5.1.transformer_blocks.0.attn2.to_out.0.weight:  38%|███▊      | 435/1131 [00:01<00:02, 240.90it/s]
ram used:  3.01 GB, model.diffusion_model.output_blocks.5.1.transformer_blocks.0.attn2.to_out.0.bias:  38%|███▊      | 435/1131 [00:01<00:02, 240.90it/s]  
ram used:  3.01 GB, model.diffusion_model.output_blocks.5.1.transformer_blocks.0.norm1.weight:  38%|███▊      | 435/1131 [00:01<00:02, 240.90it/s]       
ram used:  3.01 GB, model.diffusion_model.output_blocks.5.1.transformer_blocks.0.norm1.bias:  38%|███▊      | 435/1131 [00:01<00:02, 240.90it/s]  
ram used:  3.01 GB, model.diffusion_model.output_blocks.5.1.transformer_blocks.0.norm2.weight:  38%|███▊      | 435/1131 [00:01<00:02, 240.90it/s]
ram used:  3.01 GB, model.diffusion_model.output_blocks.5.1.transformer_blocks.0.norm2.bias:  38%|███▊      | 435/1131 [00:01<00:02, 240.90it/s]  
ram used:  3.01 GB, model.diffusion_model.output_blocks.5.1.transformer_blocks.0.norm3.weight:  38%|███▊      | 435/1131 [00:01<00:02, 240.90it/s]
ram used:  3.01 GB, model.diffusion_model.output_blocks.5.1.transformer_blocks.0.norm3.bias:  38%|███▊      | 435/1131 [00:01<00:02, 240.90it/s]  
ram used:  3.01 GB, model.diffusion_model.output_blocks.5.1.proj_out.weight:  38%|███▊      | 435/1131 [00:01<00:02, 240.90it/s]                
ram used:  3.02 GB, model.diffusion_model.output_blocks.5.1.proj_out.bias:  38%|███▊      | 435/1131 [00:01<00:02, 240.90it/s]  
ram used:  3.02 GB, model.diffusion_model.output_blocks.5.2.conv.weight:  38%|███▊      | 435/1131 [00:01<00:02, 240.90it/s]  
ram used:  3.08 GB, model.diffusion_model.output_blocks.5.2.conv.bias :  38%|███▊      | 435/1131 [00:01<00:02, 240.90it/s] 
ram used:  3.08 GB, model.diffusion_model.output_blocks.6.0.in_layers.0.weight:  38%|███▊      | 435/1131 [00:01<00:02, 240.90it/s]
ram used:  3.08 GB, model.diffusion_model.output_blocks.6.0.in_layers.0.bias:  38%|███▊      | 435/1131 [00:01<00:02, 240.90it/s]  
ram used:  3.08 GB, model.diffusion_model.output_blocks.6.0.in_layers.2.weight:  38%|███▊      | 435/1131 [00:01<00:02, 240.90it/s]
ram used:  3.12 GB, model.diffusion_model.output_blocks.6.0.in_layers.2.bias:  38%|███▊      | 435/1131 [00:01<00:02, 240.90it/s]  
ram used:  3.12 GB, model.diffusion_model.output_blocks.6.0.emb_layers.1.weight:  38%|███▊      | 435/1131 [00:01<00:02, 240.90it/s]
ram used:  3.12 GB, model.diffusion_model.output_blocks.6.0.emb_layers.1.bias:  38%|███▊      | 435/1131 [00:01<00:02, 240.90it/s]  
ram used:  3.12 GB, model.diffusion_model.output_blocks.6.0.out_layers.0.weight:  38%|███▊      | 435/1131 [00:01<00:02, 240.90it/s]
ram used:  3.12 GB, model.diffusion_model.output_blocks.6.0.out_layers.0.bias:  38%|███▊      | 435/1131 [00:01<00:02, 240.90it/s]  
ram used:  3.12 GB, model.diffusion_model.output_blocks.6.0.out_layers.3.weight:  38%|███▊      | 435/1131 [00:01<00:02, 240.90it/s]
ram used:  3.12 GB, model.diffusion_model.output_blocks.6.0.out_layers.3.weight:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]
ram used:  3.14 GB, model.diffusion_model.output_blocks.6.0.out_layers.3.bias:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]  
ram used:  3.14 GB, model.diffusion_model.output_blocks.6.0.skip_connection.weight:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]
ram used:  3.14 GB, model.diffusion_model.output_blocks.6.0.skip_connection.bias:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]  
ram used:  3.14 GB, model.diffusion_model.output_blocks.6.1.norm.weight:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]         
ram used:  3.14 GB, model.diffusion_model.output_blocks.6.1.norm.bias :  41%|████      | 462/1131 [00:01<00:02, 245.25it/s] 
ram used:  3.14 GB, model.diffusion_model.output_blocks.6.1.proj_in.weight:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]
ram used:  3.15 GB, model.diffusion_model.output_blocks.6.1.proj_in.bias:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]  
ram used:  3.15 GB, model.diffusion_model.output_blocks.6.1.transformer_blocks.0.attn1.to_q.weight:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]
ram used:  3.15 GB, model.diffusion_model.output_blocks.6.1.transformer_blocks.0.attn1.to_k.weight:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]
ram used:  3.15 GB, model.diffusion_model.output_blocks.6.1.transformer_blocks.0.attn1.to_v.weight:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]
ram used:  3.15 GB, model.diffusion_model.output_blocks.6.1.transformer_blocks.0.attn1.to_out.0.weight:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]
ram used:  3.15 GB, model.diffusion_model.output_blocks.6.1.transformer_blocks.0.attn1.to_out.0.bias:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]  
ram used:  3.15 GB, model.diffusion_model.output_blocks.6.1.transformer_blocks.0.ff.net.0.proj.weight:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]
ram used:  3.17 GB, model.diffusion_model.output_blocks.6.1.transformer_blocks.0.ff.net.0.proj.bias:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]  
ram used:  3.17 GB, model.diffusion_model.output_blocks.6.1.transformer_blocks.0.ff.net.2.weight:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]   
ram used:  3.17 GB, model.diffusion_model.output_blocks.6.1.transformer_blocks.0.ff.net.2.bias:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]  
ram used:  3.17 GB, model.diffusion_model.output_blocks.6.1.transformer_blocks.0.attn2.to_q.weight:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]
ram used:  3.17 GB, model.diffusion_model.output_blocks.6.1.transformer_blocks.0.attn2.to_k.weight:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]
ram used:  3.18 GB, model.diffusion_model.output_blocks.6.1.transformer_blocks.0.attn2.to_v.weight:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]
ram used:  3.18 GB, model.diffusion_model.output_blocks.6.1.transformer_blocks.0.attn2.to_out.0.weight:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]
ram used:  3.18 GB, model.diffusion_model.output_blocks.6.1.transformer_blocks.0.attn2.to_out.0.bias:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]  
ram used:  3.18 GB, model.diffusion_model.output_blocks.6.1.transformer_blocks.0.norm1.weight:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]       
ram used:  3.18 GB, model.diffusion_model.output_blocks.6.1.transformer_blocks.0.norm1.bias:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]  
ram used:  3.18 GB, model.diffusion_model.output_blocks.6.1.transformer_blocks.0.norm2.weight:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]
ram used:  3.18 GB, model.diffusion_model.output_blocks.6.1.transformer_blocks.0.norm2.bias:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]  
ram used:  3.18 GB, model.diffusion_model.output_blocks.6.1.transformer_blocks.0.norm3.weight:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]
ram used:  3.18 GB, model.diffusion_model.output_blocks.6.1.transformer_blocks.0.norm3.bias:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]  
ram used:  3.18 GB, model.diffusion_model.output_blocks.6.1.proj_out.weight:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]                
ram used:  3.18 GB, model.diffusion_model.output_blocks.6.1.proj_out.bias:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]  
ram used:  3.18 GB, model.diffusion_model.output_blocks.7.0.in_layers.0.weight:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]
ram used:  3.18 GB, model.diffusion_model.output_blocks.7.0.in_layers.0.bias:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]  
ram used:  3.18 GB, model.diffusion_model.output_blocks.7.0.in_layers.2.weight:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]
ram used:  3.21 GB, model.diffusion_model.output_blocks.7.0.in_layers.2.bias:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]  
ram used:  3.21 GB, model.diffusion_model.output_blocks.7.0.emb_layers.1.weight:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]
ram used:  3.21 GB, model.diffusion_model.output_blocks.7.0.emb_layers.1.bias:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]  
ram used:  3.21 GB, model.diffusion_model.output_blocks.7.0.out_layers.0.weight:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]
ram used:  3.21 GB, model.diffusion_model.output_blocks.7.0.out_layers.0.bias:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]  
ram used:  3.21 GB, model.diffusion_model.output_blocks.7.0.out_layers.3.weight:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]
ram used:  3.23 GB, model.diffusion_model.output_blocks.7.0.out_layers.3.bias:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]  
ram used:  3.23 GB, model.diffusion_model.output_blocks.7.0.skip_connection.weight:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]
ram used:  3.23 GB, model.diffusion_model.output_blocks.7.0.skip_connection.bias:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]  
ram used:  3.23 GB, model.diffusion_model.output_blocks.7.1.norm.weight:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]         
ram used:  3.23 GB, model.diffusion_model.output_blocks.7.1.norm.bias :  41%|████      | 462/1131 [00:01<00:02, 245.25it/s] 
ram used:  3.23 GB, model.diffusion_model.output_blocks.7.1.proj_in.weight:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]
ram used:  3.23 GB, model.diffusion_model.output_blocks.7.1.proj_in.bias:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]  
ram used:  3.23 GB, model.diffusion_model.output_blocks.7.1.transformer_blocks.0.attn1.to_q.weight:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]
ram used:  3.24 GB, model.diffusion_model.output_blocks.7.1.transformer_blocks.0.attn1.to_k.weight:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]
ram used:  3.24 GB, model.diffusion_model.output_blocks.7.1.transformer_blocks.0.attn1.to_v.weight:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]
ram used:  3.24 GB, model.diffusion_model.output_blocks.7.1.transformer_blocks.0.attn1.to_out.0.weight:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]
ram used:  3.24 GB, model.diffusion_model.output_blocks.7.1.transformer_blocks.0.attn1.to_out.0.bias:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]  
ram used:  3.24 GB, model.diffusion_model.output_blocks.7.1.transformer_blocks.0.ff.net.0.proj.weight:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]
ram used:  3.25 GB, model.diffusion_model.output_blocks.7.1.transformer_blocks.0.ff.net.0.proj.bias:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]  
ram used:  3.25 GB, model.diffusion_model.output_blocks.7.1.transformer_blocks.0.ff.net.2.weight:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]   
ram used:  3.26 GB, model.diffusion_model.output_blocks.7.1.transformer_blocks.0.ff.net.2.bias:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]  
ram used:  3.26 GB, model.diffusion_model.output_blocks.7.1.transformer_blocks.0.attn2.to_q.weight:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]
ram used:  3.26 GB, model.diffusion_model.output_blocks.7.1.transformer_blocks.0.attn2.to_k.weight:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]
ram used:  3.26 GB, model.diffusion_model.output_blocks.7.1.transformer_blocks.0.attn2.to_v.weight:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]
ram used:  3.27 GB, model.diffusion_model.output_blocks.7.1.transformer_blocks.0.attn2.to_out.0.weight:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]
ram used:  3.27 GB, model.diffusion_model.output_blocks.7.1.transformer_blocks.0.attn2.to_out.0.bias:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]  
ram used:  3.27 GB, model.diffusion_model.output_blocks.7.1.transformer_blocks.0.norm1.weight:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]       
ram used:  3.27 GB, model.diffusion_model.output_blocks.7.1.transformer_blocks.0.norm1.bias:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]  
ram used:  3.27 GB, model.diffusion_model.output_blocks.7.1.transformer_blocks.0.norm2.weight:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]
ram used:  3.27 GB, model.diffusion_model.output_blocks.7.1.transformer_blocks.0.norm2.bias:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]  
ram used:  3.27 GB, model.diffusion_model.output_blocks.7.1.transformer_blocks.0.norm3.weight:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]
ram used:  3.27 GB, model.diffusion_model.output_blocks.7.1.transformer_blocks.0.norm3.bias:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]  
ram used:  3.27 GB, model.diffusion_model.output_blocks.7.1.proj_out.weight:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]                
ram used:  3.27 GB, model.diffusion_model.output_blocks.7.1.proj_out.bias:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]  
ram used:  3.27 GB, model.diffusion_model.output_blocks.8.0.in_layers.0.weight:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]
ram used:  3.27 GB, model.diffusion_model.output_blocks.8.0.in_layers.0.bias:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]  
ram used:  3.27 GB, model.diffusion_model.output_blocks.8.0.in_layers.2.weight:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]
ram used:  3.29 GB, model.diffusion_model.output_blocks.8.0.in_layers.2.bias:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]  
ram used:  3.29 GB, model.diffusion_model.output_blocks.8.0.emb_layers.1.weight:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]
ram used:  3.29 GB, model.diffusion_model.output_blocks.8.0.emb_layers.1.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.29 GB, model.diffusion_model.output_blocks.8.0.emb_layers.1.bias:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]  
ram used:  3.29 GB, model.diffusion_model.output_blocks.8.0.out_layers.0.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.29 GB, model.diffusion_model.output_blocks.8.0.out_layers.0.bias:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]  
ram used:  3.29 GB, model.diffusion_model.output_blocks.8.0.out_layers.3.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.31 GB, model.diffusion_model.output_blocks.8.0.out_layers.3.bias:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]  
ram used:  3.31 GB, model.diffusion_model.output_blocks.8.0.skip_connection.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.31 GB, model.diffusion_model.output_blocks.8.0.skip_connection.bias:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]  
ram used:  3.31 GB, model.diffusion_model.output_blocks.8.1.norm.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]         
ram used:  3.31 GB, model.diffusion_model.output_blocks.8.1.norm.bias :  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s] 
ram used:  3.31 GB, model.diffusion_model.output_blocks.8.1.proj_in.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.31 GB, model.diffusion_model.output_blocks.8.1.proj_in.bias:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]  
ram used:  3.31 GB, model.diffusion_model.output_blocks.8.1.transformer_blocks.0.attn1.to_q.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.31 GB, model.diffusion_model.output_blocks.8.1.transformer_blocks.0.attn1.to_k.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.32 GB, model.diffusion_model.output_blocks.8.1.transformer_blocks.0.attn1.to_v.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.32 GB, model.diffusion_model.output_blocks.8.1.transformer_blocks.0.attn1.to_out.0.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.32 GB, model.diffusion_model.output_blocks.8.1.transformer_blocks.0.attn1.to_out.0.bias:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]  
ram used:  3.32 GB, model.diffusion_model.output_blocks.8.1.transformer_blocks.0.ff.net.0.proj.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.33 GB, model.diffusion_model.output_blocks.8.1.transformer_blocks.0.ff.net.0.proj.bias:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]  
ram used:  3.33 GB, model.diffusion_model.output_blocks.8.1.transformer_blocks.0.ff.net.2.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]   
ram used:  3.34 GB, model.diffusion_model.output_blocks.8.1.transformer_blocks.0.ff.net.2.bias:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]  
ram used:  3.34 GB, model.diffusion_model.output_blocks.8.1.transformer_blocks.0.attn2.to_q.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.34 GB, model.diffusion_model.output_blocks.8.1.transformer_blocks.0.attn2.to_k.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.34 GB, model.diffusion_model.output_blocks.8.1.transformer_blocks.0.attn2.to_v.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.34 GB, model.diffusion_model.output_blocks.8.1.transformer_blocks.0.attn2.to_out.0.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.35 GB, model.diffusion_model.output_blocks.8.1.transformer_blocks.0.attn2.to_out.0.bias:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]  
ram used:  3.35 GB, model.diffusion_model.output_blocks.8.1.transformer_blocks.0.norm1.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]       
ram used:  3.35 GB, model.diffusion_model.output_blocks.8.1.transformer_blocks.0.norm1.bias:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]  
ram used:  3.35 GB, model.diffusion_model.output_blocks.8.1.transformer_blocks.0.norm2.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.35 GB, model.diffusion_model.output_blocks.8.1.transformer_blocks.0.norm2.bias:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]  
ram used:  3.35 GB, model.diffusion_model.output_blocks.8.1.transformer_blocks.0.norm3.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.35 GB, model.diffusion_model.output_blocks.8.1.transformer_blocks.0.norm3.bias:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]  
ram used:  3.35 GB, model.diffusion_model.output_blocks.8.1.proj_out.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]                
ram used:  3.35 GB, model.diffusion_model.output_blocks.8.1.proj_out.bias:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]  
ram used:  3.35 GB, model.diffusion_model.output_blocks.8.2.conv.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]  
ram used:  3.36 GB, model.diffusion_model.output_blocks.8.2.conv.bias :  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s] 
ram used:  3.36 GB, model.diffusion_model.output_blocks.9.0.in_layers.0.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.36 GB, model.diffusion_model.output_blocks.9.0.in_layers.0.bias:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]  
ram used:  3.36 GB, model.diffusion_model.output_blocks.9.0.in_layers.2.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.37 GB, model.diffusion_model.output_blocks.9.0.in_layers.2.bias:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]  
ram used:  3.37 GB, model.diffusion_model.output_blocks.9.0.emb_layers.1.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.38 GB, model.diffusion_model.output_blocks.9.0.emb_layers.1.bias:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]  
ram used:  3.38 GB, model.diffusion_model.output_blocks.9.0.out_layers.0.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.38 GB, model.diffusion_model.output_blocks.9.0.out_layers.0.bias:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]  
ram used:  3.38 GB, model.diffusion_model.output_blocks.9.0.out_layers.3.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.38 GB, model.diffusion_model.output_blocks.9.0.out_layers.3.bias:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]  
ram used:  3.38 GB, model.diffusion_model.output_blocks.9.0.skip_connection.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.38 GB, model.diffusion_model.output_blocks.9.0.skip_connection.bias:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]  
ram used:  3.38 GB, model.diffusion_model.output_blocks.9.1.norm.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]         
ram used:  3.38 GB, model.diffusion_model.output_blocks.9.1.norm.bias :  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s] 
ram used:  3.38 GB, model.diffusion_model.output_blocks.9.1.proj_in.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.38 GB, model.diffusion_model.output_blocks.9.1.proj_in.bias:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]  
ram used:  3.38 GB, model.diffusion_model.output_blocks.9.1.transformer_blocks.0.attn1.to_q.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.38 GB, model.diffusion_model.output_blocks.9.1.transformer_blocks.0.attn1.to_k.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.38 GB, model.diffusion_model.output_blocks.9.1.transformer_blocks.0.attn1.to_v.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.38 GB, model.diffusion_model.output_blocks.9.1.transformer_blocks.0.attn1.to_out.0.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.38 GB, model.diffusion_model.output_blocks.9.1.transformer_blocks.0.attn1.to_out.0.bias:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]  
ram used:  3.38 GB, model.diffusion_model.output_blocks.9.1.transformer_blocks.0.ff.net.0.proj.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.39 GB, model.diffusion_model.output_blocks.9.1.transformer_blocks.0.ff.net.0.proj.bias:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]  
ram used:  3.39 GB, model.diffusion_model.output_blocks.9.1.transformer_blocks.0.ff.net.2.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]   
ram used:  3.39 GB, model.diffusion_model.output_blocks.9.1.transformer_blocks.0.ff.net.2.bias:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]  
ram used:  3.39 GB, model.diffusion_model.output_blocks.9.1.transformer_blocks.0.attn2.to_q.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.39 GB, model.diffusion_model.output_blocks.9.1.transformer_blocks.0.attn2.to_k.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.39 GB, model.diffusion_model.output_blocks.9.1.transformer_blocks.0.attn2.to_v.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.39 GB, model.diffusion_model.output_blocks.9.1.transformer_blocks.0.attn2.to_out.0.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.39 GB, model.diffusion_model.output_blocks.9.1.transformer_blocks.0.attn2.to_out.0.bias:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]  
ram used:  3.39 GB, model.diffusion_model.output_blocks.9.1.transformer_blocks.0.norm1.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]       
ram used:  3.39 GB, model.diffusion_model.output_blocks.9.1.transformer_blocks.0.norm1.bias:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]  
ram used:  3.39 GB, model.diffusion_model.output_blocks.9.1.transformer_blocks.0.norm2.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.39 GB, model.diffusion_model.output_blocks.9.1.transformer_blocks.0.norm2.bias:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]  
ram used:  3.39 GB, model.diffusion_model.output_blocks.9.1.transformer_blocks.0.norm3.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.39 GB, model.diffusion_model.output_blocks.9.1.transformer_blocks.0.norm3.bias:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]  
ram used:  3.39 GB, model.diffusion_model.output_blocks.9.1.proj_out.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]                
ram used:  3.39 GB, model.diffusion_model.output_blocks.9.1.proj_out.bias:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]  
ram used:  3.39 GB, model.diffusion_model.output_blocks.10.0.in_layers.0.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.39 GB, model.diffusion_model.output_blocks.10.0.in_layers.0.bias:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]  
ram used:  3.39 GB, model.diffusion_model.output_blocks.10.0.in_layers.2.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.40 GB, model.diffusion_model.output_blocks.10.0.in_layers.2.bias:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]  
ram used:  3.40 GB, model.diffusion_model.output_blocks.10.0.emb_layers.1.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.40 GB, model.diffusion_model.output_blocks.10.0.emb_layers.1.bias:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]  
ram used:  3.40 GB, model.diffusion_model.output_blocks.10.0.out_layers.0.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.40 GB, model.diffusion_model.output_blocks.10.0.out_layers.0.bias:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]  
ram used:  3.40 GB, model.diffusion_model.output_blocks.10.0.out_layers.3.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.40 GB, model.diffusion_model.output_blocks.10.0.out_layers.3.bias:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]  
ram used:  3.40 GB, model.diffusion_model.output_blocks.10.0.skip_connection.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.40 GB, model.diffusion_model.output_blocks.10.0.skip_connection.bias:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]  
ram used:  3.40 GB, model.diffusion_model.output_blocks.10.1.norm.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]         
ram used:  3.40 GB, model.diffusion_model.output_blocks.10.1.norm.bias:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]  
ram used:  3.40 GB, model.diffusion_model.output_blocks.10.1.proj_in.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.40 GB, model.diffusion_model.output_blocks.10.1.proj_in.bias:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]  
ram used:  3.40 GB, model.diffusion_model.output_blocks.10.1.transformer_blocks.0.attn1.to_q.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.40 GB, model.diffusion_model.output_blocks.10.1.transformer_blocks.0.attn1.to_k.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.41 GB, model.diffusion_model.output_blocks.10.1.transformer_blocks.0.attn1.to_v.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.41 GB, model.diffusion_model.output_blocks.10.1.transformer_blocks.0.attn1.to_out.0.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.41 GB, model.diffusion_model.output_blocks.10.1.transformer_blocks.0.attn1.to_out.0.bias:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]  
ram used:  3.41 GB, model.diffusion_model.output_blocks.10.1.transformer_blocks.0.ff.net.0.proj.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.41 GB, model.diffusion_model.output_blocks.10.1.transformer_blocks.0.ff.net.0.proj.weight:  56%|█████▌    | 629/1131 [00:01<00:00, 523.12it/s]
ram used:  3.41 GB, model.diffusion_model.output_blocks.10.1.transformer_blocks.0.ff.net.0.proj.bias:  56%|█████▌    | 629/1131 [00:01<00:00, 523.12it/s]  
ram used:  3.41 GB, model.diffusion_model.output_blocks.10.1.transformer_blocks.0.ff.net.2.weight:  56%|█████▌    | 629/1131 [00:01<00:00, 523.12it/s]   
ram used:  3.41 GB, model.diffusion_model.output_blocks.10.1.transformer_blocks.0.ff.net.2.bias:  56%|█████▌    | 629/1131 [00:01<00:00, 523.12it/s]  
ram used:  3.41 GB, model.diffusion_model.output_blocks.10.1.transformer_blocks.0.attn2.to_q.weight:  56%|█████▌    | 629/1131 [00:01<00:00, 523.12it/s]
ram used:  3.41 GB, model.diffusion_model.output_blocks.10.1.transformer_blocks.0.attn2.to_k.weight:  56%|█████▌    | 629/1131 [00:01<00:00, 523.12it/s]
ram used:  3.41 GB, model.diffusion_model.output_blocks.10.1.transformer_blocks.0.attn2.to_v.weight:  56%|█████▌    | 629/1131 [00:01<00:00, 523.12it/s]
ram used:  3.41 GB, model.diffusion_model.output_blocks.10.1.transformer_blocks.0.attn2.to_out.0.weight:  56%|█████▌    | 629/1131 [00:01<00:00, 523.12it/s]
ram used:  3.41 GB, model.diffusion_model.output_blocks.10.1.transformer_blocks.0.attn2.to_out.0.bias:  56%|█████▌    | 629/1131 [00:01<00:00, 523.12it/s]  
ram used:  3.41 GB, model.diffusion_model.output_blocks.10.1.transformer_blocks.0.norm1.weight:  56%|█████▌    | 629/1131 [00:01<00:00, 523.12it/s]       
ram used:  3.41 GB, model.diffusion_model.output_blocks.10.1.transformer_blocks.0.norm1.bias:  56%|█████▌    | 629/1131 [00:01<00:00, 523.12it/s]  
ram used:  3.41 GB, model.diffusion_model.output_blocks.10.1.transformer_blocks.0.norm2.weight:  56%|█████▌    | 629/1131 [00:01<00:00, 523.12it/s]
ram used:  3.41 GB, model.diffusion_model.output_blocks.10.1.transformer_blocks.0.norm2.bias:  56%|█████▌    | 629/1131 [00:01<00:00, 523.12it/s]  
ram used:  3.41 GB, model.diffusion_model.output_blocks.10.1.transformer_blocks.0.norm3.weight:  56%|█████▌    | 629/1131 [00:01<00:00, 523.12it/s]
ram used:  3.41 GB, model.diffusion_model.output_blocks.10.1.transformer_blocks.0.norm3.bias:  56%|█████▌    | 629/1131 [00:01<00:00, 523.12it/s]  
ram used:  3.41 GB, model.diffusion_model.output_blocks.10.1.proj_out.weight:  56%|█████▌    | 629/1131 [00:01<00:00, 523.12it/s]                
ram used:  3.41 GB, model.diffusion_model.output_blocks.10.1.proj_out.bias:  56%|█████▌    | 629/1131 [00:01<00:00, 523.12it/s]  
ram used:  3.41 GB, model.diffusion_model.output_blocks.11.0.in_layers.0.weight:  56%|█████▌    | 629/1131 [00:01<00:00, 523.12it/s]
ram used:  3.41 GB, model.diffusion_model.output_blocks.11.0.in_layers.0.bias:  56%|█████▌    | 629/1131 [00:01<00:00, 523.12it/s]  
ram used:  3.41 GB, model.diffusion_model.output_blocks.11.0.in_layers.2.weight:  56%|█████▌    | 629/1131 [00:01<00:00, 523.12it/s]
ram used:  3.42 GB, model.diffusion_model.output_blocks.11.0.in_layers.2.bias:  56%|█████▌    | 629/1131 [00:01<00:00, 523.12it/s]  
ram used:  3.42 GB, model.diffusion_model.output_blocks.11.0.emb_layers.1.weight:  56%|█████▌    | 629/1131 [00:01<00:00, 523.12it/s]
ram used:  3.42 GB, model.diffusion_model.output_blocks.11.0.emb_layers.1.bias:  56%|█████▌    | 629/1131 [00:01<00:00, 523.12it/s]  
ram used:  3.42 GB, model.diffusion_model.output_blocks.11.0.out_layers.0.weight:  56%|█████▌    | 629/1131 [00:01<00:00, 523.12it/s]
ram used:  3.42 GB, model.diffusion_model.output_blocks.11.0.out_layers.0.bias:  56%|█████▌    | 629/1131 [00:01<00:00, 523.12it/s]  
ram used:  3.42 GB, model.diffusion_model.output_blocks.11.0.out_layers.3.weight:  56%|█████▌    | 629/1131 [00:01<00:00, 523.12it/s]
ram used:  3.43 GB, model.diffusion_model.output_blocks.11.0.out_layers.3.bias:  56%|█████▌    | 629/1131 [00:01<00:00, 523.12it/s]  
ram used:  3.43 GB, model.diffusion_model.output_blocks.11.0.skip_connection.weight:  56%|█████▌    | 629/1131 [00:01<00:00, 523.12it/s]
ram used:  3.43 GB, model.diffusion_model.output_blocks.11.0.skip_connection.bias:  56%|█████▌    | 629/1131 [00:01<00:00, 523.12it/s]  
ram used:  3.43 GB, model.diffusion_model.output_blocks.11.1.norm.weight:  56%|█████▌    | 629/1131 [00:01<00:00, 523.12it/s]         
ram used:  3.43 GB, model.diffusion_model.output_blocks.11.1.norm.bias:  56%|█████▌    | 629/1131 [00:01<00:00, 523.12it/s]  
ram used:  3.43 GB, model.diffusion_model.output_blocks.11.1.proj_in.weight:  56%|█████▌    | 629/1131 [00:01<00:00, 523.12it/s]
ram used:  3.43 GB, model.diffusion_model.output_blocks.11.1.proj_in.bias:  56%|█████▌    | 629/1131 [00:01<00:00, 523.12it/s]  
ram used:  3.43 GB, model.diffusion_model.output_blocks.11.1.transformer_blocks.0.attn1.to_q.weight:  56%|█████▌    | 629/1131 [00:01<00:00, 523.12it/s]
ram used:  3.43 GB, model.diffusion_model.output_blocks.11.1.transformer_blocks.0.attn1.to_k.weight:  56%|█████▌    | 629/1131 [00:01<00:00, 523.12it/s]
ram used:  3.43 GB, model.diffusion_model.output_blocks.11.1.transformer_blocks.0.attn1.to_v.weight:  56%|█████▌    | 629/1131 [00:01<00:00, 523.12it/s]
ram used:  3.43 GB, model.diffusion_model.output_blocks.11.1.transformer_blocks.0.attn1.to_out.0.weight:  56%|█████▌    | 629/1131 [00:01<00:00, 523.12it/s]
ram used:  3.43 GB, model.diffusion_model.output_blocks.11.1.transformer_blocks.0.attn1.to_out.0.bias:  56%|█████▌    | 629/1131 [00:01<00:00, 523.12it/s]  
ram used:  3.43 GB, model.diffusion_model.output_blocks.11.1.transformer_blocks.0.ff.net.0.proj.weight:  56%|█████▌    | 629/1131 [00:01<00:00, 523.12it/s]
ram used:  3.43 GB, model.diffusion_model.output_blocks.11.1.transformer_blocks.0.ff.net.0.proj.bias:  56%|█████▌    | 629/1131 [00:01<00:00, 523.12it/s]  
ram used:  3.43 GB, model.diffusion_model.output_blocks.11.1.transformer_blocks.0.ff.net.2.weight:  56%|█████▌    | 629/1131 [00:01<00:00, 523.12it/s]   
ram used:  3.43 GB, model.diffusion_model.output_blocks.11.1.transformer_blocks.0.ff.net.2.bias:  56%|█████▌    | 629/1131 [00:01<00:00, 523.12it/s]  
ram used:  3.43 GB, model.diffusion_model.output_blocks.11.1.transformer_blocks.0.attn2.to_q.weight:  56%|█████▌    | 629/1131 [00:01<00:00, 523.12it/s]
ram used:  3.44 GB, model.diffusion_model.output_blocks.11.1.transformer_blocks.0.attn2.to_k.weight:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]
ram used:  3.44 GB, model.diffusion_model.output_blocks.11.1.transformer_blocks.0.attn2.to_v.weight:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]
ram used:  3.44 GB, model.diffusion_model.output_blocks.11.1.transformer_blocks.0.attn2.to_out.0.weight:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]
ram used:  3.44 GB, model.diffusion_model.output_blocks.11.1.transformer_blocks.0.attn2.to_out.0.bias:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]  
ram used:  3.44 GB, model.diffusion_model.output_blocks.11.1.transformer_blocks.0.norm1.weight:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]       
ram used:  3.44 GB, model.diffusion_model.output_blocks.11.1.transformer_blocks.0.norm1.bias:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]  
ram used:  3.44 GB, model.diffusion_model.output_blocks.11.1.transformer_blocks.0.norm2.weight:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]
ram used:  3.44 GB, model.diffusion_model.output_blocks.11.1.transformer_blocks.0.norm2.bias:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]  
ram used:  3.44 GB, model.diffusion_model.output_blocks.11.1.transformer_blocks.0.norm3.weight:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]
ram used:  3.44 GB, model.diffusion_model.output_blocks.11.1.transformer_blocks.0.norm3.bias:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]  
ram used:  3.44 GB, model.diffusion_model.output_blocks.11.1.proj_out.weight:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]                
ram used:  3.44 GB, model.diffusion_model.output_blocks.11.1.proj_out.bias:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]  
ram used:  3.44 GB, model.diffusion_model.out.0.weight                :  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]    
ram used:  3.44 GB, model.diffusion_model.out.0.bias                  :  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]
ram used:  3.44 GB, model.diffusion_model.out.2.weight                :  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]
ram used:  3.44 GB, model.diffusion_model.out.2.bias                  :  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]
ram used:  3.44 GB, first_stage_model.encoder.conv_in.weight          :  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]
ram used:  3.44 GB, first_stage_model.encoder.conv_in.bias            :  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]
ram used:  3.44 GB, first_stage_model.encoder.down.0.block.0.norm1.weight:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]
ram used:  3.44 GB, first_stage_model.encoder.down.0.block.0.norm1.bias:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]  
ram used:  3.44 GB, first_stage_model.encoder.down.0.block.0.conv1.weight:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]
ram used:  3.44 GB, first_stage_model.encoder.down.0.block.0.conv1.bias:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]  
ram used:  3.44 GB, first_stage_model.encoder.down.0.block.0.norm2.weight:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]
ram used:  3.44 GB, first_stage_model.encoder.down.0.block.0.norm2.bias:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]  
ram used:  3.44 GB, first_stage_model.encoder.down.0.block.0.conv2.weight:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]
ram used:  3.44 GB, first_stage_model.encoder.down.0.block.0.conv2.bias:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]  
ram used:  3.44 GB, first_stage_model.encoder.down.0.block.1.norm1.weight:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]
ram used:  3.44 GB, first_stage_model.encoder.down.0.block.1.norm1.bias:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]  
ram used:  3.44 GB, first_stage_model.encoder.down.0.block.1.conv1.weight:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]
ram used:  3.44 GB, first_stage_model.encoder.down.0.block.1.conv1.bias:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]  
ram used:  3.44 GB, first_stage_model.encoder.down.0.block.1.norm2.weight:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]
ram used:  3.44 GB, first_stage_model.encoder.down.0.block.1.norm2.bias:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]  
ram used:  3.44 GB, first_stage_model.encoder.down.0.block.1.conv2.weight:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]
ram used:  3.44 GB, first_stage_model.encoder.down.0.block.1.conv2.bias:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]  
ram used:  3.44 GB, first_stage_model.encoder.down.0.downsample.conv.weight:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]
ram used:  3.44 GB, first_stage_model.encoder.down.0.downsample.conv.bias:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]  
ram used:  3.44 GB, first_stage_model.encoder.down.1.block.0.norm1.weight:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]
ram used:  3.44 GB, first_stage_model.encoder.down.1.block.0.norm1.bias:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]  
ram used:  3.44 GB, first_stage_model.encoder.down.1.block.0.conv1.weight:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]
ram used:  3.44 GB, first_stage_model.encoder.down.1.block.0.conv1.bias:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]  
ram used:  3.44 GB, first_stage_model.encoder.down.1.block.0.norm2.weight:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]
ram used:  3.44 GB, first_stage_model.encoder.down.1.block.0.norm2.bias:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]  
ram used:  3.44 GB, first_stage_model.encoder.down.1.block.0.conv2.weight:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]
ram used:  3.44 GB, first_stage_model.encoder.down.1.block.0.conv2.bias:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]  
ram used:  3.44 GB, first_stage_model.encoder.down.1.block.0.nin_shortcut.weight:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]
ram used:  3.44 GB, first_stage_model.encoder.down.1.block.0.nin_shortcut.bias:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]  
ram used:  3.44 GB, first_stage_model.encoder.down.1.block.1.norm1.weight:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]     
ram used:  3.44 GB, first_stage_model.encoder.down.1.block.1.norm1.bias:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]  
ram used:  3.44 GB, first_stage_model.encoder.down.1.block.1.conv1.weight:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]
ram used:  3.45 GB, first_stage_model.encoder.down.1.block.1.conv1.bias:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]  
ram used:  3.45 GB, first_stage_model.encoder.down.1.block.1.norm2.weight:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]
ram used:  3.45 GB, first_stage_model.encoder.down.1.block.1.norm2.bias:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]  
ram used:  3.45 GB, first_stage_model.encoder.down.1.block.1.conv2.weight:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]
ram used:  3.45 GB, first_stage_model.encoder.down.1.block.1.conv2.bias:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]  
ram used:  3.45 GB, first_stage_model.encoder.down.1.downsample.conv.weight:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]
ram used:  3.45 GB, first_stage_model.encoder.down.1.downsample.conv.bias:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]  
ram used:  3.45 GB, first_stage_model.encoder.down.2.block.0.norm1.weight:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]
ram used:  3.45 GB, first_stage_model.encoder.down.2.block.0.norm1.bias:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]  
ram used:  3.45 GB, first_stage_model.encoder.down.2.block.0.conv1.weight:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]
ram used:  3.46 GB, first_stage_model.encoder.down.2.block.0.conv1.bias:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]  
ram used:  3.46 GB, first_stage_model.encoder.down.2.block.0.norm2.weight:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]
ram used:  3.46 GB, first_stage_model.encoder.down.2.block.0.norm2.bias:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]  
ram used:  3.46 GB, first_stage_model.encoder.down.2.block.0.conv2.weight:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]
ram used:  3.47 GB, first_stage_model.encoder.down.2.block.0.conv2.bias:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]  
ram used:  3.47 GB, first_stage_model.encoder.down.2.block.0.nin_shortcut.weight:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]
ram used:  3.47 GB, first_stage_model.encoder.down.2.block.0.nin_shortcut.bias:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]  
ram used:  3.47 GB, first_stage_model.encoder.down.2.block.1.norm1.weight:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]     
ram used:  3.47 GB, first_stage_model.encoder.down.2.block.1.norm1.bias:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]  
ram used:  3.47 GB, first_stage_model.encoder.down.2.block.1.conv1.weight:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]
ram used:  3.48 GB, first_stage_model.encoder.down.2.block.1.conv1.bias:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]  
ram used:  3.48 GB, first_stage_model.encoder.down.2.block.1.norm2.weight:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]
ram used:  3.48 GB, first_stage_model.encoder.down.2.block.1.norm2.bias:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]  
ram used:  3.48 GB, first_stage_model.encoder.down.2.block.1.conv2.weight:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]
ram used:  3.49 GB, first_stage_model.encoder.down.2.block.1.conv2.bias:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]  
ram used:  3.49 GB, first_stage_model.encoder.down.2.downsample.conv.weight:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]
ram used:  3.49 GB, first_stage_model.encoder.down.2.downsample.conv.bias:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]  
ram used:  3.49 GB, first_stage_model.encoder.down.3.block.0.norm1.weight:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]
ram used:  3.49 GB, first_stage_model.encoder.down.3.block.0.norm1.bias:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]  
ram used:  3.49 GB, first_stage_model.encoder.down.3.block.0.conv1.weight:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]
ram used:  3.49 GB, first_stage_model.encoder.down.3.block.0.conv1.weight:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.50 GB, first_stage_model.encoder.down.3.block.0.conv1.bias:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]  
ram used:  3.50 GB, first_stage_model.encoder.down.3.block.0.norm2.weight:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.50 GB, first_stage_model.encoder.down.3.block.0.norm2.bias:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]  
ram used:  3.50 GB, first_stage_model.encoder.down.3.block.0.conv2.weight:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.51 GB, first_stage_model.encoder.down.3.block.0.conv2.bias:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]  
ram used:  3.51 GB, first_stage_model.encoder.down.3.block.1.norm1.weight:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.51 GB, first_stage_model.encoder.down.3.block.1.norm1.bias:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]  
ram used:  3.51 GB, first_stage_model.encoder.down.3.block.1.conv1.weight:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.52 GB, first_stage_model.encoder.down.3.block.1.conv1.bias:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]  
ram used:  3.52 GB, first_stage_model.encoder.down.3.block.1.norm2.weight:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.52 GB, first_stage_model.encoder.down.3.block.1.norm2.bias:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]  
ram used:  3.52 GB, first_stage_model.encoder.down.3.block.1.conv2.weight:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.53 GB, first_stage_model.encoder.down.3.block.1.conv2.bias:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]  
ram used:  3.53 GB, first_stage_model.encoder.mid.block_1.norm1.weight:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s] 
ram used:  3.53 GB, first_stage_model.encoder.mid.block_1.norm1.bias  :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.53 GB, first_stage_model.encoder.mid.block_1.conv1.weight:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.54 GB, first_stage_model.encoder.mid.block_1.conv1.bias  :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.54 GB, first_stage_model.encoder.mid.block_1.norm2.weight:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.54 GB, first_stage_model.encoder.mid.block_1.norm2.bias  :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.54 GB, first_stage_model.encoder.mid.block_1.conv2.weight:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.55 GB, first_stage_model.encoder.mid.block_1.conv2.bias  :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.55 GB, first_stage_model.encoder.mid.attn_1.norm.weight  :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.55 GB, first_stage_model.encoder.mid.attn_1.norm.bias    :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.55 GB, first_stage_model.encoder.mid.attn_1.q.weight     :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.55 GB, first_stage_model.encoder.mid.attn_1.q.bias       :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.55 GB, first_stage_model.encoder.mid.attn_1.k.weight     :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.55 GB, first_stage_model.encoder.mid.attn_1.k.bias       :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.55 GB, first_stage_model.encoder.mid.attn_1.v.weight     :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.55 GB, first_stage_model.encoder.mid.attn_1.v.bias       :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.55 GB, first_stage_model.encoder.mid.attn_1.proj_out.weight:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.56 GB, first_stage_model.encoder.mid.attn_1.proj_out.bias:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]  
ram used:  3.56 GB, first_stage_model.encoder.mid.block_2.norm1.weight:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.56 GB, first_stage_model.encoder.mid.block_2.norm1.bias  :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.56 GB, first_stage_model.encoder.mid.block_2.conv1.weight:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.57 GB, first_stage_model.encoder.mid.block_2.conv1.bias  :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.57 GB, first_stage_model.encoder.mid.block_2.norm2.weight:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.57 GB, first_stage_model.encoder.mid.block_2.norm2.bias  :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.57 GB, first_stage_model.encoder.mid.block_2.conv2.weight:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.57 GB, first_stage_model.encoder.mid.block_2.conv2.bias  :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.57 GB, first_stage_model.encoder.norm_out.weight         :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.57 GB, first_stage_model.encoder.norm_out.bias           :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.57 GB, first_stage_model.encoder.conv_out.weight         :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.57 GB, first_stage_model.encoder.conv_out.bias           :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.57 GB, first_stage_model.decoder.conv_in.weight          :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.57 GB, first_stage_model.decoder.conv_in.bias            :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.57 GB, first_stage_model.decoder.mid.block_1.norm1.weight:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.57 GB, first_stage_model.decoder.mid.block_1.norm1.bias  :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.57 GB, first_stage_model.decoder.mid.block_1.conv1.weight:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.58 GB, first_stage_model.decoder.mid.block_1.conv1.bias  :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.58 GB, first_stage_model.decoder.mid.block_1.norm2.weight:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.58 GB, first_stage_model.decoder.mid.block_1.norm2.bias  :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.58 GB, first_stage_model.decoder.mid.block_1.conv2.weight:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.59 GB, first_stage_model.decoder.mid.block_1.conv2.bias  :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.59 GB, first_stage_model.decoder.mid.attn_1.norm.weight  :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.59 GB, first_stage_model.decoder.mid.attn_1.norm.bias    :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.59 GB, first_stage_model.decoder.mid.attn_1.q.weight     :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.59 GB, first_stage_model.decoder.mid.attn_1.q.bias       :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.59 GB, first_stage_model.decoder.mid.attn_1.k.weight     :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.60 GB, first_stage_model.decoder.mid.attn_1.k.bias       :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.60 GB, first_stage_model.decoder.mid.attn_1.v.weight     :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.60 GB, first_stage_model.decoder.mid.attn_1.v.bias       :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.60 GB, first_stage_model.decoder.mid.attn_1.proj_out.weight:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.60 GB, first_stage_model.decoder.mid.attn_1.proj_out.bias:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]  
ram used:  3.60 GB, first_stage_model.decoder.mid.block_2.norm1.weight:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.60 GB, first_stage_model.decoder.mid.block_2.norm1.bias  :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.60 GB, first_stage_model.decoder.mid.block_2.conv1.weight:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.61 GB, first_stage_model.decoder.mid.block_2.conv1.bias  :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.61 GB, first_stage_model.decoder.mid.block_2.norm2.weight:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.61 GB, first_stage_model.decoder.mid.block_2.norm2.bias  :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.61 GB, first_stage_model.decoder.mid.block_2.conv2.weight:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.62 GB, first_stage_model.decoder.mid.block_2.conv2.bias  :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.62 GB, first_stage_model.decoder.up.0.block.0.norm1.weight:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.62 GB, first_stage_model.decoder.up.0.block.0.norm1.bias :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s] 
ram used:  3.62 GB, first_stage_model.decoder.up.0.block.0.conv1.weight:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.62 GB, first_stage_model.decoder.up.0.block.0.conv1.bias :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s] 
ram used:  3.62 GB, first_stage_model.decoder.up.0.block.0.norm2.weight:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.62 GB, first_stage_model.decoder.up.0.block.0.norm2.bias :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s] 
ram used:  3.62 GB, first_stage_model.decoder.up.0.block.0.conv2.weight:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.62 GB, first_stage_model.decoder.up.0.block.0.conv2.bias :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s] 
ram used:  3.62 GB, first_stage_model.decoder.up.0.block.0.nin_shortcut.weight:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.62 GB, first_stage_model.decoder.up.0.block.0.nin_shortcut.bias:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]  
ram used:  3.62 GB, first_stage_model.decoder.up.0.block.1.norm1.weight:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]     
ram used:  3.62 GB, first_stage_model.decoder.up.0.block.1.norm1.bias :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s] 
ram used:  3.62 GB, first_stage_model.decoder.up.0.block.1.conv1.weight:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.62 GB, first_stage_model.decoder.up.0.block.1.conv1.bias :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s] 
ram used:  3.62 GB, first_stage_model.decoder.up.0.block.1.norm2.weight:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.62 GB, first_stage_model.decoder.up.0.block.1.norm2.bias :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s] 
ram used:  3.62 GB, first_stage_model.decoder.up.0.block.1.conv2.weight:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.62 GB, first_stage_model.decoder.up.0.block.1.conv2.bias :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s] 
ram used:  3.62 GB, first_stage_model.decoder.up.0.block.2.norm1.weight:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.62 GB, first_stage_model.decoder.up.0.block.2.norm1.bias :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s] 
ram used:  3.62 GB, first_stage_model.decoder.up.0.block.2.conv1.weight:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.62 GB, first_stage_model.decoder.up.0.block.2.conv1.bias :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s] 
ram used:  3.62 GB, first_stage_model.decoder.up.0.block.2.norm2.weight:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.62 GB, first_stage_model.decoder.up.0.block.2.norm2.bias :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s] 
ram used:  3.62 GB, first_stage_model.decoder.up.0.block.2.conv2.weight:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.62 GB, first_stage_model.decoder.up.0.block.2.conv2.bias :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s] 
ram used:  3.62 GB, first_stage_model.decoder.up.1.block.0.norm1.weight:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.62 GB, first_stage_model.decoder.up.1.block.0.norm1.bias :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s] 
ram used:  3.62 GB, first_stage_model.decoder.up.1.block.0.conv1.weight:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.63 GB, first_stage_model.decoder.up.1.block.0.conv1.bias :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s] 
ram used:  3.63 GB, first_stage_model.decoder.up.1.block.0.norm2.weight:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.63 GB, first_stage_model.decoder.up.1.block.0.norm2.bias :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s] 
ram used:  3.63 GB, first_stage_model.decoder.up.1.block.0.norm2.bias :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.63 GB, first_stage_model.decoder.up.1.block.0.conv2.weight:  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.63 GB, first_stage_model.decoder.up.1.block.0.conv2.bias :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s] 
ram used:  3.63 GB, first_stage_model.decoder.up.1.block.0.nin_shortcut.weight:  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.63 GB, first_stage_model.decoder.up.1.block.0.nin_shortcut.bias:  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]  
ram used:  3.63 GB, first_stage_model.decoder.up.1.block.1.norm1.weight:  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]     
ram used:  3.63 GB, first_stage_model.decoder.up.1.block.1.norm1.bias :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s] 
ram used:  3.63 GB, first_stage_model.decoder.up.1.block.1.conv1.weight:  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.63 GB, first_stage_model.decoder.up.1.block.1.conv1.bias :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s] 
ram used:  3.63 GB, first_stage_model.decoder.up.1.block.1.norm2.weight:  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.63 GB, first_stage_model.decoder.up.1.block.1.norm2.bias :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s] 
ram used:  3.63 GB, first_stage_model.decoder.up.1.block.1.conv2.weight:  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.63 GB, first_stage_model.decoder.up.1.block.1.conv2.bias :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s] 
ram used:  3.63 GB, first_stage_model.decoder.up.1.block.2.norm1.weight:  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.63 GB, first_stage_model.decoder.up.1.block.2.norm1.bias :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s] 
ram used:  3.63 GB, first_stage_model.decoder.up.1.block.2.conv1.weight:  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.64 GB, first_stage_model.decoder.up.1.block.2.conv1.bias :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s] 
ram used:  3.64 GB, first_stage_model.decoder.up.1.block.2.norm2.weight:  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.64 GB, first_stage_model.decoder.up.1.block.2.norm2.bias :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s] 
ram used:  3.64 GB, first_stage_model.decoder.up.1.block.2.conv2.weight:  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.64 GB, first_stage_model.decoder.up.1.block.2.conv2.bias :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s] 
ram used:  3.64 GB, first_stage_model.decoder.up.1.upsample.conv.weight:  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.64 GB, first_stage_model.decoder.up.1.upsample.conv.bias :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s] 
ram used:  3.64 GB, first_stage_model.decoder.up.2.block.0.norm1.weight:  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.64 GB, first_stage_model.decoder.up.2.block.0.norm1.bias :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s] 
ram used:  3.64 GB, first_stage_model.decoder.up.2.block.0.conv1.weight:  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.65 GB, first_stage_model.decoder.up.2.block.0.conv1.bias :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s] 
ram used:  3.65 GB, first_stage_model.decoder.up.2.block.0.norm2.weight:  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.65 GB, first_stage_model.decoder.up.2.block.0.norm2.bias :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s] 
ram used:  3.65 GB, first_stage_model.decoder.up.2.block.0.conv2.weight:  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.66 GB, first_stage_model.decoder.up.2.block.0.conv2.bias :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s] 
ram used:  3.66 GB, first_stage_model.decoder.up.2.block.1.norm1.weight:  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.66 GB, first_stage_model.decoder.up.2.block.1.norm1.bias :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s] 
ram used:  3.66 GB, first_stage_model.decoder.up.2.block.1.conv1.weight:  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.67 GB, first_stage_model.decoder.up.2.block.1.conv1.bias :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s] 
ram used:  3.67 GB, first_stage_model.decoder.up.2.block.1.norm2.weight:  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.67 GB, first_stage_model.decoder.up.2.block.1.norm2.bias :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s] 
ram used:  3.67 GB, first_stage_model.decoder.up.2.block.1.conv2.weight:  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.68 GB, first_stage_model.decoder.up.2.block.1.conv2.bias :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s] 
ram used:  3.68 GB, first_stage_model.decoder.up.2.block.2.norm1.weight:  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.68 GB, first_stage_model.decoder.up.2.block.2.norm1.bias :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s] 
ram used:  3.68 GB, first_stage_model.decoder.up.2.block.2.conv1.weight:  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.69 GB, first_stage_model.decoder.up.2.block.2.conv1.bias :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s] 
ram used:  3.69 GB, first_stage_model.decoder.up.2.block.2.norm2.weight:  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.69 GB, first_stage_model.decoder.up.2.block.2.norm2.bias :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s] 
ram used:  3.69 GB, first_stage_model.decoder.up.2.block.2.conv2.weight:  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.70 GB, first_stage_model.decoder.up.2.block.2.conv2.bias :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s] 
ram used:  3.70 GB, first_stage_model.decoder.up.2.upsample.conv.weight:  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.71 GB, first_stage_model.decoder.up.2.upsample.conv.bias :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s] 
ram used:  3.71 GB, first_stage_model.decoder.up.3.block.0.norm1.weight:  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.71 GB, first_stage_model.decoder.up.3.block.0.norm1.bias :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s] 
ram used:  3.71 GB, first_stage_model.decoder.up.3.block.0.conv1.weight:  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.72 GB, first_stage_model.decoder.up.3.block.0.conv1.bias :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s] 
ram used:  3.72 GB, first_stage_model.decoder.up.3.block.0.norm2.weight:  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.72 GB, first_stage_model.decoder.up.3.block.0.norm2.bias :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s] 
ram used:  3.72 GB, first_stage_model.decoder.up.3.block.0.conv2.weight:  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.73 GB, first_stage_model.decoder.up.3.block.0.conv2.bias :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s] 
ram used:  3.73 GB, first_stage_model.decoder.up.3.block.1.norm1.weight:  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.73 GB, first_stage_model.decoder.up.3.block.1.norm1.bias :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s] 
ram used:  3.73 GB, first_stage_model.decoder.up.3.block.1.conv1.weight:  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.73 GB, first_stage_model.decoder.up.3.block.1.conv1.bias :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s] 
ram used:  3.73 GB, first_stage_model.decoder.up.3.block.1.norm2.weight:  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.73 GB, first_stage_model.decoder.up.3.block.1.norm2.bias :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s] 
ram used:  3.73 GB, first_stage_model.decoder.up.3.block.1.conv2.weight:  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.74 GB, first_stage_model.decoder.up.3.block.1.conv2.bias :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s] 
ram used:  3.74 GB, first_stage_model.decoder.up.3.block.2.norm1.weight:  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.74 GB, first_stage_model.decoder.up.3.block.2.norm1.bias :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s] 
ram used:  3.74 GB, first_stage_model.decoder.up.3.block.2.conv1.weight:  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.75 GB, first_stage_model.decoder.up.3.block.2.conv1.bias :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s] 
ram used:  3.75 GB, first_stage_model.decoder.up.3.block.2.norm2.weight:  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.75 GB, first_stage_model.decoder.up.3.block.2.norm2.bias :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s] 
ram used:  3.75 GB, first_stage_model.decoder.up.3.block.2.conv2.weight:  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.76 GB, first_stage_model.decoder.up.3.block.2.conv2.bias :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s] 
ram used:  3.76 GB, first_stage_model.decoder.up.3.upsample.conv.weight:  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.77 GB, first_stage_model.decoder.up.3.upsample.conv.bias :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s] 
ram used:  3.77 GB, first_stage_model.decoder.norm_out.weight         :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.77 GB, first_stage_model.decoder.norm_out.bias           :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.77 GB, first_stage_model.decoder.conv_out.weight         :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.77 GB, first_stage_model.decoder.conv_out.bias           :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.77 GB, first_stage_model.quant_conv.weight               :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.77 GB, first_stage_model.quant_conv.bias                 :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.77 GB, first_stage_model.post_quant_conv.weight          :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.77 GB, first_stage_model.post_quant_conv.bias            :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.77 GB, cond_stage_model.transformer.text_model.embeddings.token_embedding.weight:  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.92 GB, cond_stage_model.transformer.text_model.embeddings.position_embedding.weight:  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.92 GB, cond_stage_model.transformer.text_model.embeddings.position_embedding.weight:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]
ram used:  3.92 GB, cond_stage_model.transformer.text_model.encoder.layers.0.self_attn.k_proj.weight:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]
ram used:  3.93 GB, cond_stage_model.transformer.text_model.encoder.layers.0.self_attn.k_proj.bias:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]  
ram used:  3.93 GB, cond_stage_model.transformer.text_model.encoder.layers.0.self_attn.v_proj.weight:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]
ram used:  3.93 GB, cond_stage_model.transformer.text_model.encoder.layers.0.self_attn.v_proj.bias:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]  
ram used:  3.93 GB, cond_stage_model.transformer.text_model.encoder.layers.0.self_attn.q_proj.weight:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]
ram used:  3.93 GB, cond_stage_model.transformer.text_model.encoder.layers.0.self_attn.q_proj.bias:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]  
ram used:  3.93 GB, cond_stage_model.transformer.text_model.encoder.layers.0.self_attn.out_proj.weight:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]
ram used:  3.93 GB, cond_stage_model.transformer.text_model.encoder.layers.0.self_attn.out_proj.bias:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]  
ram used:  3.93 GB, cond_stage_model.transformer.text_model.encoder.layers.0.layer_norm1.weight:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]     
ram used:  3.93 GB, cond_stage_model.transformer.text_model.encoder.layers.0.layer_norm1.bias:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]  
ram used:  3.93 GB, cond_stage_model.transformer.text_model.encoder.layers.0.mlp.fc1.weight:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]  
ram used:  3.94 GB, cond_stage_model.transformer.text_model.encoder.layers.0.mlp.fc1.bias:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]  
ram used:  3.94 GB, cond_stage_model.transformer.text_model.encoder.layers.0.mlp.fc2.weight:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]
ram used:  3.95 GB, cond_stage_model.transformer.text_model.encoder.layers.0.mlp.fc2.bias:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]  
ram used:  3.95 GB, cond_stage_model.transformer.text_model.encoder.layers.0.layer_norm2.weight:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]
ram used:  3.95 GB, cond_stage_model.transformer.text_model.encoder.layers.0.layer_norm2.bias:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]  
ram used:  3.95 GB, cond_stage_model.transformer.text_model.encoder.layers.1.self_attn.k_proj.weight:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]
ram used:  3.96 GB, cond_stage_model.transformer.text_model.encoder.layers.1.self_attn.k_proj.bias:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]  
ram used:  3.96 GB, cond_stage_model.transformer.text_model.encoder.layers.1.self_attn.v_proj.weight:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]
ram used:  3.96 GB, cond_stage_model.transformer.text_model.encoder.layers.1.self_attn.v_proj.bias:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]  
ram used:  3.96 GB, cond_stage_model.transformer.text_model.encoder.layers.1.self_attn.q_proj.weight:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]
ram used:  3.96 GB, cond_stage_model.transformer.text_model.encoder.layers.1.self_attn.q_proj.bias:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]  
ram used:  3.96 GB, cond_stage_model.transformer.text_model.encoder.layers.1.self_attn.out_proj.weight:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]
ram used:  3.96 GB, cond_stage_model.transformer.text_model.encoder.layers.1.self_attn.out_proj.bias:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]  
ram used:  3.96 GB, cond_stage_model.transformer.text_model.encoder.layers.1.layer_norm1.weight:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]     
ram used:  3.96 GB, cond_stage_model.transformer.text_model.encoder.layers.1.layer_norm1.bias:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]  
ram used:  3.96 GB, cond_stage_model.transformer.text_model.encoder.layers.1.mlp.fc1.weight:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]  
ram used:  3.97 GB, cond_stage_model.transformer.text_model.encoder.layers.1.mlp.fc1.bias:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]  
ram used:  3.97 GB, cond_stage_model.transformer.text_model.encoder.layers.1.mlp.fc2.weight:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]
ram used:  3.98 GB, cond_stage_model.transformer.text_model.encoder.layers.1.mlp.fc2.bias:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]  
ram used:  3.98 GB, cond_stage_model.transformer.text_model.encoder.layers.1.layer_norm2.weight:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]
ram used:  3.98 GB, cond_stage_model.transformer.text_model.encoder.layers.1.layer_norm2.bias:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]  
ram used:  3.98 GB, cond_stage_model.transformer.text_model.encoder.layers.2.self_attn.k_proj.weight:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]
ram used:  3.98 GB, cond_stage_model.transformer.text_model.encoder.layers.2.self_attn.k_proj.bias:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]  
ram used:  3.98 GB, cond_stage_model.transformer.text_model.encoder.layers.2.self_attn.v_proj.weight:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]
ram used:  3.99 GB, cond_stage_model.transformer.text_model.encoder.layers.2.self_attn.v_proj.bias:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]  
ram used:  3.99 GB, cond_stage_model.transformer.text_model.encoder.layers.2.self_attn.q_proj.weight:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]
ram used:  3.99 GB, cond_stage_model.transformer.text_model.encoder.layers.2.self_attn.q_proj.bias:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]  
ram used:  3.99 GB, cond_stage_model.transformer.text_model.encoder.layers.2.self_attn.out_proj.weight:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]
ram used:  3.99 GB, cond_stage_model.transformer.text_model.encoder.layers.2.self_attn.out_proj.bias:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]  
ram used:  3.99 GB, cond_stage_model.transformer.text_model.encoder.layers.2.layer_norm1.weight:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]     
ram used:  3.99 GB, cond_stage_model.transformer.text_model.encoder.layers.2.layer_norm1.bias:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]  
ram used:  3.99 GB, cond_stage_model.transformer.text_model.encoder.layers.2.mlp.fc1.weight:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]  
ram used:  4.00 GB, cond_stage_model.transformer.text_model.encoder.layers.2.mlp.fc1.bias:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]  
ram used:  4.00 GB, cond_stage_model.transformer.text_model.encoder.layers.2.mlp.fc2.weight:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]
ram used:  4.01 GB, cond_stage_model.transformer.text_model.encoder.layers.2.mlp.fc2.bias:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]  
ram used:  4.01 GB, cond_stage_model.transformer.text_model.encoder.layers.2.layer_norm2.weight:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]
ram used:  4.01 GB, cond_stage_model.transformer.text_model.encoder.layers.2.layer_norm2.bias:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]  
ram used:  4.01 GB, cond_stage_model.transformer.text_model.encoder.layers.3.self_attn.k_proj.weight:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]
ram used:  4.01 GB, cond_stage_model.transformer.text_model.encoder.layers.3.self_attn.k_proj.bias:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]  
ram used:  4.01 GB, cond_stage_model.transformer.text_model.encoder.layers.3.self_attn.v_proj.weight:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]
ram used:  4.01 GB, cond_stage_model.transformer.text_model.encoder.layers.3.self_attn.v_proj.bias:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]  
ram used:  4.01 GB, cond_stage_model.transformer.text_model.encoder.layers.3.self_attn.q_proj.weight:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]
ram used:  4.02 GB, cond_stage_model.transformer.text_model.encoder.layers.3.self_attn.q_proj.bias:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]  
ram used:  4.02 GB, cond_stage_model.transformer.text_model.encoder.layers.3.self_attn.out_proj.weight:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]
ram used:  4.02 GB, cond_stage_model.transformer.text_model.encoder.layers.3.self_attn.out_proj.bias:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]  
ram used:  4.02 GB, cond_stage_model.transformer.text_model.encoder.layers.3.layer_norm1.weight:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]     
ram used:  4.02 GB, cond_stage_model.transformer.text_model.encoder.layers.3.layer_norm1.bias:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]  
ram used:  4.02 GB, cond_stage_model.transformer.text_model.encoder.layers.3.mlp.fc1.weight:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]  
ram used:  4.03 GB, cond_stage_model.transformer.text_model.encoder.layers.3.mlp.fc1.bias:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]  
ram used:  4.03 GB, cond_stage_model.transformer.text_model.encoder.layers.3.mlp.fc2.weight:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]
ram used:  4.04 GB, cond_stage_model.transformer.text_model.encoder.layers.3.mlp.fc2.bias:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]  
ram used:  4.04 GB, cond_stage_model.transformer.text_model.encoder.layers.3.layer_norm2.weight:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]
ram used:  4.04 GB, cond_stage_model.transformer.text_model.encoder.layers.3.layer_norm2.bias:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]  
ram used:  4.04 GB, cond_stage_model.transformer.text_model.encoder.layers.4.self_attn.k_proj.weight:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]
ram used:  4.04 GB, cond_stage_model.transformer.text_model.encoder.layers.4.self_attn.k_proj.bias:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]  
ram used:  4.04 GB, cond_stage_model.transformer.text_model.encoder.layers.4.self_attn.v_proj.weight:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]
ram used:  4.04 GB, cond_stage_model.transformer.text_model.encoder.layers.4.self_attn.v_proj.bias:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]  
ram used:  4.04 GB, cond_stage_model.transformer.text_model.encoder.layers.4.self_attn.q_proj.weight:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]
ram used:  4.05 GB, cond_stage_model.transformer.text_model.encoder.layers.4.self_attn.q_proj.bias:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]  
ram used:  4.05 GB, cond_stage_model.transformer.text_model.encoder.layers.4.self_attn.out_proj.weight:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]
ram used:  4.05 GB, cond_stage_model.transformer.text_model.encoder.layers.4.self_attn.out_proj.bias:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]  
ram used:  4.05 GB, cond_stage_model.transformer.text_model.encoder.layers.4.layer_norm1.weight:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]     
ram used:  4.05 GB, cond_stage_model.transformer.text_model.encoder.layers.4.layer_norm1.bias:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]  
ram used:  4.05 GB, cond_stage_model.transformer.text_model.encoder.layers.4.mlp.fc1.weight:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]  
ram used:  4.06 GB, cond_stage_model.transformer.text_model.encoder.layers.4.mlp.fc1.bias:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]  
ram used:  4.06 GB, cond_stage_model.transformer.text_model.encoder.layers.4.mlp.fc2.weight:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]
ram used:  4.06 GB, cond_stage_model.transformer.text_model.encoder.layers.4.mlp.fc2.weight:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]
ram used:  4.07 GB, cond_stage_model.transformer.text_model.encoder.layers.4.mlp.fc2.bias:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.07 GB, cond_stage_model.transformer.text_model.encoder.layers.4.layer_norm2.weight:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]
ram used:  4.07 GB, cond_stage_model.transformer.text_model.encoder.layers.4.layer_norm2.bias:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.07 GB, cond_stage_model.transformer.text_model.encoder.layers.5.self_attn.k_proj.weight:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]
ram used:  4.07 GB, cond_stage_model.transformer.text_model.encoder.layers.5.self_attn.k_proj.bias:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.07 GB, cond_stage_model.transformer.text_model.encoder.layers.5.self_attn.v_proj.weight:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]
ram used:  4.07 GB, cond_stage_model.transformer.text_model.encoder.layers.5.self_attn.v_proj.bias:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.07 GB, cond_stage_model.transformer.text_model.encoder.layers.5.self_attn.q_proj.weight:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]
ram used:  4.07 GB, cond_stage_model.transformer.text_model.encoder.layers.5.self_attn.q_proj.bias:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.07 GB, cond_stage_model.transformer.text_model.encoder.layers.5.self_attn.out_proj.weight:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]
ram used:  4.08 GB, cond_stage_model.transformer.text_model.encoder.layers.5.self_attn.out_proj.bias:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.08 GB, cond_stage_model.transformer.text_model.encoder.layers.5.layer_norm1.weight:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]     
ram used:  4.08 GB, cond_stage_model.transformer.text_model.encoder.layers.5.layer_norm1.bias:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.08 GB, cond_stage_model.transformer.text_model.encoder.layers.5.mlp.fc1.weight:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.09 GB, cond_stage_model.transformer.text_model.encoder.layers.5.mlp.fc1.bias:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.09 GB, cond_stage_model.transformer.text_model.encoder.layers.5.mlp.fc2.weight:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]
ram used:  4.09 GB, cond_stage_model.transformer.text_model.encoder.layers.5.mlp.fc2.bias:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.09 GB, cond_stage_model.transformer.text_model.encoder.layers.5.layer_norm2.weight:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]
ram used:  4.09 GB, cond_stage_model.transformer.text_model.encoder.layers.5.layer_norm2.bias:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.09 GB, cond_stage_model.transformer.text_model.encoder.layers.6.self_attn.k_proj.weight:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]
ram used:  4.10 GB, cond_stage_model.transformer.text_model.encoder.layers.6.self_attn.k_proj.bias:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.10 GB, cond_stage_model.transformer.text_model.encoder.layers.6.self_attn.v_proj.weight:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]
ram used:  4.10 GB, cond_stage_model.transformer.text_model.encoder.layers.6.self_attn.v_proj.bias:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.10 GB, cond_stage_model.transformer.text_model.encoder.layers.6.self_attn.q_proj.weight:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]
ram used:  4.10 GB, cond_stage_model.transformer.text_model.encoder.layers.6.self_attn.q_proj.bias:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.10 GB, cond_stage_model.transformer.text_model.encoder.layers.6.self_attn.out_proj.weight:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]
ram used:  4.10 GB, cond_stage_model.transformer.text_model.encoder.layers.6.self_attn.out_proj.bias:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.10 GB, cond_stage_model.transformer.text_model.encoder.layers.6.layer_norm1.weight:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]     
ram used:  4.10 GB, cond_stage_model.transformer.text_model.encoder.layers.6.layer_norm1.bias:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.10 GB, cond_stage_model.transformer.text_model.encoder.layers.6.mlp.fc1.weight:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.11 GB, cond_stage_model.transformer.text_model.encoder.layers.6.mlp.fc1.bias:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.11 GB, cond_stage_model.transformer.text_model.encoder.layers.6.mlp.fc2.weight:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]
ram used:  4.12 GB, cond_stage_model.transformer.text_model.encoder.layers.6.mlp.fc2.bias:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.12 GB, cond_stage_model.transformer.text_model.encoder.layers.6.layer_norm2.weight:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]
ram used:  4.12 GB, cond_stage_model.transformer.text_model.encoder.layers.6.layer_norm2.bias:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.12 GB, cond_stage_model.transformer.text_model.encoder.layers.7.self_attn.k_proj.weight:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]
ram used:  4.13 GB, cond_stage_model.transformer.text_model.encoder.layers.7.self_attn.k_proj.bias:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.13 GB, cond_stage_model.transformer.text_model.encoder.layers.7.self_attn.v_proj.weight:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]
ram used:  4.13 GB, cond_stage_model.transformer.text_model.encoder.layers.7.self_attn.v_proj.bias:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.13 GB, cond_stage_model.transformer.text_model.encoder.layers.7.self_attn.q_proj.weight:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]
ram used:  4.13 GB, cond_stage_model.transformer.text_model.encoder.layers.7.self_attn.q_proj.bias:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.13 GB, cond_stage_model.transformer.text_model.encoder.layers.7.self_attn.out_proj.weight:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]
ram used:  4.13 GB, cond_stage_model.transformer.text_model.encoder.layers.7.self_attn.out_proj.bias:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.13 GB, cond_stage_model.transformer.text_model.encoder.layers.7.layer_norm1.weight:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]     
ram used:  4.13 GB, cond_stage_model.transformer.text_model.encoder.layers.7.layer_norm1.bias:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.13 GB, cond_stage_model.transformer.text_model.encoder.layers.7.mlp.fc1.weight:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.14 GB, cond_stage_model.transformer.text_model.encoder.layers.7.mlp.fc1.bias:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.14 GB, cond_stage_model.transformer.text_model.encoder.layers.7.mlp.fc2.weight:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]
ram used:  4.15 GB, cond_stage_model.transformer.text_model.encoder.layers.7.mlp.fc2.bias:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.15 GB, cond_stage_model.transformer.text_model.encoder.layers.7.layer_norm2.weight:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]
ram used:  4.15 GB, cond_stage_model.transformer.text_model.encoder.layers.7.layer_norm2.bias:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.15 GB, cond_stage_model.transformer.text_model.encoder.layers.8.self_attn.k_proj.weight:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]
ram used:  4.15 GB, cond_stage_model.transformer.text_model.encoder.layers.8.self_attn.k_proj.bias:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.15 GB, cond_stage_model.transformer.text_model.encoder.layers.8.self_attn.v_proj.weight:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]
ram used:  4.16 GB, cond_stage_model.transformer.text_model.encoder.layers.8.self_attn.v_proj.bias:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.16 GB, cond_stage_model.transformer.text_model.encoder.layers.8.self_attn.q_proj.weight:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]
ram used:  4.16 GB, cond_stage_model.transformer.text_model.encoder.layers.8.self_attn.q_proj.bias:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.16 GB, cond_stage_model.transformer.text_model.encoder.layers.8.self_attn.out_proj.weight:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]
ram used:  4.16 GB, cond_stage_model.transformer.text_model.encoder.layers.8.self_attn.out_proj.bias:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.16 GB, cond_stage_model.transformer.text_model.encoder.layers.8.layer_norm1.weight:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]     
ram used:  4.16 GB, cond_stage_model.transformer.text_model.encoder.layers.8.layer_norm1.bias:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.16 GB, cond_stage_model.transformer.text_model.encoder.layers.8.mlp.fc1.weight:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.17 GB, cond_stage_model.transformer.text_model.encoder.layers.8.mlp.fc1.bias:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.17 GB, cond_stage_model.transformer.text_model.encoder.layers.8.mlp.fc2.weight:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]
ram used:  4.18 GB, cond_stage_model.transformer.text_model.encoder.layers.8.mlp.fc2.bias:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.18 GB, cond_stage_model.transformer.text_model.encoder.layers.8.layer_norm2.weight:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]
ram used:  4.18 GB, cond_stage_model.transformer.text_model.encoder.layers.8.layer_norm2.bias:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.18 GB, cond_stage_model.transformer.text_model.encoder.layers.9.self_attn.k_proj.weight:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]
ram used:  4.18 GB, cond_stage_model.transformer.text_model.encoder.layers.9.self_attn.k_proj.bias:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.18 GB, cond_stage_model.transformer.text_model.encoder.layers.9.self_attn.v_proj.weight:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]
ram used:  4.18 GB, cond_stage_model.transformer.text_model.encoder.layers.9.self_attn.v_proj.bias:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.18 GB, cond_stage_model.transformer.text_model.encoder.layers.9.self_attn.q_proj.weight:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]
ram used:  4.19 GB, cond_stage_model.transformer.text_model.encoder.layers.9.self_attn.q_proj.bias:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.19 GB, cond_stage_model.transformer.text_model.encoder.layers.9.self_attn.out_proj.weight:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]
ram used:  4.19 GB, cond_stage_model.transformer.text_model.encoder.layers.9.self_attn.out_proj.bias:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.19 GB, cond_stage_model.transformer.text_model.encoder.layers.9.layer_norm1.weight:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]     
ram used:  4.19 GB, cond_stage_model.transformer.text_model.encoder.layers.9.layer_norm1.bias:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.19 GB, cond_stage_model.transformer.text_model.encoder.layers.9.mlp.fc1.weight:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.20 GB, cond_stage_model.transformer.text_model.encoder.layers.9.mlp.fc1.bias:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.20 GB, cond_stage_model.transformer.text_model.encoder.layers.9.mlp.fc2.weight:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]
ram used:  4.21 GB, cond_stage_model.transformer.text_model.encoder.layers.9.mlp.fc2.bias:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.21 GB, cond_stage_model.transformer.text_model.encoder.layers.9.layer_norm2.weight:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]
ram used:  4.21 GB, cond_stage_model.transformer.text_model.encoder.layers.9.layer_norm2.weight:  97%|█████████▋| 1096/1131 [00:02<00:00, 742.75it/s]
ram used:  4.21 GB, cond_stage_model.transformer.text_model.encoder.layers.9.layer_norm2.bias:  97%|█████████▋| 1096/1131 [00:02<00:00, 742.75it/s]  
ram used:  4.21 GB, cond_stage_model.transformer.text_model.encoder.layers.10.self_attn.k_proj.weight:  97%|█████████▋| 1096/1131 [00:02<00:00, 742.75it/s]
ram used:  4.21 GB, cond_stage_model.transformer.text_model.encoder.layers.10.self_attn.k_proj.bias:  97%|█████████▋| 1096/1131 [00:02<00:00, 742.75it/s]  
ram used:  4.21 GB, cond_stage_model.transformer.text_model.encoder.layers.10.self_attn.v_proj.weight:  97%|█████████▋| 1096/1131 [00:02<00:00, 742.75it/s]
ram used:  4.21 GB, cond_stage_model.transformer.text_model.encoder.layers.10.self_attn.v_proj.bias:  97%|█████████▋| 1096/1131 [00:02<00:00, 742.75it/s]  
ram used:  4.21 GB, cond_stage_model.transformer.text_model.encoder.layers.10.self_attn.q_proj.weight:  97%|█████████▋| 1096/1131 [00:02<00:00, 742.75it/s]
ram used:  4.22 GB, cond_stage_model.transformer.text_model.encoder.layers.10.self_attn.q_proj.bias:  97%|█████████▋| 1096/1131 [00:02<00:00, 742.75it/s]  
ram used:  4.22 GB, cond_stage_model.transformer.text_model.encoder.layers.10.self_attn.out_proj.weight:  97%|█████████▋| 1096/1131 [00:02<00:00, 742.75it/s]
ram used:  4.22 GB, cond_stage_model.transformer.text_model.encoder.layers.10.self_attn.out_proj.bias:  97%|█████████▋| 1096/1131 [00:02<00:00, 742.75it/s]  
ram used:  4.22 GB, cond_stage_model.transformer.text_model.encoder.layers.10.layer_norm1.weight:  97%|█████████▋| 1096/1131 [00:02<00:00, 742.75it/s]     
ram used:  4.22 GB, cond_stage_model.transformer.text_model.encoder.layers.10.layer_norm1.bias:  97%|█████████▋| 1096/1131 [00:02<00:00, 742.75it/s]  
ram used:  4.22 GB, cond_stage_model.transformer.text_model.encoder.layers.10.mlp.fc1.weight:  97%|█████████▋| 1096/1131 [00:02<00:00, 742.75it/s]  
ram used:  4.23 GB, cond_stage_model.transformer.text_model.encoder.layers.10.mlp.fc1.bias:  97%|█████████▋| 1096/1131 [00:02<00:00, 742.75it/s]  
ram used:  4.23 GB, cond_stage_model.transformer.text_model.encoder.layers.10.mlp.fc2.weight:  97%|█████████▋| 1096/1131 [00:02<00:00, 742.75it/s]
ram used:  4.24 GB, cond_stage_model.transformer.text_model.encoder.layers.10.mlp.fc2.bias:  97%|█████████▋| 1096/1131 [00:02<00:00, 742.75it/s]  
ram used:  4.24 GB, cond_stage_model.transformer.text_model.encoder.layers.10.layer_norm2.weight:  97%|█████████▋| 1096/1131 [00:02<00:00, 742.75it/s]
ram used:  4.24 GB, cond_stage_model.transformer.text_model.encoder.layers.10.layer_norm2.bias:  97%|█████████▋| 1096/1131 [00:02<00:00, 742.75it/s]  
ram used:  4.24 GB, cond_stage_model.transformer.text_model.encoder.layers.11.self_attn.k_proj.weight:  97%|█████████▋| 1096/1131 [00:02<00:00, 742.75it/s]
ram used:  4.24 GB, cond_stage_model.transformer.text_model.encoder.layers.11.self_attn.k_proj.bias:  97%|█████████▋| 1096/1131 [00:02<00:00, 742.75it/s]  
ram used:  4.24 GB, cond_stage_model.transformer.text_model.encoder.layers.11.self_attn.v_proj.weight:  97%|█████████▋| 1096/1131 [00:02<00:00, 742.75it/s]
ram used:  4.24 GB, cond_stage_model.transformer.text_model.encoder.layers.11.self_attn.v_proj.bias:  97%|█████████▋| 1096/1131 [00:02<00:00, 742.75it/s]  
ram used:  4.24 GB, cond_stage_model.transformer.text_model.encoder.layers.11.self_attn.q_proj.weight:  97%|█████████▋| 1096/1131 [00:02<00:00, 742.75it/s]
ram used:  4.24 GB, cond_stage_model.transformer.text_model.encoder.layers.11.self_attn.q_proj.bias:  97%|█████████▋| 1096/1131 [00:02<00:00, 742.75it/s]  
ram used:  4.24 GB, cond_stage_model.transformer.text_model.encoder.layers.11.self_attn.out_proj.weight:  97%|█████████▋| 1096/1131 [00:02<00:00, 742.75it/s]
ram used:  4.25 GB, cond_stage_model.transformer.text_model.encoder.layers.11.self_attn.out_proj.bias:  97%|█████████▋| 1096/1131 [00:02<00:00, 742.75it/s]  
ram used:  4.25 GB, cond_stage_model.transformer.text_model.encoder.layers.11.layer_norm1.weight:  97%|█████████▋| 1096/1131 [00:02<00:00, 742.75it/s]     
ram used:  4.25 GB, cond_stage_model.transformer.text_model.encoder.layers.11.layer_norm1.bias:  97%|█████████▋| 1096/1131 [00:02<00:00, 742.75it/s]  
ram used:  4.25 GB, cond_stage_model.transformer.text_model.encoder.layers.11.mlp.fc1.weight:  97%|█████████▋| 1096/1131 [00:02<00:00, 742.75it/s]  
ram used:  4.26 GB, cond_stage_model.transformer.text_model.encoder.layers.11.mlp.fc1.bias:  97%|█████████▋| 1096/1131 [00:02<00:00, 742.75it/s]  
ram used:  4.26 GB, cond_stage_model.transformer.text_model.encoder.layers.11.mlp.fc2.weight:  97%|█████████▋| 1096/1131 [00:02<00:00, 742.75it/s]
ram used:  4.26 GB, cond_stage_model.transformer.text_model.encoder.layers.11.mlp.fc2.bias:  97%|█████████▋| 1096/1131 [00:02<00:00, 742.75it/s]  
ram used:  4.26 GB, cond_stage_model.transformer.text_model.encoder.layers.11.layer_norm2.weight:  97%|█████████▋| 1096/1131 [00:02<00:00, 742.75it/s]
ram used:  4.26 GB, cond_stage_model.transformer.text_model.encoder.layers.11.layer_norm2.bias:  97%|█████████▋| 1096/1131 [00:02<00:00, 742.75it/s]  
ram used:  4.26 GB, cond_stage_model.transformer.text_model.final_layer_norm.weight:  97%|█████████▋| 1096/1131 [00:02<00:00, 742.75it/s]           
ram used:  4.26 GB, cond_stage_model.transformer.text_model.final_layer_norm.bias:  97%|█████████▋| 1096/1131 [00:02<00:00, 742.75it/s]  
ram used:  4.26 GB, cond_stage_model.transformer.text_model.final_layer_norm.bias: 100%|██████████| 1131/1131 [00:02<00:00, 440.76it/s]
loaded weights in 2571.99 ms, 4.26 GB loaded at 1.66 GB/s
got CLIP context (1, 77, 768)
got unconditional CLIP context (1, 77, 768)
running for [1, 201, 401, 601, 801] timesteps

  0%|          | 0/5 [00:00<?, ?it/s]
  4 801:   0%|          | 0/5 [00:00<?, ?it/s]
  4 801:  20%|██        | 1/5 [00:13<00:55, 13.99s/it]
  3 601:  20%|██        | 1/5 [00:13<00:55, 13.99s/it]
  3 601:  40%|████      | 2/5 [00:14<00:18,  6.01s/it]
  2 401:  40%|████      | 2/5 [00:14<00:18,  6.01s/it]
  1 201:  40%|████      | 2/5 [00:14<00:18,  6.01s/it]
  0   1:  40%|████      | 2/5 [00:14<00:18,  6.01s/it]
  0   1: 100%|██████████| 5/5 [00:14<00:00,  2.88s/it]
decode (1, 512, 64, 64)
decode (1, 512, 128, 128)
decode (1, 512, 256, 256)
decode (1, 256, 512, 512)
(512, 512, 3)
saving /tmp/rendered.png
Error: no "view" mailcap rules found for type "image/png"
/usr/bin/xdg-open: 882: www-browser: not found
/usr/bin/xdg-open: 882: links2: not found
/usr/bin/xdg-open: 882: elinks: not found
/usr/bin/xdg-open: 882: links: not found
/usr/bin/xdg-open: 882: lynx: not found
/usr/bin/xdg-open: 882: w3m: not found
xdg-open: no method available for opening '/tmp/tmphj455rbh.PNG'

train_efficientnet.py

parameter count 296
training with batch size 16 for 2048 steps

  0%|          | 0/2048 [00:00<?, ?it/s]
loss 2.68 accuracy 0.00 -- 60.07 + 59.54 + 31476.56 + 219.46 = 31815.64:   0%|          | 0/2048 [00:31<?, ?it/s]
loss 2.68 accuracy 0.00 -- 60.07 + 59.54 + 31476.56 + 219.46 = 31815.64:   0%|          | 1/2048 [00:31<18:06:37, 31.85s/it]
loss 2.68 accuracy 0.06 -- 54.64 + 153.94 + 14521.00 + 5.87 = 14735.45:   0%|          | 1/2048 [00:46<18:06:37, 31.85s/it] 
loss 2.68 accuracy 0.06 -- 54.64 + 153.94 + 14521.00 + 5.87 = 14735.45:   0%|          | 2/2048 [00:46<12:23:21, 21.80s/it]
loss 2.92 accuracy 0.12 -- 55.55 + 56.57 + 620.43 + 4.97 = 737.52:   0%|          | 2/2048 [00:47<12:23:21, 21.80s/it]     
loss 2.92 accuracy 0.12 -- 55.55 + 56.57 + 620.43 + 4.97 = 737.52:   0%|          | 3/2048 [00:47<6:55:39, 12.20s/it] 
loss 3.24 accuracy 0.12 -- 56.70 + 56.73 + 501.23 + 5.02 = 619.68:   0%|          | 3/2048 [00:48<6:55:39, 12.20s/it]
loss 3.24 accuracy 0.12 -- 56.70 + 56.73 + 501.23 + 5.02 = 619.68:   0%|          | 4/2048 [00:48<4:20:08,  7.64s/it]
loss 2.99 accuracy 0.19 -- 166.41 + 57.14 + 505.88 + 4.99 = 734.43:   0%|          | 4/2048 [00:48<4:20:08,  7.64s/it]
loss 2.99 accuracy 0.19 -- 166.41 + 57.14 + 505.88 + 4.99 = 734.43:   0%|          | 5/2048 [00:48<2:55:36,  5.16s/it]
loss 3.15 accuracy 0.12 -- 57.06 + 57.74 + 633.05 + 5.03 = 752.89:   0%|          | 5/2048 [00:49<2:55:36,  5.16s/it] 
loss 3.15 accuracy 0.12 -- 57.06 + 57.74 + 633.05 + 5.03 = 752.89:   0%|          | 6/2048 [00:49<2:04:52,  3.67s/it]
loss 3.58 accuracy 0.19 -- 57.63 + 57.02 + 513.18 + 5.00 = 632.83:   0%|          | 6/2048 [00:50<2:04:52,  3.67s/it]
loss 3.58 accuracy 0.19 -- 57.63 + 57.02 + 513.18 + 5.00 = 632.83:   0%|          | 7/2048 [00:50<1:31:21,  2.69s/it]
loss 3.65 accuracy 0.12 -- 56.84 + 57.49 + 629.53 + 5.02 = 748.88:   0%|          | 7/2048 [00:51<1:31:21,  2.69s/it]
loss 3.65 accuracy 0.12 -- 56.84 + 57.49 + 629.53 + 5.02 = 748.88:   0%|          | 8/2048 [00:51<1:10:38,  2.08s/it]
loss 3.52 accuracy 0.12 -- 57.34 + 57.04 + 511.09 + 4.98 = 630.45:   0%|          | 8/2048 [00:51<1:10:38,  2.08s/it]
loss 3.52 accuracy 0.12 -- 57.34 + 57.04 + 511.09 + 4.98 = 630.45:   0%|          | 9/2048 [00:51<55:31,  1.63s/it]  
loss 3.62 accuracy 0.12 -- 56.32 + 170.81 + 509.88 + 4.98 = 741.99:   0%|          | 9/2048 [00:52<55:31,  1.63s/it]
loss 3.62 accuracy 0.12 -- 56.32 + 170.81 + 509.88 + 4.98 = 741.99:   0%|          | 10/2048 [00:52<46:48,  1.38s/it]
loss 3.23 accuracy 0.06 -- 56.67 + 56.50 + 509.40 + 4.98 = 627.56:   0%|          | 10/2048 [00:53<46:48,  1.38s/it] 
loss 3.23 accuracy 0.06 -- 56.67 + 56.50 + 509.40 + 4.98 = 627.56:   1%|          | 11/2048 [00:53<39:16,  1.16s/it]
loss 3.66 accuracy 0.00 -- 57.39 + 57.07 + 506.83 + 5.00 = 626.28:   1%|          | 11/2048 [00:53<39:16,  1.16s/it]
loss 3.66 accuracy 0.00 -- 57.39 + 57.07 + 506.83 + 5.00 = 626.28:   1%|          | 12/2048 [00:53<35:12,  1.04s/it]
loss 3.66 accuracy 0.06 -- 56.38 + 57.46 + 501.93 + 4.92 = 620.70:   1%|          | 12/2048 [00:54<35:12,  1.04s/it]
loss 3.66 accuracy 0.06 -- 56.38 + 57.46 + 501.93 + 4.92 = 620.70:   1%|          | 13/2048 [00:54<31:11,  1.09it/s]
loss 2.83 accuracy 0.19 -- 161.58 + 56.90 + 497.35 + 4.97 = 720.81:   1%|          | 13/2048 [00:55<31:11,  1.09it/s]
loss 2.83 accuracy 0.19 -- 161.58 + 56.90 + 497.35 + 4.97 = 720.81:   1%|          | 14/2048 [00:55<29:25,  1.15it/s]
loss 2.59 accuracy 0.25 -- 56.46 + 171.02 + 510.33 + 5.01 = 742.81:   1%|          | 14/2048 [00:56<29:25,  1.15it/s]
loss 2.59 accuracy 0.25 -- 56.46 + 171.02 + 510.33 + 5.01 = 742.81:   1%|          | 15/2048 [00:56<28:24,  1.19it/s]
loss 4.20 accuracy 0.00 -- 57.55 + 57.12 + 506.81 + 4.95 = 626.43:   1%|          | 15/2048 [00:56<28:24,  1.19it/s] 
loss 4.20 accuracy 0.00 -- 57.55 + 57.12 + 506.81 + 4.95 = 626.43:   1%|          | 16/2048 [00:56<26:31,  1.28it/s]
loss 3.20 accuracy 0.25 -- 166.51 + 57.89 + 506.22 + 4.98 = 735.60:   1%|          | 16/2048 [00:57<26:31,  1.28it/s]
loss 3.20 accuracy 0.25 -- 166.51 + 57.89 + 506.22 + 4.98 = 735.60:   1%|          | 17/2048 [00:57<26:40,  1.27it/s]
loss 2.69 accuracy 0.19 -- 56.37 + 57.45 + 632.64 + 4.96 = 751.41:   1%|          | 17/2048 [00:58<26:40,  1.27it/s] 
loss 2.69 accuracy 0.19 -- 56.37 + 57.45 + 632.64 + 4.96 = 751.41:   1%|          | 18/2048 [00:58<26:33,  1.27it/s]
loss 3.83 accuracy 0.12 -- 57.34 + 56.92 + 513.08 + 4.96 = 632.30:   1%|          | 18/2048 [00:58<26:33,  1.27it/s]
loss 3.83 accuracy 0.12 -- 57.34 + 56.92 + 513.08 + 4.96 = 632.30:   1%|          | 19/2048 [00:58<25:16,  1.34it/s]
loss 3.53 accuracy 0.00 -- 56.67 + 57.76 + 627.03 + 4.96 = 746.42:   1%|          | 19/2048 [00:59<25:16,  1.34it/s]
loss 3.53 accuracy 0.00 -- 56.67 + 57.76 + 627.03 + 4.96 = 746.42:   1%|          | 20/2048 [00:59<25:32,  1.32it/s]
loss 2.85 accuracy 0.12 -- 57.14 + 56.67 + 511.39 + 5.01 = 630.21:   1%|          | 20/2048 [01:00<25:32,  1.32it/s]
loss 2.85 accuracy 0.12 -- 57.14 + 56.67 + 511.39 + 5.01 = 630.21:   1%|          | 21/2048 [01:00<24:32,  1.38it/s]
loss 3.75 accuracy 0.00 -- 56.73 + 170.30 + 510.66 + 4.94 = 742.63:   1%|          | 21/2048 [01:01<24:32,  1.38it/s]
loss 3.75 accuracy 0.00 -- 56.73 + 170.30 + 510.66 + 4.94 = 742.63:   1%|          | 22/2048 [01:01<24:58,  1.35it/s]
loss 3.16 accuracy 0.00 -- 56.37 + 56.61 + 508.78 + 4.96 = 626.71:   1%|          | 22/2048 [01:01<24:58,  1.35it/s] 
loss 3.16 accuracy 0.00 -- 56.37 + 56.61 + 508.78 + 4.96 = 626.71:   1%|          | 23/2048 [01:01<24:06,  1.40it/s]
loss 2.89 accuracy 0.25 -- 57.06 + 56.84 + 506.63 + 4.96 = 625.48:   1%|          | 23/2048 [01:02<24:06,  1.40it/s]
loss 2.89 accuracy 0.25 -- 57.06 + 56.84 + 506.63 + 4.96 = 625.48:   1%|          | 24/2048 [01:02<24:35,  1.37it/s]
loss 3.16 accuracy 0.19 -- 56.42 + 57.74 + 510.59 + 4.93 = 629.68:   1%|          | 24/2048 [01:03<24:35,  1.37it/s]
loss 3.16 accuracy 0.19 -- 56.42 + 57.74 + 510.59 + 4.93 = 629.68:   1%|          | 25/2048 [01:03<23:51,  1.41it/s]
loss 3.20 accuracy 0.06 -- 161.59 + 57.25 + 497.86 + 4.92 = 721.62:   1%|          | 25/2048 [01:03<23:51,  1.41it/s]
loss 3.20 accuracy 0.06 -- 161.59 + 57.25 + 497.86 + 4.92 = 721.62:   1%|▏         | 26/2048 [01:03<24:16,  1.39it/s]
loss 3.72 accuracy 0.19 -- 56.32 + 171.12 + 509.18 + 4.96 = 741.58:   1%|▏         | 26/2048 [01:04<24:16,  1.39it/s]
loss 3.72 accuracy 0.19 -- 56.32 + 171.12 + 509.18 + 4.96 = 741.58:   1%|▏         | 27/2048 [01:04<24:45,  1.36it/s]
loss 2.21 accuracy 0.25 -- 57.25 + 56.99 + 505.71 + 4.99 = 624.95:   1%|▏         | 27/2048 [01:05<24:45,  1.36it/s] 
loss 2.21 accuracy 0.25 -- 57.25 + 56.99 + 505.71 + 4.99 = 624.95:   1%|▏         | 28/2048 [01:05<23:54,  1.41it/s]
loss 2.45 accuracy 0.12 -- 166.51 + 57.68 + 504.50 + 4.97 = 733.66:   1%|▏         | 28/2048 [01:06<23:54,  1.41it/s]
loss 2.45 accuracy 0.12 -- 166.51 + 57.68 + 504.50 + 4.97 = 733.66:   1%|▏         | 29/2048 [01:06<24:24,  1.38it/s]
loss 2.34 accuracy 0.31 -- 56.83 + 57.68 + 631.80 + 4.96 = 751.27:   1%|▏         | 29/2048 [01:06<24:24,  1.38it/s] 
loss 2.34 accuracy 0.31 -- 56.83 + 57.68 + 631.80 + 4.96 = 751.27:   1%|▏         | 30/2048 [01:06<24:56,  1.35it/s]
loss 3.25 accuracy 0.12 -- 57.14 + 56.79 + 512.39 + 5.04 = 631.36:   1%|▏         | 30/2048 [01:07<24:56,  1.35it/s]
loss 3.25 accuracy 0.12 -- 57.14 + 56.79 + 512.39 + 5.04 = 631.36:   2%|▏         | 31/2048 [01:07<24:06,  1.39it/s]
loss 2.57 accuracy 0.25 -- 56.36 + 57.58 + 627.26 + 5.01 = 746.21:   2%|▏         | 31/2048 [01:08<24:06,  1.39it/s]
loss 2.57 accuracy 0.25 -- 56.36 + 57.58 + 627.26 + 5.01 = 746.21:   2%|▏         | 32/2048 [01:08<24:39,  1.36it/s]
loss 3.08 accuracy 0.06 -- 57.02 + 56.94 + 512.38 + 4.95 = 631.28:   2%|▏         | 32/2048 [01:09<24:39,  1.36it/s]
loss 3.08 accuracy 0.06 -- 57.02 + 56.94 + 512.38 + 4.95 = 631.28:   2%|▏         | 33/2048 [01:09<23:53,  1.41it/s]
loss 2.50 accuracy 0.19 -- 56.46 + 170.47 + 510.74 + 4.92 = 742.59:   2%|▏         | 33/2048 [01:09<23:53,  1.41it/s]
loss 2.50 accuracy 0.19 -- 56.46 + 170.47 + 510.74 + 4.92 = 742.59:   2%|▏         | 34/2048 [01:09<24:28,  1.37it/s]
loss 2.49 accuracy 0.25 -- 56.65 + 56.58 + 509.35 + 4.95 = 627.53:   2%|▏         | 34/2048 [01:10<24:28,  1.37it/s] 
loss 2.49 accuracy 0.25 -- 56.65 + 56.58 + 509.35 + 4.95 = 627.53:   2%|▏         | 35/2048 [01:10<23:42,  1.41it/s]
loss 2.66 accuracy 0.19 -- 57.15 + 56.66 + 506.31 + 4.96 = 625.09:   2%|▏         | 35/2048 [01:11<23:42,  1.41it/s]
loss 2.66 accuracy 0.19 -- 57.15 + 56.66 + 506.31 + 4.96 = 625.09:   2%|▏         | 36/2048 [01:11<24:16,  1.38it/s]
loss 2.45 accuracy 0.06 -- 56.72 + 57.53 + 503.47 + 4.93 = 622.65:   2%|▏         | 36/2048 [01:11<24:16,  1.38it/s]
loss 2.45 accuracy 0.06 -- 56.72 + 57.53 + 503.47 + 4.93 = 622.65:   2%|▏         | 37/2048 [01:11<23:31,  1.42it/s]
loss 2.52 accuracy 0.12 -- 161.43 + 57.13 + 498.34 + 4.95 = 721.86:   2%|▏         | 37/2048 [01:12<23:31,  1.42it/s]
loss 2.52 accuracy 0.12 -- 161.43 + 57.13 + 498.34 + 4.95 = 721.86:   2%|▏         | 38/2048 [01:12<23:59,  1.40it/s]
loss 2.90 accuracy 0.19 -- 56.60 + 172.00 + 510.17 + 4.94 = 743.70:   2%|▏         | 38/2048 [01:13<23:59,  1.40it/s]
loss 2.90 accuracy 0.19 -- 56.60 + 172.00 + 510.17 + 4.94 = 743.70:   2%|▏         | 39/2048 [01:13<24:32,  1.36it/s]
loss 3.13 accuracy 0.12 -- 57.61 + 56.68 + 507.47 + 4.94 = 626.71:   2%|▏         | 39/2048 [01:14<24:32,  1.36it/s] 
loss 3.13 accuracy 0.12 -- 57.61 + 56.68 + 507.47 + 4.94 = 626.71:   2%|▏         | 40/2048 [01:14<23:44,  1.41it/s]
loss 2.63 accuracy 0.12 -- 166.51 + 57.57 + 504.92 + 4.96 = 733.95:   2%|▏         | 40/2048 [01:14<23:44,  1.41it/s]
loss 2.63 accuracy 0.12 -- 166.51 + 57.57 + 504.92 + 4.96 = 733.95:   2%|▏         | 41/2048 [01:14<24:14,  1.38it/s]
loss 3.17 accuracy 0.06 -- 56.51 + 57.76 + 632.84 + 4.97 = 752.08:   2%|▏         | 41/2048 [01:15<24:14,  1.38it/s] 
loss 3.17 accuracy 0.06 -- 56.51 + 57.76 + 632.84 + 4.97 = 752.08:   2%|▏         | 42/2048 [01:15<24:46,  1.35it/s]
loss 2.67 accuracy 0.19 -- 57.36 + 57.28 + 515.07 + 4.93 = 634.63:   2%|▏         | 42/2048 [01:16<24:46,  1.35it/s]
loss 2.67 accuracy 0.19 -- 57.36 + 57.28 + 515.07 + 4.93 = 634.63:   2%|▏         | 43/2048 [01:16<23:58,  1.39it/s]
loss 2.22 accuracy 0.25 -- 57.00 + 57.66 + 626.79 + 4.93 = 746.37:   2%|▏         | 43/2048 [01:17<23:58,  1.39it/s]
loss 2.22 accuracy 0.25 -- 57.00 + 57.66 + 626.79 + 4.93 = 746.37:   2%|▏         | 44/2048 [01:17<24:31,  1.36it/s]
loss 2.57 accuracy 0.19 -- 57.19 + 56.80 + 509.33 + 4.97 = 628.30:   2%|▏         | 44/2048 [01:17<24:31,  1.36it/s]
loss 2.57 accuracy 0.19 -- 57.19 + 56.80 + 509.33 + 4.97 = 628.30:   2%|▏         | 45/2048 [01:17<23:43,  1.41it/s]
loss 2.96 accuracy 0.12 -- 56.87 + 171.29 + 512.56 + 4.99 = 745.72:   2%|▏         | 45/2048 [01:18<23:43,  1.41it/s]
loss 2.96 accuracy 0.12 -- 56.87 + 171.29 + 512.56 + 4.99 = 745.72:   2%|▏         | 46/2048 [01:18<24:20,  1.37it/s]
loss 2.29 accuracy 0.19 -- 56.80 + 56.98 + 508.70 + 4.93 = 627.41:   2%|▏         | 46/2048 [01:19<24:20,  1.37it/s] 
loss 2.29 accuracy 0.19 -- 56.80 + 56.98 + 508.70 + 4.93 = 627.41:   2%|▏         | 47/2048 [01:19<23:35,  1.41it/s]
loss 2.38 accuracy 0.31 -- 56.89 + 56.38 + 505.75 + 4.96 = 623.98:   2%|▏         | 47/2048 [01:19<23:35,  1.41it/s]
loss 2.38 accuracy 0.31 -- 56.89 + 56.38 + 505.75 + 4.96 = 623.98:   2%|▏         | 48/2048 [01:19<24:07,  1.38it/s]
loss 2.82 accuracy 0.12 -- 56.45 + 57.53 + 503.36 + 4.99 = 622.34:   2%|▏         | 48/2048 [01:20<24:07,  1.38it/s]
loss 2.82 accuracy 0.12 -- 56.45 + 57.53 + 503.36 + 4.99 = 622.34:   2%|▏         | 49/2048 [01:20<23:22,  1.43it/s]
loss 2.84 accuracy 0.06 -- 161.31 + 57.22 + 498.24 + 4.95 = 721.72:   2%|▏         | 49/2048 [01:21<23:22,  1.43it/s]
loss 2.84 accuracy 0.06 -- 161.31 + 57.22 + 498.24 + 4.95 = 721.72:   2%|▏         | 50/2048 [01:21<23:50,  1.40it/s]
loss 2.82 accuracy 0.25 -- 56.49 + 170.20 + 509.57 + 4.94 = 741.20:   2%|▏         | 50/2048 [01:22<23:50,  1.40it/s]
loss 2.82 accuracy 0.25 -- 56.49 + 170.20 + 509.57 + 4.94 = 741.20:   2%|▏         | 51/2048 [01:22<24:21,  1.37it/s]
loss 2.55 accuracy 0.06 -- 57.40 + 56.92 + 505.32 + 4.97 = 624.61:   2%|▏         | 51/2048 [01:22<24:21,  1.37it/s] 
loss 2.55 accuracy 0.06 -- 57.40 + 56.92 + 505.32 + 4.97 = 624.61:   3%|▎         | 52/2048 [01:22<23:33,  1.41it/s]
loss 2.79 accuracy 0.06 -- 166.14 + 57.77 + 505.58 + 4.95 = 734.43:   3%|▎         | 52/2048 [01:23<23:33,  1.41it/s]
loss 2.79 accuracy 0.06 -- 166.14 + 57.77 + 505.58 + 4.95 = 734.43:   3%|▎         | 53/2048 [01:23<24:04,  1.38it/s]
loss 3.23 accuracy 0.00 -- 56.58 + 57.94 + 635.67 + 4.95 = 755.14:   3%|▎         | 53/2048 [01:24<24:04,  1.38it/s] 
loss 3.23 accuracy 0.00 -- 56.58 + 57.94 + 635.67 + 4.95 = 755.14:   3%|▎         | 54/2048 [01:24<24:39,  1.35it/s]
loss 2.25 accuracy 0.25 -- 57.36 + 56.66 + 514.99 + 4.96 = 633.97:   3%|▎         | 54/2048 [01:24<24:39,  1.35it/s]
loss 2.25 accuracy 0.25 -- 57.36 + 56.66 + 514.99 + 4.96 = 633.97:   3%|▎         | 55/2048 [01:24<23:50,  1.39it/s]
loss 2.72 accuracy 0.00 -- 56.95 + 57.45 + 626.68 + 4.96 = 746.04:   3%|▎         | 55/2048 [01:25<23:50,  1.39it/s]
loss 2.72 accuracy 0.00 -- 56.95 + 57.45 + 626.68 + 4.96 = 746.04:   3%|▎         | 56/2048 [01:25<24:23,  1.36it/s]
loss 2.89 accuracy 0.19 -- 57.37 + 56.63 + 510.67 + 4.92 = 629.59:   3%|▎         | 56/2048 [01:26<24:23,  1.36it/s]
loss 2.89 accuracy 0.19 -- 57.37 + 56.63 + 510.67 + 4.92 = 629.59:   3%|▎         | 57/2048 [01:26<23:36,  1.41it/s]
loss 2.52 accuracy 0.19 -- 56.62 + 169.89 + 510.45 + 4.96 = 741.92:   3%|▎         | 57/2048 [01:27<23:36,  1.41it/s]
loss 2.52 accuracy 0.19 -- 56.62 + 169.89 + 510.45 + 4.96 = 741.92:   3%|▎         | 58/2048 [01:27<24:10,  1.37it/s]
loss 2.18 accuracy 0.19 -- 56.57 + 56.86 + 508.15 + 4.95 = 626.53:   3%|▎         | 58/2048 [01:27<24:10,  1.37it/s] 
loss 2.18 accuracy 0.19 -- 56.57 + 56.86 + 508.15 + 4.95 = 626.53:   3%|▎         | 59/2048 [01:27<23:25,  1.42it/s]
loss 2.77 accuracy 0.19 -- 57.27 + 56.81 + 505.29 + 4.92 = 624.28:   3%|▎         | 59/2048 [01:28<23:25,  1.42it/s]
loss 2.77 accuracy 0.19 -- 57.27 + 56.81 + 505.29 + 4.92 = 624.28:   3%|▎         | 60/2048 [01:28<23:57,  1.38it/s]
loss 2.40 accuracy 0.06 -- 56.41 + 57.76 + 504.98 + 4.95 = 624.09:   3%|▎         | 60/2048 [01:29<23:57,  1.38it/s]
loss 2.40 accuracy 0.06 -- 56.41 + 57.76 + 504.98 + 4.95 = 624.09:   3%|▎         | 61/2048 [01:29<23:14,  1.42it/s]
loss 2.54 accuracy 0.25 -- 161.78 + 57.10 + 497.81 + 4.95 = 721.64:   3%|▎         | 61/2048 [01:29<23:14,  1.42it/s]
loss 2.54 accuracy 0.25 -- 161.78 + 57.10 + 497.81 + 4.95 = 721.64:   3%|▎         | 62/2048 [01:29<23:42,  1.40it/s]
loss 1.99 accuracy 0.31 -- 56.19 + 169.55 + 508.93 + 4.96 = 739.63:   3%|▎         | 62/2048 [01:30<23:42,  1.40it/s]
loss 1.99 accuracy 0.31 -- 56.19 + 169.55 + 508.93 + 4.96 = 739.63:   3%|▎         | 63/2048 [01:30<24:12,  1.37it/s]
loss 2.31 accuracy 0.19 -- 57.19 + 56.49 + 504.98 + 4.96 = 623.62:   3%|▎         | 63/2048 [01:31<24:12,  1.37it/s] 
loss 2.31 accuracy 0.19 -- 57.19 + 56.49 + 504.98 + 4.96 = 623.62:   3%|▎         | 64/2048 [01:31<23:24,  1.41it/s]
loss 2.24 accuracy 0.25 -- 165.95 + 57.86 + 503.87 + 4.96 = 732.64:   3%|▎         | 64/2048 [01:32<23:24,  1.41it/s]
loss 2.24 accuracy 0.25 -- 165.95 + 57.86 + 503.87 + 4.96 = 732.64:   3%|▎         | 65/2048 [01:32<23:54,  1.38it/s]
loss 2.30 accuracy 0.06 -- 56.39 + 57.30 + 634.27 + 4.99 = 752.95:   3%|▎         | 65/2048 [01:32<23:54,  1.38it/s] 
loss 2.30 accuracy 0.06 -- 56.39 + 57.30 + 634.27 + 4.99 = 752.95:   3%|▎         | 66/2048 [01:32<24:28,  1.35it/s]
loss 2.42 accuracy 0.12 -- 57.70 + 57.06 + 512.80 + 4.92 = 632.48:   3%|▎         | 66/2048 [01:33<24:28,  1.35it/s]
loss 2.42 accuracy 0.12 -- 57.70 + 57.06 + 512.80 + 4.92 = 632.48:   3%|▎         | 67/2048 [01:33<23:39,  1.40it/s]
loss 2.42 accuracy 0.19 -- 56.63 + 57.77 + 627.63 + 4.95 = 746.98:   3%|▎         | 67/2048 [01:34<23:39,  1.40it/s]
loss 2.42 accuracy 0.19 -- 56.63 + 57.77 + 627.63 + 4.95 = 746.98:   3%|▎         | 68/2048 [01:34<24:13,  1.36it/s]
loss 2.10 accuracy 0.19 -- 57.41 + 57.06 + 510.26 + 4.95 = 629.68:   3%|▎         | 68/2048 [01:34<24:13,  1.36it/s]
loss 2.10 accuracy 0.19 -- 57.41 + 57.06 + 510.26 + 4.95 = 629.68:   3%|▎         | 69/2048 [01:34<23:27,  1.41it/s]
loss 2.33 accuracy 0.25 -- 56.52 + 169.98 + 509.45 + 4.97 = 740.92:   3%|▎         | 69/2048 [01:35<23:27,  1.41it/s]
loss 2.33 accuracy 0.25 -- 56.52 + 169.98 + 509.45 + 4.97 = 740.92:   3%|▎         | 70/2048 [01:35<24:21,  1.35it/s]
loss 2.46 accuracy 0.25 -- 56.48 + 56.69 + 507.07 + 4.96 = 625.19:   3%|▎         | 70/2048 [01:36<24:21,  1.35it/s] 
loss 2.46 accuracy 0.25 -- 56.48 + 56.69 + 507.07 + 4.96 = 625.19:   3%|▎         | 71/2048 [01:36<23:30,  1.40it/s]
loss 2.55 accuracy 0.25 -- 57.14 + 57.07 + 506.64 + 4.92 = 625.77:   3%|▎         | 71/2048 [01:37<23:30,  1.40it/s]
loss 2.55 accuracy 0.25 -- 57.14 + 57.07 + 506.64 + 4.92 = 625.77:   4%|▎         | 72/2048 [01:37<23:59,  1.37it/s]
loss 2.54 accuracy 0.19 -- 56.46 + 57.77 + 502.40 + 4.94 = 621.57:   4%|▎         | 72/2048 [01:37<23:59,  1.37it/s]
loss 2.54 accuracy 0.19 -- 56.46 + 57.77 + 502.40 + 4.94 = 621.57:   4%|▎         | 73/2048 [01:37<23:11,  1.42it/s]
loss 2.63 accuracy 0.06 -- 161.08 + 57.27 + 498.39 + 4.91 = 721.65:   4%|▎         | 73/2048 [01:38<23:11,  1.42it/s]
loss 2.63 accuracy 0.06 -- 161.08 + 57.27 + 498.39 + 4.91 = 721.65:   4%|▎         | 74/2048 [01:38<23:37,  1.39it/s]
loss 2.35 accuracy 0.06 -- 56.17 + 170.50 + 508.45 + 4.92 = 740.04:   4%|▎         | 74/2048 [01:39<23:37,  1.39it/s]
loss 2.35 accuracy 0.06 -- 56.17 + 170.50 + 508.45 + 4.92 = 740.04:   4%|▎         | 75/2048 [01:39<24:06,  1.36it/s]
loss 2.07 accuracy 0.19 -- 56.85 + 56.61 + 504.88 + 4.88 = 623.21:   4%|▎         | 75/2048 [01:40<24:06,  1.36it/s] 
loss 2.07 accuracy 0.19 -- 56.85 + 56.61 + 504.88 + 4.88 = 623.21:   4%|▎         | 76/2048 [01:40<23:17,  1.41it/s]
loss 2.45 accuracy 0.12 -- 299.43 + 102.71 + 563.90 + 5.17 = 971.21:   4%|▎         | 76/2048 [01:41<23:17,  1.41it/s]
loss 2.45 accuracy 0.12 -- 299.43 + 102.71 + 563.90 + 5.17 = 971.21:   4%|▍         | 77/2048 [01:41<26:09,  1.26it/s]
loss 2.24 accuracy 0.12 -- 60.17 + 60.60 + 666.19 + 5.00 = 791.96:   4%|▍         | 77/2048 [01:41<26:09,  1.26it/s]  
loss 2.24 accuracy 0.12 -- 60.17 + 60.60 + 666.19 + 5.00 = 791.96:   4%|▍         | 78/2048 [01:41<26:24,  1.24it/s]
loss 2.33 accuracy 0.12 -- 58.22 + 58.99 + 533.25 + 5.00 = 655.47:   4%|▍         | 78/2048 [01:42<26:24,  1.24it/s]
loss 2.33 accuracy 0.12 -- 58.22 + 58.99 + 533.25 + 5.00 = 655.47:   4%|▍         | 79/2048 [01:42<25:12,  1.30it/s]
loss 2.39 accuracy 0.19 -- 58.16 + 59.63 + 657.85 + 5.02 = 780.67:   4%|▍         | 79/2048 [01:43<25:12,  1.30it/s]
loss 2.39 accuracy 0.19 -- 58.16 + 59.63 + 657.85 + 5.02 = 780.67:   4%|▍         | 80/2048 [01:43<25:36,  1.28it/s]
loss 2.35 accuracy 0.19 -- 57.73 + 56.80 + 510.00 + 4.88 = 629.41:   4%|▍         | 80/2048 [01:43<25:36,  1.28it/s]
loss 2.35 accuracy 0.19 -- 57.73 + 56.80 + 510.00 + 4.88 = 629.41:   4%|▍         | 81/2048 [01:43<24:22,  1.34it/s]
loss 2.18 accuracy 0.25 -- 56.52 + 171.05 + 508.90 + 4.89 = 741.36:   4%|▍         | 81/2048 [01:44<24:22,  1.34it/s]
loss 2.18 accuracy 0.25 -- 56.52 + 171.05 + 508.90 + 4.89 = 741.36:   4%|▍         | 82/2048 [01:44<24:37,  1.33it/s]
loss 2.37 accuracy 0.25 -- 57.18 + 56.73 + 507.95 + 4.88 = 626.74:   4%|▍         | 82/2048 [01:45<24:37,  1.33it/s] 
loss 2.37 accuracy 0.25 -- 57.18 + 56.73 + 507.95 + 4.88 = 626.74:   4%|▍         | 83/2048 [01:45<23:39,  1.38it/s]
loss 2.51 accuracy 0.19 -- 56.79 + 56.61 + 506.11 + 4.88 = 624.39:   4%|▍         | 83/2048 [01:46<23:39,  1.38it/s]
loss 2.51 accuracy 0.19 -- 56.79 + 56.61 + 506.11 + 4.88 = 624.39:   4%|▍         | 84/2048 [01:46<24:03,  1.36it/s]
loss 2.49 accuracy 0.19 -- 56.30 + 57.36 + 503.51 + 4.90 = 622.07:   4%|▍         | 84/2048 [01:46<24:03,  1.36it/s]
loss 2.49 accuracy 0.19 -- 56.30 + 57.36 + 503.51 + 4.90 = 622.07:   4%|▍         | 85/2048 [01:46<23:12,  1.41it/s]
loss 2.18 accuracy 0.19 -- 163.99 + 56.96 + 498.16 + 4.87 = 723.98:   4%|▍         | 85/2048 [01:47<23:12,  1.41it/s]
loss 2.18 accuracy 0.19 -- 163.99 + 56.96 + 498.16 + 4.87 = 723.98:   4%|▍         | 86/2048 [01:47<23:36,  1.38it/s]
loss 2.32 accuracy 0.06 -- 56.21 + 171.62 + 510.57 + 4.94 = 743.33:   4%|▍         | 86/2048 [01:48<23:36,  1.38it/s]
loss 2.32 accuracy 0.06 -- 56.21 + 171.62 + 510.57 + 4.94 = 743.33:   4%|▍         | 87/2048 [01:48<24:05,  1.36it/s]
loss 2.29 accuracy 0.12 -- 56.90 + 56.52 + 505.62 + 4.92 = 623.96:   4%|▍         | 87/2048 [01:48<24:05,  1.36it/s] 
loss 2.29 accuracy 0.12 -- 56.90 + 56.52 + 505.62 + 4.92 = 623.96:   4%|▍         | 88/2048 [01:48<23:14,  1.41it/s]
loss 2.05 accuracy 0.38 -- 168.24 + 57.38 + 504.51 + 4.89 = 735.02:   4%|▍         | 88/2048 [01:49<23:14,  1.41it/s]
loss 2.05 accuracy 0.38 -- 168.24 + 57.38 + 504.51 + 4.89 = 735.02:   4%|▍         | 89/2048 [01:49<23:43,  1.38it/s]
loss 2.09 accuracy 0.06 -- 56.53 + 57.52 + 634.54 + 4.89 = 753.47:   4%|▍         | 89/2048 [01:50<23:43,  1.38it/s] 
loss 2.09 accuracy 0.06 -- 56.53 + 57.52 + 634.54 + 4.89 = 753.47:   4%|▍         | 90/2048 [01:50<24:14,  1.35it/s]
loss 2.06 accuracy 0.19 -- 57.51 + 57.05 + 514.55 + 4.89 = 634.00:   4%|▍         | 90/2048 [01:51<24:14,  1.35it/s]
loss 2.06 accuracy 0.19 -- 57.51 + 57.05 + 514.55 + 4.89 = 634.00:   4%|▍         | 91/2048 [01:51<23:26,  1.39it/s]
loss 2.11 accuracy 0.25 -- 56.34 + 57.30 + 630.04 + 4.89 = 748.57:   4%|▍         | 91/2048 [01:51<23:26,  1.39it/s]
loss 2.11 accuracy 0.25 -- 56.34 + 57.30 + 630.04 + 4.89 = 748.57:   4%|▍         | 92/2048 [01:51<23:59,  1.36it/s]
loss 2.39 accuracy 0.06 -- 57.09 + 56.58 + 512.42 + 4.89 = 630.98:   4%|▍         | 92/2048 [01:52<23:59,  1.36it/s]
loss 2.39 accuracy 0.06 -- 57.09 + 56.58 + 512.42 + 4.89 = 630.98:   5%|▍         | 93/2048 [01:52<23:13,  1.40it/s]
loss 2.29 accuracy 0.00 -- 56.31 + 173.65 + 511.67 + 4.89 = 746.52:   5%|▍         | 93/2048 [01:53<23:13,  1.40it/s]
loss 2.29 accuracy 0.00 -- 56.31 + 173.65 + 511.67 + 4.89 = 746.52:   5%|▍         | 94/2048 [01:53<23:48,  1.37it/s]
loss 2.99 accuracy 0.06 -- 56.38 + 56.44 + 509.84 + 4.89 = 627.55:   5%|▍         | 94/2048 [01:54<23:48,  1.37it/s] 
loss 2.99 accuracy 0.06 -- 56.38 + 56.44 + 509.84 + 4.89 = 627.55:   5%|▍         | 95/2048 [01:54<23:03,  1.41it/s]
loss 2.44 accuracy 0.00 -- 56.81 + 56.63 + 506.91 + 4.90 = 625.25:   5%|▍         | 95/2048 [01:54<23:03,  1.41it/s]
loss 2.44 accuracy 0.00 -- 56.81 + 56.63 + 506.91 + 4.90 = 625.25:   5%|▍         | 96/2048 [01:54<23:36,  1.38it/s]
loss 2.11 accuracy 0.25 -- 56.16 + 57.61 + 503.34 + 4.91 = 622.02:   5%|▍         | 96/2048 [01:55<23:36,  1.38it/s]
loss 2.11 accuracy 0.25 -- 56.16 + 57.61 + 503.34 + 4.91 = 622.02:   5%|▍         | 97/2048 [01:55<22:50,  1.42it/s]
loss 2.00 accuracy 0.19 -- 164.18 + 57.11 + 497.57 + 4.91 = 723.77:   5%|▍         | 97/2048 [01:56<22:50,  1.42it/s]
loss 2.00 accuracy 0.19 -- 164.18 + 57.11 + 497.57 + 4.91 = 723.77:   5%|▍         | 98/2048 [01:56<23:18,  1.39it/s]
loss 2.82 accuracy 0.25 -- 56.15 + 171.95 + 509.83 + 4.89 = 742.82:   5%|▍         | 98/2048 [01:57<23:18,  1.39it/s]
loss 2.82 accuracy 0.25 -- 56.15 + 171.95 + 509.83 + 4.89 = 742.82:   5%|▍         | 99/2048 [01:57<23:49,  1.36it/s]
loss 2.33 accuracy 0.19 -- 57.16 + 56.40 + 505.48 + 4.91 = 623.96:   5%|▍         | 99/2048 [01:57<23:49,  1.36it/s] 
loss 2.33 accuracy 0.19 -- 57.16 + 56.40 + 505.48 + 4.91 = 623.96:   5%|▍         | 100/2048 [01:57<23:01,  1.41it/s]
loss 2.19 accuracy 0.25 -- 168.88 + 57.41 + 505.31 + 4.92 = 736.52:   5%|▍         | 100/2048 [01:58<23:01,  1.41it/s]
loss 2.19 accuracy 0.25 -- 168.88 + 57.41 + 505.31 + 4.92 = 736.52:   5%|▍         | 101/2048 [01:58<23:32,  1.38it/s]
loss 2.25 accuracy 0.12 -- 56.40 + 58.54 + 636.19 + 4.92 = 756.04:   5%|▍         | 101/2048 [01:59<23:32,  1.38it/s] 
loss 2.25 accuracy 0.12 -- 56.40 + 58.54 + 636.19 + 4.92 = 756.04:   5%|▍         | 102/2048 [01:59<24:06,  1.35it/s]
loss 2.18 accuracy 0.06 -- 57.06 + 56.66 + 513.96 + 4.90 = 632.57:   5%|▍         | 102/2048 [01:59<24:06,  1.35it/s]
loss 2.18 accuracy 0.06 -- 57.06 + 56.66 + 513.96 + 4.90 = 632.57:   5%|▌         | 103/2048 [01:59<23:17,  1.39it/s]
loss 2.83 accuracy 0.00 -- 56.38 + 57.24 + 630.24 + 4.92 = 748.78:   5%|▌         | 103/2048 [02:00<23:17,  1.39it/s]
loss 2.83 accuracy 0.00 -- 56.38 + 57.24 + 630.24 + 4.92 = 748.78:   5%|▌         | 104/2048 [02:00<23:50,  1.36it/s]
loss 2.13 accuracy 0.19 -- 57.32 + 56.92 + 510.92 + 4.91 = 630.07:   5%|▌         | 104/2048 [02:01<23:50,  1.36it/s]
loss 2.13 accuracy 0.19 -- 57.32 + 56.92 + 510.92 + 4.91 = 630.07:   5%|▌         | 105/2048 [02:01<23:04,  1.40it/s]
loss 2.14 accuracy 0.06 -- 56.20 + 173.14 + 510.19 + 4.90 = 744.43:   5%|▌         | 105/2048 [02:02<23:04,  1.40it/s]
loss 2.14 accuracy 0.06 -- 56.20 + 173.14 + 510.19 + 4.90 = 744.43:   5%|▌         | 106/2048 [02:02<23:38,  1.37it/s]
loss 2.32 accuracy 0.19 -- 56.71 + 56.60 + 508.37 + 4.94 = 626.62:   5%|▌         | 106/2048 [02:02<23:38,  1.37it/s] 
loss 2.32 accuracy 0.19 -- 56.71 + 56.60 + 508.37 + 4.94 = 626.62:   5%|▌         | 107/2048 [02:02<22:53,  1.41it/s]
loss 2.05 accuracy 0.25 -- 66.62 + 67.52 + 513.98 + 4.89 = 653.01:   5%|▌         | 107/2048 [02:03<22:53,  1.41it/s]
loss 2.05 accuracy 0.25 -- 66.62 + 67.52 + 513.98 + 4.89 = 653.01:   5%|▌         | 108/2048 [02:03<24:42,  1.31it/s]
loss 1.98 accuracy 0.25 -- 56.12 + 57.40 + 503.46 + 4.89 = 621.88:   5%|▌         | 108/2048 [02:04<24:42,  1.31it/s]
loss 1.98 accuracy 0.25 -- 56.12 + 57.40 + 503.46 + 4.89 = 621.88:   5%|▌         | 109/2048 [02:04<23:36,  1.37it/s]
loss 2.48 accuracy 0.12 -- 164.12 + 56.85 + 497.11 + 4.88 = 722.96:   5%|▌         | 109/2048 [02:05<23:36,  1.37it/s]
loss 2.48 accuracy 0.12 -- 164.12 + 56.85 + 497.11 + 4.88 = 722.96:   5%|▌         | 110/2048 [02:05<23:47,  1.36it/s]
loss 2.08 accuracy 0.25 -- 56.28 + 171.83 + 512.36 + 4.91 = 745.37:   5%|▌         | 110/2048 [02:05<23:47,  1.36it/s]
loss 2.08 accuracy 0.25 -- 56.28 + 171.83 + 512.36 + 4.91 = 745.37:   5%|▌         | 111/2048 [02:05<24:07,  1.34it/s]
loss 2.20 accuracy 0.19 -- 57.54 + 57.25 + 507.58 + 4.94 = 627.30:   5%|▌         | 111/2048 [02:06<24:07,  1.34it/s] 
loss 2.20 accuracy 0.19 -- 57.54 + 57.25 + 507.58 + 4.94 = 627.30:   5%|▌         | 112/2048 [02:06<23:13,  1.39it/s]
loss 2.21 accuracy 0.25 -- 168.50 + 57.37 + 506.37 + 4.89 = 737.13:   5%|▌         | 112/2048 [02:07<23:13,  1.39it/s]
loss 2.21 accuracy 0.25 -- 168.50 + 57.37 + 506.37 + 4.89 = 737.13:   6%|▌         | 113/2048 [02:07<23:38,  1.36it/s]
loss 2.33 accuracy 0.19 -- 56.80 + 57.45 + 637.17 + 4.97 = 756.39:   6%|▌         | 113/2048 [02:08<23:38,  1.36it/s] 
loss 2.33 accuracy 0.19 -- 56.80 + 57.45 + 637.17 + 4.97 = 756.39:   6%|▌         | 114/2048 [02:08<24:07,  1.34it/s]
loss 2.30 accuracy 0.06 -- 57.42 + 56.76 + 516.45 + 4.91 = 635.54:   6%|▌         | 114/2048 [02:08<24:07,  1.34it/s]
loss 2.30 accuracy 0.06 -- 57.42 + 56.76 + 516.45 + 4.91 = 635.54:   6%|▌         | 115/2048 [02:08<23:16,  1.38it/s]
loss 2.09 accuracy 0.25 -- 56.93 + 59.77 + 632.29 + 4.93 = 753.91:   6%|▌         | 115/2048 [02:09<23:16,  1.38it/s]
loss 2.09 accuracy 0.25 -- 56.93 + 59.77 + 632.29 + 4.93 = 753.91:   6%|▌         | 116/2048 [02:09<23:50,  1.35it/s]
loss 2.21 accuracy 0.31 -- 57.53 + 56.92 + 512.99 + 4.91 = 632.36:   6%|▌         | 116/2048 [02:10<23:50,  1.35it/s]
loss 2.21 accuracy 0.31 -- 57.53 + 56.92 + 512.99 + 4.91 = 632.36:   6%|▌         | 117/2048 [02:10<23:02,  1.40it/s]
loss 2.14 accuracy 0.25 -- 56.98 + 172.06 + 513.29 + 4.95 = 747.27:   6%|▌         | 117/2048 [02:10<23:02,  1.40it/s]
loss 2.14 accuracy 0.25 -- 56.98 + 172.06 + 513.29 + 4.95 = 747.27:   6%|▌         | 118/2048 [02:10<23:36,  1.36it/s]
loss 2.01 accuracy 0.31 -- 57.46 + 56.96 + 510.27 + 4.93 = 629.62:   6%|▌         | 118/2048 [02:11<23:36,  1.36it/s] 
loss 2.01 accuracy 0.31 -- 57.46 + 56.96 + 510.27 + 4.93 = 629.62:   6%|▌         | 119/2048 [02:11<23:11,  1.39it/s]
loss 2.31 accuracy 0.12 -- 57.30 + 56.82 + 508.06 + 4.92 = 627.10:   6%|▌         | 119/2048 [02:12<23:11,  1.39it/s]
loss 2.31 accuracy 0.12 -- 57.30 + 56.82 + 508.06 + 4.92 = 627.10:   6%|▌         | 120/2048 [02:12<23:37,  1.36it/s]
loss 2.35 accuracy 0.12 -- 56.79 + 57.69 + 507.46 + 4.92 = 626.86:   6%|▌         | 120/2048 [02:12<23:37,  1.36it/s]
loss 2.35 accuracy 0.12 -- 56.79 + 57.69 + 507.46 + 4.92 = 626.86:   6%|▌         | 121/2048 [02:12<22:50,  1.41it/s]
loss 2.48 accuracy 0.12 -- 164.00 + 57.10 + 499.85 + 4.94 = 725.90:   6%|▌         | 121/2048 [02:13<22:50,  1.41it/s]
loss 2.48 accuracy 0.12 -- 164.00 + 57.10 + 499.85 + 4.94 = 725.90:   6%|▌         | 122/2048 [02:13<23:13,  1.38it/s]
loss 2.10 accuracy 0.25 -- 56.52 + 172.57 + 512.89 + 4.95 = 746.94:   6%|▌         | 122/2048 [02:14<23:13,  1.38it/s]
loss 2.10 accuracy 0.25 -- 56.52 + 172.57 + 512.89 + 4.95 = 746.94:   6%|▌         | 123/2048 [02:14<23:42,  1.35it/s]
loss 2.35 accuracy 0.19 -- 57.54 + 56.75 + 508.21 + 4.91 = 627.42:   6%|▌         | 123/2048 [02:15<23:42,  1.35it/s] 
loss 2.35 accuracy 0.19 -- 57.54 + 56.75 + 508.21 + 4.91 = 627.42:   6%|▌         | 124/2048 [02:15<22:53,  1.40it/s]
loss 2.11 accuracy 0.19 -- 168.71 + 57.16 + 506.84 + 4.94 = 737.66:   6%|▌         | 124/2048 [02:15<22:53,  1.40it/s]
loss 2.11 accuracy 0.19 -- 168.71 + 57.16 + 506.84 + 4.94 = 737.66:   6%|▌         | 125/2048 [02:15<23:22,  1.37it/s]
loss 2.33 accuracy 0.25 -- 57.03 + 57.84 + 637.25 + 4.93 = 757.06:   6%|▌         | 125/2048 [02:16<23:22,  1.37it/s] 
loss 2.33 accuracy 0.25 -- 57.03 + 57.84 + 637.25 + 4.93 = 757.06:   6%|▌         | 126/2048 [02:16<24:13,  1.32it/s]
loss 1.86 accuracy 0.38 -- 57.79 + 56.64 + 516.31 + 4.95 = 635.70:   6%|▌         | 126/2048 [02:17<24:13,  1.32it/s]
loss 1.86 accuracy 0.38 -- 57.79 + 56.64 + 516.31 + 4.95 = 635.70:   6%|▌         | 127/2048 [02:17<23:19,  1.37it/s]
loss 2.07 accuracy 0.25 -- 56.49 + 57.88 + 631.75 + 4.90 = 751.02:   6%|▌         | 127/2048 [02:18<23:19,  1.37it/s]
loss 2.07 accuracy 0.25 -- 56.49 + 57.88 + 631.75 + 4.90 = 751.02:   6%|▋         | 128/2048 [02:18<23:47,  1.35it/s]
loss 2.44 accuracy 0.19 -- 57.35 + 56.65 + 512.75 + 4.96 = 631.70:   6%|▋         | 128/2048 [02:18<23:47,  1.35it/s]
loss 2.44 accuracy 0.19 -- 57.35 + 56.65 + 512.75 + 4.96 = 631.70:   6%|▋         | 129/2048 [02:18<22:57,  1.39it/s]
loss 2.72 accuracy 0.06 -- 57.21 + 172.68 + 512.15 + 4.94 = 746.98:   6%|▋         | 129/2048 [02:19<22:57,  1.39it/s]
loss 2.72 accuracy 0.06 -- 57.21 + 172.68 + 512.15 + 4.94 = 746.98:   6%|▋         | 130/2048 [02:19<23:29,  1.36it/s]
loss 1.97 accuracy 0.31 -- 56.74 + 56.52 + 509.16 + 4.92 = 627.34:   6%|▋         | 130/2048 [02:20<23:29,  1.36it/s] 
loss 1.97 accuracy 0.31 -- 56.74 + 56.52 + 509.16 + 4.92 = 627.34:   6%|▋         | 131/2048 [02:20<22:42,  1.41it/s]
loss 1.96 accuracy 0.25 -- 57.28 + 56.70 + 510.21 + 4.91 = 629.10:   6%|▋         | 131/2048 [02:21<22:42,  1.41it/s]
loss 1.96 accuracy 0.25 -- 57.28 + 56.70 + 510.21 + 4.91 = 629.10:   6%|▋         | 132/2048 [02:21<23:14,  1.37it/s]
loss 1.83 accuracy 0.38 -- 56.41 + 57.49 + 504.92 + 4.93 = 623.75:   6%|▋         | 132/2048 [02:21<23:14,  1.37it/s]
loss 1.83 accuracy 0.38 -- 56.41 + 57.49 + 504.92 + 4.93 = 623.75:   6%|▋         | 133/2048 [02:21<22:30,  1.42it/s]
loss 2.27 accuracy 0.12 -- 164.79 + 57.08 + 499.86 + 4.91 = 726.65:   6%|▋         | 133/2048 [02:22<22:30,  1.42it/s]
loss 2.27 accuracy 0.12 -- 164.79 + 57.08 + 499.86 + 4.91 = 726.65:   7%|▋         | 134/2048 [02:22<22:57,  1.39it/s]
loss 2.22 accuracy 0.19 -- 56.63 + 172.73 + 511.81 + 4.95 = 746.13:   7%|▋         | 134/2048 [02:23<22:57,  1.39it/s]
loss 2.22 accuracy 0.19 -- 56.63 + 172.73 + 511.81 + 4.95 = 746.13:   7%|▋         | 135/2048 [02:23<23:27,  1.36it/s]
loss 2.32 accuracy 0.19 -- 57.06 + 56.75 + 507.38 + 4.89 = 626.08:   7%|▋         | 135/2048 [02:23<23:27,  1.36it/s] 
loss 2.32 accuracy 0.19 -- 57.06 + 56.75 + 507.38 + 4.89 = 626.08:   7%|▋         | 136/2048 [02:23<22:40,  1.41it/s]
loss 2.10 accuracy 0.12 -- 169.74 + 57.23 + 506.27 + 4.91 = 738.14:   7%|▋         | 136/2048 [02:24<22:40,  1.41it/s]
loss 2.10 accuracy 0.12 -- 169.74 + 57.23 + 506.27 + 4.91 = 738.14:   7%|▋         | 137/2048 [02:24<23:10,  1.37it/s]
loss 2.07 accuracy 0.25 -- 56.96 + 58.00 + 635.54 + 4.96 = 755.47:   7%|▋         | 137/2048 [02:25<23:10,  1.37it/s] 
loss 2.07 accuracy 0.25 -- 56.96 + 58.00 + 635.54 + 4.96 = 755.47:   7%|▋         | 138/2048 [02:25<23:41,  1.34it/s]
loss 2.37 accuracy 0.06 -- 57.04 + 56.86 + 515.98 + 4.90 = 634.79:   7%|▋         | 138/2048 [02:26<23:41,  1.34it/s]
loss 2.37 accuracy 0.06 -- 57.04 + 56.86 + 515.98 + 4.90 = 634.79:   7%|▋         | 139/2048 [02:26<22:53,  1.39it/s]
loss 2.22 accuracy 0.25 -- 57.00 + 57.28 + 631.30 + 4.94 = 750.53:   7%|▋         | 139/2048 [02:26<22:53,  1.39it/s]
loss 2.22 accuracy 0.25 -- 57.00 + 57.28 + 631.30 + 4.94 = 750.53:   7%|▋         | 140/2048 [02:26<23:46,  1.34it/s]
loss 2.31 accuracy 0.12 -- 57.24 + 56.69 + 511.23 + 4.93 = 630.09:   7%|▋         | 140/2048 [02:27<23:46,  1.34it/s]
loss 2.31 accuracy 0.12 -- 57.24 + 56.69 + 511.23 + 4.93 = 630.09:   7%|▋         | 141/2048 [02:27<22:54,  1.39it/s]
loss 2.09 accuracy 0.44 -- 56.88 + 171.64 + 512.40 + 4.94 = 745.86:   7%|▋         | 141/2048 [02:28<22:54,  1.39it/s]
loss 2.09 accuracy 0.44 -- 56.88 + 171.64 + 512.40 + 4.94 = 745.86:   7%|▋         | 142/2048 [02:28<23:23,  1.36it/s]
loss 2.37 accuracy 0.12 -- 56.85 + 56.98 + 511.36 + 4.88 = 630.07:   7%|▋         | 142/2048 [02:29<23:23,  1.36it/s] 
loss 2.37 accuracy 0.12 -- 56.85 + 56.98 + 511.36 + 4.88 = 630.07:   7%|▋         | 143/2048 [02:29<22:37,  1.40it/s]
loss 2.15 accuracy 0.25 -- 57.32 + 56.56 + 620.44 + 5.27 = 739.59:   7%|▋         | 143/2048 [02:29<22:37,  1.40it/s]
loss 2.15 accuracy 0.25 -- 57.32 + 56.56 + 620.44 + 5.27 = 739.59:   7%|▋         | 144/2048 [02:29<24:11,  1.31it/s]
loss 2.00 accuracy 0.38 -- 64.58 + 62.25 + 506.10 + 4.92 = 637.85:   7%|▋         | 144/2048 [02:30<24:11,  1.31it/s]
loss 2.00 accuracy 0.38 -- 64.58 + 62.25 + 506.10 + 4.92 = 637.85:   7%|▋         | 145/2048 [02:30<23:20,  1.36it/s]
loss 1.91 accuracy 0.25 -- 163.47 + 56.87 + 499.36 + 4.89 = 724.59:   7%|▋         | 145/2048 [02:31<23:20,  1.36it/s]
loss 1.91 accuracy 0.25 -- 163.47 + 56.87 + 499.36 + 4.89 = 724.59:   7%|▋         | 146/2048 [02:31<23:29,  1.35it/s]
loss 2.23 accuracy 0.19 -- 56.77 + 172.28 + 513.99 + 4.92 = 747.96:   7%|▋         | 146/2048 [02:32<23:29,  1.35it/s]
loss 2.23 accuracy 0.19 -- 56.77 + 172.28 + 513.99 + 4.92 = 747.96:   7%|▋         | 147/2048 [02:32<23:48,  1.33it/s]
loss 2.77 accuracy 0.12 -- 57.02 + 56.90 + 509.43 + 4.88 = 628.23:   7%|▋         | 147/2048 [02:32<23:48,  1.33it/s] 
loss 2.77 accuracy 0.12 -- 57.02 + 56.90 + 509.43 + 4.88 = 628.23:   7%|▋         | 148/2048 [02:32<22:53,  1.38it/s]
loss 2.19 accuracy 0.12 -- 168.88 + 57.69 + 507.05 + 4.90 = 738.53:   7%|▋         | 148/2048 [02:33<22:53,  1.38it/s]
loss 2.19 accuracy 0.12 -- 168.88 + 57.69 + 507.05 + 4.90 = 738.53:   7%|▋         | 149/2048 [02:33<23:17,  1.36it/s]
loss 2.25 accuracy 0.12 -- 56.84 + 57.61 + 636.93 + 4.89 = 756.27:   7%|▋         | 149/2048 [02:34<23:17,  1.36it/s] 
loss 2.25 accuracy 0.12 -- 56.84 + 57.61 + 636.93 + 4.89 = 756.27:   7%|▋         | 150/2048 [02:34<23:43,  1.33it/s]
loss 1.90 accuracy 0.25 -- 57.47 + 56.70 + 514.85 + 4.90 = 633.92:   7%|▋         | 150/2048 [02:34<23:43,  1.33it/s]
loss 1.90 accuracy 0.25 -- 57.47 + 56.70 + 514.85 + 4.90 = 633.92:   7%|▋         | 151/2048 [02:34<22:53,  1.38it/s]
loss 2.24 accuracy 0.25 -- 56.65 + 57.49 + 629.80 + 4.91 = 748.84:   7%|▋         | 151/2048 [02:35<22:53,  1.38it/s]
loss 2.24 accuracy 0.25 -- 56.65 + 57.49 + 629.80 + 4.91 = 748.84:   7%|▋         | 152/2048 [02:35<23:22,  1.35it/s]
loss 2.01 accuracy 0.31 -- 57.27 + 56.62 + 512.15 + 4.93 = 630.97:   7%|▋         | 152/2048 [02:36<23:22,  1.35it/s]
loss 2.01 accuracy 0.31 -- 57.27 + 56.62 + 512.15 + 4.93 = 630.97:   7%|▋         | 153/2048 [02:36<22:35,  1.40it/s]
loss 2.15 accuracy 0.12 -- 57.08 + 171.91 + 513.96 + 4.95 = 747.91:   7%|▋         | 153/2048 [02:37<22:35,  1.40it/s]
loss 2.15 accuracy 0.12 -- 57.08 + 171.91 + 513.96 + 4.95 = 747.91:   8%|▊         | 154/2048 [02:37<23:09,  1.36it/s]
loss 2.17 accuracy 0.12 -- 56.86 + 56.93 + 510.64 + 4.93 = 629.36:   8%|▊         | 154/2048 [02:37<23:09,  1.36it/s] 
loss 2.17 accuracy 0.12 -- 56.86 + 56.93 + 510.64 + 4.93 = 629.36:   8%|▊         | 155/2048 [02:37<22:25,  1.41it/s]
loss 2.52 accuracy 0.12 -- 57.06 + 56.65 + 507.95 + 4.93 = 626.58:   8%|▊         | 155/2048 [02:38<22:25,  1.41it/s]
loss 2.52 accuracy 0.12 -- 57.06 + 56.65 + 507.95 + 4.93 = 626.58:   8%|▊         | 156/2048 [02:38<22:55,  1.38it/s]
loss 2.81 accuracy 0.12 -- 56.52 + 57.54 + 505.01 + 4.93 = 624.01:   8%|▊         | 156/2048 [02:39<22:55,  1.38it/s]
loss 2.81 accuracy 0.12 -- 56.52 + 57.54 + 505.01 + 4.93 = 624.01:   8%|▊         | 157/2048 [02:39<22:12,  1.42it/s]
loss 2.05 accuracy 0.06 -- 163.52 + 57.22 + 499.79 + 4.88 = 725.41:   8%|▊         | 157/2048 [02:39<22:12,  1.42it/s]
loss 2.05 accuracy 0.06 -- 163.52 + 57.22 + 499.79 + 4.88 = 725.41:   8%|▊         | 158/2048 [02:39<22:38,  1.39it/s]
loss 1.97 accuracy 0.19 -- 56.30 + 171.44 + 510.95 + 4.88 = 743.58:   8%|▊         | 158/2048 [02:40<22:38,  1.39it/s]
loss 1.97 accuracy 0.19 -- 56.30 + 171.44 + 510.95 + 4.88 = 743.58:   8%|▊         | 159/2048 [02:40<23:07,  1.36it/s]
loss 2.12 accuracy 0.25 -- 56.95 + 56.33 + 505.41 + 4.95 = 623.65:   8%|▊         | 159/2048 [02:41<23:07,  1.36it/s] 
loss 2.12 accuracy 0.25 -- 56.95 + 56.33 + 505.41 + 4.95 = 623.65:   8%|▊         | 160/2048 [02:41<22:19,  1.41it/s]
loss 2.07 accuracy 0.19 -- 169.23 + 57.17 + 506.95 + 4.93 = 738.29:   8%|▊         | 160/2048 [02:42<22:19,  1.41it/s]
loss 2.07 accuracy 0.19 -- 169.23 + 57.17 + 506.95 + 4.93 = 738.29:   8%|▊         | 161/2048 [02:42<22:51,  1.38it/s]
loss 2.32 accuracy 0.00 -- 56.79 + 57.18 + 636.49 + 4.94 = 755.40:   8%|▊         | 161/2048 [02:42<22:51,  1.38it/s] 
loss 2.32 accuracy 0.00 -- 56.79 + 57.18 + 636.49 + 4.94 = 755.40:   8%|▊         | 162/2048 [02:42<23:22,  1.34it/s]
loss 1.97 accuracy 0.31 -- 57.67 + 56.96 + 514.96 + 4.92 = 634.51:   8%|▊         | 162/2048 [02:43<23:22,  1.34it/s]
loss 1.97 accuracy 0.31 -- 57.67 + 56.96 + 514.96 + 4.92 = 634.51:   8%|▊         | 163/2048 [02:43<22:35,  1.39it/s]
loss 2.74 accuracy 0.12 -- 56.47 + 57.61 + 630.52 + 4.93 = 749.53:   8%|▊         | 163/2048 [02:44<22:35,  1.39it/s]
loss 2.74 accuracy 0.12 -- 56.47 + 57.61 + 630.52 + 4.93 = 749.53:   8%|▊         | 164/2048 [02:44<23:07,  1.36it/s]
loss 2.02 accuracy 0.25 -- 57.34 + 56.89 + 513.60 + 4.95 = 632.78:   8%|▊         | 164/2048 [02:45<23:07,  1.36it/s]
loss 2.02 accuracy 0.25 -- 57.34 + 56.89 + 513.60 + 4.95 = 632.78:   8%|▊         | 165/2048 [02:45<22:23,  1.40it/s]
loss 2.11 accuracy 0.12 -- 56.52 + 171.89 + 510.56 + 4.92 = 743.89:   8%|▊         | 165/2048 [02:45<22:23,  1.40it/s]
loss 2.11 accuracy 0.12 -- 56.52 + 171.89 + 510.56 + 4.92 = 743.89:   8%|▊         | 166/2048 [02:45<22:55,  1.37it/s]
loss 1.97 accuracy 0.31 -- 57.26 + 56.55 + 509.10 + 4.89 = 627.81:   8%|▊         | 166/2048 [02:46<22:55,  1.37it/s] 
loss 1.97 accuracy 0.31 -- 57.26 + 56.55 + 509.10 + 4.89 = 627.81:   8%|▊         | 167/2048 [02:46<22:12,  1.41it/s]
loss 2.23 accuracy 0.12 -- 57.19 + 56.49 + 508.97 + 4.92 = 627.58:   8%|▊         | 167/2048 [02:47<22:12,  1.41it/s]
loss 2.23 accuracy 0.12 -- 57.19 + 56.49 + 508.97 + 4.92 = 627.58:   8%|▊         | 168/2048 [02:47<22:45,  1.38it/s]
loss 1.89 accuracy 0.31 -- 56.58 + 57.20 + 504.29 + 4.90 = 622.97:   8%|▊         | 168/2048 [02:47<22:45,  1.38it/s]
loss 1.89 accuracy 0.31 -- 56.58 + 57.20 + 504.29 + 4.90 = 622.97:   8%|▊         | 169/2048 [02:47<22:01,  1.42it/s]
loss 2.49 accuracy 0.19 -- 163.59 + 57.06 + 498.10 + 4.97 = 723.73:   8%|▊         | 169/2048 [02:48<22:01,  1.42it/s]
loss 2.49 accuracy 0.19 -- 163.59 + 57.06 + 498.10 + 4.97 = 723.73:   8%|▊         | 170/2048 [02:48<22:27,  1.39it/s]
loss 2.04 accuracy 0.19 -- 57.12 + 172.37 + 512.25 + 4.91 = 746.65:   8%|▊         | 170/2048 [02:49<22:27,  1.39it/s]
loss 2.04 accuracy 0.19 -- 57.12 + 172.37 + 512.25 + 4.91 = 746.65:   8%|▊         | 171/2048 [02:49<22:59,  1.36it/s]
loss 2.29 accuracy 0.19 -- 57.48 + 56.50 + 510.72 + 4.92 = 629.61:   8%|▊         | 171/2048 [02:50<22:59,  1.36it/s] 
loss 2.29 accuracy 0.19 -- 57.48 + 56.50 + 510.72 + 4.92 = 629.61:   8%|▊         | 172/2048 [02:50<22:14,  1.41it/s]
loss 1.98 accuracy 0.31 -- 169.24 + 57.41 + 505.79 + 4.90 = 737.35:   8%|▊         | 172/2048 [02:50<22:14,  1.41it/s]
loss 1.98 accuracy 0.31 -- 169.24 + 57.41 + 505.79 + 4.90 = 737.35:   8%|▊         | 173/2048 [02:50<22:43,  1.37it/s]
loss 2.11 accuracy 0.06 -- 56.80 + 57.32 + 636.36 + 4.90 = 755.38:   8%|▊         | 173/2048 [02:51<22:43,  1.37it/s] 
loss 2.11 accuracy 0.06 -- 56.80 + 57.32 + 636.36 + 4.90 = 755.38:   8%|▊         | 174/2048 [02:51<23:14,  1.34it/s]
loss 2.19 accuracy 0.12 -- 57.36 + 56.94 + 516.25 + 4.93 = 635.49:   8%|▊         | 174/2048 [02:52<23:14,  1.34it/s]
loss 2.19 accuracy 0.12 -- 57.36 + 56.94 + 516.25 + 4.93 = 635.49:   9%|▊         | 175/2048 [02:52<22:28,  1.39it/s]
loss 2.35 accuracy 0.06 -- 56.71 + 57.53 + 634.73 + 4.90 = 753.86:   9%|▊         | 175/2048 [02:53<22:28,  1.39it/s]
loss 2.35 accuracy 0.06 -- 56.71 + 57.53 + 634.73 + 4.90 = 753.86:   9%|▊         | 176/2048 [02:53<23:02,  1.35it/s]
loss 2.22 accuracy 0.25 -- 57.22 + 56.85 + 512.31 + 4.91 = 631.29:   9%|▊         | 176/2048 [02:53<23:02,  1.35it/s]
loss 2.22 accuracy 0.25 -- 57.22 + 56.85 + 512.31 + 4.91 = 631.29:   9%|▊         | 177/2048 [02:53<22:16,  1.40it/s]
loss 2.32 accuracy 0.06 -- 56.72 + 172.46 + 512.29 + 4.94 = 746.41:   9%|▊         | 177/2048 [02:54<22:16,  1.40it/s]
loss 2.32 accuracy 0.06 -- 56.72 + 172.46 + 512.29 + 4.94 = 746.41:   9%|▊         | 178/2048 [02:54<22:49,  1.37it/s]
loss 1.94 accuracy 0.38 -- 56.87 + 56.51 + 509.52 + 4.94 = 627.83:   9%|▊         | 178/2048 [02:55<22:49,  1.37it/s] 
loss 1.94 accuracy 0.38 -- 56.87 + 56.51 + 509.52 + 4.94 = 627.83:   9%|▊         | 179/2048 [02:55<22:05,  1.41it/s]
loss 2.28 accuracy 0.06 -- 57.41 + 56.51 + 507.89 + 4.93 = 626.74:   9%|▊         | 179/2048 [02:55<22:05,  1.41it/s]
loss 2.28 accuracy 0.06 -- 57.41 + 56.51 + 507.89 + 4.93 = 626.74:   9%|▉         | 180/2048 [02:55<22:37,  1.38it/s]
loss 1.94 accuracy 0.25 -- 56.52 + 57.80 + 504.99 + 4.92 = 624.24:   9%|▉         | 180/2048 [02:56<22:37,  1.38it/s]
loss 1.94 accuracy 0.25 -- 56.52 + 57.80 + 504.99 + 4.92 = 624.24:   9%|▉         | 181/2048 [02:56<21:54,  1.42it/s]
loss 2.05 accuracy 0.38 -- 164.59 + 57.28 + 500.50 + 4.94 = 727.30:   9%|▉         | 181/2048 [02:57<21:54,  1.42it/s]
loss 2.05 accuracy 0.38 -- 164.59 + 57.28 + 500.50 + 4.94 = 727.30:   9%|▉         | 182/2048 [02:57<22:42,  1.37it/s]
loss 2.28 accuracy 0.19 -- 56.40 + 172.88 + 511.83 + 4.95 = 746.06:   9%|▉         | 182/2048 [02:58<22:42,  1.37it/s]
loss 2.28 accuracy 0.19 -- 56.40 + 172.88 + 511.83 + 4.95 = 746.06:   9%|▉         | 183/2048 [02:58<23:05,  1.35it/s]
loss 2.40 accuracy 0.06 -- 57.77 + 56.96 + 509.54 + 4.91 = 629.18:   9%|▉         | 183/2048 [02:58<23:05,  1.35it/s] 
loss 2.40 accuracy 0.06 -- 57.77 + 56.96 + 509.54 + 4.91 = 629.18:   9%|▉         | 184/2048 [02:58<22:16,  1.39it/s]
loss 1.92 accuracy 0.19 -- 169.13 + 57.23 + 507.24 + 4.93 = 738.54:   9%|▉         | 184/2048 [02:59<22:16,  1.39it/s]
loss 1.92 accuracy 0.19 -- 169.13 + 57.23 + 507.24 + 4.93 = 738.54:   9%|▉         | 185/2048 [02:59<22:43,  1.37it/s]
loss 2.04 accuracy 0.25 -- 56.77 + 57.83 + 637.08 + 4.91 = 756.58:   9%|▉         | 185/2048 [03:00<22:43,  1.37it/s] 
loss 2.04 accuracy 0.25 -- 56.77 + 57.83 + 637.08 + 4.91 = 756.58:   9%|▉         | 186/2048 [03:00<23:11,  1.34it/s]
loss 2.33 accuracy 0.19 -- 57.50 + 56.78 + 517.36 + 4.92 = 636.56:   9%|▉         | 186/2048 [03:01<23:11,  1.34it/s]
loss 2.33 accuracy 0.19 -- 57.50 + 56.78 + 517.36 + 4.92 = 636.56:   9%|▉         | 187/2048 [03:01<22:24,  1.38it/s]
loss 2.19 accuracy 0.25 -- 56.87 + 57.57 + 632.57 + 4.94 = 751.95:   9%|▉         | 187/2048 [03:01<22:24,  1.38it/s]
loss 2.19 accuracy 0.25 -- 56.87 + 57.57 + 632.57 + 4.94 = 751.95:   9%|▉         | 188/2048 [03:01<22:55,  1.35it/s]
loss 2.34 accuracy 0.12 -- 57.35 + 57.10 + 511.51 + 4.91 = 630.87:   9%|▉         | 188/2048 [03:02<22:55,  1.35it/s]
loss 2.34 accuracy 0.12 -- 57.35 + 57.10 + 511.51 + 4.91 = 630.87:   9%|▉         | 189/2048 [03:02<22:09,  1.40it/s]
loss 2.06 accuracy 0.31 -- 56.61 + 171.98 + 512.44 + 4.89 = 745.92:   9%|▉         | 189/2048 [03:03<22:09,  1.40it/s]
loss 2.06 accuracy 0.31 -- 56.61 + 171.98 + 512.44 + 4.89 = 745.92:   9%|▉         | 190/2048 [03:03<22:41,  1.36it/s]
loss 1.74 accuracy 0.44 -- 56.83 + 56.69 + 509.75 + 4.92 = 628.20:   9%|▉         | 190/2048 [03:03<22:41,  1.36it/s] 
loss 1.74 accuracy 0.44 -- 56.83 + 56.69 + 509.75 + 4.92 = 628.20:   9%|▉         | 191/2048 [03:03<21:57,  1.41it/s]
loss 2.21 accuracy 0.19 -- 57.00 + 56.49 + 507.99 + 4.95 = 626.42:   9%|▉         | 191/2048 [03:04<21:57,  1.41it/s]
loss 2.21 accuracy 0.19 -- 57.00 + 56.49 + 507.99 + 4.95 = 626.42:   9%|▉         | 192/2048 [03:04<22:28,  1.38it/s]
loss 2.49 accuracy 0.19 -- 57.09 + 58.09 + 508.30 + 4.93 = 628.41:   9%|▉         | 192/2048 [03:05<22:28,  1.38it/s]
loss 2.49 accuracy 0.19 -- 57.09 + 58.09 + 508.30 + 4.93 = 628.41:   9%|▉         | 193/2048 [03:05<21:48,  1.42it/s]
loss 1.81 accuracy 0.38 -- 163.69 + 57.27 + 500.14 + 4.94 = 726.03:   9%|▉         | 193/2048 [03:06<21:48,  1.42it/s]
loss 1.81 accuracy 0.38 -- 163.69 + 57.27 + 500.14 + 4.94 = 726.03:   9%|▉         | 194/2048 [03:06<22:14,  1.39it/s]
loss 2.65 accuracy 0.25 -- 56.18 + 172.23 + 511.22 + 4.93 = 744.56:   9%|▉         | 194/2048 [03:06<22:14,  1.39it/s]
loss 2.65 accuracy 0.25 -- 56.18 + 172.23 + 511.22 + 4.93 = 744.56:  10%|▉         | 195/2048 [03:06<22:42,  1.36it/s]
loss 1.69 accuracy 0.31 -- 57.36 + 56.79 + 508.60 + 4.95 = 627.69:  10%|▉         | 195/2048 [03:07<22:42,  1.36it/s] 
loss 1.69 accuracy 0.31 -- 57.36 + 56.79 + 508.60 + 4.95 = 627.69:  10%|▉         | 196/2048 [03:07<22:17,  1.38it/s]
loss 1.94 accuracy 0.38 -- 168.32 + 57.32 + 507.41 + 4.93 = 737.98:  10%|▉         | 196/2048 [03:08<22:17,  1.38it/s]
loss 1.94 accuracy 0.38 -- 168.32 + 57.32 + 507.41 + 4.93 = 737.98:  10%|▉         | 197/2048 [03:08<22:40,  1.36it/s]
loss 2.44 accuracy 0.12 -- 57.02 + 57.82 + 638.13 + 5.01 = 757.98:  10%|▉         | 197/2048 [03:09<22:40,  1.36it/s] 
loss 2.44 accuracy 0.12 -- 57.02 + 57.82 + 638.13 + 5.01 = 757.98:  10%|▉         | 198/2048 [03:09<23:08,  1.33it/s]
loss 2.11 accuracy 0.19 -- 57.63 + 57.05 + 516.01 + 4.94 = 635.62:  10%|▉         | 198/2048 [03:09<23:08,  1.33it/s]
loss 2.11 accuracy 0.19 -- 57.63 + 57.05 + 516.01 + 4.94 = 635.62:  10%|▉         | 199/2048 [03:09<22:18,  1.38it/s]
loss 1.91 accuracy 0.31 -- 56.81 + 57.88 + 632.58 + 4.94 = 752.22:  10%|▉         | 199/2048 [03:10<22:18,  1.38it/s]
loss 1.91 accuracy 0.31 -- 56.81 + 57.88 + 632.58 + 4.94 = 752.22:  10%|▉         | 200/2048 [03:10<22:48,  1.35it/s]
loss 2.18 accuracy 0.25 -- 57.42 + 57.02 + 511.97 + 4.95 = 631.35:  10%|▉         | 200/2048 [03:11<22:48,  1.35it/s]
loss 2.18 accuracy 0.25 -- 57.42 + 57.02 + 511.97 + 4.95 = 631.35:  10%|▉         | 201/2048 [03:11<22:02,  1.40it/s]
loss 2.10 accuracy 0.12 -- 56.63 + 171.83 + 511.85 + 4.92 = 745.22:  10%|▉         | 201/2048 [03:11<22:02,  1.40it/s]
loss 2.10 accuracy 0.12 -- 56.63 + 171.83 + 511.85 + 4.92 = 745.22:  10%|▉         | 202/2048 [03:11<22:33,  1.36it/s]
loss 2.07 accuracy 0.12 -- 56.89 + 56.81 + 509.83 + 4.92 = 628.44:  10%|▉         | 202/2048 [03:12<22:33,  1.36it/s] 
loss 2.07 accuracy 0.12 -- 56.89 + 56.81 + 509.83 + 4.92 = 628.44:  10%|▉         | 203/2048 [03:12<22:09,  1.39it/s]
loss 1.77 accuracy 0.38 -- 57.74 + 56.96 + 507.62 + 4.92 = 627.25:  10%|▉         | 203/2048 [03:13<22:09,  1.39it/s]
loss 1.77 accuracy 0.38 -- 57.74 + 56.96 + 507.62 + 4.92 = 627.25:  10%|▉         | 204/2048 [03:13<22:34,  1.36it/s]
loss 2.33 accuracy 0.31 -- 56.72 + 57.69 + 505.65 + 4.91 = 624.97:  10%|▉         | 204/2048 [03:14<22:34,  1.36it/s]
loss 2.33 accuracy 0.31 -- 56.72 + 57.69 + 505.65 + 4.91 = 624.97:  10%|█         | 205/2048 [03:14<21:48,  1.41it/s]
loss 2.38 accuracy 0.38 -- 164.29 + 57.38 + 499.15 + 4.93 = 725.75:  10%|█         | 205/2048 [03:14<21:48,  1.41it/s]
loss 2.38 accuracy 0.38 -- 164.29 + 57.38 + 499.15 + 4.93 = 725.75:  10%|█         | 206/2048 [03:14<22:11,  1.38it/s]
loss 2.31 accuracy 0.44 -- 56.76 + 172.77 + 511.93 + 4.93 = 746.37:  10%|█         | 206/2048 [03:15<22:11,  1.38it/s]
loss 2.31 accuracy 0.44 -- 56.76 + 172.77 + 511.93 + 4.93 = 746.37:  10%|█         | 207/2048 [03:15<22:38,  1.35it/s]
loss 2.28 accuracy 0.25 -- 57.23 + 56.98 + 508.28 + 4.96 = 627.46:  10%|█         | 207/2048 [03:16<22:38,  1.35it/s] 
loss 2.28 accuracy 0.25 -- 57.23 + 56.98 + 508.28 + 4.96 = 627.46:  10%|█         | 208/2048 [03:16<21:52,  1.40it/s]
loss 2.10 accuracy 0.19 -- 170.13 + 57.33 + 508.59 + 4.92 = 740.97:  10%|█         | 208/2048 [03:17<21:52,  1.40it/s]
loss 2.10 accuracy 0.19 -- 170.13 + 57.33 + 508.59 + 4.92 = 740.97:  10%|█         | 209/2048 [03:17<22:21,  1.37it/s]
loss 2.04 accuracy 0.25 -- 56.47 + 57.74 + 636.53 + 4.94 = 755.68:  10%|█         | 209/2048 [03:17<22:21,  1.37it/s] 
loss 2.04 accuracy 0.25 -- 56.47 + 57.74 + 636.53 + 4.94 = 755.68:  10%|█         | 210/2048 [03:17<23:10,  1.32it/s]
loss 2.06 accuracy 0.12 -- 57.73 + 56.99 + 515.56 + 4.95 = 635.24:  10%|█         | 210/2048 [03:18<23:10,  1.32it/s]
loss 2.06 accuracy 0.12 -- 57.73 + 56.99 + 515.56 + 4.95 = 635.24:  10%|█         | 211/2048 [03:18<22:17,  1.37it/s]
loss 1.95 accuracy 0.19 -- 56.64 + 57.67 + 630.68 + 4.92 = 749.92:  10%|█         | 211/2048 [03:19<22:17,  1.37it/s]
loss 1.95 accuracy 0.19 -- 56.64 + 57.67 + 630.68 + 4.92 = 749.92:  10%|█         | 212/2048 [03:19<22:43,  1.35it/s]
loss 2.15 accuracy 0.31 -- 57.47 + 56.67 + 512.03 + 4.92 = 631.10:  10%|█         | 212/2048 [03:19<22:43,  1.35it/s]
loss 2.15 accuracy 0.31 -- 57.47 + 56.67 + 512.03 + 4.92 = 631.10:  10%|█         | 213/2048 [03:19<21:56,  1.39it/s]
loss 2.36 accuracy 0.38 -- 56.99 + 172.26 + 512.35 + 4.91 = 746.52:  10%|█         | 213/2048 [03:20<21:56,  1.39it/s]
loss 2.36 accuracy 0.38 -- 56.99 + 172.26 + 512.35 + 4.91 = 746.52:  10%|█         | 214/2048 [03:20<22:26,  1.36it/s]
loss 2.56 accuracy 0.06 -- 57.11 + 56.92 + 509.65 + 4.88 = 628.56:  10%|█         | 214/2048 [03:21<22:26,  1.36it/s] 
loss 2.56 accuracy 0.06 -- 57.11 + 56.92 + 509.65 + 4.88 = 628.56:  10%|█         | 215/2048 [03:21<21:43,  1.41it/s]
loss 1.86 accuracy 0.38 -- 57.22 + 56.76 + 509.52 + 4.93 = 628.44:  10%|█         | 215/2048 [03:22<21:43,  1.41it/s]
loss 1.86 accuracy 0.38 -- 57.22 + 56.76 + 509.52 + 4.93 = 628.44:  11%|█         | 216/2048 [03:22<22:13,  1.37it/s]
loss 1.81 accuracy 0.50 -- 56.43 + 57.95 + 504.49 + 4.93 = 623.80:  11%|█         | 216/2048 [03:22<22:13,  1.37it/s]
loss 1.81 accuracy 0.50 -- 56.43 + 57.95 + 504.49 + 4.93 = 623.80:  11%|█         | 217/2048 [03:22<21:30,  1.42it/s]
loss 2.34 accuracy 0.44 -- 164.31 + 57.23 + 500.07 + 4.92 = 726.53:  11%|█         | 217/2048 [03:23<21:30,  1.42it/s]
loss 2.34 accuracy 0.44 -- 164.31 + 57.23 + 500.07 + 4.92 = 726.53:  11%|█         | 218/2048 [03:23<21:56,  1.39it/s]
loss 2.13 accuracy 0.12 -- 56.16 + 172.72 + 510.93 + 4.87 = 744.68:  11%|█         | 218/2048 [03:24<21:56,  1.39it/s]
loss 2.13 accuracy 0.12 -- 56.16 + 172.72 + 510.93 + 4.87 = 744.68:  11%|█         | 219/2048 [03:24<22:25,  1.36it/s]
loss 2.30 accuracy 0.12 -- 57.26 + 56.70 + 508.37 + 4.89 = 627.22:  11%|█         | 219/2048 [03:24<22:25,  1.36it/s] 
loss 2.30 accuracy 0.12 -- 57.26 + 56.70 + 508.37 + 4.89 = 627.22:  11%|█         | 220/2048 [03:24<21:40,  1.41it/s]
loss 2.17 accuracy 0.31 -- 168.09 + 57.29 + 508.04 + 4.93 = 738.35:  11%|█         | 220/2048 [03:25<21:40,  1.41it/s]
loss 2.17 accuracy 0.31 -- 168.09 + 57.29 + 508.04 + 4.93 = 738.35:  11%|█         | 221/2048 [03:25<22:10,  1.37it/s]
loss 2.07 accuracy 0.38 -- 56.72 + 57.71 + 637.06 + 4.98 = 756.47:  11%|█         | 221/2048 [03:26<22:10,  1.37it/s] 
loss 2.07 accuracy 0.38 -- 56.72 + 57.71 + 637.06 + 4.98 = 756.47:  11%|█         | 222/2048 [03:26<22:39,  1.34it/s]
loss 1.87 accuracy 0.19 -- 57.10 + 56.67 + 514.90 + 4.91 = 633.58:  11%|█         | 222/2048 [03:27<22:39,  1.34it/s]
loss 1.87 accuracy 0.19 -- 57.10 + 56.67 + 514.90 + 4.91 = 633.58:  11%|█         | 223/2048 [03:27<21:53,  1.39it/s]
loss 1.91 accuracy 0.25 -- 56.77 + 57.59 + 630.70 + 4.95 = 750.02:  11%|█         | 223/2048 [03:27<21:53,  1.39it/s]
loss 1.91 accuracy 0.25 -- 56.77 + 57.59 + 630.70 + 4.95 = 750.02:  11%|█         | 224/2048 [03:27<22:43,  1.34it/s]
loss 2.20 accuracy 0.12 -- 57.39 + 56.57 + 511.18 + 4.93 = 630.07:  11%|█         | 224/2048 [03:28<22:43,  1.34it/s]
loss 2.20 accuracy 0.12 -- 57.39 + 56.57 + 511.18 + 4.93 = 630.07:  11%|█         | 225/2048 [03:28<21:53,  1.39it/s]
loss 1.94 accuracy 0.19 -- 56.93 + 172.48 + 511.92 + 4.92 = 746.25:  11%|█         | 225/2048 [03:29<21:53,  1.39it/s]
loss 1.94 accuracy 0.19 -- 56.93 + 172.48 + 511.92 + 4.92 = 746.25:  11%|█         | 226/2048 [03:29<22:21,  1.36it/s]
loss 2.24 accuracy 0.25 -- 56.71 + 56.69 + 509.32 + 4.92 = 627.64:  11%|█         | 226/2048 [03:30<22:21,  1.36it/s] 
loss 2.24 accuracy 0.25 -- 56.71 + 56.69 + 509.32 + 4.92 = 627.64:  11%|█         | 227/2048 [03:30<21:36,  1.40it/s]
loss 2.17 accuracy 0.12 -- 57.19 + 56.76 + 510.94 + 4.94 = 629.83:  11%|█         | 227/2048 [03:30<21:36,  1.40it/s]
loss 2.17 accuracy 0.12 -- 57.19 + 56.76 + 510.94 + 4.94 = 629.83:  11%|█         | 228/2048 [03:30<22:07,  1.37it/s]
loss 1.93 accuracy 0.25 -- 56.53 + 57.88 + 505.77 + 4.94 = 625.11:  11%|█         | 228/2048 [03:31<22:07,  1.37it/s]
loss 1.93 accuracy 0.25 -- 56.53 + 57.88 + 505.77 + 4.94 = 625.11:  11%|█         | 229/2048 [03:31<21:24,  1.42it/s]
loss 2.05 accuracy 0.31 -- 163.16 + 57.54 + 499.64 + 4.92 = 725.25:  11%|█         | 229/2048 [03:32<21:24,  1.42it/s]
loss 2.05 accuracy 0.31 -- 163.16 + 57.54 + 499.64 + 4.92 = 725.25:  11%|█         | 230/2048 [03:32<21:49,  1.39it/s]
loss 1.81 accuracy 0.25 -- 56.38 + 172.27 + 512.92 + 4.91 = 746.47:  11%|█         | 230/2048 [03:33<21:49,  1.39it/s]
loss 1.81 accuracy 0.25 -- 56.38 + 172.27 + 512.92 + 4.91 = 746.47:  11%|█▏        | 231/2048 [03:33<22:18,  1.36it/s]
loss 2.12 accuracy 0.19 -- 57.66 + 57.07 + 505.25 + 4.91 = 624.90:  11%|█▏        | 231/2048 [03:33<22:18,  1.36it/s] 
loss 2.12 accuracy 0.19 -- 57.66 + 57.07 + 505.25 + 4.91 = 624.90:  11%|█▏        | 232/2048 [03:33<21:31,  1.41it/s]
loss 1.81 accuracy 0.25 -- 168.09 + 57.09 + 505.84 + 4.94 = 735.97:  11%|█▏        | 232/2048 [03:34<21:31,  1.41it/s]
loss 1.81 accuracy 0.25 -- 168.09 + 57.09 + 505.84 + 4.94 = 735.97:  11%|█▏        | 233/2048 [03:34<21:59,  1.38it/s]
loss 2.00 accuracy 0.31 -- 56.89 + 57.72 + 636.23 + 4.92 = 755.76:  11%|█▏        | 233/2048 [03:35<21:59,  1.38it/s] 
loss 2.00 accuracy 0.31 -- 56.89 + 57.72 + 636.23 + 4.92 = 755.76:  11%|█▏        | 234/2048 [03:35<22:29,  1.34it/s]
loss 1.80 accuracy 0.38 -- 57.39 + 56.74 + 515.63 + 4.93 = 634.70:  11%|█▏        | 234/2048 [03:35<22:29,  1.34it/s]
loss 1.80 accuracy 0.38 -- 57.39 + 56.74 + 515.63 + 4.93 = 634.70:  11%|█▏        | 235/2048 [03:35<21:43,  1.39it/s]
loss 1.81 accuracy 0.38 -- 56.70 + 57.68 + 631.36 + 4.94 = 750.68:  11%|█▏        | 235/2048 [03:36<21:43,  1.39it/s]
loss 1.81 accuracy 0.38 -- 56.70 + 57.68 + 631.36 + 4.94 = 750.68:  12%|█▏        | 236/2048 [03:36<22:14,  1.36it/s]
loss 1.61 accuracy 0.62 -- 57.75 + 57.18 + 513.12 + 4.92 = 632.98:  12%|█▏        | 236/2048 [03:37<22:14,  1.36it/s]
loss 1.61 accuracy 0.62 -- 57.75 + 57.18 + 513.12 + 4.92 = 632.98:  12%|█▏        | 237/2048 [03:37<21:42,  1.39it/s]
loss 2.14 accuracy 0.25 -- 56.78 + 172.18 + 512.48 + 4.92 = 746.36:  12%|█▏        | 237/2048 [03:38<21:42,  1.39it/s]
loss 2.14 accuracy 0.25 -- 56.78 + 172.18 + 512.48 + 4.92 = 746.36:  12%|█▏        | 238/2048 [03:38<22:11,  1.36it/s]
loss 1.85 accuracy 0.25 -- 56.79 + 56.68 + 508.50 + 4.90 = 626.86:  12%|█▏        | 238/2048 [03:38<22:11,  1.36it/s] 
loss 1.85 accuracy 0.25 -- 56.79 + 56.68 + 508.50 + 4.90 = 626.86:  12%|█▏        | 239/2048 [03:38<21:26,  1.41it/s]
loss 1.77 accuracy 0.50 -- 57.48 + 56.99 + 509.82 + 4.94 = 629.22:  12%|█▏        | 239/2048 [03:39<21:26,  1.41it/s]
loss 1.77 accuracy 0.50 -- 57.48 + 56.99 + 509.82 + 4.94 = 629.22:  12%|█▏        | 240/2048 [03:39<21:57,  1.37it/s]
loss 2.57 accuracy 0.12 -- 56.87 + 57.56 + 505.73 + 4.92 = 625.09:  12%|█▏        | 240/2048 [03:40<21:57,  1.37it/s]
loss 2.57 accuracy 0.12 -- 56.87 + 57.56 + 505.73 + 4.92 = 625.09:  12%|█▏        | 241/2048 [03:40<21:15,  1.42it/s]
loss 2.70 accuracy 0.06 -- 163.39 + 57.46 + 500.06 + 4.93 = 725.83:  12%|█▏        | 241/2048 [03:40<21:15,  1.42it/s]
loss 2.70 accuracy 0.06 -- 163.39 + 57.46 + 500.06 + 4.93 = 725.83:  12%|█▏        | 242/2048 [03:40<21:40,  1.39it/s]
loss 2.01 accuracy 0.38 -- 56.35 + 172.59 + 511.52 + 4.93 = 745.39:  12%|█▏        | 242/2048 [03:41<21:40,  1.39it/s]
loss 2.01 accuracy 0.38 -- 56.35 + 172.59 + 511.52 + 4.93 = 745.39:  12%|█▏        | 243/2048 [03:41<22:08,  1.36it/s]
loss 1.81 accuracy 0.25 -- 56.94 + 56.70 + 507.62 + 4.95 = 626.22:  12%|█▏        | 243/2048 [03:42<22:08,  1.36it/s] 
loss 1.81 accuracy 0.25 -- 56.94 + 56.70 + 507.62 + 4.95 = 626.22:  12%|█▏        | 244/2048 [03:42<21:23,  1.41it/s]
loss 1.62 accuracy 0.38 -- 169.45 + 57.36 + 507.84 + 4.93 = 739.57:  12%|█▏        | 244/2048 [03:43<21:23,  1.41it/s]
loss 1.62 accuracy 0.38 -- 169.45 + 57.36 + 507.84 + 4.93 = 739.57:  12%|█▏        | 245/2048 [03:43<22:12,  1.35it/s]
loss 2.19 accuracy 0.19 -- 56.75 + 58.01 + 636.82 + 4.93 = 756.51:  12%|█▏        | 245/2048 [03:43<22:12,  1.35it/s] 
loss 2.19 accuracy 0.19 -- 56.75 + 58.01 + 636.82 + 4.93 = 756.51:  12%|█▏        | 246/2048 [03:43<22:35,  1.33it/s]
loss 2.37 accuracy 0.12 -- 57.45 + 57.20 + 518.44 + 4.89 = 637.99:  12%|█▏        | 246/2048 [03:44<22:35,  1.33it/s]
loss 2.37 accuracy 0.12 -- 57.45 + 57.20 + 518.44 + 4.89 = 637.99:  12%|█▏        | 247/2048 [03:44<21:48,  1.38it/s]
loss 2.56 accuracy 0.19 -- 56.94 + 57.70 + 630.43 + 4.92 = 749.99:  12%|█▏        | 247/2048 [03:45<21:48,  1.38it/s]
loss 2.56 accuracy 0.19 -- 56.94 + 57.70 + 630.43 + 4.92 = 749.99:  12%|█▏        | 248/2048 [03:45<22:14,  1.35it/s]
loss 1.57 accuracy 0.44 -- 57.23 + 56.74 + 511.75 + 4.90 = 630.62:  12%|█▏        | 248/2048 [03:46<22:14,  1.35it/s]
loss 1.57 accuracy 0.44 -- 57.23 + 56.74 + 511.75 + 4.90 = 630.62:  12%|█▏        | 249/2048 [03:46<21:29,  1.40it/s]
loss 2.67 accuracy 0.12 -- 56.55 + 171.81 + 512.12 + 4.96 = 745.44:  12%|█▏        | 249/2048 [03:46<21:29,  1.40it/s]
loss 2.67 accuracy 0.12 -- 56.55 + 171.81 + 512.12 + 4.96 = 745.44:  12%|█▏        | 250/2048 [03:46<21:58,  1.36it/s]
loss 2.05 accuracy 0.25 -- 57.08 + 57.04 + 512.03 + 4.94 = 631.08:  12%|█▏        | 250/2048 [03:47<21:58,  1.36it/s] 
loss 2.05 accuracy 0.25 -- 57.08 + 57.04 + 512.03 + 4.94 = 631.08:  12%|█▏        | 251/2048 [03:47<21:17,  1.41it/s]
loss 1.82 accuracy 0.25 -- 57.53 + 56.89 + 510.12 + 4.91 = 629.44:  12%|█▏        | 251/2048 [03:48<21:17,  1.41it/s]
loss 1.82 accuracy 0.25 -- 57.53 + 56.89 + 510.12 + 4.91 = 629.44:  12%|█▏        | 252/2048 [03:48<21:48,  1.37it/s]
loss 1.90 accuracy 0.12 -- 56.76 + 57.68 + 506.84 + 4.92 = 626.20:  12%|█▏        | 252/2048 [03:48<21:48,  1.37it/s]
loss 1.90 accuracy 0.12 -- 56.76 + 57.68 + 506.84 + 4.92 = 626.20:  12%|█▏        | 253/2048 [03:48<21:07,  1.42it/s]
loss 1.65 accuracy 0.19 -- 163.22 + 57.32 + 499.41 + 4.93 = 724.89:  12%|█▏        | 253/2048 [03:49<21:07,  1.42it/s]
loss 1.65 accuracy 0.19 -- 163.22 + 57.32 + 499.41 + 4.93 = 724.89:  12%|█▏        | 254/2048 [03:49<21:31,  1.39it/s]
loss 2.02 accuracy 0.06 -- 56.28 + 172.01 + 511.96 + 4.94 = 745.19:  12%|█▏        | 254/2048 [03:50<21:31,  1.39it/s]
loss 2.02 accuracy 0.06 -- 56.28 + 172.01 + 511.96 + 4.94 = 745.19:  12%|█▏        | 255/2048 [03:50<21:59,  1.36it/s]
loss 1.94 accuracy 0.44 -- 57.52 + 56.94 + 507.49 + 4.94 = 626.89:  12%|█▏        | 255/2048 [03:51<21:59,  1.36it/s] 
loss 1.94 accuracy 0.44 -- 57.52 + 56.94 + 507.49 + 4.94 = 626.89:  12%|█▎        | 256/2048 [03:51<21:14,  1.41it/s]
loss 2.03 accuracy 0.38 -- 170.15 + 57.36 + 506.24 + 4.92 = 738.66:  12%|█▎        | 256/2048 [03:51<21:14,  1.41it/s]
loss 2.03 accuracy 0.38 -- 170.15 + 57.36 + 506.24 + 4.92 = 738.66:  13%|█▎        | 257/2048 [03:51<21:43,  1.37it/s]
loss 2.14 accuracy 0.12 -- 56.54 + 57.65 + 635.40 + 4.93 = 754.52:  13%|█▎        | 257/2048 [03:52<21:43,  1.37it/s] 
loss 2.14 accuracy 0.12 -- 56.54 + 57.65 + 635.40 + 4.93 = 754.52:  13%|█▎        | 258/2048 [03:52<22:11,  1.34it/s]
loss 2.35 accuracy 0.25 -- 57.75 + 56.90 + 517.26 + 4.92 = 636.84:  13%|█▎        | 258/2048 [03:53<22:11,  1.34it/s]
loss 2.35 accuracy 0.25 -- 57.75 + 56.90 + 517.26 + 4.92 = 636.84:  13%|█▎        | 259/2048 [03:53<21:28,  1.39it/s]
loss 2.11 accuracy 0.19 -- 56.73 + 57.38 + 630.77 + 4.95 = 749.83:  13%|█▎        | 259/2048 [03:54<21:28,  1.39it/s]
loss 2.11 accuracy 0.19 -- 56.73 + 57.38 + 630.77 + 4.95 = 749.83:  13%|█▎        | 260/2048 [03:54<21:58,  1.36it/s]
loss 2.46 accuracy 0.19 -- 57.45 + 56.76 + 512.58 + 4.92 = 631.71:  13%|█▎        | 260/2048 [03:54<21:58,  1.36it/s]
loss 2.46 accuracy 0.19 -- 57.45 + 56.76 + 512.58 + 4.92 = 631.71:  13%|█▎        | 261/2048 [03:54<21:15,  1.40it/s]
loss 2.42 accuracy 0.19 -- 56.66 + 171.69 + 511.43 + 4.93 = 744.72:  13%|█▎        | 261/2048 [03:55<21:15,  1.40it/s]
loss 2.42 accuracy 0.19 -- 56.66 + 171.69 + 511.43 + 4.93 = 744.72:  13%|█▎        | 262/2048 [03:55<21:46,  1.37it/s]
loss 1.97 accuracy 0.25 -- 56.65 + 56.64 + 509.10 + 4.94 = 627.33:  13%|█▎        | 262/2048 [03:56<21:46,  1.37it/s] 
loss 1.97 accuracy 0.25 -- 56.65 + 56.64 + 509.10 + 4.94 = 627.33:  13%|█▎        | 263/2048 [03:56<21:04,  1.41it/s]
loss 2.83 accuracy 0.06 -- 57.32 + 56.68 + 508.42 + 4.90 = 627.32:  13%|█▎        | 263/2048 [03:56<21:04,  1.41it/s]
loss 2.83 accuracy 0.06 -- 57.32 + 56.68 + 508.42 + 4.90 = 627.32:  13%|█▎        | 264/2048 [03:56<21:53,  1.36it/s]
loss 1.86 accuracy 0.44 -- 56.33 + 57.64 + 504.85 + 4.96 = 623.78:  13%|█▎        | 264/2048 [03:57<21:53,  1.36it/s]
loss 1.86 accuracy 0.44 -- 56.33 + 57.64 + 504.85 + 4.96 = 623.78:  13%|█▎        | 265/2048 [03:57<21:07,  1.41it/s]
loss 2.68 accuracy 0.44 -- 163.25 + 57.08 + 498.53 + 4.90 = 723.76:  13%|█▎        | 265/2048 [03:58<21:07,  1.41it/s]
loss 2.68 accuracy 0.44 -- 163.25 + 57.08 + 498.53 + 4.90 = 723.76:  13%|█▎        | 266/2048 [03:58<21:28,  1.38it/s]
loss 1.99 accuracy 0.31 -- 56.39 + 172.53 + 510.43 + 4.93 = 744.28:  13%|█▎        | 266/2048 [03:59<21:28,  1.38it/s]
loss 1.99 accuracy 0.31 -- 56.39 + 172.53 + 510.43 + 4.93 = 744.28:  13%|█▎        | 267/2048 [03:59<21:53,  1.36it/s]
loss 2.00 accuracy 0.31 -- 57.21 + 56.83 + 507.28 + 4.90 = 626.23:  13%|█▎        | 267/2048 [03:59<21:53,  1.36it/s] 
loss 2.00 accuracy 0.31 -- 57.21 + 56.83 + 507.28 + 4.90 = 626.23:  13%|█▎        | 268/2048 [03:59<21:08,  1.40it/s]
loss 1.75 accuracy 0.31 -- 168.90 + 57.69 + 506.47 + 4.92 = 737.98:  13%|█▎        | 268/2048 [04:00<21:08,  1.40it/s]
loss 1.75 accuracy 0.31 -- 168.90 + 57.69 + 506.47 + 4.92 = 737.98:  13%|█▎        | 269/2048 [04:00<21:36,  1.37it/s]
loss 1.96 accuracy 0.19 -- 56.49 + 57.56 + 636.88 + 4.93 = 755.85:  13%|█▎        | 269/2048 [04:01<21:36,  1.37it/s] 
loss 1.96 accuracy 0.19 -- 56.49 + 57.56 + 636.88 + 4.93 = 755.85:  13%|█▎        | 270/2048 [04:01<22:04,  1.34it/s]
loss 1.86 accuracy 0.25 -- 57.43 + 58.72 + 515.15 + 4.92 = 636.22:  13%|█▎        | 270/2048 [04:02<22:04,  1.34it/s]
loss 1.86 accuracy 0.25 -- 57.43 + 58.72 + 515.15 + 4.92 = 636.22:  13%|█▎        | 271/2048 [04:02<21:20,  1.39it/s]
loss 1.78 accuracy 0.31 -- 56.62 + 57.62 + 630.36 + 4.96 = 749.56:  13%|█▎        | 271/2048 [04:02<21:20,  1.39it/s]
loss 1.78 accuracy 0.31 -- 56.62 + 57.62 + 630.36 + 4.96 = 749.56:  13%|█▎        | 272/2048 [04:02<21:50,  1.36it/s]
loss 2.01 accuracy 0.25 -- 57.35 + 57.14 + 512.02 + 4.91 = 631.41:  13%|█▎        | 272/2048 [04:03<21:50,  1.36it/s]
loss 2.01 accuracy 0.25 -- 57.35 + 57.14 + 512.02 + 4.91 = 631.41:  13%|█▎        | 273/2048 [04:03<21:07,  1.40it/s]
loss 2.58 accuracy 0.25 -- 56.69 + 172.20 + 511.64 + 4.89 = 745.43:  13%|█▎        | 273/2048 [04:04<21:07,  1.40it/s]
loss 2.58 accuracy 0.25 -- 56.69 + 172.20 + 511.64 + 4.89 = 745.43:  13%|█▎        | 274/2048 [04:04<21:38,  1.37it/s]
loss 1.98 accuracy 0.50 -- 56.63 + 56.67 + 507.98 + 4.90 = 626.18:  13%|█▎        | 274/2048 [04:04<21:38,  1.37it/s] 
loss 1.98 accuracy 0.50 -- 56.63 + 56.67 + 507.98 + 4.90 = 626.18:  13%|█▎        | 275/2048 [04:04<20:55,  1.41it/s]
loss 1.96 accuracy 0.31 -- 56.76 + 56.39 + 507.06 + 4.90 = 625.11:  13%|█▎        | 275/2048 [04:05<20:55,  1.41it/s]
loss 1.96 accuracy 0.31 -- 56.76 + 56.39 + 507.06 + 4.90 = 625.11:  13%|█▎        | 276/2048 [04:05<21:25,  1.38it/s]
loss 2.47 accuracy 0.06 -- 56.19 + 57.50 + 504.12 + 4.88 = 622.69:  13%|█▎        | 276/2048 [04:06<21:25,  1.38it/s]
loss 2.47 accuracy 0.06 -- 56.19 + 57.50 + 504.12 + 4.88 = 622.69:  14%|█▎        | 277/2048 [04:06<20:45,  1.42it/s]
loss 2.14 accuracy 0.19 -- 163.47 + 57.33 + 499.18 + 4.89 = 724.87:  14%|█▎        | 277/2048 [04:07<20:45,  1.42it/s]
loss 2.14 accuracy 0.19 -- 163.47 + 57.33 + 499.18 + 4.89 = 724.87:  14%|█▎        | 278/2048 [04:07<21:10,  1.39it/s]
loss 2.04 accuracy 0.31 -- 56.69 + 172.04 + 511.21 + 4.90 = 744.83:  14%|█▎        | 278/2048 [04:07<21:10,  1.39it/s]
loss 2.04 accuracy 0.31 -- 56.69 + 172.04 + 511.21 + 4.90 = 744.83:  14%|█▎        | 279/2048 [04:07<21:38,  1.36it/s]
loss 1.98 accuracy 0.12 -- 57.11 + 56.84 + 507.79 + 4.92 = 626.67:  14%|█▎        | 279/2048 [04:08<21:38,  1.36it/s] 
loss 1.98 accuracy 0.12 -- 57.11 + 56.84 + 507.79 + 4.92 = 626.67:  14%|█▎        | 280/2048 [04:08<20:55,  1.41it/s]
loss 1.96 accuracy 0.44 -- 169.02 + 57.23 + 506.40 + 4.92 = 737.57:  14%|█▎        | 280/2048 [04:09<20:55,  1.41it/s]
loss 1.96 accuracy 0.44 -- 169.02 + 57.23 + 506.40 + 4.92 = 737.57:  14%|█▎        | 281/2048 [04:09<21:24,  1.38it/s]
loss 2.25 accuracy 0.06 -- 56.56 + 57.74 + 633.22 + 4.89 = 752.41:  14%|█▎        | 281/2048 [04:10<21:24,  1.38it/s] 
loss 2.25 accuracy 0.06 -- 56.56 + 57.74 + 633.22 + 4.89 = 752.41:  14%|█▍        | 282/2048 [04:10<21:51,  1.35it/s]
loss 2.28 accuracy 0.19 -- 57.09 + 56.85 + 515.70 + 4.90 = 634.54:  14%|█▍        | 282/2048 [04:10<21:51,  1.35it/s]
loss 2.28 accuracy 0.19 -- 57.09 + 56.85 + 515.70 + 4.90 = 634.54:  14%|█▍        | 283/2048 [04:10<21:08,  1.39it/s]
loss 1.79 accuracy 0.12 -- 56.62 + 57.89 + 631.31 + 4.90 = 750.72:  14%|█▍        | 283/2048 [04:11<21:08,  1.39it/s]
loss 1.79 accuracy 0.12 -- 56.62 + 57.89 + 631.31 + 4.90 = 750.72:  14%|█▍        | 284/2048 [04:11<21:39,  1.36it/s]
loss 2.35 accuracy 0.06 -- 57.50 + 57.05 + 513.52 + 4.93 = 632.99:  14%|█▍        | 284/2048 [04:12<21:39,  1.36it/s]
loss 2.35 accuracy 0.06 -- 57.50 + 57.05 + 513.52 + 4.93 = 632.99:  14%|█▍        | 285/2048 [04:12<20:58,  1.40it/s]
loss 1.89 accuracy 0.25 -- 56.70 + 171.73 + 513.02 + 4.89 = 746.34:  14%|█▍        | 285/2048 [04:12<20:58,  1.40it/s]
loss 1.89 accuracy 0.25 -- 56.70 + 171.73 + 513.02 + 4.89 = 746.34:  14%|█▍        | 286/2048 [04:12<21:29,  1.37it/s]
loss 2.74 accuracy 0.25 -- 57.19 + 56.49 + 507.68 + 4.89 = 626.24:  14%|█▍        | 286/2048 [04:13<21:29,  1.37it/s] 
loss 2.74 accuracy 0.25 -- 57.19 + 56.49 + 507.68 + 4.89 = 626.24:  14%|█▍        | 287/2048 [04:13<20:47,  1.41it/s]
loss 2.32 accuracy 0.25 -- 57.33 + 56.96 + 508.47 + 4.93 = 627.70:  14%|█▍        | 287/2048 [04:14<20:47,  1.41it/s]
loss 2.32 accuracy 0.25 -- 57.33 + 56.96 + 508.47 + 4.93 = 627.70:  14%|█▍        | 288/2048 [04:14<21:17,  1.38it/s]
loss 2.05 accuracy 0.19 -- 56.65 + 57.99 + 505.28 + 4.87 = 624.80:  14%|█▍        | 288/2048 [04:14<21:17,  1.38it/s]
loss 2.05 accuracy 0.19 -- 56.65 + 57.99 + 505.28 + 4.87 = 624.80:  14%|█▍        | 289/2048 [04:14<20:38,  1.42it/s]
loss 2.04 accuracy 0.31 -- 163.56 + 57.12 + 498.36 + 4.96 = 723.99:  14%|█▍        | 289/2048 [04:15<20:38,  1.42it/s]
loss 2.04 accuracy 0.31 -- 163.56 + 57.12 + 498.36 + 4.96 = 723.99:  14%|█▍        | 290/2048 [04:15<21:02,  1.39it/s]
loss 2.20 accuracy 0.12 -- 56.24 + 172.13 + 512.34 + 4.89 = 745.61:  14%|█▍        | 290/2048 [04:16<21:02,  1.39it/s]
loss 2.20 accuracy 0.12 -- 56.24 + 172.13 + 512.34 + 4.89 = 745.61:  14%|█▍        | 291/2048 [04:16<21:31,  1.36it/s]
loss 2.05 accuracy 0.12 -- 57.28 + 56.61 + 507.92 + 4.89 = 626.70:  14%|█▍        | 291/2048 [04:17<21:31,  1.36it/s] 
loss 2.05 accuracy 0.12 -- 57.28 + 56.61 + 507.92 + 4.89 = 626.70:  14%|█▍        | 292/2048 [04:17<20:48,  1.41it/s]
loss 2.12 accuracy 0.12 -- 169.60 + 57.31 + 507.02 + 4.88 = 738.81:  14%|█▍        | 292/2048 [04:17<20:48,  1.41it/s]
loss 2.12 accuracy 0.12 -- 169.60 + 57.31 + 507.02 + 4.88 = 738.81:  14%|█▍        | 293/2048 [04:17<21:16,  1.37it/s]
loss 2.07 accuracy 0.25 -- 56.97 + 57.98 + 635.99 + 4.90 = 755.84:  14%|█▍        | 293/2048 [04:18<21:16,  1.37it/s] 
loss 2.07 accuracy 0.25 -- 56.97 + 57.98 + 635.99 + 4.90 = 755.84:  14%|█▍        | 294/2048 [04:18<21:45,  1.34it/s]
loss 2.18 accuracy 0.12 -- 57.78 + 56.79 + 515.48 + 4.89 = 634.93:  14%|█▍        | 294/2048 [04:19<21:45,  1.34it/s]
loss 2.18 accuracy 0.12 -- 57.78 + 56.79 + 515.48 + 4.89 = 634.93:  14%|█▍        | 295/2048 [04:19<21:01,  1.39it/s]
loss 2.08 accuracy 0.31 -- 56.88 + 57.89 + 631.99 + 4.90 = 751.67:  14%|█▍        | 295/2048 [04:20<21:01,  1.39it/s]
loss 2.08 accuracy 0.31 -- 56.88 + 57.89 + 631.99 + 4.90 = 751.67:  14%|█▍        | 296/2048 [04:20<21:32,  1.36it/s]
loss 1.69 accuracy 0.38 -- 57.00 + 57.11 + 511.18 + 4.88 = 630.18:  14%|█▍        | 296/2048 [04:20<21:32,  1.36it/s]
loss 1.69 accuracy 0.38 -- 57.00 + 57.11 + 511.18 + 4.88 = 630.18:  15%|█▍        | 297/2048 [04:20<20:49,  1.40it/s]
loss 2.20 accuracy 0.25 -- 56.26 + 172.38 + 510.94 + 4.90 = 744.48:  15%|█▍        | 297/2048 [04:21<20:49,  1.40it/s]
loss 2.20 accuracy 0.25 -- 56.26 + 172.38 + 510.94 + 4.90 = 744.48:  15%|█▍        | 298/2048 [04:21<21:19,  1.37it/s]
loss 2.56 accuracy 0.12 -- 56.67 + 56.64 + 511.02 + 4.92 = 629.24:  15%|█▍        | 298/2048 [04:22<21:19,  1.37it/s] 
loss 2.56 accuracy 0.12 -- 56.67 + 56.64 + 511.02 + 4.92 = 629.24:  15%|█▍        | 299/2048 [04:22<20:39,  1.41it/s]
loss 2.07 accuracy 0.19 -- 56.82 + 56.43 + 507.58 + 4.89 = 625.71:  15%|█▍        | 299/2048 [04:22<20:39,  1.41it/s]
loss 2.07 accuracy 0.19 -- 56.82 + 56.43 + 507.58 + 4.89 = 625.71:  15%|█▍        | 300/2048 [04:22<21:08,  1.38it/s]
loss 2.14 accuracy 0.25 -- 56.97 + 57.47 + 503.31 + 4.89 = 622.64:  15%|█▍        | 300/2048 [04:23<21:08,  1.38it/s]
loss 2.14 accuracy 0.25 -- 56.97 + 57.47 + 503.31 + 4.89 = 622.64:  15%|█▍        | 301/2048 [04:23<20:28,  1.42it/s]
loss 1.99 accuracy 0.25 -- 163.25 + 56.72 + 497.43 + 4.88 = 722.29:  15%|█▍        | 301/2048 [04:24<20:28,  1.42it/s]
loss 1.99 accuracy 0.25 -- 163.25 + 56.72 + 497.43 + 4.88 = 722.29:  15%|█▍        | 302/2048 [04:24<20:51,  1.39it/s]
loss 2.19 accuracy 0.25 -- 56.39 + 172.01 + 510.66 + 4.91 = 743.98:  15%|█▍        | 302/2048 [04:25<20:51,  1.39it/s]
loss 2.19 accuracy 0.25 -- 56.39 + 172.01 + 510.66 + 4.91 = 743.98:  15%|█▍        | 303/2048 [04:25<21:19,  1.36it/s]
loss 2.18 accuracy 0.19 -- 57.28 + 56.63 + 505.35 + 4.90 = 624.16:  15%|█▍        | 303/2048 [04:25<21:19,  1.36it/s] 
loss 2.18 accuracy 0.19 -- 57.28 + 56.63 + 505.35 + 4.90 = 624.16:  15%|█▍        | 304/2048 [04:25<20:36,  1.41it/s]
loss 2.28 accuracy 0.19 -- 168.55 + 57.58 + 504.83 + 4.88 = 735.84:  15%|█▍        | 304/2048 [04:26<20:36,  1.41it/s]
loss 2.28 accuracy 0.19 -- 168.55 + 57.58 + 504.83 + 4.88 = 735.84:  15%|█▍        | 305/2048 [04:26<21:04,  1.38it/s]
loss 2.22 accuracy 0.19 -- 56.66 + 57.48 + 633.45 + 4.91 = 752.50:  15%|█▍        | 305/2048 [04:27<21:04,  1.38it/s] 
loss 2.22 accuracy 0.19 -- 56.66 + 57.48 + 633.45 + 4.91 = 752.50:  15%|█▍        | 306/2048 [04:27<21:32,  1.35it/s]
loss 2.22 accuracy 0.25 -- 57.08 + 56.57 + 514.13 + 4.90 = 632.68:  15%|█▍        | 306/2048 [04:28<21:32,  1.35it/s]
loss 2.22 accuracy 0.25 -- 57.08 + 56.57 + 514.13 + 4.90 = 632.68:  15%|█▍        | 307/2048 [04:28<20:48,  1.39it/s]
loss 2.28 accuracy 0.19 -- 56.33 + 57.44 + 629.81 + 4.91 = 748.50:  15%|█▍        | 307/2048 [04:28<20:48,  1.39it/s]
loss 2.28 accuracy 0.19 -- 56.33 + 57.44 + 629.81 + 4.91 = 748.50:  15%|█▌        | 308/2048 [04:28<21:18,  1.36it/s]
loss 2.18 accuracy 0.12 -- 57.33 + 56.75 + 510.54 + 4.89 = 629.50:  15%|█▌        | 308/2048 [04:29<21:18,  1.36it/s]
loss 2.18 accuracy 0.12 -- 57.33 + 56.75 + 510.54 + 4.89 = 629.50:  15%|█▌        | 309/2048 [04:29<20:37,  1.41it/s]
loss 1.78 accuracy 0.31 -- 56.79 + 171.28 + 508.91 + 4.88 = 741.86:  15%|█▌        | 309/2048 [04:30<20:37,  1.41it/s]
loss 1.78 accuracy 0.31 -- 56.79 + 171.28 + 508.91 + 4.88 = 741.86:  15%|█▌        | 310/2048 [04:30<21:06,  1.37it/s]
loss 2.10 accuracy 0.12 -- 56.33 + 56.57 + 507.60 + 4.87 = 625.37:  15%|█▌        | 310/2048 [04:30<21:06,  1.37it/s] 
loss 2.10 accuracy 0.12 -- 56.33 + 56.57 + 507.60 + 4.87 = 625.37:  15%|█▌        | 311/2048 [04:30<20:26,  1.42it/s]
loss 1.93 accuracy 0.12 -- 57.36 + 56.32 + 506.38 + 4.90 = 624.95:  15%|█▌        | 311/2048 [04:31<20:26,  1.42it/s]
loss 1.93 accuracy 0.12 -- 57.36 + 56.32 + 506.38 + 4.90 = 624.95:  15%|█▌        | 312/2048 [04:31<20:56,  1.38it/s]
loss 2.14 accuracy 0.12 -- 56.31 + 57.24 + 503.99 + 4.88 = 622.42:  15%|█▌        | 312/2048 [04:32<20:56,  1.38it/s]
loss 2.14 accuracy 0.12 -- 56.31 + 57.24 + 503.99 + 4.88 = 622.42:  15%|█▌        | 313/2048 [04:32<20:16,  1.43it/s]
loss 1.68 accuracy 0.38 -- 163.31 + 56.74 + 497.49 + 4.90 = 722.43:  15%|█▌        | 313/2048 [04:33<20:16,  1.43it/s]
loss 1.68 accuracy 0.38 -- 163.31 + 56.74 + 497.49 + 4.90 = 722.43:  15%|█▌        | 314/2048 [04:33<20:41,  1.40it/s]
loss 1.86 accuracy 0.25 -- 56.18 + 172.05 + 514.18 + 4.94 = 747.35:  15%|█▌        | 314/2048 [04:33<20:41,  1.40it/s]
loss 1.86 accuracy 0.25 -- 56.18 + 172.05 + 514.18 + 4.94 = 747.35:  15%|█▌        | 315/2048 [04:33<21:11,  1.36it/s]
loss 2.54 accuracy 0.19 -- 56.88 + 56.58 + 507.79 + 4.87 = 626.11:  15%|█▌        | 315/2048 [04:34<21:11,  1.36it/s] 
loss 2.54 accuracy 0.19 -- 56.88 + 56.58 + 507.79 + 4.87 = 626.11:  15%|█▌        | 316/2048 [04:34<20:29,  1.41it/s]
loss 1.98 accuracy 0.12 -- 168.15 + 57.29 + 504.57 + 4.92 = 734.93:  15%|█▌        | 316/2048 [04:35<20:29,  1.41it/s]
loss 1.98 accuracy 0.12 -- 168.15 + 57.29 + 504.57 + 4.92 = 734.93:  15%|█▌        | 317/2048 [04:35<20:56,  1.38it/s]
loss 1.86 accuracy 0.31 -- 56.57 + 57.21 + 634.13 + 4.92 = 752.84:  15%|█▌        | 317/2048 [04:36<20:56,  1.38it/s] 
loss 1.86 accuracy 0.31 -- 56.57 + 57.21 + 634.13 + 4.92 = 752.84:  16%|█▌        | 318/2048 [04:36<21:23,  1.35it/s]
loss 2.16 accuracy 0.44 -- 56.91 + 56.43 + 513.11 + 4.89 = 631.34:  16%|█▌        | 318/2048 [04:36<21:23,  1.35it/s]
loss 2.16 accuracy 0.44 -- 56.91 + 56.43 + 513.11 + 4.89 = 631.34:  16%|█▌        | 319/2048 [04:36<20:39,  1.39it/s]
loss 2.47 accuracy 0.12 -- 56.91 + 57.89 + 630.56 + 4.94 = 750.30:  16%|█▌        | 319/2048 [04:37<20:39,  1.39it/s]
loss 2.47 accuracy 0.12 -- 56.91 + 57.89 + 630.56 + 4.94 = 750.30:  16%|█▌        | 320/2048 [04:37<21:10,  1.36it/s]
loss 2.49 accuracy 0.06 -- 57.43 + 56.78 + 511.17 + 4.90 = 630.28:  16%|█▌        | 320/2048 [04:38<21:10,  1.36it/s]
loss 2.49 accuracy 0.06 -- 57.43 + 56.78 + 511.17 + 4.90 = 630.28:  16%|█▌        | 321/2048 [04:38<20:29,  1.40it/s]
loss 2.09 accuracy 0.12 -- 56.21 + 171.15 + 510.10 + 4.89 = 742.36:  16%|█▌        | 321/2048 [04:38<20:29,  1.40it/s]
loss 2.09 accuracy 0.12 -- 56.21 + 171.15 + 510.10 + 4.89 = 742.36:  16%|█▌        | 322/2048 [04:38<20:58,  1.37it/s]
loss 2.01 accuracy 0.19 -- 56.77 + 56.59 + 510.07 + 4.94 = 628.37:  16%|█▌        | 322/2048 [04:39<20:58,  1.37it/s] 
loss 2.01 accuracy 0.19 -- 56.77 + 56.59 + 510.07 + 4.94 = 628.37:  16%|█▌        | 323/2048 [04:39<20:19,  1.41it/s]
loss 2.18 accuracy 0.50 -- 57.45 + 56.89 + 507.60 + 4.92 = 626.87:  16%|█▌        | 323/2048 [04:40<20:19,  1.41it/s]
loss 2.18 accuracy 0.50 -- 57.45 + 56.89 + 507.60 + 4.92 = 626.87:  16%|█▌        | 324/2048 [04:40<20:49,  1.38it/s]
loss 2.48 accuracy 0.06 -- 56.18 + 57.35 + 504.61 + 4.93 = 623.07:  16%|█▌        | 324/2048 [04:40<20:49,  1.38it/s]
loss 2.48 accuracy 0.06 -- 56.18 + 57.35 + 504.61 + 4.93 = 623.07:  16%|█▌        | 325/2048 [04:40<20:10,  1.42it/s]
loss 1.85 accuracy 0.25 -- 163.54 + 57.11 + 499.07 + 4.93 = 724.65:  16%|█▌        | 325/2048 [04:41<20:10,  1.42it/s]
loss 1.85 accuracy 0.25 -- 163.54 + 57.11 + 499.07 + 4.93 = 724.65:  16%|█▌        | 326/2048 [04:41<20:35,  1.39it/s]
loss 1.66 accuracy 0.44 -- 56.43 + 171.68 + 512.81 + 4.89 = 745.82:  16%|█▌        | 326/2048 [04:42<20:35,  1.39it/s]
loss 1.66 accuracy 0.44 -- 56.43 + 171.68 + 512.81 + 4.89 = 745.82:  16%|█▌        | 327/2048 [04:42<21:03,  1.36it/s]
loss 1.89 accuracy 0.25 -- 56.93 + 57.03 + 508.02 + 4.95 = 626.93:  16%|█▌        | 327/2048 [04:43<21:03,  1.36it/s] 
loss 1.89 accuracy 0.25 -- 56.93 + 57.03 + 508.02 + 4.95 = 626.93:  16%|█▌        | 328/2048 [04:43<20:21,  1.41it/s]
loss 1.84 accuracy 0.31 -- 168.16 + 57.34 + 507.41 + 4.91 = 737.82:  16%|█▌        | 328/2048 [04:43<20:21,  1.41it/s]
loss 1.84 accuracy 0.31 -- 168.16 + 57.34 + 507.41 + 4.91 = 737.82:  16%|█▌        | 329/2048 [04:43<20:49,  1.38it/s]
loss 2.56 accuracy 0.38 -- 56.69 + 57.44 + 635.50 + 4.90 = 754.54:  16%|█▌        | 329/2048 [04:44<20:49,  1.38it/s] 
loss 2.56 accuracy 0.38 -- 56.69 + 57.44 + 635.50 + 4.90 = 754.54:  16%|█▌        | 330/2048 [04:44<21:17,  1.35it/s]
loss 1.76 accuracy 0.31 -- 57.39 + 57.57 + 516.10 + 4.94 = 636.00:  16%|█▌        | 330/2048 [04:45<21:17,  1.35it/s]
loss 1.76 accuracy 0.31 -- 57.39 + 57.57 + 516.10 + 4.94 = 636.00:  16%|█▌        | 331/2048 [04:45<20:35,  1.39it/s]
loss 1.97 accuracy 0.25 -- 56.68 + 57.56 + 631.26 + 4.92 = 750.42:  16%|█▌        | 331/2048 [04:46<20:35,  1.39it/s]
loss 1.97 accuracy 0.25 -- 56.68 + 57.56 + 631.26 + 4.92 = 750.42:  16%|█▌        | 332/2048 [04:46<21:04,  1.36it/s]
loss 2.28 accuracy 0.19 -- 57.55 + 56.57 + 513.04 + 4.92 = 632.08:  16%|█▌        | 332/2048 [04:46<21:04,  1.36it/s]
loss 2.28 accuracy 0.19 -- 57.55 + 56.57 + 513.04 + 4.92 = 632.08:  16%|█▋        | 333/2048 [04:46<20:24,  1.40it/s]
loss 2.08 accuracy 0.38 -- 56.63 + 173.11 + 512.03 + 4.93 = 746.70:  16%|█▋        | 333/2048 [04:47<20:24,  1.40it/s]
loss 2.08 accuracy 0.38 -- 56.63 + 173.11 + 512.03 + 4.93 = 746.70:  16%|█▋        | 334/2048 [04:47<20:54,  1.37it/s]
loss 1.98 accuracy 0.38 -- 56.98 + 56.79 + 508.06 + 4.91 = 626.75:  16%|█▋        | 334/2048 [04:48<20:54,  1.37it/s] 
loss 1.98 accuracy 0.38 -- 56.98 + 56.79 + 508.06 + 4.91 = 626.75:  16%|█▋        | 335/2048 [04:48<20:13,  1.41it/s]
loss 2.38 accuracy 0.25 -- 57.43 + 56.61 + 508.51 + 4.93 = 627.48:  16%|█▋        | 335/2048 [04:48<20:13,  1.41it/s]
loss 2.38 accuracy 0.25 -- 57.43 + 56.61 + 508.51 + 4.93 = 627.48:  16%|█▋        | 336/2048 [04:48<20:42,  1.38it/s]
loss 2.36 accuracy 0.25 -- 56.74 + 57.47 + 503.42 + 4.92 = 622.55:  16%|█▋        | 336/2048 [04:49<20:42,  1.38it/s]
loss 2.36 accuracy 0.25 -- 56.74 + 57.47 + 503.42 + 4.92 = 622.55:  16%|█▋        | 337/2048 [04:49<20:03,  1.42it/s]
loss 1.80 accuracy 0.44 -- 163.62 + 57.25 + 498.30 + 4.87 = 724.05:  16%|█▋        | 337/2048 [04:50<20:03,  1.42it/s]
loss 1.80 accuracy 0.44 -- 163.62 + 57.25 + 498.30 + 4.87 = 724.05:  17%|█▋        | 338/2048 [04:50<20:27,  1.39it/s]
loss 1.94 accuracy 0.25 -- 56.17 + 172.15 + 511.16 + 4.95 = 744.42:  17%|█▋        | 338/2048 [04:51<20:27,  1.39it/s]
loss 1.94 accuracy 0.25 -- 56.17 + 172.15 + 511.16 + 4.95 = 744.42:  17%|█▋        | 339/2048 [04:51<20:54,  1.36it/s]
loss 1.68 accuracy 0.38 -- 57.53 + 57.33 + 508.26 + 4.92 = 628.04:  17%|█▋        | 339/2048 [04:51<20:54,  1.36it/s] 
loss 1.68 accuracy 0.38 -- 57.53 + 57.33 + 508.26 + 4.92 = 628.04:  17%|█▋        | 340/2048 [04:51<20:13,  1.41it/s]
loss 2.35 accuracy 0.12 -- 168.41 + 57.92 + 506.52 + 4.95 = 737.80:  17%|█▋        | 340/2048 [04:52<20:13,  1.41it/s]
loss 2.35 accuracy 0.12 -- 168.41 + 57.92 + 506.52 + 4.95 = 737.80:  17%|█▋        | 341/2048 [04:52<20:40,  1.38it/s]
loss 2.29 accuracy 0.25 -- 56.78 + 59.62 + 637.24 + 4.93 = 758.56:  17%|█▋        | 341/2048 [04:53<20:40,  1.38it/s] 
loss 2.29 accuracy 0.25 -- 56.78 + 59.62 + 637.24 + 4.93 = 758.56:  17%|█▋        | 342/2048 [04:53<21:10,  1.34it/s]
loss 2.23 accuracy 0.25 -- 57.41 + 56.65 + 514.71 + 4.94 = 633.71:  17%|█▋        | 342/2048 [04:54<21:10,  1.34it/s]
loss 2.23 accuracy 0.25 -- 57.41 + 56.65 + 514.71 + 4.94 = 633.71:  17%|█▋        | 343/2048 [04:54<20:45,  1.37it/s]
loss 2.71 accuracy 0.12 -- 56.73 + 57.91 + 631.18 + 4.95 = 750.77:  17%|█▋        | 343/2048 [04:54<20:45,  1.37it/s]
loss 2.71 accuracy 0.12 -- 56.73 + 57.91 + 631.18 + 4.95 = 750.77:  17%|█▋        | 344/2048 [04:54<21:09,  1.34it/s]
loss 1.93 accuracy 0.31 -- 57.38 + 57.18 + 512.71 + 4.89 = 632.17:  17%|█▋        | 344/2048 [04:55<21:09,  1.34it/s]
loss 1.93 accuracy 0.31 -- 57.38 + 57.18 + 512.71 + 4.89 = 632.17:  17%|█▋        | 345/2048 [04:55<20:25,  1.39it/s]
loss 2.16 accuracy 0.31 -- 56.37 + 171.91 + 511.63 + 4.89 = 744.80:  17%|█▋        | 345/2048 [04:56<20:25,  1.39it/s]
loss 2.16 accuracy 0.31 -- 56.37 + 171.91 + 511.63 + 4.89 = 744.80:  17%|█▋        | 346/2048 [04:56<20:51,  1.36it/s]
loss 1.98 accuracy 0.12 -- 57.17 + 57.52 + 510.84 + 4.94 = 630.47:  17%|█▋        | 346/2048 [04:56<20:51,  1.36it/s] 
loss 1.98 accuracy 0.12 -- 57.17 + 57.52 + 510.84 + 4.94 = 630.47:  17%|█▋        | 347/2048 [04:56<20:11,  1.40it/s]
loss 2.23 accuracy 0.12 -- 56.89 + 56.89 + 509.18 + 4.93 = 627.89:  17%|█▋        | 347/2048 [04:57<20:11,  1.40it/s]
loss 2.23 accuracy 0.12 -- 56.89 + 56.89 + 509.18 + 4.93 = 627.89:  17%|█▋        | 348/2048 [04:57<20:38,  1.37it/s]
loss 2.10 accuracy 0.31 -- 56.82 + 57.52 + 505.22 + 4.94 = 624.51:  17%|█▋        | 348/2048 [04:58<20:38,  1.37it/s]
loss 2.10 accuracy 0.31 -- 56.82 + 57.52 + 505.22 + 4.94 = 624.51:  17%|█▋        | 349/2048 [04:58<19:58,  1.42it/s]
loss 2.31 accuracy 0.12 -- 163.31 + 57.23 + 498.97 + 4.95 = 724.47:  17%|█▋        | 349/2048 [04:59<19:58,  1.42it/s]
loss 2.31 accuracy 0.12 -- 163.31 + 57.23 + 498.97 + 4.95 = 724.47:  17%|█▋        | 350/2048 [04:59<20:21,  1.39it/s]
loss 2.34 accuracy 0.25 -- 56.55 + 172.68 + 511.37 + 4.95 = 745.56:  17%|█▋        | 350/2048 [04:59<20:21,  1.39it/s]
loss 2.34 accuracy 0.25 -- 56.55 + 172.68 + 511.37 + 4.95 = 745.56:  17%|█▋        | 351/2048 [04:59<20:48,  1.36it/s]
loss 1.95 accuracy 0.38 -- 56.93 + 56.66 + 506.98 + 4.94 = 625.50:  17%|█▋        | 351/2048 [05:00<20:48,  1.36it/s] 
loss 1.95 accuracy 0.38 -- 56.93 + 56.66 + 506.98 + 4.94 = 625.50:  17%|█▋        | 352/2048 [05:00<20:05,  1.41it/s]
loss 2.10 accuracy 0.31 -- 168.99 + 57.81 + 508.35 + 4.90 = 740.05:  17%|█▋        | 352/2048 [05:01<20:05,  1.41it/s]
loss 2.10 accuracy 0.31 -- 168.99 + 57.81 + 508.35 + 4.90 = 740.05:  17%|█▋        | 353/2048 [05:01<20:33,  1.37it/s]
loss 2.17 accuracy 0.25 -- 56.60 + 58.04 + 637.37 + 4.95 = 756.96:  17%|█▋        | 353/2048 [05:02<20:33,  1.37it/s] 
loss 2.17 accuracy 0.25 -- 56.60 + 58.04 + 637.37 + 4.95 = 756.96:  17%|█▋        | 354/2048 [05:02<21:01,  1.34it/s]
loss 1.92 accuracy 0.25 -- 57.70 + 57.33 + 515.13 + 4.95 = 635.11:  17%|█▋        | 354/2048 [05:02<21:01,  1.34it/s]
loss 1.92 accuracy 0.25 -- 57.70 + 57.33 + 515.13 + 4.95 = 635.11:  17%|█▋        | 355/2048 [05:02<20:19,  1.39it/s]
loss 2.20 accuracy 0.19 -- 56.64 + 57.42 + 632.15 + 4.89 = 751.09:  17%|█▋        | 355/2048 [05:03<20:19,  1.39it/s]
loss 2.20 accuracy 0.19 -- 56.64 + 57.42 + 632.15 + 4.89 = 751.09:  17%|█▋        | 356/2048 [05:03<20:47,  1.36it/s]
loss 2.23 accuracy 0.25 -- 57.05 + 56.80 + 512.16 + 4.91 = 630.93:  17%|█▋        | 356/2048 [05:04<20:47,  1.36it/s]
loss 2.23 accuracy 0.25 -- 57.05 + 56.80 + 512.16 + 4.91 = 630.93:  17%|█▋        | 357/2048 [05:04<20:07,  1.40it/s]
loss 2.05 accuracy 0.12 -- 56.75 + 171.50 + 512.32 + 4.92 = 745.49:  17%|█▋        | 357/2048 [05:04<20:07,  1.40it/s]
loss 2.05 accuracy 0.12 -- 56.75 + 171.50 + 512.32 + 4.92 = 745.49:  17%|█▋        | 358/2048 [05:04<20:36,  1.37it/s]
loss 1.83 accuracy 0.38 -- 57.13 + 56.74 + 509.01 + 4.97 = 627.85:  17%|█▋        | 358/2048 [05:05<20:36,  1.37it/s] 
loss 1.83 accuracy 0.38 -- 57.13 + 56.74 + 509.01 + 4.97 = 627.85:  18%|█▊        | 359/2048 [05:05<19:57,  1.41it/s]
loss 2.13 accuracy 0.31 -- 57.29 + 56.69 + 507.47 + 4.93 = 626.37:  18%|█▊        | 359/2048 [05:06<19:57,  1.41it/s]
loss 2.13 accuracy 0.31 -- 57.29 + 56.69 + 507.47 + 4.93 = 626.37:  18%|█▊        | 360/2048 [05:06<20:25,  1.38it/s]
loss 2.14 accuracy 0.12 -- 56.59 + 57.85 + 504.18 + 4.88 = 623.50:  18%|█▊        | 360/2048 [05:07<20:25,  1.38it/s]
loss 2.14 accuracy 0.12 -- 56.59 + 57.85 + 504.18 + 4.88 = 623.50:  18%|█▊        | 361/2048 [05:07<19:46,  1.42it/s]
loss 2.57 accuracy 0.31 -- 164.24 + 56.85 + 497.77 + 4.91 = 723.76:  18%|█▊        | 361/2048 [05:07<19:46,  1.42it/s]
loss 2.57 accuracy 0.31 -- 164.24 + 56.85 + 497.77 + 4.91 = 723.76:  18%|█▊        | 362/2048 [05:07<20:10,  1.39it/s]
loss 2.40 accuracy 0.31 -- 56.44 + 172.03 + 512.91 + 4.91 = 746.28:  18%|█▊        | 362/2048 [05:08<20:10,  1.39it/s]
loss 2.40 accuracy 0.31 -- 56.44 + 172.03 + 512.91 + 4.91 = 746.28:  18%|█▊        | 363/2048 [05:08<20:37,  1.36it/s]
loss 2.48 accuracy 0.31 -- 57.12 + 57.20 + 508.57 + 4.92 = 627.81:  18%|█▊        | 363/2048 [05:09<20:37,  1.36it/s] 
loss 2.48 accuracy 0.31 -- 57.12 + 57.20 + 508.57 + 4.92 = 627.81:  18%|█▊        | 364/2048 [05:09<19:57,  1.41it/s]
loss 2.78 accuracy 0.12 -- 169.24 + 57.25 + 505.54 + 4.92 = 736.95:  18%|█▊        | 364/2048 [05:09<19:57,  1.41it/s]
loss 2.78 accuracy 0.12 -- 169.24 + 57.25 + 505.54 + 4.92 = 736.95:  18%|█▊        | 365/2048 [05:09<20:23,  1.38it/s]
loss 1.92 accuracy 0.25 -- 56.65 + 57.50 + 635.00 + 4.94 = 754.09:  18%|█▊        | 365/2048 [05:10<20:23,  1.38it/s] 
loss 1.92 accuracy 0.25 -- 56.65 + 57.50 + 635.00 + 4.94 = 754.09:  18%|█▊        | 366/2048 [05:10<20:50,  1.35it/s]
loss 2.32 accuracy 0.06 -- 57.31 + 56.64 + 514.73 + 4.93 = 633.60:  18%|█▊        | 366/2048 [05:11<20:50,  1.35it/s]
loss 2.32 accuracy 0.06 -- 57.31 + 56.64 + 514.73 + 4.93 = 633.60:  18%|█▊        | 367/2048 [05:11<20:07,  1.39it/s]
loss 2.08 accuracy 0.19 -- 56.69 + 57.84 + 629.74 + 4.94 = 749.21:  18%|█▊        | 367/2048 [05:12<20:07,  1.39it/s]
loss 2.08 accuracy 0.19 -- 56.69 + 57.84 + 629.74 + 4.94 = 749.21:  18%|█▊        | 368/2048 [05:12<20:36,  1.36it/s]
loss 2.23 accuracy 0.19 -- 57.03 + 56.53 + 512.06 + 4.92 = 630.53:  18%|█▊        | 368/2048 [05:12<20:36,  1.36it/s]
loss 2.23 accuracy 0.19 -- 57.03 + 56.53 + 512.06 + 4.92 = 630.53:  18%|█▊        | 369/2048 [05:12<19:56,  1.40it/s]
loss 2.17 accuracy 0.06 -- 57.32 + 171.82 + 512.48 + 4.88 = 746.50:  18%|█▊        | 369/2048 [05:13<19:56,  1.40it/s]
loss 2.17 accuracy 0.06 -- 57.32 + 171.82 + 512.48 + 4.88 = 746.50:  18%|█▊        | 370/2048 [05:13<20:26,  1.37it/s]
loss 1.89 accuracy 0.38 -- 56.05 + 56.56 + 508.74 + 4.93 = 626.28:  18%|█▊        | 370/2048 [05:14<20:26,  1.37it/s] 
loss 1.89 accuracy 0.38 -- 56.05 + 56.56 + 508.74 + 4.93 = 626.28:  18%|█▊        | 371/2048 [05:14<19:46,  1.41it/s]
loss 1.98 accuracy 0.38 -- 57.64 + 56.95 + 510.40 + 4.90 = 629.89:  18%|█▊        | 371/2048 [05:15<19:46,  1.41it/s]
loss 1.98 accuracy 0.38 -- 57.64 + 56.95 + 510.40 + 4.90 = 629.89:  18%|█▊        | 372/2048 [05:15<20:17,  1.38it/s]
loss 1.88 accuracy 0.12 -- 56.45 + 57.43 + 505.43 + 4.89 = 624.20:  18%|█▊        | 372/2048 [05:15<20:17,  1.38it/s]
loss 1.88 accuracy 0.12 -- 56.45 + 57.43 + 505.43 + 4.89 = 624.20:  18%|█▊        | 373/2048 [05:15<19:38,  1.42it/s]
loss 2.06 accuracy 0.50 -- 162.74 + 56.79 + 497.22 + 4.89 = 721.64:  18%|█▊        | 373/2048 [05:16<19:38,  1.42it/s]
loss 2.06 accuracy 0.50 -- 162.74 + 56.79 + 497.22 + 4.89 = 721.64:  18%|█▊        | 374/2048 [05:16<20:01,  1.39it/s]
loss 2.46 accuracy 0.00 -- 56.10 + 171.22 + 509.83 + 4.90 = 742.05:  18%|█▊        | 374/2048 [05:17<20:01,  1.39it/s]
loss 2.46 accuracy 0.00 -- 56.10 + 171.22 + 509.83 + 4.90 = 742.05:  18%|█▊        | 375/2048 [05:17<20:26,  1.36it/s]
loss 2.17 accuracy 0.25 -- 56.84 + 56.47 + 505.67 + 4.94 = 623.92:  18%|█▊        | 375/2048 [05:17<20:26,  1.36it/s] 
loss 2.17 accuracy 0.25 -- 56.84 + 56.47 + 505.67 + 4.94 = 623.92:  18%|█▊        | 376/2048 [05:17<19:44,  1.41it/s]
loss 2.22 accuracy 0.19 -- 168.23 + 57.74 + 507.55 + 4.93 = 738.44:  18%|█▊        | 376/2048 [05:18<19:44,  1.41it/s]
loss 2.22 accuracy 0.19 -- 168.23 + 57.74 + 507.55 + 4.93 = 738.44:  18%|█▊        | 377/2048 [05:18<20:12,  1.38it/s]
loss 1.92 accuracy 0.19 -- 57.38 + 57.94 + 637.82 + 4.93 = 758.07:  18%|█▊        | 377/2048 [05:19<20:12,  1.38it/s] 
loss 1.92 accuracy 0.19 -- 57.38 + 57.94 + 637.82 + 4.93 = 758.07:  18%|█▊        | 378/2048 [05:19<20:41,  1.34it/s]
loss 1.87 accuracy 0.38 -- 57.67 + 57.05 + 515.73 + 4.91 = 635.36:  18%|█▊        | 378/2048 [05:20<20:41,  1.34it/s]
loss 1.87 accuracy 0.38 -- 57.67 + 57.05 + 515.73 + 4.91 = 635.36:  19%|█▊        | 379/2048 [05:20<20:00,  1.39it/s]
loss 1.84 accuracy 0.44 -- 56.98 + 57.68 + 632.42 + 4.94 = 752.02:  19%|█▊        | 379/2048 [05:20<20:00,  1.39it/s]
loss 1.84 accuracy 0.44 -- 56.98 + 57.68 + 632.42 + 4.94 = 752.02:  19%|█▊        | 380/2048 [05:20<20:29,  1.36it/s]
loss 2.02 accuracy 0.25 -- 57.73 + 57.42 + 513.67 + 4.93 = 633.75:  19%|█▊        | 380/2048 [05:21<20:29,  1.36it/s]
loss 2.02 accuracy 0.25 -- 57.73 + 57.42 + 513.67 + 4.93 = 633.75:  19%|█▊        | 381/2048 [05:21<19:50,  1.40it/s]
loss 1.85 accuracy 0.31 -- 57.05 + 171.82 + 510.60 + 4.96 = 744.43:  19%|█▊        | 381/2048 [05:22<19:50,  1.40it/s]
loss 1.85 accuracy 0.31 -- 57.05 + 171.82 + 510.60 + 4.96 = 744.43:  19%|█▊        | 382/2048 [05:22<20:19,  1.37it/s]
loss 1.83 accuracy 0.19 -- 56.63 + 57.02 + 509.68 + 4.92 = 628.26:  19%|█▊        | 382/2048 [05:22<20:19,  1.37it/s] 
loss 1.83 accuracy 0.19 -- 56.63 + 57.02 + 509.68 + 4.92 = 628.26:  19%|█▊        | 383/2048 [05:22<19:40,  1.41it/s]
loss 1.70 accuracy 0.19 -- 57.46 + 56.80 + 509.08 + 4.93 = 628.28:  19%|█▊        | 383/2048 [05:23<19:40,  1.41it/s]
loss 1.70 accuracy 0.19 -- 57.46 + 56.80 + 509.08 + 4.93 = 628.28:  19%|█▉        | 384/2048 [05:23<20:09,  1.38it/s]
loss 1.74 accuracy 0.38 -- 56.66 + 57.75 + 505.47 + 4.92 = 624.81:  19%|█▉        | 384/2048 [05:24<20:09,  1.38it/s]
loss 1.74 accuracy 0.38 -- 56.66 + 57.75 + 505.47 + 4.92 = 624.81:  19%|█▉        | 385/2048 [05:24<19:31,  1.42it/s]
loss 2.34 accuracy 0.19 -- 163.98 + 57.11 + 500.71 + 4.93 = 726.73:  19%|█▉        | 385/2048 [05:25<19:31,  1.42it/s]
loss 2.34 accuracy 0.19 -- 163.98 + 57.11 + 500.71 + 4.93 = 726.73:  19%|█▉        | 386/2048 [05:25<19:55,  1.39it/s]
loss 2.11 accuracy 0.25 -- 56.34 + 172.12 + 510.85 + 4.95 = 744.27:  19%|█▉        | 386/2048 [05:25<19:55,  1.39it/s]
loss 2.11 accuracy 0.25 -- 56.34 + 172.12 + 510.85 + 4.95 = 744.27:  19%|█▉        | 387/2048 [05:25<20:20,  1.36it/s]
loss 1.95 accuracy 0.25 -- 57.09 + 56.75 + 506.54 + 4.91 = 625.28:  19%|█▉        | 387/2048 [05:26<20:20,  1.36it/s] 
loss 1.95 accuracy 0.25 -- 57.09 + 56.75 + 506.54 + 4.91 = 625.28:  19%|█▉        | 388/2048 [05:26<19:39,  1.41it/s]
loss 2.23 accuracy 0.38 -- 169.28 + 57.46 + 506.65 + 4.92 = 738.31:  19%|█▉        | 388/2048 [05:27<19:39,  1.41it/s]
loss 2.23 accuracy 0.38 -- 169.28 + 57.46 + 506.65 + 4.92 = 738.31:  19%|█▉        | 389/2048 [05:27<20:06,  1.38it/s]
loss 1.79 accuracy 0.38 -- 56.72 + 57.43 + 635.45 + 4.93 = 754.52:  19%|█▉        | 389/2048 [05:28<20:06,  1.38it/s] 
loss 1.79 accuracy 0.38 -- 56.72 + 57.43 + 635.45 + 4.93 = 754.52:  19%|█▉        | 390/2048 [05:28<20:32,  1.34it/s]
loss 2.49 accuracy 0.31 -- 57.16 + 56.84 + 515.18 + 4.91 = 634.09:  19%|█▉        | 390/2048 [05:28<20:32,  1.34it/s]
loss 2.49 accuracy 0.31 -- 57.16 + 56.84 + 515.18 + 4.91 = 634.09:  19%|█▉        | 391/2048 [05:28<19:51,  1.39it/s]
loss 1.96 accuracy 0.19 -- 57.34 + 58.26 + 631.39 + 4.89 = 751.88:  19%|█▉        | 391/2048 [05:29<19:51,  1.39it/s]
loss 1.96 accuracy 0.19 -- 57.34 + 58.26 + 631.39 + 4.89 = 751.88:  19%|█▉        | 392/2048 [05:29<20:20,  1.36it/s]
loss 2.19 accuracy 0.19 -- 57.60 + 56.70 + 512.06 + 4.93 = 631.28:  19%|█▉        | 392/2048 [05:30<20:20,  1.36it/s]
loss 2.19 accuracy 0.19 -- 57.60 + 56.70 + 512.06 + 4.93 = 631.28:  19%|█▉        | 393/2048 [05:30<19:40,  1.40it/s]
loss 1.77 accuracy 0.31 -- 56.72 + 171.34 + 511.50 + 4.94 = 744.50:  19%|█▉        | 393/2048 [05:30<19:40,  1.40it/s]
loss 1.77 accuracy 0.31 -- 56.72 + 171.34 + 511.50 + 4.94 = 744.50:  19%|█▉        | 394/2048 [05:30<20:09,  1.37it/s]
loss 1.72 accuracy 0.38 -- 56.49 + 56.72 + 508.94 + 4.91 = 627.06:  19%|█▉        | 394/2048 [05:31<20:09,  1.37it/s] 
loss 1.72 accuracy 0.38 -- 56.49 + 56.72 + 508.94 + 4.91 = 627.06:  19%|█▉        | 395/2048 [05:31<19:30,  1.41it/s]
loss 1.99 accuracy 0.31 -- 57.38 + 56.50 + 507.55 + 4.93 = 626.36:  19%|█▉        | 395/2048 [05:32<19:30,  1.41it/s]
loss 1.99 accuracy 0.31 -- 57.38 + 56.50 + 507.55 + 4.93 = 626.36:  19%|█▉        | 396/2048 [05:32<19:58,  1.38it/s]
loss 2.34 accuracy 0.06 -- 56.74 + 57.32 + 505.39 + 4.89 = 624.34:  19%|█▉        | 396/2048 [05:33<19:58,  1.38it/s]
loss 2.34 accuracy 0.06 -- 56.74 + 57.32 + 505.39 + 4.89 = 624.34:  19%|█▉        | 397/2048 [05:33<19:21,  1.42it/s]
loss 2.08 accuracy 0.31 -- 164.21 + 57.26 + 499.96 + 4.92 = 726.35:  19%|█▉        | 397/2048 [05:33<19:21,  1.42it/s]
loss 2.08 accuracy 0.31 -- 164.21 + 57.26 + 499.96 + 4.92 = 726.35:  19%|█▉        | 398/2048 [05:33<19:45,  1.39it/s]
loss 2.04 accuracy 0.25 -- 56.15 + 171.81 + 511.96 + 4.92 = 744.84:  19%|█▉        | 398/2048 [05:34<19:45,  1.39it/s]
loss 2.04 accuracy 0.25 -- 56.15 + 171.81 + 511.96 + 4.92 = 744.84:  19%|█▉        | 399/2048 [05:34<20:29,  1.34it/s]
loss 2.87 accuracy 0.12 -- 57.12 + 56.74 + 506.21 + 4.94 = 625.01:  19%|█▉        | 399/2048 [05:35<20:29,  1.34it/s] 
loss 2.87 accuracy 0.12 -- 57.12 + 56.74 + 506.21 + 4.94 = 625.01:  20%|█▉        | 400/2048 [05:35<19:42,  1.39it/s]
loss 2.19 accuracy 0.31 -- 168.22 + 57.38 + 507.09 + 4.94 = 737.63:  20%|█▉        | 400/2048 [05:36<19:42,  1.39it/s]
loss 2.19 accuracy 0.31 -- 168.22 + 57.38 + 507.09 + 4.94 = 737.63:  20%|█▉        | 401/2048 [05:36<20:05,  1.37it/s]
loss 1.66 accuracy 0.44 -- 56.62 + 57.80 + 637.74 + 4.95 = 757.11:  20%|█▉        | 401/2048 [05:36<20:05,  1.37it/s] 
loss 1.66 accuracy 0.44 -- 56.62 + 57.80 + 637.74 + 4.95 = 757.11:  20%|█▉        | 402/2048 [05:36<20:30,  1.34it/s]
loss 2.80 accuracy 0.06 -- 57.51 + 57.18 + 515.84 + 4.91 = 635.44:  20%|█▉        | 402/2048 [05:37<20:30,  1.34it/s]
loss 2.80 accuracy 0.06 -- 57.51 + 57.18 + 515.84 + 4.91 = 635.44:  20%|█▉        | 403/2048 [05:37<19:48,  1.38it/s]
loss 1.66 accuracy 0.50 -- 56.68 + 57.32 + 628.46 + 4.89 = 747.35:  20%|█▉        | 403/2048 [05:38<19:48,  1.38it/s]
loss 1.66 accuracy 0.50 -- 56.68 + 57.32 + 628.46 + 4.89 = 747.35:  20%|█▉        | 404/2048 [05:38<20:13,  1.36it/s]
loss 2.40 accuracy 0.12 -- 56.86 + 56.46 + 510.57 + 4.94 = 628.83:  20%|█▉        | 404/2048 [05:38<20:13,  1.36it/s]
loss 2.40 accuracy 0.12 -- 56.86 + 56.46 + 510.57 + 4.94 = 628.83:  20%|█▉        | 405/2048 [05:38<19:32,  1.40it/s]
loss 2.19 accuracy 0.12 -- 56.97 + 171.66 + 511.12 + 4.95 = 744.71:  20%|█▉        | 405/2048 [05:39<19:32,  1.40it/s]
loss 2.19 accuracy 0.12 -- 56.97 + 171.66 + 511.12 + 4.95 = 744.71:  20%|█▉        | 406/2048 [05:39<20:17,  1.35it/s]
loss 2.01 accuracy 0.25 -- 56.87 + 56.63 + 509.33 + 4.93 = 627.75:  20%|█▉        | 406/2048 [05:40<20:17,  1.35it/s] 
loss 2.01 accuracy 0.25 -- 56.87 + 56.63 + 509.33 + 4.93 = 627.75:  20%|█▉        | 407/2048 [05:40<19:34,  1.40it/s]
loss 2.30 accuracy 0.19 -- 57.09 + 56.82 + 509.81 + 4.94 = 628.66:  20%|█▉        | 407/2048 [05:41<19:34,  1.40it/s]
loss 2.30 accuracy 0.19 -- 57.09 + 56.82 + 509.81 + 4.94 = 628.66:  20%|█▉        | 408/2048 [05:41<19:59,  1.37it/s]
loss 1.97 accuracy 0.19 -- 56.54 + 57.57 + 504.41 + 4.95 = 623.46:  20%|█▉        | 408/2048 [05:41<19:59,  1.37it/s]
loss 1.97 accuracy 0.19 -- 56.54 + 57.57 + 504.41 + 4.95 = 623.46:  20%|█▉        | 409/2048 [05:41<19:19,  1.41it/s]
loss 2.04 accuracy 0.19 -- 164.06 + 57.33 + 499.20 + 4.94 = 725.54:  20%|█▉        | 409/2048 [05:42<19:19,  1.41it/s]
loss 2.04 accuracy 0.19 -- 164.06 + 57.33 + 499.20 + 4.94 = 725.54:  20%|██        | 410/2048 [05:42<19:40,  1.39it/s]
loss 2.54 accuracy 0.06 -- 56.13 + 172.33 + 510.93 + 4.89 = 744.27:  20%|██        | 410/2048 [05:43<19:40,  1.39it/s]
loss 2.54 accuracy 0.06 -- 56.13 + 172.33 + 510.93 + 4.89 = 744.27:  20%|██        | 411/2048 [05:43<20:05,  1.36it/s]
loss 2.37 accuracy 0.25 -- 57.20 + 56.84 + 509.12 + 4.92 = 628.08:  20%|██        | 411/2048 [05:43<20:05,  1.36it/s] 
loss 2.37 accuracy 0.25 -- 57.20 + 56.84 + 509.12 + 4.92 = 628.08:  20%|██        | 412/2048 [05:43<19:24,  1.40it/s]
loss 2.81 accuracy 0.00 -- 168.13 + 57.27 + 506.44 + 4.96 = 736.80:  20%|██        | 412/2048 [05:44<19:24,  1.40it/s]
loss 2.81 accuracy 0.00 -- 168.13 + 57.27 + 506.44 + 4.96 = 736.80:  20%|██        | 413/2048 [05:44<20:07,  1.35it/s]
loss 2.43 accuracy 0.38 -- 57.29 + 58.13 + 636.60 + 4.91 = 756.94:  20%|██        | 413/2048 [05:45<20:07,  1.35it/s] 
loss 2.43 accuracy 0.38 -- 57.29 + 58.13 + 636.60 + 4.91 = 756.94:  20%|██        | 414/2048 [05:45<20:28,  1.33it/s]
loss 2.02 accuracy 0.25 -- 57.59 + 56.41 + 515.86 + 4.88 = 634.74:  20%|██        | 414/2048 [05:46<20:28,  1.33it/s]
loss 2.02 accuracy 0.25 -- 57.59 + 56.41 + 515.86 + 4.88 = 634.74:  20%|██        | 415/2048 [05:46<19:43,  1.38it/s]
loss 2.88 accuracy 0.19 -- 56.64 + 57.75 + 630.37 + 4.94 = 749.69:  20%|██        | 415/2048 [05:46<19:43,  1.38it/s]
loss 2.88 accuracy 0.19 -- 56.64 + 57.75 + 630.37 + 4.94 = 749.69:  20%|██        | 416/2048 [05:46<20:08,  1.35it/s]
loss 2.02 accuracy 0.31 -- 56.97 + 56.56 + 512.42 + 4.93 = 630.89:  20%|██        | 416/2048 [05:47<20:08,  1.35it/s]
loss 2.02 accuracy 0.31 -- 56.97 + 56.56 + 512.42 + 4.93 = 630.89:  20%|██        | 417/2048 [05:47<19:27,  1.40it/s]
loss 2.18 accuracy 0.12 -- 56.61 + 171.84 + 512.49 + 4.95 = 745.89:  20%|██        | 417/2048 [05:48<19:27,  1.40it/s]
loss 2.18 accuracy 0.12 -- 56.61 + 171.84 + 512.49 + 4.95 = 745.89:  20%|██        | 418/2048 [05:48<19:55,  1.36it/s]
loss 2.32 accuracy 0.25 -- 56.58 + 57.08 + 510.75 + 4.93 = 629.34:  20%|██        | 418/2048 [05:49<19:55,  1.36it/s] 
loss 2.32 accuracy 0.25 -- 56.58 + 57.08 + 510.75 + 4.93 = 629.34:  20%|██        | 419/2048 [05:49<19:17,  1.41it/s]
loss 2.29 accuracy 0.12 -- 57.46 + 56.70 + 507.43 + 4.93 = 626.52:  20%|██        | 419/2048 [05:49<19:17,  1.41it/s]
loss 2.29 accuracy 0.12 -- 57.46 + 56.70 + 507.43 + 4.93 = 626.52:  21%|██        | 420/2048 [05:49<20:01,  1.35it/s]
loss 2.14 accuracy 0.25 -- 56.26 + 57.62 + 504.86 + 4.92 = 623.66:  21%|██        | 420/2048 [05:50<20:01,  1.35it/s]
loss 2.14 accuracy 0.25 -- 56.26 + 57.62 + 504.86 + 4.92 = 623.66:  21%|██        | 421/2048 [05:50<19:18,  1.40it/s]
loss 1.79 accuracy 0.44 -- 163.98 + 56.96 + 499.85 + 4.90 = 725.69:  21%|██        | 421/2048 [05:51<19:18,  1.40it/s]
loss 1.79 accuracy 0.44 -- 163.98 + 56.96 + 499.85 + 4.90 = 725.69:  21%|██        | 422/2048 [05:51<19:37,  1.38it/s]
loss 2.08 accuracy 0.31 -- 56.27 + 172.36 + 511.69 + 4.91 = 745.23:  21%|██        | 422/2048 [05:52<19:37,  1.38it/s]
loss 2.08 accuracy 0.31 -- 56.27 + 172.36 + 511.69 + 4.91 = 745.23:  21%|██        | 423/2048 [05:52<20:01,  1.35it/s]
loss 2.12 accuracy 0.19 -- 57.18 + 56.77 + 508.57 + 4.95 = 627.48:  21%|██        | 423/2048 [05:52<20:01,  1.35it/s] 
loss 2.12 accuracy 0.19 -- 57.18 + 56.77 + 508.57 + 4.95 = 627.48:  21%|██        | 424/2048 [05:52<19:19,  1.40it/s]
loss 2.41 accuracy 0.12 -- 169.07 + 57.75 + 505.58 + 4.92 = 737.32:  21%|██        | 424/2048 [05:53<19:19,  1.40it/s]
loss 2.41 accuracy 0.12 -- 169.07 + 57.75 + 505.58 + 4.92 = 737.32:  21%|██        | 425/2048 [05:53<19:43,  1.37it/s]
loss 2.08 accuracy 0.25 -- 56.85 + 57.58 + 638.67 + 4.90 = 758.00:  21%|██        | 425/2048 [05:54<19:43,  1.37it/s] 
loss 2.08 accuracy 0.25 -- 56.85 + 57.58 + 638.67 + 4.90 = 758.00:  21%|██        | 426/2048 [05:54<20:09,  1.34it/s]
loss 2.10 accuracy 0.12 -- 57.18 + 56.91 + 515.17 + 4.93 = 634.18:  21%|██        | 426/2048 [05:54<20:09,  1.34it/s]
loss 2.10 accuracy 0.12 -- 57.18 + 56.91 + 515.17 + 4.93 = 634.18:  21%|██        | 427/2048 [05:54<19:28,  1.39it/s]
loss 2.35 accuracy 0.19 -- 56.95 + 57.70 + 631.33 + 4.91 = 750.90:  21%|██        | 427/2048 [05:55<19:28,  1.39it/s]
loss 2.35 accuracy 0.19 -- 56.95 + 57.70 + 631.33 + 4.91 = 750.90:  21%|██        | 428/2048 [05:55<19:55,  1.36it/s]
loss 1.74 accuracy 0.44 -- 57.38 + 56.64 + 514.41 + 4.93 = 633.36:  21%|██        | 428/2048 [05:56<19:55,  1.36it/s]
loss 1.74 accuracy 0.44 -- 57.38 + 56.64 + 514.41 + 4.93 = 633.36:  21%|██        | 429/2048 [05:56<19:16,  1.40it/s]
loss 2.39 accuracy 0.25 -- 56.30 + 171.58 + 513.91 + 4.95 = 746.75:  21%|██        | 429/2048 [05:57<19:16,  1.40it/s]
loss 2.39 accuracy 0.25 -- 56.30 + 171.58 + 513.91 + 4.95 = 746.75:  21%|██        | 430/2048 [05:57<19:44,  1.37it/s]
loss 2.13 accuracy 0.38 -- 56.55 + 56.47 + 508.77 + 4.95 = 626.74:  21%|██        | 430/2048 [05:57<19:44,  1.37it/s] 
loss 2.13 accuracy 0.38 -- 56.55 + 56.47 + 508.77 + 4.95 = 626.74:  21%|██        | 431/2048 [05:57<19:06,  1.41it/s]
loss 1.89 accuracy 0.38 -- 57.07 + 56.90 + 509.17 + 4.91 = 628.05:  21%|██        | 431/2048 [05:58<19:06,  1.41it/s]
loss 1.89 accuracy 0.38 -- 57.07 + 56.90 + 509.17 + 4.91 = 628.05:  21%|██        | 432/2048 [05:58<19:33,  1.38it/s]
loss 1.86 accuracy 0.31 -- 56.33 + 57.58 + 505.17 + 4.96 = 624.03:  21%|██        | 432/2048 [05:59<19:33,  1.38it/s]
loss 1.86 accuracy 0.31 -- 56.33 + 57.58 + 505.17 + 4.96 = 624.03:  21%|██        | 433/2048 [05:59<18:56,  1.42it/s]
loss 2.00 accuracy 0.19 -- 163.80 + 57.36 + 499.65 + 4.91 = 725.72:  21%|██        | 433/2048 [05:59<18:56,  1.42it/s]
loss 2.00 accuracy 0.19 -- 163.80 + 57.36 + 499.65 + 4.91 = 725.72:  21%|██        | 434/2048 [05:59<19:19,  1.39it/s]
loss 1.81 accuracy 0.25 -- 56.69 + 171.92 + 512.99 + 4.93 = 746.53:  21%|██        | 434/2048 [06:00<19:19,  1.39it/s]
loss 1.81 accuracy 0.25 -- 56.69 + 171.92 + 512.99 + 4.93 = 746.53:  21%|██        | 435/2048 [06:00<19:45,  1.36it/s]
loss 2.04 accuracy 0.25 -- 57.48 + 56.87 + 507.87 + 4.93 = 627.15:  21%|██        | 435/2048 [06:01<19:45,  1.36it/s] 
loss 2.04 accuracy 0.25 -- 57.48 + 56.87 + 507.87 + 4.93 = 627.15:  21%|██▏       | 436/2048 [06:01<19:06,  1.41it/s]
loss 2.20 accuracy 0.06 -- 168.72 + 57.84 + 508.53 + 4.93 = 740.02:  21%|██▏       | 436/2048 [06:02<19:06,  1.41it/s]
loss 2.20 accuracy 0.06 -- 168.72 + 57.84 + 508.53 + 4.93 = 740.02:  21%|██▏       | 437/2048 [06:02<19:32,  1.37it/s]
loss 2.25 accuracy 0.19 -- 56.76 + 57.28 + 637.45 + 4.92 = 756.40:  21%|██▏       | 437/2048 [06:02<19:32,  1.37it/s] 
loss 2.25 accuracy 0.19 -- 56.76 + 57.28 + 637.45 + 4.92 = 756.40:  21%|██▏       | 438/2048 [06:02<19:58,  1.34it/s]
loss 2.09 accuracy 0.31 -- 57.14 + 56.60 + 518.27 + 4.93 = 636.95:  21%|██▏       | 438/2048 [06:03<19:58,  1.34it/s]
loss 2.09 accuracy 0.31 -- 57.14 + 56.60 + 518.27 + 4.93 = 636.95:  21%|██▏       | 439/2048 [06:03<19:19,  1.39it/s]
loss 1.98 accuracy 0.19 -- 56.48 + 57.46 + 631.64 + 4.94 = 750.52:  21%|██▏       | 439/2048 [06:04<19:19,  1.39it/s]
loss 1.98 accuracy 0.19 -- 56.48 + 57.46 + 631.64 + 4.94 = 750.52:  21%|██▏       | 440/2048 [06:04<19:46,  1.36it/s]
loss 2.19 accuracy 0.19 -- 57.01 + 56.50 + 512.87 + 4.91 = 631.29:  21%|██▏       | 440/2048 [06:05<19:46,  1.36it/s]
loss 2.19 accuracy 0.19 -- 57.01 + 56.50 + 512.87 + 4.91 = 631.29:  22%|██▏       | 441/2048 [06:05<19:07,  1.40it/s]
loss 2.01 accuracy 0.25 -- 56.74 + 171.28 + 511.57 + 4.92 = 744.51:  22%|██▏       | 441/2048 [06:05<19:07,  1.40it/s]
loss 2.01 accuracy 0.25 -- 56.74 + 171.28 + 511.57 + 4.92 = 744.51:  22%|██▏       | 442/2048 [06:05<19:34,  1.37it/s]
loss 2.06 accuracy 0.25 -- 56.68 + 56.80 + 509.52 + 4.94 = 627.94:  22%|██▏       | 442/2048 [06:06<19:34,  1.37it/s] 
loss 2.06 accuracy 0.25 -- 56.68 + 56.80 + 509.52 + 4.94 = 627.94:  22%|██▏       | 443/2048 [06:06<18:57,  1.41it/s]
loss 2.70 accuracy 0.38 -- 57.20 + 56.66 + 507.58 + 4.91 = 626.36:  22%|██▏       | 443/2048 [06:07<18:57,  1.41it/s]
loss 2.70 accuracy 0.38 -- 57.20 + 56.66 + 507.58 + 4.91 = 626.36:  22%|██▏       | 444/2048 [06:07<19:24,  1.38it/s]
loss 2.10 accuracy 0.12 -- 56.53 + 57.76 + 504.43 + 4.92 = 623.65:  22%|██▏       | 444/2048 [06:07<19:24,  1.38it/s]
loss 2.10 accuracy 0.12 -- 56.53 + 57.76 + 504.43 + 4.92 = 623.65:  22%|██▏       | 445/2048 [06:07<18:47,  1.42it/s]
loss 2.47 accuracy 0.19 -- 163.85 + 57.28 + 499.32 + 4.93 = 725.38:  22%|██▏       | 445/2048 [06:08<18:47,  1.42it/s]
loss 2.47 accuracy 0.19 -- 163.85 + 57.28 + 499.32 + 4.93 = 725.38:  22%|██▏       | 446/2048 [06:08<19:10,  1.39it/s]
loss 1.63 accuracy 0.56 -- 56.71 + 172.32 + 514.26 + 4.94 = 748.23:  22%|██▏       | 446/2048 [06:09<19:10,  1.39it/s]
loss 1.63 accuracy 0.56 -- 56.71 + 172.32 + 514.26 + 4.94 = 748.23:  22%|██▏       | 447/2048 [06:09<19:37,  1.36it/s]
loss 2.06 accuracy 0.31 -- 57.26 + 56.82 + 507.24 + 4.91 = 626.23:  22%|██▏       | 447/2048 [06:10<19:37,  1.36it/s] 
loss 2.06 accuracy 0.31 -- 57.26 + 56.82 + 507.24 + 4.91 = 626.23:  22%|██▏       | 448/2048 [06:10<18:57,  1.41it/s]
loss 2.51 accuracy 0.06 -- 168.00 + 57.79 + 507.40 + 4.94 = 738.14:  22%|██▏       | 448/2048 [06:10<18:57,  1.41it/s]
loss 2.51 accuracy 0.06 -- 168.00 + 57.79 + 507.40 + 4.94 = 738.14:  22%|██▏       | 449/2048 [06:10<19:23,  1.37it/s]
loss 2.75 accuracy 0.06 -- 56.63 + 57.72 + 639.72 + 4.93 = 759.00:  22%|██▏       | 449/2048 [06:11<19:23,  1.37it/s] 
loss 2.75 accuracy 0.06 -- 56.63 + 57.72 + 639.72 + 4.93 = 759.00:  22%|██▏       | 450/2048 [06:11<19:50,  1.34it/s]
loss 2.07 accuracy 0.06 -- 57.46 + 56.96 + 515.64 + 4.93 = 634.99:  22%|██▏       | 450/2048 [06:12<19:50,  1.34it/s]
loss 2.07 accuracy 0.06 -- 57.46 + 56.96 + 515.64 + 4.93 = 634.99:  22%|██▏       | 451/2048 [06:12<19:10,  1.39it/s]
loss 2.79 accuracy 0.19 -- 56.53 + 57.63 + 632.74 + 4.96 = 751.87:  22%|██▏       | 451/2048 [06:13<19:10,  1.39it/s]
loss 2.79 accuracy 0.19 -- 56.53 + 57.63 + 632.74 + 4.96 = 751.87:  22%|██▏       | 452/2048 [06:13<19:37,  1.36it/s]
loss 2.31 accuracy 0.25 -- 57.33 + 56.69 + 512.52 + 4.93 = 631.48:  22%|██▏       | 452/2048 [06:13<19:37,  1.36it/s]
loss 2.31 accuracy 0.25 -- 57.33 + 56.69 + 512.52 + 4.93 = 631.48:  22%|██▏       | 453/2048 [06:13<18:59,  1.40it/s]
loss 2.09 accuracy 0.12 -- 56.73 + 185.80 + 511.48 + 4.94 = 758.95:  22%|██▏       | 453/2048 [06:14<18:59,  1.40it/s]
loss 2.09 accuracy 0.12 -- 56.73 + 185.80 + 511.48 + 4.94 = 758.95:  22%|██▏       | 454/2048 [06:14<19:32,  1.36it/s]
loss 2.50 accuracy 0.19 -- 57.43 + 57.43 + 508.14 + 4.89 = 627.89:  22%|██▏       | 454/2048 [06:15<19:32,  1.36it/s] 
loss 2.50 accuracy 0.19 -- 57.43 + 57.43 + 508.14 + 4.89 = 627.89:  22%|██▏       | 455/2048 [06:15<18:54,  1.40it/s]
loss 2.06 accuracy 0.38 -- 56.95 + 56.47 + 508.40 + 4.91 = 626.73:  22%|██▏       | 455/2048 [06:15<18:54,  1.40it/s]
loss 2.06 accuracy 0.38 -- 56.95 + 56.47 + 508.40 + 4.91 = 626.73:  22%|██▏       | 456/2048 [06:15<19:20,  1.37it/s]
loss 2.00 accuracy 0.50 -- 56.04 + 57.45 + 503.69 + 4.90 = 622.09:  22%|██▏       | 456/2048 [06:16<19:20,  1.37it/s]
loss 2.00 accuracy 0.50 -- 56.04 + 57.45 + 503.69 + 4.90 = 622.09:  22%|██▏       | 457/2048 [06:16<18:42,  1.42it/s]
loss 1.86 accuracy 0.38 -- 163.91 + 57.21 + 498.78 + 4.89 = 724.80:  22%|██▏       | 457/2048 [06:17<18:42,  1.42it/s]
loss 1.86 accuracy 0.38 -- 163.91 + 57.21 + 498.78 + 4.89 = 724.80:  22%|██▏       | 458/2048 [06:17<19:03,  1.39it/s]
loss 2.12 accuracy 0.38 -- 56.31 + 172.53 + 509.65 + 4.92 = 743.40:  22%|██▏       | 458/2048 [06:18<19:03,  1.39it/s]
loss 2.12 accuracy 0.38 -- 56.31 + 172.53 + 509.65 + 4.92 = 743.40:  22%|██▏       | 459/2048 [06:18<19:27,  1.36it/s]
loss 2.13 accuracy 0.25 -- 56.90 + 56.34 + 507.11 + 4.89 = 625.25:  22%|██▏       | 459/2048 [06:18<19:27,  1.36it/s] 
loss 2.13 accuracy 0.25 -- 56.90 + 56.34 + 507.11 + 4.89 = 625.25:  22%|██▏       | 460/2048 [06:18<18:47,  1.41it/s]
loss 1.97 accuracy 0.38 -- 168.91 + 57.53 + 506.28 + 4.89 = 737.60:  22%|██▏       | 460/2048 [06:19<18:47,  1.41it/s]
loss 1.97 accuracy 0.38 -- 168.91 + 57.53 + 506.28 + 4.89 = 737.60:  23%|██▎       | 461/2048 [06:19<19:13,  1.38it/s]
loss 1.95 accuracy 0.25 -- 56.66 + 57.53 + 634.33 + 4.91 = 753.43:  23%|██▎       | 461/2048 [06:20<19:13,  1.38it/s] 
loss 1.95 accuracy 0.25 -- 56.66 + 57.53 + 634.33 + 4.91 = 753.43:  23%|██▎       | 462/2048 [06:20<19:38,  1.35it/s]
loss 2.11 accuracy 0.19 -- 57.00 + 56.73 + 515.19 + 4.90 = 633.82:  23%|██▎       | 462/2048 [06:20<19:38,  1.35it/s]
loss 2.11 accuracy 0.19 -- 57.00 + 56.73 + 515.19 + 4.90 = 633.82:  23%|██▎       | 463/2048 [06:20<18:58,  1.39it/s]
loss 2.01 accuracy 0.19 -- 56.35 + 57.33 + 631.29 + 4.90 = 749.88:  23%|██▎       | 463/2048 [06:21<18:58,  1.39it/s]
loss 2.01 accuracy 0.19 -- 56.35 + 57.33 + 631.29 + 4.90 = 749.88:  23%|██▎       | 464/2048 [06:21<19:25,  1.36it/s]
loss 1.89 accuracy 0.31 -- 56.97 + 56.42 + 510.67 + 4.88 = 628.95:  23%|██▎       | 464/2048 [06:22<19:25,  1.36it/s]
loss 1.89 accuracy 0.31 -- 56.97 + 56.42 + 510.67 + 4.88 = 628.95:  23%|██▎       | 465/2048 [06:22<18:47,  1.40it/s]
loss 1.92 accuracy 0.38 -- 56.73 + 171.95 + 510.25 + 4.91 = 743.85:  23%|██▎       | 465/2048 [06:23<18:47,  1.40it/s]
loss 1.92 accuracy 0.38 -- 56.73 + 171.95 + 510.25 + 4.91 = 743.85:  23%|██▎       | 466/2048 [06:23<19:14,  1.37it/s]
loss 1.68 accuracy 0.38 -- 56.71 + 56.50 + 508.54 + 4.88 = 626.64:  23%|██▎       | 466/2048 [06:23<19:14,  1.37it/s] 
loss 1.68 accuracy 0.38 -- 56.71 + 56.50 + 508.54 + 4.88 = 626.64:  23%|██▎       | 467/2048 [06:23<18:37,  1.41it/s]
loss 1.68 accuracy 0.31 -- 56.95 + 56.71 + 508.34 + 4.93 = 626.93:  23%|██▎       | 467/2048 [06:24<18:37,  1.41it/s]
loss 1.68 accuracy 0.31 -- 56.95 + 56.71 + 508.34 + 4.93 = 626.93:  23%|██▎       | 468/2048 [06:24<19:05,  1.38it/s]
loss 2.16 accuracy 0.19 -- 56.34 + 57.68 + 506.89 + 4.91 = 625.82:  23%|██▎       | 468/2048 [06:25<19:05,  1.38it/s]
loss 2.16 accuracy 0.19 -- 56.34 + 57.68 + 506.89 + 4.91 = 625.82:  23%|██▎       | 469/2048 [06:25<18:30,  1.42it/s]
loss 1.83 accuracy 0.31 -- 163.60 + 57.42 + 498.30 + 4.91 = 724.22:  23%|██▎       | 469/2048 [06:25<18:30,  1.42it/s]
loss 1.83 accuracy 0.31 -- 163.60 + 57.42 + 498.30 + 4.91 = 724.22:  23%|██▎       | 470/2048 [06:25<18:52,  1.39it/s]
loss 2.06 accuracy 0.25 -- 56.29 + 171.57 + 509.96 + 4.92 = 742.73:  23%|██▎       | 470/2048 [06:26<18:52,  1.39it/s]
loss 2.06 accuracy 0.25 -- 56.29 + 171.57 + 509.96 + 4.92 = 742.73:  23%|██▎       | 471/2048 [06:26<19:33,  1.34it/s]
loss 1.99 accuracy 0.19 -- 57.30 + 56.36 + 506.39 + 4.96 = 625.00:  23%|██▎       | 471/2048 [06:27<19:33,  1.34it/s] 
loss 1.99 accuracy 0.19 -- 57.30 + 56.36 + 506.39 + 4.96 = 625.00:  23%|██▎       | 472/2048 [06:27<18:49,  1.40it/s]
loss 1.88 accuracy 0.25 -- 170.36 + 57.60 + 506.43 + 4.88 = 739.27:  23%|██▎       | 472/2048 [06:28<18:49,  1.40it/s]
loss 1.88 accuracy 0.25 -- 170.36 + 57.60 + 506.43 + 4.88 = 739.27:  23%|██▎       | 473/2048 [06:28<19:12,  1.37it/s]
loss 2.26 accuracy 0.19 -- 56.48 + 57.37 + 637.92 + 4.89 = 756.67:  23%|██▎       | 473/2048 [06:28<19:12,  1.37it/s] 
loss 2.26 accuracy 0.19 -- 56.48 + 57.37 + 637.92 + 4.89 = 756.67:  23%|██▎       | 474/2048 [06:28<19:36,  1.34it/s]
loss 1.83 accuracy 0.38 -- 57.32 + 56.86 + 514.21 + 4.87 = 633.27:  23%|██▎       | 474/2048 [06:29<19:36,  1.34it/s]
loss 1.83 accuracy 0.38 -- 57.32 + 56.86 + 514.21 + 4.87 = 633.27:  23%|██▎       | 475/2048 [06:29<18:55,  1.39it/s]
loss 1.88 accuracy 0.44 -- 56.39 + 57.61 + 629.80 + 4.92 = 748.72:  23%|██▎       | 475/2048 [06:30<18:55,  1.39it/s]
loss 1.88 accuracy 0.44 -- 56.39 + 57.61 + 629.80 + 4.92 = 748.72:  23%|██▎       | 476/2048 [06:30<19:19,  1.36it/s]
loss 1.90 accuracy 0.31 -- 57.63 + 57.04 + 510.73 + 4.89 = 630.29:  23%|██▎       | 476/2048 [06:31<19:19,  1.36it/s]
loss 1.90 accuracy 0.31 -- 57.63 + 57.04 + 510.73 + 4.89 = 630.29:  23%|██▎       | 477/2048 [06:31<18:41,  1.40it/s]
loss 2.21 accuracy 0.25 -- 56.64 + 171.05 + 510.31 + 4.90 = 742.89:  23%|██▎       | 477/2048 [06:31<18:41,  1.40it/s]
loss 2.21 accuracy 0.25 -- 56.64 + 171.05 + 510.31 + 4.90 = 742.89:  23%|██▎       | 478/2048 [06:31<19:23,  1.35it/s]
loss 2.60 accuracy 0.19 -- 56.39 + 56.37 + 508.48 + 4.90 = 626.14:  23%|██▎       | 478/2048 [06:32<19:23,  1.35it/s] 
loss 2.60 accuracy 0.19 -- 56.39 + 56.37 + 508.48 + 4.90 = 626.14:  23%|██▎       | 479/2048 [06:32<18:41,  1.40it/s]
loss 1.80 accuracy 0.12 -- 56.97 + 56.73 + 508.09 + 4.88 = 626.67:  23%|██▎       | 479/2048 [06:33<18:41,  1.40it/s]
loss 1.80 accuracy 0.12 -- 56.97 + 56.73 + 508.09 + 4.88 = 626.67:  23%|██▎       | 480/2048 [06:33<19:05,  1.37it/s]
loss 2.42 accuracy 0.19 -- 56.44 + 57.42 + 503.49 + 4.89 = 622.24:  23%|██▎       | 480/2048 [06:33<19:05,  1.37it/s]
loss 2.42 accuracy 0.19 -- 56.44 + 57.42 + 503.49 + 4.89 = 622.24:  23%|██▎       | 481/2048 [06:33<18:26,  1.42it/s]
loss 2.24 accuracy 0.25 -- 163.47 + 57.00 + 498.69 + 4.93 = 724.09:  23%|██▎       | 481/2048 [06:34<18:26,  1.42it/s]
loss 2.24 accuracy 0.25 -- 163.47 + 57.00 + 498.69 + 4.93 = 724.09:  24%|██▎       | 482/2048 [06:34<18:47,  1.39it/s]
loss 2.12 accuracy 0.19 -- 56.28 + 172.26 + 512.25 + 4.87 = 745.66:  24%|██▎       | 482/2048 [06:35<18:47,  1.39it/s]
loss 2.12 accuracy 0.19 -- 56.28 + 172.26 + 512.25 + 4.87 = 745.66:  24%|██▎       | 483/2048 [06:35<19:11,  1.36it/s]
loss 1.90 accuracy 0.31 -- 57.12 + 56.62 + 504.76 + 4.87 = 623.37:  24%|██▎       | 483/2048 [06:36<19:11,  1.36it/s] 
loss 1.90 accuracy 0.31 -- 57.12 + 56.62 + 504.76 + 4.87 = 623.37:  24%|██▎       | 484/2048 [06:36<18:30,  1.41it/s]
loss 1.64 accuracy 0.44 -- 167.77 + 57.06 + 505.33 + 4.89 = 735.05:  24%|██▎       | 484/2048 [06:36<18:30,  1.41it/s]
loss 1.64 accuracy 0.44 -- 167.77 + 57.06 + 505.33 + 4.89 = 735.05:  24%|██▎       | 485/2048 [06:36<19:11,  1.36it/s]
loss 2.03 accuracy 0.25 -- 56.64 + 57.55 + 632.17 + 4.87 = 751.23:  24%|██▎       | 485/2048 [06:37<19:11,  1.36it/s] 
loss 2.03 accuracy 0.25 -- 56.64 + 57.55 + 632.17 + 4.87 = 751.23:  24%|██▎       | 486/2048 [06:37<19:30,  1.33it/s]
loss 2.08 accuracy 0.25 -- 57.08 + 56.72 + 512.28 + 4.89 = 630.97:  24%|██▎       | 486/2048 [06:38<19:30,  1.33it/s]
loss 2.08 accuracy 0.25 -- 57.08 + 56.72 + 512.28 + 4.89 = 630.97:  24%|██▍       | 487/2048 [06:38<18:46,  1.39it/s]
loss 2.79 accuracy 0.25 -- 56.26 + 57.31 + 627.79 + 4.89 = 746.25:  24%|██▍       | 487/2048 [06:39<18:46,  1.39it/s]
loss 2.79 accuracy 0.25 -- 56.26 + 57.31 + 627.79 + 4.89 = 746.25:  24%|██▍       | 488/2048 [06:39<19:10,  1.36it/s]
loss 1.96 accuracy 0.25 -- 57.35 + 56.59 + 508.65 + 4.87 = 627.46:  24%|██▍       | 488/2048 [06:39<19:10,  1.36it/s]
loss 1.96 accuracy 0.25 -- 57.35 + 56.59 + 508.65 + 4.87 = 627.46:  24%|██▍       | 489/2048 [06:39<18:30,  1.40it/s]
loss 1.79 accuracy 0.31 -- 56.68 + 171.58 + 509.03 + 4.88 = 742.17:  24%|██▍       | 489/2048 [06:40<18:30,  1.40it/s]
loss 1.79 accuracy 0.31 -- 56.68 + 171.58 + 509.03 + 4.88 = 742.17:  24%|██▍       | 490/2048 [06:40<18:56,  1.37it/s]
loss 1.79 accuracy 0.38 -- 56.32 + 56.83 + 508.21 + 4.88 = 626.25:  24%|██▍       | 490/2048 [06:41<18:56,  1.37it/s] 
loss 1.79 accuracy 0.38 -- 56.32 + 56.83 + 508.21 + 4.88 = 626.25:  24%|██▍       | 491/2048 [06:41<18:20,  1.41it/s]
loss 2.10 accuracy 0.19 -- 57.03 + 56.78 + 507.50 + 4.87 = 626.18:  24%|██▍       | 491/2048 [06:42<18:20,  1.41it/s]
loss 2.10 accuracy 0.19 -- 57.03 + 56.78 + 507.50 + 4.87 = 626.18:  24%|██▍       | 492/2048 [06:42<19:03,  1.36it/s]
loss 2.37 accuracy 0.12 -- 56.36 + 57.35 + 502.85 + 4.87 = 621.43:  24%|██▍       | 492/2048 [06:42<19:03,  1.36it/s]
loss 2.37 accuracy 0.12 -- 56.36 + 57.35 + 502.85 + 4.87 = 621.43:  24%|██▍       | 493/2048 [06:42<18:22,  1.41it/s]
loss 1.86 accuracy 0.38 -- 162.86 + 56.78 + 495.88 + 4.89 = 720.41:  24%|██▍       | 493/2048 [06:43<18:22,  1.41it/s]
loss 1.86 accuracy 0.38 -- 162.86 + 56.78 + 495.88 + 4.89 = 720.41:  24%|██▍       | 494/2048 [06:43<18:40,  1.39it/s]
loss 2.28 accuracy 0.12 -- 55.81 + 171.85 + 509.29 + 4.87 = 741.83:  24%|██▍       | 494/2048 [06:44<18:40,  1.39it/s]
loss 2.28 accuracy 0.12 -- 55.81 + 171.85 + 509.29 + 4.87 = 741.83:  24%|██▍       | 495/2048 [06:44<19:02,  1.36it/s]
loss 1.92 accuracy 0.25 -- 56.71 + 56.62 + 504.79 + 4.90 = 623.02:  24%|██▍       | 495/2048 [06:44<19:02,  1.36it/s] 
loss 1.92 accuracy 0.25 -- 56.71 + 56.62 + 504.79 + 4.90 = 623.02:  24%|██▍       | 496/2048 [06:44<18:21,  1.41it/s]
loss 2.41 accuracy 0.12 -- 168.09 + 57.14 + 505.20 + 4.84 = 735.26:  24%|██▍       | 496/2048 [06:45<18:21,  1.41it/s]
loss 2.41 accuracy 0.12 -- 168.09 + 57.14 + 505.20 + 4.84 = 735.26:  24%|██▍       | 497/2048 [06:45<18:45,  1.38it/s]
loss 1.64 accuracy 0.44 -- 56.21 + 56.98 + 631.85 + 4.85 = 749.88:  24%|██▍       | 497/2048 [06:46<18:45,  1.38it/s] 
loss 1.64 accuracy 0.44 -- 56.21 + 56.98 + 631.85 + 4.85 = 749.88:  24%|██▍       | 498/2048 [06:46<19:08,  1.35it/s]
loss 2.49 accuracy 0.19 -- 57.04 + 56.29 + 511.37 + 4.85 = 629.55:  24%|██▍       | 498/2048 [06:47<19:08,  1.35it/s]
loss 2.49 accuracy 0.19 -- 57.04 + 56.29 + 511.37 + 4.85 = 629.55:  24%|██▍       | 499/2048 [06:47<18:45,  1.38it/s]
loss 1.61 accuracy 0.44 -- 56.31 + 57.35 + 627.44 + 4.83 = 745.92:  24%|██▍       | 499/2048 [06:47<18:45,  1.38it/s]
loss 1.61 accuracy 0.44 -- 56.31 + 57.35 + 627.44 + 4.83 = 745.92:  24%|██▍       | 500/2048 [06:47<19:06,  1.35it/s]
loss 2.44 accuracy 0.12 -- 57.16 + 56.49 + 509.96 + 4.85 = 628.47:  24%|██▍       | 500/2048 [06:48<19:06,  1.35it/s]
loss 2.44 accuracy 0.12 -- 57.16 + 56.49 + 509.96 + 4.85 = 628.47:  24%|██▍       | 501/2048 [06:48<18:26,  1.40it/s]
loss 1.84 accuracy 0.31 -- 56.27 + 171.71 + 508.52 + 4.85 = 741.35:  24%|██▍       | 501/2048 [06:49<18:26,  1.40it/s]
loss 1.84 accuracy 0.31 -- 56.27 + 171.71 + 508.52 + 4.85 = 741.35:  25%|██▍       | 502/2048 [06:49<18:50,  1.37it/s]
loss 2.20 accuracy 0.12 -- 56.12 + 56.63 + 505.56 + 4.82 = 623.15:  25%|██▍       | 502/2048 [06:49<18:50,  1.37it/s] 
loss 2.20 accuracy 0.12 -- 56.12 + 56.63 + 505.56 + 4.82 = 623.15:  25%|██▍       | 503/2048 [06:49<18:12,  1.41it/s]
loss 2.28 accuracy 0.38 -- 56.67 + 56.43 + 504.60 + 4.84 = 622.54:  25%|██▍       | 503/2048 [06:50<18:12,  1.41it/s]
loss 2.28 accuracy 0.38 -- 56.67 + 56.43 + 504.60 + 4.84 = 622.54:  25%|██▍       | 504/2048 [06:50<18:37,  1.38it/s]
loss 2.27 accuracy 0.38 -- 56.32 + 57.17 + 500.86 + 4.84 = 619.20:  25%|██▍       | 504/2048 [06:51<18:37,  1.38it/s]
loss 2.27 accuracy 0.38 -- 56.32 + 57.17 + 500.86 + 4.84 = 619.20:  25%|██▍       | 505/2048 [06:51<18:00,  1.43it/s]
loss 1.61 accuracy 0.25 -- 162.72 + 56.95 + 496.47 + 4.85 = 720.99:  25%|██▍       | 505/2048 [06:52<18:00,  1.43it/s]
loss 1.61 accuracy 0.25 -- 162.72 + 56.95 + 496.47 + 4.85 = 720.99:  25%|██▍       | 506/2048 [06:52<18:22,  1.40it/s]
loss 2.30 accuracy 0.06 -- 55.94 + 171.66 + 509.60 + 4.83 = 742.03:  25%|██▍       | 506/2048 [06:52<18:22,  1.40it/s]
loss 2.30 accuracy 0.06 -- 55.94 + 171.66 + 509.60 + 4.83 = 742.03:  25%|██▍       | 507/2048 [06:52<19:03,  1.35it/s]
loss 1.92 accuracy 0.25 -- 56.99 + 56.66 + 503.60 + 4.84 = 622.09:  25%|██▍       | 507/2048 [06:53<19:03,  1.35it/s] 
loss 1.92 accuracy 0.25 -- 56.99 + 56.66 + 503.60 + 4.84 = 622.09:  25%|██▍       | 508/2048 [06:53<18:20,  1.40it/s]
loss 2.45 accuracy 0.19 -- 168.10 + 57.07 + 503.05 + 4.84 = 733.05:  25%|██▍       | 508/2048 [06:54<18:20,  1.40it/s]
loss 2.45 accuracy 0.19 -- 168.10 + 57.07 + 503.05 + 4.84 = 733.05:  25%|██▍       | 509/2048 [06:54<18:40,  1.37it/s]
loss 2.01 accuracy 0.25 -- 56.97 + 57.31 + 634.60 + 4.83 = 753.70:  25%|██▍       | 509/2048 [06:55<18:40,  1.37it/s] 
loss 2.01 accuracy 0.25 -- 56.97 + 57.31 + 634.60 + 4.83 = 753.70:  25%|██▍       | 510/2048 [06:55<19:04,  1.34it/s]
loss 1.87 accuracy 0.12 -- 57.24 + 56.79 + 515.05 + 4.88 = 633.97:  25%|██▍       | 510/2048 [06:55<19:04,  1.34it/s]
loss 1.87 accuracy 0.12 -- 57.24 + 56.79 + 515.05 + 4.88 = 633.97:  25%|██▍       | 511/2048 [06:55<18:25,  1.39it/s]
loss 1.97 accuracy 0.12 -- 56.56 + 57.40 + 627.91 + 4.82 = 746.68:  25%|██▍       | 511/2048 [06:56<18:25,  1.39it/s]
loss 1.97 accuracy 0.12 -- 56.56 + 57.40 + 627.91 + 4.82 = 746.68:  25%|██▌       | 512/2048 [06:56<18:50,  1.36it/s]
loss 1.64 accuracy 0.38 -- 57.35 + 56.95 + 508.46 + 4.82 = 627.59:  25%|██▌       | 512/2048 [06:57<18:50,  1.36it/s]
loss 1.64 accuracy 0.38 -- 57.35 + 56.95 + 508.46 + 4.82 = 627.59:  25%|██▌       | 513/2048 [06:57<18:12,  1.41it/s]
loss 2.01 accuracy 0.44 -- 56.45 + 170.55 + 507.50 + 4.83 = 739.33:  25%|██▌       | 513/2048 [06:57<18:12,  1.41it/s]
loss 2.01 accuracy 0.44 -- 56.45 + 170.55 + 507.50 + 4.83 = 739.33:  25%|██▌       | 514/2048 [06:57<18:37,  1.37it/s]
loss 2.71 accuracy 0.06 -- 56.05 + 56.38 + 505.05 + 4.82 = 622.29:  25%|██▌       | 514/2048 [06:58<18:37,  1.37it/s] 
loss 2.71 accuracy 0.06 -- 56.05 + 56.38 + 505.05 + 4.82 = 622.29:  25%|██▌       | 515/2048 [06:58<18:00,  1.42it/s]
loss 1.82 accuracy 0.25 -- 57.22 + 56.41 + 504.23 + 4.82 = 622.68:  25%|██▌       | 515/2048 [06:59<18:00,  1.42it/s]
loss 1.82 accuracy 0.25 -- 57.22 + 56.41 + 504.23 + 4.82 = 622.68:  25%|██▌       | 516/2048 [06:59<18:26,  1.39it/s]
loss 2.51 accuracy 0.12 -- 56.26 + 57.10 + 500.66 + 4.83 = 618.84:  25%|██▌       | 516/2048 [06:59<18:26,  1.39it/s]
loss 2.51 accuracy 0.12 -- 56.26 + 57.10 + 500.66 + 4.83 = 618.84:  25%|██▌       | 517/2048 [06:59<17:50,  1.43it/s]
loss 2.38 accuracy 0.44 -- 162.64 + 56.88 + 497.07 + 4.84 = 721.44:  25%|██▌       | 517/2048 [07:00<17:50,  1.43it/s]
loss 2.38 accuracy 0.44 -- 162.64 + 56.88 + 497.07 + 4.84 = 721.44:  25%|██▌       | 518/2048 [07:00<18:12,  1.40it/s]
loss 2.12 accuracy 0.12 -- 56.51 + 172.22 + 508.10 + 4.82 = 741.65:  25%|██▌       | 518/2048 [07:01<18:12,  1.40it/s]
loss 2.12 accuracy 0.12 -- 56.51 + 172.22 + 508.10 + 4.82 = 741.65:  25%|██▌       | 519/2048 [07:01<18:37,  1.37it/s]
loss 1.83 accuracy 0.31 -- 57.10 + 56.20 + 505.52 + 4.84 = 623.67:  25%|██▌       | 519/2048 [07:02<18:37,  1.37it/s] 
loss 1.83 accuracy 0.31 -- 57.10 + 56.20 + 505.52 + 4.84 = 623.67:  25%|██▌       | 520/2048 [07:02<18:00,  1.41it/s]
loss 2.00 accuracy 0.19 -- 167.69 + 56.97 + 503.00 + 4.83 = 732.48:  25%|██▌       | 520/2048 [07:02<18:00,  1.41it/s]
loss 2.00 accuracy 0.19 -- 167.69 + 56.97 + 503.00 + 4.83 = 732.48:  25%|██▌       | 521/2048 [07:02<18:23,  1.38it/s]
loss 1.83 accuracy 0.31 -- 56.12 + 57.73 + 631.86 + 4.81 = 750.52:  25%|██▌       | 521/2048 [07:03<18:23,  1.38it/s] 
loss 1.83 accuracy 0.31 -- 56.12 + 57.73 + 631.86 + 4.81 = 750.52:  25%|██▌       | 522/2048 [07:03<19:04,  1.33it/s]
loss 1.84 accuracy 0.38 -- 57.05 + 56.54 + 511.62 + 4.82 = 630.02:  25%|██▌       | 522/2048 [07:04<19:04,  1.33it/s]
loss 1.84 accuracy 0.38 -- 57.05 + 56.54 + 511.62 + 4.82 = 630.02:  26%|██▌       | 523/2048 [07:04<18:21,  1.38it/s]
loss 2.16 accuracy 0.19 -- 56.35 + 57.33 + 628.02 + 4.83 = 746.53:  26%|██▌       | 523/2048 [07:05<18:21,  1.38it/s]
loss 2.16 accuracy 0.19 -- 56.35 + 57.33 + 628.02 + 4.83 = 746.53:  26%|██▌       | 524/2048 [07:05<18:44,  1.36it/s]
loss 1.68 accuracy 0.38 -- 56.94 + 56.73 + 509.25 + 4.84 = 627.77:  26%|██▌       | 524/2048 [07:05<18:44,  1.36it/s]
loss 1.68 accuracy 0.38 -- 56.94 + 56.73 + 509.25 + 4.84 = 627.77:  26%|██▌       | 525/2048 [07:05<18:05,  1.40it/s]
loss 2.27 accuracy 0.25 -- 56.31 + 171.19 + 507.43 + 4.85 = 739.78:  26%|██▌       | 525/2048 [07:06<18:05,  1.40it/s]
loss 2.27 accuracy 0.25 -- 56.31 + 171.19 + 507.43 + 4.85 = 739.78:  26%|██▌       | 526/2048 [07:06<18:29,  1.37it/s]
loss 2.16 accuracy 0.12 -- 56.55 + 56.66 + 505.14 + 4.82 = 623.16:  26%|██▌       | 526/2048 [07:07<18:29,  1.37it/s] 
loss 2.16 accuracy 0.12 -- 56.55 + 56.66 + 505.14 + 4.82 = 623.16:  26%|██▌       | 527/2048 [07:07<17:53,  1.42it/s]
loss 2.04 accuracy 0.25 -- 56.98 + 56.75 + 505.28 + 4.82 = 623.84:  26%|██▌       | 527/2048 [07:07<17:53,  1.42it/s]
loss 2.04 accuracy 0.25 -- 56.98 + 56.75 + 505.28 + 4.82 = 623.84:  26%|██▌       | 528/2048 [07:07<18:18,  1.38it/s]
loss 2.03 accuracy 0.25 -- 56.52 + 57.34 + 501.99 + 4.86 = 620.72:  26%|██▌       | 528/2048 [07:08<18:18,  1.38it/s]
loss 2.03 accuracy 0.25 -- 56.52 + 57.34 + 501.99 + 4.86 = 620.72:  26%|██▌       | 529/2048 [07:08<17:59,  1.41it/s]
loss 1.62 accuracy 0.50 -- 163.41 + 57.20 + 496.01 + 4.83 = 721.45:  26%|██▌       | 529/2048 [07:09<17:59,  1.41it/s]
loss 1.62 accuracy 0.50 -- 163.41 + 57.20 + 496.01 + 4.83 = 721.45:  26%|██▌       | 530/2048 [07:09<18:16,  1.38it/s]
loss 2.07 accuracy 0.25 -- 56.28 + 171.67 + 509.30 + 4.83 = 742.08:  26%|██▌       | 530/2048 [07:10<18:16,  1.38it/s]
loss 2.07 accuracy 0.25 -- 56.28 + 171.67 + 509.30 + 4.83 = 742.08:  26%|██▌       | 531/2048 [07:10<18:37,  1.36it/s]
loss 1.56 accuracy 0.44 -- 57.06 + 56.57 + 503.82 + 4.82 = 622.27:  26%|██▌       | 531/2048 [07:10<18:37,  1.36it/s] 
loss 1.56 accuracy 0.44 -- 57.06 + 56.57 + 503.82 + 4.82 = 622.27:  26%|██▌       | 532/2048 [07:10<17:57,  1.41it/s]
loss 2.13 accuracy 0.25 -- 167.54 + 57.13 + 502.52 + 4.83 = 732.03:  26%|██▌       | 532/2048 [07:11<17:57,  1.41it/s]
loss 2.13 accuracy 0.25 -- 167.54 + 57.13 + 502.52 + 4.83 = 732.03:  26%|██▌       | 533/2048 [07:11<18:18,  1.38it/s]
loss 1.84 accuracy 0.44 -- 56.24 + 57.12 + 632.62 + 4.83 = 750.82:  26%|██▌       | 533/2048 [07:12<18:18,  1.38it/s] 
loss 1.84 accuracy 0.44 -- 56.24 + 57.12 + 632.62 + 4.83 = 750.82:  26%|██▌       | 534/2048 [07:12<18:42,  1.35it/s]
loss 2.24 accuracy 0.12 -- 56.93 + 56.34 + 513.43 + 4.83 = 631.52:  26%|██▌       | 534/2048 [07:13<18:42,  1.35it/s]
loss 2.24 accuracy 0.12 -- 56.93 + 56.34 + 513.43 + 4.83 = 631.52:  26%|██▌       | 535/2048 [07:13<18:04,  1.40it/s]
loss 2.89 accuracy 0.19 -- 56.22 + 57.27 + 626.85 + 4.84 = 745.17:  26%|██▌       | 535/2048 [07:13<18:04,  1.40it/s]
loss 2.89 accuracy 0.19 -- 56.22 + 57.27 + 626.85 + 4.84 = 745.17:  26%|██▌       | 536/2048 [07:13<18:29,  1.36it/s]
loss 2.13 accuracy 0.25 -- 56.99 + 56.39 + 508.28 + 4.84 = 626.49:  26%|██▌       | 536/2048 [07:14<18:29,  1.36it/s]
loss 2.13 accuracy 0.25 -- 56.99 + 56.39 + 508.28 + 4.84 = 626.49:  26%|██▌       | 537/2048 [07:14<17:52,  1.41it/s]
loss 2.18 accuracy 0.19 -- 56.29 + 171.17 + 508.77 + 4.84 = 741.06:  26%|██▌       | 537/2048 [07:15<17:52,  1.41it/s]
loss 2.18 accuracy 0.19 -- 56.29 + 171.17 + 508.77 + 4.84 = 741.06:  26%|██▋       | 538/2048 [07:15<18:18,  1.37it/s]
loss 3.52 accuracy 0.12 -- 56.51 + 56.44 + 504.93 + 4.82 = 622.71:  26%|██▋       | 538/2048 [07:15<18:18,  1.37it/s] 
loss 3.52 accuracy 0.12 -- 56.51 + 56.44 + 504.93 + 4.82 = 622.71:  26%|██▋       | 539/2048 [07:15<17:42,  1.42it/s]
loss 2.46 accuracy 0.19 -- 56.91 + 56.53 + 504.45 + 4.86 = 622.75:  26%|██▋       | 539/2048 [07:16<17:42,  1.42it/s]
loss 2.46 accuracy 0.19 -- 56.91 + 56.53 + 504.45 + 4.86 = 622.75:  26%|██▋       | 540/2048 [07:16<18:08,  1.38it/s]
loss 2.50 accuracy 0.25 -- 56.52 + 57.70 + 501.63 + 4.83 = 620.69:  26%|██▋       | 540/2048 [07:17<18:08,  1.38it/s]
loss 2.50 accuracy 0.25 -- 56.52 + 57.70 + 501.63 + 4.83 = 620.69:  26%|██▋       | 541/2048 [07:17<17:34,  1.43it/s]
loss 2.72 accuracy 0.25 -- 162.27 + 57.07 + 495.34 + 4.82 = 719.51:  26%|██▋       | 541/2048 [07:18<17:34,  1.43it/s]
loss 2.72 accuracy 0.25 -- 162.27 + 57.07 + 495.34 + 4.82 = 719.51:  26%|██▋       | 542/2048 [07:18<17:55,  1.40it/s]
loss 2.48 accuracy 0.12 -- 55.80 + 171.39 + 509.40 + 4.83 = 741.42:  26%|██▋       | 542/2048 [07:18<17:55,  1.40it/s]
loss 2.48 accuracy 0.12 -- 55.80 + 171.39 + 509.40 + 4.83 = 741.42:  27%|██▋       | 543/2048 [07:18<18:35,  1.35it/s]
loss 2.19 accuracy 0.38 -- 56.83 + 56.45 + 503.44 + 4.83 = 621.55:  27%|██▋       | 543/2048 [07:19<18:35,  1.35it/s] 
loss 2.19 accuracy 0.38 -- 56.83 + 56.45 + 503.44 + 4.83 = 621.55:  27%|██▋       | 544/2048 [07:19<17:52,  1.40it/s]
loss 1.94 accuracy 0.50 -- 168.21 + 57.13 + 502.94 + 4.82 = 733.10:  27%|██▋       | 544/2048 [07:20<17:52,  1.40it/s]
loss 1.94 accuracy 0.50 -- 168.21 + 57.13 + 502.94 + 4.82 = 733.10:  27%|██▋       | 545/2048 [07:20<18:13,  1.37it/s]
loss 2.00 accuracy 0.25 -- 56.32 + 57.30 + 632.82 + 4.83 = 751.27:  27%|██▋       | 545/2048 [07:21<18:13,  1.37it/s] 
loss 2.00 accuracy 0.25 -- 56.32 + 57.30 + 632.82 + 4.83 = 751.27:  27%|██▋       | 546/2048 [07:21<18:35,  1.35it/s]
loss 2.29 accuracy 0.12 -- 57.09 + 56.51 + 511.20 + 4.83 = 629.62:  27%|██▋       | 546/2048 [07:21<18:35,  1.35it/s]
loss 2.29 accuracy 0.12 -- 57.09 + 56.51 + 511.20 + 4.83 = 629.62:  27%|██▋       | 547/2048 [07:21<17:56,  1.39it/s]
loss 1.98 accuracy 0.25 -- 56.67 + 57.62 + 627.88 + 4.86 = 747.01:  27%|██▋       | 547/2048 [07:22<17:56,  1.39it/s]
loss 1.98 accuracy 0.25 -- 56.67 + 57.62 + 627.88 + 4.86 = 747.01:  27%|██▋       | 548/2048 [07:22<18:21,  1.36it/s]
loss 1.91 accuracy 0.25 -- 56.98 + 56.75 + 507.79 + 4.83 = 626.36:  27%|██▋       | 548/2048 [07:23<18:21,  1.36it/s]
loss 1.91 accuracy 0.25 -- 56.98 + 56.75 + 507.79 + 4.83 = 626.36:  27%|██▋       | 549/2048 [07:23<17:44,  1.41it/s]
loss 2.08 accuracy 0.25 -- 56.62 + 171.07 + 508.62 + 4.83 = 741.13:  27%|██▋       | 549/2048 [07:23<17:44,  1.41it/s]
loss 2.08 accuracy 0.25 -- 56.62 + 171.07 + 508.62 + 4.83 = 741.13:  27%|██▋       | 550/2048 [07:23<18:26,  1.35it/s]
loss 1.87 accuracy 0.38 -- 56.15 + 56.79 + 505.88 + 4.82 = 623.63:  27%|██▋       | 550/2048 [07:24<18:26,  1.35it/s] 
loss 1.87 accuracy 0.38 -- 56.15 + 56.79 + 505.88 + 4.82 = 623.63:  27%|██▋       | 551/2048 [07:24<17:46,  1.40it/s]
loss 2.41 accuracy 0.25 -- 57.21 + 57.26 + 505.72 + 4.83 = 625.02:  27%|██▋       | 551/2048 [07:25<17:46,  1.40it/s]
loss 2.41 accuracy 0.25 -- 57.21 + 57.26 + 505.72 + 4.83 = 625.02:  27%|██▋       | 552/2048 [07:25<18:08,  1.37it/s]
loss 2.11 accuracy 0.19 -- 56.36 + 57.51 + 501.38 + 4.84 = 620.08:  27%|██▋       | 552/2048 [07:25<18:08,  1.37it/s]
loss 2.11 accuracy 0.19 -- 56.36 + 57.51 + 501.38 + 4.84 = 620.08:  27%|██▋       | 553/2048 [07:25<17:32,  1.42it/s]
loss 2.07 accuracy 0.19 -- 163.49 + 57.97 + 496.01 + 4.82 = 722.29:  27%|██▋       | 553/2048 [07:26<17:32,  1.42it/s]
loss 2.07 accuracy 0.19 -- 163.49 + 57.97 + 496.01 + 4.82 = 722.29:  27%|██▋       | 554/2048 [07:26<17:52,  1.39it/s]
loss 2.13 accuracy 0.25 -- 56.33 + 172.09 + 508.22 + 4.83 = 741.47:  27%|██▋       | 554/2048 [07:27<17:52,  1.39it/s]
loss 2.13 accuracy 0.25 -- 56.33 + 172.09 + 508.22 + 4.83 = 741.47:  27%|██▋       | 555/2048 [07:27<18:14,  1.36it/s]
loss 1.99 accuracy 0.25 -- 56.97 + 56.60 + 504.37 + 4.82 = 622.76:  27%|██▋       | 555/2048 [07:28<18:14,  1.36it/s] 
loss 1.99 accuracy 0.25 -- 56.97 + 56.60 + 504.37 + 4.82 = 622.76:  27%|██▋       | 556/2048 [07:28<17:36,  1.41it/s]
loss 1.75 accuracy 0.31 -- 166.92 + 57.50 + 504.55 + 4.82 = 733.78:  27%|██▋       | 556/2048 [07:28<17:36,  1.41it/s]
loss 1.75 accuracy 0.31 -- 166.92 + 57.50 + 504.55 + 4.82 = 733.78:  27%|██▋       | 557/2048 [07:28<18:15,  1.36it/s]
loss 2.44 accuracy 0.25 -- 56.52 + 57.35 + 631.24 + 4.82 = 749.93:  27%|██▋       | 557/2048 [07:29<18:15,  1.36it/s] 
loss 2.44 accuracy 0.25 -- 56.52 + 57.35 + 631.24 + 4.82 = 749.93:  27%|██▋       | 558/2048 [07:29<18:33,  1.34it/s]
loss 1.81 accuracy 0.44 -- 57.36 + 56.60 + 511.67 + 4.84 = 630.47:  27%|██▋       | 558/2048 [07:30<18:33,  1.34it/s]
loss 1.81 accuracy 0.44 -- 57.36 + 56.60 + 511.67 + 4.84 = 630.47:  27%|██▋       | 559/2048 [07:30<17:52,  1.39it/s]
loss 2.57 accuracy 0.06 -- 56.53 + 57.74 + 626.47 + 4.82 = 745.55:  27%|██▋       | 559/2048 [07:31<17:52,  1.39it/s]
loss 2.57 accuracy 0.06 -- 56.53 + 57.74 + 626.47 + 4.82 = 745.55:  27%|██▋       | 560/2048 [07:31<18:15,  1.36it/s]
loss 2.18 accuracy 0.19 -- 56.99 + 56.77 + 507.75 + 4.86 = 626.37:  27%|██▋       | 560/2048 [07:31<18:15,  1.36it/s]
loss 2.18 accuracy 0.19 -- 56.99 + 56.77 + 507.75 + 4.86 = 626.37:  27%|██▋       | 561/2048 [07:31<17:38,  1.41it/s]
loss 2.08 accuracy 0.31 -- 56.73 + 171.46 + 507.24 + 4.85 = 740.28:  27%|██▋       | 561/2048 [07:32<17:38,  1.41it/s]
loss 2.08 accuracy 0.31 -- 56.73 + 171.46 + 507.24 + 4.85 = 740.28:  27%|██▋       | 562/2048 [07:32<18:02,  1.37it/s]
loss 1.74 accuracy 0.50 -- 56.94 + 56.74 + 506.91 + 4.83 = 625.41:  27%|██▋       | 562/2048 [07:33<18:02,  1.37it/s] 
loss 1.74 accuracy 0.50 -- 56.94 + 56.74 + 506.91 + 4.83 = 625.41:  27%|██▋       | 563/2048 [07:33<17:27,  1.42it/s]
loss 2.17 accuracy 0.06 -- 57.08 + 56.56 + 504.61 + 4.86 = 623.11:  27%|██▋       | 563/2048 [07:34<17:27,  1.42it/s]
loss 2.17 accuracy 0.06 -- 57.08 + 56.56 + 504.61 + 4.86 = 623.11:  28%|██▊       | 564/2048 [07:34<18:08,  1.36it/s]
loss 1.86 accuracy 0.19 -- 56.58 + 56.93 + 500.44 + 4.85 = 618.80:  28%|██▊       | 564/2048 [07:34<18:08,  1.36it/s]
loss 1.86 accuracy 0.19 -- 56.58 + 56.93 + 500.44 + 4.85 = 618.80:  28%|██▊       | 565/2048 [07:34<17:28,  1.41it/s]
loss 2.32 accuracy 0.25 -- 162.70 + 57.00 + 494.58 + 4.82 = 719.11:  28%|██▊       | 565/2048 [07:35<17:28,  1.41it/s]
loss 2.32 accuracy 0.25 -- 162.70 + 57.00 + 494.58 + 4.82 = 719.11:  28%|██▊       | 566/2048 [07:35<17:45,  1.39it/s]
loss 2.66 accuracy 0.12 -- 55.92 + 171.56 + 507.27 + 4.83 = 739.57:  28%|██▊       | 566/2048 [07:36<17:45,  1.39it/s]
loss 2.66 accuracy 0.12 -- 55.92 + 171.56 + 507.27 + 4.83 = 739.57:  28%|██▊       | 567/2048 [07:36<18:05,  1.36it/s]
loss 2.11 accuracy 0.38 -- 56.95 + 56.43 + 505.85 + 4.80 = 624.04:  28%|██▊       | 567/2048 [07:36<18:05,  1.36it/s] 
loss 2.11 accuracy 0.38 -- 56.95 + 56.43 + 505.85 + 4.80 = 624.04:  28%|██▊       | 568/2048 [07:36<17:28,  1.41it/s]
loss 2.50 accuracy 0.19 -- 167.86 + 57.04 + 502.96 + 4.87 = 732.74:  28%|██▊       | 568/2048 [07:37<17:28,  1.41it/s]
loss 2.50 accuracy 0.19 -- 167.86 + 57.04 + 502.96 + 4.87 = 732.74:  28%|██▊       | 569/2048 [07:37<17:50,  1.38it/s]
loss 2.30 accuracy 0.25 -- 56.47 + 57.60 + 632.25 + 4.84 = 751.15:  28%|██▊       | 569/2048 [07:38<17:50,  1.38it/s] 
loss 2.30 accuracy 0.25 -- 56.47 + 57.60 + 632.25 + 4.84 = 751.15:  28%|██▊       | 570/2048 [07:38<18:14,  1.35it/s]
loss 2.02 accuracy 0.44 -- 57.17 + 56.75 + 510.60 + 4.84 = 629.36:  28%|██▊       | 570/2048 [07:39<18:14,  1.35it/s]
loss 2.02 accuracy 0.44 -- 57.17 + 56.75 + 510.60 + 4.84 = 629.36:  28%|██▊       | 571/2048 [07:39<17:52,  1.38it/s]
loss 2.75 accuracy 0.19 -- 56.36 + 57.16 + 626.04 + 4.84 = 744.39:  28%|██▊       | 571/2048 [07:39<17:52,  1.38it/s]
loss 2.75 accuracy 0.19 -- 56.36 + 57.16 + 626.04 + 4.84 = 744.39:  28%|██▊       | 572/2048 [07:39<18:11,  1.35it/s]
loss 1.93 accuracy 0.25 -- 57.28 + 56.81 + 507.60 + 4.82 = 626.51:  28%|██▊       | 572/2048 [07:40<18:11,  1.35it/s]
loss 1.93 accuracy 0.25 -- 57.28 + 56.81 + 507.60 + 4.82 = 626.51:  28%|██▊       | 573/2048 [07:40<17:33,  1.40it/s]
loss 2.71 accuracy 0.19 -- 56.76 + 171.41 + 510.33 + 4.82 = 743.31:  28%|██▊       | 573/2048 [07:41<17:33,  1.40it/s]
loss 2.71 accuracy 0.19 -- 56.76 + 171.41 + 510.33 + 4.82 = 743.31:  28%|██▊       | 574/2048 [07:41<17:57,  1.37it/s]
loss 1.97 accuracy 0.12 -- 56.43 + 56.47 + 505.95 + 4.81 = 623.66:  28%|██▊       | 574/2048 [07:41<17:57,  1.37it/s] 
loss 1.97 accuracy 0.12 -- 56.43 + 56.47 + 505.95 + 4.81 = 623.66:  28%|██▊       | 575/2048 [07:41<17:21,  1.41it/s]
loss 2.22 accuracy 0.12 -- 57.20 + 56.51 + 505.21 + 4.84 = 623.77:  28%|██▊       | 575/2048 [07:42<17:21,  1.41it/s]
loss 2.22 accuracy 0.12 -- 57.20 + 56.51 + 505.21 + 4.84 = 623.77:  28%|██▊       | 576/2048 [07:42<17:45,  1.38it/s]
loss 1.95 accuracy 0.31 -- 56.32 + 57.27 + 502.58 + 4.82 = 620.99:  28%|██▊       | 576/2048 [07:43<17:45,  1.38it/s]
loss 1.95 accuracy 0.31 -- 56.32 + 57.27 + 502.58 + 4.82 = 620.99:  28%|██▊       | 577/2048 [07:43<17:11,  1.43it/s]
loss 2.00 accuracy 0.31 -- 162.49 + 56.98 + 495.67 + 4.81 = 719.94:  28%|██▊       | 577/2048 [07:44<17:11,  1.43it/s]
loss 2.00 accuracy 0.31 -- 162.49 + 56.98 + 495.67 + 4.81 = 719.94:  28%|██▊       | 578/2048 [07:44<17:30,  1.40it/s]
loss 2.02 accuracy 0.12 -- 55.95 + 171.05 + 508.67 + 4.83 = 740.50:  28%|██▊       | 578/2048 [07:44<17:30,  1.40it/s]
loss 2.02 accuracy 0.12 -- 55.95 + 171.05 + 508.67 + 4.83 = 740.50:  28%|██▊       | 579/2048 [07:44<18:09,  1.35it/s]
loss 2.50 accuracy 0.19 -- 57.71 + 56.69 + 502.92 + 4.86 = 622.18:  28%|██▊       | 579/2048 [07:45<18:09,  1.35it/s] 
loss 2.50 accuracy 0.19 -- 57.71 + 56.69 + 502.92 + 4.86 = 622.18:  28%|██▊       | 580/2048 [07:45<17:27,  1.40it/s]
loss 2.30 accuracy 0.38 -- 168.37 + 57.40 + 502.13 + 4.82 = 732.73:  28%|██▊       | 580/2048 [07:46<17:27,  1.40it/s]
loss 2.30 accuracy 0.38 -- 168.37 + 57.40 + 502.13 + 4.82 = 732.73:  28%|██▊       | 581/2048 [07:46<17:47,  1.37it/s]
loss 1.89 accuracy 0.19 -- 56.70 + 57.25 + 634.05 + 4.82 = 752.81:  28%|██▊       | 581/2048 [07:47<17:47,  1.37it/s] 
loss 1.89 accuracy 0.19 -- 56.70 + 57.25 + 634.05 + 4.82 = 752.81:  28%|██▊       | 582/2048 [07:47<18:09,  1.35it/s]
loss 2.47 accuracy 0.19 -- 56.78 + 56.18 + 510.58 + 4.83 = 628.37:  28%|██▊       | 582/2048 [07:47<18:09,  1.35it/s]
loss 2.47 accuracy 0.19 -- 56.78 + 56.18 + 510.58 + 4.83 = 628.37:  28%|██▊       | 583/2048 [07:47<17:30,  1.39it/s]
loss 2.32 accuracy 0.19 -- 56.33 + 57.35 + 626.43 + 4.84 = 744.94:  28%|██▊       | 583/2048 [07:48<17:30,  1.39it/s]
loss 2.32 accuracy 0.19 -- 56.33 + 57.35 + 626.43 + 4.84 = 744.94:  29%|██▊       | 584/2048 [07:48<17:53,  1.36it/s]
loss 2.17 accuracy 0.31 -- 57.11 + 56.72 + 509.83 + 4.82 = 628.47:  29%|██▊       | 584/2048 [07:49<17:53,  1.36it/s]
loss 2.17 accuracy 0.31 -- 57.11 + 56.72 + 509.83 + 4.82 = 628.47:  29%|██▊       | 585/2048 [07:49<17:18,  1.41it/s]
loss 2.15 accuracy 0.31 -- 56.58 + 170.50 + 506.99 + 4.85 = 738.92:  29%|██▊       | 585/2048 [07:49<17:18,  1.41it/s]
loss 2.15 accuracy 0.31 -- 56.58 + 170.50 + 506.99 + 4.85 = 738.92:  29%|██▊       | 586/2048 [07:49<17:58,  1.36it/s]
loss 2.18 accuracy 0.25 -- 56.52 + 56.35 + 505.09 + 4.82 = 622.78:  29%|██▊       | 586/2048 [07:50<17:58,  1.36it/s] 
loss 2.18 accuracy 0.25 -- 56.52 + 56.35 + 505.09 + 4.82 = 622.78:  29%|██▊       | 587/2048 [07:50<17:19,  1.41it/s]
loss 1.87 accuracy 0.31 -- 57.31 + 56.55 + 504.17 + 4.82 = 622.84:  29%|██▊       | 587/2048 [07:51<17:19,  1.41it/s]
loss 1.87 accuracy 0.31 -- 57.31 + 56.55 + 504.17 + 4.82 = 622.84:  29%|██▊       | 588/2048 [07:51<17:40,  1.38it/s]
loss 2.03 accuracy 0.06 -- 56.34 + 57.11 + 500.76 + 4.82 = 619.03:  29%|██▊       | 588/2048 [07:52<17:40,  1.38it/s]
loss 2.03 accuracy 0.06 -- 56.34 + 57.11 + 500.76 + 4.82 = 619.03:  29%|██▉       | 589/2048 [07:52<17:04,  1.42it/s]
loss 2.17 accuracy 0.19 -- 162.41 + 56.89 + 495.86 + 4.82 = 719.98:  29%|██▉       | 589/2048 [07:52<17:04,  1.42it/s]
loss 2.17 accuracy 0.19 -- 162.41 + 56.89 + 495.86 + 4.82 = 719.98:  29%|██▉       | 590/2048 [07:52<17:23,  1.40it/s]
loss 2.01 accuracy 0.31 -- 56.51 + 172.36 + 507.11 + 4.83 = 740.81:  29%|██▉       | 590/2048 [07:53<17:23,  1.40it/s]
loss 2.01 accuracy 0.31 -- 56.51 + 172.36 + 507.11 + 4.83 = 740.81:  29%|██▉       | 591/2048 [07:53<17:45,  1.37it/s]
loss 2.26 accuracy 0.12 -- 56.74 + 56.77 + 503.50 + 4.83 = 621.83:  29%|██▉       | 591/2048 [07:54<17:45,  1.37it/s] 
loss 2.26 accuracy 0.12 -- 56.74 + 56.77 + 503.50 + 4.83 = 621.83:  29%|██▉       | 592/2048 [07:54<17:08,  1.41it/s]
loss 1.97 accuracy 0.12 -- 167.37 + 56.64 + 502.75 + 4.82 = 731.58:  29%|██▉       | 592/2048 [07:54<17:08,  1.41it/s]
loss 1.97 accuracy 0.12 -- 167.37 + 56.64 + 502.75 + 4.82 = 731.58:  29%|██▉       | 593/2048 [07:54<17:46,  1.36it/s]
loss 2.21 accuracy 0.12 -- 56.21 + 57.70 + 631.20 + 4.84 = 749.96:  29%|██▉       | 593/2048 [07:55<17:46,  1.36it/s] 
loss 2.21 accuracy 0.12 -- 56.21 + 57.70 + 631.20 + 4.84 = 749.96:  29%|██▉       | 594/2048 [07:55<18:04,  1.34it/s]
loss 2.11 accuracy 0.06 -- 57.20 + 56.49 + 512.21 + 4.82 = 630.72:  29%|██▉       | 594/2048 [07:56<18:04,  1.34it/s]
loss 2.11 accuracy 0.06 -- 57.20 + 56.49 + 512.21 + 4.82 = 630.72:  29%|██▉       | 595/2048 [07:56<17:25,  1.39it/s]
loss 2.32 accuracy 0.19 -- 56.16 + 57.28 + 630.51 + 4.82 = 748.77:  29%|██▉       | 595/2048 [07:57<17:25,  1.39it/s]
loss 2.32 accuracy 0.19 -- 56.16 + 57.28 + 630.51 + 4.82 = 748.77:  29%|██▉       | 596/2048 [07:57<17:49,  1.36it/s]
loss 2.30 accuracy 0.19 -- 57.09 + 56.59 + 507.70 + 4.82 = 626.21:  29%|██▉       | 596/2048 [07:57<17:49,  1.36it/s]
loss 2.30 accuracy 0.19 -- 57.09 + 56.59 + 507.70 + 4.82 = 626.21:  29%|██▉       | 597/2048 [07:57<17:12,  1.41it/s]
loss 2.32 accuracy 0.25 -- 56.45 + 171.50 + 506.60 + 4.86 = 739.41:  29%|██▉       | 597/2048 [07:58<17:12,  1.41it/s]
loss 2.32 accuracy 0.25 -- 56.45 + 171.50 + 506.60 + 4.86 = 739.41:  29%|██▉       | 598/2048 [07:58<17:35,  1.37it/s]
loss 2.24 accuracy 0.31 -- 56.75 + 56.69 + 506.08 + 4.81 = 624.34:  29%|██▉       | 598/2048 [07:59<17:35,  1.37it/s] 
loss 2.24 accuracy 0.31 -- 56.75 + 56.69 + 506.08 + 4.81 = 624.34:  29%|██▉       | 599/2048 [07:59<17:01,  1.42it/s]
loss 1.73 accuracy 0.44 -- 56.91 + 56.53 + 504.64 + 4.81 = 622.90:  29%|██▉       | 599/2048 [08:00<17:01,  1.42it/s]
loss 1.73 accuracy 0.44 -- 56.91 + 56.53 + 504.64 + 4.81 = 622.90:  29%|██▉       | 600/2048 [08:00<17:26,  1.38it/s]
loss 1.88 accuracy 0.38 -- 56.58 + 57.36 + 501.22 + 4.83 = 619.99:  29%|██▉       | 600/2048 [08:00<17:26,  1.38it/s]
loss 1.88 accuracy 0.38 -- 56.58 + 57.36 + 501.22 + 4.83 = 619.99:  29%|██▉       | 601/2048 [08:00<16:52,  1.43it/s]
loss 1.93 accuracy 0.31 -- 163.11 + 57.22 + 496.37 + 4.82 = 721.52:  29%|██▉       | 601/2048 [08:01<16:52,  1.43it/s]
loss 1.93 accuracy 0.31 -- 163.11 + 57.22 + 496.37 + 4.82 = 721.52:  29%|██▉       | 602/2048 [08:01<17:13,  1.40it/s]
loss 1.89 accuracy 0.38 -- 55.89 + 191.29 + 557.43 + 5.10 = 809.71:  29%|██▉       | 602/2048 [08:02<17:13,  1.40it/s]
loss 1.89 accuracy 0.38 -- 55.89 + 191.29 + 557.43 + 5.10 = 809.71:  29%|██▉       | 603/2048 [08:02<18:05,  1.33it/s]
loss 1.99 accuracy 0.19 -- 57.71 + 59.25 + 542.88 + 5.03 = 664.87:  29%|██▉       | 603/2048 [08:02<18:05,  1.33it/s] 
loss 1.99 accuracy 0.19 -- 57.71 + 59.25 + 542.88 + 5.03 = 664.87:  29%|██▉       | 604/2048 [08:02<17:40,  1.36it/s]
loss 2.46 accuracy 0.12 -- 187.78 + 58.21 + 540.41 + 5.03 = 791.44:  29%|██▉       | 604/2048 [08:03<17:40,  1.36it/s]
loss 2.46 accuracy 0.12 -- 187.78 + 58.21 + 540.41 + 5.03 = 791.44:  30%|██▉       | 605/2048 [08:03<18:17,  1.31it/s]
loss 2.19 accuracy 0.31 -- 58.11 + 59.82 + 669.05 + 5.02 = 791.99:  30%|██▉       | 605/2048 [08:04<18:17,  1.31it/s] 
loss 2.19 accuracy 0.31 -- 58.11 + 59.82 + 669.05 + 5.02 = 791.99:  30%|██▉       | 606/2048 [08:04<18:43,  1.28it/s]
loss 1.83 accuracy 0.19 -- 58.78 + 57.32 + 547.80 + 5.05 = 668.95:  30%|██▉       | 606/2048 [08:05<18:43,  1.28it/s]
loss 1.83 accuracy 0.19 -- 58.78 + 57.32 + 547.80 + 5.05 = 668.95:  30%|██▉       | 607/2048 [08:05<18:08,  1.32it/s]
loss 2.00 accuracy 0.25 -- 58.88 + 60.82 + 687.05 + 5.01 = 811.77:  30%|██▉       | 607/2048 [08:06<18:08,  1.32it/s]
loss 2.00 accuracy 0.25 -- 58.88 + 60.82 + 687.05 + 5.01 = 811.77:  30%|██▉       | 608/2048 [08:06<18:44,  1.28it/s]
loss 1.80 accuracy 0.31 -- 59.21 + 59.23 + 551.34 + 5.06 = 674.83:  30%|██▉       | 608/2048 [08:06<18:44,  1.28it/s]
loss 1.80 accuracy 0.31 -- 59.21 + 59.23 + 551.34 + 5.06 = 674.83:  30%|██▉       | 609/2048 [08:06<18:11,  1.32it/s]
loss 1.89 accuracy 0.25 -- 58.51 + 188.35 + 546.38 + 4.89 = 798.13:  30%|██▉       | 609/2048 [08:07<18:11,  1.32it/s]
loss 1.89 accuracy 0.25 -- 58.51 + 188.35 + 546.38 + 4.89 = 798.13:  30%|██▉       | 610/2048 [08:07<18:40,  1.28it/s]
loss 2.12 accuracy 0.12 -- 56.53 + 56.99 + 508.78 + 4.83 = 627.13:  30%|██▉       | 610/2048 [08:08<18:40,  1.28it/s] 
loss 2.12 accuracy 0.12 -- 56.53 + 56.99 + 508.78 + 4.83 = 627.13:  30%|██▉       | 611/2048 [08:08<17:46,  1.35it/s]
loss 1.94 accuracy 0.44 -- 56.91 + 56.39 + 506.03 + 4.84 = 624.18:  30%|██▉       | 611/2048 [08:09<17:46,  1.35it/s]
loss 1.94 accuracy 0.44 -- 56.91 + 56.39 + 506.03 + 4.84 = 624.18:  30%|██▉       | 612/2048 [08:09<17:56,  1.33it/s]
loss 2.62 accuracy 0.25 -- 56.05 + 57.15 + 501.24 + 4.84 = 619.28:  30%|██▉       | 612/2048 [08:09<17:56,  1.33it/s]
loss 2.62 accuracy 0.25 -- 56.05 + 57.15 + 501.24 + 4.84 = 619.28:  30%|██▉       | 613/2048 [08:09<17:11,  1.39it/s]
loss 1.67 accuracy 0.44 -- 162.55 + 56.92 + 495.63 + 4.81 = 719.91:  30%|██▉       | 613/2048 [08:10<17:11,  1.39it/s]
loss 1.67 accuracy 0.44 -- 162.55 + 56.92 + 495.63 + 4.81 = 719.91:  30%|██▉       | 614/2048 [08:10<17:23,  1.37it/s]
loss 1.98 accuracy 0.06 -- 56.17 + 171.38 + 508.90 + 4.83 = 741.27:  30%|██▉       | 614/2048 [08:11<17:23,  1.37it/s]
loss 1.98 accuracy 0.06 -- 56.17 + 171.38 + 508.90 + 4.83 = 741.27:  30%|███       | 615/2048 [08:11<17:40,  1.35it/s]
loss 2.12 accuracy 0.06 -- 56.62 + 56.66 + 503.69 + 4.82 = 621.79:  30%|███       | 615/2048 [08:11<17:40,  1.35it/s] 
loss 2.12 accuracy 0.06 -- 56.62 + 56.66 + 503.69 + 4.82 = 621.79:  30%|███       | 616/2048 [08:11<17:01,  1.40it/s]
loss 1.56 accuracy 0.44 -- 168.16 + 57.16 + 503.86 + 4.83 = 734.02:  30%|███       | 616/2048 [08:12<17:01,  1.40it/s]
loss 1.56 accuracy 0.44 -- 168.16 + 57.16 + 503.86 + 4.83 = 734.02:  30%|███       | 617/2048 [08:12<17:21,  1.37it/s]
loss 2.10 accuracy 0.25 -- 56.12 + 57.50 + 632.01 + 4.82 = 750.45:  30%|███       | 617/2048 [08:13<17:21,  1.37it/s] 
loss 2.10 accuracy 0.25 -- 56.12 + 57.50 + 632.01 + 4.82 = 750.45:  30%|███       | 618/2048 [08:13<17:42,  1.35it/s]
loss 2.16 accuracy 0.12 -- 57.09 + 56.32 + 512.64 + 4.81 = 630.86:  30%|███       | 618/2048 [08:14<17:42,  1.35it/s]
loss 2.16 accuracy 0.12 -- 57.09 + 56.32 + 512.64 + 4.81 = 630.86:  30%|███       | 619/2048 [08:14<17:05,  1.39it/s]
loss 2.43 accuracy 0.06 -- 56.60 + 57.42 + 626.97 + 4.83 = 745.82:  30%|███       | 619/2048 [08:14<17:05,  1.39it/s]
loss 2.43 accuracy 0.06 -- 56.60 + 57.42 + 626.97 + 4.83 = 745.82:  30%|███       | 620/2048 [08:14<17:28,  1.36it/s]
loss 2.27 accuracy 0.12 -- 56.74 + 56.66 + 507.89 + 4.82 = 626.11:  30%|███       | 620/2048 [08:15<17:28,  1.36it/s]
loss 2.27 accuracy 0.12 -- 56.74 + 56.66 + 507.89 + 4.82 = 626.11:  30%|███       | 621/2048 [08:15<16:53,  1.41it/s]
loss 1.86 accuracy 0.38 -- 56.27 + 171.09 + 507.02 + 4.82 = 739.20:  30%|███       | 621/2048 [08:16<16:53,  1.41it/s]
loss 1.86 accuracy 0.38 -- 56.27 + 171.09 + 507.02 + 4.82 = 739.20:  30%|███       | 622/2048 [08:16<17:17,  1.37it/s]
loss 1.89 accuracy 0.31 -- 56.15 + 56.39 + 508.44 + 4.82 = 625.81:  30%|███       | 622/2048 [08:16<17:17,  1.37it/s] 
loss 1.89 accuracy 0.31 -- 56.15 + 56.39 + 508.44 + 4.82 = 625.81:  30%|███       | 623/2048 [08:16<16:45,  1.42it/s]
loss 1.76 accuracy 0.38 -- 57.01 + 56.43 + 505.48 + 4.82 = 623.74:  30%|███       | 623/2048 [08:17<16:45,  1.42it/s]
loss 1.76 accuracy 0.38 -- 57.01 + 56.43 + 505.48 + 4.82 = 623.74:  30%|███       | 624/2048 [08:17<17:09,  1.38it/s]
loss 1.95 accuracy 0.25 -- 56.54 + 57.13 + 500.71 + 4.81 = 619.19:  30%|███       | 624/2048 [08:18<17:09,  1.38it/s]
loss 1.95 accuracy 0.25 -- 56.54 + 57.13 + 500.71 + 4.81 = 619.19:  31%|███       | 625/2048 [08:18<16:35,  1.43it/s]
loss 2.25 accuracy 0.19 -- 162.57 + 56.92 + 494.94 + 4.80 = 719.24:  31%|███       | 625/2048 [08:19<16:35,  1.43it/s]
loss 2.25 accuracy 0.19 -- 162.57 + 56.92 + 494.94 + 4.80 = 719.24:  31%|███       | 626/2048 [08:19<16:55,  1.40it/s]
loss 1.95 accuracy 0.38 -- 56.02 + 170.75 + 507.07 + 4.83 = 738.67:  31%|███       | 626/2048 [08:19<16:55,  1.40it/s]
loss 1.95 accuracy 0.38 -- 56.02 + 170.75 + 507.07 + 4.83 = 738.67:  31%|███       | 627/2048 [08:19<17:16,  1.37it/s]
loss 2.49 accuracy 0.06 -- 56.86 + 56.52 + 503.57 + 4.81 = 621.77:  31%|███       | 627/2048 [08:20<17:16,  1.37it/s] 
loss 2.49 accuracy 0.06 -- 56.86 + 56.52 + 503.57 + 4.81 = 621.77:  31%|███       | 628/2048 [08:20<16:42,  1.42it/s]
loss 1.81 accuracy 0.50 -- 167.39 + 57.03 + 502.37 + 4.81 = 731.61:  31%|███       | 628/2048 [08:21<16:42,  1.42it/s]
loss 1.81 accuracy 0.50 -- 167.39 + 57.03 + 502.37 + 4.81 = 731.61:  31%|███       | 629/2048 [08:21<17:04,  1.39it/s]
loss 1.75 accuracy 0.31 -- 56.08 + 57.07 + 631.59 + 4.81 = 749.56:  31%|███       | 629/2048 [08:22<17:04,  1.39it/s] 
loss 1.75 accuracy 0.31 -- 56.08 + 57.07 + 631.59 + 4.81 = 749.56:  31%|███       | 630/2048 [08:22<17:27,  1.35it/s]
loss 2.37 accuracy 0.25 -- 56.71 + 56.28 + 512.52 + 4.84 = 630.35:  31%|███       | 630/2048 [08:22<17:27,  1.35it/s]
loss 2.37 accuracy 0.25 -- 56.71 + 56.28 + 512.52 + 4.84 = 630.35:  31%|███       | 631/2048 [08:22<16:52,  1.40it/s]
loss 2.16 accuracy 0.12 -- 56.47 + 57.36 + 683.41 + 4.82 = 802.06:  31%|███       | 631/2048 [08:23<16:52,  1.40it/s]
loss 2.16 accuracy 0.12 -- 56.47 + 57.36 + 683.41 + 4.82 = 802.06:  31%|███       | 632/2048 [08:23<17:40,  1.33it/s]
loss 2.08 accuracy 0.25 -- 57.19 + 56.49 + 562.20 + 4.80 = 680.68:  31%|███       | 632/2048 [08:24<17:40,  1.33it/s]
loss 2.08 accuracy 0.25 -- 57.19 + 56.49 + 562.20 + 4.80 = 680.68:  31%|███       | 633/2048 [08:24<17:22,  1.36it/s]
loss 1.95 accuracy 0.25 -- 56.36 + 221.20 + 503.43 + 4.79 = 785.78:  31%|███       | 633/2048 [08:25<17:22,  1.36it/s]
loss 1.95 accuracy 0.25 -- 56.36 + 221.20 + 503.43 + 4.79 = 785.78:  31%|███       | 634/2048 [08:25<17:54,  1.32it/s]
loss 2.09 accuracy 0.25 -- 56.18 + 56.50 + 498.83 + 4.77 = 616.28:  31%|███       | 634/2048 [08:25<17:54,  1.32it/s] 
loss 2.09 accuracy 0.25 -- 56.18 + 56.50 + 498.83 + 4.77 = 616.28:  31%|███       | 635/2048 [08:25<17:05,  1.38it/s]
loss 1.89 accuracy 0.31 -- 56.62 + 56.21 + 498.20 + 4.81 = 615.83:  31%|███       | 635/2048 [08:26<17:05,  1.38it/s]
loss 1.89 accuracy 0.31 -- 56.62 + 56.21 + 498.20 + 4.81 = 615.83:  31%|███       | 636/2048 [08:26<17:15,  1.36it/s]
loss 2.11 accuracy 0.25 -- 55.97 + 57.44 + 495.70 + 4.79 = 613.91:  31%|███       | 636/2048 [08:27<17:15,  1.36it/s]
loss 2.11 accuracy 0.25 -- 55.97 + 57.44 + 495.70 + 4.79 = 613.91:  31%|███       | 637/2048 [08:27<16:35,  1.42it/s]
loss 1.64 accuracy 0.50 -- 157.50 + 56.86 + 490.79 + 4.78 = 709.93:  31%|███       | 637/2048 [08:27<16:35,  1.42it/s]
loss 1.64 accuracy 0.50 -- 157.50 + 56.86 + 490.79 + 4.78 = 709.93:  31%|███       | 638/2048 [08:27<16:48,  1.40it/s]
loss 2.20 accuracy 0.25 -- 55.86 + 166.91 + 501.83 + 4.79 = 729.38:  31%|███       | 638/2048 [08:28<16:48,  1.40it/s]
loss 2.20 accuracy 0.25 -- 55.86 + 166.91 + 501.83 + 4.79 = 729.38:  31%|███       | 639/2048 [08:28<17:05,  1.37it/s]
loss 1.65 accuracy 0.44 -- 56.45 + 56.47 + 496.86 + 4.77 = 614.55:  31%|███       | 639/2048 [08:29<17:05,  1.37it/s] 
loss 1.65 accuracy 0.44 -- 56.45 + 56.47 + 496.86 + 4.77 = 614.55:  31%|███▏      | 640/2048 [08:29<16:29,  1.42it/s]
loss 2.01 accuracy 0.31 -- 162.05 + 57.27 + 497.82 + 4.78 = 721.93:  31%|███▏      | 640/2048 [08:29<16:29,  1.42it/s]
loss 2.01 accuracy 0.31 -- 162.05 + 57.27 + 497.82 + 4.78 = 721.93:  31%|███▏      | 641/2048 [08:29<16:48,  1.40it/s]
loss 1.52 accuracy 0.56 -- 55.92 + 57.11 + 621.34 + 4.79 = 739.16:  31%|███▏      | 641/2048 [08:30<16:48,  1.40it/s] 
loss 1.52 accuracy 0.56 -- 55.92 + 57.11 + 621.34 + 4.79 = 739.16:  31%|███▏      | 642/2048 [08:30<17:08,  1.37it/s]
loss 2.08 accuracy 0.06 -- 56.86 + 56.41 + 505.30 + 4.77 = 623.34:  31%|███▏      | 642/2048 [08:31<17:08,  1.37it/s]
loss 2.08 accuracy 0.06 -- 56.86 + 56.41 + 505.30 + 4.77 = 623.34:  31%|███▏      | 643/2048 [08:31<16:33,  1.41it/s]
loss 1.94 accuracy 0.00 -- 55.94 + 57.19 + 615.58 + 4.80 = 733.52:  31%|███▏      | 643/2048 [08:32<16:33,  1.41it/s]
loss 1.94 accuracy 0.00 -- 55.94 + 57.19 + 615.58 + 4.80 = 733.52:  31%|███▏      | 644/2048 [08:32<16:55,  1.38it/s]
loss 1.67 accuracy 0.31 -- 56.63 + 56.97 + 502.47 + 4.86 = 620.93:  31%|███▏      | 644/2048 [08:32<16:55,  1.38it/s]
loss 1.67 accuracy 0.31 -- 56.63 + 56.97 + 502.47 + 4.86 = 620.93:  31%|███▏      | 645/2048 [08:32<16:23,  1.43it/s]
loss 2.07 accuracy 0.44 -- 56.49 + 166.39 + 501.02 + 4.76 = 728.66:  31%|███▏      | 645/2048 [08:33<16:23,  1.43it/s]
loss 2.07 accuracy 0.44 -- 56.49 + 166.39 + 501.02 + 4.76 = 728.66:  32%|███▏      | 646/2048 [08:33<16:45,  1.39it/s]
loss 2.24 accuracy 0.25 -- 56.20 + 56.20 + 498.92 + 4.76 = 616.08:  32%|███▏      | 646/2048 [08:34<16:45,  1.39it/s] 
loss 2.24 accuracy 0.25 -- 56.20 + 56.20 + 498.92 + 4.76 = 616.08:  32%|███▏      | 647/2048 [08:34<16:13,  1.44it/s]
loss 1.83 accuracy 0.31 -- 56.92 + 56.21 + 498.46 + 4.77 = 616.37:  32%|███▏      | 647/2048 [08:34<16:13,  1.44it/s]
loss 1.83 accuracy 0.31 -- 56.92 + 56.21 + 498.46 + 4.77 = 616.37:  32%|███▏      | 648/2048 [08:34<16:36,  1.40it/s]
loss 1.61 accuracy 0.44 -- 56.19 + 57.38 + 495.23 + 4.76 = 613.57:  32%|███▏      | 648/2048 [08:35<16:36,  1.40it/s]
loss 1.61 accuracy 0.44 -- 56.19 + 57.38 + 495.23 + 4.76 = 613.57:  32%|███▏      | 649/2048 [08:35<16:06,  1.45it/s]
loss 2.00 accuracy 0.44 -- 158.16 + 56.75 + 490.99 + 4.78 = 710.68:  32%|███▏      | 649/2048 [08:36<16:06,  1.45it/s]
loss 2.00 accuracy 0.44 -- 158.16 + 56.75 + 490.99 + 4.78 = 710.68:  32%|███▏      | 650/2048 [08:36<16:25,  1.42it/s]
loss 2.02 accuracy 0.19 -- 55.67 + 166.22 + 505.11 + 4.77 = 731.78:  32%|███▏      | 650/2048 [08:37<16:25,  1.42it/s]
loss 2.02 accuracy 0.19 -- 55.67 + 166.22 + 505.11 + 4.77 = 731.78:  32%|███▏      | 651/2048 [08:37<16:47,  1.39it/s]
loss 1.81 accuracy 0.31 -- 56.62 + 56.04 + 499.64 + 4.81 = 617.11:  32%|███▏      | 651/2048 [08:37<16:47,  1.39it/s] 
loss 1.81 accuracy 0.31 -- 56.62 + 56.04 + 499.64 + 4.81 = 617.11:  32%|███▏      | 652/2048 [08:37<16:14,  1.43it/s]
loss 1.81 accuracy 0.38 -- 162.53 + 57.28 + 497.72 + 4.78 = 722.31:  32%|███▏      | 652/2048 [08:38<16:14,  1.43it/s]
loss 1.81 accuracy 0.38 -- 162.53 + 57.28 + 497.72 + 4.78 = 722.31:  32%|███▏      | 653/2048 [08:38<16:35,  1.40it/s]
loss 2.09 accuracy 0.31 -- 56.10 + 57.52 + 619.55 + 4.77 = 737.94:  32%|███▏      | 653/2048 [08:39<16:35,  1.40it/s] 
loss 2.09 accuracy 0.31 -- 56.10 + 57.52 + 619.55 + 4.77 = 737.94:  32%|███▏      | 654/2048 [08:39<16:56,  1.37it/s]
loss 1.65 accuracy 0.31 -- 56.64 + 56.80 + 504.67 + 4.78 = 622.88:  32%|███▏      | 654/2048 [08:39<16:56,  1.37it/s]
loss 1.65 accuracy 0.31 -- 56.64 + 56.80 + 504.67 + 4.78 = 622.88:  32%|███▏      | 655/2048 [08:39<16:22,  1.42it/s]
loss 2.08 accuracy 0.19 -- 56.25 + 57.41 + 616.63 + 4.79 = 735.08:  32%|███▏      | 655/2048 [08:40<16:22,  1.42it/s]
loss 2.08 accuracy 0.19 -- 56.25 + 57.41 + 616.63 + 4.79 = 735.08:  32%|███▏      | 656/2048 [08:40<16:45,  1.38it/s]
loss 2.16 accuracy 0.25 -- 57.13 + 57.16 + 503.06 + 4.77 = 622.13:  32%|███▏      | 656/2048 [08:41<16:45,  1.38it/s]
loss 2.16 accuracy 0.25 -- 57.13 + 57.16 + 503.06 + 4.77 = 622.13:  32%|███▏      | 657/2048 [08:41<16:14,  1.43it/s]
loss 2.49 accuracy 0.19 -- 56.42 + 166.19 + 503.90 + 4.77 = 731.28:  32%|███▏      | 657/2048 [08:42<16:14,  1.43it/s]
loss 2.49 accuracy 0.19 -- 56.42 + 166.19 + 503.90 + 4.77 = 731.28:  32%|███▏      | 658/2048 [08:42<16:37,  1.39it/s]
loss 2.04 accuracy 0.25 -- 56.51 + 56.24 + 500.58 + 4.77 = 618.10:  32%|███▏      | 658/2048 [08:42<16:37,  1.39it/s] 
loss 2.04 accuracy 0.25 -- 56.51 + 56.24 + 500.58 + 4.77 = 618.10:  32%|███▏      | 659/2048 [08:42<16:06,  1.44it/s]
loss 1.92 accuracy 0.44 -- 56.53 + 56.25 + 498.66 + 4.77 = 616.22:  32%|███▏      | 659/2048 [08:43<16:06,  1.44it/s]
loss 1.92 accuracy 0.44 -- 56.53 + 56.25 + 498.66 + 4.77 = 616.22:  32%|███▏      | 660/2048 [08:43<16:28,  1.40it/s]
loss 1.65 accuracy 0.50 -- 56.16 + 57.22 + 497.80 + 4.76 = 615.94:  32%|███▏      | 660/2048 [08:44<16:28,  1.40it/s]
loss 1.65 accuracy 0.50 -- 56.16 + 57.22 + 497.80 + 4.76 = 615.94:  32%|███▏      | 661/2048 [08:44<15:59,  1.45it/s]
loss 1.56 accuracy 0.38 -- 157.45 + 57.04 + 489.66 + 4.79 = 708.93:  32%|███▏      | 661/2048 [08:44<15:59,  1.45it/s]
loss 1.56 accuracy 0.38 -- 157.45 + 57.04 + 489.66 + 4.79 = 708.93:  32%|███▏      | 662/2048 [08:44<16:17,  1.42it/s]
loss 2.45 accuracy 0.19 -- 56.27 + 166.72 + 500.84 + 4.76 = 728.58:  32%|███▏      | 662/2048 [08:45<16:17,  1.42it/s]
loss 2.45 accuracy 0.19 -- 56.27 + 166.72 + 500.84 + 4.76 = 728.58:  32%|███▏      | 663/2048 [08:45<16:37,  1.39it/s]
loss 2.27 accuracy 0.12 -- 56.49 + 56.49 + 496.93 + 4.79 = 614.70:  32%|███▏      | 663/2048 [08:46<16:37,  1.39it/s] 
loss 2.27 accuracy 0.12 -- 56.49 + 56.49 + 496.93 + 4.79 = 614.70:  32%|███▏      | 664/2048 [08:46<16:04,  1.44it/s]
loss 1.79 accuracy 0.31 -- 162.74 + 56.84 + 497.18 + 4.77 = 721.53:  32%|███▏      | 664/2048 [08:47<16:04,  1.44it/s]
loss 1.79 accuracy 0.31 -- 162.74 + 56.84 + 497.18 + 4.77 = 721.53:  32%|███▏      | 665/2048 [08:47<16:39,  1.38it/s]
loss 2.26 accuracy 0.38 -- 56.06 + 57.34 + 619.75 + 4.78 = 737.94:  32%|███▏      | 665/2048 [08:47<16:39,  1.38it/s] 
loss 2.26 accuracy 0.38 -- 56.06 + 57.34 + 619.75 + 4.78 = 737.94:  33%|███▎      | 666/2048 [08:47<16:56,  1.36it/s]
loss 2.23 accuracy 0.31 -- 56.88 + 56.33 + 505.18 + 4.78 = 623.17:  33%|███▎      | 666/2048 [08:48<16:56,  1.36it/s]
loss 2.23 accuracy 0.31 -- 56.88 + 56.33 + 505.18 + 4.78 = 623.17:  33%|███▎      | 667/2048 [08:48<16:20,  1.41it/s]
loss 1.77 accuracy 0.25 -- 56.20 + 57.27 + 616.10 + 4.77 = 734.35:  33%|███▎      | 667/2048 [08:49<16:20,  1.41it/s]
loss 1.77 accuracy 0.25 -- 56.20 + 57.27 + 616.10 + 4.77 = 734.35:  33%|███▎      | 668/2048 [08:49<16:40,  1.38it/s]
loss 1.82 accuracy 0.31 -- 56.63 + 56.65 + 501.83 + 4.78 = 619.89:  33%|███▎      | 668/2048 [08:49<16:40,  1.38it/s]
loss 1.82 accuracy 0.31 -- 56.63 + 56.65 + 501.83 + 4.78 = 619.89:  33%|███▎      | 669/2048 [08:49<16:07,  1.42it/s]
loss 2.02 accuracy 0.38 -- 56.30 + 165.93 + 501.09 + 4.78 = 728.10:  33%|███▎      | 669/2048 [08:50<16:07,  1.42it/s]
loss 2.02 accuracy 0.38 -- 56.30 + 165.93 + 501.09 + 4.78 = 728.10:  33%|███▎      | 670/2048 [08:50<16:29,  1.39it/s]
loss 1.76 accuracy 0.12 -- 55.79 + 56.25 + 499.12 + 4.77 = 615.93:  33%|███▎      | 670/2048 [08:51<16:29,  1.39it/s] 
loss 1.76 accuracy 0.12 -- 55.79 + 56.25 + 499.12 + 4.77 = 615.93:  33%|███▎      | 671/2048 [08:51<15:57,  1.44it/s]
loss 1.91 accuracy 0.19 -- 56.48 + 56.43 + 499.61 + 4.77 = 617.30:  33%|███▎      | 671/2048 [08:52<15:57,  1.44it/s]
loss 1.91 accuracy 0.19 -- 56.48 + 56.43 + 499.61 + 4.77 = 617.30:  33%|███▎      | 672/2048 [08:52<16:20,  1.40it/s]
loss 1.54 accuracy 0.50 -- 55.99 + 56.93 + 495.38 + 4.76 = 613.07:  33%|███▎      | 672/2048 [08:52<16:20,  1.40it/s]
loss 1.54 accuracy 0.50 -- 55.99 + 56.93 + 495.38 + 4.76 = 613.07:  33%|███▎      | 673/2048 [08:52<15:49,  1.45it/s]
loss 2.13 accuracy 0.12 -- 157.43 + 57.27 + 490.69 + 4.79 = 710.19:  33%|███▎      | 673/2048 [08:53<15:49,  1.45it/s]
loss 2.13 accuracy 0.12 -- 157.43 + 57.27 + 490.69 + 4.79 = 710.19:  33%|███▎      | 674/2048 [08:53<16:08,  1.42it/s]
loss 2.16 accuracy 0.19 -- 55.62 + 166.29 + 500.51 + 4.78 = 727.20:  33%|███▎      | 674/2048 [08:54<16:08,  1.42it/s]
loss 2.16 accuracy 0.19 -- 55.62 + 166.29 + 500.51 + 4.78 = 727.20:  33%|███▎      | 675/2048 [08:54<16:27,  1.39it/s]
loss 2.35 accuracy 0.19 -- 56.70 + 56.30 + 498.44 + 4.78 = 616.23:  33%|███▎      | 675/2048 [08:54<16:27,  1.39it/s] 
loss 2.35 accuracy 0.19 -- 56.70 + 56.30 + 498.44 + 4.78 = 616.23:  33%|███▎      | 676/2048 [08:54<15:56,  1.44it/s]
loss 2.41 accuracy 0.12 -- 162.49 + 57.06 + 497.41 + 4.79 = 721.75:  33%|███▎      | 676/2048 [08:55<15:56,  1.44it/s]
loss 2.41 accuracy 0.12 -- 162.49 + 57.06 + 497.41 + 4.79 = 721.75:  33%|███▎      | 677/2048 [08:55<16:16,  1.40it/s]
loss 2.25 accuracy 0.19 -- 55.98 + 56.97 + 619.89 + 4.79 = 737.63:  33%|███▎      | 677/2048 [08:56<16:16,  1.40it/s] 
loss 2.25 accuracy 0.19 -- 55.98 + 56.97 + 619.89 + 4.79 = 737.63:  33%|███▎      | 678/2048 [08:56<16:37,  1.37it/s]
loss 1.83 accuracy 0.44 -- 56.67 + 56.45 + 505.59 + 4.78 = 623.49:  33%|███▎      | 678/2048 [08:56<16:37,  1.37it/s]
loss 1.83 accuracy 0.44 -- 56.67 + 56.45 + 505.59 + 4.78 = 623.49:  33%|███▎      | 679/2048 [08:56<16:05,  1.42it/s]
loss 2.09 accuracy 0.25 -- 56.04 + 57.14 + 616.19 + 4.79 = 734.16:  33%|███▎      | 679/2048 [08:57<16:05,  1.42it/s]
loss 2.09 accuracy 0.25 -- 56.04 + 57.14 + 616.19 + 4.79 = 734.16:  33%|███▎      | 680/2048 [08:57<16:27,  1.39it/s]
loss 1.90 accuracy 0.25 -- 56.51 + 56.48 + 514.24 + 4.85 = 632.07:  33%|███▎      | 680/2048 [08:58<16:27,  1.39it/s]
loss 1.90 accuracy 0.25 -- 56.51 + 56.48 + 514.24 + 4.85 = 632.07:  33%|███▎      | 681/2048 [08:58<16:01,  1.42it/s]
loss 1.99 accuracy 0.19 -- 56.91 + 173.12 + 509.27 + 4.86 = 744.15:  33%|███▎      | 681/2048 [08:59<16:01,  1.42it/s]
loss 1.99 accuracy 0.19 -- 56.91 + 173.12 + 509.27 + 4.86 = 744.15:  33%|███▎      | 682/2048 [08:59<16:31,  1.38it/s]
loss 2.28 accuracy 0.25 -- 56.33 + 56.70 + 507.96 + 4.88 = 625.87:  33%|███▎      | 682/2048 [08:59<16:31,  1.38it/s] 
loss 2.28 accuracy 0.25 -- 56.33 + 56.70 + 507.96 + 4.88 = 625.87:  33%|███▎      | 683/2048 [08:59<16:02,  1.42it/s]
loss 1.97 accuracy 0.44 -- 57.66 + 57.24 + 524.21 + 5.00 = 644.11:  33%|███▎      | 683/2048 [09:00<16:02,  1.42it/s]
loss 1.97 accuracy 0.44 -- 57.66 + 57.24 + 524.21 + 5.00 = 644.11:  33%|███▎      | 684/2048 [09:00<16:36,  1.37it/s]
loss 1.98 accuracy 0.31 -- 58.06 + 60.43 + 531.50 + 5.00 = 654.98:  33%|███▎      | 684/2048 [09:01<16:36,  1.37it/s]
loss 1.98 accuracy 0.31 -- 58.06 + 60.43 + 531.50 + 5.00 = 654.98:  33%|███▎      | 685/2048 [09:01<16:17,  1.39it/s]
loss 1.85 accuracy 0.31 -- 170.06 + 57.51 + 521.24 + 5.05 = 753.86:  33%|███▎      | 685/2048 [09:02<16:17,  1.39it/s]
loss 1.85 accuracy 0.31 -- 170.06 + 57.51 + 521.24 + 5.05 = 753.86:  33%|███▎      | 686/2048 [09:02<16:44,  1.36it/s]
loss 2.09 accuracy 0.19 -- 59.14 + 185.63 + 537.29 + 5.08 = 787.14:  33%|███▎      | 686/2048 [09:02<16:44,  1.36it/s]
loss 2.09 accuracy 0.19 -- 59.14 + 185.63 + 537.29 + 5.08 = 787.14:  34%|███▎      | 687/2048 [09:02<17:16,  1.31it/s]
loss 2.50 accuracy 0.19 -- 58.42 + 59.81 + 530.30 + 4.99 = 653.52:  34%|███▎      | 687/2048 [09:03<17:16,  1.31it/s] 
loss 2.50 accuracy 0.19 -- 58.42 + 59.81 + 530.30 + 4.99 = 653.52:  34%|███▎      | 688/2048 [09:03<16:43,  1.35it/s]
loss 2.21 accuracy 0.25 -- 175.08 + 59.29 + 522.88 + 4.91 = 762.17:  34%|███▎      | 688/2048 [09:04<16:43,  1.35it/s]
loss 2.21 accuracy 0.25 -- 175.08 + 59.29 + 522.88 + 4.91 = 762.17:  34%|███▎      | 689/2048 [09:04<17:05,  1.33it/s]
loss 2.20 accuracy 0.19 -- 58.13 + 59.56 + 673.00 + 5.01 = 795.70:  34%|███▎      | 689/2048 [09:05<17:05,  1.33it/s] 
loss 2.20 accuracy 0.19 -- 58.13 + 59.56 + 673.00 + 5.01 = 795.70:  34%|███▎      | 690/2048 [09:05<17:33,  1.29it/s]
loss 2.84 accuracy 0.19 -- 59.36 + 59.01 + 520.17 + 4.77 = 643.30:  34%|███▎      | 690/2048 [09:05<17:33,  1.29it/s]
loss 2.84 accuracy 0.19 -- 59.36 + 59.01 + 520.17 + 4.77 = 643.30:  34%|███▎      | 691/2048 [09:05<16:50,  1.34it/s]
loss 2.27 accuracy 0.12 -- 56.14 + 57.32 + 615.16 + 4.80 = 733.42:  34%|███▎      | 691/2048 [09:06<16:50,  1.34it/s]
loss 2.27 accuracy 0.12 -- 56.14 + 57.32 + 615.16 + 4.80 = 733.42:  34%|███▍      | 692/2048 [09:06<16:56,  1.33it/s]
loss 2.05 accuracy 0.38 -- 57.02 + 56.67 + 500.75 + 4.78 = 619.21:  34%|███▍      | 692/2048 [09:07<16:56,  1.33it/s]
loss 2.05 accuracy 0.38 -- 57.02 + 56.67 + 500.75 + 4.78 = 619.21:  34%|███▍      | 693/2048 [09:07<16:14,  1.39it/s]
loss 2.42 accuracy 0.19 -- 56.02 + 165.76 + 501.45 + 4.79 = 728.02:  34%|███▍      | 693/2048 [09:08<16:14,  1.39it/s]
loss 2.42 accuracy 0.19 -- 56.02 + 165.76 + 501.45 + 4.79 = 728.02:  34%|███▍      | 694/2048 [09:08<16:28,  1.37it/s]
loss 2.44 accuracy 0.19 -- 56.19 + 56.66 + 499.51 + 4.79 = 617.15:  34%|███▍      | 694/2048 [09:08<16:28,  1.37it/s] 
loss 2.44 accuracy 0.19 -- 56.19 + 56.66 + 499.51 + 4.79 = 617.15:  34%|███▍      | 695/2048 [09:08<15:53,  1.42it/s]
loss 2.04 accuracy 0.31 -- 57.08 + 56.84 + 499.99 + 4.82 = 618.72:  34%|███▍      | 695/2048 [09:09<15:53,  1.42it/s]
loss 2.04 accuracy 0.31 -- 57.08 + 56.84 + 499.99 + 4.82 = 618.72:  34%|███▍      | 696/2048 [09:09<16:12,  1.39it/s]
loss 1.67 accuracy 0.31 -- 56.02 + 57.53 + 496.76 + 4.84 = 615.14:  34%|███▍      | 696/2048 [09:10<16:12,  1.39it/s]
loss 1.67 accuracy 0.31 -- 56.02 + 57.53 + 496.76 + 4.84 = 615.14:  34%|███▍      | 697/2048 [09:10<15:40,  1.44it/s]
loss 1.76 accuracy 0.50 -- 157.62 + 57.06 + 489.96 + 4.80 = 709.44:  34%|███▍      | 697/2048 [09:10<15:40,  1.44it/s]
loss 1.76 accuracy 0.50 -- 157.62 + 57.06 + 489.96 + 4.80 = 709.44:  34%|███▍      | 698/2048 [09:10<15:56,  1.41it/s]
loss 1.70 accuracy 0.31 -- 55.98 + 166.38 + 501.76 + 4.80 = 728.92:  34%|███▍      | 698/2048 [09:11<15:56,  1.41it/s]
loss 1.70 accuracy 0.31 -- 55.98 + 166.38 + 501.76 + 4.80 = 728.92:  34%|███▍      | 699/2048 [09:11<16:15,  1.38it/s]
loss 2.33 accuracy 0.38 -- 56.70 + 56.50 + 498.57 + 4.80 = 616.57:  34%|███▍      | 699/2048 [09:12<16:15,  1.38it/s] 
loss 2.33 accuracy 0.38 -- 56.70 + 56.50 + 498.57 + 4.80 = 616.57:  34%|███▍      | 700/2048 [09:12<15:42,  1.43it/s]
loss 2.62 accuracy 0.31 -- 163.29 + 57.26 + 500.00 + 4.79 = 725.34:  34%|███▍      | 700/2048 [09:12<15:42,  1.43it/s]
loss 2.62 accuracy 0.31 -- 163.29 + 57.26 + 500.00 + 4.79 = 725.34:  34%|███▍      | 701/2048 [09:12<16:03,  1.40it/s]
loss 1.96 accuracy 0.19 -- 56.38 + 57.36 + 623.80 + 4.84 = 742.37:  34%|███▍      | 701/2048 [09:13<16:03,  1.40it/s] 
loss 1.96 accuracy 0.19 -- 56.38 + 57.36 + 623.80 + 4.84 = 742.37:  34%|███▍      | 702/2048 [09:13<16:24,  1.37it/s]
loss 1.65 accuracy 0.56 -- 57.00 + 56.69 + 506.12 + 4.82 = 624.63:  34%|███▍      | 702/2048 [09:14<16:24,  1.37it/s]
loss 1.65 accuracy 0.56 -- 57.00 + 56.69 + 506.12 + 4.82 = 624.63:  34%|███▍      | 703/2048 [09:14<15:51,  1.41it/s]
loss 1.73 accuracy 0.38 -- 56.23 + 57.52 + 616.42 + 4.79 = 734.97:  34%|███▍      | 703/2048 [09:15<15:51,  1.41it/s]
loss 1.73 accuracy 0.38 -- 56.23 + 57.52 + 616.42 + 4.79 = 734.97:  34%|███▍      | 704/2048 [09:15<16:13,  1.38it/s]
loss 1.77 accuracy 0.31 -- 56.69 + 56.77 + 502.89 + 4.82 = 621.16:  34%|███▍      | 704/2048 [09:15<16:13,  1.38it/s]
loss 1.77 accuracy 0.31 -- 56.69 + 56.77 + 502.89 + 4.82 = 621.16:  34%|███▍      | 705/2048 [09:15<15:42,  1.43it/s]
loss 1.90 accuracy 0.25 -- 56.25 + 166.93 + 502.90 + 4.81 = 730.88:  34%|███▍      | 705/2048 [09:16<15:42,  1.43it/s]
loss 1.90 accuracy 0.25 -- 56.25 + 166.93 + 502.90 + 4.81 = 730.88:  34%|███▍      | 706/2048 [09:16<16:04,  1.39it/s]
loss 1.92 accuracy 0.19 -- 56.97 + 57.06 + 501.18 + 4.81 = 620.01:  34%|███▍      | 706/2048 [09:17<16:04,  1.39it/s] 
loss 1.92 accuracy 0.19 -- 56.97 + 57.06 + 501.18 + 4.81 = 620.01:  35%|███▍      | 707/2048 [09:17<15:35,  1.43it/s]
loss 2.04 accuracy 0.19 -- 56.58 + 56.52 + 499.87 + 4.78 = 617.75:  35%|███▍      | 707/2048 [09:17<15:35,  1.43it/s]
loss 2.04 accuracy 0.19 -- 56.58 + 56.52 + 499.87 + 4.78 = 617.75:  35%|███▍      | 708/2048 [09:17<15:56,  1.40it/s]
loss 2.84 accuracy 0.19 -- 55.99 + 57.19 + 495.82 + 4.79 = 613.79:  35%|███▍      | 708/2048 [09:18<15:56,  1.40it/s]
loss 2.84 accuracy 0.19 -- 55.99 + 57.19 + 495.82 + 4.79 = 613.79:  35%|███▍      | 709/2048 [09:18<15:26,  1.45it/s]
loss 1.66 accuracy 0.25 -- 157.80 + 57.03 + 490.61 + 4.80 = 710.25:  35%|███▍      | 709/2048 [09:19<15:26,  1.45it/s]
loss 1.66 accuracy 0.25 -- 157.80 + 57.03 + 490.61 + 4.80 = 710.25:  35%|███▍      | 710/2048 [09:19<15:43,  1.42it/s]
loss 2.45 accuracy 0.25 -- 55.89 + 166.98 + 501.57 + 4.78 = 729.21:  35%|███▍      | 710/2048 [09:20<15:43,  1.42it/s]
loss 2.45 accuracy 0.25 -- 55.89 + 166.98 + 501.57 + 4.78 = 729.21:  35%|███▍      | 711/2048 [09:20<16:03,  1.39it/s]
loss 1.90 accuracy 0.44 -- 56.50 + 56.24 + 498.31 + 4.84 = 615.89:  35%|███▍      | 711/2048 [09:20<16:03,  1.39it/s] 
loss 1.90 accuracy 0.44 -- 56.50 + 56.24 + 498.31 + 4.84 = 615.89:  35%|███▍      | 712/2048 [09:20<15:31,  1.43it/s]
loss 1.97 accuracy 0.38 -- 163.11 + 57.41 + 496.75 + 4.78 = 722.05:  35%|███▍      | 712/2048 [09:21<15:31,  1.43it/s]
loss 1.97 accuracy 0.38 -- 163.11 + 57.41 + 496.75 + 4.78 = 722.05:  35%|███▍      | 713/2048 [09:21<15:51,  1.40it/s]
loss 1.87 accuracy 0.31 -- 56.13 + 57.48 + 621.72 + 4.79 = 740.11:  35%|███▍      | 713/2048 [09:22<15:51,  1.40it/s] 
loss 1.87 accuracy 0.31 -- 56.13 + 57.48 + 621.72 + 4.79 = 740.11:  35%|███▍      | 714/2048 [09:22<16:12,  1.37it/s]
loss 2.07 accuracy 0.31 -- 56.70 + 56.59 + 505.87 + 4.82 = 623.98:  35%|███▍      | 714/2048 [09:22<16:12,  1.37it/s]
loss 2.07 accuracy 0.31 -- 56.70 + 56.59 + 505.87 + 4.82 = 623.98:  35%|███▍      | 715/2048 [09:22<15:41,  1.42it/s]
loss 2.05 accuracy 0.31 -- 55.99 + 57.22 + 616.59 + 4.78 = 734.58:  35%|███▍      | 715/2048 [09:23<15:41,  1.42it/s]
loss 2.05 accuracy 0.31 -- 55.99 + 57.22 + 616.59 + 4.78 = 734.58:  35%|███▍      | 716/2048 [09:23<16:02,  1.38it/s]
loss 1.74 accuracy 0.31 -- 56.84 + 56.31 + 501.08 + 4.80 = 619.03:  35%|███▍      | 716/2048 [09:24<16:02,  1.38it/s]
loss 1.74 accuracy 0.31 -- 56.84 + 56.31 + 501.08 + 4.80 = 619.03:  35%|███▌      | 717/2048 [09:24<15:31,  1.43it/s]
loss 2.16 accuracy 0.19 -- 56.09 + 166.44 + 503.21 + 4.79 = 730.54:  35%|███▌      | 717/2048 [09:25<15:31,  1.43it/s]
loss 2.16 accuracy 0.19 -- 56.09 + 166.44 + 503.21 + 4.79 = 730.54:  35%|███▌      | 718/2048 [09:25<15:54,  1.39it/s]
loss 1.64 accuracy 0.50 -- 55.89 + 56.09 + 498.72 + 4.78 = 615.48:  35%|███▌      | 718/2048 [09:25<15:54,  1.39it/s] 
loss 1.64 accuracy 0.50 -- 55.89 + 56.09 + 498.72 + 4.78 = 615.48:  35%|███▌      | 719/2048 [09:25<15:23,  1.44it/s]
loss 1.89 accuracy 0.25 -- 56.65 + 56.58 + 497.86 + 4.77 = 615.86:  35%|███▌      | 719/2048 [09:26<15:23,  1.44it/s]
loss 1.89 accuracy 0.25 -- 56.65 + 56.58 + 497.86 + 4.77 = 615.86:  35%|███▌      | 720/2048 [09:26<15:44,  1.41it/s]
loss 2.41 accuracy 0.06 -- 56.21 + 57.34 + 494.63 + 4.78 = 612.97:  35%|███▌      | 720/2048 [09:27<15:44,  1.41it/s]
loss 2.41 accuracy 0.06 -- 56.21 + 57.34 + 494.63 + 4.78 = 612.97:  35%|███▌      | 721/2048 [09:27<15:15,  1.45it/s]
loss 2.17 accuracy 0.12 -- 157.96 + 56.66 + 489.57 + 4.82 = 709.01:  35%|███▌      | 721/2048 [09:27<15:15,  1.45it/s]
loss 2.17 accuracy 0.12 -- 157.96 + 56.66 + 489.57 + 4.82 = 709.01:  35%|███▌      | 722/2048 [09:27<15:33,  1.42it/s]
loss 2.02 accuracy 0.38 -- 55.86 + 166.27 + 501.03 + 4.78 = 727.93:  35%|███▌      | 722/2048 [09:28<15:33,  1.42it/s]
loss 2.02 accuracy 0.38 -- 55.86 + 166.27 + 501.03 + 4.78 = 727.93:  35%|███▌      | 723/2048 [09:28<15:53,  1.39it/s]
loss 1.88 accuracy 0.19 -- 57.16 + 56.97 + 498.94 + 4.78 = 617.84:  35%|███▌      | 723/2048 [09:29<15:53,  1.39it/s] 
loss 1.88 accuracy 0.19 -- 57.16 + 56.97 + 498.94 + 4.78 = 617.84:  35%|███▌      | 724/2048 [09:29<15:22,  1.43it/s]
loss 1.83 accuracy 0.44 -- 162.72 + 56.84 + 497.00 + 4.80 = 721.36:  35%|███▌      | 724/2048 [09:29<15:22,  1.43it/s]
loss 1.83 accuracy 0.44 -- 162.72 + 56.84 + 497.00 + 4.80 = 721.36:  35%|███▌      | 725/2048 [09:29<15:42,  1.40it/s]
loss 1.74 accuracy 0.44 -- 56.54 + 57.64 + 621.47 + 4.81 = 740.46:  35%|███▌      | 725/2048 [09:30<15:42,  1.40it/s] 
loss 1.74 accuracy 0.44 -- 56.54 + 57.64 + 621.47 + 4.81 = 740.46:  35%|███▌      | 726/2048 [09:30<16:03,  1.37it/s]
loss 1.91 accuracy 0.38 -- 56.85 + 56.71 + 504.86 + 4.79 = 623.21:  35%|███▌      | 726/2048 [09:31<16:03,  1.37it/s]
loss 1.91 accuracy 0.38 -- 56.85 + 56.71 + 504.86 + 4.79 = 623.21:  35%|███▌      | 727/2048 [09:31<15:32,  1.42it/s]
loss 1.94 accuracy 0.25 -- 56.05 + 57.34 + 616.25 + 4.77 = 734.42:  35%|███▌      | 727/2048 [09:32<15:32,  1.42it/s]
loss 1.94 accuracy 0.25 -- 56.05 + 57.34 + 616.25 + 4.77 = 734.42:  36%|███▌      | 728/2048 [09:32<15:53,  1.38it/s]
loss 1.68 accuracy 0.44 -- 56.64 + 56.65 + 503.49 + 4.79 = 621.57:  36%|███▌      | 728/2048 [09:32<15:53,  1.38it/s]
loss 1.68 accuracy 0.44 -- 56.64 + 56.65 + 503.49 + 4.79 = 621.57:  36%|███▌      | 729/2048 [09:32<15:23,  1.43it/s]
loss 1.81 accuracy 0.12 -- 56.54 + 166.79 + 501.16 + 4.77 = 729.26:  36%|███▌      | 729/2048 [09:33<15:23,  1.43it/s]
loss 1.81 accuracy 0.12 -- 56.54 + 166.79 + 501.16 + 4.77 = 729.26:  36%|███▌      | 730/2048 [09:33<15:45,  1.39it/s]
loss 1.86 accuracy 0.31 -- 56.09 + 56.41 + 499.66 + 4.78 = 616.95:  36%|███▌      | 730/2048 [09:34<15:45,  1.39it/s] 
loss 1.86 accuracy 0.31 -- 56.09 + 56.41 + 499.66 + 4.78 = 616.95:  36%|███▌      | 731/2048 [09:34<15:15,  1.44it/s]
loss 2.12 accuracy 0.12 -- 56.36 + 56.45 + 498.28 + 4.79 = 615.89:  36%|███▌      | 731/2048 [09:34<15:15,  1.44it/s]
loss 2.12 accuracy 0.12 -- 56.36 + 56.45 + 498.28 + 4.79 = 615.89:  36%|███▌      | 732/2048 [09:34<15:36,  1.41it/s]
loss 2.19 accuracy 0.25 -- 56.04 + 57.31 + 495.15 + 4.79 = 613.29:  36%|███▌      | 732/2048 [09:35<15:36,  1.41it/s]
loss 2.19 accuracy 0.25 -- 56.04 + 57.31 + 495.15 + 4.79 = 613.29:  36%|███▌      | 733/2048 [09:35<15:07,  1.45it/s]
loss 2.01 accuracy 0.25 -- 156.98 + 56.66 + 490.45 + 4.77 = 708.87:  36%|███▌      | 733/2048 [09:36<15:07,  1.45it/s]
loss 2.01 accuracy 0.25 -- 156.98 + 56.66 + 490.45 + 4.77 = 708.87:  36%|███▌      | 734/2048 [09:36<15:25,  1.42it/s]
loss 2.07 accuracy 0.25 -- 55.90 + 166.42 + 502.91 + 4.76 = 730.00:  36%|███▌      | 734/2048 [09:37<15:25,  1.42it/s]
loss 2.07 accuracy 0.25 -- 55.90 + 166.42 + 502.91 + 4.76 = 730.00:  36%|███▌      | 735/2048 [09:37<15:45,  1.39it/s]
loss 1.44 accuracy 0.44 -- 56.48 + 56.19 + 497.12 + 4.78 = 614.58:  36%|███▌      | 735/2048 [09:37<15:45,  1.39it/s] 
loss 1.44 accuracy 0.44 -- 56.48 + 56.19 + 497.12 + 4.78 = 614.58:  36%|███▌      | 736/2048 [09:37<15:13,  1.44it/s]
loss 2.10 accuracy 0.12 -- 162.23 + 56.82 + 496.32 + 4.79 = 720.15:  36%|███▌      | 736/2048 [09:38<15:13,  1.44it/s]
loss 2.10 accuracy 0.12 -- 162.23 + 56.82 + 496.32 + 4.79 = 720.15:  36%|███▌      | 737/2048 [09:38<15:33,  1.40it/s]
loss 1.74 accuracy 0.31 -- 56.28 + 57.64 + 620.88 + 4.82 = 739.61:  36%|███▌      | 737/2048 [09:39<15:33,  1.40it/s] 
loss 1.74 accuracy 0.31 -- 56.28 + 57.64 + 620.88 + 4.82 = 739.61:  36%|███▌      | 738/2048 [09:39<15:54,  1.37it/s]
loss 1.70 accuracy 0.25 -- 56.78 + 56.27 + 505.30 + 4.76 = 623.12:  36%|███▌      | 738/2048 [09:39<15:54,  1.37it/s]
loss 1.70 accuracy 0.25 -- 56.78 + 56.27 + 505.30 + 4.76 = 623.12:  36%|███▌      | 739/2048 [09:39<15:36,  1.40it/s]
loss 1.82 accuracy 0.19 -- 56.14 + 57.20 + 617.69 + 4.77 = 735.80:  36%|███▌      | 739/2048 [09:40<15:36,  1.40it/s]
loss 1.82 accuracy 0.19 -- 56.14 + 57.20 + 617.69 + 4.77 = 735.80:  36%|███▌      | 740/2048 [09:40<15:54,  1.37it/s]
loss 2.16 accuracy 0.31 -- 57.82 + 56.84 + 504.10 + 4.78 = 623.55:  36%|███▌      | 740/2048 [09:41<15:54,  1.37it/s]
loss 2.16 accuracy 0.31 -- 57.82 + 56.84 + 504.10 + 4.78 = 623.55:  36%|███▌      | 741/2048 [09:41<15:22,  1.42it/s]
loss 1.78 accuracy 0.25 -- 56.36 + 166.29 + 503.50 + 4.77 = 730.91:  36%|███▌      | 741/2048 [09:42<15:22,  1.42it/s]
loss 1.78 accuracy 0.25 -- 56.36 + 166.29 + 503.50 + 4.77 = 730.91:  36%|███▌      | 742/2048 [09:42<15:42,  1.39it/s]
loss 2.20 accuracy 0.31 -- 56.51 + 56.33 + 500.75 + 4.77 = 618.36:  36%|███▌      | 742/2048 [09:42<15:42,  1.39it/s] 
loss 2.20 accuracy 0.31 -- 56.51 + 56.33 + 500.75 + 4.77 = 618.36:  36%|███▋      | 743/2048 [09:42<15:11,  1.43it/s]
loss 1.80 accuracy 0.25 -- 57.19 + 56.84 + 499.47 + 4.77 = 618.26:  36%|███▋      | 743/2048 [09:43<15:11,  1.43it/s]
loss 1.80 accuracy 0.25 -- 57.19 + 56.84 + 499.47 + 4.77 = 618.26:  36%|███▋      | 744/2048 [09:43<15:32,  1.40it/s]
loss 1.98 accuracy 0.19 -- 56.20 + 57.24 + 496.62 + 4.78 = 614.84:  36%|███▋      | 744/2048 [09:44<15:32,  1.40it/s]
loss 1.98 accuracy 0.19 -- 56.20 + 57.24 + 496.62 + 4.78 = 614.84:  36%|███▋      | 745/2048 [09:44<15:02,  1.44it/s]
loss 1.92 accuracy 0.25 -- 157.60 + 57.03 + 490.23 + 4.81 = 709.66:  36%|███▋      | 745/2048 [09:44<15:02,  1.44it/s]
loss 1.92 accuracy 0.25 -- 157.60 + 57.03 + 490.23 + 4.81 = 709.66:  36%|███▋      | 746/2048 [09:44<15:19,  1.42it/s]
loss 2.35 accuracy 0.25 -- 56.05 + 166.99 + 501.39 + 4.79 = 729.22:  36%|███▋      | 746/2048 [09:45<15:19,  1.42it/s]
loss 2.35 accuracy 0.25 -- 56.05 + 166.99 + 501.39 + 4.79 = 729.22:  36%|███▋      | 747/2048 [09:45<15:38,  1.39it/s]
loss 1.72 accuracy 0.38 -- 56.90 + 56.86 + 497.53 + 4.77 = 616.07:  36%|███▋      | 747/2048 [09:46<15:38,  1.39it/s] 
loss 1.72 accuracy 0.38 -- 56.90 + 56.86 + 497.53 + 4.77 = 616.07:  37%|███▋      | 748/2048 [09:46<15:07,  1.43it/s]
loss 1.96 accuracy 0.31 -- 162.37 + 56.96 + 498.23 + 4.79 = 722.35:  37%|███▋      | 748/2048 [09:47<15:07,  1.43it/s]
loss 1.96 accuracy 0.31 -- 162.37 + 56.96 + 498.23 + 4.79 = 722.35:  37%|███▋      | 749/2048 [09:47<15:26,  1.40it/s]
loss 2.38 accuracy 0.25 -- 56.06 + 57.17 + 620.85 + 4.80 = 738.88:  37%|███▋      | 749/2048 [09:47<15:26,  1.40it/s] 
loss 2.38 accuracy 0.25 -- 56.06 + 57.17 + 620.85 + 4.80 = 738.88:  37%|███▋      | 750/2048 [09:47<15:46,  1.37it/s]
loss 1.78 accuracy 0.31 -- 57.13 + 56.49 + 504.03 + 4.79 = 622.43:  37%|███▋      | 750/2048 [09:48<15:46,  1.37it/s]
loss 1.78 accuracy 0.31 -- 57.13 + 56.49 + 504.03 + 4.79 = 622.43:  37%|███▋      | 751/2048 [09:48<15:14,  1.42it/s]
loss 1.63 accuracy 0.38 -- 56.42 + 57.10 + 617.77 + 4.79 = 736.07:  37%|███▋      | 751/2048 [09:49<15:14,  1.42it/s]
loss 1.63 accuracy 0.38 -- 56.42 + 57.10 + 617.77 + 4.79 = 736.07:  37%|███▋      | 752/2048 [09:49<15:36,  1.38it/s]
loss 2.07 accuracy 0.50 -- 56.56 + 56.29 + 502.39 + 4.79 = 620.02:  37%|███▋      | 752/2048 [09:49<15:36,  1.38it/s]
loss 2.07 accuracy 0.50 -- 56.56 + 56.29 + 502.39 + 4.79 = 620.02:  37%|███▋      | 753/2048 [09:49<15:06,  1.43it/s]
loss 1.86 accuracy 0.19 -- 56.34 + 166.19 + 501.25 + 4.78 = 728.56:  37%|███▋      | 753/2048 [09:50<15:06,  1.43it/s]
loss 1.86 accuracy 0.19 -- 56.34 + 166.19 + 501.25 + 4.78 = 728.56:  37%|███▋      | 754/2048 [09:50<15:27,  1.39it/s]
loss 1.71 accuracy 0.25 -- 56.14 + 58.34 + 531.39 + 4.95 = 650.82:  37%|███▋      | 754/2048 [09:51<15:27,  1.39it/s] 
loss 1.71 accuracy 0.25 -- 56.14 + 58.34 + 531.39 + 4.95 = 650.82:  37%|███▋      | 755/2048 [09:51<15:12,  1.42it/s]
loss 1.84 accuracy 0.25 -- 58.40 + 58.79 + 509.60 + 4.79 = 631.59:  37%|███▋      | 755/2048 [09:52<15:12,  1.42it/s]
loss 1.84 accuracy 0.25 -- 58.40 + 58.79 + 509.60 + 4.79 = 631.59:  37%|███▋      | 756/2048 [09:52<15:38,  1.38it/s]
loss 1.95 accuracy 0.19 -- 55.79 + 57.46 + 495.06 + 4.78 = 613.09:  37%|███▋      | 756/2048 [09:52<15:38,  1.38it/s]
loss 1.95 accuracy 0.19 -- 55.79 + 57.46 + 495.06 + 4.78 = 613.09:  37%|███▋      | 757/2048 [09:52<15:04,  1.43it/s]
loss 2.17 accuracy 0.12 -- 156.82 + 56.71 + 489.37 + 4.77 = 707.66:  37%|███▋      | 757/2048 [09:53<15:04,  1.43it/s]
loss 2.17 accuracy 0.12 -- 156.82 + 56.71 + 489.37 + 4.77 = 707.66:  37%|███▋      | 758/2048 [09:53<15:17,  1.41it/s]
loss 1.81 accuracy 0.19 -- 55.90 + 166.73 + 501.84 + 4.78 = 729.25:  37%|███▋      | 758/2048 [09:54<15:17,  1.41it/s]
loss 1.81 accuracy 0.19 -- 55.90 + 166.73 + 501.84 + 4.78 = 729.25:  37%|███▋      | 759/2048 [09:54<15:34,  1.38it/s]
loss 2.70 accuracy 0.12 -- 56.38 + 56.04 + 504.61 + 4.79 = 621.82:  37%|███▋      | 759/2048 [09:54<15:34,  1.38it/s] 
loss 2.70 accuracy 0.12 -- 56.38 + 56.04 + 504.61 + 4.79 = 621.82:  37%|███▋      | 760/2048 [09:54<15:04,  1.42it/s]
loss 1.92 accuracy 0.25 -- 162.77 + 57.41 + 497.22 + 4.77 = 722.18:  37%|███▋      | 760/2048 [09:55<15:04,  1.42it/s]
loss 1.92 accuracy 0.25 -- 162.77 + 57.41 + 497.22 + 4.77 = 722.18:  37%|███▋      | 761/2048 [09:55<15:22,  1.40it/s]
loss 1.70 accuracy 0.50 -- 56.30 + 57.83 + 626.48 + 4.77 = 745.37:  37%|███▋      | 761/2048 [09:56<15:22,  1.40it/s] 
loss 1.70 accuracy 0.50 -- 56.30 + 57.83 + 626.48 + 4.77 = 745.37:  37%|███▋      | 762/2048 [09:56<15:43,  1.36it/s]
loss 1.99 accuracy 0.31 -- 57.02 + 56.44 + 507.60 + 4.77 = 625.84:  37%|███▋      | 762/2048 [09:57<15:43,  1.36it/s]
loss 1.99 accuracy 0.31 -- 57.02 + 56.44 + 507.60 + 4.77 = 625.84:  37%|███▋      | 763/2048 [09:57<15:11,  1.41it/s]
loss 2.07 accuracy 0.31 -- 56.31 + 57.71 + 615.58 + 4.79 = 734.38:  37%|███▋      | 763/2048 [09:57<15:11,  1.41it/s]
loss 2.07 accuracy 0.31 -- 56.31 + 57.71 + 615.58 + 4.79 = 734.38:  37%|███▋      | 764/2048 [09:57<15:30,  1.38it/s]
loss 2.08 accuracy 0.12 -- 56.87 + 56.56 + 501.64 + 4.78 = 619.85:  37%|███▋      | 764/2048 [09:58<15:30,  1.38it/s]
loss 2.08 accuracy 0.12 -- 56.87 + 56.56 + 501.64 + 4.78 = 619.85:  37%|███▋      | 765/2048 [09:58<15:00,  1.43it/s]
loss 1.90 accuracy 0.25 -- 56.27 + 166.02 + 503.20 + 4.81 = 730.31:  37%|███▋      | 765/2048 [09:59<15:00,  1.43it/s]
loss 1.90 accuracy 0.25 -- 56.27 + 166.02 + 503.20 + 4.81 = 730.31:  37%|███▋      | 766/2048 [09:59<15:20,  1.39it/s]
loss 1.59 accuracy 0.56 -- 56.06 + 56.31 + 500.68 + 4.82 = 617.87:  37%|███▋      | 766/2048 [09:59<15:20,  1.39it/s] 
loss 1.59 accuracy 0.56 -- 56.06 + 56.31 + 500.68 + 4.82 = 617.87:  37%|███▋      | 767/2048 [09:59<14:52,  1.44it/s]
loss 1.64 accuracy 0.38 -- 56.80 + 56.14 + 497.95 + 4.77 = 615.67:  37%|███▋      | 767/2048 [10:00<14:52,  1.44it/s]
loss 1.64 accuracy 0.38 -- 56.80 + 56.14 + 497.95 + 4.77 = 615.67:  38%|███▊      | 768/2048 [10:00<15:11,  1.40it/s]
loss 1.84 accuracy 0.31 -- 56.09 + 57.93 + 496.62 + 4.78 = 615.43:  38%|███▊      | 768/2048 [10:01<15:11,  1.40it/s]
loss 1.84 accuracy 0.31 -- 56.09 + 57.93 + 496.62 + 4.78 = 615.43:  38%|███▊      | 769/2048 [10:01<14:44,  1.45it/s]
loss 1.99 accuracy 0.19 -- 157.53 + 56.72 + 497.95 + 4.77 = 716.98:  38%|███▊      | 769/2048 [10:01<14:44,  1.45it/s]
loss 1.99 accuracy 0.19 -- 157.53 + 56.72 + 497.95 + 4.77 = 716.98:  38%|███▊      | 770/2048 [10:01<15:03,  1.41it/s]
loss 2.11 accuracy 0.31 -- 55.91 + 167.27 + 501.64 + 4.78 = 729.60:  38%|███▊      | 770/2048 [10:02<15:03,  1.41it/s]
loss 2.11 accuracy 0.31 -- 55.91 + 167.27 + 501.64 + 4.78 = 729.60:  38%|███▊      | 771/2048 [10:02<15:22,  1.38it/s]
loss 1.80 accuracy 0.38 -- 56.70 + 56.60 + 497.74 + 4.79 = 615.82:  38%|███▊      | 771/2048 [10:03<15:22,  1.38it/s] 
loss 1.80 accuracy 0.38 -- 56.70 + 56.60 + 497.74 + 4.79 = 615.82:  38%|███▊      | 772/2048 [10:03<14:51,  1.43it/s]
loss 2.16 accuracy 0.38 -- 162.29 + 57.06 + 497.14 + 4.79 = 721.28:  38%|███▊      | 772/2048 [10:04<14:51,  1.43it/s]
loss 2.16 accuracy 0.38 -- 162.29 + 57.06 + 497.14 + 4.79 = 721.28:  38%|███▊      | 773/2048 [10:04<15:09,  1.40it/s]
loss 2.04 accuracy 0.31 -- 56.01 + 57.35 + 622.45 + 4.82 = 740.64:  38%|███▊      | 773/2048 [10:04<15:09,  1.40it/s] 
loss 2.04 accuracy 0.31 -- 56.01 + 57.35 + 622.45 + 4.82 = 740.64:  38%|███▊      | 774/2048 [10:04<15:29,  1.37it/s]
loss 2.03 accuracy 0.25 -- 57.56 + 57.05 + 504.96 + 4.79 = 624.36:  38%|███▊      | 774/2048 [10:05<15:29,  1.37it/s]
loss 2.03 accuracy 0.25 -- 57.56 + 57.05 + 504.96 + 4.79 = 624.36:  38%|███▊      | 775/2048 [10:05<14:59,  1.42it/s]
loss 1.77 accuracy 0.44 -- 56.52 + 57.26 + 616.56 + 4.78 = 735.12:  38%|███▊      | 775/2048 [10:06<14:59,  1.42it/s]
loss 1.77 accuracy 0.44 -- 56.52 + 57.26 + 616.56 + 4.78 = 735.12:  38%|███▊      | 776/2048 [10:06<15:19,  1.38it/s]
loss 2.60 accuracy 0.19 -- 56.98 + 56.69 + 503.09 + 4.78 = 621.54:  38%|███▊      | 776/2048 [10:06<15:19,  1.38it/s]
loss 2.60 accuracy 0.19 -- 56.98 + 56.69 + 503.09 + 4.78 = 621.54:  38%|███▊      | 777/2048 [10:06<14:50,  1.43it/s]
loss 1.74 accuracy 0.31 -- 56.40 + 165.77 + 501.16 + 4.78 = 728.12:  38%|███▊      | 777/2048 [10:07<14:50,  1.43it/s]
loss 1.74 accuracy 0.31 -- 56.40 + 165.77 + 501.16 + 4.78 = 728.12:  38%|███▊      | 778/2048 [10:07<15:10,  1.39it/s]
loss 1.76 accuracy 0.44 -- 56.22 + 56.69 + 501.20 + 4.76 = 618.88:  38%|███▊      | 778/2048 [10:08<15:10,  1.39it/s] 
loss 1.76 accuracy 0.44 -- 56.22 + 56.69 + 501.20 + 4.76 = 618.88:  38%|███▊      | 779/2048 [10:08<14:42,  1.44it/s]
loss 2.10 accuracy 0.19 -- 56.53 + 56.36 + 499.89 + 4.75 = 617.53:  38%|███▊      | 779/2048 [10:09<14:42,  1.44it/s]
loss 2.10 accuracy 0.19 -- 56.53 + 56.36 + 499.89 + 4.75 = 617.53:  38%|███▊      | 780/2048 [10:09<15:03,  1.40it/s]
loss 2.16 accuracy 0.12 -- 56.15 + 57.51 + 496.80 + 4.79 = 615.24:  38%|███▊      | 780/2048 [10:09<15:03,  1.40it/s]
loss 2.16 accuracy 0.12 -- 56.15 + 57.51 + 496.80 + 4.79 = 615.24:  38%|███▊      | 781/2048 [10:09<14:36,  1.45it/s]
loss 1.87 accuracy 0.38 -- 157.95 + 56.87 + 490.90 + 4.81 = 710.53:  38%|███▊      | 781/2048 [10:10<14:36,  1.45it/s]
loss 1.87 accuracy 0.38 -- 157.95 + 56.87 + 490.90 + 4.81 = 710.53:  38%|███▊      | 782/2048 [10:10<14:52,  1.42it/s]
loss 1.92 accuracy 0.12 -- 56.14 + 166.82 + 501.48 + 4.80 = 729.24:  38%|███▊      | 782/2048 [10:11<14:52,  1.42it/s]
loss 1.92 accuracy 0.12 -- 56.14 + 166.82 + 501.48 + 4.80 = 729.24:  38%|███▊      | 783/2048 [10:11<15:11,  1.39it/s]
loss 1.69 accuracy 0.56 -- 56.76 + 56.30 + 498.50 + 4.77 = 616.33:  38%|███▊      | 783/2048 [10:11<15:11,  1.39it/s] 
loss 1.69 accuracy 0.56 -- 56.76 + 56.30 + 498.50 + 4.77 = 616.33:  38%|███▊      | 784/2048 [10:11<14:41,  1.43it/s]
loss 1.68 accuracy 0.25 -- 162.51 + 57.09 + 496.43 + 4.78 = 720.81:  38%|███▊      | 784/2048 [10:12<14:41,  1.43it/s]
loss 1.68 accuracy 0.25 -- 162.51 + 57.09 + 496.43 + 4.78 = 720.81:  38%|███▊      | 785/2048 [10:12<15:00,  1.40it/s]
loss 2.00 accuracy 0.31 -- 56.59 + 57.70 + 621.74 + 4.78 = 740.81:  38%|███▊      | 785/2048 [10:13<15:00,  1.40it/s] 
loss 2.00 accuracy 0.31 -- 56.59 + 57.70 + 621.74 + 4.78 = 740.81:  38%|███▊      | 786/2048 [10:13<15:20,  1.37it/s]
loss 1.63 accuracy 0.50 -- 56.90 + 56.72 + 505.72 + 4.78 = 624.11:  38%|███▊      | 786/2048 [10:14<15:20,  1.37it/s]
loss 1.63 accuracy 0.50 -- 56.90 + 56.72 + 505.72 + 4.78 = 624.11:  38%|███▊      | 787/2048 [10:14<14:50,  1.42it/s]
loss 2.66 accuracy 0.12 -- 56.28 + 57.52 + 616.72 + 4.79 = 735.30:  38%|███▊      | 787/2048 [10:14<14:50,  1.42it/s]
loss 2.66 accuracy 0.12 -- 56.28 + 57.52 + 616.72 + 4.79 = 735.30:  38%|███▊      | 788/2048 [10:14<15:10,  1.38it/s]
loss 1.94 accuracy 0.25 -- 56.96 + 56.91 + 503.10 + 4.78 = 621.74:  38%|███▊      | 788/2048 [10:15<15:10,  1.38it/s]
loss 1.94 accuracy 0.25 -- 56.96 + 56.91 + 503.10 + 4.78 = 621.74:  39%|███▊      | 789/2048 [10:15<14:42,  1.43it/s]
loss 1.88 accuracy 0.31 -- 56.39 + 166.21 + 502.97 + 4.79 = 730.35:  39%|███▊      | 789/2048 [10:16<14:42,  1.43it/s]
loss 1.88 accuracy 0.31 -- 56.39 + 166.21 + 502.97 + 4.79 = 730.35:  39%|███▊      | 790/2048 [10:16<15:02,  1.39it/s]
loss 1.94 accuracy 0.19 -- 56.09 + 56.32 + 501.03 + 4.79 = 618.22:  39%|███▊      | 790/2048 [10:16<15:02,  1.39it/s] 
loss 1.94 accuracy 0.19 -- 56.09 + 56.32 + 501.03 + 4.79 = 618.22:  39%|███▊      | 791/2048 [10:16<14:34,  1.44it/s]
loss 1.78 accuracy 0.44 -- 57.16 + 56.36 + 498.48 + 4.79 = 616.79:  39%|███▊      | 791/2048 [10:17<14:34,  1.44it/s]
loss 1.78 accuracy 0.44 -- 57.16 + 56.36 + 498.48 + 4.79 = 616.79:  39%|███▊      | 792/2048 [10:17<14:54,  1.40it/s]
loss 1.88 accuracy 0.19 -- 56.25 + 57.21 + 495.75 + 4.77 = 613.99:  39%|███▊      | 792/2048 [10:18<14:54,  1.40it/s]
loss 1.88 accuracy 0.19 -- 56.25 + 57.21 + 495.75 + 4.77 = 613.99:  39%|███▊      | 793/2048 [10:18<14:27,  1.45it/s]
loss 1.97 accuracy 0.44 -- 157.17 + 57.07 + 491.07 + 4.77 = 710.08:  39%|███▊      | 793/2048 [10:19<14:27,  1.45it/s]
loss 1.97 accuracy 0.44 -- 157.17 + 57.07 + 491.07 + 4.77 = 710.08:  39%|███▉      | 794/2048 [10:19<14:43,  1.42it/s]
loss 1.74 accuracy 0.38 -- 55.96 + 166.21 + 501.24 + 4.77 = 728.18:  39%|███▉      | 794/2048 [10:19<14:43,  1.42it/s]
loss 1.74 accuracy 0.38 -- 55.96 + 166.21 + 501.24 + 4.77 = 728.18:  39%|███▉      | 795/2048 [10:19<15:02,  1.39it/s]
loss 1.78 accuracy 0.31 -- 56.61 + 56.72 + 498.32 + 4.78 = 616.43:  39%|███▉      | 795/2048 [10:20<15:02,  1.39it/s] 
loss 1.78 accuracy 0.31 -- 56.61 + 56.72 + 498.32 + 4.78 = 616.43:  39%|███▉      | 796/2048 [10:20<14:32,  1.43it/s]
loss 1.78 accuracy 0.38 -- 162.55 + 57.19 + 497.42 + 4.78 = 721.93:  39%|███▉      | 796/2048 [10:21<14:32,  1.43it/s]
loss 1.78 accuracy 0.38 -- 162.55 + 57.19 + 497.42 + 4.78 = 721.93:  39%|███▉      | 797/2048 [10:21<14:51,  1.40it/s]
loss 1.76 accuracy 0.25 -- 55.83 + 57.15 + 618.63 + 4.80 = 736.41:  39%|███▉      | 797/2048 [10:21<14:51,  1.40it/s] 
loss 1.76 accuracy 0.25 -- 55.83 + 57.15 + 618.63 + 4.80 = 736.41:  39%|███▉      | 798/2048 [10:21<15:09,  1.37it/s]
loss 1.57 accuracy 0.44 -- 57.06 + 56.83 + 505.29 + 4.77 = 623.95:  39%|███▉      | 798/2048 [10:22<15:09,  1.37it/s]
loss 1.57 accuracy 0.44 -- 57.06 + 56.83 + 505.29 + 4.77 = 623.95:  39%|███▉      | 799/2048 [10:22<14:40,  1.42it/s]
loss 2.18 accuracy 0.25 -- 56.40 + 57.22 + 616.22 + 4.81 = 734.64:  39%|███▉      | 799/2048 [10:23<14:40,  1.42it/s]
loss 2.18 accuracy 0.25 -- 56.40 + 57.22 + 616.22 + 4.81 = 734.64:  39%|███▉      | 800/2048 [10:23<15:01,  1.39it/s]
loss 2.30 accuracy 0.25 -- 56.70 + 56.73 + 501.64 + 4.78 = 619.86:  39%|███▉      | 800/2048 [10:24<15:01,  1.39it/s]
loss 2.30 accuracy 0.25 -- 56.70 + 56.73 + 501.64 + 4.78 = 619.86:  39%|███▉      | 801/2048 [10:24<14:45,  1.41it/s]
loss 1.53 accuracy 0.56 -- 56.38 + 166.55 + 502.96 + 4.80 = 730.68:  39%|███▉      | 801/2048 [10:24<14:45,  1.41it/s]
loss 1.53 accuracy 0.56 -- 56.38 + 166.55 + 502.96 + 4.80 = 730.68:  39%|███▉      | 802/2048 [10:24<15:02,  1.38it/s]
loss 1.79 accuracy 0.31 -- 56.33 + 56.82 + 498.89 + 4.77 = 616.80:  39%|███▉      | 802/2048 [10:25<15:02,  1.38it/s] 
loss 1.79 accuracy 0.31 -- 56.33 + 56.82 + 498.89 + 4.77 = 616.80:  39%|███▉      | 803/2048 [10:25<14:31,  1.43it/s]
loss 1.82 accuracy 0.25 -- 56.79 + 56.18 + 498.49 + 4.77 = 616.23:  39%|███▉      | 803/2048 [10:26<14:31,  1.43it/s]
loss 1.82 accuracy 0.25 -- 56.79 + 56.18 + 498.49 + 4.77 = 616.23:  39%|███▉      | 804/2048 [10:26<14:49,  1.40it/s]
loss 1.79 accuracy 0.38 -- 56.05 + 57.51 + 494.96 + 4.77 = 613.29:  39%|███▉      | 804/2048 [10:26<14:49,  1.40it/s]
loss 1.79 accuracy 0.38 -- 56.05 + 57.51 + 494.96 + 4.77 = 613.29:  39%|███▉      | 805/2048 [10:26<14:21,  1.44it/s]
loss 2.28 accuracy 0.12 -- 157.75 + 57.02 + 488.34 + 4.79 = 707.90:  39%|███▉      | 805/2048 [10:27<14:21,  1.44it/s]
loss 2.28 accuracy 0.12 -- 157.75 + 57.02 + 488.34 + 4.79 = 707.90:  39%|███▉      | 806/2048 [10:27<14:36,  1.42it/s]
loss 1.70 accuracy 0.44 -- 55.83 + 166.65 + 501.78 + 4.78 = 729.03:  39%|███▉      | 806/2048 [10:28<14:36,  1.42it/s]
loss 1.70 accuracy 0.44 -- 55.83 + 166.65 + 501.78 + 4.78 = 729.03:  39%|███▉      | 807/2048 [10:28<14:54,  1.39it/s]
loss 2.17 accuracy 0.44 -- 56.72 + 56.31 + 499.35 + 4.80 = 617.17:  39%|███▉      | 807/2048 [10:28<14:54,  1.39it/s] 
loss 2.17 accuracy 0.44 -- 56.72 + 56.31 + 499.35 + 4.80 = 617.17:  39%|███▉      | 808/2048 [10:28<14:25,  1.43it/s]
loss 2.48 accuracy 0.31 -- 162.84 + 57.13 + 497.70 + 4.78 = 722.46:  39%|███▉      | 808/2048 [10:29<14:25,  1.43it/s]
loss 2.48 accuracy 0.31 -- 162.84 + 57.13 + 497.70 + 4.78 = 722.46:  40%|███▉      | 809/2048 [10:29<14:56,  1.38it/s]
loss 2.21 accuracy 0.12 -- 55.93 + 57.03 + 621.32 + 4.79 = 739.07:  40%|███▉      | 809/2048 [10:30<14:56,  1.38it/s] 
loss 2.21 accuracy 0.12 -- 55.93 + 57.03 + 621.32 + 4.79 = 739.07:  40%|███▉      | 810/2048 [10:30<15:11,  1.36it/s]
loss 2.33 accuracy 0.19 -- 56.90 + 57.05 + 505.12 + 4.78 = 623.86:  40%|███▉      | 810/2048 [10:31<15:11,  1.36it/s]
loss 2.33 accuracy 0.19 -- 56.90 + 57.05 + 505.12 + 4.78 = 623.86:  40%|███▉      | 811/2048 [10:31<14:39,  1.41it/s]
loss 2.22 accuracy 0.38 -- 56.00 + 57.40 + 615.56 + 4.77 = 733.73:  40%|███▉      | 811/2048 [10:31<14:39,  1.41it/s]
loss 2.22 accuracy 0.38 -- 56.00 + 57.40 + 615.56 + 4.77 = 733.73:  40%|███▉      | 812/2048 [10:31<14:57,  1.38it/s]
loss 2.79 accuracy 0.06 -- 56.67 + 56.72 + 502.91 + 4.76 = 621.06:  40%|███▉      | 812/2048 [10:32<14:57,  1.38it/s]
loss 2.79 accuracy 0.06 -- 56.67 + 56.72 + 502.91 + 4.76 = 621.06:  40%|███▉      | 813/2048 [10:32<14:27,  1.42it/s]
loss 1.69 accuracy 0.38 -- 56.20 + 167.07 + 503.42 + 4.77 = 731.46:  40%|███▉      | 813/2048 [10:33<14:27,  1.42it/s]
loss 1.69 accuracy 0.38 -- 56.20 + 167.07 + 503.42 + 4.77 = 731.46:  40%|███▉      | 814/2048 [10:33<14:47,  1.39it/s]
loss 2.20 accuracy 0.19 -- 56.10 + 56.45 + 498.46 + 4.78 = 615.78:  40%|███▉      | 814/2048 [10:33<14:47,  1.39it/s] 
loss 2.20 accuracy 0.19 -- 56.10 + 56.45 + 498.46 + 4.78 = 615.78:  40%|███▉      | 815/2048 [10:33<14:18,  1.44it/s]
loss 2.38 accuracy 0.38 -- 56.65 + 56.38 + 498.15 + 4.78 = 615.95:  40%|███▉      | 815/2048 [10:34<14:18,  1.44it/s]
loss 2.38 accuracy 0.38 -- 56.65 + 56.38 + 498.15 + 4.78 = 615.95:  40%|███▉      | 816/2048 [10:34<14:50,  1.38it/s]
loss 2.17 accuracy 0.25 -- 56.10 + 57.54 + 496.06 + 4.78 = 614.47:  40%|███▉      | 816/2048 [10:35<14:50,  1.38it/s]
loss 2.17 accuracy 0.25 -- 56.10 + 57.54 + 496.06 + 4.78 = 614.47:  40%|███▉      | 817/2048 [10:35<14:20,  1.43it/s]
loss 1.43 accuracy 0.50 -- 157.21 + 56.75 + 489.52 + 4.77 = 708.24:  40%|███▉      | 817/2048 [10:36<14:20,  1.43it/s]
loss 1.43 accuracy 0.50 -- 157.21 + 56.75 + 489.52 + 4.77 = 708.24:  40%|███▉      | 818/2048 [10:36<14:32,  1.41it/s]
loss 1.81 accuracy 0.25 -- 55.72 + 166.37 + 503.13 + 4.77 = 729.99:  40%|███▉      | 818/2048 [10:36<14:32,  1.41it/s]
loss 1.81 accuracy 0.25 -- 55.72 + 166.37 + 503.13 + 4.77 = 729.99:  40%|███▉      | 819/2048 [10:36<14:49,  1.38it/s]
loss 1.81 accuracy 0.38 -- 56.87 + 56.75 + 498.72 + 4.78 = 617.11:  40%|███▉      | 819/2048 [10:37<14:49,  1.38it/s] 
loss 1.81 accuracy 0.38 -- 56.87 + 56.75 + 498.72 + 4.78 = 617.11:  40%|████      | 820/2048 [10:37<14:19,  1.43it/s]
loss 1.78 accuracy 0.19 -- 162.40 + 57.04 + 496.44 + 4.77 = 720.66:  40%|████      | 820/2048 [10:38<14:19,  1.43it/s]
loss 1.78 accuracy 0.19 -- 162.40 + 57.04 + 496.44 + 4.77 = 720.66:  40%|████      | 821/2048 [10:38<14:36,  1.40it/s]
loss 1.96 accuracy 0.31 -- 56.16 + 57.17 + 619.66 + 4.78 = 737.78:  40%|████      | 821/2048 [10:39<14:36,  1.40it/s] 
loss 1.96 accuracy 0.31 -- 56.16 + 57.17 + 619.66 + 4.78 = 737.78:  40%|████      | 822/2048 [10:39<14:54,  1.37it/s]
loss 2.03 accuracy 0.31 -- 56.92 + 56.89 + 505.71 + 4.79 = 624.30:  40%|████      | 822/2048 [10:39<14:54,  1.37it/s]
loss 2.03 accuracy 0.31 -- 56.92 + 56.89 + 505.71 + 4.79 = 624.30:  40%|████      | 823/2048 [10:39<14:38,  1.40it/s]
loss 2.53 accuracy 0.19 -- 56.17 + 57.23 + 616.05 + 4.78 = 734.22:  40%|████      | 823/2048 [10:40<14:38,  1.40it/s]
loss 2.53 accuracy 0.19 -- 56.17 + 57.23 + 616.05 + 4.78 = 734.22:  40%|████      | 824/2048 [10:40<14:53,  1.37it/s]
loss 1.85 accuracy 0.25 -- 56.82 + 56.68 + 504.45 + 4.77 = 622.71:  40%|████      | 824/2048 [10:41<14:53,  1.37it/s]
loss 1.85 accuracy 0.25 -- 56.82 + 56.68 + 504.45 + 4.77 = 622.71:  40%|████      | 825/2048 [10:41<14:23,  1.42it/s]
loss 1.90 accuracy 0.25 -- 56.07 + 166.23 + 502.10 + 4.78 = 729.18:  40%|████      | 825/2048 [10:41<14:23,  1.42it/s]
loss 1.90 accuracy 0.25 -- 56.07 + 166.23 + 502.10 + 4.78 = 729.18:  40%|████      | 826/2048 [10:41<14:41,  1.39it/s]
loss 2.23 accuracy 0.12 -- 55.87 + 56.53 + 499.77 + 4.77 = 616.93:  40%|████      | 826/2048 [10:42<14:41,  1.39it/s] 
loss 2.23 accuracy 0.12 -- 55.87 + 56.53 + 499.77 + 4.77 = 616.93:  40%|████      | 827/2048 [10:42<14:12,  1.43it/s]
loss 2.23 accuracy 0.12 -- 56.84 + 56.36 + 498.65 + 4.79 = 616.64:  40%|████      | 827/2048 [10:43<14:12,  1.43it/s]
loss 2.23 accuracy 0.12 -- 56.84 + 56.36 + 498.65 + 4.79 = 616.64:  40%|████      | 828/2048 [10:43<14:30,  1.40it/s]
loss 1.83 accuracy 0.31 -- 56.35 + 56.92 + 495.17 + 4.78 = 613.23:  40%|████      | 828/2048 [10:43<14:30,  1.40it/s]
loss 1.83 accuracy 0.31 -- 56.35 + 56.92 + 495.17 + 4.78 = 613.23:  40%|████      | 829/2048 [10:43<14:03,  1.45it/s]
loss 1.60 accuracy 0.44 -- 157.36 + 56.75 + 489.16 + 4.79 = 708.06:  40%|████      | 829/2048 [10:44<14:03,  1.45it/s]
loss 1.60 accuracy 0.44 -- 157.36 + 56.75 + 489.16 + 4.79 = 708.06:  41%|████      | 830/2048 [10:44<14:24,  1.41it/s]
loss 1.82 accuracy 0.50 -- 56.08 + 167.04 + 502.14 + 4.77 = 730.03:  41%|████      | 830/2048 [10:45<14:24,  1.41it/s]
loss 1.82 accuracy 0.50 -- 56.08 + 167.04 + 502.14 + 4.77 = 730.03:  41%|████      | 831/2048 [10:45<14:41,  1.38it/s]
loss 1.82 accuracy 0.50 -- 56.55 + 56.58 + 497.01 + 4.78 = 614.92:  41%|████      | 831/2048 [10:46<14:41,  1.38it/s] 
loss 1.82 accuracy 0.50 -- 56.55 + 56.58 + 497.01 + 4.78 = 614.92:  41%|████      | 832/2048 [10:46<14:10,  1.43it/s]
loss 1.73 accuracy 0.25 -- 163.02 + 56.95 + 497.65 + 4.77 = 722.39:  41%|████      | 832/2048 [10:46<14:10,  1.43it/s]
loss 1.73 accuracy 0.25 -- 163.02 + 56.95 + 497.65 + 4.77 = 722.39:  41%|████      | 833/2048 [10:46<14:28,  1.40it/s]
loss 1.68 accuracy 0.19 -- 56.02 + 56.86 + 619.63 + 4.78 = 737.29:  41%|████      | 833/2048 [10:47<14:28,  1.40it/s] 
loss 1.68 accuracy 0.19 -- 56.02 + 56.86 + 619.63 + 4.78 = 737.29:  41%|████      | 834/2048 [10:47<14:45,  1.37it/s]
loss 1.56 accuracy 0.38 -- 56.94 + 56.80 + 505.11 + 4.78 = 623.63:  41%|████      | 834/2048 [10:48<14:45,  1.37it/s]
loss 1.56 accuracy 0.38 -- 56.94 + 56.80 + 505.11 + 4.78 = 623.63:  41%|████      | 835/2048 [10:48<14:16,  1.42it/s]
loss 1.95 accuracy 0.12 -- 56.21 + 57.66 + 618.80 + 4.80 = 737.47:  41%|████      | 835/2048 [10:49<14:16,  1.42it/s]
loss 1.95 accuracy 0.12 -- 56.21 + 57.66 + 618.80 + 4.80 = 737.47:  41%|████      | 836/2048 [10:49<14:37,  1.38it/s]
loss 2.07 accuracy 0.25 -- 56.81 + 56.50 + 502.67 + 4.78 = 620.77:  41%|████      | 836/2048 [10:49<14:37,  1.38it/s]
loss 2.07 accuracy 0.25 -- 56.81 + 56.50 + 502.67 + 4.78 = 620.77:  41%|████      | 837/2048 [10:49<14:08,  1.43it/s]
loss 1.71 accuracy 0.31 -- 56.18 + 166.46 + 503.71 + 4.83 = 731.18:  41%|████      | 837/2048 [10:50<14:08,  1.43it/s]
loss 1.71 accuracy 0.31 -- 56.18 + 166.46 + 503.71 + 4.83 = 731.18:  41%|████      | 838/2048 [10:50<14:29,  1.39it/s]
loss 1.88 accuracy 0.31 -- 56.23 + 56.78 + 499.70 + 4.78 = 617.49:  41%|████      | 838/2048 [10:51<14:29,  1.39it/s] 
loss 1.88 accuracy 0.31 -- 56.23 + 56.78 + 499.70 + 4.78 = 617.49:  41%|████      | 839/2048 [10:51<14:02,  1.44it/s]
loss 1.71 accuracy 0.31 -- 56.71 + 56.30 + 498.30 + 4.78 = 616.09:  41%|████      | 839/2048 [10:51<14:02,  1.44it/s]
loss 1.71 accuracy 0.31 -- 56.71 + 56.30 + 498.30 + 4.78 = 616.09:  41%|████      | 840/2048 [10:51<14:20,  1.40it/s]
loss 2.21 accuracy 0.19 -- 56.28 + 57.25 + 496.71 + 4.78 = 615.01:  41%|████      | 840/2048 [10:52<14:20,  1.40it/s]
loss 2.21 accuracy 0.19 -- 56.28 + 57.25 + 496.71 + 4.78 = 615.01:  41%|████      | 841/2048 [10:52<13:54,  1.45it/s]
loss 1.78 accuracy 0.38 -- 157.69 + 57.17 + 491.34 + 4.77 = 710.97:  41%|████      | 841/2048 [10:53<13:54,  1.45it/s]
loss 1.78 accuracy 0.38 -- 157.69 + 57.17 + 491.34 + 4.77 = 710.97:  41%|████      | 842/2048 [10:53<14:10,  1.42it/s]
loss 2.01 accuracy 0.19 -- 55.83 + 166.30 + 501.28 + 4.78 = 728.19:  41%|████      | 842/2048 [10:53<14:10,  1.42it/s]
loss 2.01 accuracy 0.19 -- 55.83 + 166.30 + 501.28 + 4.78 = 728.19:  41%|████      | 843/2048 [10:53<14:27,  1.39it/s]
loss 1.73 accuracy 0.25 -- 56.34 + 56.38 + 497.59 + 4.78 = 615.10:  41%|████      | 843/2048 [10:54<14:27,  1.39it/s] 
loss 1.73 accuracy 0.25 -- 56.34 + 56.38 + 497.59 + 4.78 = 615.10:  41%|████      | 844/2048 [10:54<13:58,  1.44it/s]
loss 2.09 accuracy 0.38 -- 162.94 + 56.92 + 497.03 + 4.77 = 721.65:  41%|████      | 844/2048 [10:55<13:58,  1.44it/s]
loss 2.09 accuracy 0.38 -- 162.94 + 56.92 + 497.03 + 4.77 = 721.65:  41%|████▏     | 845/2048 [10:55<14:16,  1.40it/s]
loss 2.12 accuracy 0.38 -- 56.08 + 57.01 + 621.06 + 4.78 = 738.93:  41%|████▏     | 845/2048 [10:56<14:16,  1.40it/s] 
loss 2.12 accuracy 0.38 -- 56.08 + 57.01 + 621.06 + 4.78 = 738.93:  41%|████▏     | 846/2048 [10:56<14:35,  1.37it/s]
loss 1.98 accuracy 0.25 -- 57.32 + 56.55 + 505.52 + 4.83 = 624.21:  41%|████▏     | 846/2048 [10:56<14:35,  1.37it/s]
loss 1.98 accuracy 0.25 -- 57.32 + 56.55 + 505.52 + 4.83 = 624.21:  41%|████▏     | 847/2048 [10:56<14:06,  1.42it/s]
loss 1.81 accuracy 0.44 -- 56.68 + 57.74 + 616.46 + 4.79 = 735.68:  41%|████▏     | 847/2048 [10:57<14:06,  1.42it/s]
loss 1.81 accuracy 0.44 -- 56.68 + 57.74 + 616.46 + 4.79 = 735.68:  41%|████▏     | 848/2048 [10:57<14:26,  1.38it/s]
loss 1.76 accuracy 0.38 -- 56.62 + 56.45 + 503.42 + 4.80 = 621.29:  41%|████▏     | 848/2048 [10:58<14:26,  1.38it/s]
loss 1.76 accuracy 0.38 -- 56.62 + 56.45 + 503.42 + 4.80 = 621.29:  41%|████▏     | 849/2048 [10:58<13:59,  1.43it/s]
loss 1.68 accuracy 0.44 -- 56.40 + 166.36 + 503.49 + 4.78 = 731.03:  41%|████▏     | 849/2048 [10:58<13:59,  1.43it/s]
loss 1.68 accuracy 0.44 -- 56.40 + 166.36 + 503.49 + 4.78 = 731.03:  42%|████▏     | 850/2048 [10:58<14:19,  1.39it/s]
loss 2.47 accuracy 0.19 -- 56.56 + 56.54 + 500.61 + 4.79 = 618.50:  42%|████▏     | 850/2048 [10:59<14:19,  1.39it/s] 
loss 2.47 accuracy 0.19 -- 56.56 + 56.54 + 500.61 + 4.79 = 618.50:  42%|████▏     | 851/2048 [10:59<13:52,  1.44it/s]
loss 2.23 accuracy 0.38 -- 56.48 + 56.60 + 498.87 + 4.80 = 616.75:  42%|████▏     | 851/2048 [11:00<13:52,  1.44it/s]
loss 2.23 accuracy 0.38 -- 56.48 + 56.60 + 498.87 + 4.80 = 616.75:  42%|████▏     | 852/2048 [11:00<14:11,  1.40it/s]
loss 2.17 accuracy 0.19 -- 56.94 + 57.58 + 498.15 + 4.78 = 617.44:  42%|████▏     | 852/2048 [11:00<14:11,  1.40it/s]
loss 2.17 accuracy 0.19 -- 56.94 + 57.58 + 498.15 + 4.78 = 617.44:  42%|████▏     | 853/2048 [11:00<13:46,  1.45it/s]
loss 1.92 accuracy 0.12 -- 157.54 + 57.28 + 490.36 + 4.77 = 709.95:  42%|████▏     | 853/2048 [11:01<13:46,  1.45it/s]
loss 1.92 accuracy 0.12 -- 157.54 + 57.28 + 490.36 + 4.77 = 709.95:  42%|████▏     | 854/2048 [11:01<14:02,  1.42it/s]
loss 1.56 accuracy 0.38 -- 55.84 + 166.49 + 501.86 + 4.77 = 728.96:  42%|████▏     | 854/2048 [11:02<14:02,  1.42it/s]
loss 1.56 accuracy 0.38 -- 55.84 + 166.49 + 501.86 + 4.77 = 728.96:  42%|████▏     | 855/2048 [11:02<14:19,  1.39it/s]
loss 2.04 accuracy 0.25 -- 56.61 + 56.87 + 497.35 + 4.77 = 615.59:  42%|████▏     | 855/2048 [11:03<14:19,  1.39it/s] 
loss 2.04 accuracy 0.25 -- 56.61 + 56.87 + 497.35 + 4.77 = 615.59:  42%|████▏     | 856/2048 [11:03<13:50,  1.43it/s]
loss 1.99 accuracy 0.19 -- 162.50 + 57.05 + 495.92 + 4.76 = 720.23:  42%|████▏     | 856/2048 [11:03<13:50,  1.43it/s]
loss 1.99 accuracy 0.19 -- 162.50 + 57.05 + 495.92 + 4.76 = 720.23:  42%|████▏     | 857/2048 [11:03<14:08,  1.40it/s]
loss 1.56 accuracy 0.62 -- 56.08 + 57.10 + 621.45 + 4.78 = 739.41:  42%|████▏     | 857/2048 [11:04<14:08,  1.40it/s] 
loss 1.56 accuracy 0.62 -- 56.08 + 57.10 + 621.45 + 4.78 = 739.41:  42%|████▏     | 858/2048 [11:04<14:26,  1.37it/s]
loss 2.41 accuracy 0.25 -- 57.04 + 56.93 + 507.49 + 4.81 = 626.27:  42%|████▏     | 858/2048 [11:05<14:26,  1.37it/s]
loss 2.41 accuracy 0.25 -- 57.04 + 56.93 + 507.49 + 4.81 = 626.27:  42%|████▏     | 859/2048 [11:05<13:59,  1.42it/s]
loss 1.67 accuracy 0.31 -- 56.12 + 57.20 + 615.24 + 4.77 = 733.34:  42%|████▏     | 859/2048 [11:06<13:59,  1.42it/s]
loss 1.67 accuracy 0.31 -- 56.12 + 57.20 + 615.24 + 4.77 = 733.34:  42%|████▏     | 860/2048 [11:06<14:17,  1.38it/s]
loss 2.15 accuracy 0.25 -- 56.54 + 56.73 + 501.82 + 4.77 = 619.86:  42%|████▏     | 860/2048 [11:06<14:17,  1.38it/s]
loss 2.15 accuracy 0.25 -- 56.54 + 56.73 + 501.82 + 4.77 = 619.86:  42%|████▏     | 861/2048 [11:06<13:50,  1.43it/s]
loss 1.45 accuracy 0.50 -- 56.03 + 166.19 + 502.01 + 4.79 = 729.01:  42%|████▏     | 861/2048 [11:07<13:50,  1.43it/s]
loss 1.45 accuracy 0.50 -- 56.03 + 166.19 + 502.01 + 4.79 = 729.01:  42%|████▏     | 862/2048 [11:07<14:09,  1.40it/s]
loss 1.87 accuracy 0.44 -- 56.23 + 56.35 + 500.26 + 4.77 = 617.61:  42%|████▏     | 862/2048 [11:08<14:09,  1.40it/s] 
loss 1.87 accuracy 0.44 -- 56.23 + 56.35 + 500.26 + 4.77 = 617.61:  42%|████▏     | 863/2048 [11:08<13:43,  1.44it/s]
loss 1.73 accuracy 0.31 -- 56.58 + 56.49 + 499.31 + 4.78 = 617.16:  42%|████▏     | 863/2048 [11:08<13:43,  1.44it/s]
loss 1.73 accuracy 0.31 -- 56.58 + 56.49 + 499.31 + 4.78 = 617.16:  42%|████▏     | 864/2048 [11:08<14:02,  1.40it/s]
loss 1.86 accuracy 0.25 -- 56.39 + 57.59 + 496.32 + 4.79 = 615.08:  42%|████▏     | 864/2048 [11:09<14:02,  1.40it/s]
loss 1.86 accuracy 0.25 -- 56.39 + 57.59 + 496.32 + 4.79 = 615.08:  42%|████▏     | 865/2048 [11:09<13:37,  1.45it/s]
loss 1.45 accuracy 0.44 -- 157.84 + 56.84 + 489.39 + 4.78 = 708.85:  42%|████▏     | 865/2048 [11:10<13:37,  1.45it/s]
loss 1.45 accuracy 0.44 -- 157.84 + 56.84 + 489.39 + 4.78 = 708.85:  42%|████▏     | 866/2048 [11:10<13:52,  1.42it/s]
loss 2.11 accuracy 0.25 -- 55.65 + 166.21 + 501.14 + 4.78 = 727.78:  42%|████▏     | 866/2048 [11:10<13:52,  1.42it/s]
loss 2.11 accuracy 0.25 -- 55.65 + 166.21 + 501.14 + 4.78 = 727.78:  42%|████▏     | 867/2048 [11:10<14:09,  1.39it/s]
loss 1.85 accuracy 0.25 -- 56.95 + 56.59 + 498.15 + 4.78 = 616.48:  42%|████▏     | 867/2048 [11:11<14:09,  1.39it/s] 
loss 1.85 accuracy 0.25 -- 56.95 + 56.59 + 498.15 + 4.78 = 616.48:  42%|████▏     | 868/2048 [11:11<13:41,  1.44it/s]
loss 1.68 accuracy 0.25 -- 162.80 + 57.16 + 497.14 + 4.77 = 721.87:  42%|████▏     | 868/2048 [11:12<13:41,  1.44it/s]
loss 1.68 accuracy 0.25 -- 162.80 + 57.16 + 497.14 + 4.77 = 721.87:  42%|████▏     | 869/2048 [11:12<13:59,  1.40it/s]
loss 1.55 accuracy 0.31 -- 55.91 + 57.44 + 620.88 + 4.79 = 739.03:  42%|████▏     | 869/2048 [11:13<13:59,  1.40it/s] 
loss 1.55 accuracy 0.31 -- 55.91 + 57.44 + 620.88 + 4.79 = 739.03:  42%|████▏     | 870/2048 [11:13<14:18,  1.37it/s]
loss 1.70 accuracy 0.31 -- 56.60 + 56.55 + 506.13 + 4.78 = 624.06:  42%|████▏     | 870/2048 [11:13<14:18,  1.37it/s]
loss 1.70 accuracy 0.31 -- 56.60 + 56.55 + 506.13 + 4.78 = 624.06:  43%|████▎     | 871/2048 [11:13<13:50,  1.42it/s]
loss 1.95 accuracy 0.19 -- 56.13 + 57.30 + 616.84 + 4.79 = 735.06:  43%|████▎     | 871/2048 [11:14<13:50,  1.42it/s]
loss 1.95 accuracy 0.19 -- 56.13 + 57.30 + 616.84 + 4.79 = 735.06:  43%|████▎     | 872/2048 [11:14<14:09,  1.38it/s]
loss 2.10 accuracy 0.38 -- 56.74 + 56.54 + 501.74 + 4.78 = 619.81:  43%|████▎     | 872/2048 [11:15<14:09,  1.38it/s]
loss 2.10 accuracy 0.38 -- 56.74 + 56.54 + 501.74 + 4.78 = 619.81:  43%|████▎     | 873/2048 [11:15<13:42,  1.43it/s]
loss 2.04 accuracy 0.31 -- 56.34 + 166.89 + 501.48 + 4.77 = 729.48:  43%|████▎     | 873/2048 [11:15<13:42,  1.43it/s]
loss 2.04 accuracy 0.31 -- 56.34 + 166.89 + 501.48 + 4.77 = 729.48:  43%|████▎     | 874/2048 [11:15<14:01,  1.40it/s]
loss 1.70 accuracy 0.31 -- 56.31 + 56.37 + 500.06 + 4.78 = 617.52:  43%|████▎     | 874/2048 [11:16<14:01,  1.40it/s] 
loss 1.70 accuracy 0.31 -- 56.31 + 56.37 + 500.06 + 4.78 = 617.52:  43%|████▎     | 875/2048 [11:16<13:35,  1.44it/s]
loss 1.64 accuracy 0.38 -- 56.64 + 56.68 + 498.99 + 4.79 = 617.10:  43%|████▎     | 875/2048 [11:17<13:35,  1.44it/s]
loss 1.64 accuracy 0.38 -- 56.64 + 56.68 + 498.99 + 4.79 = 617.10:  43%|████▎     | 876/2048 [11:17<13:54,  1.40it/s]
loss 1.68 accuracy 0.44 -- 56.22 + 57.16 + 494.71 + 4.77 = 612.85:  43%|████▎     | 876/2048 [11:17<13:54,  1.40it/s]
loss 1.68 accuracy 0.44 -- 56.22 + 57.16 + 494.71 + 4.77 = 612.85:  43%|████▎     | 877/2048 [11:17<13:28,  1.45it/s]
loss 2.25 accuracy 0.19 -- 158.02 + 56.72 + 489.31 + 4.77 = 708.82:  43%|████▎     | 877/2048 [11:18<13:28,  1.45it/s]
loss 2.25 accuracy 0.19 -- 158.02 + 56.72 + 489.31 + 4.77 = 708.82:  43%|████▎     | 878/2048 [11:18<13:43,  1.42it/s]
loss 2.36 accuracy 0.25 -- 55.73 + 166.32 + 501.19 + 4.79 = 728.03:  43%|████▎     | 878/2048 [11:19<13:43,  1.42it/s]
loss 2.36 accuracy 0.25 -- 55.73 + 166.32 + 501.19 + 4.79 = 728.03:  43%|████▎     | 879/2048 [11:19<14:00,  1.39it/s]
loss 1.94 accuracy 0.31 -- 56.58 + 56.64 + 498.34 + 4.77 = 616.33:  43%|████▎     | 879/2048 [11:20<14:00,  1.39it/s] 
loss 1.94 accuracy 0.31 -- 56.58 + 56.64 + 498.34 + 4.77 = 616.33:  43%|████▎     | 880/2048 [11:20<13:33,  1.44it/s]
loss 1.89 accuracy 0.38 -- 162.90 + 57.04 + 496.89 + 4.77 = 721.60:  43%|████▎     | 880/2048 [11:20<13:33,  1.44it/s]
loss 1.89 accuracy 0.38 -- 162.90 + 57.04 + 496.89 + 4.77 = 721.60:  43%|████▎     | 881/2048 [11:20<13:51,  1.40it/s]
loss 1.51 accuracy 0.38 -- 56.38 + 57.64 + 620.59 + 4.78 = 739.39:  43%|████▎     | 881/2048 [11:21<13:51,  1.40it/s] 
loss 1.51 accuracy 0.38 -- 56.38 + 57.64 + 620.59 + 4.78 = 739.39:  43%|████▎     | 882/2048 [11:21<14:17,  1.36it/s]
loss 1.65 accuracy 0.44 -- 57.07 + 56.70 + 503.63 + 4.78 = 622.18:  43%|████▎     | 882/2048 [11:22<14:17,  1.36it/s]
loss 1.65 accuracy 0.44 -- 57.07 + 56.70 + 503.63 + 4.78 = 622.18:  43%|████▎     | 883/2048 [11:22<13:46,  1.41it/s]
loss 2.39 accuracy 0.31 -- 56.44 + 57.11 + 615.29 + 4.80 = 733.64:  43%|████▎     | 883/2048 [11:23<13:46,  1.41it/s]
loss 2.39 accuracy 0.31 -- 56.44 + 57.11 + 615.29 + 4.80 = 733.64:  43%|████▎     | 884/2048 [11:23<14:03,  1.38it/s]
loss 1.90 accuracy 0.31 -- 56.76 + 56.49 + 502.15 + 4.87 = 620.27:  43%|████▎     | 884/2048 [11:23<14:03,  1.38it/s]
loss 1.90 accuracy 0.31 -- 56.76 + 56.49 + 502.15 + 4.87 = 620.27:  43%|████▎     | 885/2048 [11:23<13:35,  1.43it/s]
loss 1.44 accuracy 0.31 -- 56.18 + 166.45 + 502.51 + 4.79 = 729.94:  43%|████▎     | 885/2048 [11:24<13:35,  1.43it/s]
loss 1.44 accuracy 0.31 -- 56.18 + 166.45 + 502.51 + 4.79 = 729.94:  43%|████▎     | 886/2048 [11:24<13:54,  1.39it/s]
loss 1.93 accuracy 0.38 -- 56.16 + 56.80 + 500.58 + 4.77 = 618.32:  43%|████▎     | 886/2048 [11:25<13:54,  1.39it/s] 
loss 1.93 accuracy 0.38 -- 56.16 + 56.80 + 500.58 + 4.77 = 618.32:  43%|████▎     | 887/2048 [11:25<13:28,  1.44it/s]
loss 1.91 accuracy 0.19 -- 56.80 + 56.64 + 497.88 + 4.78 = 616.10:  43%|████▎     | 887/2048 [11:25<13:28,  1.44it/s]
loss 1.91 accuracy 0.19 -- 56.80 + 56.64 + 497.88 + 4.78 = 616.10:  43%|████▎     | 888/2048 [11:25<13:46,  1.40it/s]
loss 1.93 accuracy 0.44 -- 56.32 + 57.52 + 495.49 + 4.78 = 614.11:  43%|████▎     | 888/2048 [11:26<13:46,  1.40it/s]
loss 1.93 accuracy 0.44 -- 56.32 + 57.52 + 495.49 + 4.78 = 614.11:  43%|████▎     | 889/2048 [11:26<13:20,  1.45it/s]
loss 1.81 accuracy 0.38 -- 157.14 + 57.02 + 489.45 + 4.80 = 708.40:  43%|████▎     | 889/2048 [11:27<13:20,  1.45it/s]
loss 1.81 accuracy 0.38 -- 157.14 + 57.02 + 489.45 + 4.80 = 708.40:  43%|████▎     | 890/2048 [11:27<13:35,  1.42it/s]
loss 1.57 accuracy 0.38 -- 55.98 + 166.61 + 501.06 + 4.78 = 728.42:  43%|████▎     | 890/2048 [11:27<13:35,  1.42it/s]
loss 1.57 accuracy 0.38 -- 55.98 + 166.61 + 501.06 + 4.78 = 728.42:  44%|████▎     | 891/2048 [11:27<13:52,  1.39it/s]
loss 1.53 accuracy 0.31 -- 56.48 + 56.43 + 498.67 + 4.80 = 616.38:  44%|████▎     | 891/2048 [11:28<13:52,  1.39it/s] 
loss 1.53 accuracy 0.31 -- 56.48 + 56.43 + 498.67 + 4.80 = 616.38:  44%|████▎     | 892/2048 [11:28<13:25,  1.43it/s]
loss 2.04 accuracy 0.25 -- 162.91 + 57.41 + 497.47 + 4.83 = 722.62:  44%|████▎     | 892/2048 [11:29<13:25,  1.43it/s]
loss 2.04 accuracy 0.25 -- 162.91 + 57.41 + 497.47 + 4.83 = 722.62:  44%|████▎     | 893/2048 [11:29<13:43,  1.40it/s]
loss 1.80 accuracy 0.44 -- 56.47 + 57.18 + 620.46 + 4.78 = 738.89:  44%|████▎     | 893/2048 [11:30<13:43,  1.40it/s] 
loss 1.80 accuracy 0.44 -- 56.47 + 57.18 + 620.46 + 4.78 = 738.89:  44%|████▎     | 894/2048 [11:30<14:01,  1.37it/s]
loss 1.62 accuracy 0.38 -- 56.59 + 56.58 + 505.19 + 4.80 = 623.17:  44%|████▎     | 894/2048 [11:30<14:01,  1.37it/s]
loss 1.62 accuracy 0.38 -- 56.59 + 56.58 + 505.19 + 4.80 = 623.17:  44%|████▎     | 895/2048 [11:30<13:33,  1.42it/s]
loss 2.17 accuracy 0.25 -- 56.41 + 57.29 + 615.55 + 4.78 = 734.02:  44%|████▎     | 895/2048 [11:31<13:33,  1.42it/s]
loss 2.17 accuracy 0.25 -- 56.41 + 57.29 + 615.55 + 4.78 = 734.02:  44%|████▍     | 896/2048 [11:31<13:51,  1.38it/s]
loss 1.77 accuracy 0.31 -- 56.81 + 56.28 + 501.84 + 4.77 = 619.70:  44%|████▍     | 896/2048 [11:32<13:51,  1.38it/s]
loss 1.77 accuracy 0.31 -- 56.81 + 56.28 + 501.84 + 4.77 = 619.70:  44%|████▍     | 897/2048 [11:32<13:25,  1.43it/s]
loss 1.83 accuracy 0.31 -- 56.35 + 167.34 + 503.20 + 4.82 = 731.71:  44%|████▍     | 897/2048 [11:32<13:25,  1.43it/s]
loss 1.83 accuracy 0.31 -- 56.35 + 167.34 + 503.20 + 4.82 = 731.71:  44%|████▍     | 898/2048 [11:32<13:44,  1.39it/s]
loss 1.65 accuracy 0.38 -- 56.02 + 56.51 + 500.25 + 4.78 = 617.57:  44%|████▍     | 898/2048 [11:33<13:44,  1.39it/s] 
loss 1.65 accuracy 0.38 -- 56.02 + 56.51 + 500.25 + 4.78 = 617.57:  44%|████▍     | 899/2048 [11:33<13:19,  1.44it/s]
loss 1.65 accuracy 0.56 -- 56.46 + 56.19 + 497.52 + 4.78 = 614.94:  44%|████▍     | 899/2048 [11:34<13:19,  1.44it/s]
loss 1.65 accuracy 0.56 -- 56.46 + 56.19 + 497.52 + 4.78 = 614.94:  44%|████▍     | 900/2048 [11:34<13:36,  1.41it/s]
loss 1.70 accuracy 0.44 -- 56.08 + 57.04 + 494.78 + 4.77 = 612.66:  44%|████▍     | 900/2048 [11:34<13:36,  1.41it/s]
loss 1.70 accuracy 0.44 -- 56.08 + 57.04 + 494.78 + 4.77 = 612.66:  44%|████▍     | 901/2048 [11:34<13:11,  1.45it/s]
loss 2.05 accuracy 0.12 -- 157.79 + 56.92 + 489.18 + 4.77 = 708.67:  44%|████▍     | 901/2048 [11:35<13:11,  1.45it/s]
loss 2.05 accuracy 0.12 -- 157.79 + 56.92 + 489.18 + 4.77 = 708.67:  44%|████▍     | 902/2048 [11:35<13:26,  1.42it/s]
loss 1.58 accuracy 0.31 -- 55.62 + 166.73 + 502.42 + 4.77 = 729.54:  44%|████▍     | 902/2048 [11:36<13:26,  1.42it/s]
loss 1.58 accuracy 0.31 -- 55.62 + 166.73 + 502.42 + 4.77 = 729.54:  44%|████▍     | 903/2048 [11:36<13:44,  1.39it/s]
loss 1.46 accuracy 0.44 -- 56.63 + 56.23 + 499.25 + 4.77 = 616.88:  44%|████▍     | 903/2048 [11:37<13:44,  1.39it/s] 
loss 1.46 accuracy 0.44 -- 56.63 + 56.23 + 499.25 + 4.77 = 616.88:  44%|████▍     | 904/2048 [11:37<13:17,  1.43it/s]
loss 1.70 accuracy 0.38 -- 162.91 + 57.12 + 498.61 + 4.77 = 723.41:  44%|████▍     | 904/2048 [11:37<13:17,  1.43it/s]
loss 1.70 accuracy 0.38 -- 162.91 + 57.12 + 498.61 + 4.77 = 723.41:  44%|████▍     | 905/2048 [11:37<13:35,  1.40it/s]
loss 1.70 accuracy 0.19 -- 56.08 + 57.39 + 620.08 + 4.81 = 738.36:  44%|████▍     | 905/2048 [11:38<13:35,  1.40it/s] 
loss 1.70 accuracy 0.19 -- 56.08 + 57.39 + 620.08 + 4.81 = 738.36:  44%|████▍     | 906/2048 [11:38<13:52,  1.37it/s]
loss 1.55 accuracy 0.38 -- 56.52 + 56.37 + 504.63 + 4.78 = 622.30:  44%|████▍     | 906/2048 [11:39<13:52,  1.37it/s]
loss 1.55 accuracy 0.38 -- 56.52 + 56.37 + 504.63 + 4.78 = 622.30:  44%|████▍     | 907/2048 [11:39<13:24,  1.42it/s]
loss 2.29 accuracy 0.06 -- 56.11 + 57.46 + 616.16 + 4.78 = 734.50:  44%|████▍     | 907/2048 [11:40<13:24,  1.42it/s]
loss 2.29 accuracy 0.06 -- 56.11 + 57.46 + 616.16 + 4.78 = 734.50:  44%|████▍     | 908/2048 [11:40<13:43,  1.38it/s]
loss 2.23 accuracy 0.31 -- 56.66 + 56.40 + 503.47 + 4.78 = 621.32:  44%|████▍     | 908/2048 [11:40<13:43,  1.38it/s]
loss 2.23 accuracy 0.31 -- 56.66 + 56.40 + 503.47 + 4.78 = 621.32:  44%|████▍     | 909/2048 [11:40<13:17,  1.43it/s]
loss 2.80 accuracy 0.31 -- 56.49 + 166.76 + 503.17 + 4.78 = 731.19:  44%|████▍     | 909/2048 [11:41<13:17,  1.43it/s]
loss 2.80 accuracy 0.31 -- 56.49 + 166.76 + 503.17 + 4.78 = 731.19:  44%|████▍     | 910/2048 [11:41<13:36,  1.39it/s]
loss 1.68 accuracy 0.44 -- 56.40 + 56.39 + 500.80 + 4.76 = 618.35:  44%|████▍     | 910/2048 [11:42<13:36,  1.39it/s] 
loss 1.68 accuracy 0.44 -- 56.40 + 56.39 + 500.80 + 4.76 = 618.35:  44%|████▍     | 911/2048 [11:42<13:11,  1.44it/s]
loss 1.25 accuracy 0.56 -- 57.19 + 56.71 + 499.46 + 4.78 = 618.14:  44%|████▍     | 911/2048 [11:42<13:11,  1.44it/s]
loss 1.25 accuracy 0.56 -- 57.19 + 56.71 + 499.46 + 4.78 = 618.14:  45%|████▍     | 912/2048 [11:42<13:29,  1.40it/s]
loss 1.94 accuracy 0.38 -- 55.87 + 57.13 + 495.48 + 4.77 = 613.25:  45%|████▍     | 912/2048 [11:43<13:29,  1.40it/s]
loss 1.94 accuracy 0.38 -- 55.87 + 57.13 + 495.48 + 4.77 = 613.25:  45%|████▍     | 913/2048 [11:43<13:04,  1.45it/s]
loss 2.16 accuracy 0.31 -- 157.12 + 56.80 + 489.77 + 4.78 = 708.47:  45%|████▍     | 913/2048 [11:44<13:04,  1.45it/s]
loss 2.16 accuracy 0.31 -- 157.12 + 56.80 + 489.77 + 4.78 = 708.47:  45%|████▍     | 914/2048 [11:44<13:19,  1.42it/s]
loss 1.62 accuracy 0.44 -- 55.66 + 166.64 + 503.77 + 4.79 = 730.85:  45%|████▍     | 914/2048 [11:44<13:19,  1.42it/s]
loss 1.62 accuracy 0.44 -- 55.66 + 166.64 + 503.77 + 4.79 = 730.85:  45%|████▍     | 915/2048 [11:44<13:36,  1.39it/s]
loss 1.76 accuracy 0.31 -- 56.82 + 56.44 + 498.21 + 4.77 = 616.24:  45%|████▍     | 915/2048 [11:45<13:36,  1.39it/s] 
loss 1.76 accuracy 0.31 -- 56.82 + 56.44 + 498.21 + 4.77 = 616.24:  45%|████▍     | 916/2048 [11:45<13:09,  1.43it/s]
loss 1.32 accuracy 0.38 -- 162.21 + 56.78 + 497.61 + 4.77 = 721.37:  45%|████▍     | 916/2048 [11:46<13:09,  1.43it/s]
loss 1.32 accuracy 0.38 -- 162.21 + 56.78 + 497.61 + 4.77 = 721.37:  45%|████▍     | 917/2048 [11:46<13:26,  1.40it/s]
loss 2.20 accuracy 0.44 -- 56.25 + 57.59 + 620.84 + 4.79 = 739.48:  45%|████▍     | 917/2048 [11:47<13:26,  1.40it/s] 
loss 2.20 accuracy 0.44 -- 56.25 + 57.59 + 620.84 + 4.79 = 739.48:  45%|████▍     | 918/2048 [11:47<13:43,  1.37it/s]
loss 1.96 accuracy 0.44 -- 57.04 + 56.62 + 504.89 + 4.78 = 623.33:  45%|████▍     | 918/2048 [11:47<13:43,  1.37it/s]
loss 1.96 accuracy 0.44 -- 57.04 + 56.62 + 504.89 + 4.78 = 623.33:  45%|████▍     | 919/2048 [11:47<13:16,  1.42it/s]
loss 1.64 accuracy 0.31 -- 56.16 + 57.35 + 616.75 + 4.80 = 735.06:  45%|████▍     | 919/2048 [11:48<13:16,  1.42it/s]
loss 1.64 accuracy 0.31 -- 56.16 + 57.35 + 616.75 + 4.80 = 735.06:  45%|████▍     | 920/2048 [11:48<13:35,  1.38it/s]
loss 2.05 accuracy 0.31 -- 56.64 + 57.36 + 502.66 + 4.77 = 621.42:  45%|████▍     | 920/2048 [11:49<13:35,  1.38it/s]
loss 2.05 accuracy 0.31 -- 56.64 + 57.36 + 502.66 + 4.77 = 621.42:  45%|████▍     | 921/2048 [11:49<13:09,  1.43it/s]
loss 1.82 accuracy 0.44 -- 56.38 + 166.28 + 501.82 + 4.77 = 729.24:  45%|████▍     | 921/2048 [11:49<13:09,  1.43it/s]
loss 1.82 accuracy 0.44 -- 56.38 + 166.28 + 501.82 + 4.77 = 729.24:  45%|████▌     | 922/2048 [11:49<13:27,  1.39it/s]
loss 2.01 accuracy 0.25 -- 56.13 + 56.50 + 499.76 + 4.77 = 617.16:  45%|████▌     | 922/2048 [11:50<13:27,  1.39it/s] 
loss 2.01 accuracy 0.25 -- 56.13 + 56.50 + 499.76 + 4.77 = 617.16:  45%|████▌     | 923/2048 [11:50<13:02,  1.44it/s]
loss 2.25 accuracy 0.31 -- 56.73 + 56.58 + 498.39 + 4.77 = 616.48:  45%|████▌     | 923/2048 [11:51<13:02,  1.44it/s]
loss 2.25 accuracy 0.31 -- 56.73 + 56.58 + 498.39 + 4.77 = 616.48:  45%|████▌     | 924/2048 [11:51<13:20,  1.40it/s]
loss 1.64 accuracy 0.31 -- 55.75 + 57.72 + 495.39 + 4.80 = 613.65:  45%|████▌     | 924/2048 [11:52<13:20,  1.40it/s]
loss 1.64 accuracy 0.31 -- 55.75 + 57.72 + 495.39 + 4.80 = 613.65:  45%|████▌     | 925/2048 [11:52<12:55,  1.45it/s]
loss 2.17 accuracy 0.19 -- 157.67 + 57.24 + 490.17 + 4.78 = 709.85:  45%|████▌     | 925/2048 [11:52<12:55,  1.45it/s]
loss 2.17 accuracy 0.19 -- 157.67 + 57.24 + 490.17 + 4.78 = 709.85:  45%|████▌     | 926/2048 [11:52<13:10,  1.42it/s]
loss 2.32 accuracy 0.12 -- 56.25 + 167.18 + 502.02 + 4.77 = 730.23:  45%|████▌     | 926/2048 [11:53<13:10,  1.42it/s]
loss 2.32 accuracy 0.12 -- 56.25 + 167.18 + 502.02 + 4.77 = 730.23:  45%|████▌     | 927/2048 [11:53<13:27,  1.39it/s]
loss 1.57 accuracy 0.31 -- 56.67 + 56.55 + 497.52 + 4.80 = 615.54:  45%|████▌     | 927/2048 [11:54<13:27,  1.39it/s] 
loss 1.57 accuracy 0.31 -- 56.67 + 56.55 + 497.52 + 4.80 = 615.54:  45%|████▌     | 928/2048 [11:54<13:00,  1.43it/s]
loss 1.88 accuracy 0.31 -- 162.64 + 56.99 + 497.51 + 4.79 = 721.93:  45%|████▌     | 928/2048 [11:54<13:00,  1.43it/s]
loss 1.88 accuracy 0.31 -- 162.64 + 56.99 + 497.51 + 4.79 = 721.93:  45%|████▌     | 929/2048 [11:54<13:17,  1.40it/s]
loss 1.74 accuracy 0.44 -- 56.15 + 56.99 + 620.25 + 4.78 = 738.17:  45%|████▌     | 929/2048 [11:55<13:17,  1.40it/s] 
loss 1.74 accuracy 0.44 -- 56.15 + 56.99 + 620.25 + 4.78 = 738.17:  45%|████▌     | 930/2048 [11:55<13:34,  1.37it/s]
loss 1.75 accuracy 0.56 -- 56.57 + 56.75 + 505.13 + 4.77 = 623.22:  45%|████▌     | 930/2048 [11:56<13:34,  1.37it/s]
loss 1.75 accuracy 0.56 -- 56.57 + 56.75 + 505.13 + 4.77 = 623.22:  45%|████▌     | 931/2048 [11:56<13:07,  1.42it/s]
loss 1.80 accuracy 0.25 -- 56.04 + 57.37 + 616.85 + 4.79 = 735.05:  45%|████▌     | 931/2048 [11:57<13:07,  1.42it/s]
loss 1.80 accuracy 0.25 -- 56.04 + 57.37 + 616.85 + 4.79 = 735.05:  46%|████▌     | 932/2048 [11:57<13:25,  1.38it/s]
loss 2.40 accuracy 0.31 -- 56.61 + 56.43 + 502.01 + 4.78 = 619.83:  46%|████▌     | 932/2048 [11:57<13:25,  1.38it/s]
loss 2.40 accuracy 0.31 -- 56.61 + 56.43 + 502.01 + 4.78 = 619.83:  46%|████▌     | 933/2048 [11:57<13:00,  1.43it/s]
loss 2.00 accuracy 0.31 -- 56.28 + 166.27 + 502.32 + 4.78 = 729.64:  46%|████▌     | 933/2048 [11:58<13:00,  1.43it/s]
loss 2.00 accuracy 0.31 -- 56.28 + 166.27 + 502.32 + 4.78 = 729.64:  46%|████▌     | 934/2048 [11:58<13:18,  1.40it/s]
loss 2.14 accuracy 0.19 -- 56.24 + 56.42 + 499.08 + 4.77 = 616.52:  46%|████▌     | 934/2048 [11:59<13:18,  1.40it/s] 
loss 2.14 accuracy 0.19 -- 56.24 + 56.42 + 499.08 + 4.77 = 616.52:  46%|████▌     | 935/2048 [11:59<12:53,  1.44it/s]
loss 1.97 accuracy 0.31 -- 56.63 + 56.32 + 497.79 + 4.78 = 615.52:  46%|████▌     | 935/2048 [11:59<12:53,  1.44it/s]
loss 1.97 accuracy 0.31 -- 56.63 + 56.32 + 497.79 + 4.78 = 615.52:  46%|████▌     | 936/2048 [11:59<13:11,  1.41it/s]
loss 1.68 accuracy 0.19 -- 55.90 + 57.39 + 495.81 + 4.79 = 613.89:  46%|████▌     | 936/2048 [12:00<13:11,  1.41it/s]
loss 1.68 accuracy 0.19 -- 55.90 + 57.39 + 495.81 + 4.79 = 613.89:  46%|████▌     | 937/2048 [12:00<12:46,  1.45it/s]
loss 1.80 accuracy 0.38 -- 158.05 + 57.29 + 491.50 + 4.76 = 711.61:  46%|████▌     | 937/2048 [12:01<12:46,  1.45it/s]
loss 1.80 accuracy 0.38 -- 158.05 + 57.29 + 491.50 + 4.76 = 711.61:  46%|████▌     | 938/2048 [12:01<13:02,  1.42it/s]
loss 1.93 accuracy 0.19 -- 55.87 + 166.41 + 502.00 + 4.82 = 729.10:  46%|████▌     | 938/2048 [12:01<13:02,  1.42it/s]
loss 1.93 accuracy 0.19 -- 55.87 + 166.41 + 502.00 + 4.82 = 729.10:  46%|████▌     | 939/2048 [12:01<13:18,  1.39it/s]
loss 1.88 accuracy 0.44 -- 56.87 + 56.87 + 500.20 + 4.77 = 618.70:  46%|████▌     | 939/2048 [12:02<13:18,  1.39it/s] 
loss 1.88 accuracy 0.44 -- 56.87 + 56.87 + 500.20 + 4.77 = 618.70:  46%|████▌     | 940/2048 [12:02<12:53,  1.43it/s]
loss 2.05 accuracy 0.31 -- 162.53 + 56.94 + 486.01 + 4.78 = 710.26:  46%|████▌     | 940/2048 [12:03<12:53,  1.43it/s]
loss 2.05 accuracy 0.31 -- 162.53 + 56.94 + 486.01 + 4.78 = 710.26:  46%|████▌     | 941/2048 [12:03<13:05,  1.41it/s]
loss 2.01 accuracy 0.19 -- 56.54 + 57.33 + 623.04 + 4.77 = 741.68:  46%|████▌     | 941/2048 [12:04<13:05,  1.41it/s] 
loss 2.01 accuracy 0.19 -- 56.54 + 57.33 + 623.04 + 4.77 = 741.68:  46%|████▌     | 942/2048 [12:04<13:24,  1.37it/s]
loss 1.70 accuracy 0.44 -- 56.77 + 56.55 + 505.42 + 4.78 = 623.51:  46%|████▌     | 942/2048 [12:04<13:24,  1.37it/s]
loss 1.70 accuracy 0.44 -- 56.77 + 56.55 + 505.42 + 4.78 = 623.51:  46%|████▌     | 943/2048 [12:04<12:58,  1.42it/s]
loss 2.05 accuracy 0.12 -- 56.78 + 57.74 + 615.46 + 4.78 = 734.75:  46%|████▌     | 943/2048 [12:05<12:58,  1.42it/s]
loss 2.05 accuracy 0.12 -- 56.78 + 57.74 + 615.46 + 4.78 = 734.75:  46%|████▌     | 944/2048 [12:05<13:16,  1.39it/s]
loss 1.75 accuracy 0.38 -- 56.79 + 56.46 + 501.62 + 4.77 = 619.64:  46%|████▌     | 944/2048 [12:06<13:16,  1.39it/s]
loss 1.75 accuracy 0.38 -- 56.79 + 56.46 + 501.62 + 4.77 = 619.64:  46%|████▌     | 945/2048 [12:06<12:51,  1.43it/s]
loss 2.04 accuracy 0.44 -- 55.94 + 166.39 + 501.82 + 4.77 = 728.92:  46%|████▌     | 945/2048 [12:06<12:51,  1.43it/s]
loss 2.04 accuracy 0.44 -- 55.94 + 166.39 + 501.82 + 4.77 = 728.92:  46%|████▌     | 946/2048 [12:06<13:09,  1.40it/s]
loss 1.83 accuracy 0.38 -- 55.99 + 56.61 + 498.86 + 4.76 = 616.22:  46%|████▌     | 946/2048 [12:07<13:09,  1.40it/s] 
loss 1.83 accuracy 0.38 -- 55.99 + 56.61 + 498.86 + 4.76 = 616.22:  46%|████▌     | 947/2048 [12:07<12:44,  1.44it/s]
loss 1.65 accuracy 0.44 -- 56.61 + 56.45 + 498.02 + 4.80 = 615.89:  46%|████▌     | 947/2048 [12:08<12:44,  1.44it/s]
loss 1.65 accuracy 0.44 -- 56.61 + 56.45 + 498.02 + 4.80 = 615.89:  46%|████▋     | 948/2048 [12:08<13:02,  1.41it/s]
loss 1.85 accuracy 0.25 -- 56.31 + 57.41 + 497.46 + 4.78 = 615.97:  46%|████▋     | 948/2048 [12:09<13:02,  1.41it/s]
loss 1.85 accuracy 0.25 -- 56.31 + 57.41 + 497.46 + 4.78 = 615.97:  46%|████▋     | 949/2048 [12:09<12:39,  1.45it/s]
loss 2.04 accuracy 0.38 -- 157.21 + 56.78 + 490.05 + 4.77 = 708.81:  46%|████▋     | 949/2048 [12:09<12:39,  1.45it/s]
loss 2.04 accuracy 0.38 -- 157.21 + 56.78 + 490.05 + 4.77 = 708.81:  46%|████▋     | 950/2048 [12:09<12:53,  1.42it/s]
loss 1.84 accuracy 0.38 -- 55.75 + 166.54 + 500.56 + 4.78 = 727.63:  46%|████▋     | 950/2048 [12:10<12:53,  1.42it/s]
loss 1.84 accuracy 0.38 -- 55.75 + 166.54 + 500.56 + 4.78 = 727.63:  46%|████▋     | 951/2048 [12:10<13:09,  1.39it/s]
loss 1.72 accuracy 0.31 -- 56.61 + 56.56 + 496.97 + 4.78 = 614.92:  46%|████▋     | 951/2048 [12:11<13:09,  1.39it/s] 
loss 1.72 accuracy 0.31 -- 56.61 + 56.56 + 496.97 + 4.78 = 614.92:  46%|████▋     | 952/2048 [12:11<12:43,  1.44it/s]
loss 1.73 accuracy 0.38 -- 162.24 + 57.04 + 496.42 + 4.77 = 720.47:  46%|████▋     | 952/2048 [12:11<12:43,  1.44it/s]
loss 1.73 accuracy 0.38 -- 162.24 + 57.04 + 496.42 + 4.77 = 720.47:  47%|████▋     | 953/2048 [12:11<12:59,  1.41it/s]
loss 2.49 accuracy 0.19 -- 55.98 + 57.58 + 620.44 + 4.79 = 738.78:  47%|████▋     | 953/2048 [12:12<12:59,  1.41it/s] 
loss 2.49 accuracy 0.19 -- 55.98 + 57.58 + 620.44 + 4.79 = 738.78:  47%|████▋     | 954/2048 [12:12<13:16,  1.37it/s]
loss 2.52 accuracy 0.19 -- 56.99 + 56.88 + 505.27 + 4.77 = 623.92:  47%|████▋     | 954/2048 [12:13<13:16,  1.37it/s]
loss 2.52 accuracy 0.19 -- 56.99 + 56.88 + 505.27 + 4.77 = 623.92:  47%|████▋     | 955/2048 [12:13<12:50,  1.42it/s]
loss 1.87 accuracy 0.31 -- 56.05 + 57.55 + 615.59 + 4.79 = 733.97:  47%|████▋     | 955/2048 [12:14<12:50,  1.42it/s]
loss 1.87 accuracy 0.31 -- 56.05 + 57.55 + 615.59 + 4.79 = 733.97:  47%|████▋     | 956/2048 [12:14<13:08,  1.39it/s]
loss 1.69 accuracy 0.44 -- 56.64 + 56.21 + 502.39 + 4.78 = 620.02:  47%|████▋     | 956/2048 [12:14<13:08,  1.39it/s]
loss 1.69 accuracy 0.44 -- 56.64 + 56.21 + 502.39 + 4.78 = 620.02:  47%|████▋     | 957/2048 [12:14<12:54,  1.41it/s]
loss 2.06 accuracy 0.25 -- 56.21 + 166.01 + 501.60 + 4.78 = 728.60:  47%|████▋     | 957/2048 [12:15<12:54,  1.41it/s]
loss 2.06 accuracy 0.25 -- 56.21 + 166.01 + 501.60 + 4.78 = 728.60:  47%|████▋     | 958/2048 [12:15<13:08,  1.38it/s]
loss 2.17 accuracy 0.38 -- 56.27 + 56.35 + 501.01 + 4.79 = 618.42:  47%|████▋     | 958/2048 [12:16<13:08,  1.38it/s] 
loss 2.17 accuracy 0.38 -- 56.27 + 56.35 + 501.01 + 4.79 = 618.42:  47%|████▋     | 959/2048 [12:16<12:42,  1.43it/s]
loss 1.34 accuracy 0.62 -- 56.64 + 56.22 + 500.28 + 4.78 = 617.92:  47%|████▋     | 959/2048 [12:16<12:42,  1.43it/s]
loss 1.34 accuracy 0.62 -- 56.64 + 56.22 + 500.28 + 4.78 = 617.92:  47%|████▋     | 960/2048 [12:16<12:58,  1.40it/s]
loss 2.12 accuracy 0.25 -- 56.28 + 57.88 + 495.74 + 4.78 = 614.67:  47%|████▋     | 960/2048 [12:17<12:58,  1.40it/s]
loss 2.12 accuracy 0.25 -- 56.28 + 57.88 + 495.74 + 4.78 = 614.67:  47%|████▋     | 961/2048 [12:17<12:33,  1.44it/s]
loss 2.11 accuracy 0.25 -- 157.10 + 56.68 + 489.26 + 4.77 = 707.80:  47%|████▋     | 961/2048 [12:18<12:33,  1.44it/s]
loss 2.11 accuracy 0.25 -- 157.10 + 56.68 + 489.26 + 4.77 = 707.80:  47%|████▋     | 962/2048 [12:18<12:46,  1.42it/s]
loss 2.02 accuracy 0.25 -- 55.94 + 166.57 + 501.21 + 4.78 = 728.50:  47%|████▋     | 962/2048 [12:19<12:46,  1.42it/s]
loss 2.02 accuracy 0.25 -- 55.94 + 166.57 + 501.21 + 4.78 = 728.50:  47%|████▋     | 963/2048 [12:19<13:02,  1.39it/s]
loss 1.60 accuracy 0.38 -- 56.76 + 56.43 + 498.43 + 4.78 = 616.40:  47%|████▋     | 963/2048 [12:19<13:02,  1.39it/s] 
loss 1.60 accuracy 0.38 -- 56.76 + 56.43 + 498.43 + 4.78 = 616.40:  47%|████▋     | 964/2048 [12:19<12:36,  1.43it/s]
loss 1.67 accuracy 0.31 -- 162.46 + 56.84 + 496.85 + 4.77 = 720.92:  47%|████▋     | 964/2048 [12:20<12:36,  1.43it/s]
loss 1.67 accuracy 0.31 -- 162.46 + 56.84 + 496.85 + 4.77 = 720.92:  47%|████▋     | 965/2048 [12:20<12:52,  1.40it/s]
loss 1.89 accuracy 0.19 -- 56.03 + 57.21 + 621.02 + 4.79 = 739.05:  47%|████▋     | 965/2048 [12:21<12:52,  1.40it/s] 
loss 1.89 accuracy 0.19 -- 56.03 + 57.21 + 621.02 + 4.79 = 739.05:  47%|████▋     | 966/2048 [12:21<13:08,  1.37it/s]
loss 1.82 accuracy 0.19 -- 56.77 + 56.47 + 505.35 + 4.79 = 623.39:  47%|████▋     | 966/2048 [12:21<13:08,  1.37it/s]
loss 1.82 accuracy 0.19 -- 56.77 + 56.47 + 505.35 + 4.79 = 623.39:  47%|████▋     | 967/2048 [12:21<12:42,  1.42it/s]
loss 2.26 accuracy 0.19 -- 55.98 + 57.40 + 616.40 + 4.78 = 734.56:  47%|████▋     | 967/2048 [12:22<12:42,  1.42it/s]
loss 2.26 accuracy 0.19 -- 55.98 + 57.40 + 616.40 + 4.78 = 734.56:  47%|████▋     | 968/2048 [12:22<13:00,  1.38it/s]
loss 2.06 accuracy 0.19 -- 56.93 + 56.67 + 501.50 + 4.76 = 619.85:  47%|████▋     | 968/2048 [12:23<13:00,  1.38it/s]
loss 2.06 accuracy 0.19 -- 56.93 + 56.67 + 501.50 + 4.76 = 619.85:  47%|████▋     | 969/2048 [12:23<12:34,  1.43it/s]
loss 1.82 accuracy 0.31 -- 56.20 + 166.19 + 501.33 + 4.78 = 728.50:  47%|████▋     | 969/2048 [12:24<12:34,  1.43it/s]
loss 1.82 accuracy 0.31 -- 56.20 + 166.19 + 501.33 + 4.78 = 728.50:  47%|████▋     | 970/2048 [12:24<12:52,  1.40it/s]
loss 1.94 accuracy 0.25 -- 56.17 + 56.37 + 499.24 + 4.77 = 616.56:  47%|████▋     | 970/2048 [12:24<12:52,  1.40it/s] 
loss 1.94 accuracy 0.25 -- 56.17 + 56.37 + 499.24 + 4.77 = 616.56:  47%|████▋     | 971/2048 [12:24<12:28,  1.44it/s]
loss 2.65 accuracy 0.31 -- 57.09 + 56.75 + 499.07 + 4.78 = 617.69:  47%|████▋     | 971/2048 [12:25<12:28,  1.44it/s]
loss 2.65 accuracy 0.31 -- 57.09 + 56.75 + 499.07 + 4.78 = 617.69:  47%|████▋     | 972/2048 [12:25<12:45,  1.40it/s]
loss 1.88 accuracy 0.12 -- 56.04 + 57.33 + 495.12 + 4.77 = 613.26:  47%|████▋     | 972/2048 [12:26<12:45,  1.40it/s]
loss 1.88 accuracy 0.12 -- 56.04 + 57.33 + 495.12 + 4.77 = 613.26:  48%|████▊     | 973/2048 [12:26<12:33,  1.43it/s]
loss 1.92 accuracy 0.19 -- 157.35 + 56.89 + 489.92 + 4.77 = 708.92:  48%|████▊     | 973/2048 [12:26<12:33,  1.43it/s]
loss 1.92 accuracy 0.19 -- 157.35 + 56.89 + 489.92 + 4.77 = 708.92:  48%|████▊     | 974/2048 [12:26<12:44,  1.41it/s]
loss 2.24 accuracy 0.31 -- 55.90 + 166.48 + 501.41 + 4.79 = 728.58:  48%|████▊     | 974/2048 [12:27<12:44,  1.41it/s]
loss 2.24 accuracy 0.31 -- 55.90 + 166.48 + 501.41 + 4.79 = 728.58:  48%|████▊     | 975/2048 [12:27<12:57,  1.38it/s]
loss 1.66 accuracy 0.44 -- 56.54 + 56.69 + 497.66 + 4.78 = 615.67:  48%|████▊     | 975/2048 [12:28<12:57,  1.38it/s] 
loss 1.66 accuracy 0.44 -- 56.54 + 56.69 + 497.66 + 4.78 = 615.67:  48%|████▊     | 976/2048 [12:28<12:30,  1.43it/s]
loss 1.85 accuracy 0.50 -- 162.22 + 57.10 + 498.17 + 4.81 = 722.30:  48%|████▊     | 976/2048 [12:28<12:30,  1.43it/s]
loss 1.85 accuracy 0.50 -- 162.22 + 57.10 + 498.17 + 4.81 = 722.30:  48%|████▊     | 977/2048 [12:28<12:45,  1.40it/s]
loss 1.89 accuracy 0.31 -- 56.01 + 57.00 + 621.20 + 4.78 = 739.00:  48%|████▊     | 977/2048 [12:29<12:45,  1.40it/s] 
loss 1.89 accuracy 0.31 -- 56.01 + 57.00 + 621.20 + 4.78 = 739.00:  48%|████▊     | 978/2048 [12:29<13:01,  1.37it/s]
loss 2.23 accuracy 0.06 -- 57.04 + 56.34 + 506.23 + 4.80 = 624.40:  48%|████▊     | 978/2048 [12:30<13:01,  1.37it/s]
loss 2.23 accuracy 0.06 -- 57.04 + 56.34 + 506.23 + 4.80 = 624.40:  48%|████▊     | 979/2048 [12:30<12:35,  1.41it/s]
loss 1.90 accuracy 0.44 -- 56.38 + 57.28 + 616.50 + 4.78 = 734.95:  48%|████▊     | 979/2048 [12:31<12:35,  1.41it/s]
loss 1.90 accuracy 0.44 -- 56.38 + 57.28 + 616.50 + 4.78 = 734.95:  48%|████▊     | 980/2048 [12:31<12:52,  1.38it/s]
loss 1.89 accuracy 0.31 -- 56.72 + 57.68 + 503.24 + 4.77 = 622.42:  48%|████▊     | 980/2048 [12:31<12:52,  1.38it/s]
loss 1.89 accuracy 0.31 -- 56.72 + 57.68 + 503.24 + 4.77 = 622.42:  48%|████▊     | 981/2048 [12:31<12:28,  1.43it/s]
loss 2.07 accuracy 0.19 -- 56.39 + 166.27 + 503.18 + 4.78 = 730.62:  48%|████▊     | 981/2048 [12:32<12:28,  1.43it/s]
loss 2.07 accuracy 0.19 -- 56.39 + 166.27 + 503.18 + 4.78 = 730.62:  48%|████▊     | 982/2048 [12:32<12:45,  1.39it/s]
loss 1.34 accuracy 0.56 -- 56.21 + 56.89 + 501.29 + 4.77 = 619.16:  48%|████▊     | 982/2048 [12:33<12:45,  1.39it/s] 
loss 1.34 accuracy 0.56 -- 56.21 + 56.89 + 501.29 + 4.77 = 619.16:  48%|████▊     | 983/2048 [12:33<12:21,  1.44it/s]
loss 2.61 accuracy 0.12 -- 56.59 + 56.83 + 498.26 + 4.78 = 616.46:  48%|████▊     | 983/2048 [12:33<12:21,  1.44it/s]
loss 2.61 accuracy 0.12 -- 56.59 + 56.83 + 498.26 + 4.78 = 616.46:  48%|████▊     | 984/2048 [12:33<12:38,  1.40it/s]
loss 1.56 accuracy 0.12 -- 55.90 + 57.33 + 495.46 + 4.78 = 613.47:  48%|████▊     | 984/2048 [12:34<12:38,  1.40it/s]
loss 1.56 accuracy 0.12 -- 55.90 + 57.33 + 495.46 + 4.78 = 613.47:  48%|████▊     | 985/2048 [12:34<12:14,  1.45it/s]
loss 1.57 accuracy 0.50 -- 157.18 + 57.05 + 489.99 + 4.79 = 709.01:  48%|████▊     | 985/2048 [12:35<12:14,  1.45it/s]
loss 1.57 accuracy 0.50 -- 157.18 + 57.05 + 489.99 + 4.79 = 709.01:  48%|████▊     | 986/2048 [12:35<12:28,  1.42it/s]
loss 1.60 accuracy 0.38 -- 56.03 + 166.30 + 503.10 + 4.78 = 730.20:  48%|████▊     | 986/2048 [12:36<12:28,  1.42it/s]
loss 1.60 accuracy 0.38 -- 56.03 + 166.30 + 503.10 + 4.78 = 730.20:  48%|████▊     | 987/2048 [12:36<12:44,  1.39it/s]
loss 1.87 accuracy 0.31 -- 56.81 + 56.38 + 499.90 + 4.84 = 617.93:  48%|████▊     | 987/2048 [12:36<12:44,  1.39it/s] 
loss 1.87 accuracy 0.31 -- 56.81 + 56.38 + 499.90 + 4.84 = 617.93:  48%|████▊     | 988/2048 [12:36<12:19,  1.43it/s]
loss 1.93 accuracy 0.25 -- 163.47 + 57.67 + 498.55 + 4.77 = 724.47:  48%|████▊     | 988/2048 [12:37<12:19,  1.43it/s]
loss 1.93 accuracy 0.25 -- 163.47 + 57.67 + 498.55 + 4.77 = 724.47:  48%|████▊     | 989/2048 [12:37<12:36,  1.40it/s]
loss 1.24 accuracy 0.44 -- 56.28 + 57.35 + 620.90 + 4.78 = 739.31:  48%|████▊     | 989/2048 [12:38<12:36,  1.40it/s] 
loss 1.24 accuracy 0.44 -- 56.28 + 57.35 + 620.90 + 4.78 = 739.31:  48%|████▊     | 990/2048 [12:38<12:52,  1.37it/s]
loss 1.65 accuracy 0.50 -- 56.46 + 56.47 + 505.17 + 4.78 = 622.88:  48%|████▊     | 990/2048 [12:38<12:52,  1.37it/s]
loss 1.65 accuracy 0.50 -- 56.46 + 56.47 + 505.17 + 4.78 = 622.88:  48%|████▊     | 991/2048 [12:38<12:26,  1.42it/s]
loss 1.47 accuracy 0.50 -- 56.19 + 57.27 + 615.52 + 4.79 = 733.77:  48%|████▊     | 991/2048 [12:39<12:26,  1.42it/s]
loss 1.47 accuracy 0.50 -- 56.19 + 57.27 + 615.52 + 4.79 = 733.77:  48%|████▊     | 992/2048 [12:39<12:42,  1.38it/s]
loss 1.80 accuracy 0.19 -- 56.74 + 56.93 + 501.77 + 4.77 = 620.21:  48%|████▊     | 992/2048 [12:40<12:42,  1.38it/s]
loss 1.80 accuracy 0.19 -- 56.74 + 56.93 + 501.77 + 4.77 = 620.21:  48%|████▊     | 993/2048 [12:40<12:18,  1.43it/s]
loss 1.81 accuracy 0.12 -- 56.00 + 166.25 + 503.48 + 4.78 = 730.51:  48%|████▊     | 993/2048 [12:41<12:18,  1.43it/s]
loss 1.81 accuracy 0.12 -- 56.00 + 166.25 + 503.48 + 4.78 = 730.51:  49%|████▊     | 994/2048 [12:41<12:35,  1.39it/s]
loss 1.83 accuracy 0.44 -- 55.91 + 56.19 + 498.27 + 4.78 = 615.14:  49%|████▊     | 994/2048 [12:41<12:35,  1.39it/s] 
loss 1.83 accuracy 0.44 -- 55.91 + 56.19 + 498.27 + 4.78 = 615.14:  49%|████▊     | 995/2048 [12:41<12:11,  1.44it/s]
loss 1.94 accuracy 0.25 -- 57.07 + 56.52 + 498.62 + 4.78 = 616.99:  49%|████▊     | 995/2048 [12:42<12:11,  1.44it/s]
loss 1.94 accuracy 0.25 -- 57.07 + 56.52 + 498.62 + 4.78 = 616.99:  49%|████▊     | 996/2048 [12:42<12:28,  1.41it/s]
loss 2.04 accuracy 0.31 -- 55.88 + 56.89 + 495.00 + 4.78 = 612.56:  49%|████▊     | 996/2048 [12:43<12:28,  1.41it/s]
loss 2.04 accuracy 0.31 -- 55.88 + 56.89 + 495.00 + 4.78 = 612.56:  49%|████▊     | 997/2048 [12:43<12:05,  1.45it/s]
loss 2.38 accuracy 0.06 -- 156.77 + 56.81 + 489.32 + 4.77 = 707.66:  49%|████▊     | 997/2048 [12:43<12:05,  1.45it/s]
loss 2.38 accuracy 0.06 -- 156.77 + 56.81 + 489.32 + 4.77 = 707.66:  49%|████▊     | 998/2048 [12:43<12:18,  1.42it/s]
loss 1.70 accuracy 0.38 -- 55.79 + 166.36 + 502.30 + 4.78 = 729.23:  49%|████▊     | 998/2048 [12:44<12:18,  1.42it/s]
loss 1.70 accuracy 0.38 -- 55.79 + 166.36 + 502.30 + 4.78 = 729.23:  49%|████▉     | 999/2048 [12:44<12:34,  1.39it/s]
loss 1.59 accuracy 0.31 -- 56.90 + 56.75 + 498.45 + 4.79 = 616.89:  49%|████▉     | 999/2048 [12:45<12:34,  1.39it/s] 
loss 1.59 accuracy 0.31 -- 56.90 + 56.75 + 498.45 + 4.79 = 616.89:  49%|████▉     | 1000/2048 [12:45<12:10,  1.44it/s]
loss 1.73 accuracy 0.44 -- 162.71 + 56.96 + 497.72 + 4.76 = 722.15:  49%|████▉     | 1000/2048 [12:45<12:10,  1.44it/s]
loss 1.73 accuracy 0.44 -- 162.71 + 56.96 + 497.72 + 4.76 = 722.15:  49%|████▉     | 1001/2048 [12:45<12:26,  1.40it/s]
loss 2.03 accuracy 0.38 -- 56.41 + 57.14 + 619.66 + 4.78 = 737.98:  49%|████▉     | 1001/2048 [12:46<12:26,  1.40it/s] 
loss 2.03 accuracy 0.38 -- 56.41 + 57.14 + 619.66 + 4.78 = 737.98:  49%|████▉     | 1002/2048 [12:46<12:41,  1.37it/s]
loss 1.93 accuracy 0.19 -- 56.70 + 56.52 + 504.79 + 4.79 = 622.79:  49%|████▉     | 1002/2048 [12:47<12:41,  1.37it/s]
loss 1.93 accuracy 0.19 -- 56.70 + 56.52 + 504.79 + 4.79 = 622.79:  49%|████▉     | 1003/2048 [12:47<12:16,  1.42it/s]
loss 1.89 accuracy 0.25 -- 56.31 + 57.46 + 616.15 + 4.77 = 734.70:  49%|████▉     | 1003/2048 [12:48<12:16,  1.42it/s]
loss 1.89 accuracy 0.25 -- 56.31 + 57.46 + 616.15 + 4.77 = 734.70:  49%|████▉     | 1004/2048 [12:48<12:33,  1.39it/s]
loss 1.81 accuracy 0.44 -- 56.83 + 56.49 + 503.59 + 4.78 = 621.71:  49%|████▉     | 1004/2048 [12:48<12:33,  1.39it/s]
loss 1.81 accuracy 0.44 -- 56.83 + 56.49 + 503.59 + 4.78 = 621.71:  49%|████▉     | 1005/2048 [12:48<12:21,  1.41it/s]
loss 2.68 accuracy 0.06 -- 56.49 + 166.63 + 502.43 + 4.78 = 730.33:  49%|████▉     | 1005/2048 [12:49<12:21,  1.41it/s]
loss 2.68 accuracy 0.06 -- 56.49 + 166.63 + 502.43 + 4.78 = 730.33:  49%|████▉     | 1006/2048 [12:49<12:34,  1.38it/s]
loss 2.08 accuracy 0.31 -- 56.19 + 56.36 + 499.60 + 4.80 = 616.94:  49%|████▉     | 1006/2048 [12:50<12:34,  1.38it/s] 
loss 2.08 accuracy 0.31 -- 56.19 + 56.36 + 499.60 + 4.80 = 616.94:  49%|████▉     | 1007/2048 [12:50<12:09,  1.43it/s]
loss 2.01 accuracy 0.38 -- 56.74 + 56.53 + 497.70 + 4.77 = 615.74:  49%|████▉     | 1007/2048 [12:50<12:09,  1.43it/s]
loss 2.01 accuracy 0.38 -- 56.74 + 56.53 + 497.70 + 4.77 = 615.74:  49%|████▉     | 1008/2048 [12:50<12:23,  1.40it/s]
loss 2.05 accuracy 0.31 -- 56.12 + 57.45 + 495.70 + 4.78 = 614.05:  49%|████▉     | 1008/2048 [12:51<12:23,  1.40it/s]
loss 2.05 accuracy 0.31 -- 56.12 + 57.45 + 495.70 + 4.78 = 614.05:  49%|████▉     | 1009/2048 [12:51<12:00,  1.44it/s]
loss 1.76 accuracy 0.19 -- 157.67 + 57.20 + 489.91 + 4.77 = 709.55:  49%|████▉     | 1009/2048 [12:52<12:00,  1.44it/s]
loss 1.76 accuracy 0.19 -- 157.67 + 57.20 + 489.91 + 4.77 = 709.55:  49%|████▉     | 1010/2048 [12:52<12:12,  1.42it/s]
loss 2.20 accuracy 0.25 -- 55.67 + 166.59 + 502.65 + 4.76 = 729.68:  49%|████▉     | 1010/2048 [12:53<12:12,  1.42it/s]
loss 2.20 accuracy 0.25 -- 55.67 + 166.59 + 502.65 + 4.76 = 729.68:  49%|████▉     | 1011/2048 [12:53<12:27,  1.39it/s]
loss 1.74 accuracy 0.31 -- 56.78 + 56.35 + 498.26 + 4.78 = 616.17:  49%|████▉     | 1011/2048 [12:53<12:27,  1.39it/s] 
loss 1.74 accuracy 0.31 -- 56.78 + 56.35 + 498.26 + 4.78 = 616.17:  49%|████▉     | 1012/2048 [12:53<12:02,  1.43it/s]
loss 2.29 accuracy 0.25 -- 162.27 + 56.83 + 497.01 + 4.77 = 720.87:  49%|████▉     | 1012/2048 [12:54<12:02,  1.43it/s]
loss 2.29 accuracy 0.25 -- 162.27 + 56.83 + 497.01 + 4.77 = 720.87:  49%|████▉     | 1013/2048 [12:54<12:17,  1.40it/s]
loss 1.48 accuracy 0.56 -- 56.28 + 57.26 + 620.80 + 4.79 = 739.13:  49%|████▉     | 1013/2048 [12:55<12:17,  1.40it/s] 
loss 1.48 accuracy 0.56 -- 56.28 + 57.26 + 620.80 + 4.79 = 739.13:  50%|████▉     | 1014/2048 [12:55<12:33,  1.37it/s]
loss 1.96 accuracy 0.25 -- 56.77 + 56.22 + 506.15 + 4.78 = 623.91:  50%|████▉     | 1014/2048 [12:55<12:33,  1.37it/s]
loss 1.96 accuracy 0.25 -- 56.77 + 56.22 + 506.15 + 4.78 = 623.91:  50%|████▉     | 1015/2048 [12:55<12:08,  1.42it/s]
loss 1.70 accuracy 0.25 -- 56.40 + 57.23 + 615.14 + 4.86 = 733.64:  50%|████▉     | 1015/2048 [12:56<12:08,  1.42it/s]
loss 1.70 accuracy 0.25 -- 56.40 + 57.23 + 615.14 + 4.86 = 733.64:  50%|████▉     | 1016/2048 [12:56<12:25,  1.38it/s]
loss 1.87 accuracy 0.25 -- 57.02 + 56.71 + 503.02 + 4.78 = 621.54:  50%|████▉     | 1016/2048 [12:57<12:25,  1.38it/s]
loss 1.87 accuracy 0.25 -- 57.02 + 56.71 + 503.02 + 4.78 = 621.54:  50%|████▉     | 1017/2048 [12:57<12:01,  1.43it/s]
loss 1.89 accuracy 0.31 -- 56.36 + 166.43 + 502.57 + 4.76 = 730.13:  50%|████▉     | 1017/2048 [12:58<12:01,  1.43it/s]
loss 1.89 accuracy 0.31 -- 56.36 + 166.43 + 502.57 + 4.76 = 730.13:  50%|████▉     | 1018/2048 [12:58<12:18,  1.39it/s]
loss 1.86 accuracy 0.19 -- 56.12 + 56.28 + 500.01 + 4.78 = 617.19:  50%|████▉     | 1018/2048 [12:58<12:18,  1.39it/s] 
loss 1.86 accuracy 0.19 -- 56.12 + 56.28 + 500.01 + 4.78 = 617.19:  50%|████▉     | 1019/2048 [12:58<11:55,  1.44it/s]
loss 2.00 accuracy 0.38 -- 56.40 + 56.77 + 499.56 + 4.78 = 617.51:  50%|████▉     | 1019/2048 [12:59<11:55,  1.44it/s]
loss 2.00 accuracy 0.38 -- 56.40 + 56.77 + 499.56 + 4.78 = 617.51:  50%|████▉     | 1020/2048 [12:59<12:12,  1.40it/s]
loss 1.62 accuracy 0.50 -- 55.93 + 57.49 + 495.61 + 4.78 = 613.82:  50%|████▉     | 1020/2048 [13:00<12:12,  1.40it/s]
loss 1.62 accuracy 0.50 -- 55.93 + 57.49 + 495.61 + 4.78 = 613.82:  50%|████▉     | 1021/2048 [13:00<11:49,  1.45it/s]
loss 2.06 accuracy 0.12 -- 157.21 + 56.80 + 490.41 + 4.82 = 709.23:  50%|████▉     | 1021/2048 [13:00<11:49,  1.45it/s]
loss 2.06 accuracy 0.12 -- 157.21 + 56.80 + 490.41 + 4.82 = 709.23:  50%|████▉     | 1022/2048 [13:00<12:02,  1.42it/s]
loss 1.49 accuracy 0.31 -- 56.14 + 166.58 + 501.47 + 4.78 = 728.97:  50%|████▉     | 1022/2048 [13:01<12:02,  1.42it/s]
loss 1.49 accuracy 0.31 -- 56.14 + 166.58 + 501.47 + 4.78 = 728.97:  50%|████▉     | 1023/2048 [13:01<12:18,  1.39it/s]
loss 1.94 accuracy 0.31 -- 56.87 + 56.42 + 497.66 + 4.85 = 615.80:  50%|████▉     | 1023/2048 [13:02<12:18,  1.39it/s] 
loss 1.94 accuracy 0.31 -- 56.87 + 56.42 + 497.66 + 4.85 = 615.80:  50%|█████     | 1024/2048 [13:02<11:53,  1.43it/s]
loss 1.80 accuracy 0.31 -- 162.36 + 57.05 + 496.55 + 4.77 = 720.73:  50%|█████     | 1024/2048 [13:03<11:53,  1.43it/s]
loss 1.80 accuracy 0.31 -- 162.36 + 57.05 + 496.55 + 4.77 = 720.73:  50%|█████     | 1025/2048 [13:03<12:08,  1.40it/s]
loss 1.76 accuracy 0.44 -- 56.57 + 57.39 + 620.30 + 4.78 = 739.04:  50%|█████     | 1025/2048 [13:03<12:08,  1.40it/s] 
loss 1.76 accuracy 0.44 -- 56.57 + 57.39 + 620.30 + 4.78 = 739.04:  50%|█████     | 1026/2048 [13:03<12:24,  1.37it/s]
loss 1.93 accuracy 0.19 -- 56.91 + 56.66 + 505.92 + 4.76 = 624.25:  50%|█████     | 1026/2048 [13:04<12:24,  1.37it/s]
loss 1.93 accuracy 0.19 -- 56.91 + 56.66 + 505.92 + 4.76 = 624.25:  50%|█████     | 1027/2048 [13:04<12:00,  1.42it/s]
loss 1.97 accuracy 0.25 -- 56.15 + 57.41 + 617.38 + 4.79 = 735.73:  50%|█████     | 1027/2048 [13:05<12:00,  1.42it/s]
loss 1.97 accuracy 0.25 -- 56.15 + 57.41 + 617.38 + 4.79 = 735.73:  50%|█████     | 1028/2048 [13:05<12:16,  1.38it/s]
loss 1.64 accuracy 0.38 -- 56.78 + 56.31 + 502.19 + 4.79 = 620.07:  50%|█████     | 1028/2048 [13:05<12:16,  1.38it/s]
loss 1.64 accuracy 0.38 -- 56.78 + 56.31 + 502.19 + 4.79 = 620.07:  50%|█████     | 1029/2048 [13:05<11:53,  1.43it/s]
loss 1.93 accuracy 0.19 -- 56.52 + 166.32 + 501.80 + 4.78 = 729.42:  50%|█████     | 1029/2048 [13:06<11:53,  1.43it/s]
loss 1.93 accuracy 0.19 -- 56.52 + 166.32 + 501.80 + 4.78 = 729.42:  50%|█████     | 1030/2048 [13:06<12:09,  1.39it/s]
loss 1.74 accuracy 0.38 -- 56.21 + 56.32 + 498.44 + 4.77 = 615.73:  50%|█████     | 1030/2048 [13:07<12:09,  1.39it/s] 
loss 1.74 accuracy 0.38 -- 56.21 + 56.32 + 498.44 + 4.77 = 615.73:  50%|█████     | 1031/2048 [13:07<11:46,  1.44it/s]
loss 2.09 accuracy 0.25 -- 56.78 + 56.54 + 499.31 + 4.79 = 617.41:  50%|█████     | 1031/2048 [13:07<11:46,  1.44it/s]
loss 2.09 accuracy 0.25 -- 56.78 + 56.54 + 499.31 + 4.79 = 617.41:  50%|█████     | 1032/2048 [13:07<12:03,  1.40it/s]
loss 2.12 accuracy 0.25 -- 56.27 + 57.38 + 495.24 + 4.79 = 613.68:  50%|█████     | 1032/2048 [13:08<12:03,  1.40it/s]
loss 2.12 accuracy 0.25 -- 56.27 + 57.38 + 495.24 + 4.79 = 613.68:  50%|█████     | 1033/2048 [13:08<11:40,  1.45it/s]
loss 2.00 accuracy 0.12 -- 158.51 + 57.50 + 489.82 + 4.76 = 710.59:  50%|█████     | 1033/2048 [13:09<11:40,  1.45it/s]
loss 2.00 accuracy 0.12 -- 158.51 + 57.50 + 489.82 + 4.76 = 710.59:  50%|█████     | 1034/2048 [13:09<11:54,  1.42it/s]
loss 2.34 accuracy 0.12 -- 55.71 + 166.21 + 502.53 + 4.76 = 729.22:  50%|█████     | 1034/2048 [13:10<11:54,  1.42it/s]
loss 2.34 accuracy 0.12 -- 55.71 + 166.21 + 502.53 + 4.76 = 729.22:  51%|█████     | 1035/2048 [13:10<12:09,  1.39it/s]
loss 2.29 accuracy 0.06 -- 56.72 + 56.90 + 497.94 + 4.79 = 616.35:  51%|█████     | 1035/2048 [13:10<12:09,  1.39it/s] 
loss 2.29 accuracy 0.06 -- 56.72 + 56.90 + 497.94 + 4.79 = 616.35:  51%|█████     | 1036/2048 [13:10<11:45,  1.43it/s]
loss 1.90 accuracy 0.38 -- 162.54 + 57.27 + 497.92 + 4.78 = 722.50:  51%|█████     | 1036/2048 [13:11<11:45,  1.43it/s]
loss 1.90 accuracy 0.38 -- 162.54 + 57.27 + 497.92 + 4.78 = 722.50:  51%|█████     | 1037/2048 [13:11<12:00,  1.40it/s]
loss 2.09 accuracy 0.25 -- 56.21 + 57.34 + 621.69 + 4.78 = 740.01:  51%|█████     | 1037/2048 [13:12<12:00,  1.40it/s] 
loss 2.09 accuracy 0.25 -- 56.21 + 57.34 + 621.69 + 4.78 = 740.01:  51%|█████     | 1038/2048 [13:12<12:16,  1.37it/s]
loss 2.13 accuracy 0.25 -- 57.22 + 56.56 + 509.24 + 4.78 = 627.80:  51%|█████     | 1038/2048 [13:12<12:16,  1.37it/s]
loss 2.13 accuracy 0.25 -- 57.22 + 56.56 + 509.24 + 4.78 = 627.80:  51%|█████     | 1039/2048 [13:12<11:53,  1.41it/s]
loss 1.76 accuracy 0.31 -- 56.79 + 57.70 + 619.05 + 4.81 = 738.36:  51%|█████     | 1039/2048 [13:13<11:53,  1.41it/s]
loss 1.76 accuracy 0.31 -- 56.79 + 57.70 + 619.05 + 4.81 = 738.36:  51%|█████     | 1040/2048 [13:13<12:10,  1.38it/s]
loss 1.62 accuracy 0.56 -- 56.96 + 56.48 + 502.29 + 4.77 = 620.50:  51%|█████     | 1040/2048 [13:14<12:10,  1.38it/s]
loss 1.62 accuracy 0.56 -- 56.96 + 56.48 + 502.29 + 4.77 = 620.50:  51%|█████     | 1041/2048 [13:14<11:46,  1.43it/s]
loss 1.60 accuracy 0.25 -- 56.45 + 166.09 + 502.72 + 4.79 = 730.05:  51%|█████     | 1041/2048 [13:15<11:46,  1.43it/s]
loss 1.60 accuracy 0.25 -- 56.45 + 166.09 + 502.72 + 4.79 = 730.05:  51%|█████     | 1042/2048 [13:15<12:02,  1.39it/s]
loss 2.55 accuracy 0.31 -- 56.40 + 56.64 + 499.67 + 4.77 = 617.48:  51%|█████     | 1042/2048 [13:15<12:02,  1.39it/s] 
loss 2.55 accuracy 0.31 -- 56.40 + 56.64 + 499.67 + 4.77 = 617.48:  51%|█████     | 1043/2048 [13:15<11:39,  1.44it/s]
loss 1.84 accuracy 0.25 -- 56.75 + 56.80 + 500.30 + 4.79 = 618.64:  51%|█████     | 1043/2048 [13:16<11:39,  1.44it/s]
loss 1.84 accuracy 0.25 -- 56.75 + 56.80 + 500.30 + 4.79 = 618.64:  51%|█████     | 1044/2048 [13:16<11:55,  1.40it/s]
loss 1.69 accuracy 0.38 -- 56.59 + 57.25 + 497.58 + 4.78 = 616.20:  51%|█████     | 1044/2048 [13:17<11:55,  1.40it/s]
loss 1.69 accuracy 0.38 -- 56.59 + 57.25 + 497.58 + 4.78 = 616.20:  51%|█████     | 1045/2048 [13:17<11:34,  1.44it/s]
loss 2.06 accuracy 0.25 -- 158.03 + 57.84 + 491.04 + 4.79 = 711.70:  51%|█████     | 1045/2048 [13:17<11:34,  1.44it/s]
loss 2.06 accuracy 0.25 -- 158.03 + 57.84 + 491.04 + 4.79 = 711.70:  51%|█████     | 1046/2048 [13:17<11:47,  1.42it/s]
loss 2.08 accuracy 0.31 -- 55.86 + 166.33 + 501.34 + 4.82 = 728.35:  51%|█████     | 1046/2048 [13:18<11:47,  1.42it/s]
loss 2.08 accuracy 0.31 -- 55.86 + 166.33 + 501.34 + 4.82 = 728.35:  51%|█████     | 1047/2048 [13:18<12:01,  1.39it/s]
loss 2.10 accuracy 0.31 -- 56.52 + 56.37 + 497.77 + 4.76 = 615.42:  51%|█████     | 1047/2048 [13:19<12:01,  1.39it/s] 
loss 2.10 accuracy 0.31 -- 56.52 + 56.37 + 497.77 + 4.76 = 615.42:  51%|█████     | 1048/2048 [13:19<11:37,  1.43it/s]
loss 1.52 accuracy 0.38 -- 162.49 + 57.34 + 497.11 + 4.77 = 721.71:  51%|█████     | 1048/2048 [13:20<11:37,  1.43it/s]
loss 1.52 accuracy 0.38 -- 162.49 + 57.34 + 497.11 + 4.77 = 721.71:  51%|█████     | 1049/2048 [13:20<11:52,  1.40it/s]
loss 1.64 accuracy 0.38 -- 56.34 + 57.37 + 622.58 + 4.78 = 741.07:  51%|█████     | 1049/2048 [13:20<11:52,  1.40it/s] 
loss 1.64 accuracy 0.38 -- 56.34 + 57.37 + 622.58 + 4.78 = 741.07:  51%|█████▏    | 1050/2048 [13:20<12:08,  1.37it/s]
loss 2.06 accuracy 0.38 -- 57.34 + 56.75 + 505.95 + 4.76 = 624.81:  51%|█████▏    | 1050/2048 [13:21<12:08,  1.37it/s]
loss 2.06 accuracy 0.38 -- 57.34 + 56.75 + 505.95 + 4.76 = 624.81:  51%|█████▏    | 1051/2048 [13:21<11:44,  1.42it/s]
loss 1.70 accuracy 0.44 -- 56.20 + 57.52 + 616.46 + 4.84 = 735.01:  51%|█████▏    | 1051/2048 [13:22<11:44,  1.42it/s]
loss 1.70 accuracy 0.44 -- 56.20 + 57.52 + 616.46 + 4.84 = 735.01:  51%|█████▏    | 1052/2048 [13:22<12:00,  1.38it/s]
loss 1.72 accuracy 0.38 -- 56.94 + 56.42 + 502.54 + 4.76 = 620.65:  51%|█████▏    | 1052/2048 [13:22<12:00,  1.38it/s]
loss 1.72 accuracy 0.38 -- 56.94 + 56.42 + 502.54 + 4.76 = 620.65:  51%|█████▏    | 1053/2048 [13:22<11:37,  1.43it/s]
loss 1.70 accuracy 0.38 -- 56.16 + 166.32 + 501.69 + 4.79 = 728.97:  51%|█████▏    | 1053/2048 [13:23<11:37,  1.43it/s]
loss 1.70 accuracy 0.38 -- 56.16 + 166.32 + 501.69 + 4.79 = 728.97:  51%|█████▏    | 1054/2048 [13:23<11:52,  1.39it/s]
loss 1.89 accuracy 0.38 -- 56.21 + 56.49 + 499.69 + 4.77 = 617.16:  51%|█████▏    | 1054/2048 [13:24<11:52,  1.39it/s] 
loss 1.89 accuracy 0.38 -- 56.21 + 56.49 + 499.69 + 4.77 = 617.16:  52%|█████▏    | 1055/2048 [13:24<11:30,  1.44it/s]
loss 1.57 accuracy 0.50 -- 56.36 + 56.22 + 499.61 + 4.77 = 616.96:  52%|█████▏    | 1055/2048 [13:25<11:30,  1.44it/s]
loss 1.57 accuracy 0.50 -- 56.36 + 56.22 + 499.61 + 4.77 = 616.96:  52%|█████▏    | 1056/2048 [13:25<11:46,  1.40it/s]
loss 2.17 accuracy 0.19 -- 56.33 + 57.18 + 495.42 + 4.82 = 613.74:  52%|█████▏    | 1056/2048 [13:25<11:46,  1.40it/s]
loss 2.17 accuracy 0.19 -- 56.33 + 57.18 + 495.42 + 4.82 = 613.74:  52%|█████▏    | 1057/2048 [13:25<11:24,  1.45it/s]
loss 1.81 accuracy 0.31 -- 157.89 + 56.87 + 489.74 + 4.78 = 709.28:  52%|█████▏    | 1057/2048 [13:26<11:24,  1.45it/s]
loss 1.81 accuracy 0.31 -- 157.89 + 56.87 + 489.74 + 4.78 = 709.28:  52%|█████▏    | 1058/2048 [13:26<11:37,  1.42it/s]
loss 2.38 accuracy 0.19 -- 55.91 + 166.96 + 502.13 + 4.78 = 729.78:  52%|█████▏    | 1058/2048 [13:27<11:37,  1.42it/s]
loss 2.38 accuracy 0.19 -- 55.91 + 166.96 + 502.13 + 4.78 = 729.78:  52%|█████▏    | 1059/2048 [13:27<11:52,  1.39it/s]
loss 1.73 accuracy 0.44 -- 56.43 + 56.54 + 496.79 + 4.77 = 614.53:  52%|█████▏    | 1059/2048 [13:27<11:52,  1.39it/s] 
loss 1.73 accuracy 0.44 -- 56.43 + 56.54 + 496.79 + 4.77 = 614.53:  52%|█████▏    | 1060/2048 [13:27<11:28,  1.44it/s]
loss 1.90 accuracy 0.25 -- 162.84 + 57.54 + 499.84 + 4.78 = 724.99:  52%|█████▏    | 1060/2048 [13:28<11:28,  1.44it/s]
loss 1.90 accuracy 0.25 -- 162.84 + 57.54 + 499.84 + 4.78 = 724.99:  52%|█████▏    | 1061/2048 [13:28<11:44,  1.40it/s]
loss 1.62 accuracy 0.38 -- 56.66 + 57.59 + 620.47 + 4.79 = 739.51:  52%|█████▏    | 1061/2048 [13:29<11:44,  1.40it/s] 
loss 1.62 accuracy 0.38 -- 56.66 + 57.59 + 620.47 + 4.79 = 739.51:  52%|█████▏    | 1062/2048 [13:29<11:59,  1.37it/s]
loss 1.83 accuracy 0.31 -- 56.65 + 56.57 + 504.25 + 4.78 = 622.25:  52%|█████▏    | 1062/2048 [13:29<11:59,  1.37it/s]
loss 1.83 accuracy 0.31 -- 56.65 + 56.57 + 504.25 + 4.78 = 622.25:  52%|█████▏    | 1063/2048 [13:29<11:34,  1.42it/s]
loss 2.42 accuracy 0.06 -- 56.16 + 57.28 + 615.82 + 4.77 = 734.03:  52%|█████▏    | 1063/2048 [13:30<11:34,  1.42it/s]
loss 2.42 accuracy 0.06 -- 56.16 + 57.28 + 615.82 + 4.77 = 734.03:  52%|█████▏    | 1064/2048 [13:30<11:50,  1.38it/s]
loss 1.54 accuracy 0.38 -- 56.64 + 56.43 + 502.22 + 4.78 = 620.07:  52%|█████▏    | 1064/2048 [13:31<11:50,  1.38it/s]
loss 1.54 accuracy 0.38 -- 56.64 + 56.43 + 502.22 + 4.78 = 620.07:  52%|█████▏    | 1065/2048 [13:31<11:27,  1.43it/s]
loss 2.40 accuracy 0.12 -- 56.21 + 165.86 + 501.31 + 4.78 = 728.16:  52%|█████▏    | 1065/2048 [13:32<11:27,  1.43it/s]
loss 2.40 accuracy 0.12 -- 56.21 + 165.86 + 501.31 + 4.78 = 728.16:  52%|█████▏    | 1066/2048 [13:32<11:43,  1.40it/s]
loss 1.76 accuracy 0.38 -- 55.99 + 56.41 + 500.16 + 4.77 = 617.33:  52%|█████▏    | 1066/2048 [13:32<11:43,  1.40it/s] 
loss 1.76 accuracy 0.38 -- 55.99 + 56.41 + 500.16 + 4.77 = 617.33:  52%|█████▏    | 1067/2048 [13:32<11:21,  1.44it/s]
loss 1.86 accuracy 0.38 -- 56.90 + 56.45 + 498.78 + 4.78 = 616.91:  52%|█████▏    | 1067/2048 [13:33<11:21,  1.44it/s]
loss 1.86 accuracy 0.38 -- 56.90 + 56.45 + 498.78 + 4.78 = 616.91:  52%|█████▏    | 1068/2048 [13:33<11:37,  1.41it/s]
loss 1.43 accuracy 0.31 -- 56.06 + 56.89 + 494.68 + 4.76 = 612.40:  52%|█████▏    | 1068/2048 [13:34<11:37,  1.41it/s]
loss 1.43 accuracy 0.31 -- 56.06 + 56.89 + 494.68 + 4.76 = 612.40:  52%|█████▏    | 1069/2048 [13:34<11:15,  1.45it/s]
loss 1.78 accuracy 0.38 -- 157.63 + 57.07 + 490.42 + 4.78 = 709.90:  52%|█████▏    | 1069/2048 [13:34<11:15,  1.45it/s]
loss 1.78 accuracy 0.38 -- 157.63 + 57.07 + 490.42 + 4.78 = 709.90:  52%|█████▏    | 1070/2048 [13:34<11:28,  1.42it/s]
loss 2.38 accuracy 0.06 -- 55.67 + 166.35 + 502.85 + 4.78 = 729.64:  52%|█████▏    | 1070/2048 [13:35<11:28,  1.42it/s]
loss 2.38 accuracy 0.06 -- 55.67 + 166.35 + 502.85 + 4.78 = 729.64:  52%|█████▏    | 1071/2048 [13:35<11:43,  1.39it/s]
loss 1.73 accuracy 0.44 -- 56.70 + 56.77 + 498.21 + 4.78 = 616.47:  52%|█████▏    | 1071/2048 [13:36<11:43,  1.39it/s] 
loss 1.73 accuracy 0.44 -- 56.70 + 56.77 + 498.21 + 4.78 = 616.47:  52%|█████▏    | 1072/2048 [13:36<11:20,  1.43it/s]
loss 1.56 accuracy 0.44 -- 162.64 + 57.00 + 497.08 + 4.77 = 721.50:  52%|█████▏    | 1072/2048 [13:37<11:20,  1.43it/s]
loss 1.56 accuracy 0.44 -- 162.64 + 57.00 + 497.08 + 4.77 = 721.50:  52%|█████▏    | 1073/2048 [13:37<11:34,  1.40it/s]
loss 1.70 accuracy 0.38 -- 56.33 + 57.07 + 619.90 + 4.80 = 738.09:  52%|█████▏    | 1073/2048 [13:37<11:34,  1.40it/s] 
loss 1.70 accuracy 0.38 -- 56.33 + 57.07 + 619.90 + 4.80 = 738.09:  52%|█████▏    | 1074/2048 [13:37<11:49,  1.37it/s]
loss 1.98 accuracy 0.25 -- 56.79 + 56.55 + 505.38 + 4.82 = 623.55:  52%|█████▏    | 1074/2048 [13:38<11:49,  1.37it/s]
loss 1.98 accuracy 0.25 -- 56.79 + 56.55 + 505.38 + 4.82 = 623.55:  52%|█████▏    | 1075/2048 [13:38<11:26,  1.42it/s]
loss 2.16 accuracy 0.31 -- 56.37 + 58.17 + 617.49 + 4.79 = 736.82:  52%|█████▏    | 1075/2048 [13:39<11:26,  1.42it/s]
loss 2.16 accuracy 0.31 -- 56.37 + 58.17 + 617.49 + 4.79 = 736.82:  53%|█████▎    | 1076/2048 [13:39<11:42,  1.38it/s]
loss 1.91 accuracy 0.31 -- 56.59 + 56.65 + 502.74 + 4.78 = 620.76:  53%|█████▎    | 1076/2048 [13:39<11:42,  1.38it/s]
loss 1.91 accuracy 0.31 -- 56.59 + 56.65 + 502.74 + 4.78 = 620.76:  53%|█████▎    | 1077/2048 [13:39<11:19,  1.43it/s]
loss 1.63 accuracy 0.38 -- 56.28 + 166.20 + 501.56 + 4.79 = 728.84:  53%|█████▎    | 1077/2048 [13:40<11:19,  1.43it/s]
loss 1.63 accuracy 0.38 -- 56.28 + 166.20 + 501.56 + 4.79 = 728.84:  53%|█████▎    | 1078/2048 [13:40<11:35,  1.39it/s]
loss 1.70 accuracy 0.31 -- 56.34 + 56.61 + 500.28 + 4.77 = 618.00:  53%|█████▎    | 1078/2048 [13:41<11:35,  1.39it/s] 
loss 1.70 accuracy 0.31 -- 56.34 + 56.61 + 500.28 + 4.77 = 618.00:  53%|█████▎    | 1079/2048 [13:41<11:13,  1.44it/s]
loss 1.60 accuracy 0.19 -- 56.30 + 56.35 + 498.46 + 4.77 = 615.88:  53%|█████▎    | 1079/2048 [13:42<11:13,  1.44it/s]
loss 1.60 accuracy 0.19 -- 56.30 + 56.35 + 498.46 + 4.77 = 615.88:  53%|█████▎    | 1080/2048 [13:42<11:29,  1.40it/s]
loss 1.29 accuracy 0.44 -- 56.22 + 57.31 + 495.64 + 4.78 = 613.95:  53%|█████▎    | 1080/2048 [13:42<11:29,  1.40it/s]
loss 1.29 accuracy 0.44 -- 56.22 + 57.31 + 495.64 + 4.78 = 613.95:  53%|█████▎    | 1081/2048 [13:42<11:07,  1.45it/s]
loss 2.30 accuracy 0.19 -- 158.45 + 56.88 + 490.93 + 4.79 = 711.05:  53%|█████▎    | 1081/2048 [13:43<11:07,  1.45it/s]
loss 2.30 accuracy 0.19 -- 158.45 + 56.88 + 490.93 + 4.79 = 711.05:  53%|█████▎    | 1082/2048 [13:43<11:20,  1.42it/s]
loss 1.47 accuracy 0.50 -- 55.96 + 166.11 + 501.33 + 4.81 = 728.20:  53%|█████▎    | 1082/2048 [13:44<11:20,  1.42it/s]
loss 1.47 accuracy 0.50 -- 55.96 + 166.11 + 501.33 + 4.81 = 728.20:  53%|█████▎    | 1083/2048 [13:44<11:34,  1.39it/s]
loss 1.46 accuracy 0.50 -- 56.53 + 56.63 + 499.56 + 4.80 = 617.51:  53%|█████▎    | 1083/2048 [13:44<11:34,  1.39it/s] 
loss 1.46 accuracy 0.50 -- 56.53 + 56.63 + 499.56 + 4.80 = 617.51:  53%|█████▎    | 1084/2048 [13:44<11:12,  1.43it/s]
loss 1.86 accuracy 0.38 -- 163.14 + 56.80 + 497.86 + 4.78 = 722.58:  53%|█████▎    | 1084/2048 [13:45<11:12,  1.43it/s]
loss 1.86 accuracy 0.38 -- 163.14 + 56.80 + 497.86 + 4.78 = 722.58:  53%|█████▎    | 1085/2048 [13:45<11:26,  1.40it/s]
loss 1.55 accuracy 0.38 -- 56.09 + 57.33 + 620.82 + 4.77 = 739.00:  53%|█████▎    | 1085/2048 [13:46<11:26,  1.40it/s] 
loss 1.55 accuracy 0.38 -- 56.09 + 57.33 + 620.82 + 4.77 = 739.00:  53%|█████▎    | 1086/2048 [13:46<11:41,  1.37it/s]
loss 1.67 accuracy 0.44 -- 56.56 + 56.21 + 504.01 + 4.76 = 621.54:  53%|█████▎    | 1086/2048 [13:46<11:41,  1.37it/s]
loss 1.67 accuracy 0.44 -- 56.56 + 56.21 + 504.01 + 4.76 = 621.54:  53%|█████▎    | 1087/2048 [13:46<11:17,  1.42it/s]
loss 1.84 accuracy 0.19 -- 55.84 + 57.18 + 613.84 + 4.78 = 731.64:  53%|█████▎    | 1087/2048 [13:47<11:17,  1.42it/s]
loss 1.84 accuracy 0.19 -- 55.84 + 57.18 + 613.84 + 4.78 = 731.64:  53%|█████▎    | 1088/2048 [13:47<11:32,  1.39it/s]
loss 2.07 accuracy 0.31 -- 56.45 + 56.36 + 500.81 + 4.78 = 618.40:  53%|█████▎    | 1088/2048 [13:48<11:32,  1.39it/s]
loss 2.07 accuracy 0.31 -- 56.45 + 56.36 + 500.81 + 4.78 = 618.40:  53%|█████▎    | 1089/2048 [13:48<11:09,  1.43it/s]
loss 2.41 accuracy 0.19 -- 56.05 + 165.99 + 501.76 + 4.79 = 728.59:  53%|█████▎    | 1089/2048 [13:49<11:09,  1.43it/s]
loss 2.41 accuracy 0.19 -- 56.05 + 165.99 + 501.76 + 4.79 = 728.59:  53%|█████▎    | 1090/2048 [13:49<11:25,  1.40it/s]
loss 2.21 accuracy 0.25 -- 55.87 + 56.04 + 497.96 + 4.78 = 614.65:  53%|█████▎    | 1090/2048 [13:49<11:25,  1.40it/s] 
loss 2.21 accuracy 0.25 -- 55.87 + 56.04 + 497.96 + 4.78 = 614.65:  53%|█████▎    | 1091/2048 [13:49<11:03,  1.44it/s]
loss 2.19 accuracy 0.12 -- 56.33 + 56.34 + 497.50 + 4.78 = 614.96:  53%|█████▎    | 1091/2048 [13:50<11:03,  1.44it/s]
loss 2.19 accuracy 0.12 -- 56.33 + 56.34 + 497.50 + 4.78 = 614.96:  53%|█████▎    | 1092/2048 [13:50<11:18,  1.41it/s]
loss 1.39 accuracy 0.69 -- 55.77 + 57.11 + 494.26 + 4.78 = 611.92:  53%|█████▎    | 1092/2048 [13:51<11:18,  1.41it/s]
loss 1.39 accuracy 0.69 -- 55.77 + 57.11 + 494.26 + 4.78 = 611.92:  53%|█████▎    | 1093/2048 [13:51<10:57,  1.45it/s]
loss 1.99 accuracy 0.31 -- 157.57 + 56.76 + 488.87 + 4.77 = 707.98:  53%|█████▎    | 1093/2048 [13:51<10:57,  1.45it/s]
loss 1.99 accuracy 0.31 -- 157.57 + 56.76 + 488.87 + 4.77 = 707.98:  53%|█████▎    | 1094/2048 [13:51<11:10,  1.42it/s]
loss 1.92 accuracy 0.25 -- 55.51 + 166.02 + 502.20 + 4.77 = 728.50:  53%|█████▎    | 1094/2048 [13:52<11:10,  1.42it/s]
loss 1.92 accuracy 0.25 -- 55.51 + 166.02 + 502.20 + 4.77 = 728.50:  53%|█████▎    | 1095/2048 [13:52<11:24,  1.39it/s]
loss 2.26 accuracy 0.12 -- 56.54 + 56.36 + 497.54 + 4.78 = 615.22:  53%|█████▎    | 1095/2048 [13:53<11:24,  1.39it/s] 
loss 2.26 accuracy 0.12 -- 56.54 + 56.36 + 497.54 + 4.78 = 615.22:  54%|█████▎    | 1096/2048 [13:53<11:02,  1.44it/s]
loss 1.91 accuracy 0.31 -- 162.28 + 57.02 + 495.56 + 4.76 = 719.63:  54%|█████▎    | 1096/2048 [13:54<11:02,  1.44it/s]
loss 1.91 accuracy 0.31 -- 162.28 + 57.02 + 495.56 + 4.76 = 719.63:  54%|█████▎    | 1097/2048 [13:54<11:16,  1.41it/s]
loss 1.74 accuracy 0.44 -- 55.95 + 56.98 + 618.88 + 4.78 = 736.59:  54%|█████▎    | 1097/2048 [13:54<11:16,  1.41it/s] 
loss 1.74 accuracy 0.44 -- 55.95 + 56.98 + 618.88 + 4.78 = 736.59:  54%|█████▎    | 1098/2048 [13:54<11:30,  1.38it/s]
loss 1.70 accuracy 0.38 -- 56.60 + 56.55 + 504.09 + 4.78 = 622.02:  54%|█████▎    | 1098/2048 [13:55<11:30,  1.38it/s]
loss 1.70 accuracy 0.38 -- 56.60 + 56.55 + 504.09 + 4.78 = 622.02:  54%|█████▎    | 1099/2048 [13:55<11:07,  1.42it/s]
loss 1.53 accuracy 0.38 -- 56.19 + 57.18 + 616.87 + 4.82 = 735.06:  54%|█████▎    | 1099/2048 [13:56<11:07,  1.42it/s]
loss 1.53 accuracy 0.38 -- 56.19 + 57.18 + 616.87 + 4.82 = 735.06:  54%|█████▎    | 1100/2048 [13:56<11:23,  1.39it/s]
loss 1.90 accuracy 0.25 -- 56.89 + 56.62 + 504.03 + 4.81 = 622.35:  54%|█████▎    | 1100/2048 [13:56<11:23,  1.39it/s]
loss 1.90 accuracy 0.25 -- 56.89 + 56.62 + 504.03 + 4.81 = 622.35:  54%|█████▍    | 1101/2048 [13:56<11:02,  1.43it/s]
loss 1.47 accuracy 0.69 -- 56.70 + 166.81 + 502.77 + 4.81 = 731.09:  54%|█████▍    | 1101/2048 [13:57<11:02,  1.43it/s]
loss 1.47 accuracy 0.69 -- 56.70 + 166.81 + 502.77 + 4.81 = 731.09:  54%|█████▍    | 1102/2048 [13:57<11:18,  1.39it/s]
loss 1.75 accuracy 0.31 -- 56.03 + 56.36 + 500.21 + 4.78 = 617.38:  54%|█████▍    | 1102/2048 [13:58<11:18,  1.39it/s] 
loss 1.75 accuracy 0.31 -- 56.03 + 56.36 + 500.21 + 4.78 = 617.38:  54%|█████▍    | 1103/2048 [13:58<10:57,  1.44it/s]
loss 1.69 accuracy 0.38 -- 56.68 + 56.33 + 500.38 + 4.86 = 618.25:  54%|█████▍    | 1103/2048 [13:59<10:57,  1.44it/s]
loss 1.69 accuracy 0.38 -- 56.68 + 56.33 + 500.38 + 4.86 = 618.25:  54%|█████▍    | 1104/2048 [13:59<11:12,  1.40it/s]
loss 1.84 accuracy 0.19 -- 56.07 + 57.01 + 495.67 + 4.82 = 613.57:  54%|█████▍    | 1104/2048 [13:59<11:12,  1.40it/s]
loss 1.84 accuracy 0.19 -- 56.07 + 57.01 + 495.67 + 4.82 = 613.57:  54%|█████▍    | 1105/2048 [13:59<10:51,  1.45it/s]
loss 1.77 accuracy 0.31 -- 157.72 + 56.98 + 490.52 + 4.76 = 709.98:  54%|█████▍    | 1105/2048 [14:00<10:51,  1.45it/s]
loss 1.77 accuracy 0.31 -- 157.72 + 56.98 + 490.52 + 4.76 = 709.98:  54%|█████▍    | 1106/2048 [14:00<11:04,  1.42it/s]
loss 1.51 accuracy 0.31 -- 56.04 + 166.14 + 503.51 + 4.77 = 730.46:  54%|█████▍    | 1106/2048 [14:01<11:04,  1.42it/s]
loss 1.51 accuracy 0.31 -- 56.04 + 166.14 + 503.51 + 4.77 = 730.46:  54%|█████▍    | 1107/2048 [14:01<11:18,  1.39it/s]
loss 1.67 accuracy 0.50 -- 56.93 + 56.78 + 497.85 + 4.79 = 616.35:  54%|█████▍    | 1107/2048 [14:01<11:18,  1.39it/s] 
loss 1.67 accuracy 0.50 -- 56.93 + 56.78 + 497.85 + 4.79 = 616.35:  54%|█████▍    | 1108/2048 [14:01<10:55,  1.43it/s]
loss 1.80 accuracy 0.12 -- 162.51 + 56.91 + 496.40 + 4.79 = 720.60:  54%|█████▍    | 1108/2048 [14:02<10:55,  1.43it/s]
loss 1.80 accuracy 0.12 -- 162.51 + 56.91 + 496.40 + 4.79 = 720.60:  54%|█████▍    | 1109/2048 [14:02<11:09,  1.40it/s]
loss 2.10 accuracy 0.19 -- 56.20 + 57.30 + 622.64 + 4.78 = 740.92:  54%|█████▍    | 1109/2048 [14:03<11:09,  1.40it/s] 
loss 2.10 accuracy 0.19 -- 56.20 + 57.30 + 622.64 + 4.78 = 740.92:  54%|█████▍    | 1110/2048 [14:03<11:23,  1.37it/s]
loss 2.11 accuracy 0.44 -- 56.87 + 56.17 + 504.08 + 4.78 = 621.89:  54%|█████▍    | 1110/2048 [14:03<11:23,  1.37it/s]
loss 2.11 accuracy 0.44 -- 56.87 + 56.17 + 504.08 + 4.78 = 621.89:  54%|█████▍    | 1111/2048 [14:03<11:00,  1.42it/s]
loss 1.28 accuracy 0.50 -- 56.08 + 57.45 + 616.20 + 4.78 = 734.51:  54%|█████▍    | 1111/2048 [14:04<11:00,  1.42it/s]
loss 1.28 accuracy 0.50 -- 56.08 + 57.45 + 616.20 + 4.78 = 734.51:  54%|█████▍    | 1112/2048 [14:04<11:15,  1.38it/s]
loss 1.63 accuracy 0.31 -- 56.91 + 56.90 + 502.86 + 4.80 = 621.48:  54%|█████▍    | 1112/2048 [14:05<11:15,  1.38it/s]
loss 1.63 accuracy 0.31 -- 56.91 + 56.90 + 502.86 + 4.80 = 621.48:  54%|█████▍    | 1113/2048 [14:05<10:54,  1.43it/s]
loss 1.59 accuracy 0.44 -- 56.15 + 166.12 + 501.45 + 4.78 = 728.50:  54%|█████▍    | 1113/2048 [14:06<10:54,  1.43it/s]
loss 1.59 accuracy 0.44 -- 56.15 + 166.12 + 501.45 + 4.78 = 728.50:  54%|█████▍    | 1114/2048 [14:06<11:09,  1.40it/s]
loss 1.77 accuracy 0.31 -- 55.93 + 56.55 + 499.13 + 4.77 = 616.37:  54%|█████▍    | 1114/2048 [14:06<11:09,  1.40it/s] 
loss 1.77 accuracy 0.31 -- 55.93 + 56.55 + 499.13 + 4.77 = 616.37:  54%|█████▍    | 1115/2048 [14:06<10:48,  1.44it/s]
loss 1.78 accuracy 0.25 -- 56.66 + 56.21 + 498.24 + 4.77 = 615.89:  54%|█████▍    | 1115/2048 [14:07<10:48,  1.44it/s]
loss 1.78 accuracy 0.25 -- 56.66 + 56.21 + 498.24 + 4.77 = 615.89:  54%|█████▍    | 1116/2048 [14:07<11:02,  1.41it/s]
loss 1.69 accuracy 0.38 -- 55.72 + 57.58 + 494.81 + 4.75 = 612.86:  54%|█████▍    | 1116/2048 [14:08<11:02,  1.41it/s]
loss 1.69 accuracy 0.38 -- 55.72 + 57.58 + 494.81 + 4.75 = 612.86:  55%|█████▍    | 1117/2048 [14:08<10:42,  1.45it/s]
loss 1.89 accuracy 0.38 -- 157.12 + 57.22 + 490.86 + 4.80 = 710.00:  55%|█████▍    | 1117/2048 [14:08<10:42,  1.45it/s]
loss 1.89 accuracy 0.38 -- 157.12 + 57.22 + 490.86 + 4.80 = 710.00:  55%|█████▍    | 1118/2048 [14:08<10:54,  1.42it/s]
loss 1.80 accuracy 0.25 -- 55.90 + 166.54 + 501.19 + 4.78 = 728.40:  55%|█████▍    | 1118/2048 [14:09<10:54,  1.42it/s]
loss 1.80 accuracy 0.25 -- 55.90 + 166.54 + 501.19 + 4.78 = 728.40:  55%|█████▍    | 1119/2048 [14:09<11:08,  1.39it/s]
loss 1.79 accuracy 0.31 -- 56.44 + 56.17 + 497.25 + 4.78 = 614.64:  55%|█████▍    | 1119/2048 [14:10<11:08,  1.39it/s] 
loss 1.79 accuracy 0.31 -- 56.44 + 56.17 + 497.25 + 4.78 = 614.64:  55%|█████▍    | 1120/2048 [14:10<10:46,  1.44it/s]
loss 1.61 accuracy 0.62 -- 162.65 + 57.37 + 495.40 + 4.78 = 720.19:  55%|█████▍    | 1120/2048 [14:11<10:46,  1.44it/s]
loss 1.61 accuracy 0.62 -- 162.65 + 57.37 + 495.40 + 4.78 = 720.19:  55%|█████▍    | 1121/2048 [14:11<10:59,  1.41it/s]
loss 1.58 accuracy 0.44 -- 56.26 + 57.05 + 619.54 + 4.78 = 737.63:  55%|█████▍    | 1121/2048 [14:11<10:59,  1.41it/s] 
loss 1.58 accuracy 0.44 -- 56.26 + 57.05 + 619.54 + 4.78 = 737.63:  55%|█████▍    | 1122/2048 [14:11<11:13,  1.37it/s]
loss 2.62 accuracy 0.19 -- 56.63 + 56.60 + 504.88 + 4.78 = 622.90:  55%|█████▍    | 1122/2048 [14:12<11:13,  1.37it/s]
loss 2.62 accuracy 0.19 -- 56.63 + 56.60 + 504.88 + 4.78 = 622.90:  55%|█████▍    | 1123/2048 [14:12<10:51,  1.42it/s]
loss 1.75 accuracy 0.38 -- 56.16 + 57.38 + 616.45 + 4.78 = 734.76:  55%|█████▍    | 1123/2048 [14:13<10:51,  1.42it/s]
loss 1.75 accuracy 0.38 -- 56.16 + 57.38 + 616.45 + 4.78 = 734.76:  55%|█████▍    | 1124/2048 [14:13<11:06,  1.39it/s]
loss 2.24 accuracy 0.25 -- 56.60 + 56.66 + 501.85 + 4.80 = 619.91:  55%|█████▍    | 1124/2048 [14:13<11:06,  1.39it/s]
loss 2.24 accuracy 0.25 -- 56.60 + 56.66 + 501.85 + 4.80 = 619.91:  55%|█████▍    | 1125/2048 [14:13<10:45,  1.43it/s]
loss 1.70 accuracy 0.50 -- 56.46 + 167.03 + 503.91 + 4.80 = 732.20:  55%|█████▍    | 1125/2048 [14:14<10:45,  1.43it/s]
loss 1.70 accuracy 0.50 -- 56.46 + 167.03 + 503.91 + 4.80 = 732.20:  55%|█████▍    | 1126/2048 [14:14<11:01,  1.39it/s]
loss 1.56 accuracy 0.50 -- 56.47 + 56.80 + 501.24 + 4.77 = 619.28:  55%|█████▍    | 1126/2048 [14:15<11:01,  1.39it/s] 
loss 1.56 accuracy 0.50 -- 56.47 + 56.80 + 501.24 + 4.77 = 619.28:  55%|█████▌    | 1127/2048 [14:15<10:41,  1.44it/s]
loss 2.13 accuracy 0.31 -- 57.09 + 56.92 + 498.78 + 4.78 = 617.57:  55%|█████▌    | 1127/2048 [14:16<10:41,  1.44it/s]
loss 2.13 accuracy 0.31 -- 57.09 + 56.92 + 498.78 + 4.78 = 617.57:  55%|█████▌    | 1128/2048 [14:16<10:55,  1.40it/s]
loss 1.65 accuracy 0.38 -- 56.37 + 57.51 + 496.30 + 4.78 = 614.96:  55%|█████▌    | 1128/2048 [14:16<10:55,  1.40it/s]
loss 1.65 accuracy 0.38 -- 56.37 + 57.51 + 496.30 + 4.78 = 614.96:  55%|█████▌    | 1129/2048 [14:16<10:35,  1.45it/s]
loss 1.66 accuracy 0.38 -- 157.84 + 57.33 + 490.51 + 4.77 = 710.45:  55%|█████▌    | 1129/2048 [14:17<10:35,  1.45it/s]
loss 1.66 accuracy 0.38 -- 157.84 + 57.33 + 490.51 + 4.77 = 710.45:  55%|█████▌    | 1130/2048 [14:17<10:47,  1.42it/s]
loss 1.50 accuracy 0.38 -- 55.74 + 166.70 + 501.18 + 4.77 = 728.39:  55%|█████▌    | 1130/2048 [14:18<10:47,  1.42it/s]
loss 1.50 accuracy 0.38 -- 55.74 + 166.70 + 501.18 + 4.77 = 728.39:  55%|█████▌    | 1131/2048 [14:18<11:00,  1.39it/s]
loss 1.89 accuracy 0.19 -- 56.48 + 56.32 + 498.51 + 4.77 = 616.07:  55%|█████▌    | 1131/2048 [14:18<11:00,  1.39it/s] 
loss 1.89 accuracy 0.19 -- 56.48 + 56.32 + 498.51 + 4.77 = 616.07:  55%|█████▌    | 1132/2048 [14:18<10:38,  1.43it/s]
loss 2.22 accuracy 0.19 -- 162.80 + 57.04 + 496.69 + 4.77 = 721.30:  55%|█████▌    | 1132/2048 [14:19<10:38,  1.43it/s]
loss 2.22 accuracy 0.19 -- 162.80 + 57.04 + 496.69 + 4.77 = 721.30:  55%|█████▌    | 1133/2048 [14:19<10:52,  1.40it/s]
loss 1.70 accuracy 0.44 -- 56.01 + 57.19 + 619.39 + 4.80 = 737.39:  55%|█████▌    | 1133/2048 [14:20<10:52,  1.40it/s] 
loss 1.70 accuracy 0.44 -- 56.01 + 57.19 + 619.39 + 4.80 = 737.39:  55%|█████▌    | 1134/2048 [14:20<11:05,  1.37it/s]
loss 1.91 accuracy 0.44 -- 56.70 + 56.21 + 505.95 + 4.78 = 623.64:  55%|█████▌    | 1134/2048 [14:20<11:05,  1.37it/s]
loss 1.91 accuracy 0.44 -- 56.70 + 56.21 + 505.95 + 4.78 = 623.64:  55%|█████▌    | 1135/2048 [14:20<10:43,  1.42it/s]
loss 1.71 accuracy 0.50 -- 56.09 + 57.43 + 615.98 + 4.78 = 734.29:  55%|█████▌    | 1135/2048 [14:21<10:43,  1.42it/s]
loss 1.71 accuracy 0.50 -- 56.09 + 57.43 + 615.98 + 4.78 = 734.29:  55%|█████▌    | 1136/2048 [14:21<10:58,  1.38it/s]
loss 1.75 accuracy 0.44 -- 56.69 + 56.75 + 502.41 + 4.77 = 620.62:  55%|█████▌    | 1136/2048 [14:22<10:58,  1.38it/s]
loss 1.75 accuracy 0.44 -- 56.69 + 56.75 + 502.41 + 4.77 = 620.62:  56%|█████▌    | 1137/2048 [14:22<10:37,  1.43it/s]
loss 1.84 accuracy 0.38 -- 56.42 + 166.59 + 501.82 + 4.79 = 729.61:  56%|█████▌    | 1137/2048 [14:23<10:37,  1.43it/s]
loss 1.84 accuracy 0.38 -- 56.42 + 166.59 + 501.82 + 4.79 = 729.61:  56%|█████▌    | 1138/2048 [14:23<10:52,  1.39it/s]
loss 1.48 accuracy 0.25 -- 56.35 + 56.64 + 498.66 + 4.78 = 616.42:  56%|█████▌    | 1138/2048 [14:23<10:52,  1.39it/s] 
loss 1.48 accuracy 0.25 -- 56.35 + 56.64 + 498.66 + 4.78 = 616.42:  56%|█████▌    | 1139/2048 [14:23<10:31,  1.44it/s]
loss 1.53 accuracy 0.44 -- 56.73 + 56.86 + 499.27 + 4.77 = 617.63:  56%|█████▌    | 1139/2048 [14:24<10:31,  1.44it/s]
loss 1.53 accuracy 0.44 -- 56.73 + 56.86 + 499.27 + 4.77 = 617.63:  56%|█████▌    | 1140/2048 [14:24<10:46,  1.40it/s]
loss 1.69 accuracy 0.31 -- 56.11 + 57.70 + 496.10 + 4.77 = 614.67:  56%|█████▌    | 1140/2048 [14:25<10:46,  1.40it/s]
loss 1.69 accuracy 0.31 -- 56.11 + 57.70 + 496.10 + 4.77 = 614.67:  56%|█████▌    | 1141/2048 [14:25<10:26,  1.45it/s]
loss 1.87 accuracy 0.44 -- 157.29 + 57.03 + 488.66 + 4.78 = 707.75:  56%|█████▌    | 1141/2048 [14:25<10:26,  1.45it/s]
loss 1.87 accuracy 0.44 -- 157.29 + 57.03 + 488.66 + 4.78 = 707.75:  56%|█████▌    | 1142/2048 [14:25<10:37,  1.42it/s]
loss 1.75 accuracy 0.31 -- 55.65 + 166.98 + 501.56 + 4.77 = 728.95:  56%|█████▌    | 1142/2048 [14:26<10:37,  1.42it/s]
loss 1.75 accuracy 0.31 -- 55.65 + 166.98 + 501.56 + 4.77 = 728.95:  56%|█████▌    | 1143/2048 [14:26<10:51,  1.39it/s]
loss 1.61 accuracy 0.38 -- 56.69 + 56.23 + 496.81 + 4.79 = 614.52:  56%|█████▌    | 1143/2048 [14:27<10:51,  1.39it/s] 
loss 1.61 accuracy 0.38 -- 56.69 + 56.23 + 496.81 + 4.79 = 614.52:  56%|█████▌    | 1144/2048 [14:27<10:29,  1.44it/s]
loss 2.19 accuracy 0.31 -- 162.58 + 57.08 + 496.84 + 4.78 = 721.28:  56%|█████▌    | 1144/2048 [14:28<10:29,  1.44it/s]
loss 2.19 accuracy 0.31 -- 162.58 + 57.08 + 496.84 + 4.78 = 721.28:  56%|█████▌    | 1145/2048 [14:28<10:42,  1.40it/s]
loss 1.64 accuracy 0.25 -- 56.25 + 57.58 + 620.76 + 4.82 = 739.41:  56%|█████▌    | 1145/2048 [14:28<10:42,  1.40it/s] 
loss 1.64 accuracy 0.25 -- 56.25 + 57.58 + 620.76 + 4.82 = 739.41:  56%|█████▌    | 1146/2048 [14:28<10:56,  1.37it/s]
loss 2.08 accuracy 0.38 -- 56.92 + 56.87 + 505.42 + 4.77 = 623.98:  56%|█████▌    | 1146/2048 [14:29<10:56,  1.37it/s]
loss 2.08 accuracy 0.38 -- 56.92 + 56.87 + 505.42 + 4.77 = 623.98:  56%|█████▌    | 1147/2048 [14:29<10:35,  1.42it/s]
loss 1.83 accuracy 0.38 -- 56.06 + 57.62 + 616.36 + 4.79 = 734.84:  56%|█████▌    | 1147/2048 [14:30<10:35,  1.42it/s]
loss 1.83 accuracy 0.38 -- 56.06 + 57.62 + 616.36 + 4.79 = 734.84:  56%|█████▌    | 1148/2048 [14:30<10:50,  1.38it/s]
loss 1.77 accuracy 0.25 -- 56.72 + 56.55 + 502.33 + 4.77 = 620.37:  56%|█████▌    | 1148/2048 [14:30<10:50,  1.38it/s]
loss 1.77 accuracy 0.25 -- 56.72 + 56.55 + 502.33 + 4.77 = 620.37:  56%|█████▌    | 1149/2048 [14:30<10:29,  1.43it/s]
loss 1.63 accuracy 0.31 -- 55.99 + 166.46 + 502.01 + 4.79 = 729.24:  56%|█████▌    | 1149/2048 [14:31<10:29,  1.43it/s]
loss 1.63 accuracy 0.31 -- 55.99 + 166.46 + 502.01 + 4.79 = 729.24:  56%|█████▌    | 1150/2048 [14:31<10:43,  1.39it/s]
loss 1.67 accuracy 0.38 -- 55.95 + 56.48 + 499.08 + 4.77 = 616.28:  56%|█████▌    | 1150/2048 [14:32<10:43,  1.39it/s] 
loss 1.67 accuracy 0.38 -- 55.95 + 56.48 + 499.08 + 4.77 = 616.28:  56%|█████▌    | 1151/2048 [14:32<10:23,  1.44it/s]
loss 1.73 accuracy 0.44 -- 56.76 + 56.75 + 499.36 + 4.78 = 617.64:  56%|█████▌    | 1151/2048 [14:33<10:23,  1.44it/s]
loss 1.73 accuracy 0.44 -- 56.76 + 56.75 + 499.36 + 4.78 = 617.64:  56%|█████▋    | 1152/2048 [14:33<10:37,  1.40it/s]
loss 1.96 accuracy 0.19 -- 56.09 + 57.33 + 494.87 + 4.79 = 613.07:  56%|█████▋    | 1152/2048 [14:33<10:37,  1.40it/s]
loss 1.96 accuracy 0.19 -- 56.09 + 57.33 + 494.87 + 4.79 = 613.07:  56%|█████▋    | 1153/2048 [14:33<10:17,  1.45it/s]
loss 1.49 accuracy 0.31 -- 157.99 + 56.67 + 489.12 + 4.77 = 708.56:  56%|█████▋    | 1153/2048 [14:34<10:17,  1.45it/s]
loss 1.49 accuracy 0.31 -- 157.99 + 56.67 + 489.12 + 4.77 = 708.56:  56%|█████▋    | 1154/2048 [14:34<10:29,  1.42it/s]
loss 2.22 accuracy 0.44 -- 56.20 + 166.90 + 501.66 + 4.79 = 729.55:  56%|█████▋    | 1154/2048 [14:35<10:29,  1.42it/s]
loss 2.22 accuracy 0.44 -- 56.20 + 166.90 + 501.66 + 4.79 = 729.55:  56%|█████▋    | 1155/2048 [14:35<10:42,  1.39it/s]
loss 1.51 accuracy 0.50 -- 56.85 + 56.17 + 497.60 + 4.77 = 615.38:  56%|█████▋    | 1155/2048 [14:35<10:42,  1.39it/s] 
loss 1.51 accuracy 0.50 -- 56.85 + 56.17 + 497.60 + 4.77 = 615.38:  56%|█████▋    | 1156/2048 [14:35<10:21,  1.44it/s]
loss 2.21 accuracy 0.19 -- 162.59 + 57.20 + 497.95 + 4.77 = 722.51:  56%|█████▋    | 1156/2048 [14:36<10:21,  1.44it/s]
loss 2.21 accuracy 0.19 -- 162.59 + 57.20 + 497.95 + 4.77 = 722.51:  56%|█████▋    | 1157/2048 [14:36<10:34,  1.40it/s]
loss 1.57 accuracy 0.44 -- 56.19 + 57.74 + 622.09 + 4.79 = 740.81:  56%|█████▋    | 1157/2048 [14:37<10:34,  1.40it/s] 
loss 1.57 accuracy 0.44 -- 56.19 + 57.74 + 622.09 + 4.79 = 740.81:  57%|█████▋    | 1158/2048 [14:37<10:48,  1.37it/s]
loss 2.05 accuracy 0.25 -- 56.53 + 56.89 + 504.98 + 4.78 = 623.18:  57%|█████▋    | 1158/2048 [14:37<10:48,  1.37it/s]
loss 2.05 accuracy 0.25 -- 56.53 + 56.89 + 504.98 + 4.78 = 623.18:  57%|█████▋    | 1159/2048 [14:37<10:27,  1.42it/s]
loss 2.09 accuracy 0.31 -- 56.24 + 57.23 + 615.40 + 4.80 = 733.66:  57%|█████▋    | 1159/2048 [14:38<10:27,  1.42it/s]
loss 2.09 accuracy 0.31 -- 56.24 + 57.23 + 615.40 + 4.80 = 733.66:  57%|█████▋    | 1160/2048 [14:38<10:41,  1.38it/s]
loss 1.80 accuracy 0.38 -- 56.81 + 56.63 + 502.08 + 4.78 = 620.30:  57%|█████▋    | 1160/2048 [14:39<10:41,  1.38it/s]
loss 1.80 accuracy 0.38 -- 56.81 + 56.63 + 502.08 + 4.78 = 620.30:  57%|█████▋    | 1161/2048 [14:39<10:20,  1.43it/s]
loss 1.56 accuracy 0.31 -- 56.29 + 166.28 + 502.43 + 4.77 = 729.77:  57%|█████▋    | 1161/2048 [14:40<10:20,  1.43it/s]
loss 1.56 accuracy 0.31 -- 56.29 + 166.28 + 502.43 + 4.77 = 729.77:  57%|█████▋    | 1162/2048 [14:40<10:35,  1.40it/s]
loss 1.70 accuracy 0.50 -- 56.51 + 56.34 + 500.00 + 4.78 = 617.63:  57%|█████▋    | 1162/2048 [14:40<10:35,  1.40it/s] 
loss 1.70 accuracy 0.50 -- 56.51 + 56.34 + 500.00 + 4.78 = 617.63:  57%|█████▋    | 1163/2048 [14:40<10:15,  1.44it/s]
loss 1.39 accuracy 0.38 -- 57.14 + 56.62 + 498.37 + 4.79 = 616.92:  57%|█████▋    | 1163/2048 [14:41<10:15,  1.44it/s]
loss 1.39 accuracy 0.38 -- 57.14 + 56.62 + 498.37 + 4.79 = 616.92:  57%|█████▋    | 1164/2048 [14:41<10:29,  1.40it/s]
loss 2.52 accuracy 0.19 -- 55.81 + 57.27 + 493.36 + 4.77 = 611.22:  57%|█████▋    | 1164/2048 [14:42<10:29,  1.40it/s]
loss 2.52 accuracy 0.19 -- 55.81 + 57.27 + 493.36 + 4.77 = 611.22:  57%|█████▋    | 1165/2048 [14:42<10:09,  1.45it/s]
loss 1.75 accuracy 0.31 -- 157.26 + 57.24 + 490.47 + 4.77 = 709.74:  57%|█████▋    | 1165/2048 [14:42<10:09,  1.45it/s]
loss 1.75 accuracy 0.31 -- 157.26 + 57.24 + 490.47 + 4.77 = 709.74:  57%|█████▋    | 1166/2048 [14:42<10:20,  1.42it/s]
loss 1.60 accuracy 0.19 -- 55.79 + 166.71 + 502.06 + 4.78 = 729.34:  57%|█████▋    | 1166/2048 [14:43<10:20,  1.42it/s]
loss 1.60 accuracy 0.19 -- 55.79 + 166.71 + 502.06 + 4.78 = 729.34:  57%|█████▋    | 1167/2048 [14:43<10:34,  1.39it/s]
loss 2.02 accuracy 0.38 -- 56.81 + 56.62 + 497.89 + 4.78 = 616.10:  57%|█████▋    | 1167/2048 [14:44<10:34,  1.39it/s] 
loss 2.02 accuracy 0.38 -- 56.81 + 56.62 + 497.89 + 4.78 = 616.10:  57%|█████▋    | 1168/2048 [14:44<10:13,  1.44it/s]
loss 1.44 accuracy 0.50 -- 162.30 + 57.54 + 498.18 + 4.78 = 722.79:  57%|█████▋    | 1168/2048 [14:45<10:13,  1.44it/s]
loss 1.44 accuracy 0.50 -- 162.30 + 57.54 + 498.18 + 4.78 = 722.79:  57%|█████▋    | 1169/2048 [14:45<10:26,  1.40it/s]
loss 1.96 accuracy 0.19 -- 55.85 + 57.17 + 620.99 + 4.79 = 738.79:  57%|█████▋    | 1169/2048 [14:45<10:26,  1.40it/s] 
loss 1.96 accuracy 0.19 -- 55.85 + 57.17 + 620.99 + 4.79 = 738.79:  57%|█████▋    | 1170/2048 [14:45<10:39,  1.37it/s]
loss 1.88 accuracy 0.19 -- 56.72 + 56.30 + 504.45 + 4.77 = 622.24:  57%|█████▋    | 1170/2048 [14:46<10:39,  1.37it/s]
loss 1.88 accuracy 0.19 -- 56.72 + 56.30 + 504.45 + 4.77 = 622.24:  57%|█████▋    | 1171/2048 [14:46<10:18,  1.42it/s]
loss 1.76 accuracy 0.31 -- 56.57 + 57.17 + 616.78 + 4.79 = 735.31:  57%|█████▋    | 1171/2048 [14:47<10:18,  1.42it/s]
loss 1.76 accuracy 0.31 -- 56.57 + 57.17 + 616.78 + 4.79 = 735.31:  57%|█████▋    | 1172/2048 [14:47<10:32,  1.38it/s]
loss 1.72 accuracy 0.38 -- 56.75 + 56.68 + 501.55 + 4.79 = 619.77:  57%|█████▋    | 1172/2048 [14:47<10:32,  1.38it/s]
loss 1.72 accuracy 0.38 -- 56.75 + 56.68 + 501.55 + 4.79 = 619.77:  57%|█████▋    | 1173/2048 [14:47<10:12,  1.43it/s]
loss 1.78 accuracy 0.25 -- 56.17 + 166.15 + 502.08 + 4.79 = 729.19:  57%|█████▋    | 1173/2048 [14:48<10:12,  1.43it/s]
loss 1.78 accuracy 0.25 -- 56.17 + 166.15 + 502.08 + 4.79 = 729.19:  57%|█████▋    | 1174/2048 [14:48<10:26,  1.40it/s]
loss 1.51 accuracy 0.38 -- 56.39 + 56.77 + 498.90 + 4.76 = 616.81:  57%|█████▋    | 1174/2048 [14:49<10:26,  1.40it/s] 
loss 1.51 accuracy 0.38 -- 56.39 + 56.77 + 498.90 + 4.76 = 616.81:  57%|█████▋    | 1175/2048 [14:49<10:06,  1.44it/s]
loss 1.70 accuracy 0.44 -- 56.72 + 56.84 + 498.10 + 4.78 = 616.44:  57%|█████▋    | 1175/2048 [14:50<10:06,  1.44it/s]
loss 1.70 accuracy 0.44 -- 56.72 + 56.84 + 498.10 + 4.78 = 616.44:  57%|█████▋    | 1176/2048 [14:50<10:20,  1.41it/s]
loss 1.66 accuracy 0.44 -- 55.82 + 57.05 + 494.51 + 4.77 = 612.15:  57%|█████▋    | 1176/2048 [14:50<10:20,  1.41it/s]
loss 1.66 accuracy 0.44 -- 55.82 + 57.05 + 494.51 + 4.77 = 612.15:  57%|█████▋    | 1177/2048 [14:50<10:00,  1.45it/s]
loss 1.65 accuracy 0.56 -- 157.48 + 56.72 + 489.24 + 4.78 = 708.22:  57%|█████▋    | 1177/2048 [14:51<10:00,  1.45it/s]
loss 1.65 accuracy 0.56 -- 157.48 + 56.72 + 489.24 + 4.78 = 708.22:  58%|█████▊    | 1178/2048 [14:51<10:12,  1.42it/s]
loss 2.26 accuracy 0.25 -- 55.72 + 166.80 + 500.24 + 4.77 = 727.52:  58%|█████▊    | 1178/2048 [14:52<10:12,  1.42it/s]
loss 2.26 accuracy 0.25 -- 55.72 + 166.80 + 500.24 + 4.77 = 727.52:  58%|█████▊    | 1179/2048 [14:52<10:24,  1.39it/s]
loss 1.46 accuracy 0.50 -- 56.59 + 56.46 + 499.49 + 4.81 = 617.35:  58%|█████▊    | 1179/2048 [14:52<10:24,  1.39it/s] 
loss 1.46 accuracy 0.50 -- 56.59 + 56.46 + 499.49 + 4.81 = 617.35:  58%|█████▊    | 1180/2048 [14:52<10:04,  1.44it/s]
loss 2.42 accuracy 0.19 -- 162.66 + 57.23 + 497.73 + 4.78 = 722.40:  58%|█████▊    | 1180/2048 [14:53<10:04,  1.44it/s]
loss 2.42 accuracy 0.19 -- 162.66 + 57.23 + 497.73 + 4.78 = 722.40:  58%|█████▊    | 1181/2048 [14:53<10:17,  1.40it/s]
loss 2.08 accuracy 0.38 -- 56.20 + 57.51 + 622.35 + 4.79 = 740.85:  58%|█████▊    | 1181/2048 [14:54<10:17,  1.40it/s] 
loss 2.08 accuracy 0.38 -- 56.20 + 57.51 + 622.35 + 4.79 = 740.85:  58%|█████▊    | 1182/2048 [14:54<10:31,  1.37it/s]
loss 1.66 accuracy 0.31 -- 56.75 + 56.43 + 505.73 + 4.77 = 623.68:  58%|█████▊    | 1182/2048 [14:54<10:31,  1.37it/s]
loss 1.66 accuracy 0.31 -- 56.75 + 56.43 + 505.73 + 4.77 = 623.68:  58%|█████▊    | 1183/2048 [14:54<10:10,  1.42it/s]
loss 2.11 accuracy 0.31 -- 56.02 + 57.26 + 615.63 + 4.80 = 733.70:  58%|█████▊    | 1183/2048 [14:55<10:10,  1.42it/s]
loss 2.11 accuracy 0.31 -- 56.02 + 57.26 + 615.63 + 4.80 = 733.70:  58%|█████▊    | 1184/2048 [14:55<10:24,  1.38it/s]
loss 1.76 accuracy 0.56 -- 56.73 + 56.67 + 501.54 + 4.77 = 619.70:  58%|█████▊    | 1184/2048 [14:56<10:24,  1.38it/s]
loss 1.76 accuracy 0.56 -- 56.73 + 56.67 + 501.54 + 4.77 = 619.70:  58%|█████▊    | 1185/2048 [14:56<10:03,  1.43it/s]
loss 2.02 accuracy 0.25 -- 56.37 + 166.72 + 503.05 + 4.77 = 730.90:  58%|█████▊    | 1185/2048 [14:57<10:03,  1.43it/s]
loss 2.02 accuracy 0.25 -- 56.37 + 166.72 + 503.05 + 4.77 = 730.90:  58%|█████▊    | 1186/2048 [14:57<10:18,  1.39it/s]
loss 1.73 accuracy 0.38 -- 56.18 + 56.80 + 500.11 + 4.77 = 617.86:  58%|█████▊    | 1186/2048 [14:57<10:18,  1.39it/s] 
loss 1.73 accuracy 0.38 -- 56.18 + 56.80 + 500.11 + 4.77 = 617.86:  58%|█████▊    | 1187/2048 [14:57<09:58,  1.44it/s]
loss 2.42 accuracy 0.25 -- 56.80 + 56.35 + 498.32 + 4.79 = 616.26:  58%|█████▊    | 1187/2048 [14:58<09:58,  1.44it/s]
loss 2.42 accuracy 0.25 -- 56.80 + 56.35 + 498.32 + 4.79 = 616.26:  58%|█████▊    | 1188/2048 [14:58<10:12,  1.40it/s]
loss 2.19 accuracy 0.19 -- 56.35 + 57.54 + 495.94 + 4.77 = 614.59:  58%|█████▊    | 1188/2048 [14:59<10:12,  1.40it/s]
loss 2.19 accuracy 0.19 -- 56.35 + 57.54 + 495.94 + 4.77 = 614.59:  58%|█████▊    | 1189/2048 [14:59<09:53,  1.45it/s]
loss 1.48 accuracy 0.38 -- 157.02 + 56.92 + 491.58 + 4.78 = 710.30:  58%|█████▊    | 1189/2048 [14:59<09:53,  1.45it/s]
loss 1.48 accuracy 0.38 -- 157.02 + 56.92 + 491.58 + 4.78 = 710.30:  58%|█████▊    | 1190/2048 [14:59<10:04,  1.42it/s]
loss 1.55 accuracy 0.38 -- 55.78 + 166.47 + 501.96 + 4.77 = 728.98:  58%|█████▊    | 1190/2048 [15:00<10:04,  1.42it/s]
loss 1.55 accuracy 0.38 -- 55.78 + 166.47 + 501.96 + 4.77 = 728.98:  58%|█████▊    | 1191/2048 [15:00<10:17,  1.39it/s]
loss 1.62 accuracy 0.44 -- 57.01 + 56.83 + 498.19 + 4.78 = 616.81:  58%|█████▊    | 1191/2048 [15:01<10:17,  1.39it/s] 
loss 1.62 accuracy 0.44 -- 57.01 + 56.83 + 498.19 + 4.78 = 616.81:  58%|█████▊    | 1192/2048 [15:01<09:56,  1.43it/s]
loss 2.06 accuracy 0.38 -- 162.73 + 57.17 + 496.49 + 4.77 = 721.16:  58%|█████▊    | 1192/2048 [15:02<09:56,  1.43it/s]
loss 2.06 accuracy 0.38 -- 162.73 + 57.17 + 496.49 + 4.77 = 721.16:  58%|█████▊    | 1193/2048 [15:02<10:09,  1.40it/s]
loss 2.45 accuracy 0.12 -- 56.02 + 57.06 + 620.77 + 4.78 = 738.63:  58%|█████▊    | 1193/2048 [15:02<10:09,  1.40it/s] 
loss 2.45 accuracy 0.12 -- 56.02 + 57.06 + 620.77 + 4.78 = 738.63:  58%|█████▊    | 1194/2048 [15:02<10:22,  1.37it/s]
loss 1.63 accuracy 0.38 -- 56.66 + 56.52 + 504.31 + 4.76 = 622.25:  58%|█████▊    | 1194/2048 [15:03<10:22,  1.37it/s]
loss 1.63 accuracy 0.38 -- 56.66 + 56.52 + 504.31 + 4.76 = 622.25:  58%|█████▊    | 1195/2048 [15:03<10:01,  1.42it/s]
loss 1.68 accuracy 0.38 -- 56.11 + 57.28 + 614.84 + 4.78 = 733.01:  58%|█████▊    | 1195/2048 [15:04<10:01,  1.42it/s]
loss 1.68 accuracy 0.38 -- 56.11 + 57.28 + 614.84 + 4.78 = 733.01:  58%|█████▊    | 1196/2048 [15:04<10:14,  1.39it/s]
loss 1.75 accuracy 0.19 -- 56.75 + 56.41 + 503.25 + 4.78 = 621.18:  58%|█████▊    | 1196/2048 [15:04<10:14,  1.39it/s]
loss 1.75 accuracy 0.19 -- 56.75 + 56.41 + 503.25 + 4.78 = 621.18:  58%|█████▊    | 1197/2048 [15:04<09:55,  1.43it/s]
loss 1.90 accuracy 0.25 -- 56.41 + 166.50 + 501.15 + 4.77 = 728.83:  58%|█████▊    | 1197/2048 [15:05<09:55,  1.43it/s]
loss 1.90 accuracy 0.25 -- 56.41 + 166.50 + 501.15 + 4.77 = 728.83:  58%|█████▊    | 1198/2048 [15:05<10:09,  1.40it/s]
loss 3.22 accuracy 0.06 -- 56.11 + 56.54 + 499.24 + 4.82 = 616.71:  58%|█████▊    | 1198/2048 [15:06<10:09,  1.40it/s] 
loss 3.22 accuracy 0.06 -- 56.11 + 56.54 + 499.24 + 4.82 = 616.71:  59%|█████▊    | 1199/2048 [15:06<09:49,  1.44it/s]
loss 2.45 accuracy 0.06 -- 56.69 + 56.40 + 498.93 + 4.80 = 616.82:  59%|█████▊    | 1199/2048 [15:07<09:49,  1.44it/s]
loss 2.45 accuracy 0.06 -- 56.69 + 56.40 + 498.93 + 4.80 = 616.82:  59%|█████▊    | 1200/2048 [15:07<10:03,  1.41it/s]
loss 1.69 accuracy 0.44 -- 56.13 + 57.46 + 495.59 + 4.82 = 614.00:  59%|█████▊    | 1200/2048 [15:07<10:03,  1.41it/s]
loss 1.69 accuracy 0.44 -- 56.13 + 57.46 + 495.59 + 4.82 = 614.00:  59%|█████▊    | 1201/2048 [15:07<09:44,  1.45it/s]
loss 1.94 accuracy 0.44 -- 157.63 + 57.11 + 488.96 + 4.76 = 708.46:  59%|█████▊    | 1201/2048 [15:08<09:44,  1.45it/s]
loss 1.94 accuracy 0.44 -- 157.63 + 57.11 + 488.96 + 4.76 = 708.46:  59%|█████▊    | 1202/2048 [15:08<09:55,  1.42it/s]
loss 1.97 accuracy 0.25 -- 55.74 + 166.55 + 503.03 + 4.77 = 730.10:  59%|█████▊    | 1202/2048 [15:09<09:55,  1.42it/s]
loss 1.97 accuracy 0.25 -- 55.74 + 166.55 + 503.03 + 4.77 = 730.10:  59%|█████▊    | 1203/2048 [15:09<10:08,  1.39it/s]
loss 1.61 accuracy 0.38 -- 56.76 + 56.34 + 498.26 + 4.78 = 616.14:  59%|█████▊    | 1203/2048 [15:09<10:08,  1.39it/s] 
loss 1.61 accuracy 0.38 -- 56.76 + 56.34 + 498.26 + 4.78 = 616.14:  59%|█████▉    | 1204/2048 [15:09<09:48,  1.43it/s]
loss 2.34 accuracy 0.31 -- 162.84 + 57.15 + 497.75 + 4.79 = 722.53:  59%|█████▉    | 1204/2048 [15:10<09:48,  1.43it/s]
loss 2.34 accuracy 0.31 -- 162.84 + 57.15 + 497.75 + 4.79 = 722.53:  59%|█████▉    | 1205/2048 [15:10<10:00,  1.40it/s]
loss 1.86 accuracy 0.06 -- 56.23 + 57.30 + 620.28 + 4.78 = 738.59:  59%|█████▉    | 1205/2048 [15:11<10:00,  1.40it/s] 
loss 1.86 accuracy 0.06 -- 56.23 + 57.30 + 620.28 + 4.78 = 738.59:  59%|█████▉    | 1206/2048 [15:11<10:13,  1.37it/s]
loss 2.01 accuracy 0.25 -- 56.67 + 56.59 + 504.66 + 4.78 = 622.70:  59%|█████▉    | 1206/2048 [15:11<10:13,  1.37it/s]
loss 2.01 accuracy 0.25 -- 56.67 + 56.59 + 504.66 + 4.78 = 622.70:  59%|█████▉    | 1207/2048 [15:11<09:52,  1.42it/s]
loss 1.92 accuracy 0.44 -- 56.12 + 57.30 + 616.94 + 4.80 = 735.16:  59%|█████▉    | 1207/2048 [15:12<09:52,  1.42it/s]
loss 1.92 accuracy 0.44 -- 56.12 + 57.30 + 616.94 + 4.80 = 735.16:  59%|█████▉    | 1208/2048 [15:12<10:06,  1.38it/s]
loss 1.67 accuracy 0.44 -- 57.25 + 56.69 + 502.45 + 4.78 = 621.17:  59%|█████▉    | 1208/2048 [15:13<10:06,  1.38it/s]
loss 1.67 accuracy 0.44 -- 57.25 + 56.69 + 502.45 + 4.78 = 621.17:  59%|█████▉    | 1209/2048 [15:13<09:47,  1.43it/s]
loss 1.81 accuracy 0.50 -- 55.96 + 166.43 + 502.07 + 4.79 = 729.26:  59%|█████▉    | 1209/2048 [15:14<09:47,  1.43it/s]
loss 1.81 accuracy 0.50 -- 55.96 + 166.43 + 502.07 + 4.79 = 729.26:  59%|█████▉    | 1210/2048 [15:14<10:00,  1.39it/s]
loss 2.18 accuracy 0.31 -- 55.84 + 56.05 + 498.20 + 4.78 = 614.87:  59%|█████▉    | 1210/2048 [15:14<10:00,  1.39it/s] 
loss 2.18 accuracy 0.31 -- 55.84 + 56.05 + 498.20 + 4.78 = 614.87:  59%|█████▉    | 1211/2048 [15:14<09:41,  1.44it/s]
loss 1.89 accuracy 0.19 -- 56.68 + 56.55 + 498.03 + 4.77 = 616.03:  59%|█████▉    | 1211/2048 [15:15<09:41,  1.44it/s]
loss 1.89 accuracy 0.19 -- 56.68 + 56.55 + 498.03 + 4.77 = 616.03:  59%|█████▉    | 1212/2048 [15:15<09:54,  1.41it/s]
loss 2.36 accuracy 0.31 -- 55.91 + 57.35 + 496.49 + 4.79 = 614.54:  59%|█████▉    | 1212/2048 [15:16<09:54,  1.41it/s]
loss 2.36 accuracy 0.31 -- 55.91 + 57.35 + 496.49 + 4.79 = 614.54:  59%|█████▉    | 1213/2048 [15:16<09:36,  1.45it/s]
loss 2.00 accuracy 0.25 -- 157.53 + 56.92 + 491.55 + 4.77 = 710.77:  59%|█████▉    | 1213/2048 [15:16<09:36,  1.45it/s]
loss 2.00 accuracy 0.25 -- 157.53 + 56.92 + 491.55 + 4.77 = 710.77:  59%|█████▉    | 1214/2048 [15:16<09:47,  1.42it/s]
loss 1.96 accuracy 0.31 -- 56.04 + 166.55 + 501.55 + 4.79 = 728.93:  59%|█████▉    | 1214/2048 [15:17<09:47,  1.42it/s]
loss 1.96 accuracy 0.31 -- 56.04 + 166.55 + 501.55 + 4.79 = 728.93:  59%|█████▉    | 1215/2048 [15:17<10:01,  1.38it/s]
loss 1.99 accuracy 0.25 -- 56.80 + 56.51 + 497.80 + 4.78 = 615.88:  59%|█████▉    | 1215/2048 [15:18<10:01,  1.38it/s] 
loss 1.99 accuracy 0.25 -- 56.80 + 56.51 + 497.80 + 4.78 = 615.88:  59%|█████▉    | 1216/2048 [15:18<09:41,  1.43it/s]
loss 1.86 accuracy 0.25 -- 162.92 + 57.24 + 496.08 + 4.78 = 721.03:  59%|█████▉    | 1216/2048 [15:19<09:41,  1.43it/s]
loss 1.86 accuracy 0.25 -- 162.92 + 57.24 + 496.08 + 4.78 = 721.03:  59%|█████▉    | 1217/2048 [15:19<09:52,  1.40it/s]
loss 1.57 accuracy 0.44 -- 55.85 + 56.98 + 618.83 + 4.83 = 736.50:  59%|█████▉    | 1217/2048 [15:19<09:52,  1.40it/s] 
loss 1.57 accuracy 0.44 -- 55.85 + 56.98 + 618.83 + 4.83 = 736.50:  59%|█████▉    | 1218/2048 [15:19<10:04,  1.37it/s]
loss 1.65 accuracy 0.31 -- 56.60 + 56.20 + 504.84 + 4.78 = 622.42:  59%|█████▉    | 1218/2048 [15:20<10:04,  1.37it/s]
loss 1.65 accuracy 0.31 -- 56.60 + 56.20 + 504.84 + 4.78 = 622.42:  60%|█████▉    | 1219/2048 [15:20<09:44,  1.42it/s]
loss 2.32 accuracy 0.19 -- 56.34 + 57.57 + 617.80 + 4.80 = 736.50:  60%|█████▉    | 1219/2048 [15:21<09:44,  1.42it/s]
loss 2.32 accuracy 0.19 -- 56.34 + 57.57 + 617.80 + 4.80 = 736.50:  60%|█████▉    | 1220/2048 [15:21<09:58,  1.38it/s]
loss 2.22 accuracy 0.38 -- 56.72 + 56.38 + 502.04 + 4.78 = 619.93:  60%|█████▉    | 1220/2048 [15:21<09:58,  1.38it/s]
loss 2.22 accuracy 0.38 -- 56.72 + 56.38 + 502.04 + 4.78 = 619.93:  60%|█████▉    | 1221/2048 [15:21<09:38,  1.43it/s]
loss 2.30 accuracy 0.19 -- 56.13 + 166.14 + 502.50 + 4.79 = 729.56:  60%|█████▉    | 1221/2048 [15:22<09:38,  1.43it/s]
loss 2.30 accuracy 0.19 -- 56.13 + 166.14 + 502.50 + 4.79 = 729.56:  60%|█████▉    | 1222/2048 [15:22<09:52,  1.39it/s]
loss 2.07 accuracy 0.19 -- 56.35 + 56.43 + 498.82 + 4.77 = 616.37:  60%|█████▉    | 1222/2048 [15:23<09:52,  1.39it/s] 
loss 2.07 accuracy 0.19 -- 56.35 + 56.43 + 498.82 + 4.77 = 616.37:  60%|█████▉    | 1223/2048 [15:23<09:33,  1.44it/s]
loss 1.61 accuracy 0.38 -- 56.48 + 56.39 + 498.90 + 4.79 = 616.56:  60%|█████▉    | 1223/2048 [15:24<09:33,  1.44it/s]
loss 1.61 accuracy 0.38 -- 56.48 + 56.39 + 498.90 + 4.79 = 616.56:  60%|█████▉    | 1224/2048 [15:24<09:46,  1.41it/s]
loss 2.11 accuracy 0.25 -- 55.97 + 57.30 + 496.45 + 4.84 = 614.55:  60%|█████▉    | 1224/2048 [15:24<09:46,  1.41it/s]
loss 2.11 accuracy 0.25 -- 55.97 + 57.30 + 496.45 + 4.84 = 614.55:  60%|█████▉    | 1225/2048 [15:24<09:28,  1.45it/s]
loss 2.02 accuracy 0.19 -- 158.35 + 57.24 + 489.39 + 4.78 = 709.76:  60%|█████▉    | 1225/2048 [15:25<09:28,  1.45it/s]
loss 2.02 accuracy 0.19 -- 158.35 + 57.24 + 489.39 + 4.78 = 709.76:  60%|█████▉    | 1226/2048 [15:25<09:39,  1.42it/s]
loss 2.08 accuracy 0.44 -- 55.72 + 166.74 + 501.88 + 4.78 = 729.12:  60%|█████▉    | 1226/2048 [15:26<09:39,  1.42it/s]
loss 2.08 accuracy 0.44 -- 55.72 + 166.74 + 501.88 + 4.78 = 729.12:  60%|█████▉    | 1227/2048 [15:26<09:51,  1.39it/s]
loss 1.59 accuracy 0.31 -- 56.65 + 56.61 + 497.64 + 4.82 = 615.72:  60%|█████▉    | 1227/2048 [15:26<09:51,  1.39it/s] 
loss 1.59 accuracy 0.31 -- 56.65 + 56.61 + 497.64 + 4.82 = 615.72:  60%|█████▉    | 1228/2048 [15:26<09:31,  1.43it/s]
loss 2.06 accuracy 0.31 -- 162.90 + 57.14 + 496.44 + 4.80 = 721.28:  60%|█████▉    | 1228/2048 [15:27<09:31,  1.43it/s]
loss 2.06 accuracy 0.31 -- 162.90 + 57.14 + 496.44 + 4.80 = 721.28:  60%|██████    | 1229/2048 [15:27<09:43,  1.40it/s]
loss 1.85 accuracy 0.31 -- 55.90 + 57.01 + 619.95 + 4.79 = 737.65:  60%|██████    | 1229/2048 [15:28<09:43,  1.40it/s] 
loss 1.85 accuracy 0.31 -- 55.90 + 57.01 + 619.95 + 4.79 = 737.65:  60%|██████    | 1230/2048 [15:28<09:55,  1.37it/s]
loss 2.39 accuracy 0.12 -- 56.92 + 56.37 + 506.41 + 4.77 = 624.47:  60%|██████    | 1230/2048 [15:29<09:55,  1.37it/s]
loss 2.39 accuracy 0.12 -- 56.92 + 56.37 + 506.41 + 4.77 = 624.47:  60%|██████    | 1231/2048 [15:29<09:36,  1.42it/s]
loss 1.55 accuracy 0.56 -- 56.21 + 57.12 + 616.03 + 4.79 = 734.15:  60%|██████    | 1231/2048 [15:29<09:36,  1.42it/s]
loss 1.55 accuracy 0.56 -- 56.21 + 57.12 + 616.03 + 4.79 = 734.15:  60%|██████    | 1232/2048 [15:29<09:49,  1.38it/s]
loss 2.06 accuracy 0.25 -- 56.76 + 56.55 + 501.73 + 4.78 = 619.81:  60%|██████    | 1232/2048 [15:30<09:49,  1.38it/s]
loss 2.06 accuracy 0.25 -- 56.76 + 56.55 + 501.73 + 4.78 = 619.81:  60%|██████    | 1233/2048 [15:30<09:30,  1.43it/s]
loss 1.76 accuracy 0.19 -- 55.94 + 166.59 + 502.10 + 4.79 = 729.43:  60%|██████    | 1233/2048 [15:31<09:30,  1.43it/s]
loss 1.76 accuracy 0.19 -- 55.94 + 166.59 + 502.10 + 4.79 = 729.43:  60%|██████    | 1234/2048 [15:31<09:43,  1.40it/s]
loss 2.06 accuracy 0.31 -- 55.91 + 56.55 + 499.30 + 4.78 = 616.54:  60%|██████    | 1234/2048 [15:31<09:43,  1.40it/s] 
loss 2.06 accuracy 0.31 -- 55.91 + 56.55 + 499.30 + 4.78 = 616.54:  60%|██████    | 1235/2048 [15:31<09:24,  1.44it/s]
loss 1.76 accuracy 0.25 -- 56.57 + 56.46 + 498.32 + 4.79 = 616.14:  60%|██████    | 1235/2048 [15:32<09:24,  1.44it/s]
loss 1.76 accuracy 0.25 -- 56.57 + 56.46 + 498.32 + 4.79 = 616.14:  60%|██████    | 1236/2048 [15:32<09:37,  1.41it/s]
loss 2.11 accuracy 0.25 -- 56.25 + 57.68 + 496.73 + 4.78 = 615.43:  60%|██████    | 1236/2048 [15:33<09:37,  1.41it/s]
loss 2.11 accuracy 0.25 -- 56.25 + 57.68 + 496.73 + 4.78 = 615.43:  60%|██████    | 1237/2048 [15:33<09:20,  1.45it/s]
loss 2.04 accuracy 0.31 -- 157.62 + 56.82 + 489.20 + 4.77 = 708.42:  60%|██████    | 1237/2048 [15:33<09:20,  1.45it/s]
loss 2.04 accuracy 0.31 -- 157.62 + 56.82 + 489.20 + 4.77 = 708.42:  60%|██████    | 1238/2048 [15:33<09:30,  1.42it/s]
loss 2.11 accuracy 0.31 -- 55.78 + 166.48 + 500.87 + 4.77 = 727.90:  60%|██████    | 1238/2048 [15:34<09:30,  1.42it/s]
loss 2.11 accuracy 0.31 -- 55.78 + 166.48 + 500.87 + 4.77 = 727.90:  60%|██████    | 1239/2048 [15:34<09:42,  1.39it/s]
loss 1.92 accuracy 0.38 -- 56.45 + 56.61 + 498.66 + 4.78 = 616.50:  60%|██████    | 1239/2048 [15:35<09:42,  1.39it/s] 
loss 1.92 accuracy 0.38 -- 56.45 + 56.61 + 498.66 + 4.78 = 616.50:  61%|██████    | 1240/2048 [15:35<09:23,  1.44it/s]
loss 1.87 accuracy 0.25 -- 162.27 + 57.18 + 497.10 + 4.79 = 721.35:  61%|██████    | 1240/2048 [15:36<09:23,  1.44it/s]
loss 1.87 accuracy 0.25 -- 162.27 + 57.18 + 497.10 + 4.79 = 721.35:  61%|██████    | 1241/2048 [15:36<09:34,  1.40it/s]
loss 1.78 accuracy 0.38 -- 55.91 + 57.22 + 622.50 + 4.79 = 740.41:  61%|██████    | 1241/2048 [15:36<09:34,  1.40it/s] 
loss 1.78 accuracy 0.38 -- 55.91 + 57.22 + 622.50 + 4.79 = 740.41:  61%|██████    | 1242/2048 [15:36<09:47,  1.37it/s]
loss 1.88 accuracy 0.25 -- 57.04 + 56.84 + 504.16 + 4.78 = 622.82:  61%|██████    | 1242/2048 [15:37<09:47,  1.37it/s]
loss 1.88 accuracy 0.25 -- 57.04 + 56.84 + 504.16 + 4.78 = 622.82:  61%|██████    | 1243/2048 [15:37<09:27,  1.42it/s]
loss 1.99 accuracy 0.25 -- 56.10 + 57.08 + 615.96 + 4.78 = 733.92:  61%|██████    | 1243/2048 [15:38<09:27,  1.42it/s]
loss 1.99 accuracy 0.25 -- 56.10 + 57.08 + 615.96 + 4.78 = 733.92:  61%|██████    | 1244/2048 [15:38<09:40,  1.38it/s]
loss 1.86 accuracy 0.25 -- 56.62 + 56.69 + 502.68 + 4.77 = 620.76:  61%|██████    | 1244/2048 [15:38<09:40,  1.38it/s]
loss 1.86 accuracy 0.25 -- 56.62 + 56.69 + 502.68 + 4.77 = 620.76:  61%|██████    | 1245/2048 [15:38<09:21,  1.43it/s]
loss 1.88 accuracy 0.31 -- 56.33 + 165.97 + 501.80 + 4.79 = 728.88:  61%|██████    | 1245/2048 [15:39<09:21,  1.43it/s]
loss 1.88 accuracy 0.31 -- 56.33 + 165.97 + 501.80 + 4.79 = 728.88:  61%|██████    | 1246/2048 [15:39<09:34,  1.40it/s]
loss 1.84 accuracy 0.31 -- 56.46 + 56.59 + 499.66 + 4.77 = 617.49:  61%|██████    | 1246/2048 [15:40<09:34,  1.40it/s] 
loss 1.84 accuracy 0.31 -- 56.46 + 56.59 + 499.66 + 4.77 = 617.49:  61%|██████    | 1247/2048 [15:40<09:16,  1.44it/s]
loss 1.82 accuracy 0.31 -- 56.63 + 56.73 + 498.93 + 4.76 = 617.05:  61%|██████    | 1247/2048 [15:41<09:16,  1.44it/s]
loss 1.82 accuracy 0.31 -- 56.63 + 56.73 + 498.93 + 4.76 = 617.05:  61%|██████    | 1248/2048 [15:41<09:29,  1.40it/s]
loss 1.92 accuracy 0.31 -- 55.85 + 57.01 + 494.28 + 4.78 = 611.92:  61%|██████    | 1248/2048 [15:41<09:29,  1.40it/s]
loss 1.92 accuracy 0.31 -- 55.85 + 57.01 + 494.28 + 4.78 = 611.92:  61%|██████    | 1249/2048 [15:41<09:11,  1.45it/s]
loss 1.64 accuracy 0.44 -- 157.74 + 56.76 + 490.07 + 4.78 = 709.35:  61%|██████    | 1249/2048 [15:42<09:11,  1.45it/s]
loss 1.64 accuracy 0.44 -- 157.74 + 56.76 + 490.07 + 4.78 = 709.35:  61%|██████    | 1250/2048 [15:42<09:21,  1.42it/s]
loss 1.55 accuracy 0.44 -- 56.04 + 167.43 + 503.39 + 4.78 = 731.63:  61%|██████    | 1250/2048 [15:43<09:21,  1.42it/s]
loss 1.55 accuracy 0.44 -- 56.04 + 167.43 + 503.39 + 4.78 = 731.63:  61%|██████    | 1251/2048 [15:43<09:34,  1.39it/s]
loss 2.22 accuracy 0.31 -- 56.62 + 56.37 + 496.60 + 4.79 = 614.38:  61%|██████    | 1251/2048 [15:43<09:34,  1.39it/s] 
loss 2.22 accuracy 0.31 -- 56.62 + 56.37 + 496.60 + 4.79 = 614.38:  61%|██████    | 1252/2048 [15:43<09:14,  1.44it/s]
loss 1.82 accuracy 0.38 -- 162.69 + 56.92 + 496.45 + 4.77 = 720.82:  61%|██████    | 1252/2048 [15:44<09:14,  1.44it/s]
loss 1.82 accuracy 0.38 -- 162.69 + 56.92 + 496.45 + 4.77 = 720.82:  61%|██████    | 1253/2048 [15:44<09:26,  1.40it/s]
loss 1.98 accuracy 0.25 -- 56.28 + 57.63 + 621.04 + 4.81 = 739.75:  61%|██████    | 1253/2048 [15:45<09:26,  1.40it/s] 
loss 1.98 accuracy 0.25 -- 56.28 + 57.63 + 621.04 + 4.81 = 739.75:  61%|██████    | 1254/2048 [15:45<09:38,  1.37it/s]
loss 1.78 accuracy 0.31 -- 56.79 + 56.48 + 505.03 + 4.80 = 623.11:  61%|██████    | 1254/2048 [15:46<09:38,  1.37it/s]
loss 1.78 accuracy 0.31 -- 56.79 + 56.48 + 505.03 + 4.80 = 623.11:  61%|██████▏   | 1255/2048 [15:46<09:19,  1.42it/s]
loss 1.74 accuracy 0.38 -- 56.24 + 57.17 + 615.21 + 4.78 = 733.41:  61%|██████▏   | 1255/2048 [15:46<09:19,  1.42it/s]
loss 1.74 accuracy 0.38 -- 56.24 + 57.17 + 615.21 + 4.78 = 733.41:  61%|██████▏   | 1256/2048 [15:46<09:31,  1.39it/s]
loss 1.97 accuracy 0.12 -- 56.59 + 56.77 + 502.24 + 4.78 = 620.38:  61%|██████▏   | 1256/2048 [15:47<09:31,  1.39it/s]
loss 1.97 accuracy 0.12 -- 56.59 + 56.77 + 502.24 + 4.78 = 620.38:  61%|██████▏   | 1257/2048 [15:47<09:13,  1.43it/s]
loss 1.76 accuracy 0.38 -- 56.18 + 166.40 + 503.26 + 4.78 = 730.62:  61%|██████▏   | 1257/2048 [15:48<09:13,  1.43it/s]
loss 1.76 accuracy 0.38 -- 56.18 + 166.40 + 503.26 + 4.78 = 730.62:  61%|██████▏   | 1258/2048 [15:48<09:26,  1.39it/s]
loss 1.96 accuracy 0.38 -- 56.05 + 56.20 + 501.11 + 4.76 = 618.13:  61%|██████▏   | 1258/2048 [15:48<09:26,  1.39it/s] 
loss 1.96 accuracy 0.38 -- 56.05 + 56.20 + 501.11 + 4.76 = 618.13:  61%|██████▏   | 1259/2048 [15:48<09:08,  1.44it/s]
loss 1.88 accuracy 0.50 -- 57.31 + 56.83 + 498.38 + 4.79 = 617.31:  61%|██████▏   | 1259/2048 [15:49<09:08,  1.44it/s]
loss 1.88 accuracy 0.50 -- 57.31 + 56.83 + 498.38 + 4.79 = 617.31:  62%|██████▏   | 1260/2048 [15:49<09:21,  1.40it/s]
loss 2.08 accuracy 0.25 -- 56.37 + 57.16 + 495.38 + 4.78 = 613.69:  62%|██████▏   | 1260/2048 [15:50<09:21,  1.40it/s]
loss 2.08 accuracy 0.25 -- 56.37 + 57.16 + 495.38 + 4.78 = 613.69:  62%|██████▏   | 1261/2048 [15:50<09:03,  1.45it/s]
loss 2.16 accuracy 0.25 -- 157.34 + 56.91 + 488.94 + 4.76 = 707.95:  62%|██████▏   | 1261/2048 [15:50<09:03,  1.45it/s]
loss 2.16 accuracy 0.25 -- 157.34 + 56.91 + 488.94 + 4.76 = 707.95:  62%|██████▏   | 1262/2048 [15:50<09:13,  1.42it/s]
loss 1.64 accuracy 0.50 -- 55.92 + 166.51 + 501.39 + 4.78 = 728.60:  62%|██████▏   | 1262/2048 [15:51<09:13,  1.42it/s]
loss 1.64 accuracy 0.50 -- 55.92 + 166.51 + 501.39 + 4.78 = 728.60:  62%|██████▏   | 1263/2048 [15:51<09:24,  1.39it/s]
loss 2.11 accuracy 0.12 -- 56.55 + 56.71 + 497.58 + 4.78 = 615.62:  62%|██████▏   | 1263/2048 [15:52<09:24,  1.39it/s] 
loss 2.11 accuracy 0.12 -- 56.55 + 56.71 + 497.58 + 4.78 = 615.62:  62%|██████▏   | 1264/2048 [15:52<09:06,  1.44it/s]
loss 1.54 accuracy 0.44 -- 162.62 + 56.82 + 499.04 + 4.76 = 723.24:  62%|██████▏   | 1264/2048 [15:53<09:06,  1.44it/s]
loss 1.54 accuracy 0.44 -- 162.62 + 56.82 + 499.04 + 4.76 = 723.24:  62%|██████▏   | 1265/2048 [15:53<09:18,  1.40it/s]
loss 2.46 accuracy 0.12 -- 56.23 + 57.17 + 620.45 + 4.79 = 738.64:  62%|██████▏   | 1265/2048 [15:53<09:18,  1.40it/s] 
loss 2.46 accuracy 0.12 -- 56.23 + 57.17 + 620.45 + 4.79 = 738.64:  62%|██████▏   | 1266/2048 [15:53<09:29,  1.37it/s]
loss 2.24 accuracy 0.12 -- 57.06 + 56.65 + 506.71 + 4.77 = 625.19:  62%|██████▏   | 1266/2048 [15:54<09:29,  1.37it/s]
loss 2.24 accuracy 0.12 -- 57.06 + 56.65 + 506.71 + 4.77 = 625.19:  62%|██████▏   | 1267/2048 [15:54<09:11,  1.42it/s]
loss 2.23 accuracy 0.12 -- 56.39 + 57.19 + 617.11 + 4.79 = 735.47:  62%|██████▏   | 1267/2048 [15:55<09:11,  1.42it/s]
loss 2.23 accuracy 0.12 -- 56.39 + 57.19 + 617.11 + 4.79 = 735.47:  62%|██████▏   | 1268/2048 [15:55<09:23,  1.38it/s]
loss 1.53 accuracy 0.50 -- 56.93 + 56.76 + 503.22 + 4.78 = 621.69:  62%|██████▏   | 1268/2048 [15:55<09:23,  1.38it/s]
loss 1.53 accuracy 0.50 -- 56.93 + 56.76 + 503.22 + 4.78 = 621.69:  62%|██████▏   | 1269/2048 [15:55<09:05,  1.43it/s]
loss 1.48 accuracy 0.50 -- 56.27 + 166.30 + 502.13 + 4.80 = 729.50:  62%|██████▏   | 1269/2048 [15:56<09:05,  1.43it/s]
loss 1.48 accuracy 0.50 -- 56.27 + 166.30 + 502.13 + 4.80 = 729.50:  62%|██████▏   | 1270/2048 [15:56<09:18,  1.39it/s]
loss 1.61 accuracy 0.44 -- 56.89 + 57.08 + 499.78 + 4.78 = 618.53:  62%|██████▏   | 1270/2048 [15:57<09:18,  1.39it/s] 
loss 1.61 accuracy 0.44 -- 56.89 + 57.08 + 499.78 + 4.78 = 618.53:  62%|██████▏   | 1271/2048 [15:57<09:00,  1.44it/s]
loss 1.53 accuracy 0.50 -- 56.53 + 56.71 + 499.77 + 4.77 = 617.77:  62%|██████▏   | 1271/2048 [15:58<09:00,  1.44it/s]
loss 1.53 accuracy 0.50 -- 56.53 + 56.71 + 499.77 + 4.77 = 617.77:  62%|██████▏   | 1272/2048 [15:58<09:13,  1.40it/s]
loss 2.23 accuracy 0.19 -- 56.11 + 57.17 + 494.81 + 4.78 = 612.88:  62%|██████▏   | 1272/2048 [15:58<09:13,  1.40it/s]
loss 2.23 accuracy 0.19 -- 56.11 + 57.17 + 494.81 + 4.78 = 612.88:  62%|██████▏   | 1273/2048 [15:58<09:03,  1.43it/s]
loss 2.02 accuracy 0.38 -- 157.40 + 57.08 + 489.56 + 4.77 = 708.82:  62%|██████▏   | 1273/2048 [15:59<09:03,  1.43it/s]
loss 2.02 accuracy 0.38 -- 157.40 + 57.08 + 489.56 + 4.77 = 708.82:  62%|██████▏   | 1274/2048 [15:59<09:10,  1.40it/s]
loss 1.92 accuracy 0.31 -- 55.92 + 166.65 + 501.67 + 4.77 = 729.01:  62%|██████▏   | 1274/2048 [16:00<09:10,  1.40it/s]
loss 1.92 accuracy 0.31 -- 55.92 + 166.65 + 501.67 + 4.77 = 729.01:  62%|██████▏   | 1275/2048 [16:00<09:20,  1.38it/s]
loss 2.22 accuracy 0.25 -- 56.47 + 56.32 + 500.03 + 4.80 = 617.62:  62%|██████▏   | 1275/2048 [16:00<09:20,  1.38it/s] 
loss 2.22 accuracy 0.25 -- 56.47 + 56.32 + 500.03 + 4.80 = 617.62:  62%|██████▏   | 1276/2048 [16:00<09:01,  1.43it/s]
loss 1.95 accuracy 0.25 -- 163.36 + 56.90 + 496.25 + 4.81 = 721.32:  62%|██████▏   | 1276/2048 [16:01<09:01,  1.43it/s]
loss 1.95 accuracy 0.25 -- 163.36 + 56.90 + 496.25 + 4.81 = 721.32:  62%|██████▏   | 1277/2048 [16:01<09:11,  1.40it/s]
loss 1.71 accuracy 0.44 -- 56.04 + 57.40 + 619.89 + 4.77 = 738.10:  62%|██████▏   | 1277/2048 [16:02<09:11,  1.40it/s] 
loss 1.71 accuracy 0.44 -- 56.04 + 57.40 + 619.89 + 4.77 = 738.10:  62%|██████▏   | 1278/2048 [16:02<09:22,  1.37it/s]
loss 2.01 accuracy 0.44 -- 56.67 + 56.54 + 505.59 + 4.78 = 623.58:  62%|██████▏   | 1278/2048 [16:03<09:22,  1.37it/s]
loss 2.01 accuracy 0.44 -- 56.67 + 56.54 + 505.59 + 4.78 = 623.58:  62%|██████▏   | 1279/2048 [16:03<09:03,  1.42it/s]
loss 1.47 accuracy 0.25 -- 56.26 + 57.33 + 616.04 + 4.83 = 734.45:  62%|██████▏   | 1279/2048 [16:03<09:03,  1.42it/s]
loss 1.47 accuracy 0.25 -- 56.26 + 57.33 + 616.04 + 4.83 = 734.45:  62%|██████▎   | 1280/2048 [16:03<09:15,  1.38it/s]
loss 1.65 accuracy 0.56 -- 56.85 + 56.64 + 501.34 + 4.77 = 619.60:  62%|██████▎   | 1280/2048 [16:04<09:15,  1.38it/s]
loss 1.65 accuracy 0.56 -- 56.85 + 56.64 + 501.34 + 4.77 = 619.60:  63%|██████▎   | 1281/2048 [16:04<08:56,  1.43it/s]
loss 1.48 accuracy 0.44 -- 56.19 + 166.49 + 502.46 + 4.79 = 729.94:  63%|██████▎   | 1281/2048 [16:05<08:56,  1.43it/s]
loss 1.48 accuracy 0.44 -- 56.19 + 166.49 + 502.46 + 4.79 = 729.94:  63%|██████▎   | 1282/2048 [16:05<09:09,  1.39it/s]
loss 1.64 accuracy 0.38 -- 56.15 + 56.41 + 498.83 + 4.78 = 616.16:  63%|██████▎   | 1282/2048 [16:05<09:09,  1.39it/s] 
loss 1.64 accuracy 0.38 -- 56.15 + 56.41 + 498.83 + 4.78 = 616.16:  63%|██████▎   | 1283/2048 [16:05<08:51,  1.44it/s]
loss 1.69 accuracy 0.25 -- 56.53 + 56.39 + 497.71 + 4.78 = 615.41:  63%|██████▎   | 1283/2048 [16:06<08:51,  1.44it/s]
loss 1.69 accuracy 0.25 -- 56.53 + 56.39 + 497.71 + 4.78 = 615.41:  63%|██████▎   | 1284/2048 [16:06<09:03,  1.41it/s]
loss 2.55 accuracy 0.06 -- 55.86 + 57.31 + 494.39 + 4.78 = 612.33:  63%|██████▎   | 1284/2048 [16:07<09:03,  1.41it/s]
loss 2.55 accuracy 0.06 -- 55.86 + 57.31 + 494.39 + 4.78 = 612.33:  63%|██████▎   | 1285/2048 [16:07<08:46,  1.45it/s]
loss 1.74 accuracy 0.31 -- 157.40 + 56.83 + 489.29 + 4.84 = 708.37:  63%|██████▎   | 1285/2048 [16:07<08:46,  1.45it/s]
loss 1.74 accuracy 0.31 -- 157.40 + 56.83 + 489.29 + 4.84 = 708.37:  63%|██████▎   | 1286/2048 [16:07<08:56,  1.42it/s]
loss 1.52 accuracy 0.44 -- 55.79 + 166.30 + 502.22 + 4.80 = 729.11:  63%|██████▎   | 1286/2048 [16:08<08:56,  1.42it/s]
loss 1.52 accuracy 0.44 -- 55.79 + 166.30 + 502.22 + 4.80 = 729.11:  63%|██████▎   | 1287/2048 [16:08<09:07,  1.39it/s]
loss 2.08 accuracy 0.19 -- 57.09 + 56.84 + 497.13 + 4.77 = 615.82:  63%|██████▎   | 1287/2048 [16:09<09:07,  1.39it/s] 
loss 2.08 accuracy 0.19 -- 57.09 + 56.84 + 497.13 + 4.77 = 615.82:  63%|██████▎   | 1288/2048 [16:09<08:49,  1.44it/s]
loss 1.81 accuracy 0.31 -- 162.64 + 56.84 + 496.59 + 4.80 = 720.86:  63%|██████▎   | 1288/2048 [16:10<08:49,  1.44it/s]
loss 1.81 accuracy 0.31 -- 162.64 + 56.84 + 496.59 + 4.80 = 720.86:  63%|██████▎   | 1289/2048 [16:10<09:00,  1.40it/s]
loss 2.04 accuracy 0.25 -- 56.09 + 57.36 + 619.70 + 4.77 = 737.92:  63%|██████▎   | 1289/2048 [16:10<09:00,  1.40it/s] 
loss 2.04 accuracy 0.25 -- 56.09 + 57.36 + 619.70 + 4.77 = 737.92:  63%|██████▎   | 1290/2048 [16:10<09:11,  1.37it/s]
loss 1.77 accuracy 0.19 -- 56.66 + 56.35 + 503.85 + 4.77 = 621.62:  63%|██████▎   | 1290/2048 [16:11<09:11,  1.37it/s]
loss 1.77 accuracy 0.19 -- 56.66 + 56.35 + 503.85 + 4.77 = 621.62:  63%|██████▎   | 1291/2048 [16:11<08:53,  1.42it/s]
loss 2.08 accuracy 0.31 -- 56.15 + 57.18 + 615.18 + 4.79 = 733.30:  63%|██████▎   | 1291/2048 [16:12<08:53,  1.42it/s]
loss 2.08 accuracy 0.31 -- 56.15 + 57.18 + 615.18 + 4.79 = 733.30:  63%|██████▎   | 1292/2048 [16:12<09:05,  1.39it/s]
loss 1.94 accuracy 0.31 -- 56.65 + 56.46 + 504.89 + 4.80 = 622.80:  63%|██████▎   | 1292/2048 [16:12<09:05,  1.39it/s]
loss 1.94 accuracy 0.31 -- 56.65 + 56.46 + 504.89 + 4.80 = 622.80:  63%|██████▎   | 1293/2048 [16:12<08:48,  1.43it/s]
loss 2.05 accuracy 0.19 -- 56.57 + 166.65 + 502.56 + 4.79 = 730.56:  63%|██████▎   | 1293/2048 [16:13<08:48,  1.43it/s]
loss 2.05 accuracy 0.19 -- 56.57 + 166.65 + 502.56 + 4.79 = 730.56:  63%|██████▎   | 1294/2048 [16:13<09:00,  1.39it/s]
loss 1.68 accuracy 0.44 -- 56.51 + 56.39 + 500.72 + 4.77 = 618.39:  63%|██████▎   | 1294/2048 [16:14<09:00,  1.39it/s] 
loss 1.68 accuracy 0.44 -- 56.51 + 56.39 + 500.72 + 4.77 = 618.39:  63%|██████▎   | 1295/2048 [16:14<08:43,  1.44it/s]
loss 2.10 accuracy 0.38 -- 56.75 + 56.64 + 500.06 + 4.81 = 618.26:  63%|██████▎   | 1295/2048 [16:15<08:43,  1.44it/s]
loss 2.10 accuracy 0.38 -- 56.75 + 56.64 + 500.06 + 4.81 = 618.26:  63%|██████▎   | 1296/2048 [16:15<08:55,  1.40it/s]
loss 1.89 accuracy 0.19 -- 56.15 + 57.20 + 495.55 + 4.77 = 613.67:  63%|██████▎   | 1296/2048 [16:15<08:55,  1.40it/s]
loss 1.89 accuracy 0.19 -- 56.15 + 57.20 + 495.55 + 4.77 = 613.67:  63%|██████▎   | 1297/2048 [16:15<08:46,  1.43it/s]
loss 2.01 accuracy 0.19 -- 158.30 + 56.96 + 490.52 + 4.77 = 710.56:  63%|██████▎   | 1297/2048 [16:16<08:46,  1.43it/s]
loss 2.01 accuracy 0.19 -- 158.30 + 56.96 + 490.52 + 4.77 = 710.56:  63%|██████▎   | 1298/2048 [16:16<08:54,  1.40it/s]
loss 1.84 accuracy 0.25 -- 55.79 + 166.46 + 502.14 + 4.77 = 729.15:  63%|██████▎   | 1298/2048 [16:17<08:54,  1.40it/s]
loss 1.84 accuracy 0.25 -- 55.79 + 166.46 + 502.14 + 4.77 = 729.15:  63%|██████▎   | 1299/2048 [16:17<09:03,  1.38it/s]
loss 1.41 accuracy 0.38 -- 56.47 + 56.46 + 497.03 + 4.77 = 614.73:  63%|██████▎   | 1299/2048 [16:17<09:03,  1.38it/s] 
loss 1.41 accuracy 0.38 -- 56.47 + 56.46 + 497.03 + 4.77 = 614.73:  63%|██████▎   | 1300/2048 [16:17<08:43,  1.43it/s]
loss 1.97 accuracy 0.38 -- 162.94 + 57.14 + 496.63 + 4.78 = 721.50:  63%|██████▎   | 1300/2048 [16:18<08:43,  1.43it/s]
loss 1.97 accuracy 0.38 -- 162.94 + 57.14 + 496.63 + 4.78 = 721.50:  64%|██████▎   | 1301/2048 [16:18<08:54,  1.40it/s]
loss 1.56 accuracy 0.31 -- 56.02 + 57.19 + 621.41 + 4.78 = 739.40:  64%|██████▎   | 1301/2048 [16:19<08:54,  1.40it/s] 
loss 1.56 accuracy 0.31 -- 56.02 + 57.19 + 621.41 + 4.78 = 739.40:  64%|██████▎   | 1302/2048 [16:19<09:04,  1.37it/s]
loss 2.12 accuracy 0.38 -- 57.04 + 56.80 + 505.99 + 4.77 = 624.60:  64%|██████▎   | 1302/2048 [16:20<09:04,  1.37it/s]
loss 2.12 accuracy 0.38 -- 57.04 + 56.80 + 505.99 + 4.77 = 624.60:  64%|██████▎   | 1303/2048 [16:20<08:46,  1.41it/s]
loss 2.23 accuracy 0.38 -- 56.18 + 56.95 + 616.56 + 4.80 = 734.49:  64%|██████▎   | 1303/2048 [16:20<08:46,  1.41it/s]
loss 2.23 accuracy 0.38 -- 56.18 + 56.95 + 616.56 + 4.80 = 734.49:  64%|██████▎   | 1304/2048 [16:20<08:58,  1.38it/s]
loss 1.74 accuracy 0.38 -- 57.15 + 56.57 + 501.18 + 4.78 = 619.69:  64%|██████▎   | 1304/2048 [16:21<08:58,  1.38it/s]
loss 1.74 accuracy 0.38 -- 57.15 + 56.57 + 501.18 + 4.78 = 619.69:  64%|██████▎   | 1305/2048 [16:21<08:40,  1.43it/s]
loss 1.92 accuracy 0.38 -- 56.17 + 165.97 + 500.42 + 4.77 = 727.32:  64%|██████▎   | 1305/2048 [16:22<08:40,  1.43it/s]
loss 1.92 accuracy 0.38 -- 56.17 + 165.97 + 500.42 + 4.77 = 727.32:  64%|██████▍   | 1306/2048 [16:22<08:51,  1.40it/s]
loss 1.70 accuracy 0.38 -- 56.05 + 56.16 + 499.13 + 4.76 = 616.10:  64%|██████▍   | 1306/2048 [16:22<08:51,  1.40it/s] 
loss 1.70 accuracy 0.38 -- 56.05 + 56.16 + 499.13 + 4.76 = 616.10:  64%|██████▍   | 1307/2048 [16:22<08:34,  1.44it/s]
loss 1.82 accuracy 0.31 -- 56.56 + 56.43 + 497.95 + 4.78 = 615.72:  64%|██████▍   | 1307/2048 [16:23<08:34,  1.44it/s]
loss 1.82 accuracy 0.31 -- 56.56 + 56.43 + 497.95 + 4.78 = 615.72:  64%|██████▍   | 1308/2048 [16:23<08:46,  1.41it/s]
loss 1.09 accuracy 0.75 -- 56.32 + 57.18 + 495.20 + 4.78 = 613.48:  64%|██████▍   | 1308/2048 [16:24<08:46,  1.41it/s]
loss 1.09 accuracy 0.75 -- 56.32 + 57.18 + 495.20 + 4.78 = 613.48:  64%|██████▍   | 1309/2048 [16:24<08:29,  1.45it/s]
loss 1.92 accuracy 0.31 -- 158.14 + 56.72 + 491.00 + 4.78 = 710.64:  64%|██████▍   | 1309/2048 [16:25<08:29,  1.45it/s]
loss 1.92 accuracy 0.31 -- 158.14 + 56.72 + 491.00 + 4.78 = 710.64:  64%|██████▍   | 1310/2048 [16:25<08:39,  1.42it/s]
loss 1.62 accuracy 0.56 -- 56.14 + 166.94 + 500.98 + 4.77 = 728.83:  64%|██████▍   | 1310/2048 [16:25<08:39,  1.42it/s]
loss 1.62 accuracy 0.56 -- 56.14 + 166.94 + 500.98 + 4.77 = 728.83:  64%|██████▍   | 1311/2048 [16:25<08:50,  1.39it/s]
loss 1.82 accuracy 0.25 -- 56.37 + 56.51 + 498.22 + 4.79 = 615.89:  64%|██████▍   | 1311/2048 [16:26<08:50,  1.39it/s] 
loss 1.82 accuracy 0.25 -- 56.37 + 56.51 + 498.22 + 4.79 = 615.89:  64%|██████▍   | 1312/2048 [16:26<08:32,  1.44it/s]
loss 1.96 accuracy 0.25 -- 163.14 + 56.73 + 498.00 + 4.82 = 722.69:  64%|██████▍   | 1312/2048 [16:27<08:32,  1.44it/s]
loss 1.96 accuracy 0.25 -- 163.14 + 56.73 + 498.00 + 4.82 = 722.69:  64%|██████▍   | 1313/2048 [16:27<08:43,  1.40it/s]
loss 1.93 accuracy 0.31 -- 56.16 + 57.07 + 621.25 + 4.78 = 739.25:  64%|██████▍   | 1313/2048 [16:27<08:43,  1.40it/s] 
loss 1.93 accuracy 0.31 -- 56.16 + 57.07 + 621.25 + 4.78 = 739.25:  64%|██████▍   | 1314/2048 [16:27<08:54,  1.37it/s]
loss 1.53 accuracy 0.38 -- 56.89 + 56.70 + 506.25 + 4.79 = 624.63:  64%|██████▍   | 1314/2048 [16:28<08:54,  1.37it/s]
loss 1.53 accuracy 0.38 -- 56.89 + 56.70 + 506.25 + 4.79 = 624.63:  64%|██████▍   | 1315/2048 [16:28<08:37,  1.42it/s]
loss 2.15 accuracy 0.06 -- 56.26 + 57.56 + 616.91 + 4.80 = 735.53:  64%|██████▍   | 1315/2048 [16:29<08:37,  1.42it/s]
loss 2.15 accuracy 0.06 -- 56.26 + 57.56 + 616.91 + 4.80 = 735.53:  64%|██████▍   | 1316/2048 [16:29<08:49,  1.38it/s]
loss 2.51 accuracy 0.25 -- 57.07 + 56.45 + 502.72 + 4.77 = 621.01:  64%|██████▍   | 1316/2048 [16:30<08:49,  1.38it/s]
loss 2.51 accuracy 0.25 -- 57.07 + 56.45 + 502.72 + 4.77 = 621.01:  64%|██████▍   | 1317/2048 [16:30<08:31,  1.43it/s]
loss 2.04 accuracy 0.19 -- 55.92 + 166.59 + 501.35 + 4.76 = 728.62:  64%|██████▍   | 1317/2048 [16:30<08:31,  1.43it/s]
loss 2.04 accuracy 0.19 -- 55.92 + 166.59 + 501.35 + 4.76 = 728.62:  64%|██████▍   | 1318/2048 [16:30<08:43,  1.39it/s]
loss 2.36 accuracy 0.19 -- 56.01 + 56.69 + 498.72 + 4.78 = 616.20:  64%|██████▍   | 1318/2048 [16:31<08:43,  1.39it/s] 
loss 2.36 accuracy 0.19 -- 56.01 + 56.69 + 498.72 + 4.78 = 616.20:  64%|██████▍   | 1319/2048 [16:31<08:26,  1.44it/s]
loss 1.83 accuracy 0.31 -- 56.75 + 56.29 + 497.55 + 4.78 = 615.38:  64%|██████▍   | 1319/2048 [16:32<08:26,  1.44it/s]
loss 1.83 accuracy 0.31 -- 56.75 + 56.29 + 497.55 + 4.78 = 615.38:  64%|██████▍   | 1320/2048 [16:32<08:37,  1.41it/s]
loss 1.54 accuracy 0.31 -- 55.98 + 56.98 + 495.81 + 4.79 = 613.57:  64%|██████▍   | 1320/2048 [16:32<08:37,  1.41it/s]
loss 1.54 accuracy 0.31 -- 55.98 + 56.98 + 495.81 + 4.79 = 613.57:  65%|██████▍   | 1321/2048 [16:32<08:21,  1.45it/s]
loss 2.01 accuracy 0.38 -- 158.54 + 56.72 + 489.26 + 4.78 = 709.30:  65%|██████▍   | 1321/2048 [16:33<08:21,  1.45it/s]
loss 2.01 accuracy 0.38 -- 158.54 + 56.72 + 489.26 + 4.78 = 709.30:  65%|██████▍   | 1322/2048 [16:33<08:31,  1.42it/s]
loss 1.79 accuracy 0.44 -- 55.58 + 166.32 + 501.41 + 4.77 = 728.07:  65%|██████▍   | 1322/2048 [16:34<08:31,  1.42it/s]
loss 1.79 accuracy 0.44 -- 55.58 + 166.32 + 501.41 + 4.77 = 728.07:  65%|██████▍   | 1323/2048 [16:34<08:41,  1.39it/s]
loss 1.44 accuracy 0.56 -- 56.75 + 56.31 + 498.30 + 4.78 = 616.14:  65%|██████▍   | 1323/2048 [16:34<08:41,  1.39it/s] 
loss 1.44 accuracy 0.56 -- 56.75 + 56.31 + 498.30 + 4.78 = 616.14:  65%|██████▍   | 1324/2048 [16:34<08:24,  1.44it/s]
loss 1.55 accuracy 0.50 -- 162.58 + 56.97 + 496.36 + 4.80 = 720.71:  65%|██████▍   | 1324/2048 [16:35<08:24,  1.44it/s]
loss 1.55 accuracy 0.50 -- 162.58 + 56.97 + 496.36 + 4.80 = 720.71:  65%|██████▍   | 1325/2048 [16:35<08:34,  1.40it/s]
loss 1.67 accuracy 0.38 -- 56.03 + 57.36 + 619.81 + 4.78 = 737.98:  65%|██████▍   | 1325/2048 [16:36<08:34,  1.40it/s] 
loss 1.67 accuracy 0.38 -- 56.03 + 57.36 + 619.81 + 4.78 = 737.98:  65%|██████▍   | 1326/2048 [16:36<08:45,  1.37it/s]
loss 1.67 accuracy 0.31 -- 56.81 + 56.47 + 505.71 + 4.84 = 623.84:  65%|██████▍   | 1326/2048 [16:37<08:45,  1.37it/s]
loss 1.67 accuracy 0.31 -- 56.81 + 56.47 + 505.71 + 4.84 = 623.84:  65%|██████▍   | 1327/2048 [16:37<08:28,  1.42it/s]
loss 2.77 accuracy 0.25 -- 56.18 + 57.24 + 615.05 + 4.79 = 733.25:  65%|██████▍   | 1327/2048 [16:37<08:28,  1.42it/s]
loss 2.77 accuracy 0.25 -- 56.18 + 57.24 + 615.05 + 4.79 = 733.25:  65%|██████▍   | 1328/2048 [16:37<08:39,  1.39it/s]
loss 1.52 accuracy 0.38 -- 56.68 + 56.48 + 502.06 + 4.77 = 619.99:  65%|██████▍   | 1328/2048 [16:38<08:39,  1.39it/s]
loss 1.52 accuracy 0.38 -- 56.68 + 56.48 + 502.06 + 4.77 = 619.99:  65%|██████▍   | 1329/2048 [16:38<08:22,  1.43it/s]
loss 1.76 accuracy 0.31 -- 56.21 + 166.21 + 501.49 + 4.78 = 728.69:  65%|██████▍   | 1329/2048 [16:39<08:22,  1.43it/s]
loss 1.76 accuracy 0.31 -- 56.21 + 166.21 + 501.49 + 4.78 = 728.69:  65%|██████▍   | 1330/2048 [16:39<08:34,  1.40it/s]
loss 1.57 accuracy 0.56 -- 56.30 + 56.46 + 498.90 + 4.77 = 616.43:  65%|██████▍   | 1330/2048 [16:39<08:34,  1.40it/s] 
loss 1.57 accuracy 0.56 -- 56.30 + 56.46 + 498.90 + 4.77 = 616.43:  65%|██████▍   | 1331/2048 [16:39<08:17,  1.44it/s]
loss 2.98 accuracy 0.25 -- 56.76 + 56.59 + 499.08 + 4.80 = 617.23:  65%|██████▍   | 1331/2048 [16:40<08:17,  1.44it/s]
loss 2.98 accuracy 0.25 -- 56.76 + 56.59 + 499.08 + 4.80 = 617.23:  65%|██████▌   | 1332/2048 [16:40<08:29,  1.41it/s]
loss 1.74 accuracy 0.38 -- 56.31 + 57.82 + 497.52 + 4.77 = 616.42:  65%|██████▌   | 1332/2048 [16:41<08:29,  1.41it/s]
loss 1.74 accuracy 0.38 -- 56.31 + 57.82 + 497.52 + 4.77 = 616.42:  65%|██████▌   | 1333/2048 [16:41<08:14,  1.45it/s]
loss 2.14 accuracy 0.31 -- 157.73 + 56.92 + 489.59 + 4.78 = 709.01:  65%|██████▌   | 1333/2048 [16:42<08:14,  1.45it/s]
loss 2.14 accuracy 0.31 -- 157.73 + 56.92 + 489.59 + 4.78 = 709.01:  65%|██████▌   | 1334/2048 [16:42<08:23,  1.42it/s]
loss 2.10 accuracy 0.25 -- 55.81 + 166.95 + 501.31 + 4.77 = 728.83:  65%|██████▌   | 1334/2048 [16:42<08:23,  1.42it/s]
loss 2.10 accuracy 0.25 -- 55.81 + 166.95 + 501.31 + 4.77 = 728.83:  65%|██████▌   | 1335/2048 [16:42<08:33,  1.39it/s]
loss 1.58 accuracy 0.44 -- 56.56 + 56.20 + 498.27 + 4.76 = 615.80:  65%|██████▌   | 1335/2048 [16:43<08:33,  1.39it/s] 
loss 1.58 accuracy 0.44 -- 56.56 + 56.20 + 498.27 + 4.76 = 615.80:  65%|██████▌   | 1336/2048 [16:43<08:16,  1.43it/s]
loss 2.06 accuracy 0.12 -- 163.05 + 57.35 + 496.91 + 4.81 = 722.11:  65%|██████▌   | 1336/2048 [16:44<08:16,  1.43it/s]
loss 2.06 accuracy 0.12 -- 163.05 + 57.35 + 496.91 + 4.81 = 722.11:  65%|██████▌   | 1337/2048 [16:44<08:26,  1.40it/s]
loss 1.60 accuracy 0.38 -- 56.22 + 56.97 + 621.37 + 4.78 = 739.34:  65%|██████▌   | 1337/2048 [16:44<08:26,  1.40it/s] 
loss 1.60 accuracy 0.38 -- 56.22 + 56.97 + 621.37 + 4.78 = 739.34:  65%|██████▌   | 1338/2048 [16:44<08:37,  1.37it/s]
loss 1.89 accuracy 0.25 -- 57.05 + 56.18 + 505.27 + 4.78 = 623.28:  65%|██████▌   | 1338/2048 [16:45<08:37,  1.37it/s]
loss 1.89 accuracy 0.25 -- 57.05 + 56.18 + 505.27 + 4.78 = 623.28:  65%|██████▌   | 1339/2048 [16:45<08:20,  1.42it/s]
loss 2.00 accuracy 0.25 -- 56.01 + 57.23 + 615.06 + 4.78 = 733.08:  65%|██████▌   | 1339/2048 [16:46<08:20,  1.42it/s]
loss 2.00 accuracy 0.25 -- 56.01 + 57.23 + 615.06 + 4.78 = 733.08:  65%|██████▌   | 1340/2048 [16:46<08:31,  1.39it/s]
loss 1.87 accuracy 0.25 -- 56.30 + 56.35 + 501.25 + 4.77 = 618.67:  65%|██████▌   | 1340/2048 [16:47<08:31,  1.39it/s]
loss 1.87 accuracy 0.25 -- 56.30 + 56.35 + 501.25 + 4.77 = 618.67:  65%|██████▌   | 1341/2048 [16:47<08:14,  1.43it/s]
loss 1.80 accuracy 0.44 -- 56.23 + 166.49 + 502.23 + 4.78 = 729.74:  65%|██████▌   | 1341/2048 [16:47<08:14,  1.43it/s]
loss 1.80 accuracy 0.44 -- 56.23 + 166.49 + 502.23 + 4.78 = 729.74:  66%|██████▌   | 1342/2048 [16:47<08:25,  1.40it/s]
loss 1.99 accuracy 0.19 -- 56.30 + 56.25 + 499.53 + 4.78 = 616.86:  66%|██████▌   | 1342/2048 [16:48<08:25,  1.40it/s] 
loss 1.99 accuracy 0.19 -- 56.30 + 56.25 + 499.53 + 4.78 = 616.86:  66%|██████▌   | 1343/2048 [16:48<08:09,  1.44it/s]
loss 1.76 accuracy 0.19 -- 56.68 + 56.66 + 498.84 + 4.77 = 616.96:  66%|██████▌   | 1343/2048 [16:49<08:09,  1.44it/s]
loss 1.76 accuracy 0.19 -- 56.68 + 56.66 + 498.84 + 4.77 = 616.96:  66%|██████▌   | 1344/2048 [16:49<08:21,  1.41it/s]
loss 2.04 accuracy 0.25 -- 56.09 + 57.97 + 495.37 + 4.79 = 614.22:  66%|██████▌   | 1344/2048 [16:49<08:21,  1.41it/s]
loss 2.04 accuracy 0.25 -- 56.09 + 57.97 + 495.37 + 4.79 = 614.22:  66%|██████▌   | 1345/2048 [16:49<08:12,  1.43it/s]
loss 2.18 accuracy 0.25 -- 157.45 + 56.85 + 491.28 + 4.77 = 710.36:  66%|██████▌   | 1345/2048 [16:50<08:12,  1.43it/s]
loss 2.18 accuracy 0.25 -- 157.45 + 56.85 + 491.28 + 4.77 = 710.36:  66%|██████▌   | 1346/2048 [16:50<08:19,  1.40it/s]
loss 2.09 accuracy 0.31 -- 55.98 + 166.76 + 501.93 + 4.77 = 729.44:  66%|██████▌   | 1346/2048 [16:51<08:19,  1.40it/s]
loss 2.09 accuracy 0.31 -- 55.98 + 166.76 + 501.93 + 4.77 = 729.44:  66%|██████▌   | 1347/2048 [16:51<08:28,  1.38it/s]
loss 1.85 accuracy 0.25 -- 56.77 + 56.44 + 497.64 + 4.77 = 615.62:  66%|██████▌   | 1347/2048 [16:51<08:28,  1.38it/s] 
loss 1.85 accuracy 0.25 -- 56.77 + 56.44 + 497.64 + 4.77 = 615.62:  66%|██████▌   | 1348/2048 [16:51<08:10,  1.43it/s]
loss 1.42 accuracy 0.50 -- 162.87 + 57.14 + 500.13 + 4.80 = 724.93:  66%|██████▌   | 1348/2048 [16:52<08:10,  1.43it/s]
loss 1.42 accuracy 0.50 -- 162.87 + 57.14 + 500.13 + 4.80 = 724.93:  66%|██████▌   | 1349/2048 [16:52<08:20,  1.40it/s]
loss 2.26 accuracy 0.19 -- 56.30 + 57.62 + 622.68 + 4.78 = 741.38:  66%|██████▌   | 1349/2048 [16:53<08:20,  1.40it/s] 
loss 2.26 accuracy 0.19 -- 56.30 + 57.62 + 622.68 + 4.78 = 741.38:  66%|██████▌   | 1350/2048 [16:53<08:30,  1.37it/s]
loss 2.34 accuracy 0.31 -- 56.93 + 56.18 + 506.01 + 4.76 = 623.88:  66%|██████▌   | 1350/2048 [16:54<08:30,  1.37it/s]
loss 2.34 accuracy 0.31 -- 56.93 + 56.18 + 506.01 + 4.76 = 623.88:  66%|██████▌   | 1351/2048 [16:54<08:13,  1.41it/s]
loss 2.05 accuracy 0.38 -- 56.42 + 57.44 + 616.97 + 4.79 = 735.61:  66%|██████▌   | 1351/2048 [16:54<08:13,  1.41it/s]
loss 2.05 accuracy 0.38 -- 56.42 + 57.44 + 616.97 + 4.79 = 735.61:  66%|██████▌   | 1352/2048 [16:54<08:23,  1.38it/s]
loss 1.78 accuracy 0.31 -- 56.87 + 56.56 + 503.39 + 4.78 = 621.59:  66%|██████▌   | 1352/2048 [16:55<08:23,  1.38it/s]
loss 1.78 accuracy 0.31 -- 56.87 + 56.56 + 503.39 + 4.78 = 621.59:  66%|██████▌   | 1353/2048 [16:55<08:07,  1.43it/s]
loss 2.17 accuracy 0.25 -- 56.53 + 166.80 + 503.08 + 4.78 = 731.20:  66%|██████▌   | 1353/2048 [16:56<08:07,  1.43it/s]
loss 2.17 accuracy 0.25 -- 56.53 + 166.80 + 503.08 + 4.78 = 731.20:  66%|██████▌   | 1354/2048 [16:56<08:18,  1.39it/s]
loss 1.94 accuracy 0.50 -- 56.20 + 56.03 + 501.52 + 4.78 = 618.53:  66%|██████▌   | 1354/2048 [16:56<08:18,  1.39it/s] 
loss 1.94 accuracy 0.50 -- 56.20 + 56.03 + 501.52 + 4.78 = 618.53:  66%|██████▌   | 1355/2048 [16:56<08:02,  1.44it/s]
loss 1.85 accuracy 0.25 -- 56.79 + 56.46 + 498.95 + 4.77 = 616.97:  66%|██████▌   | 1355/2048 [16:57<08:02,  1.44it/s]
loss 1.85 accuracy 0.25 -- 56.79 + 56.46 + 498.95 + 4.77 = 616.97:  66%|██████▌   | 1356/2048 [16:57<08:13,  1.40it/s]
loss 1.67 accuracy 0.56 -- 55.98 + 57.25 + 494.95 + 4.78 = 612.95:  66%|██████▌   | 1356/2048 [16:58<08:13,  1.40it/s]
loss 1.67 accuracy 0.56 -- 55.98 + 57.25 + 494.95 + 4.78 = 612.95:  66%|██████▋   | 1357/2048 [16:58<07:57,  1.45it/s]
loss 1.35 accuracy 0.56 -- 157.67 + 57.00 + 489.45 + 4.77 = 708.90:  66%|██████▋   | 1357/2048 [16:59<07:57,  1.45it/s]
loss 1.35 accuracy 0.56 -- 157.67 + 57.00 + 489.45 + 4.77 = 708.90:  66%|██████▋   | 1358/2048 [16:59<08:06,  1.42it/s]
loss 2.05 accuracy 0.25 -- 55.57 + 166.02 + 501.32 + 4.78 = 727.68:  66%|██████▋   | 1358/2048 [16:59<08:06,  1.42it/s]
loss 2.05 accuracy 0.25 -- 55.57 + 166.02 + 501.32 + 4.78 = 727.68:  66%|██████▋   | 1359/2048 [16:59<08:15,  1.39it/s]
loss 1.76 accuracy 0.44 -- 56.39 + 56.36 + 498.25 + 4.77 = 615.76:  66%|██████▋   | 1359/2048 [17:00<08:15,  1.39it/s] 
loss 1.76 accuracy 0.44 -- 56.39 + 56.36 + 498.25 + 4.77 = 615.76:  66%|██████▋   | 1360/2048 [17:00<07:59,  1.44it/s]
loss 1.96 accuracy 0.19 -- 162.61 + 57.32 + 497.92 + 4.76 = 722.61:  66%|██████▋   | 1360/2048 [17:01<07:59,  1.44it/s]
loss 1.96 accuracy 0.19 -- 162.61 + 57.32 + 497.92 + 4.76 = 722.61:  66%|██████▋   | 1361/2048 [17:01<08:09,  1.40it/s]
loss 1.98 accuracy 0.31 -- 56.06 + 57.15 + 620.92 + 4.79 = 738.92:  66%|██████▋   | 1361/2048 [17:02<08:09,  1.40it/s] 
loss 1.98 accuracy 0.31 -- 56.06 + 57.15 + 620.92 + 4.79 = 738.92:  67%|██████▋   | 1362/2048 [17:02<08:19,  1.37it/s]
loss 1.78 accuracy 0.31 -- 56.80 + 56.30 + 505.53 + 4.76 = 623.39:  67%|██████▋   | 1362/2048 [17:02<08:19,  1.37it/s]
loss 1.78 accuracy 0.31 -- 56.80 + 56.30 + 505.53 + 4.76 = 623.39:  67%|██████▋   | 1363/2048 [17:02<08:03,  1.42it/s]
loss 1.97 accuracy 0.19 -- 55.92 + 56.99 + 616.19 + 4.78 = 733.88:  67%|██████▋   | 1363/2048 [17:03<08:03,  1.42it/s]
loss 1.97 accuracy 0.19 -- 55.92 + 56.99 + 616.19 + 4.78 = 733.88:  67%|██████▋   | 1364/2048 [17:03<08:13,  1.39it/s]
loss 2.25 accuracy 0.06 -- 56.74 + 56.34 + 503.07 + 4.77 = 620.92:  67%|██████▋   | 1364/2048 [17:04<08:13,  1.39it/s]
loss 2.25 accuracy 0.06 -- 56.74 + 56.34 + 503.07 + 4.77 = 620.92:  67%|██████▋   | 1365/2048 [17:04<07:57,  1.43it/s]
loss 1.46 accuracy 0.44 -- 56.38 + 166.73 + 504.01 + 4.83 = 731.96:  67%|██████▋   | 1365/2048 [17:04<07:57,  1.43it/s]
loss 1.46 accuracy 0.44 -- 56.38 + 166.73 + 504.01 + 4.83 = 731.96:  67%|██████▋   | 1366/2048 [17:04<08:09,  1.39it/s]
loss 1.65 accuracy 0.38 -- 56.34 + 56.63 + 499.71 + 4.76 = 617.43:  67%|██████▋   | 1366/2048 [17:05<08:09,  1.39it/s] 
loss 1.65 accuracy 0.38 -- 56.34 + 56.63 + 499.71 + 4.76 = 617.43:  67%|██████▋   | 1367/2048 [17:05<07:53,  1.44it/s]
loss 1.48 accuracy 0.50 -- 56.72 + 56.71 + 498.12 + 4.78 = 616.33:  67%|██████▋   | 1367/2048 [17:06<07:53,  1.44it/s]
loss 1.48 accuracy 0.50 -- 56.72 + 56.71 + 498.12 + 4.78 = 616.33:  67%|██████▋   | 1368/2048 [17:06<08:04,  1.40it/s]
loss 2.04 accuracy 0.50 -- 55.99 + 57.21 + 494.90 + 4.77 = 612.86:  67%|██████▋   | 1368/2048 [17:06<08:04,  1.40it/s]
loss 2.04 accuracy 0.50 -- 55.99 + 57.21 + 494.90 + 4.77 = 612.86:  67%|██████▋   | 1369/2048 [17:06<07:55,  1.43it/s]
loss 1.59 accuracy 0.44 -- 157.57 + 56.95 + 491.30 + 4.79 = 710.61:  67%|██████▋   | 1369/2048 [17:07<07:55,  1.43it/s]
loss 1.59 accuracy 0.44 -- 157.57 + 56.95 + 491.30 + 4.79 = 710.61:  67%|██████▋   | 1370/2048 [17:07<08:02,  1.40it/s]
loss 1.80 accuracy 0.31 -- 55.90 + 166.35 + 501.66 + 4.77 = 728.69:  67%|██████▋   | 1370/2048 [17:08<08:02,  1.40it/s]
loss 1.80 accuracy 0.31 -- 55.90 + 166.35 + 501.66 + 4.77 = 728.69:  67%|██████▋   | 1371/2048 [17:08<08:10,  1.38it/s]
loss 1.87 accuracy 0.31 -- 56.71 + 57.83 + 499.35 + 4.77 = 618.66:  67%|██████▋   | 1371/2048 [17:09<08:10,  1.38it/s] 
loss 1.87 accuracy 0.31 -- 56.71 + 57.83 + 499.35 + 4.77 = 618.66:  67%|██████▋   | 1372/2048 [17:09<07:54,  1.43it/s]
loss 2.07 accuracy 0.19 -- 162.60 + 56.95 + 497.27 + 4.78 = 721.60:  67%|██████▋   | 1372/2048 [17:09<07:54,  1.43it/s]
loss 2.07 accuracy 0.19 -- 162.60 + 56.95 + 497.27 + 4.78 = 721.60:  67%|██████▋   | 1373/2048 [17:09<08:02,  1.40it/s]
loss 1.89 accuracy 0.38 -- 56.23 + 57.41 + 621.06 + 4.78 = 739.48:  67%|██████▋   | 1373/2048 [17:10<08:02,  1.40it/s] 
loss 1.89 accuracy 0.38 -- 56.23 + 57.41 + 621.06 + 4.78 = 739.48:  67%|██████▋   | 1374/2048 [17:10<08:12,  1.37it/s]
loss 1.60 accuracy 0.38 -- 56.35 + 56.24 + 504.41 + 4.78 = 621.78:  67%|██████▋   | 1374/2048 [17:11<08:12,  1.37it/s]
loss 1.60 accuracy 0.38 -- 56.35 + 56.24 + 504.41 + 4.78 = 621.78:  67%|██████▋   | 1375/2048 [17:11<07:55,  1.42it/s]
loss 1.63 accuracy 0.50 -- 56.41 + 57.29 + 615.75 + 4.78 = 734.24:  67%|██████▋   | 1375/2048 [17:11<07:55,  1.42it/s]
loss 1.63 accuracy 0.50 -- 56.41 + 57.29 + 615.75 + 4.78 = 734.24:  67%|██████▋   | 1376/2048 [17:11<08:05,  1.38it/s]
loss 1.68 accuracy 0.31 -- 56.47 + 56.61 + 502.03 + 4.76 = 619.87:  67%|██████▋   | 1376/2048 [17:12<08:05,  1.38it/s]
loss 1.68 accuracy 0.31 -- 56.47 + 56.61 + 502.03 + 4.76 = 619.87:  67%|██████▋   | 1377/2048 [17:12<07:49,  1.43it/s]
loss 1.42 accuracy 0.44 -- 56.48 + 166.77 + 502.69 + 4.82 = 730.75:  67%|██████▋   | 1377/2048 [17:13<07:49,  1.43it/s]
loss 1.42 accuracy 0.44 -- 56.48 + 166.77 + 502.69 + 4.82 = 730.75:  67%|██████▋   | 1378/2048 [17:13<08:00,  1.39it/s]
loss 1.82 accuracy 0.19 -- 55.99 + 56.34 + 499.36 + 4.78 = 616.47:  67%|██████▋   | 1378/2048 [17:14<08:00,  1.39it/s] 
loss 1.82 accuracy 0.19 -- 55.99 + 56.34 + 499.36 + 4.78 = 616.47:  67%|██████▋   | 1379/2048 [17:14<07:45,  1.44it/s]
loss 1.55 accuracy 0.62 -- 56.74 + 56.67 + 499.33 + 4.78 = 617.53:  67%|██████▋   | 1379/2048 [17:14<07:45,  1.44it/s]
loss 1.55 accuracy 0.62 -- 56.74 + 56.67 + 499.33 + 4.78 = 617.53:  67%|██████▋   | 1380/2048 [17:14<07:55,  1.40it/s]
loss 1.76 accuracy 0.44 -- 56.09 + 57.35 + 495.20 + 4.80 = 613.44:  67%|██████▋   | 1380/2048 [17:15<07:55,  1.40it/s]
loss 1.76 accuracy 0.44 -- 56.09 + 57.35 + 495.20 + 4.80 = 613.44:  67%|██████▋   | 1381/2048 [17:15<07:40,  1.45it/s]
loss 1.33 accuracy 0.50 -- 156.95 + 56.96 + 489.38 + 4.78 = 708.07:  67%|██████▋   | 1381/2048 [17:16<07:40,  1.45it/s]
loss 1.33 accuracy 0.50 -- 156.95 + 56.96 + 489.38 + 4.78 = 708.07:  67%|██████▋   | 1382/2048 [17:16<07:49,  1.42it/s]
loss 1.87 accuracy 0.19 -- 55.75 + 166.77 + 502.82 + 4.78 = 730.13:  67%|██████▋   | 1382/2048 [17:16<07:49,  1.42it/s]
loss 1.87 accuracy 0.19 -- 55.75 + 166.77 + 502.82 + 4.78 = 730.13:  68%|██████▊   | 1383/2048 [17:16<07:58,  1.39it/s]
loss 1.66 accuracy 0.25 -- 57.04 + 56.76 + 498.40 + 4.76 = 616.96:  68%|██████▊   | 1383/2048 [17:17<07:58,  1.39it/s] 
loss 1.66 accuracy 0.25 -- 57.04 + 56.76 + 498.40 + 4.76 = 616.96:  68%|██████▊   | 1384/2048 [17:17<07:43,  1.43it/s]
loss 1.73 accuracy 0.50 -- 162.93 + 56.98 + 497.03 + 4.78 = 721.72:  68%|██████▊   | 1384/2048 [17:18<07:43,  1.43it/s]
loss 1.73 accuracy 0.50 -- 162.93 + 56.98 + 497.03 + 4.78 = 721.72:  68%|██████▊   | 1385/2048 [17:18<07:52,  1.40it/s]
loss 2.07 accuracy 0.25 -- 55.93 + 57.56 + 620.56 + 4.78 = 738.83:  68%|██████▊   | 1385/2048 [17:19<07:52,  1.40it/s] 
loss 2.07 accuracy 0.25 -- 55.93 + 57.56 + 620.56 + 4.78 = 738.83:  68%|██████▊   | 1386/2048 [17:19<08:02,  1.37it/s]
loss 1.62 accuracy 0.44 -- 57.05 + 56.35 + 504.98 + 4.78 = 623.17:  68%|██████▊   | 1386/2048 [17:19<08:02,  1.37it/s]
loss 1.62 accuracy 0.44 -- 57.05 + 56.35 + 504.98 + 4.78 = 623.17:  68%|██████▊   | 1387/2048 [17:19<07:46,  1.42it/s]
loss 1.58 accuracy 0.38 -- 56.24 + 57.18 + 616.62 + 4.80 = 734.84:  68%|██████▊   | 1387/2048 [17:20<07:46,  1.42it/s]
loss 1.58 accuracy 0.38 -- 56.24 + 57.18 + 616.62 + 4.80 = 734.84:  68%|██████▊   | 1388/2048 [17:20<07:56,  1.38it/s]
loss 1.78 accuracy 0.25 -- 56.71 + 56.50 + 503.80 + 4.78 = 621.79:  68%|██████▊   | 1388/2048 [17:21<07:56,  1.38it/s]
loss 1.78 accuracy 0.25 -- 56.71 + 56.50 + 503.80 + 4.78 = 621.79:  68%|██████▊   | 1389/2048 [17:21<07:41,  1.43it/s]
loss 1.99 accuracy 0.19 -- 56.27 + 166.72 + 502.43 + 4.79 = 730.20:  68%|██████▊   | 1389/2048 [17:21<07:41,  1.43it/s]
loss 1.99 accuracy 0.19 -- 56.27 + 166.72 + 502.43 + 4.79 = 730.20:  68%|██████▊   | 1390/2048 [17:21<07:52,  1.39it/s]
loss 1.78 accuracy 0.31 -- 56.13 + 56.25 + 499.09 + 4.78 = 616.24:  68%|██████▊   | 1390/2048 [17:22<07:52,  1.39it/s] 
loss 1.78 accuracy 0.31 -- 56.13 + 56.25 + 499.09 + 4.78 = 616.24:  68%|██████▊   | 1391/2048 [17:22<07:36,  1.44it/s]
loss 1.78 accuracy 0.25 -- 56.67 + 56.62 + 498.09 + 4.79 = 616.17:  68%|██████▊   | 1391/2048 [17:23<07:36,  1.44it/s]
loss 1.78 accuracy 0.25 -- 56.67 + 56.62 + 498.09 + 4.79 = 616.17:  68%|██████▊   | 1392/2048 [17:23<07:46,  1.40it/s]
loss 2.47 accuracy 0.19 -- 56.02 + 57.09 + 494.90 + 4.77 = 612.79:  68%|██████▊   | 1392/2048 [17:23<07:46,  1.40it/s]
loss 2.47 accuracy 0.19 -- 56.02 + 57.09 + 494.90 + 4.77 = 612.79:  68%|██████▊   | 1393/2048 [17:23<07:39,  1.43it/s]
loss 1.67 accuracy 0.19 -- 158.16 + 56.44 + 490.33 + 4.78 = 709.70:  68%|██████▊   | 1393/2048 [17:24<07:39,  1.43it/s]
loss 1.67 accuracy 0.19 -- 158.16 + 56.44 + 490.33 + 4.78 = 709.70:  68%|██████▊   | 1394/2048 [17:24<07:45,  1.41it/s]
loss 1.78 accuracy 0.38 -- 55.99 + 167.25 + 502.37 + 4.79 = 730.40:  68%|██████▊   | 1394/2048 [17:25<07:45,  1.41it/s]
loss 1.78 accuracy 0.38 -- 55.99 + 167.25 + 502.37 + 4.79 = 730.40:  68%|██████▊   | 1395/2048 [17:25<07:53,  1.38it/s]
loss 2.17 accuracy 0.19 -- 56.58 + 56.26 + 497.82 + 4.76 = 615.42:  68%|██████▊   | 1395/2048 [17:26<07:53,  1.38it/s] 
loss 2.17 accuracy 0.19 -- 56.58 + 56.26 + 497.82 + 4.76 = 615.42:  68%|██████▊   | 1396/2048 [17:26<07:36,  1.43it/s]
loss 1.66 accuracy 0.19 -- 162.87 + 57.17 + 497.32 + 4.79 = 722.15:  68%|██████▊   | 1396/2048 [17:26<07:36,  1.43it/s]
loss 1.66 accuracy 0.19 -- 162.87 + 57.17 + 497.32 + 4.79 = 722.15:  68%|██████▊   | 1397/2048 [17:26<07:45,  1.40it/s]
loss 1.54 accuracy 0.50 -- 56.10 + 57.33 + 621.19 + 4.80 = 739.42:  68%|██████▊   | 1397/2048 [17:27<07:45,  1.40it/s] 
loss 1.54 accuracy 0.50 -- 56.10 + 57.33 + 621.19 + 4.80 = 739.42:  68%|██████▊   | 1398/2048 [17:27<07:54,  1.37it/s]
loss 1.79 accuracy 0.31 -- 56.74 + 56.52 + 504.80 + 4.78 = 622.83:  68%|██████▊   | 1398/2048 [17:28<07:54,  1.37it/s]
loss 1.79 accuracy 0.31 -- 56.74 + 56.52 + 504.80 + 4.78 = 622.83:  68%|██████▊   | 1399/2048 [17:28<07:38,  1.42it/s]
loss 1.83 accuracy 0.44 -- 56.21 + 57.09 + 617.55 + 4.78 = 735.64:  68%|██████▊   | 1399/2048 [17:29<07:38,  1.42it/s]
loss 1.83 accuracy 0.44 -- 56.21 + 57.09 + 617.55 + 4.78 = 735.64:  68%|██████▊   | 1400/2048 [17:29<07:48,  1.38it/s]
loss 2.09 accuracy 0.19 -- 56.80 + 56.49 + 501.94 + 4.77 = 620.00:  68%|██████▊   | 1400/2048 [17:29<07:48,  1.38it/s]
loss 2.09 accuracy 0.19 -- 56.80 + 56.49 + 501.94 + 4.77 = 620.00:  68%|██████▊   | 1401/2048 [17:29<07:40,  1.41it/s]
loss 1.82 accuracy 0.38 -- 56.30 + 166.11 + 501.72 + 4.79 = 728.92:  68%|██████▊   | 1401/2048 [17:30<07:40,  1.41it/s]
loss 1.82 accuracy 0.38 -- 56.30 + 166.11 + 501.72 + 4.79 = 728.92:  68%|██████▊   | 1402/2048 [17:30<07:48,  1.38it/s]
loss 1.56 accuracy 0.38 -- 56.36 + 56.22 + 498.65 + 4.79 = 616.02:  68%|██████▊   | 1402/2048 [17:31<07:48,  1.38it/s] 
loss 1.56 accuracy 0.38 -- 56.36 + 56.22 + 498.65 + 4.79 = 616.02:  69%|██████▊   | 1403/2048 [17:31<07:31,  1.43it/s]
loss 1.94 accuracy 0.19 -- 56.60 + 56.42 + 498.41 + 4.77 = 616.20:  69%|██████▊   | 1403/2048 [17:31<07:31,  1.43it/s]
loss 1.94 accuracy 0.19 -- 56.60 + 56.42 + 498.41 + 4.77 = 616.20:  69%|██████▊   | 1404/2048 [17:31<07:40,  1.40it/s]
loss 1.55 accuracy 0.25 -- 56.11 + 57.39 + 495.71 + 4.76 = 613.97:  69%|██████▊   | 1404/2048 [17:32<07:40,  1.40it/s]
loss 1.55 accuracy 0.25 -- 56.11 + 57.39 + 495.71 + 4.76 = 613.97:  69%|██████▊   | 1405/2048 [17:32<07:25,  1.44it/s]
loss 1.69 accuracy 0.38 -- 157.63 + 57.48 + 491.33 + 4.77 = 711.21:  69%|██████▊   | 1405/2048 [17:33<07:25,  1.44it/s]
loss 1.69 accuracy 0.38 -- 157.63 + 57.48 + 491.33 + 4.77 = 711.21:  69%|██████▊   | 1406/2048 [17:33<07:33,  1.42it/s]
loss 1.96 accuracy 0.44 -- 56.20 + 167.01 + 503.48 + 4.77 = 731.46:  69%|██████▊   | 1406/2048 [17:33<07:33,  1.42it/s]
loss 1.96 accuracy 0.44 -- 56.20 + 167.01 + 503.48 + 4.77 = 731.46:  69%|██████▊   | 1407/2048 [17:33<07:42,  1.38it/s]
loss 1.65 accuracy 0.44 -- 56.71 + 56.27 + 499.46 + 4.77 = 617.21:  69%|██████▊   | 1407/2048 [17:34<07:42,  1.38it/s] 
loss 1.65 accuracy 0.44 -- 56.71 + 56.27 + 499.46 + 4.77 = 617.21:  69%|██████▉   | 1408/2048 [17:34<07:27,  1.43it/s]
loss 2.61 accuracy 0.12 -- 163.18 + 57.15 + 497.24 + 4.83 = 722.40:  69%|██████▉   | 1408/2048 [17:35<07:27,  1.43it/s]
loss 2.61 accuracy 0.12 -- 163.18 + 57.15 + 497.24 + 4.83 = 722.40:  69%|██████▉   | 1409/2048 [17:35<07:36,  1.40it/s]
loss 2.18 accuracy 0.19 -- 56.08 + 57.48 + 619.61 + 4.78 = 737.95:  69%|██████▉   | 1409/2048 [17:36<07:36,  1.40it/s] 
loss 2.18 accuracy 0.19 -- 56.08 + 57.48 + 619.61 + 4.78 = 737.95:  69%|██████▉   | 1410/2048 [17:36<07:45,  1.37it/s]
loss 2.30 accuracy 0.12 -- 56.70 + 56.46 + 504.96 + 4.81 = 622.93:  69%|██████▉   | 1410/2048 [17:36<07:45,  1.37it/s]
loss 2.30 accuracy 0.12 -- 56.70 + 56.46 + 504.96 + 4.81 = 622.93:  69%|██████▉   | 1411/2048 [17:36<07:29,  1.42it/s]
loss 1.94 accuracy 0.25 -- 56.46 + 57.49 + 615.97 + 4.77 = 734.69:  69%|██████▉   | 1411/2048 [17:37<07:29,  1.42it/s]
loss 1.94 accuracy 0.25 -- 56.46 + 57.49 + 615.97 + 4.77 = 734.69:  69%|██████▉   | 1412/2048 [17:37<07:39,  1.38it/s]
loss 1.61 accuracy 0.56 -- 56.44 + 56.31 + 503.96 + 4.81 = 621.52:  69%|██████▉   | 1412/2048 [17:38<07:39,  1.38it/s]
loss 1.61 accuracy 0.56 -- 56.44 + 56.31 + 503.96 + 4.81 = 621.52:  69%|██████▉   | 1413/2048 [17:38<07:24,  1.43it/s]
loss 1.46 accuracy 0.44 -- 56.05 + 165.96 + 501.42 + 4.78 = 728.21:  69%|██████▉   | 1413/2048 [17:38<07:24,  1.43it/s]
loss 1.46 accuracy 0.44 -- 56.05 + 165.96 + 501.42 + 4.78 = 728.21:  69%|██████▉   | 1414/2048 [17:38<07:34,  1.39it/s]
loss 1.98 accuracy 0.31 -- 56.09 + 56.25 + 499.52 + 4.78 = 616.65:  69%|██████▉   | 1414/2048 [17:39<07:34,  1.39it/s] 
loss 1.98 accuracy 0.31 -- 56.09 + 56.25 + 499.52 + 4.78 = 616.65:  69%|██████▉   | 1415/2048 [17:39<07:20,  1.44it/s]
loss 2.28 accuracy 0.31 -- 56.48 + 56.42 + 497.24 + 4.78 = 614.92:  69%|██████▉   | 1415/2048 [17:40<07:20,  1.44it/s]
loss 2.28 accuracy 0.31 -- 56.48 + 56.42 + 497.24 + 4.78 = 614.92:  69%|██████▉   | 1416/2048 [17:40<07:29,  1.41it/s]
loss 1.71 accuracy 0.44 -- 56.34 + 57.25 + 496.41 + 4.80 = 614.80:  69%|██████▉   | 1416/2048 [17:41<07:29,  1.41it/s]
loss 1.71 accuracy 0.44 -- 56.34 + 57.25 + 496.41 + 4.80 = 614.80:  69%|██████▉   | 1417/2048 [17:41<07:22,  1.43it/s]
loss 1.65 accuracy 0.50 -- 157.81 + 56.83 + 489.57 + 4.77 = 708.98:  69%|██████▉   | 1417/2048 [17:41<07:22,  1.43it/s]
loss 1.65 accuracy 0.50 -- 157.81 + 56.83 + 489.57 + 4.77 = 708.98:  69%|██████▉   | 1418/2048 [17:41<07:28,  1.41it/s]
loss 2.15 accuracy 0.25 -- 55.91 + 166.34 + 501.36 + 4.78 = 728.39:  69%|██████▉   | 1418/2048 [17:42<07:28,  1.41it/s]
loss 2.15 accuracy 0.25 -- 55.91 + 166.34 + 501.36 + 4.78 = 728.39:  69%|██████▉   | 1419/2048 [17:42<07:35,  1.38it/s]
loss 2.25 accuracy 0.12 -- 56.60 + 56.53 + 497.87 + 4.78 = 615.78:  69%|██████▉   | 1419/2048 [17:43<07:35,  1.38it/s] 
loss 2.25 accuracy 0.12 -- 56.60 + 56.53 + 497.87 + 4.78 = 615.78:  69%|██████▉   | 1420/2048 [17:43<07:19,  1.43it/s]
loss 2.10 accuracy 0.19 -- 162.64 + 57.21 + 497.81 + 4.77 = 722.43:  69%|██████▉   | 1420/2048 [17:43<07:19,  1.43it/s]
loss 2.10 accuracy 0.19 -- 162.64 + 57.21 + 497.81 + 4.77 = 722.43:  69%|██████▉   | 1421/2048 [17:43<07:28,  1.40it/s]
loss 1.43 accuracy 0.50 -- 56.15 + 57.36 + 621.38 + 4.79 = 739.67:  69%|██████▉   | 1421/2048 [17:44<07:28,  1.40it/s] 
loss 1.43 accuracy 0.50 -- 56.15 + 57.36 + 621.38 + 4.79 = 739.67:  69%|██████▉   | 1422/2048 [17:44<07:37,  1.37it/s]
loss 1.96 accuracy 0.25 -- 56.99 + 56.61 + 505.62 + 4.77 = 623.99:  69%|██████▉   | 1422/2048 [17:45<07:37,  1.37it/s]
loss 1.96 accuracy 0.25 -- 56.99 + 56.61 + 505.62 + 4.77 = 623.99:  69%|██████▉   | 1423/2048 [17:45<07:21,  1.42it/s]
loss 1.65 accuracy 0.38 -- 56.40 + 57.20 + 616.52 + 4.79 = 734.91:  69%|██████▉   | 1423/2048 [17:46<07:21,  1.42it/s]
loss 1.65 accuracy 0.38 -- 56.40 + 57.20 + 616.52 + 4.79 = 734.91:  70%|██████▉   | 1424/2048 [17:46<07:31,  1.38it/s]
loss 1.40 accuracy 0.56 -- 57.05 + 56.54 + 502.32 + 4.77 = 620.67:  70%|██████▉   | 1424/2048 [17:46<07:31,  1.38it/s]
loss 1.40 accuracy 0.56 -- 57.05 + 56.54 + 502.32 + 4.77 = 620.67:  70%|██████▉   | 1425/2048 [17:46<07:22,  1.41it/s]
loss 1.76 accuracy 0.25 -- 55.94 + 165.84 + 502.74 + 4.79 = 729.31:  70%|██████▉   | 1425/2048 [17:47<07:22,  1.41it/s]
loss 1.76 accuracy 0.25 -- 55.94 + 165.84 + 502.74 + 4.79 = 729.31:  70%|██████▉   | 1426/2048 [17:47<07:30,  1.38it/s]
loss 1.58 accuracy 0.25 -- 56.00 + 56.35 + 499.86 + 4.77 = 616.98:  70%|██████▉   | 1426/2048 [17:48<07:30,  1.38it/s] 
loss 1.58 accuracy 0.25 -- 56.00 + 56.35 + 499.86 + 4.77 = 616.98:  70%|██████▉   | 1427/2048 [17:48<07:14,  1.43it/s]
loss 2.22 accuracy 0.38 -- 56.80 + 56.51 + 500.32 + 4.78 = 618.41:  70%|██████▉   | 1427/2048 [17:48<07:14,  1.43it/s]
loss 2.22 accuracy 0.38 -- 56.80 + 56.51 + 500.32 + 4.78 = 618.41:  70%|██████▉   | 1428/2048 [17:48<07:23,  1.40it/s]
loss 1.82 accuracy 0.31 -- 56.32 + 57.34 + 495.77 + 4.78 = 614.22:  70%|██████▉   | 1428/2048 [17:49<07:23,  1.40it/s]
loss 1.82 accuracy 0.31 -- 56.32 + 57.34 + 495.77 + 4.78 = 614.22:  70%|██████▉   | 1429/2048 [17:49<07:09,  1.44it/s]
loss 1.57 accuracy 0.50 -- 158.07 + 57.12 + 489.84 + 4.79 = 709.82:  70%|██████▉   | 1429/2048 [17:50<07:09,  1.44it/s]
loss 1.57 accuracy 0.50 -- 158.07 + 57.12 + 489.84 + 4.79 = 709.82:  70%|██████▉   | 1430/2048 [17:50<07:16,  1.42it/s]
loss 2.43 accuracy 0.25 -- 55.70 + 166.19 + 501.54 + 4.78 = 728.21:  70%|██████▉   | 1430/2048 [17:51<07:16,  1.42it/s]
loss 2.43 accuracy 0.25 -- 55.70 + 166.19 + 501.54 + 4.78 = 728.21:  70%|██████▉   | 1431/2048 [17:51<07:25,  1.39it/s]
loss 2.03 accuracy 0.25 -- 56.52 + 56.58 + 497.10 + 4.77 = 614.97:  70%|██████▉   | 1431/2048 [17:51<07:25,  1.39it/s] 
loss 2.03 accuracy 0.25 -- 56.52 + 56.58 + 497.10 + 4.77 = 614.97:  70%|██████▉   | 1432/2048 [17:51<07:09,  1.43it/s]
loss 2.17 accuracy 0.19 -- 162.22 + 57.17 + 496.76 + 4.77 = 720.92:  70%|██████▉   | 1432/2048 [17:52<07:09,  1.43it/s]
loss 2.17 accuracy 0.19 -- 162.22 + 57.17 + 496.76 + 4.77 = 720.92:  70%|██████▉   | 1433/2048 [17:52<07:18,  1.40it/s]
loss 2.74 accuracy 0.19 -- 56.06 + 57.42 + 621.27 + 4.77 = 739.52:  70%|██████▉   | 1433/2048 [17:53<07:18,  1.40it/s] 
loss 2.74 accuracy 0.19 -- 56.06 + 57.42 + 621.27 + 4.77 = 739.52:  70%|███████   | 1434/2048 [17:53<07:27,  1.37it/s]
loss 1.78 accuracy 0.25 -- 57.02 + 56.41 + 504.41 + 4.78 = 622.63:  70%|███████   | 1434/2048 [17:53<07:27,  1.37it/s]
loss 1.78 accuracy 0.25 -- 57.02 + 56.41 + 504.41 + 4.78 = 622.63:  70%|███████   | 1435/2048 [17:53<07:12,  1.42it/s]
loss 2.06 accuracy 0.44 -- 56.29 + 57.06 + 615.78 + 4.79 = 733.92:  70%|███████   | 1435/2048 [17:54<07:12,  1.42it/s]
loss 2.06 accuracy 0.44 -- 56.29 + 57.06 + 615.78 + 4.79 = 733.92:  70%|███████   | 1436/2048 [17:54<07:21,  1.39it/s]
loss 1.74 accuracy 0.31 -- 56.76 + 56.39 + 501.08 + 4.77 = 618.99:  70%|███████   | 1436/2048 [17:55<07:21,  1.39it/s]
loss 1.74 accuracy 0.31 -- 56.76 + 56.39 + 501.08 + 4.77 = 618.99:  70%|███████   | 1437/2048 [17:55<07:07,  1.43it/s]
loss 1.93 accuracy 0.19 -- 56.18 + 166.42 + 502.03 + 4.80 = 729.42:  70%|███████   | 1437/2048 [17:56<07:07,  1.43it/s]
loss 1.93 accuracy 0.19 -- 56.18 + 166.42 + 502.03 + 4.80 = 729.42:  70%|███████   | 1438/2048 [17:56<07:16,  1.40it/s]
loss 1.78 accuracy 0.38 -- 56.13 + 56.48 + 499.95 + 4.77 = 617.33:  70%|███████   | 1438/2048 [17:56<07:16,  1.40it/s] 
loss 1.78 accuracy 0.38 -- 56.13 + 56.48 + 499.95 + 4.77 = 617.33:  70%|███████   | 1439/2048 [17:56<07:02,  1.44it/s]
loss 1.70 accuracy 0.25 -- 56.70 + 56.77 + 497.65 + 4.78 = 615.90:  70%|███████   | 1439/2048 [17:57<07:02,  1.44it/s]
loss 1.70 accuracy 0.25 -- 56.70 + 56.77 + 497.65 + 4.78 = 615.90:  70%|███████   | 1440/2048 [17:57<07:12,  1.41it/s]
loss 2.03 accuracy 0.19 -- 55.93 + 57.35 + 495.11 + 4.79 = 613.17:  70%|███████   | 1440/2048 [17:58<07:12,  1.41it/s]
loss 2.03 accuracy 0.19 -- 55.93 + 57.35 + 495.11 + 4.79 = 613.17:  70%|███████   | 1441/2048 [17:58<06:58,  1.45it/s]
loss 1.74 accuracy 0.31 -- 157.87 + 57.03 + 490.31 + 4.78 = 709.99:  70%|███████   | 1441/2048 [17:58<06:58,  1.45it/s]
loss 1.74 accuracy 0.31 -- 157.87 + 57.03 + 490.31 + 4.78 = 709.99:  70%|███████   | 1442/2048 [17:58<07:06,  1.42it/s]
loss 2.06 accuracy 0.19 -- 55.78 + 166.88 + 501.72 + 4.77 = 729.15:  70%|███████   | 1442/2048 [17:59<07:06,  1.42it/s]
loss 2.06 accuracy 0.19 -- 55.78 + 166.88 + 501.72 + 4.77 = 729.15:  70%|███████   | 1443/2048 [17:59<07:15,  1.39it/s]
loss 1.97 accuracy 0.31 -- 56.81 + 56.73 + 498.45 + 4.77 = 616.76:  70%|███████   | 1443/2048 [18:00<07:15,  1.39it/s] 
loss 1.97 accuracy 0.31 -- 56.81 + 56.73 + 498.45 + 4.77 = 616.76:  71%|███████   | 1444/2048 [18:00<07:01,  1.43it/s]
loss 1.82 accuracy 0.38 -- 162.92 + 57.25 + 498.41 + 4.84 = 723.41:  71%|███████   | 1444/2048 [18:00<07:01,  1.43it/s]
loss 1.82 accuracy 0.38 -- 162.92 + 57.25 + 498.41 + 4.84 = 723.41:  71%|███████   | 1445/2048 [18:00<07:09,  1.40it/s]
loss 2.08 accuracy 0.56 -- 55.97 + 57.45 + 621.90 + 4.80 = 740.12:  71%|███████   | 1445/2048 [18:01<07:09,  1.40it/s] 
loss 2.08 accuracy 0.56 -- 55.97 + 57.45 + 621.90 + 4.80 = 740.12:  71%|███████   | 1446/2048 [18:01<07:18,  1.37it/s]
loss 1.59 accuracy 0.50 -- 57.20 + 56.77 + 506.18 + 4.77 = 624.91:  71%|███████   | 1446/2048 [18:02<07:18,  1.37it/s]
loss 1.59 accuracy 0.50 -- 57.20 + 56.77 + 506.18 + 4.77 = 624.91:  71%|███████   | 1447/2048 [18:02<07:04,  1.42it/s]
loss 2.02 accuracy 0.19 -- 56.57 + 57.62 + 617.39 + 4.79 = 736.37:  71%|███████   | 1447/2048 [18:03<07:04,  1.42it/s]
loss 2.02 accuracy 0.19 -- 56.57 + 57.62 + 617.39 + 4.79 = 736.37:  71%|███████   | 1448/2048 [18:03<07:13,  1.38it/s]
loss 1.84 accuracy 0.12 -- 56.85 + 56.23 + 501.94 + 4.78 = 619.80:  71%|███████   | 1448/2048 [18:03<07:13,  1.38it/s]
loss 1.84 accuracy 0.12 -- 56.85 + 56.23 + 501.94 + 4.78 = 619.80:  71%|███████   | 1449/2048 [18:03<07:05,  1.41it/s]
loss 1.77 accuracy 0.50 -- 56.32 + 166.28 + 501.54 + 4.78 = 728.92:  71%|███████   | 1449/2048 [18:04<07:05,  1.41it/s]
loss 1.77 accuracy 0.50 -- 56.32 + 166.28 + 501.54 + 4.78 = 728.92:  71%|███████   | 1450/2048 [18:04<07:13,  1.38it/s]
loss 2.00 accuracy 0.50 -- 56.15 + 56.80 + 499.43 + 4.77 = 617.15:  71%|███████   | 1450/2048 [18:05<07:13,  1.38it/s] 
loss 2.00 accuracy 0.50 -- 56.15 + 56.80 + 499.43 + 4.77 = 617.15:  71%|███████   | 1451/2048 [18:05<06:58,  1.43it/s]
loss 2.06 accuracy 0.25 -- 56.69 + 56.28 + 498.45 + 4.80 = 616.21:  71%|███████   | 1451/2048 [18:05<06:58,  1.43it/s]
loss 2.06 accuracy 0.25 -- 56.69 + 56.28 + 498.45 + 4.80 = 616.21:  71%|███████   | 1452/2048 [18:05<07:06,  1.40it/s]
loss 1.42 accuracy 0.50 -- 56.24 + 57.04 + 495.13 + 4.81 = 613.22:  71%|███████   | 1452/2048 [18:06<07:06,  1.40it/s]
loss 1.42 accuracy 0.50 -- 56.24 + 57.04 + 495.13 + 4.81 = 613.22:  71%|███████   | 1453/2048 [18:06<06:52,  1.44it/s]
loss 1.62 accuracy 0.44 -- 157.53 + 56.98 + 489.24 + 4.76 = 708.51:  71%|███████   | 1453/2048 [18:07<06:52,  1.44it/s]
loss 1.62 accuracy 0.44 -- 157.53 + 56.98 + 489.24 + 4.76 = 708.51:  71%|███████   | 1454/2048 [18:07<06:59,  1.42it/s]
loss 1.86 accuracy 0.31 -- 55.53 + 166.88 + 501.57 + 4.78 = 728.76:  71%|███████   | 1454/2048 [18:08<06:59,  1.42it/s]
loss 1.86 accuracy 0.31 -- 55.53 + 166.88 + 501.57 + 4.78 = 728.76:  71%|███████   | 1455/2048 [18:08<07:07,  1.39it/s]
loss 1.64 accuracy 0.31 -- 56.43 + 56.72 + 498.55 + 4.82 = 616.52:  71%|███████   | 1455/2048 [18:08<07:07,  1.39it/s] 
loss 1.64 accuracy 0.31 -- 56.43 + 56.72 + 498.55 + 4.82 = 616.52:  71%|███████   | 1456/2048 [18:08<06:52,  1.43it/s]
loss 1.91 accuracy 0.25 -- 163.55 + 57.38 + 497.09 + 4.76 = 722.78:  71%|███████   | 1456/2048 [18:09<06:52,  1.43it/s]
loss 1.91 accuracy 0.25 -- 163.55 + 57.38 + 497.09 + 4.76 = 722.78:  71%|███████   | 1457/2048 [18:09<07:01,  1.40it/s]
loss 1.71 accuracy 0.38 -- 56.03 + 57.05 + 620.54 + 4.80 = 738.43:  71%|███████   | 1457/2048 [18:10<07:01,  1.40it/s] 
loss 1.71 accuracy 0.38 -- 56.03 + 57.05 + 620.54 + 4.80 = 738.43:  71%|███████   | 1458/2048 [18:10<07:09,  1.37it/s]
loss 2.09 accuracy 0.25 -- 57.02 + 56.78 + 504.87 + 4.81 = 623.48:  71%|███████   | 1458/2048 [18:10<07:09,  1.37it/s]
loss 2.09 accuracy 0.25 -- 57.02 + 56.78 + 504.87 + 4.81 = 623.48:  71%|███████   | 1459/2048 [18:10<06:55,  1.42it/s]
loss 1.96 accuracy 0.06 -- 55.92 + 57.38 + 615.63 + 4.78 = 733.71:  71%|███████   | 1459/2048 [18:11<06:55,  1.42it/s]
loss 1.96 accuracy 0.06 -- 55.92 + 57.38 + 615.63 + 4.78 = 733.71:  71%|███████▏  | 1460/2048 [18:11<07:04,  1.39it/s]
loss 2.01 accuracy 0.31 -- 56.69 + 56.87 + 501.03 + 4.78 = 619.38:  71%|███████▏  | 1460/2048 [18:12<07:04,  1.39it/s]
loss 2.01 accuracy 0.31 -- 56.69 + 56.87 + 501.03 + 4.78 = 619.38:  71%|███████▏  | 1461/2048 [18:12<06:50,  1.43it/s]
loss 2.11 accuracy 0.38 -- 56.00 + 166.38 + 504.07 + 4.81 = 731.26:  71%|███████▏  | 1461/2048 [18:13<06:50,  1.43it/s]
loss 2.11 accuracy 0.38 -- 56.00 + 166.38 + 504.07 + 4.81 = 731.26:  71%|███████▏  | 1462/2048 [18:13<07:00,  1.39it/s]
loss 2.24 accuracy 0.38 -- 56.08 + 56.33 + 498.62 + 4.76 = 615.80:  71%|███████▏  | 1462/2048 [18:13<07:00,  1.39it/s] 
loss 2.24 accuracy 0.38 -- 56.08 + 56.33 + 498.62 + 4.76 = 615.80:  71%|███████▏  | 1463/2048 [18:13<06:46,  1.44it/s]
loss 1.82 accuracy 0.25 -- 56.49 + 56.23 + 497.41 + 4.78 = 614.91:  71%|███████▏  | 1463/2048 [18:14<06:46,  1.44it/s]
loss 1.82 accuracy 0.25 -- 56.49 + 56.23 + 497.41 + 4.78 = 614.91:  71%|███████▏  | 1464/2048 [18:14<06:55,  1.41it/s]
loss 1.49 accuracy 0.44 -- 56.15 + 57.57 + 494.96 + 4.77 = 613.45:  71%|███████▏  | 1464/2048 [18:15<06:55,  1.41it/s]
loss 1.49 accuracy 0.44 -- 56.15 + 57.57 + 494.96 + 4.77 = 613.45:  72%|███████▏  | 1465/2048 [18:15<06:48,  1.43it/s]
loss 1.85 accuracy 0.25 -- 158.14 + 56.63 + 489.61 + 4.77 = 709.15:  72%|███████▏  | 1465/2048 [18:15<06:48,  1.43it/s]
loss 1.85 accuracy 0.25 -- 158.14 + 56.63 + 489.61 + 4.77 = 709.15:  72%|███████▏  | 1466/2048 [18:15<06:53,  1.41it/s]
loss 1.79 accuracy 0.19 -- 55.67 + 165.99 + 499.95 + 4.77 = 726.38:  72%|███████▏  | 1466/2048 [18:16<06:53,  1.41it/s]
loss 1.79 accuracy 0.19 -- 55.67 + 165.99 + 499.95 + 4.77 = 726.38:  72%|███████▏  | 1467/2048 [18:16<07:00,  1.38it/s]
loss 1.41 accuracy 0.50 -- 56.90 + 57.11 + 498.53 + 4.77 = 617.31:  72%|███████▏  | 1467/2048 [18:17<07:00,  1.38it/s] 
loss 1.41 accuracy 0.50 -- 56.90 + 57.11 + 498.53 + 4.77 = 617.31:  72%|███████▏  | 1468/2048 [18:17<06:46,  1.43it/s]
loss 1.96 accuracy 0.12 -- 162.26 + 56.85 + 496.85 + 4.77 = 720.74:  72%|███████▏  | 1468/2048 [18:18<06:46,  1.43it/s]
loss 1.96 accuracy 0.12 -- 162.26 + 56.85 + 496.85 + 4.77 = 720.74:  72%|███████▏  | 1469/2048 [18:18<06:53,  1.40it/s]
loss 1.73 accuracy 0.38 -- 56.00 + 57.32 + 620.24 + 4.79 = 738.35:  72%|███████▏  | 1469/2048 [18:18<06:53,  1.40it/s] 
loss 1.73 accuracy 0.38 -- 56.00 + 57.32 + 620.24 + 4.79 = 738.35:  72%|███████▏  | 1470/2048 [18:18<07:01,  1.37it/s]
loss 1.87 accuracy 0.31 -- 56.76 + 56.54 + 504.27 + 4.82 = 622.38:  72%|███████▏  | 1470/2048 [18:19<07:01,  1.37it/s]
loss 1.87 accuracy 0.31 -- 56.76 + 56.54 + 504.27 + 4.82 = 622.38:  72%|███████▏  | 1471/2048 [18:19<06:47,  1.42it/s]
loss 1.71 accuracy 0.44 -- 56.16 + 57.25 + 616.81 + 4.78 = 735.01:  72%|███████▏  | 1471/2048 [18:20<06:47,  1.42it/s]
loss 1.71 accuracy 0.44 -- 56.16 + 57.25 + 616.81 + 4.78 = 735.01:  72%|███████▏  | 1472/2048 [18:20<06:56,  1.38it/s]
loss 1.61 accuracy 0.44 -- 56.72 + 56.77 + 504.25 + 4.79 = 622.53:  72%|███████▏  | 1472/2048 [18:20<06:56,  1.38it/s]
loss 1.61 accuracy 0.44 -- 56.72 + 56.77 + 504.25 + 4.79 = 622.53:  72%|███████▏  | 1473/2048 [18:20<06:48,  1.41it/s]
loss 1.57 accuracy 0.38 -- 56.53 + 167.05 + 501.91 + 4.79 = 730.28:  72%|███████▏  | 1473/2048 [18:21<06:48,  1.41it/s]
loss 1.57 accuracy 0.38 -- 56.53 + 167.05 + 501.91 + 4.79 = 730.28:  72%|███████▏  | 1474/2048 [18:21<06:56,  1.38it/s]
loss 1.85 accuracy 0.25 -- 56.14 + 56.63 + 499.52 + 4.76 = 617.05:  72%|███████▏  | 1474/2048 [18:22<06:56,  1.38it/s] 
loss 1.85 accuracy 0.25 -- 56.14 + 56.63 + 499.52 + 4.76 = 617.05:  72%|███████▏  | 1475/2048 [18:22<06:41,  1.43it/s]
loss 1.85 accuracy 0.19 -- 56.97 + 56.42 + 498.90 + 4.78 = 617.07:  72%|███████▏  | 1475/2048 [18:23<06:41,  1.43it/s]
loss 1.85 accuracy 0.19 -- 56.97 + 56.42 + 498.90 + 4.78 = 617.07:  72%|███████▏  | 1476/2048 [18:23<06:49,  1.40it/s]
loss 1.80 accuracy 0.38 -- 56.09 + 57.51 + 495.44 + 4.77 = 613.81:  72%|███████▏  | 1476/2048 [18:23<06:49,  1.40it/s]
loss 1.80 accuracy 0.38 -- 56.09 + 57.51 + 495.44 + 4.77 = 613.81:  72%|███████▏  | 1477/2048 [18:23<06:35,  1.44it/s]
loss 1.53 accuracy 0.44 -- 158.27 + 57.14 + 489.68 + 4.77 = 709.85:  72%|███████▏  | 1477/2048 [18:24<06:35,  1.44it/s]
loss 1.53 accuracy 0.44 -- 158.27 + 57.14 + 489.68 + 4.77 = 709.85:  72%|███████▏  | 1478/2048 [18:24<06:42,  1.42it/s]
loss 1.92 accuracy 0.38 -- 56.01 + 166.73 + 503.56 + 4.77 = 731.08:  72%|███████▏  | 1478/2048 [18:25<06:42,  1.42it/s]
loss 1.92 accuracy 0.38 -- 56.01 + 166.73 + 503.56 + 4.77 = 731.08:  72%|███████▏  | 1479/2048 [18:25<06:50,  1.39it/s]
loss 1.81 accuracy 0.25 -- 56.42 + 56.47 + 497.54 + 4.77 = 615.21:  72%|███████▏  | 1479/2048 [18:25<06:50,  1.39it/s] 
loss 1.81 accuracy 0.25 -- 56.42 + 56.47 + 497.54 + 4.77 = 615.21:  72%|███████▏  | 1480/2048 [18:25<06:36,  1.43it/s]
loss 1.47 accuracy 0.56 -- 162.87 + 57.09 + 496.74 + 4.77 = 721.48:  72%|███████▏  | 1480/2048 [18:26<06:36,  1.43it/s]
loss 1.47 accuracy 0.56 -- 162.87 + 57.09 + 496.74 + 4.77 = 721.48:  72%|███████▏  | 1481/2048 [18:26<06:44,  1.40it/s]
loss 1.61 accuracy 0.25 -- 55.99 + 57.21 + 620.59 + 4.78 = 738.56:  72%|███████▏  | 1481/2048 [18:27<06:44,  1.40it/s] 
loss 1.61 accuracy 0.25 -- 55.99 + 57.21 + 620.59 + 4.78 = 738.56:  72%|███████▏  | 1482/2048 [18:27<06:52,  1.37it/s]
loss 1.55 accuracy 0.50 -- 57.13 + 56.77 + 506.70 + 4.78 = 625.38:  72%|███████▏  | 1482/2048 [18:27<06:52,  1.37it/s]
loss 1.55 accuracy 0.50 -- 57.13 + 56.77 + 506.70 + 4.78 = 625.38:  72%|███████▏  | 1483/2048 [18:27<06:38,  1.42it/s]
loss 2.01 accuracy 0.31 -- 56.13 + 57.72 + 617.59 + 4.79 = 736.23:  72%|███████▏  | 1483/2048 [18:28<06:38,  1.42it/s]
loss 2.01 accuracy 0.31 -- 56.13 + 57.72 + 617.59 + 4.79 = 736.23:  72%|███████▏  | 1484/2048 [18:28<06:47,  1.38it/s]
loss 1.89 accuracy 0.44 -- 57.22 + 57.30 + 503.17 + 4.78 = 622.47:  72%|███████▏  | 1484/2048 [18:29<06:47,  1.38it/s]
loss 1.89 accuracy 0.44 -- 57.22 + 57.30 + 503.17 + 4.78 = 622.47:  73%|███████▎  | 1485/2048 [18:29<06:34,  1.43it/s]
loss 2.18 accuracy 0.12 -- 56.33 + 166.48 + 503.02 + 4.78 = 730.60:  73%|███████▎  | 1485/2048 [18:30<06:34,  1.43it/s]
loss 2.18 accuracy 0.12 -- 56.33 + 166.48 + 503.02 + 4.78 = 730.60:  73%|███████▎  | 1486/2048 [18:30<06:43,  1.39it/s]
loss 1.49 accuracy 0.50 -- 56.24 + 56.31 + 500.04 + 4.78 = 617.37:  73%|███████▎  | 1486/2048 [18:30<06:43,  1.39it/s] 
loss 1.49 accuracy 0.50 -- 56.24 + 56.31 + 500.04 + 4.78 = 617.37:  73%|███████▎  | 1487/2048 [18:30<06:30,  1.44it/s]
loss 2.22 accuracy 0.12 -- 56.83 + 56.49 + 497.87 + 4.79 = 615.97:  73%|███████▎  | 1487/2048 [18:31<06:30,  1.44it/s]
loss 2.22 accuracy 0.12 -- 56.83 + 56.49 + 497.87 + 4.79 = 615.97:  73%|███████▎  | 1488/2048 [18:31<06:38,  1.40it/s]
loss 1.81 accuracy 0.50 -- 56.02 + 57.27 + 494.49 + 4.83 = 612.60:  73%|███████▎  | 1488/2048 [18:32<06:38,  1.40it/s]
loss 1.81 accuracy 0.50 -- 56.02 + 57.27 + 494.49 + 4.83 = 612.60:  73%|███████▎  | 1489/2048 [18:32<06:26,  1.45it/s]
loss 1.67 accuracy 0.56 -- 157.26 + 56.92 + 492.08 + 4.78 = 711.03:  73%|███████▎  | 1489/2048 [18:32<06:26,  1.45it/s]
loss 1.67 accuracy 0.56 -- 157.26 + 56.92 + 492.08 + 4.78 = 711.03:  73%|███████▎  | 1490/2048 [18:32<06:33,  1.42it/s]
loss 2.06 accuracy 0.44 -- 55.87 + 166.66 + 501.10 + 4.79 = 728.43:  73%|███████▎  | 1490/2048 [18:33<06:33,  1.42it/s]
loss 2.06 accuracy 0.44 -- 55.87 + 166.66 + 501.10 + 4.79 = 728.43:  73%|███████▎  | 1491/2048 [18:33<06:41,  1.39it/s]
loss 2.49 accuracy 0.19 -- 56.66 + 56.41 + 498.10 + 4.76 = 615.92:  73%|███████▎  | 1491/2048 [18:34<06:41,  1.39it/s] 
loss 2.49 accuracy 0.19 -- 56.66 + 56.41 + 498.10 + 4.76 = 615.92:  73%|███████▎  | 1492/2048 [18:34<06:27,  1.43it/s]
loss 1.88 accuracy 0.25 -- 163.00 + 57.11 + 497.13 + 4.78 = 722.01:  73%|███████▎  | 1492/2048 [18:35<06:27,  1.43it/s]
loss 1.88 accuracy 0.25 -- 163.00 + 57.11 + 497.13 + 4.78 = 722.01:  73%|███████▎  | 1493/2048 [18:35<06:35,  1.40it/s]
loss 1.58 accuracy 0.44 -- 56.13 + 57.22 + 619.77 + 4.80 = 737.93:  73%|███████▎  | 1493/2048 [18:35<06:35,  1.40it/s] 
loss 1.58 accuracy 0.44 -- 56.13 + 57.22 + 619.77 + 4.80 = 737.93:  73%|███████▎  | 1494/2048 [18:35<06:43,  1.37it/s]
loss 1.30 accuracy 0.56 -- 56.98 + 56.54 + 505.20 + 4.79 = 623.52:  73%|███████▎  | 1494/2048 [18:36<06:43,  1.37it/s]
loss 1.30 accuracy 0.56 -- 56.98 + 56.54 + 505.20 + 4.79 = 623.52:  73%|███████▎  | 1495/2048 [18:36<06:29,  1.42it/s]
loss 1.61 accuracy 0.50 -- 56.27 + 57.31 + 617.13 + 4.78 = 735.50:  73%|███████▎  | 1495/2048 [18:37<06:29,  1.42it/s]
loss 1.61 accuracy 0.50 -- 56.27 + 57.31 + 617.13 + 4.78 = 735.50:  73%|███████▎  | 1496/2048 [18:37<06:38,  1.38it/s]
loss 1.66 accuracy 0.31 -- 57.02 + 56.99 + 504.50 + 4.78 = 623.28:  73%|███████▎  | 1496/2048 [18:37<06:38,  1.38it/s]
loss 1.66 accuracy 0.31 -- 57.02 + 56.99 + 504.50 + 4.78 = 623.28:  73%|███████▎  | 1497/2048 [18:37<06:31,  1.41it/s]
loss 1.58 accuracy 0.50 -- 56.22 + 166.58 + 501.88 + 4.77 = 729.45:  73%|███████▎  | 1497/2048 [18:38<06:31,  1.41it/s]
loss 1.58 accuracy 0.50 -- 56.22 + 166.58 + 501.88 + 4.77 = 729.45:  73%|███████▎  | 1498/2048 [18:38<06:38,  1.38it/s]
loss 2.05 accuracy 0.25 -- 56.23 + 56.63 + 499.90 + 4.76 = 617.52:  73%|███████▎  | 1498/2048 [18:39<06:38,  1.38it/s] 
loss 2.05 accuracy 0.25 -- 56.23 + 56.63 + 499.90 + 4.76 = 617.52:  73%|███████▎  | 1499/2048 [18:39<06:24,  1.43it/s]
loss 1.83 accuracy 0.19 -- 56.86 + 56.29 + 498.22 + 4.78 = 616.14:  73%|███████▎  | 1499/2048 [18:40<06:24,  1.43it/s]
loss 1.83 accuracy 0.19 -- 56.86 + 56.29 + 498.22 + 4.78 = 616.14:  73%|███████▎  | 1500/2048 [18:40<06:32,  1.40it/s]
loss 1.86 accuracy 0.56 -- 55.86 + 57.21 + 495.40 + 4.80 = 613.27:  73%|███████▎  | 1500/2048 [18:40<06:32,  1.40it/s]
loss 1.86 accuracy 0.56 -- 55.86 + 57.21 + 495.40 + 4.80 = 613.27:  73%|███████▎  | 1501/2048 [18:40<06:19,  1.44it/s]
loss 2.50 accuracy 0.25 -- 157.96 + 57.17 + 488.15 + 4.78 = 708.05:  73%|███████▎  | 1501/2048 [18:41<06:19,  1.44it/s]
loss 2.50 accuracy 0.25 -- 157.96 + 57.17 + 488.15 + 4.78 = 708.05:  73%|███████▎  | 1502/2048 [18:41<06:25,  1.42it/s]
loss 1.70 accuracy 0.19 -- 55.68 + 166.04 + 501.59 + 4.78 = 728.09:  73%|███████▎  | 1502/2048 [18:42<06:25,  1.42it/s]
loss 1.70 accuracy 0.19 -- 55.68 + 166.04 + 501.59 + 4.78 = 728.09:  73%|███████▎  | 1503/2048 [18:42<06:32,  1.39it/s]
loss 2.46 accuracy 0.12 -- 56.48 + 56.46 + 496.68 + 4.80 = 614.41:  73%|███████▎  | 1503/2048 [18:42<06:32,  1.39it/s] 
loss 2.46 accuracy 0.12 -- 56.48 + 56.46 + 496.68 + 4.80 = 614.41:  73%|███████▎  | 1504/2048 [18:42<06:18,  1.44it/s]
loss 2.05 accuracy 0.38 -- 163.05 + 56.88 + 496.50 + 4.78 = 721.21:  73%|███████▎  | 1504/2048 [18:43<06:18,  1.44it/s]
loss 2.05 accuracy 0.38 -- 163.05 + 56.88 + 496.50 + 4.78 = 721.21:  73%|███████▎  | 1505/2048 [18:43<06:26,  1.40it/s]
loss 1.77 accuracy 0.31 -- 56.05 + 57.37 + 620.68 + 4.80 = 738.90:  73%|███████▎  | 1505/2048 [18:44<06:26,  1.40it/s] 
loss 1.77 accuracy 0.31 -- 56.05 + 57.37 + 620.68 + 4.80 = 738.90:  74%|███████▎  | 1506/2048 [18:44<06:34,  1.37it/s]
loss 1.57 accuracy 0.38 -- 56.65 + 56.44 + 507.47 + 4.78 = 625.33:  74%|███████▎  | 1506/2048 [18:45<06:34,  1.37it/s]
loss 1.57 accuracy 0.38 -- 56.65 + 56.44 + 507.47 + 4.78 = 625.33:  74%|███████▎  | 1507/2048 [18:45<06:21,  1.42it/s]
loss 1.76 accuracy 0.19 -- 56.18 + 58.43 + 617.34 + 4.80 = 736.75:  74%|███████▎  | 1507/2048 [18:45<06:21,  1.42it/s]
loss 1.76 accuracy 0.19 -- 56.18 + 58.43 + 617.34 + 4.80 = 736.75:  74%|███████▎  | 1508/2048 [18:45<06:30,  1.38it/s]
loss 1.83 accuracy 0.31 -- 56.69 + 56.58 + 502.28 + 4.79 = 620.34:  74%|███████▎  | 1508/2048 [18:46<06:30,  1.38it/s]
loss 1.83 accuracy 0.31 -- 56.69 + 56.58 + 502.28 + 4.79 = 620.34:  74%|███████▎  | 1509/2048 [18:46<06:17,  1.43it/s]
loss 1.23 accuracy 0.56 -- 56.00 + 167.02 + 501.65 + 4.78 = 729.45:  74%|███████▎  | 1509/2048 [18:47<06:17,  1.43it/s]
loss 1.23 accuracy 0.56 -- 56.00 + 167.02 + 501.65 + 4.78 = 729.45:  74%|███████▎  | 1510/2048 [18:47<06:25,  1.39it/s]
loss 1.94 accuracy 0.31 -- 56.10 + 56.53 + 499.61 + 4.79 = 617.03:  74%|███████▎  | 1510/2048 [18:47<06:25,  1.39it/s] 
loss 1.94 accuracy 0.31 -- 56.10 + 56.53 + 499.61 + 4.79 = 617.03:  74%|███████▍  | 1511/2048 [18:47<06:13,  1.44it/s]
loss 1.48 accuracy 0.44 -- 56.84 + 56.27 + 498.74 + 4.78 = 616.63:  74%|███████▍  | 1511/2048 [18:48<06:13,  1.44it/s]
loss 1.48 accuracy 0.44 -- 56.84 + 56.27 + 498.74 + 4.78 = 616.63:  74%|███████▍  | 1512/2048 [18:48<06:21,  1.40it/s]
loss 2.18 accuracy 0.25 -- 56.16 + 57.38 + 496.34 + 4.82 = 614.69:  74%|███████▍  | 1512/2048 [18:49<06:21,  1.40it/s]
loss 2.18 accuracy 0.25 -- 56.16 + 57.38 + 496.34 + 4.82 = 614.69:  74%|███████▍  | 1513/2048 [18:49<06:09,  1.45it/s]
loss 1.80 accuracy 0.38 -- 157.41 + 56.95 + 489.89 + 4.77 = 709.03:  74%|███████▍  | 1513/2048 [18:49<06:09,  1.45it/s]
loss 1.80 accuracy 0.38 -- 157.41 + 56.95 + 489.89 + 4.77 = 709.03:  74%|███████▍  | 1514/2048 [18:49<06:16,  1.42it/s]
loss 1.79 accuracy 0.31 -- 55.96 + 166.42 + 501.70 + 4.79 = 728.88:  74%|███████▍  | 1514/2048 [18:50<06:16,  1.42it/s]
loss 1.79 accuracy 0.31 -- 55.96 + 166.42 + 501.70 + 4.79 = 728.88:  74%|███████▍  | 1515/2048 [18:50<06:23,  1.39it/s]
loss 2.34 accuracy 0.12 -- 56.71 + 56.50 + 496.97 + 4.78 = 614.96:  74%|███████▍  | 1515/2048 [18:51<06:23,  1.39it/s] 
loss 2.34 accuracy 0.12 -- 56.71 + 56.50 + 496.97 + 4.78 = 614.96:  74%|███████▍  | 1516/2048 [18:51<06:10,  1.44it/s]
loss 2.30 accuracy 0.06 -- 162.62 + 56.78 + 496.92 + 4.77 = 721.10:  74%|███████▍  | 1516/2048 [18:52<06:10,  1.44it/s]
loss 2.30 accuracy 0.06 -- 162.62 + 56.78 + 496.92 + 4.77 = 721.10:  74%|███████▍  | 1517/2048 [18:52<06:18,  1.40it/s]
loss 1.86 accuracy 0.44 -- 56.19 + 57.05 + 623.85 + 4.87 = 741.97:  74%|███████▍  | 1517/2048 [18:52<06:18,  1.40it/s] 
loss 1.86 accuracy 0.44 -- 56.19 + 57.05 + 623.85 + 4.87 = 741.97:  74%|███████▍  | 1518/2048 [18:52<06:26,  1.37it/s]
loss 1.93 accuracy 0.31 -- 57.44 + 56.81 + 505.77 + 4.78 = 624.80:  74%|███████▍  | 1518/2048 [18:53<06:26,  1.37it/s]
loss 1.93 accuracy 0.31 -- 57.44 + 56.81 + 505.77 + 4.78 = 624.80:  74%|███████▍  | 1519/2048 [18:53<06:13,  1.42it/s]
loss 1.99 accuracy 0.56 -- 56.20 + 57.24 + 616.11 + 4.77 = 734.32:  74%|███████▍  | 1519/2048 [18:54<06:13,  1.42it/s]
loss 1.99 accuracy 0.56 -- 56.20 + 57.24 + 616.11 + 4.77 = 734.32:  74%|███████▍  | 1520/2048 [18:54<06:21,  1.38it/s]
loss 1.64 accuracy 0.31 -- 56.94 + 56.76 + 504.02 + 4.78 = 622.50:  74%|███████▍  | 1520/2048 [18:54<06:21,  1.38it/s]
loss 1.64 accuracy 0.31 -- 56.94 + 56.76 + 504.02 + 4.78 = 622.50:  74%|███████▍  | 1521/2048 [18:54<06:09,  1.43it/s]
loss 1.76 accuracy 0.31 -- 56.47 + 166.21 + 503.64 + 4.78 = 731.10:  74%|███████▍  | 1521/2048 [18:55<06:09,  1.43it/s]
loss 1.76 accuracy 0.31 -- 56.47 + 166.21 + 503.64 + 4.78 = 731.10:  74%|███████▍  | 1522/2048 [18:55<06:17,  1.39it/s]
loss 1.49 accuracy 0.56 -- 56.33 + 56.34 + 498.96 + 4.77 = 616.39:  74%|███████▍  | 1522/2048 [18:56<06:17,  1.39it/s] 
loss 1.49 accuracy 0.56 -- 56.33 + 56.34 + 498.96 + 4.77 = 616.39:  74%|███████▍  | 1523/2048 [18:56<06:05,  1.44it/s]
loss 2.63 accuracy 0.19 -- 56.68 + 56.55 + 499.75 + 4.78 = 617.76:  74%|███████▍  | 1523/2048 [18:57<06:05,  1.44it/s]
loss 2.63 accuracy 0.19 -- 56.68 + 56.55 + 499.75 + 4.78 = 617.76:  74%|███████▍  | 1524/2048 [18:57<06:13,  1.40it/s]
loss 1.77 accuracy 0.25 -- 55.80 + 57.22 + 494.14 + 4.78 = 611.94:  74%|███████▍  | 1524/2048 [18:57<06:13,  1.40it/s]
loss 1.77 accuracy 0.25 -- 55.80 + 57.22 + 494.14 + 4.78 = 611.94:  74%|███████▍  | 1525/2048 [18:57<06:01,  1.45it/s]
loss 1.64 accuracy 0.38 -- 157.43 + 56.73 + 488.55 + 4.78 = 707.49:  74%|███████▍  | 1525/2048 [18:58<06:01,  1.45it/s]
loss 1.64 accuracy 0.38 -- 157.43 + 56.73 + 488.55 + 4.78 = 707.49:  75%|███████▍  | 1526/2048 [18:58<06:07,  1.42it/s]
loss 2.20 accuracy 0.19 -- 55.84 + 166.38 + 501.13 + 4.77 = 728.12:  75%|███████▍  | 1526/2048 [18:59<06:07,  1.42it/s]
loss 2.20 accuracy 0.19 -- 55.84 + 166.38 + 501.13 + 4.77 = 728.12:  75%|███████▍  | 1527/2048 [18:59<06:14,  1.39it/s]
loss 1.83 accuracy 0.31 -- 56.61 + 56.41 + 498.22 + 4.79 = 616.04:  75%|███████▍  | 1527/2048 [18:59<06:14,  1.39it/s] 
loss 1.83 accuracy 0.31 -- 56.61 + 56.41 + 498.22 + 4.79 = 616.04:  75%|███████▍  | 1528/2048 [18:59<06:02,  1.44it/s]
loss 1.47 accuracy 0.50 -- 163.17 + 57.04 + 496.56 + 4.76 = 721.53:  75%|███████▍  | 1528/2048 [19:00<06:02,  1.44it/s]
loss 1.47 accuracy 0.50 -- 163.17 + 57.04 + 496.56 + 4.76 = 721.53:  75%|███████▍  | 1529/2048 [19:00<06:09,  1.40it/s]
loss 1.56 accuracy 0.50 -- 56.48 + 57.57 + 621.19 + 4.79 = 740.03:  75%|███████▍  | 1529/2048 [19:01<06:09,  1.40it/s] 
loss 1.56 accuracy 0.50 -- 56.48 + 57.57 + 621.19 + 4.79 = 740.03:  75%|███████▍  | 1530/2048 [19:01<06:17,  1.37it/s]
loss 1.33 accuracy 0.50 -- 56.80 + 56.36 + 504.84 + 4.80 = 622.80:  75%|███████▍  | 1530/2048 [19:02<06:17,  1.37it/s]
loss 1.33 accuracy 0.50 -- 56.80 + 56.36 + 504.84 + 4.80 = 622.80:  75%|███████▍  | 1531/2048 [19:02<06:04,  1.42it/s]
loss 1.89 accuracy 0.25 -- 56.44 + 57.55 + 618.01 + 4.81 = 736.81:  75%|███████▍  | 1531/2048 [19:02<06:04,  1.42it/s]
loss 1.89 accuracy 0.25 -- 56.44 + 57.55 + 618.01 + 4.81 = 736.81:  75%|███████▍  | 1532/2048 [19:02<06:12,  1.38it/s]
loss 1.53 accuracy 0.62 -- 57.02 + 56.71 + 504.95 + 4.86 = 623.54:  75%|███████▍  | 1532/2048 [19:03<06:12,  1.38it/s]
loss 1.53 accuracy 0.62 -- 57.02 + 56.71 + 504.95 + 4.86 = 623.54:  75%|███████▍  | 1533/2048 [19:03<06:00,  1.43it/s]
loss 2.03 accuracy 0.44 -- 56.48 + 166.88 + 501.89 + 4.79 = 730.04:  75%|███████▍  | 1533/2048 [19:04<06:00,  1.43it/s]
loss 2.03 accuracy 0.44 -- 56.48 + 166.88 + 501.89 + 4.79 = 730.04:  75%|███████▍  | 1534/2048 [19:04<06:08,  1.39it/s]
loss 2.01 accuracy 0.25 -- 56.15 + 56.41 + 503.02 + 4.77 = 620.34:  75%|███████▍  | 1534/2048 [19:04<06:08,  1.39it/s] 
loss 2.01 accuracy 0.25 -- 56.15 + 56.41 + 503.02 + 4.77 = 620.34:  75%|███████▍  | 1535/2048 [19:04<05:57,  1.44it/s]
loss 1.89 accuracy 0.31 -- 56.85 + 56.46 + 498.40 + 4.78 = 616.49:  75%|███████▍  | 1535/2048 [19:05<05:57,  1.44it/s]
loss 1.89 accuracy 0.31 -- 56.85 + 56.46 + 498.40 + 4.78 = 616.49:  75%|███████▌  | 1536/2048 [19:05<06:05,  1.40it/s]
loss 1.68 accuracy 0.25 -- 56.04 + 57.27 + 500.10 + 4.78 = 618.19:  75%|███████▌  | 1536/2048 [19:06<06:05,  1.40it/s]
loss 1.68 accuracy 0.25 -- 56.04 + 57.27 + 500.10 + 4.78 = 618.19:  75%|███████▌  | 1537/2048 [19:06<05:54,  1.44it/s]
loss 1.78 accuracy 0.44 -- 157.58 + 56.97 + 491.09 + 4.77 = 710.41:  75%|███████▌  | 1537/2048 [19:06<05:54,  1.44it/s]
loss 1.78 accuracy 0.44 -- 157.58 + 56.97 + 491.09 + 4.77 = 710.41:  75%|███████▌  | 1538/2048 [19:06<06:00,  1.42it/s]
loss 2.10 accuracy 0.38 -- 55.86 + 166.23 + 501.18 + 4.77 = 728.04:  75%|███████▌  | 1538/2048 [19:07<06:00,  1.42it/s]
loss 2.10 accuracy 0.38 -- 55.86 + 166.23 + 501.18 + 4.77 = 728.04:  75%|███████▌  | 1539/2048 [19:07<06:07,  1.39it/s]
loss 2.30 accuracy 0.19 -- 56.46 + 56.48 + 496.79 + 4.79 = 614.53:  75%|███████▌  | 1539/2048 [19:08<06:07,  1.39it/s] 
loss 2.30 accuracy 0.19 -- 56.46 + 56.48 + 496.79 + 4.79 = 614.53:  75%|███████▌  | 1540/2048 [19:08<05:54,  1.43it/s]
loss 1.92 accuracy 0.31 -- 162.12 + 56.94 + 497.96 + 4.77 = 721.80:  75%|███████▌  | 1540/2048 [19:09<05:54,  1.43it/s]
loss 1.92 accuracy 0.31 -- 162.12 + 56.94 + 497.96 + 4.77 = 721.80:  75%|███████▌  | 1541/2048 [19:09<06:01,  1.40it/s]
loss 2.12 accuracy 0.19 -- 56.16 + 57.28 + 619.77 + 4.78 = 737.99:  75%|███████▌  | 1541/2048 [19:09<06:01,  1.40it/s] 
loss 2.12 accuracy 0.19 -- 56.16 + 57.28 + 619.77 + 4.78 = 737.99:  75%|███████▌  | 1542/2048 [19:09<06:08,  1.37it/s]
loss 1.83 accuracy 0.44 -- 56.64 + 56.41 + 504.67 + 4.82 = 622.54:  75%|███████▌  | 1542/2048 [19:10<06:08,  1.37it/s]
loss 1.83 accuracy 0.44 -- 56.64 + 56.41 + 504.67 + 4.82 = 622.54:  75%|███████▌  | 1543/2048 [19:10<05:55,  1.42it/s]
loss 1.48 accuracy 0.44 -- 56.28 + 57.50 + 616.67 + 4.78 = 735.23:  75%|███████▌  | 1543/2048 [19:11<05:55,  1.42it/s]
loss 1.48 accuracy 0.44 -- 56.28 + 57.50 + 616.67 + 4.78 = 735.23:  75%|███████▌  | 1544/2048 [19:11<06:03,  1.39it/s]
loss 1.71 accuracy 0.50 -- 56.56 + 56.49 + 501.78 + 4.76 = 619.58:  75%|███████▌  | 1544/2048 [19:11<06:03,  1.39it/s]
loss 1.71 accuracy 0.50 -- 56.56 + 56.49 + 501.78 + 4.76 = 619.58:  75%|███████▌  | 1545/2048 [19:11<05:51,  1.43it/s]
loss 1.72 accuracy 0.56 -- 56.22 + 166.31 + 502.26 + 4.80 = 729.58:  75%|███████▌  | 1545/2048 [19:12<05:51,  1.43it/s]
loss 1.72 accuracy 0.56 -- 56.22 + 166.31 + 502.26 + 4.80 = 729.58:  75%|███████▌  | 1546/2048 [19:12<05:59,  1.40it/s]
loss 1.47 accuracy 0.56 -- 56.40 + 56.79 + 500.48 + 4.77 = 618.43:  75%|███████▌  | 1546/2048 [19:13<05:59,  1.40it/s] 
loss 1.47 accuracy 0.56 -- 56.40 + 56.79 + 500.48 + 4.77 = 618.43:  76%|███████▌  | 1547/2048 [19:13<05:48,  1.44it/s]
loss 1.88 accuracy 0.19 -- 56.73 + 56.39 + 498.32 + 4.78 = 616.22:  76%|███████▌  | 1547/2048 [19:14<05:48,  1.44it/s]
loss 1.88 accuracy 0.19 -- 56.73 + 56.39 + 498.32 + 4.78 = 616.22:  76%|███████▌  | 1548/2048 [19:14<05:55,  1.41it/s]
loss 1.80 accuracy 0.38 -- 56.14 + 57.17 + 496.50 + 4.78 = 614.59:  76%|███████▌  | 1548/2048 [19:14<05:55,  1.41it/s]
loss 1.80 accuracy 0.38 -- 56.14 + 57.17 + 496.50 + 4.78 = 614.59:  76%|███████▌  | 1549/2048 [19:14<05:44,  1.45it/s]
loss 1.86 accuracy 0.38 -- 157.46 + 56.99 + 489.75 + 4.77 = 708.97:  76%|███████▌  | 1549/2048 [19:15<05:44,  1.45it/s]
loss 1.86 accuracy 0.38 -- 157.46 + 56.99 + 489.75 + 4.77 = 708.97:  76%|███████▌  | 1550/2048 [19:15<05:50,  1.42it/s]
loss 1.71 accuracy 0.44 -- 55.82 + 166.37 + 502.64 + 4.77 = 729.60:  76%|███████▌  | 1550/2048 [19:16<05:50,  1.42it/s]
loss 1.71 accuracy 0.44 -- 55.82 + 166.37 + 502.64 + 4.77 = 729.60:  76%|███████▌  | 1551/2048 [19:16<05:57,  1.39it/s]
loss 1.63 accuracy 0.25 -- 56.68 + 56.26 + 499.53 + 4.78 = 617.24:  76%|███████▌  | 1551/2048 [19:16<05:57,  1.39it/s] 
loss 1.63 accuracy 0.25 -- 56.68 + 56.26 + 499.53 + 4.78 = 617.24:  76%|███████▌  | 1552/2048 [19:16<05:45,  1.43it/s]
loss 1.65 accuracy 0.31 -- 162.87 + 56.93 + 498.05 + 4.82 = 722.67:  76%|███████▌  | 1552/2048 [19:17<05:45,  1.43it/s]
loss 1.65 accuracy 0.31 -- 162.87 + 56.93 + 498.05 + 4.82 = 722.67:  76%|███████▌  | 1553/2048 [19:17<05:53,  1.40it/s]
loss 1.90 accuracy 0.25 -- 56.16 + 57.48 + 621.48 + 4.81 = 739.93:  76%|███████▌  | 1553/2048 [19:18<05:53,  1.40it/s] 
loss 1.90 accuracy 0.25 -- 56.16 + 57.48 + 621.48 + 4.81 = 739.93:  76%|███████▌  | 1554/2048 [19:18<06:00,  1.37it/s]
loss 1.92 accuracy 0.19 -- 56.79 + 56.95 + 506.86 + 4.80 = 625.40:  76%|███████▌  | 1554/2048 [19:19<06:00,  1.37it/s]
loss 1.92 accuracy 0.19 -- 56.79 + 56.95 + 506.86 + 4.80 = 625.40:  76%|███████▌  | 1555/2048 [19:19<05:48,  1.42it/s]
loss 2.55 accuracy 0.38 -- 56.38 + 57.50 + 616.09 + 4.79 = 734.75:  76%|███████▌  | 1555/2048 [19:19<05:48,  1.42it/s]
loss 2.55 accuracy 0.38 -- 56.38 + 57.50 + 616.09 + 4.79 = 734.75:  76%|███████▌  | 1556/2048 [19:19<05:55,  1.38it/s]
loss 1.91 accuracy 0.38 -- 56.68 + 56.57 + 501.45 + 4.78 = 619.48:  76%|███████▌  | 1556/2048 [19:20<05:55,  1.38it/s]
loss 1.91 accuracy 0.38 -- 56.68 + 56.57 + 501.45 + 4.78 = 619.48:  76%|███████▌  | 1557/2048 [19:20<05:43,  1.43it/s]
loss 1.79 accuracy 0.38 -- 55.98 + 166.51 + 502.93 + 4.78 = 730.19:  76%|███████▌  | 1557/2048 [19:21<05:43,  1.43it/s]
loss 1.79 accuracy 0.38 -- 55.98 + 166.51 + 502.93 + 4.78 = 730.19:  76%|███████▌  | 1558/2048 [19:21<05:51,  1.39it/s]
loss 1.70 accuracy 0.31 -- 56.14 + 56.16 + 498.61 + 4.77 = 615.67:  76%|███████▌  | 1558/2048 [19:21<05:51,  1.39it/s] 
loss 1.70 accuracy 0.31 -- 56.14 + 56.16 + 498.61 + 4.77 = 615.67:  76%|███████▌  | 1559/2048 [19:21<05:39,  1.44it/s]
loss 1.58 accuracy 0.25 -- 56.53 + 56.36 + 497.77 + 4.79 = 615.45:  76%|███████▌  | 1559/2048 [19:22<05:39,  1.44it/s]
loss 1.58 accuracy 0.25 -- 56.53 + 56.36 + 497.77 + 4.79 = 615.45:  76%|███████▌  | 1560/2048 [19:22<05:46,  1.41it/s]
loss 2.09 accuracy 0.25 -- 56.15 + 57.06 + 496.49 + 4.78 = 614.48:  76%|███████▌  | 1560/2048 [19:23<05:46,  1.41it/s]
loss 2.09 accuracy 0.25 -- 56.15 + 57.06 + 496.49 + 4.78 = 614.48:  76%|███████▌  | 1561/2048 [19:23<05:36,  1.45it/s]
loss 1.81 accuracy 0.31 -- 157.15 + 56.88 + 488.44 + 4.80 = 707.27:  76%|███████▌  | 1561/2048 [19:23<05:36,  1.45it/s]
loss 1.81 accuracy 0.31 -- 157.15 + 56.88 + 488.44 + 4.80 = 707.27:  76%|███████▋  | 1562/2048 [19:23<05:41,  1.42it/s]
loss 1.62 accuracy 0.50 -- 56.04 + 166.19 + 502.49 + 4.79 = 729.51:  76%|███████▋  | 1562/2048 [19:24<05:41,  1.42it/s]
loss 1.62 accuracy 0.50 -- 56.04 + 166.19 + 502.49 + 4.79 = 729.51:  76%|███████▋  | 1563/2048 [19:24<05:48,  1.39it/s]
loss 1.56 accuracy 0.38 -- 56.82 + 56.92 + 499.35 + 4.78 = 617.88:  76%|███████▋  | 1563/2048 [19:25<05:48,  1.39it/s] 
loss 1.56 accuracy 0.38 -- 56.82 + 56.92 + 499.35 + 4.78 = 617.88:  76%|███████▋  | 1564/2048 [19:25<05:37,  1.43it/s]
loss 2.05 accuracy 0.31 -- 162.41 + 57.45 + 496.37 + 4.92 = 721.16:  76%|███████▋  | 1564/2048 [19:26<05:37,  1.43it/s]
loss 2.05 accuracy 0.31 -- 162.41 + 57.45 + 496.37 + 4.92 = 721.16:  76%|███████▋  | 1565/2048 [19:26<05:44,  1.40it/s]
loss 1.73 accuracy 0.50 -- 56.31 + 57.09 + 621.36 + 4.78 = 739.54:  76%|███████▋  | 1565/2048 [19:26<05:44,  1.40it/s] 
loss 1.73 accuracy 0.50 -- 56.31 + 57.09 + 621.36 + 4.78 = 739.54:  76%|███████▋  | 1566/2048 [19:26<05:51,  1.37it/s]
loss 2.24 accuracy 0.19 -- 56.68 + 56.47 + 504.54 + 4.77 = 622.47:  76%|███████▋  | 1566/2048 [19:27<05:51,  1.37it/s]
loss 2.24 accuracy 0.19 -- 56.68 + 56.47 + 504.54 + 4.77 = 622.47:  77%|███████▋  | 1567/2048 [19:27<05:39,  1.42it/s]
loss 1.42 accuracy 0.44 -- 55.97 + 57.23 + 615.03 + 4.78 = 733.01:  77%|███████▋  | 1567/2048 [19:28<05:39,  1.42it/s]
loss 1.42 accuracy 0.44 -- 55.97 + 57.23 + 615.03 + 4.78 = 733.01:  77%|███████▋  | 1568/2048 [19:28<05:46,  1.39it/s]
loss 1.92 accuracy 0.44 -- 56.47 + 56.36 + 502.83 + 4.77 = 620.43:  77%|███████▋  | 1568/2048 [19:28<05:46,  1.39it/s]
loss 1.92 accuracy 0.44 -- 56.47 + 56.36 + 502.83 + 4.77 = 620.43:  77%|███████▋  | 1569/2048 [19:28<05:34,  1.43it/s]
loss 2.27 accuracy 0.38 -- 56.19 + 165.75 + 502.30 + 4.78 = 729.02:  77%|███████▋  | 1569/2048 [19:29<05:34,  1.43it/s]
loss 2.27 accuracy 0.38 -- 56.19 + 165.75 + 502.30 + 4.78 = 729.02:  77%|███████▋  | 1570/2048 [19:29<05:42,  1.40it/s]
loss 1.91 accuracy 0.12 -- 56.32 + 56.45 + 499.35 + 4.76 = 616.89:  77%|███████▋  | 1570/2048 [19:30<05:42,  1.40it/s] 
loss 1.91 accuracy 0.12 -- 56.32 + 56.45 + 499.35 + 4.76 = 616.89:  77%|███████▋  | 1571/2048 [19:30<05:31,  1.44it/s]
loss 2.22 accuracy 0.12 -- 56.44 + 56.41 + 497.82 + 4.76 = 615.44:  77%|███████▋  | 1571/2048 [19:31<05:31,  1.44it/s]
loss 2.22 accuracy 0.12 -- 56.44 + 56.41 + 497.82 + 4.76 = 615.44:  77%|███████▋  | 1572/2048 [19:31<05:38,  1.41it/s]
loss 1.99 accuracy 0.25 -- 55.89 + 57.46 + 495.08 + 4.77 = 613.21:  77%|███████▋  | 1572/2048 [19:31<05:38,  1.41it/s]
loss 1.99 accuracy 0.25 -- 55.89 + 57.46 + 495.08 + 4.77 = 613.21:  77%|███████▋  | 1573/2048 [19:31<05:27,  1.45it/s]
loss 1.57 accuracy 0.25 -- 156.76 + 57.11 + 488.99 + 4.76 = 707.62:  77%|███████▋  | 1573/2048 [19:32<05:27,  1.45it/s]
loss 1.57 accuracy 0.25 -- 156.76 + 57.11 + 488.99 + 4.76 = 707.62:  77%|███████▋  | 1574/2048 [19:32<05:33,  1.42it/s]
loss 1.76 accuracy 0.38 -- 55.63 + 166.53 + 502.25 + 4.79 = 729.20:  77%|███████▋  | 1574/2048 [19:33<05:33,  1.42it/s]
loss 1.76 accuracy 0.38 -- 55.63 + 166.53 + 502.25 + 4.79 = 729.20:  77%|███████▋  | 1575/2048 [19:33<05:40,  1.39it/s]
loss 1.70 accuracy 0.25 -- 56.45 + 56.81 + 496.75 + 4.76 = 614.77:  77%|███████▋  | 1575/2048 [19:33<05:40,  1.39it/s] 
loss 1.70 accuracy 0.25 -- 56.45 + 56.81 + 496.75 + 4.76 = 614.77:  77%|███████▋  | 1576/2048 [19:33<05:28,  1.44it/s]
loss 1.93 accuracy 0.19 -- 162.46 + 56.71 + 496.66 + 4.77 = 720.60:  77%|███████▋  | 1576/2048 [19:34<05:28,  1.44it/s]
loss 1.93 accuracy 0.19 -- 162.46 + 56.71 + 496.66 + 4.77 = 720.60:  77%|███████▋  | 1577/2048 [19:34<05:35,  1.41it/s]
loss 1.53 accuracy 0.56 -- 56.05 + 57.30 + 620.57 + 4.78 = 738.70:  77%|███████▋  | 1577/2048 [19:35<05:35,  1.41it/s] 
loss 1.53 accuracy 0.56 -- 56.05 + 57.30 + 620.57 + 4.78 = 738.70:  77%|███████▋  | 1578/2048 [19:35<05:42,  1.37it/s]
loss 1.83 accuracy 0.38 -- 56.62 + 56.51 + 504.72 + 4.78 = 622.63:  77%|███████▋  | 1578/2048 [19:36<05:42,  1.37it/s]
loss 1.83 accuracy 0.38 -- 56.62 + 56.51 + 504.72 + 4.78 = 622.63:  77%|███████▋  | 1579/2048 [19:36<05:30,  1.42it/s]
loss 1.68 accuracy 0.25 -- 55.97 + 57.18 + 616.35 + 4.79 = 734.29:  77%|███████▋  | 1579/2048 [19:36<05:30,  1.42it/s]
loss 1.68 accuracy 0.25 -- 55.97 + 57.18 + 616.35 + 4.79 = 734.29:  77%|███████▋  | 1580/2048 [19:36<05:37,  1.39it/s]
loss 1.60 accuracy 0.31 -- 56.91 + 56.93 + 502.29 + 4.79 = 620.92:  77%|███████▋  | 1580/2048 [19:37<05:37,  1.39it/s]
loss 1.60 accuracy 0.31 -- 56.91 + 56.93 + 502.29 + 4.79 = 620.92:  77%|███████▋  | 1581/2048 [19:37<05:26,  1.43it/s]
loss 1.93 accuracy 0.44 -- 56.36 + 166.22 + 502.40 + 4.78 = 729.75:  77%|███████▋  | 1581/2048 [19:38<05:26,  1.43it/s]
loss 1.93 accuracy 0.44 -- 56.36 + 166.22 + 502.40 + 4.78 = 729.75:  77%|███████▋  | 1582/2048 [19:38<05:33,  1.40it/s]
loss 1.85 accuracy 0.19 -- 56.15 + 56.46 + 499.43 + 4.80 = 616.84:  77%|███████▋  | 1582/2048 [19:38<05:33,  1.40it/s] 
loss 1.85 accuracy 0.19 -- 56.15 + 56.46 + 499.43 + 4.80 = 616.84:  77%|███████▋  | 1583/2048 [19:38<05:23,  1.44it/s]
loss 2.07 accuracy 0.31 -- 56.63 + 56.86 + 500.14 + 4.79 = 618.43:  77%|███████▋  | 1583/2048 [19:39<05:23,  1.44it/s]
loss 2.07 accuracy 0.31 -- 56.63 + 56.86 + 500.14 + 4.79 = 618.43:  77%|███████▋  | 1584/2048 [19:39<05:30,  1.40it/s]
loss 1.45 accuracy 0.50 -- 55.92 + 57.08 + 496.01 + 4.80 = 613.81:  77%|███████▋  | 1584/2048 [19:40<05:30,  1.40it/s]
loss 1.45 accuracy 0.50 -- 55.92 + 57.08 + 496.01 + 4.80 = 613.81:  77%|███████▋  | 1585/2048 [19:40<05:24,  1.43it/s]
loss 1.70 accuracy 0.44 -- 157.17 + 56.87 + 490.97 + 4.75 = 709.77:  77%|███████▋  | 1585/2048 [19:41<05:24,  1.43it/s]
loss 1.70 accuracy 0.44 -- 157.17 + 56.87 + 490.97 + 4.75 = 709.77:  77%|███████▋  | 1586/2048 [19:41<05:28,  1.40it/s]
loss 2.05 accuracy 0.31 -- 56.06 + 166.51 + 501.51 + 4.78 = 728.85:  77%|███████▋  | 1586/2048 [19:41<05:28,  1.40it/s]
loss 2.05 accuracy 0.31 -- 56.06 + 166.51 + 501.51 + 4.78 = 728.85:  77%|███████▋  | 1587/2048 [19:41<05:34,  1.38it/s]
loss 1.58 accuracy 0.50 -- 56.60 + 56.37 + 499.02 + 4.78 = 616.77:  77%|███████▋  | 1587/2048 [19:42<05:34,  1.38it/s] 
loss 1.58 accuracy 0.50 -- 56.60 + 56.37 + 499.02 + 4.78 = 616.77:  78%|███████▊  | 1588/2048 [19:42<05:22,  1.43it/s]
loss 1.90 accuracy 0.12 -- 162.81 + 57.03 + 497.35 + 4.80 = 721.99:  78%|███████▊  | 1588/2048 [19:43<05:22,  1.43it/s]
loss 1.90 accuracy 0.12 -- 162.81 + 57.03 + 497.35 + 4.80 = 721.99:  78%|███████▊  | 1589/2048 [19:43<05:28,  1.40it/s]
loss 1.62 accuracy 0.31 -- 56.22 + 57.34 + 619.69 + 4.77 = 738.01:  78%|███████▊  | 1589/2048 [19:43<05:28,  1.40it/s] 
loss 1.62 accuracy 0.31 -- 56.22 + 57.34 + 619.69 + 4.77 = 738.01:  78%|███████▊  | 1590/2048 [19:43<05:34,  1.37it/s]
loss 1.94 accuracy 0.25 -- 56.70 + 56.68 + 505.75 + 4.78 = 623.90:  78%|███████▊  | 1590/2048 [19:44<05:34,  1.37it/s]
loss 1.94 accuracy 0.25 -- 56.70 + 56.68 + 505.75 + 4.78 = 623.90:  78%|███████▊  | 1591/2048 [19:44<05:22,  1.42it/s]
loss 1.60 accuracy 0.38 -- 56.47 + 57.40 + 615.43 + 4.84 = 734.14:  78%|███████▊  | 1591/2048 [19:45<05:22,  1.42it/s]
loss 1.60 accuracy 0.38 -- 56.47 + 57.40 + 615.43 + 4.84 = 734.14:  78%|███████▊  | 1592/2048 [19:45<05:29,  1.38it/s]
loss 1.79 accuracy 0.44 -- 56.66 + 56.79 + 503.91 + 4.78 = 622.14:  78%|███████▊  | 1592/2048 [19:46<05:29,  1.38it/s]
loss 1.79 accuracy 0.44 -- 56.66 + 56.79 + 503.91 + 4.78 = 622.14:  78%|███████▊  | 1593/2048 [19:46<05:23,  1.41it/s]
loss 2.14 accuracy 0.12 -- 56.48 + 166.17 + 501.80 + 4.78 = 729.24:  78%|███████▊  | 1593/2048 [19:46<05:23,  1.41it/s]
loss 2.14 accuracy 0.12 -- 56.48 + 166.17 + 501.80 + 4.78 = 729.24:  78%|███████▊  | 1594/2048 [19:46<05:29,  1.38it/s]
loss 2.12 accuracy 0.25 -- 56.10 + 56.34 + 498.63 + 4.77 = 615.85:  78%|███████▊  | 1594/2048 [19:47<05:29,  1.38it/s] 
loss 2.12 accuracy 0.25 -- 56.10 + 56.34 + 498.63 + 4.77 = 615.85:  78%|███████▊  | 1595/2048 [19:47<05:17,  1.43it/s]
loss 1.78 accuracy 0.31 -- 56.65 + 56.31 + 499.04 + 4.79 = 616.79:  78%|███████▊  | 1595/2048 [19:48<05:17,  1.43it/s]
loss 1.78 accuracy 0.31 -- 56.65 + 56.31 + 499.04 + 4.79 = 616.79:  78%|███████▊  | 1596/2048 [19:48<05:23,  1.40it/s]
loss 1.98 accuracy 0.38 -- 56.04 + 57.62 + 496.33 + 4.78 = 614.78:  78%|███████▊  | 1596/2048 [19:48<05:23,  1.40it/s]
loss 1.98 accuracy 0.38 -- 56.04 + 57.62 + 496.33 + 4.78 = 614.78:  78%|███████▊  | 1597/2048 [19:48<05:12,  1.44it/s]
loss 2.13 accuracy 0.31 -- 157.08 + 56.91 + 488.65 + 4.77 = 707.40:  78%|███████▊  | 1597/2048 [19:49<05:12,  1.44it/s]
loss 2.13 accuracy 0.31 -- 157.08 + 56.91 + 488.65 + 4.77 = 707.40:  78%|███████▊  | 1598/2048 [19:49<05:17,  1.42it/s]
loss 1.96 accuracy 0.25 -- 55.78 + 166.65 + 501.76 + 4.79 = 728.98:  78%|███████▊  | 1598/2048 [19:50<05:17,  1.42it/s]
loss 1.96 accuracy 0.25 -- 55.78 + 166.65 + 501.76 + 4.79 = 728.98:  78%|███████▊  | 1599/2048 [19:50<05:23,  1.39it/s]
loss 1.54 accuracy 0.44 -- 56.82 + 56.90 + 497.97 + 4.79 = 616.47:  78%|███████▊  | 1599/2048 [19:50<05:23,  1.39it/s] 
loss 1.54 accuracy 0.44 -- 56.82 + 56.90 + 497.97 + 4.79 = 616.47:  78%|███████▊  | 1600/2048 [19:50<05:12,  1.43it/s]
loss 1.84 accuracy 0.31 -- 162.13 + 57.11 + 496.49 + 4.77 = 720.49:  78%|███████▊  | 1600/2048 [19:51<05:12,  1.43it/s]
loss 1.84 accuracy 0.31 -- 162.13 + 57.11 + 496.49 + 4.77 = 720.49:  78%|███████▊  | 1601/2048 [19:51<05:23,  1.38it/s]
loss 1.86 accuracy 0.25 -- 55.99 + 57.60 + 621.08 + 4.77 = 739.44:  78%|███████▊  | 1601/2048 [19:52<05:23,  1.38it/s] 
loss 1.86 accuracy 0.25 -- 55.99 + 57.60 + 621.08 + 4.77 = 739.44:  78%|███████▊  | 1602/2048 [19:52<05:28,  1.36it/s]
loss 2.26 accuracy 0.38 -- 56.68 + 56.69 + 506.93 + 4.80 = 625.09:  78%|███████▊  | 1602/2048 [19:53<05:28,  1.36it/s]
loss 2.26 accuracy 0.38 -- 56.68 + 56.69 + 506.93 + 4.80 = 625.09:  78%|███████▊  | 1603/2048 [19:53<05:16,  1.41it/s]
loss 1.28 accuracy 0.69 -- 56.03 + 57.42 + 615.63 + 4.77 = 733.86:  78%|███████▊  | 1603/2048 [19:53<05:16,  1.41it/s]
loss 1.28 accuracy 0.69 -- 56.03 + 57.42 + 615.63 + 4.77 = 733.86:  78%|███████▊  | 1604/2048 [19:53<05:22,  1.38it/s]
loss 1.95 accuracy 0.25 -- 56.74 + 56.68 + 502.69 + 4.75 = 620.87:  78%|███████▊  | 1604/2048 [19:54<05:22,  1.38it/s]
loss 1.95 accuracy 0.25 -- 56.74 + 56.68 + 502.69 + 4.75 = 620.87:  78%|███████▊  | 1605/2048 [19:54<05:11,  1.42it/s]
loss 1.99 accuracy 0.19 -- 56.44 + 165.92 + 500.30 + 4.78 = 727.43:  78%|███████▊  | 1605/2048 [19:55<05:11,  1.42it/s]
loss 1.99 accuracy 0.19 -- 56.44 + 165.92 + 500.30 + 4.78 = 727.43:  78%|███████▊  | 1606/2048 [19:55<05:17,  1.39it/s]
loss 1.80 accuracy 0.38 -- 56.00 + 56.35 + 498.68 + 4.78 = 615.81:  78%|███████▊  | 1606/2048 [19:55<05:17,  1.39it/s] 
loss 1.80 accuracy 0.38 -- 56.00 + 56.35 + 498.68 + 4.78 = 615.81:  78%|███████▊  | 1607/2048 [19:55<05:06,  1.44it/s]
loss 1.72 accuracy 0.56 -- 56.50 + 56.47 + 498.02 + 4.78 = 615.77:  78%|███████▊  | 1607/2048 [19:56<05:06,  1.44it/s]
loss 1.72 accuracy 0.56 -- 56.50 + 56.47 + 498.02 + 4.78 = 615.77:  79%|███████▊  | 1608/2048 [19:56<05:17,  1.38it/s]
loss 1.49 accuracy 0.56 -- 56.30 + 57.81 + 495.62 + 4.77 = 614.51:  79%|███████▊  | 1608/2048 [19:57<05:17,  1.38it/s]
loss 1.49 accuracy 0.56 -- 56.30 + 57.81 + 495.62 + 4.77 = 614.51:  79%|███████▊  | 1609/2048 [19:57<05:06,  1.43it/s]
loss 1.75 accuracy 0.25 -- 157.02 + 56.76 + 488.55 + 4.81 = 707.15:  79%|███████▊  | 1609/2048 [19:58<05:06,  1.43it/s]
loss 1.75 accuracy 0.25 -- 157.02 + 56.76 + 488.55 + 4.81 = 707.15:  79%|███████▊  | 1610/2048 [19:58<05:10,  1.41it/s]
loss 1.58 accuracy 0.38 -- 55.73 + 166.16 + 501.80 + 4.77 = 728.45:  79%|███████▊  | 1610/2048 [19:58<05:10,  1.41it/s]
loss 1.58 accuracy 0.38 -- 55.73 + 166.16 + 501.80 + 4.77 = 728.45:  79%|███████▊  | 1611/2048 [19:58<05:15,  1.38it/s]
loss 1.55 accuracy 0.25 -- 56.68 + 56.34 + 498.76 + 4.80 = 616.59:  79%|███████▊  | 1611/2048 [19:59<05:15,  1.38it/s] 
loss 1.55 accuracy 0.25 -- 56.68 + 56.34 + 498.76 + 4.80 = 616.59:  79%|███████▊  | 1612/2048 [19:59<05:04,  1.43it/s]
loss 1.81 accuracy 0.25 -- 162.06 + 57.42 + 496.43 + 4.77 = 720.68:  79%|███████▊  | 1612/2048 [20:00<05:04,  1.43it/s]
loss 1.81 accuracy 0.25 -- 162.06 + 57.42 + 496.43 + 4.77 = 720.68:  79%|███████▉  | 1613/2048 [20:00<05:10,  1.40it/s]
loss 2.45 accuracy 0.19 -- 56.18 + 57.45 + 622.21 + 4.78 = 740.62:  79%|███████▉  | 1613/2048 [20:01<05:10,  1.40it/s] 
loss 2.45 accuracy 0.19 -- 56.18 + 57.45 + 622.21 + 4.78 = 740.62:  79%|███████▉  | 1614/2048 [20:01<05:16,  1.37it/s]
loss 1.71 accuracy 0.25 -- 56.90 + 56.58 + 506.51 + 4.79 = 624.76:  79%|███████▉  | 1614/2048 [20:01<05:16,  1.37it/s]
loss 1.71 accuracy 0.25 -- 56.90 + 56.58 + 506.51 + 4.79 = 624.76:  79%|███████▉  | 1615/2048 [20:01<05:10,  1.39it/s]
loss 1.60 accuracy 0.50 -- 56.02 + 57.52 + 616.04 + 4.78 = 734.36:  79%|███████▉  | 1615/2048 [20:02<05:10,  1.39it/s]
loss 1.60 accuracy 0.50 -- 56.02 + 57.52 + 616.04 + 4.78 = 734.36:  79%|███████▉  | 1616/2048 [20:02<05:15,  1.37it/s]
loss 1.96 accuracy 0.25 -- 56.51 + 56.54 + 502.37 + 4.78 = 620.21:  79%|███████▉  | 1616/2048 [20:03<05:15,  1.37it/s]
loss 1.96 accuracy 0.25 -- 56.51 + 56.54 + 502.37 + 4.78 = 620.21:  79%|███████▉  | 1617/2048 [20:03<05:04,  1.42it/s]
loss 1.71 accuracy 0.19 -- 55.99 + 165.98 + 500.94 + 4.77 = 727.68:  79%|███████▉  | 1617/2048 [20:03<05:04,  1.42it/s]
loss 1.71 accuracy 0.19 -- 55.99 + 165.98 + 500.94 + 4.77 = 727.68:  79%|███████▉  | 1618/2048 [20:03<05:09,  1.39it/s]
loss 1.55 accuracy 0.38 -- 55.92 + 56.44 + 498.24 + 4.78 = 615.38:  79%|███████▉  | 1618/2048 [20:04<05:09,  1.39it/s] 
loss 1.55 accuracy 0.38 -- 55.92 + 56.44 + 498.24 + 4.78 = 615.38:  79%|███████▉  | 1619/2048 [20:04<04:58,  1.43it/s]
loss 1.73 accuracy 0.38 -- 56.64 + 56.31 + 498.47 + 4.78 = 616.20:  79%|███████▉  | 1619/2048 [20:05<04:58,  1.43it/s]
loss 1.73 accuracy 0.38 -- 56.64 + 56.31 + 498.47 + 4.78 = 616.20:  79%|███████▉  | 1620/2048 [20:05<05:05,  1.40it/s]
loss 1.27 accuracy 0.50 -- 56.02 + 57.63 + 494.58 + 4.78 = 613.02:  79%|███████▉  | 1620/2048 [20:05<05:05,  1.40it/s]
loss 1.27 accuracy 0.50 -- 56.02 + 57.63 + 494.58 + 4.78 = 613.02:  79%|███████▉  | 1621/2048 [20:05<04:55,  1.45it/s]
loss 1.54 accuracy 0.44 -- 156.57 + 56.68 + 488.88 + 4.76 = 706.89:  79%|███████▉  | 1621/2048 [20:06<04:55,  1.45it/s]
loss 1.54 accuracy 0.44 -- 156.57 + 56.68 + 488.88 + 4.76 = 706.89:  79%|███████▉  | 1622/2048 [20:06<04:59,  1.42it/s]
loss 1.86 accuracy 0.19 -- 55.88 + 166.46 + 502.34 + 4.79 = 729.48:  79%|███████▉  | 1622/2048 [20:07<04:59,  1.42it/s]
loss 1.86 accuracy 0.19 -- 55.88 + 166.46 + 502.34 + 4.79 = 729.48:  79%|███████▉  | 1623/2048 [20:07<05:05,  1.39it/s]
loss 1.64 accuracy 0.50 -- 56.66 + 56.44 + 497.66 + 4.78 = 615.54:  79%|███████▉  | 1623/2048 [20:08<05:05,  1.39it/s] 
loss 1.64 accuracy 0.50 -- 56.66 + 56.44 + 497.66 + 4.78 = 615.54:  79%|███████▉  | 1624/2048 [20:08<04:55,  1.44it/s]
loss 1.69 accuracy 0.19 -- 162.70 + 57.08 + 497.73 + 4.80 = 722.33:  79%|███████▉  | 1624/2048 [20:08<04:55,  1.44it/s]
loss 1.69 accuracy 0.19 -- 162.70 + 57.08 + 497.73 + 4.80 = 722.33:  79%|███████▉  | 1625/2048 [20:08<05:01,  1.40it/s]
loss 1.59 accuracy 0.50 -- 56.23 + 57.87 + 619.65 + 4.79 = 738.54:  79%|███████▉  | 1625/2048 [20:09<05:01,  1.40it/s] 
loss 1.59 accuracy 0.50 -- 56.23 + 57.87 + 619.65 + 4.79 = 738.54:  79%|███████▉  | 1626/2048 [20:09<05:07,  1.37it/s]
loss 1.84 accuracy 0.31 -- 56.67 + 56.92 + 504.03 + 4.78 = 622.40:  79%|███████▉  | 1626/2048 [20:10<05:07,  1.37it/s]
loss 1.84 accuracy 0.31 -- 56.67 + 56.92 + 504.03 + 4.78 = 622.40:  79%|███████▉  | 1627/2048 [20:10<04:56,  1.42it/s]
loss 1.55 accuracy 0.50 -- 56.16 + 57.11 + 616.03 + 4.80 = 734.10:  79%|███████▉  | 1627/2048 [20:10<04:56,  1.42it/s]
loss 1.55 accuracy 0.50 -- 56.16 + 57.11 + 616.03 + 4.80 = 734.10:  79%|███████▉  | 1628/2048 [20:10<05:03,  1.39it/s]
loss 1.49 accuracy 0.25 -- 56.87 + 56.49 + 501.89 + 4.78 = 620.03:  79%|███████▉  | 1628/2048 [20:11<05:03,  1.39it/s]
loss 1.49 accuracy 0.25 -- 56.87 + 56.49 + 501.89 + 4.78 = 620.03:  80%|███████▉  | 1629/2048 [20:11<04:53,  1.43it/s]
loss 2.06 accuracy 0.31 -- 56.08 + 167.37 + 501.28 + 4.78 = 729.51:  80%|███████▉  | 1629/2048 [20:12<04:53,  1.43it/s]
loss 2.06 accuracy 0.31 -- 56.08 + 167.37 + 501.28 + 4.78 = 729.51:  80%|███████▉  | 1630/2048 [20:12<04:59,  1.40it/s]
loss 1.83 accuracy 0.44 -- 56.05 + 56.42 + 499.94 + 4.79 = 617.19:  80%|███████▉  | 1630/2048 [20:13<04:59,  1.40it/s] 
loss 1.83 accuracy 0.44 -- 56.05 + 56.42 + 499.94 + 4.79 = 617.19:  80%|███████▉  | 1631/2048 [20:13<04:49,  1.44it/s]
loss 2.01 accuracy 0.31 -- 56.54 + 56.47 + 497.81 + 4.77 = 615.59:  80%|███████▉  | 1631/2048 [20:13<04:49,  1.44it/s]
loss 2.01 accuracy 0.31 -- 56.54 + 56.47 + 497.81 + 4.77 = 615.59:  80%|███████▉  | 1632/2048 [20:13<04:55,  1.41it/s]
loss 2.11 accuracy 0.31 -- 55.86 + 57.27 + 494.79 + 4.79 = 612.72:  80%|███████▉  | 1632/2048 [20:14<04:55,  1.41it/s]
loss 2.11 accuracy 0.31 -- 55.86 + 57.27 + 494.79 + 4.79 = 612.72:  80%|███████▉  | 1633/2048 [20:14<04:46,  1.45it/s]
loss 1.66 accuracy 0.31 -- 157.53 + 56.70 + 488.54 + 4.80 = 707.58:  80%|███████▉  | 1633/2048 [20:15<04:46,  1.45it/s]
loss 1.66 accuracy 0.31 -- 157.53 + 56.70 + 488.54 + 4.80 = 707.58:  80%|███████▉  | 1634/2048 [20:15<04:51,  1.42it/s]
loss 1.70 accuracy 0.31 -- 55.59 + 165.84 + 499.97 + 4.78 = 726.18:  80%|███████▉  | 1634/2048 [20:15<04:51,  1.42it/s]
loss 1.70 accuracy 0.31 -- 55.59 + 165.84 + 499.97 + 4.78 = 726.18:  80%|███████▉  | 1635/2048 [20:15<04:56,  1.39it/s]
loss 1.27 accuracy 0.69 -- 56.43 + 56.56 + 497.48 + 4.79 = 615.26:  80%|███████▉  | 1635/2048 [20:16<04:56,  1.39it/s] 
loss 1.27 accuracy 0.69 -- 56.43 + 56.56 + 497.48 + 4.79 = 615.26:  80%|███████▉  | 1636/2048 [20:16<04:46,  1.44it/s]
loss 1.83 accuracy 0.50 -- 162.07 + 57.72 + 497.80 + 4.77 = 722.36:  80%|███████▉  | 1636/2048 [20:17<04:46,  1.44it/s]
loss 1.83 accuracy 0.50 -- 162.07 + 57.72 + 497.80 + 4.77 = 722.36:  80%|███████▉  | 1637/2048 [20:17<04:52,  1.41it/s]
loss 1.43 accuracy 0.50 -- 56.03 + 57.06 + 619.95 + 4.83 = 737.86:  80%|███████▉  | 1637/2048 [20:18<04:52,  1.41it/s] 
loss 1.43 accuracy 0.50 -- 56.03 + 57.06 + 619.95 + 4.83 = 737.86:  80%|███████▉  | 1638/2048 [20:18<04:58,  1.37it/s]
loss 1.72 accuracy 0.38 -- 56.75 + 56.24 + 505.05 + 4.77 = 622.81:  80%|███████▉  | 1638/2048 [20:18<04:58,  1.37it/s]
loss 1.72 accuracy 0.38 -- 56.75 + 56.24 + 505.05 + 4.77 = 622.81:  80%|████████  | 1639/2048 [20:18<04:52,  1.40it/s]
loss 2.23 accuracy 0.12 -- 56.11 + 57.29 + 615.28 + 4.78 = 733.45:  80%|████████  | 1639/2048 [20:19<04:52,  1.40it/s]
loss 2.23 accuracy 0.12 -- 56.11 + 57.29 + 615.28 + 4.78 = 733.45:  80%|████████  | 1640/2048 [20:19<04:57,  1.37it/s]
loss 1.89 accuracy 0.50 -- 56.75 + 56.24 + 502.17 + 4.78 = 619.94:  80%|████████  | 1640/2048 [20:20<04:57,  1.37it/s]
loss 1.89 accuracy 0.50 -- 56.75 + 56.24 + 502.17 + 4.78 = 619.94:  80%|████████  | 1641/2048 [20:20<04:46,  1.42it/s]
loss 2.06 accuracy 0.31 -- 56.18 + 166.01 + 502.95 + 4.78 = 729.92:  80%|████████  | 1641/2048 [20:20<04:46,  1.42it/s]
loss 2.06 accuracy 0.31 -- 56.18 + 166.01 + 502.95 + 4.78 = 729.92:  80%|████████  | 1642/2048 [20:20<04:52,  1.39it/s]
loss 2.20 accuracy 0.31 -- 56.37 + 56.41 + 499.39 + 4.78 = 616.95:  80%|████████  | 1642/2048 [20:21<04:52,  1.39it/s] 
loss 2.20 accuracy 0.31 -- 56.37 + 56.41 + 499.39 + 4.78 = 616.95:  80%|████████  | 1643/2048 [20:21<04:42,  1.43it/s]
loss 2.14 accuracy 0.12 -- 56.78 + 56.48 + 498.18 + 4.79 = 616.23:  80%|████████  | 1643/2048 [20:22<04:42,  1.43it/s]
loss 2.14 accuracy 0.12 -- 56.78 + 56.48 + 498.18 + 4.79 = 616.23:  80%|████████  | 1644/2048 [20:22<04:48,  1.40it/s]
loss 2.09 accuracy 0.19 -- 55.96 + 57.04 + 495.33 + 4.77 = 613.09:  80%|████████  | 1644/2048 [20:22<04:48,  1.40it/s]
loss 2.09 accuracy 0.19 -- 55.96 + 57.04 + 495.33 + 4.77 = 613.09:  80%|████████  | 1645/2048 [20:22<04:38,  1.45it/s]
loss 1.74 accuracy 0.50 -- 157.65 + 56.86 + 489.11 + 4.78 = 708.39:  80%|████████  | 1645/2048 [20:23<04:38,  1.45it/s]
loss 1.74 accuracy 0.50 -- 157.65 + 56.86 + 489.11 + 4.78 = 708.39:  80%|████████  | 1646/2048 [20:23<04:43,  1.42it/s]
loss 1.89 accuracy 0.38 -- 55.83 + 166.38 + 501.34 + 4.83 = 728.37:  80%|████████  | 1646/2048 [20:24<04:43,  1.42it/s]
loss 1.89 accuracy 0.38 -- 55.83 + 166.38 + 501.34 + 4.83 = 728.37:  80%|████████  | 1647/2048 [20:24<04:52,  1.37it/s]
loss 1.62 accuracy 0.50 -- 56.58 + 56.21 + 499.44 + 4.78 = 617.02:  80%|████████  | 1647/2048 [20:25<04:52,  1.37it/s] 
loss 1.62 accuracy 0.50 -- 56.58 + 56.21 + 499.44 + 4.78 = 617.02:  80%|████████  | 1648/2048 [20:25<04:41,  1.42it/s]
loss 1.67 accuracy 0.44 -- 162.08 + 57.19 + 497.17 + 4.77 = 721.21:  80%|████████  | 1648/2048 [20:25<04:41,  1.42it/s]
loss 1.67 accuracy 0.44 -- 162.08 + 57.19 + 497.17 + 4.77 = 721.21:  81%|████████  | 1649/2048 [20:25<04:46,  1.39it/s]
loss 1.61 accuracy 0.44 -- 56.37 + 57.22 + 620.64 + 4.78 = 739.01:  81%|████████  | 1649/2048 [20:26<04:46,  1.39it/s] 
loss 1.61 accuracy 0.44 -- 56.37 + 57.22 + 620.64 + 4.78 = 739.01:  81%|████████  | 1650/2048 [20:26<04:51,  1.37it/s]
loss 1.84 accuracy 0.31 -- 56.91 + 56.30 + 504.28 + 4.76 = 622.25:  81%|████████  | 1650/2048 [20:27<04:51,  1.37it/s]
loss 1.84 accuracy 0.31 -- 56.91 + 56.30 + 504.28 + 4.76 = 622.25:  81%|████████  | 1651/2048 [20:27<04:40,  1.41it/s]
loss 1.51 accuracy 0.44 -- 56.07 + 57.69 + 615.21 + 4.77 = 733.73:  81%|████████  | 1651/2048 [20:28<04:40,  1.41it/s]
loss 1.51 accuracy 0.44 -- 56.07 + 57.69 + 615.21 + 4.77 = 733.73:  81%|████████  | 1652/2048 [20:28<04:46,  1.38it/s]
loss 1.56 accuracy 0.44 -- 56.69 + 56.32 + 503.10 + 4.79 = 620.91:  81%|████████  | 1652/2048 [20:28<04:46,  1.38it/s]
loss 1.56 accuracy 0.44 -- 56.69 + 56.32 + 503.10 + 4.79 = 620.91:  81%|████████  | 1653/2048 [20:28<04:36,  1.43it/s]
loss 1.87 accuracy 0.12 -- 56.61 + 166.75 + 501.04 + 4.76 = 729.15:  81%|████████  | 1653/2048 [20:29<04:36,  1.43it/s]
loss 1.87 accuracy 0.12 -- 56.61 + 166.75 + 501.04 + 4.76 = 729.15:  81%|████████  | 1654/2048 [20:29<04:46,  1.37it/s]
loss 1.59 accuracy 0.44 -- 56.26 + 56.88 + 499.06 + 4.78 = 616.98:  81%|████████  | 1654/2048 [20:30<04:46,  1.37it/s] 
loss 1.59 accuracy 0.44 -- 56.26 + 56.88 + 499.06 + 4.78 = 616.98:  81%|████████  | 1655/2048 [20:30<04:36,  1.42it/s]
loss 1.35 accuracy 0.44 -- 56.68 + 56.15 + 498.51 + 4.76 = 616.12:  81%|████████  | 1655/2048 [20:30<04:36,  1.42it/s]
loss 1.35 accuracy 0.44 -- 56.68 + 56.15 + 498.51 + 4.76 = 616.12:  81%|████████  | 1656/2048 [20:30<04:41,  1.39it/s]
loss 1.90 accuracy 0.25 -- 56.09 + 57.27 + 495.18 + 4.84 = 613.38:  81%|████████  | 1656/2048 [20:31<04:41,  1.39it/s]
loss 1.90 accuracy 0.25 -- 56.09 + 57.27 + 495.18 + 4.84 = 613.38:  81%|████████  | 1657/2048 [20:31<04:31,  1.44it/s]
loss 1.28 accuracy 0.50 -- 157.88 + 56.67 + 490.38 + 4.77 = 709.71:  81%|████████  | 1657/2048 [20:32<04:31,  1.44it/s]
loss 1.28 accuracy 0.50 -- 157.88 + 56.67 + 490.38 + 4.77 = 709.71:  81%|████████  | 1658/2048 [20:32<04:35,  1.41it/s]
loss 1.90 accuracy 0.19 -- 55.79 + 166.29 + 504.09 + 4.78 = 730.96:  81%|████████  | 1658/2048 [20:33<04:35,  1.41it/s]
loss 1.90 accuracy 0.19 -- 55.79 + 166.29 + 504.09 + 4.78 = 730.96:  81%|████████  | 1659/2048 [20:33<04:40,  1.38it/s]
loss 2.00 accuracy 0.31 -- 56.69 + 56.61 + 498.23 + 4.77 = 616.31:  81%|████████  | 1659/2048 [20:33<04:40,  1.38it/s] 
loss 2.00 accuracy 0.31 -- 56.69 + 56.61 + 498.23 + 4.77 = 616.31:  81%|████████  | 1660/2048 [20:33<04:31,  1.43it/s]
loss 1.41 accuracy 0.50 -- 162.12 + 57.72 + 497.00 + 4.77 = 721.60:  81%|████████  | 1660/2048 [20:34<04:31,  1.43it/s]
loss 1.41 accuracy 0.50 -- 162.12 + 57.72 + 497.00 + 4.77 = 721.60:  81%|████████  | 1661/2048 [20:34<04:36,  1.40it/s]
loss 2.63 accuracy 0.25 -- 56.12 + 57.75 + 619.83 + 4.78 = 738.48:  81%|████████  | 1661/2048 [20:35<04:36,  1.40it/s] 
loss 2.63 accuracy 0.25 -- 56.12 + 57.75 + 619.83 + 4.78 = 738.48:  81%|████████  | 1662/2048 [20:35<04:45,  1.35it/s]
loss 1.65 accuracy 0.44 -- 56.67 + 56.33 + 505.09 + 4.77 = 622.86:  81%|████████  | 1662/2048 [20:35<04:45,  1.35it/s]
loss 1.65 accuracy 0.44 -- 56.67 + 56.33 + 505.09 + 4.77 = 622.86:  81%|████████  | 1663/2048 [20:35<04:34,  1.40it/s]
loss 1.96 accuracy 0.44 -- 56.17 + 57.25 + 614.80 + 4.78 = 733.00:  81%|████████  | 1663/2048 [20:36<04:34,  1.40it/s]
loss 1.96 accuracy 0.44 -- 56.17 + 57.25 + 614.80 + 4.78 = 733.00:  81%|████████▏ | 1664/2048 [20:36<04:39,  1.38it/s]
loss 1.50 accuracy 0.38 -- 57.11 + 57.06 + 502.72 + 4.76 = 621.65:  81%|████████▏ | 1664/2048 [20:37<04:39,  1.38it/s]
loss 1.50 accuracy 0.38 -- 57.11 + 57.06 + 502.72 + 4.76 = 621.65:  81%|████████▏ | 1665/2048 [20:37<04:29,  1.42it/s]
loss 1.85 accuracy 0.31 -- 56.40 + 166.17 + 502.35 + 4.79 = 729.71:  81%|████████▏ | 1665/2048 [20:38<04:29,  1.42it/s]
loss 1.85 accuracy 0.31 -- 56.40 + 166.17 + 502.35 + 4.79 = 729.71:  81%|████████▏ | 1666/2048 [20:38<04:34,  1.39it/s]
loss 2.34 accuracy 0.25 -- 56.06 + 56.37 + 499.75 + 4.77 = 616.96:  81%|████████▏ | 1666/2048 [20:38<04:34,  1.39it/s] 
loss 2.34 accuracy 0.25 -- 56.06 + 56.37 + 499.75 + 4.77 = 616.96:  81%|████████▏ | 1667/2048 [20:38<04:25,  1.44it/s]
loss 1.54 accuracy 0.44 -- 56.63 + 56.49 + 497.81 + 4.77 = 615.70:  81%|████████▏ | 1667/2048 [20:39<04:25,  1.44it/s]
loss 1.54 accuracy 0.44 -- 56.63 + 56.49 + 497.81 + 4.77 = 615.70:  81%|████████▏ | 1668/2048 [20:39<04:30,  1.40it/s]
loss 1.67 accuracy 0.31 -- 55.99 + 57.48 + 494.44 + 4.77 = 612.69:  81%|████████▏ | 1668/2048 [20:40<04:30,  1.40it/s]
loss 1.67 accuracy 0.31 -- 55.99 + 57.48 + 494.44 + 4.77 = 612.69:  81%|████████▏ | 1669/2048 [20:40<04:25,  1.43it/s]
loss 2.53 accuracy 0.31 -- 157.38 + 56.86 + 489.40 + 4.78 = 708.43:  81%|████████▏ | 1669/2048 [20:40<04:25,  1.43it/s]
loss 2.53 accuracy 0.31 -- 157.38 + 56.86 + 489.40 + 4.78 = 708.43:  82%|████████▏ | 1670/2048 [20:40<04:28,  1.41it/s]
loss 2.16 accuracy 0.31 -- 55.85 + 166.09 + 499.90 + 4.79 = 726.63:  82%|████████▏ | 1670/2048 [20:41<04:28,  1.41it/s]
loss 2.16 accuracy 0.31 -- 55.85 + 166.09 + 499.90 + 4.79 = 726.63:  82%|████████▏ | 1671/2048 [20:41<04:32,  1.38it/s]
loss 1.87 accuracy 0.25 -- 56.76 + 56.52 + 496.67 + 4.79 = 614.74:  82%|████████▏ | 1671/2048 [20:42<04:32,  1.38it/s] 
loss 1.87 accuracy 0.25 -- 56.76 + 56.52 + 496.67 + 4.79 = 614.74:  82%|████████▏ | 1672/2048 [20:42<04:22,  1.43it/s]
loss 1.43 accuracy 0.69 -- 162.36 + 57.05 + 496.15 + 4.77 = 720.34:  82%|████████▏ | 1672/2048 [20:42<04:22,  1.43it/s]
loss 1.43 accuracy 0.69 -- 162.36 + 57.05 + 496.15 + 4.77 = 720.34:  82%|████████▏ | 1673/2048 [20:42<04:27,  1.40it/s]
loss 1.78 accuracy 0.31 -- 56.14 + 57.07 + 619.95 + 4.78 = 737.94:  82%|████████▏ | 1673/2048 [20:43<04:27,  1.40it/s] 
loss 1.78 accuracy 0.31 -- 56.14 + 57.07 + 619.95 + 4.78 = 737.94:  82%|████████▏ | 1674/2048 [20:43<04:32,  1.37it/s]
loss 1.90 accuracy 0.44 -- 56.71 + 56.50 + 504.29 + 4.77 = 622.27:  82%|████████▏ | 1674/2048 [20:44<04:32,  1.37it/s]
loss 1.90 accuracy 0.44 -- 56.71 + 56.50 + 504.29 + 4.77 = 622.27:  82%|████████▏ | 1675/2048 [20:44<04:23,  1.42it/s]
loss 1.62 accuracy 0.62 -- 55.94 + 57.19 + 615.51 + 4.76 = 733.40:  82%|████████▏ | 1675/2048 [20:45<04:23,  1.42it/s]
loss 1.62 accuracy 0.62 -- 55.94 + 57.19 + 615.51 + 4.76 = 733.40:  82%|████████▏ | 1676/2048 [20:45<04:28,  1.39it/s]
loss 1.69 accuracy 0.19 -- 56.56 + 56.49 + 502.05 + 4.77 = 619.87:  82%|████████▏ | 1676/2048 [20:45<04:28,  1.39it/s]
loss 1.69 accuracy 0.19 -- 56.56 + 56.49 + 502.05 + 4.77 = 619.87:  82%|████████▏ | 1677/2048 [20:45<04:23,  1.41it/s]
loss 1.47 accuracy 0.44 -- 56.33 + 166.24 + 501.76 + 4.78 = 729.11:  82%|████████▏ | 1677/2048 [20:46<04:23,  1.41it/s]
loss 1.47 accuracy 0.44 -- 56.33 + 166.24 + 501.76 + 4.78 = 729.11:  82%|████████▏ | 1678/2048 [20:46<04:27,  1.38it/s]
loss 1.60 accuracy 0.38 -- 56.14 + 56.54 + 499.03 + 4.78 = 616.49:  82%|████████▏ | 1678/2048 [20:47<04:27,  1.38it/s] 
loss 1.60 accuracy 0.38 -- 56.14 + 56.54 + 499.03 + 4.78 = 616.49:  82%|████████▏ | 1679/2048 [20:47<04:18,  1.43it/s]
loss 1.68 accuracy 0.50 -- 56.71 + 56.66 + 498.86 + 4.77 = 617.01:  82%|████████▏ | 1679/2048 [20:47<04:18,  1.43it/s]
loss 1.68 accuracy 0.50 -- 56.71 + 56.66 + 498.86 + 4.77 = 617.01:  82%|████████▏ | 1680/2048 [20:47<04:23,  1.40it/s]
loss 1.70 accuracy 0.38 -- 55.93 + 57.27 + 494.98 + 4.76 = 612.94:  82%|████████▏ | 1680/2048 [20:48<04:23,  1.40it/s]
loss 1.70 accuracy 0.38 -- 55.93 + 57.27 + 494.98 + 4.76 = 612.94:  82%|████████▏ | 1681/2048 [20:48<04:14,  1.44it/s]
loss 1.63 accuracy 0.38 -- 158.07 + 57.16 + 489.60 + 4.77 = 709.60:  82%|████████▏ | 1681/2048 [20:49<04:14,  1.44it/s]
loss 1.63 accuracy 0.38 -- 158.07 + 57.16 + 489.60 + 4.77 = 709.60:  82%|████████▏ | 1682/2048 [20:49<04:18,  1.42it/s]
loss 1.81 accuracy 0.25 -- 55.60 + 166.02 + 501.86 + 4.77 = 728.25:  82%|████████▏ | 1682/2048 [20:50<04:18,  1.42it/s]
loss 1.81 accuracy 0.25 -- 55.60 + 166.02 + 501.86 + 4.77 = 728.25:  82%|████████▏ | 1683/2048 [20:50<04:23,  1.39it/s]
loss 1.52 accuracy 0.50 -- 57.02 + 56.57 + 498.86 + 4.79 = 617.23:  82%|████████▏ | 1683/2048 [20:50<04:23,  1.39it/s] 
loss 1.52 accuracy 0.50 -- 57.02 + 56.57 + 498.86 + 4.79 = 617.23:  82%|████████▏ | 1684/2048 [20:50<04:13,  1.43it/s]
loss 1.96 accuracy 0.25 -- 162.31 + 57.15 + 496.60 + 4.79 = 720.86:  82%|████████▏ | 1684/2048 [20:51<04:13,  1.43it/s]
loss 1.96 accuracy 0.25 -- 162.31 + 57.15 + 496.60 + 4.79 = 720.86:  82%|████████▏ | 1685/2048 [20:51<04:18,  1.40it/s]
loss 1.84 accuracy 0.25 -- 56.21 + 57.46 + 618.48 + 4.77 = 736.93:  82%|████████▏ | 1685/2048 [20:52<04:18,  1.40it/s] 
loss 1.84 accuracy 0.25 -- 56.21 + 57.46 + 618.48 + 4.77 = 736.93:  82%|████████▏ | 1686/2048 [20:52<04:23,  1.37it/s]
loss 1.84 accuracy 0.38 -- 56.68 + 56.25 + 505.59 + 4.80 = 623.33:  82%|████████▏ | 1686/2048 [20:52<04:23,  1.37it/s]
loss 1.84 accuracy 0.38 -- 56.68 + 56.25 + 505.59 + 4.80 = 623.33:  82%|████████▏ | 1687/2048 [20:52<04:14,  1.42it/s]
loss 1.18 accuracy 0.62 -- 56.60 + 57.65 + 617.13 + 4.78 = 736.16:  82%|████████▏ | 1687/2048 [20:53<04:14,  1.42it/s]
loss 1.18 accuracy 0.62 -- 56.60 + 57.65 + 617.13 + 4.78 = 736.16:  82%|████████▏ | 1688/2048 [20:53<04:20,  1.38it/s]
loss 1.81 accuracy 0.44 -- 56.69 + 56.33 + 500.99 + 4.78 = 618.80:  82%|████████▏ | 1688/2048 [20:54<04:20,  1.38it/s]
loss 1.81 accuracy 0.44 -- 56.69 + 56.33 + 500.99 + 4.78 = 618.80:  82%|████████▏ | 1689/2048 [20:54<04:11,  1.43it/s]
loss 1.88 accuracy 0.31 -- 56.33 + 166.18 + 500.63 + 4.79 = 727.92:  82%|████████▏ | 1689/2048 [20:55<04:11,  1.43it/s]
loss 1.88 accuracy 0.31 -- 56.33 + 166.18 + 500.63 + 4.79 = 727.92:  83%|████████▎ | 1690/2048 [20:55<04:16,  1.40it/s]
loss 2.12 accuracy 0.31 -- 55.99 + 56.74 + 499.08 + 4.79 = 616.60:  83%|████████▎ | 1690/2048 [20:55<04:16,  1.40it/s] 
loss 2.12 accuracy 0.31 -- 55.99 + 56.74 + 499.08 + 4.79 = 616.60:  83%|████████▎ | 1691/2048 [20:55<04:07,  1.44it/s]
loss 2.04 accuracy 0.25 -- 56.69 + 56.45 + 497.90 + 4.80 = 615.84:  83%|████████▎ | 1691/2048 [20:56<04:07,  1.44it/s]
loss 2.04 accuracy 0.25 -- 56.69 + 56.45 + 497.90 + 4.80 = 615.84:  83%|████████▎ | 1692/2048 [20:56<04:16,  1.39it/s]
loss 1.82 accuracy 0.38 -- 56.07 + 57.70 + 496.92 + 4.76 = 615.44:  83%|████████▎ | 1692/2048 [20:57<04:16,  1.39it/s]
loss 1.82 accuracy 0.38 -- 56.07 + 57.70 + 496.92 + 4.76 = 615.44:  83%|████████▎ | 1693/2048 [20:57<04:07,  1.43it/s]
loss 1.92 accuracy 0.38 -- 157.39 + 56.82 + 489.23 + 4.76 = 708.20:  83%|████████▎ | 1693/2048 [20:57<04:07,  1.43it/s]
loss 1.92 accuracy 0.38 -- 157.39 + 56.82 + 489.23 + 4.76 = 708.20:  83%|████████▎ | 1694/2048 [20:57<04:10,  1.41it/s]
loss 1.55 accuracy 0.50 -- 55.87 + 166.41 + 501.82 + 4.78 = 728.88:  83%|████████▎ | 1694/2048 [20:58<04:10,  1.41it/s]
loss 1.55 accuracy 0.50 -- 55.87 + 166.41 + 501.82 + 4.78 = 728.88:  83%|████████▎ | 1695/2048 [20:58<04:15,  1.38it/s]
loss 2.16 accuracy 0.19 -- 56.73 + 56.57 + 497.65 + 4.77 = 615.72:  83%|████████▎ | 1695/2048 [20:59<04:15,  1.38it/s] 
loss 2.16 accuracy 0.19 -- 56.73 + 56.57 + 497.65 + 4.77 = 615.72:  83%|████████▎ | 1696/2048 [20:59<04:05,  1.43it/s]
loss 1.74 accuracy 0.50 -- 162.44 + 57.01 + 497.21 + 4.77 = 721.43:  83%|████████▎ | 1696/2048 [21:00<04:05,  1.43it/s]
loss 1.74 accuracy 0.50 -- 162.44 + 57.01 + 497.21 + 4.77 = 721.43:  83%|████████▎ | 1697/2048 [21:00<04:10,  1.40it/s]
loss 1.65 accuracy 0.56 -- 56.00 + 57.03 + 621.60 + 4.79 = 739.42:  83%|████████▎ | 1697/2048 [21:00<04:10,  1.40it/s] 
loss 1.65 accuracy 0.56 -- 56.00 + 57.03 + 621.60 + 4.79 = 739.42:  83%|████████▎ | 1698/2048 [21:00<04:15,  1.37it/s]
loss 1.42 accuracy 0.56 -- 57.16 + 57.53 + 507.31 + 4.82 = 626.82:  83%|████████▎ | 1698/2048 [21:01<04:15,  1.37it/s]
loss 1.42 accuracy 0.56 -- 57.16 + 57.53 + 507.31 + 4.82 = 626.82:  83%|████████▎ | 1699/2048 [21:01<04:06,  1.41it/s]
loss 2.24 accuracy 0.38 -- 56.45 + 57.70 + 615.81 + 4.78 = 734.74:  83%|████████▎ | 1699/2048 [21:02<04:06,  1.41it/s]
loss 2.24 accuracy 0.38 -- 56.45 + 57.70 + 615.81 + 4.78 = 734.74:  83%|████████▎ | 1700/2048 [21:02<04:11,  1.38it/s]
loss 1.80 accuracy 0.31 -- 56.80 + 56.25 + 503.12 + 4.81 = 620.99:  83%|████████▎ | 1700/2048 [21:02<04:11,  1.38it/s]
loss 1.80 accuracy 0.31 -- 56.80 + 56.25 + 503.12 + 4.81 = 620.99:  83%|████████▎ | 1701/2048 [21:02<04:06,  1.41it/s]
loss 1.78 accuracy 0.38 -- 56.03 + 166.56 + 502.28 + 4.79 = 729.65:  83%|████████▎ | 1701/2048 [21:03<04:06,  1.41it/s]
loss 1.78 accuracy 0.38 -- 56.03 + 166.56 + 502.28 + 4.79 = 729.65:  83%|████████▎ | 1702/2048 [21:03<04:10,  1.38it/s]
loss 1.58 accuracy 0.44 -- 56.31 + 56.54 + 499.13 + 4.77 = 616.74:  83%|████████▎ | 1702/2048 [21:04<04:10,  1.38it/s] 
loss 1.58 accuracy 0.44 -- 56.31 + 56.54 + 499.13 + 4.77 = 616.74:  83%|████████▎ | 1703/2048 [21:04<04:01,  1.43it/s]
loss 1.47 accuracy 0.25 -- 56.85 + 56.50 + 499.93 + 4.78 = 618.06:  83%|████████▎ | 1703/2048 [21:05<04:01,  1.43it/s]
loss 1.47 accuracy 0.25 -- 56.85 + 56.50 + 499.93 + 4.78 = 618.06:  83%|████████▎ | 1704/2048 [21:05<04:06,  1.40it/s]
loss 1.45 accuracy 0.31 -- 56.08 + 57.12 + 495.78 + 4.77 = 613.74:  83%|████████▎ | 1704/2048 [21:05<04:06,  1.40it/s]
loss 1.45 accuracy 0.31 -- 56.08 + 57.12 + 495.78 + 4.77 = 613.74:  83%|████████▎ | 1705/2048 [21:05<03:57,  1.44it/s]
loss 1.65 accuracy 0.31 -- 157.54 + 56.97 + 488.78 + 4.77 = 708.06:  83%|████████▎ | 1705/2048 [21:06<03:57,  1.44it/s]
loss 1.65 accuracy 0.31 -- 157.54 + 56.97 + 488.78 + 4.77 = 708.06:  83%|████████▎ | 1706/2048 [21:06<04:01,  1.42it/s]
loss 1.74 accuracy 0.25 -- 55.78 + 166.09 + 500.25 + 4.76 = 726.88:  83%|████████▎ | 1706/2048 [21:07<04:01,  1.42it/s]
loss 1.74 accuracy 0.25 -- 55.78 + 166.09 + 500.25 + 4.76 = 726.88:  83%|████████▎ | 1707/2048 [21:07<04:05,  1.39it/s]
loss 1.64 accuracy 0.31 -- 56.36 + 56.33 + 497.17 + 4.78 = 614.64:  83%|████████▎ | 1707/2048 [21:07<04:05,  1.39it/s] 
loss 1.64 accuracy 0.31 -- 56.36 + 56.33 + 497.17 + 4.78 = 614.64:  83%|████████▎ | 1708/2048 [21:07<03:56,  1.44it/s]
loss 2.26 accuracy 0.25 -- 162.40 + 57.09 + 495.53 + 4.76 = 719.77:  83%|████████▎ | 1708/2048 [21:08<03:56,  1.44it/s]
loss 2.26 accuracy 0.25 -- 162.40 + 57.09 + 495.53 + 4.76 = 719.77:  83%|████████▎ | 1709/2048 [21:08<04:01,  1.40it/s]
loss 1.72 accuracy 0.44 -- 56.06 + 57.79 + 621.54 + 4.79 = 740.17:  83%|████████▎ | 1709/2048 [21:09<04:01,  1.40it/s] 
loss 1.72 accuracy 0.44 -- 56.06 + 57.79 + 621.54 + 4.79 = 740.17:  83%|████████▎ | 1710/2048 [21:09<04:06,  1.37it/s]
loss 1.99 accuracy 0.38 -- 56.92 + 56.42 + 506.27 + 4.78 = 624.39:  83%|████████▎ | 1710/2048 [21:09<04:06,  1.37it/s]
loss 1.99 accuracy 0.38 -- 56.92 + 56.42 + 506.27 + 4.78 = 624.39:  84%|████████▎ | 1711/2048 [21:09<03:57,  1.42it/s]
loss 1.80 accuracy 0.38 -- 56.42 + 57.30 + 615.65 + 4.78 = 734.15:  84%|████████▎ | 1711/2048 [21:10<03:57,  1.42it/s]
loss 1.80 accuracy 0.38 -- 56.42 + 57.30 + 615.65 + 4.78 = 734.15:  84%|████████▎ | 1712/2048 [21:10<04:02,  1.39it/s]
loss 1.51 accuracy 0.44 -- 56.88 + 56.35 + 502.47 + 4.78 = 620.48:  84%|████████▎ | 1712/2048 [21:11<04:02,  1.39it/s]
loss 1.51 accuracy 0.44 -- 56.88 + 56.35 + 502.47 + 4.78 = 620.48:  84%|████████▎ | 1713/2048 [21:11<03:54,  1.43it/s]
loss 2.39 accuracy 0.25 -- 56.14 + 165.70 + 501.58 + 4.77 = 728.19:  84%|████████▎ | 1713/2048 [21:12<03:54,  1.43it/s]
loss 2.39 accuracy 0.25 -- 56.14 + 165.70 + 501.58 + 4.77 = 728.19:  84%|████████▎ | 1714/2048 [21:12<03:59,  1.40it/s]
loss 1.76 accuracy 0.38 -- 56.15 + 56.55 + 499.44 + 4.81 = 616.96:  84%|████████▎ | 1714/2048 [21:12<03:59,  1.40it/s] 
loss 1.76 accuracy 0.38 -- 56.15 + 56.55 + 499.44 + 4.81 = 616.96:  84%|████████▎ | 1715/2048 [21:12<03:51,  1.44it/s]
loss 1.53 accuracy 0.44 -- 56.99 + 56.55 + 498.75 + 4.77 = 617.05:  84%|████████▎ | 1715/2048 [21:13<03:51,  1.44it/s]
loss 1.53 accuracy 0.44 -- 56.99 + 56.55 + 498.75 + 4.77 = 617.05:  84%|████████▍ | 1716/2048 [21:13<03:56,  1.41it/s]
loss 2.31 accuracy 0.38 -- 56.31 + 57.14 + 495.76 + 4.77 = 613.98:  84%|████████▍ | 1716/2048 [21:14<03:56,  1.41it/s]
loss 2.31 accuracy 0.38 -- 56.31 + 57.14 + 495.76 + 4.77 = 613.98:  84%|████████▍ | 1717/2048 [21:14<03:52,  1.43it/s]
loss 1.67 accuracy 0.38 -- 157.71 + 56.78 + 488.99 + 4.77 = 708.26:  84%|████████▍ | 1717/2048 [21:14<03:52,  1.43it/s]
loss 1.67 accuracy 0.38 -- 157.71 + 56.78 + 488.99 + 4.77 = 708.26:  84%|████████▍ | 1718/2048 [21:14<03:54,  1.41it/s]
loss 1.68 accuracy 0.38 -- 55.78 + 166.39 + 501.19 + 4.76 = 728.12:  84%|████████▍ | 1718/2048 [21:15<03:54,  1.41it/s]
loss 1.68 accuracy 0.38 -- 55.78 + 166.39 + 501.19 + 4.76 = 728.12:  84%|████████▍ | 1719/2048 [21:15<03:58,  1.38it/s]
loss 2.04 accuracy 0.38 -- 56.82 + 56.77 + 496.93 + 4.78 = 615.30:  84%|████████▍ | 1719/2048 [21:16<03:58,  1.38it/s] 
loss 2.04 accuracy 0.38 -- 56.82 + 56.77 + 496.93 + 4.78 = 615.30:  84%|████████▍ | 1720/2048 [21:16<03:49,  1.43it/s]
loss 1.64 accuracy 0.56 -- 162.34 + 57.11 + 497.96 + 4.78 = 722.20:  84%|████████▍ | 1720/2048 [21:17<03:49,  1.43it/s]
loss 1.64 accuracy 0.56 -- 162.34 + 57.11 + 497.96 + 4.78 = 722.20:  84%|████████▍ | 1721/2048 [21:17<03:53,  1.40it/s]
loss 1.69 accuracy 0.25 -- 55.97 + 56.90 + 620.84 + 4.78 = 738.48:  84%|████████▍ | 1721/2048 [21:17<03:53,  1.40it/s] 
loss 1.69 accuracy 0.25 -- 55.97 + 56.90 + 620.84 + 4.78 = 738.48:  84%|████████▍ | 1722/2048 [21:17<03:57,  1.37it/s]
loss 1.77 accuracy 0.50 -- 56.74 + 56.48 + 506.19 + 4.76 = 624.18:  84%|████████▍ | 1722/2048 [21:18<03:57,  1.37it/s]
loss 1.77 accuracy 0.50 -- 56.74 + 56.48 + 506.19 + 4.76 = 624.18:  84%|████████▍ | 1723/2048 [21:18<03:49,  1.42it/s]
loss 1.74 accuracy 0.38 -- 56.14 + 56.88 + 614.59 + 4.78 = 732.39:  84%|████████▍ | 1723/2048 [21:19<03:49,  1.42it/s]
loss 1.74 accuracy 0.38 -- 56.14 + 56.88 + 614.59 + 4.78 = 732.39:  84%|████████▍ | 1724/2048 [21:19<03:53,  1.38it/s]
loss 1.48 accuracy 0.44 -- 56.99 + 56.51 + 502.36 + 4.77 = 620.64:  84%|████████▍ | 1724/2048 [21:19<03:53,  1.38it/s]
loss 1.48 accuracy 0.44 -- 56.99 + 56.51 + 502.36 + 4.77 = 620.64:  84%|████████▍ | 1725/2048 [21:19<03:49,  1.41it/s]
loss 1.28 accuracy 0.44 -- 56.43 + 165.84 + 500.94 + 4.83 = 728.04:  84%|████████▍ | 1725/2048 [21:20<03:49,  1.41it/s]
loss 1.28 accuracy 0.44 -- 56.43 + 165.84 + 500.94 + 4.83 = 728.04:  84%|████████▍ | 1726/2048 [21:20<03:53,  1.38it/s]
loss 1.83 accuracy 0.38 -- 56.46 + 56.64 + 498.92 + 4.76 = 616.79:  84%|████████▍ | 1726/2048 [21:21<03:53,  1.38it/s] 
loss 1.83 accuracy 0.38 -- 56.46 + 56.64 + 498.92 + 4.76 = 616.79:  84%|████████▍ | 1727/2048 [21:21<03:44,  1.43it/s]
loss 1.39 accuracy 0.31 -- 56.66 + 56.41 + 497.09 + 4.76 = 614.92:  84%|████████▍ | 1727/2048 [21:22<03:44,  1.43it/s]
loss 1.39 accuracy 0.31 -- 56.66 + 56.41 + 497.09 + 4.76 = 614.92:  84%|████████▍ | 1728/2048 [21:22<03:48,  1.40it/s]
loss 1.67 accuracy 0.38 -- 55.71 + 57.13 + 494.83 + 4.78 = 612.45:  84%|████████▍ | 1728/2048 [21:22<03:48,  1.40it/s]
loss 1.67 accuracy 0.38 -- 55.71 + 57.13 + 494.83 + 4.78 = 612.45:  84%|████████▍ | 1729/2048 [21:22<03:40,  1.45it/s]
loss 1.87 accuracy 0.19 -- 157.57 + 56.85 + 488.65 + 4.77 = 707.85:  84%|████████▍ | 1729/2048 [21:23<03:40,  1.45it/s]
loss 1.87 accuracy 0.19 -- 157.57 + 56.85 + 488.65 + 4.77 = 707.85:  84%|████████▍ | 1730/2048 [21:23<03:44,  1.42it/s]
loss 1.80 accuracy 0.38 -- 55.73 + 166.30 + 500.98 + 4.77 = 727.78:  84%|████████▍ | 1730/2048 [21:24<03:44,  1.42it/s]
loss 1.80 accuracy 0.38 -- 55.73 + 166.30 + 500.98 + 4.77 = 727.78:  85%|████████▍ | 1731/2048 [21:24<03:48,  1.39it/s]
loss 1.56 accuracy 0.38 -- 56.66 + 56.50 + 498.53 + 4.77 = 616.46:  85%|████████▍ | 1731/2048 [21:24<03:48,  1.39it/s] 
loss 1.56 accuracy 0.38 -- 56.66 + 56.50 + 498.53 + 4.77 = 616.46:  85%|████████▍ | 1732/2048 [21:24<03:40,  1.43it/s]
loss 1.37 accuracy 0.50 -- 162.44 + 56.84 + 495.60 + 4.78 = 719.65:  85%|████████▍ | 1732/2048 [21:25<03:40,  1.43it/s]
loss 1.37 accuracy 0.50 -- 162.44 + 56.84 + 495.60 + 4.78 = 719.65:  85%|████████▍ | 1733/2048 [21:25<03:47,  1.38it/s]
loss 1.28 accuracy 0.62 -- 55.94 + 56.98 + 620.00 + 4.77 = 737.69:  85%|████████▍ | 1733/2048 [21:26<03:47,  1.38it/s] 
loss 1.28 accuracy 0.62 -- 55.94 + 56.98 + 620.00 + 4.77 = 737.69:  85%|████████▍ | 1734/2048 [21:26<03:50,  1.36it/s]
loss 1.32 accuracy 0.50 -- 56.56 + 56.54 + 505.89 + 4.77 = 623.76:  85%|████████▍ | 1734/2048 [21:27<03:50,  1.36it/s]
loss 1.32 accuracy 0.50 -- 56.56 + 56.54 + 505.89 + 4.77 = 623.76:  85%|████████▍ | 1735/2048 [21:27<03:42,  1.41it/s]
loss 2.11 accuracy 0.31 -- 56.23 + 57.14 + 615.42 + 4.84 = 733.63:  85%|████████▍ | 1735/2048 [21:27<03:42,  1.41it/s]
loss 2.11 accuracy 0.31 -- 56.23 + 57.14 + 615.42 + 4.84 = 733.63:  85%|████████▍ | 1736/2048 [21:27<03:46,  1.38it/s]
loss 1.91 accuracy 0.12 -- 56.76 + 56.48 + 501.98 + 4.80 = 620.01:  85%|████████▍ | 1736/2048 [21:28<03:46,  1.38it/s]
loss 1.91 accuracy 0.12 -- 56.76 + 56.48 + 501.98 + 4.80 = 620.01:  85%|████████▍ | 1737/2048 [21:28<03:38,  1.43it/s]
loss 1.43 accuracy 0.50 -- 56.32 + 166.48 + 503.45 + 4.78 = 731.04:  85%|████████▍ | 1737/2048 [21:29<03:38,  1.43it/s]
loss 1.43 accuracy 0.50 -- 56.32 + 166.48 + 503.45 + 4.78 = 731.04:  85%|████████▍ | 1738/2048 [21:29<03:42,  1.39it/s]
loss 1.85 accuracy 0.31 -- 56.16 + 56.40 + 499.39 + 4.80 = 616.74:  85%|████████▍ | 1738/2048 [21:29<03:42,  1.39it/s] 
loss 1.85 accuracy 0.31 -- 56.16 + 56.40 + 499.39 + 4.80 = 616.74:  85%|████████▍ | 1739/2048 [21:29<03:35,  1.44it/s]
loss 2.12 accuracy 0.25 -- 56.77 + 56.44 + 497.55 + 4.78 = 615.53:  85%|████████▍ | 1739/2048 [21:30<03:35,  1.44it/s]
loss 2.12 accuracy 0.25 -- 56.77 + 56.44 + 497.55 + 4.78 = 615.53:  85%|████████▍ | 1740/2048 [21:30<03:39,  1.40it/s]
loss 1.74 accuracy 0.31 -- 55.86 + 57.27 + 495.41 + 4.77 = 613.31:  85%|████████▍ | 1740/2048 [21:31<03:39,  1.40it/s]
loss 1.74 accuracy 0.31 -- 55.86 + 57.27 + 495.41 + 4.77 = 613.31:  85%|████████▌ | 1741/2048 [21:31<03:32,  1.45it/s]
loss 1.93 accuracy 0.44 -- 157.65 + 56.88 + 489.27 + 4.78 = 708.58:  85%|████████▌ | 1741/2048 [21:32<03:32,  1.45it/s]
loss 1.93 accuracy 0.44 -- 157.65 + 56.88 + 489.27 + 4.78 = 708.58:  85%|████████▌ | 1742/2048 [21:32<03:35,  1.42it/s]
loss 2.33 accuracy 0.25 -- 56.05 + 166.49 + 501.01 + 4.77 = 728.32:  85%|████████▌ | 1742/2048 [21:32<03:35,  1.42it/s]
loss 2.33 accuracy 0.25 -- 56.05 + 166.49 + 501.01 + 4.77 = 728.32:  85%|████████▌ | 1743/2048 [21:32<03:39,  1.39it/s]
loss 2.04 accuracy 0.19 -- 56.75 + 56.83 + 498.28 + 4.78 = 616.64:  85%|████████▌ | 1743/2048 [21:33<03:39,  1.39it/s] 
loss 2.04 accuracy 0.19 -- 56.75 + 56.83 + 498.28 + 4.78 = 616.64:  85%|████████▌ | 1744/2048 [21:33<03:31,  1.44it/s]
loss 1.99 accuracy 0.25 -- 162.88 + 57.01 + 496.24 + 4.78 = 720.91:  85%|████████▌ | 1744/2048 [21:34<03:31,  1.44it/s]
loss 1.99 accuracy 0.25 -- 162.88 + 57.01 + 496.24 + 4.78 = 720.91:  85%|████████▌ | 1745/2048 [21:34<03:35,  1.40it/s]
loss 1.86 accuracy 0.25 -- 56.09 + 57.31 + 619.76 + 4.80 = 737.97:  85%|████████▌ | 1745/2048 [21:34<03:35,  1.40it/s] 
loss 1.86 accuracy 0.25 -- 56.09 + 57.31 + 619.76 + 4.80 = 737.97:  85%|████████▌ | 1746/2048 [21:34<03:39,  1.37it/s]
loss 1.47 accuracy 0.44 -- 56.79 + 56.53 + 503.49 + 4.78 = 621.59:  85%|████████▌ | 1746/2048 [21:35<03:39,  1.37it/s]
loss 1.47 accuracy 0.44 -- 56.79 + 56.53 + 503.49 + 4.78 = 621.59:  85%|████████▌ | 1747/2048 [21:35<03:31,  1.42it/s]
loss 1.43 accuracy 0.38 -- 55.78 + 57.14 + 615.07 + 4.79 = 732.77:  85%|████████▌ | 1747/2048 [21:36<03:31,  1.42it/s]
loss 1.43 accuracy 0.38 -- 55.78 + 57.14 + 615.07 + 4.79 = 732.77:  85%|████████▌ | 1748/2048 [21:36<03:36,  1.39it/s]
loss 2.18 accuracy 0.31 -- 56.80 + 56.35 + 503.25 + 4.77 = 621.17:  85%|████████▌ | 1748/2048 [21:36<03:36,  1.39it/s]
loss 2.18 accuracy 0.31 -- 56.80 + 56.35 + 503.25 + 4.77 = 621.17:  85%|████████▌ | 1749/2048 [21:36<03:29,  1.43it/s]
loss 2.23 accuracy 0.31 -- 56.03 + 165.88 + 501.06 + 4.79 = 727.75:  85%|████████▌ | 1749/2048 [21:37<03:29,  1.43it/s]
loss 2.23 accuracy 0.31 -- 56.03 + 165.88 + 501.06 + 4.79 = 727.75:  85%|████████▌ | 1750/2048 [21:37<03:33,  1.40it/s]
loss 1.77 accuracy 0.25 -- 56.07 + 56.45 + 498.33 + 4.77 = 615.63:  85%|████████▌ | 1750/2048 [21:38<03:33,  1.40it/s] 
loss 1.77 accuracy 0.25 -- 56.07 + 56.45 + 498.33 + 4.77 = 615.63:  85%|████████▌ | 1751/2048 [21:38<03:26,  1.44it/s]
loss 2.52 accuracy 0.38 -- 56.89 + 56.51 + 498.68 + 4.77 = 616.85:  85%|████████▌ | 1751/2048 [21:39<03:26,  1.44it/s]
loss 2.52 accuracy 0.38 -- 56.89 + 56.51 + 498.68 + 4.77 = 616.85:  86%|████████▌ | 1752/2048 [21:39<03:30,  1.41it/s]
loss 1.58 accuracy 0.38 -- 55.84 + 57.02 + 495.04 + 4.77 = 612.68:  86%|████████▌ | 1752/2048 [21:39<03:30,  1.41it/s]
loss 1.58 accuracy 0.38 -- 55.84 + 57.02 + 495.04 + 4.77 = 612.68:  86%|████████▌ | 1753/2048 [21:39<03:23,  1.45it/s]
loss 1.76 accuracy 0.25 -- 157.70 + 56.99 + 489.21 + 4.77 = 708.67:  86%|████████▌ | 1753/2048 [21:40<03:23,  1.45it/s]
loss 1.76 accuracy 0.25 -- 157.70 + 56.99 + 489.21 + 4.77 = 708.67:  86%|████████▌ | 1754/2048 [21:40<03:26,  1.42it/s]
loss 1.59 accuracy 0.38 -- 55.96 + 166.35 + 502.01 + 4.78 = 729.10:  86%|████████▌ | 1754/2048 [21:41<03:26,  1.42it/s]
loss 1.59 accuracy 0.38 -- 55.96 + 166.35 + 502.01 + 4.78 = 729.10:  86%|████████▌ | 1755/2048 [21:41<03:30,  1.39it/s]
loss 1.63 accuracy 0.38 -- 56.50 + 56.50 + 496.74 + 4.80 = 614.55:  86%|████████▌ | 1755/2048 [21:41<03:30,  1.39it/s] 
loss 1.63 accuracy 0.38 -- 56.50 + 56.50 + 496.74 + 4.80 = 614.55:  86%|████████▌ | 1756/2048 [21:41<03:23,  1.44it/s]
loss 1.90 accuracy 0.12 -- 162.65 + 57.22 + 496.02 + 4.77 = 720.66:  86%|████████▌ | 1756/2048 [21:42<03:23,  1.44it/s]
loss 1.90 accuracy 0.12 -- 162.65 + 57.22 + 496.02 + 4.77 = 720.66:  86%|████████▌ | 1757/2048 [21:42<03:27,  1.41it/s]
loss 2.66 accuracy 0.25 -- 55.71 + 56.83 + 619.25 + 4.78 = 736.57:  86%|████████▌ | 1757/2048 [21:43<03:27,  1.41it/s] 
loss 2.66 accuracy 0.25 -- 55.71 + 56.83 + 619.25 + 4.78 = 736.57:  86%|████████▌ | 1758/2048 [21:43<03:30,  1.38it/s]
loss 1.98 accuracy 0.38 -- 56.87 + 56.49 + 505.05 + 4.77 = 623.18:  86%|████████▌ | 1758/2048 [21:44<03:30,  1.38it/s]
loss 1.98 accuracy 0.38 -- 56.87 + 56.49 + 505.05 + 4.77 = 623.18:  86%|████████▌ | 1759/2048 [21:44<03:23,  1.42it/s]
loss 1.70 accuracy 0.44 -- 55.94 + 57.23 + 615.31 + 4.78 = 733.25:  86%|████████▌ | 1759/2048 [21:44<03:23,  1.42it/s]
loss 1.70 accuracy 0.44 -- 55.94 + 57.23 + 615.31 + 4.78 = 733.25:  86%|████████▌ | 1760/2048 [21:44<03:27,  1.39it/s]
loss 1.60 accuracy 0.50 -- 57.12 + 56.75 + 501.41 + 4.77 = 620.06:  86%|████████▌ | 1760/2048 [21:45<03:27,  1.39it/s]
loss 1.60 accuracy 0.50 -- 57.12 + 56.75 + 501.41 + 4.77 = 620.06:  86%|████████▌ | 1761/2048 [21:45<03:20,  1.43it/s]
loss 1.77 accuracy 0.31 -- 56.17 + 166.09 + 501.96 + 4.77 = 729.00:  86%|████████▌ | 1761/2048 [21:46<03:20,  1.43it/s]
loss 1.77 accuracy 0.31 -- 56.17 + 166.09 + 501.96 + 4.77 = 729.00:  86%|████████▌ | 1762/2048 [21:46<03:24,  1.40it/s]
loss 2.09 accuracy 0.25 -- 56.09 + 55.95 + 498.50 + 4.76 = 615.31:  86%|████████▌ | 1762/2048 [21:46<03:24,  1.40it/s] 
loss 2.09 accuracy 0.25 -- 56.09 + 55.95 + 498.50 + 4.76 = 615.31:  86%|████████▌ | 1763/2048 [21:46<03:17,  1.44it/s]
loss 1.53 accuracy 0.38 -- 56.50 + 56.44 + 496.17 + 4.77 = 613.87:  86%|████████▌ | 1763/2048 [21:47<03:17,  1.44it/s]
loss 1.53 accuracy 0.38 -- 56.50 + 56.44 + 496.17 + 4.77 = 613.87:  86%|████████▌ | 1764/2048 [21:47<03:21,  1.41it/s]
loss 1.52 accuracy 0.50 -- 56.02 + 57.41 + 494.65 + 4.77 = 612.85:  86%|████████▌ | 1764/2048 [21:48<03:21,  1.41it/s]
loss 1.52 accuracy 0.50 -- 56.02 + 57.41 + 494.65 + 4.77 = 612.85:  86%|████████▌ | 1765/2048 [21:48<03:14,  1.45it/s]
loss 1.97 accuracy 0.19 -- 157.19 + 56.67 + 490.84 + 4.77 = 709.47:  86%|████████▌ | 1765/2048 [21:48<03:14,  1.45it/s]
loss 1.97 accuracy 0.19 -- 157.19 + 56.67 + 490.84 + 4.77 = 709.47:  86%|████████▌ | 1766/2048 [21:48<03:18,  1.42it/s]
loss 1.67 accuracy 0.56 -- 55.98 + 166.36 + 500.33 + 4.77 = 727.44:  86%|████████▌ | 1766/2048 [21:49<03:18,  1.42it/s]
loss 1.67 accuracy 0.56 -- 55.98 + 166.36 + 500.33 + 4.77 = 727.44:  86%|████████▋ | 1767/2048 [21:49<03:21,  1.39it/s]
loss 1.28 accuracy 0.62 -- 56.44 + 56.59 + 497.35 + 4.77 = 615.16:  86%|████████▋ | 1767/2048 [21:50<03:21,  1.39it/s] 
loss 1.28 accuracy 0.62 -- 56.44 + 56.59 + 497.35 + 4.77 = 615.16:  86%|████████▋ | 1768/2048 [21:50<03:14,  1.44it/s]
loss 1.53 accuracy 0.38 -- 162.07 + 56.99 + 496.86 + 4.76 = 720.68:  86%|████████▋ | 1768/2048 [21:51<03:14,  1.44it/s]
loss 1.53 accuracy 0.38 -- 162.07 + 56.99 + 496.86 + 4.76 = 720.68:  86%|████████▋ | 1769/2048 [21:51<03:18,  1.41it/s]
loss 1.74 accuracy 0.31 -- 56.03 + 57.10 + 619.37 + 4.78 = 737.29:  86%|████████▋ | 1769/2048 [21:51<03:18,  1.41it/s] 
loss 1.74 accuracy 0.31 -- 56.03 + 57.10 + 619.37 + 4.78 = 737.29:  86%|████████▋ | 1770/2048 [21:51<03:25,  1.36it/s]
loss 1.41 accuracy 0.38 -- 56.70 + 56.59 + 503.63 + 4.78 = 621.70:  86%|████████▋ | 1770/2048 [21:52<03:25,  1.36it/s]
loss 1.41 accuracy 0.38 -- 56.70 + 56.59 + 503.63 + 4.78 = 621.70:  86%|████████▋ | 1771/2048 [21:52<03:16,  1.41it/s]
loss 2.15 accuracy 0.38 -- 56.34 + 57.75 + 616.42 + 4.78 = 735.29:  86%|████████▋ | 1771/2048 [21:53<03:16,  1.41it/s]
loss 2.15 accuracy 0.38 -- 56.34 + 57.75 + 616.42 + 4.78 = 735.29:  87%|████████▋ | 1772/2048 [21:53<03:20,  1.38it/s]
loss 2.24 accuracy 0.06 -- 56.61 + 56.63 + 502.19 + 4.78 = 620.21:  87%|████████▋ | 1772/2048 [21:53<03:20,  1.38it/s]
loss 2.24 accuracy 0.06 -- 56.61 + 56.63 + 502.19 + 4.78 = 620.21:  87%|████████▋ | 1773/2048 [21:53<03:13,  1.42it/s]
loss 2.85 accuracy 0.12 -- 56.25 + 166.58 + 501.29 + 4.77 = 728.89:  87%|████████▋ | 1773/2048 [21:54<03:13,  1.42it/s]
loss 2.85 accuracy 0.12 -- 56.25 + 166.58 + 501.29 + 4.77 = 728.89:  87%|████████▋ | 1774/2048 [21:54<03:16,  1.39it/s]
loss 1.66 accuracy 0.38 -- 55.78 + 56.16 + 499.36 + 4.76 = 616.06:  87%|████████▋ | 1774/2048 [21:55<03:16,  1.39it/s] 
loss 1.66 accuracy 0.38 -- 55.78 + 56.16 + 499.36 + 4.76 = 616.06:  87%|████████▋ | 1775/2048 [21:55<03:09,  1.44it/s]
loss 1.74 accuracy 0.19 -- 56.53 + 56.47 + 497.09 + 4.82 = 614.91:  87%|████████▋ | 1775/2048 [21:56<03:09,  1.44it/s]
loss 1.74 accuracy 0.19 -- 56.53 + 56.47 + 497.09 + 4.82 = 614.91:  87%|████████▋ | 1776/2048 [21:56<03:13,  1.40it/s]
loss 2.12 accuracy 0.25 -- 56.04 + 57.28 + 495.75 + 4.77 = 613.84:  87%|████████▋ | 1776/2048 [21:56<03:13,  1.40it/s]
loss 2.12 accuracy 0.25 -- 56.04 + 57.28 + 495.75 + 4.77 = 613.84:  87%|████████▋ | 1777/2048 [21:56<03:07,  1.45it/s]
loss 2.00 accuracy 0.19 -- 158.10 + 56.80 + 489.14 + 4.76 = 708.80:  87%|████████▋ | 1777/2048 [21:57<03:07,  1.45it/s]
loss 2.00 accuracy 0.19 -- 158.10 + 56.80 + 489.14 + 4.76 = 708.80:  87%|████████▋ | 1778/2048 [21:57<03:10,  1.42it/s]
loss 1.85 accuracy 0.25 -- 55.64 + 166.59 + 502.03 + 4.78 = 729.03:  87%|████████▋ | 1778/2048 [21:58<03:10,  1.42it/s]
loss 1.85 accuracy 0.25 -- 55.64 + 166.59 + 502.03 + 4.78 = 729.03:  87%|████████▋ | 1779/2048 [21:58<03:13,  1.39it/s]
loss 1.43 accuracy 0.31 -- 56.89 + 56.85 + 498.64 + 4.77 = 617.15:  87%|████████▋ | 1779/2048 [21:58<03:13,  1.39it/s] 
loss 1.43 accuracy 0.31 -- 56.89 + 56.85 + 498.64 + 4.77 = 617.15:  87%|████████▋ | 1780/2048 [21:58<03:06,  1.43it/s]
loss 1.88 accuracy 0.31 -- 162.49 + 57.18 + 498.37 + 4.77 = 722.81:  87%|████████▋ | 1780/2048 [21:59<03:06,  1.43it/s]
loss 1.88 accuracy 0.31 -- 162.49 + 57.18 + 498.37 + 4.77 = 722.81:  87%|████████▋ | 1781/2048 [21:59<03:10,  1.40it/s]
loss 1.76 accuracy 0.38 -- 56.11 + 57.11 + 619.71 + 4.78 = 737.71:  87%|████████▋ | 1781/2048 [22:00<03:10,  1.40it/s] 
loss 1.76 accuracy 0.38 -- 56.11 + 57.11 + 619.71 + 4.78 = 737.71:  87%|████████▋ | 1782/2048 [22:00<03:13,  1.37it/s]
loss 2.45 accuracy 0.31 -- 56.79 + 56.65 + 506.34 + 4.77 = 624.55:  87%|████████▋ | 1782/2048 [22:01<03:13,  1.37it/s]
loss 2.45 accuracy 0.31 -- 56.79 + 56.65 + 506.34 + 4.77 = 624.55:  87%|████████▋ | 1783/2048 [22:01<03:06,  1.42it/s]
loss 1.21 accuracy 0.62 -- 56.09 + 57.18 + 614.21 + 4.78 = 732.26:  87%|████████▋ | 1783/2048 [22:01<03:06,  1.42it/s]
loss 1.21 accuracy 0.62 -- 56.09 + 57.18 + 614.21 + 4.78 = 732.26:  87%|████████▋ | 1784/2048 [22:01<03:10,  1.39it/s]
loss 2.00 accuracy 0.19 -- 56.68 + 56.74 + 501.06 + 4.76 = 619.23:  87%|████████▋ | 1784/2048 [22:02<03:10,  1.39it/s]
loss 2.00 accuracy 0.19 -- 56.68 + 56.74 + 501.06 + 4.76 = 619.23:  87%|████████▋ | 1785/2048 [22:02<03:03,  1.43it/s]
loss 1.68 accuracy 0.44 -- 56.31 + 166.41 + 502.55 + 4.78 = 730.05:  87%|████████▋ | 1785/2048 [22:03<03:03,  1.43it/s]
loss 1.68 accuracy 0.44 -- 56.31 + 166.41 + 502.55 + 4.78 = 730.05:  87%|████████▋ | 1786/2048 [22:03<03:07,  1.40it/s]
loss 1.86 accuracy 0.25 -- 56.30 + 56.48 + 499.19 + 4.77 = 616.74:  87%|████████▋ | 1786/2048 [22:03<03:07,  1.40it/s] 
loss 1.86 accuracy 0.25 -- 56.30 + 56.48 + 499.19 + 4.77 = 616.74:  87%|████████▋ | 1787/2048 [22:03<03:01,  1.44it/s]
loss 2.10 accuracy 0.31 -- 56.77 + 56.67 + 498.14 + 4.77 = 616.34:  87%|████████▋ | 1787/2048 [22:04<03:01,  1.44it/s]
loss 2.10 accuracy 0.31 -- 56.77 + 56.67 + 498.14 + 4.77 = 616.34:  87%|████████▋ | 1788/2048 [22:04<03:04,  1.41it/s]
loss 1.49 accuracy 0.50 -- 56.11 + 57.50 + 495.19 + 4.77 = 613.57:  87%|████████▋ | 1788/2048 [22:05<03:04,  1.41it/s]
loss 1.49 accuracy 0.50 -- 56.11 + 57.50 + 495.19 + 4.77 = 613.57:  87%|████████▋ | 1789/2048 [22:05<02:58,  1.45it/s]
loss 1.96 accuracy 0.31 -- 157.09 + 57.03 + 489.67 + 4.77 = 708.56:  87%|████████▋ | 1789/2048 [22:06<02:58,  1.45it/s]
loss 1.96 accuracy 0.31 -- 157.09 + 57.03 + 489.67 + 4.77 = 708.56:  87%|████████▋ | 1790/2048 [22:06<03:01,  1.42it/s]
loss 1.79 accuracy 0.25 -- 55.92 + 166.38 + 499.93 + 4.79 = 727.02:  87%|████████▋ | 1790/2048 [22:06<03:01,  1.42it/s]
loss 1.79 accuracy 0.25 -- 55.92 + 166.38 + 499.93 + 4.79 = 727.02:  87%|████████▋ | 1791/2048 [22:06<03:04,  1.39it/s]
loss 1.30 accuracy 0.62 -- 56.42 + 56.88 + 497.04 + 4.78 = 615.13:  87%|████████▋ | 1791/2048 [22:07<03:04,  1.39it/s] 
loss 1.30 accuracy 0.62 -- 56.42 + 56.88 + 497.04 + 4.78 = 615.13:  88%|████████▊ | 1792/2048 [22:07<02:58,  1.44it/s]
loss 1.81 accuracy 0.38 -- 162.22 + 56.86 + 496.88 + 4.76 = 720.72:  88%|████████▊ | 1792/2048 [22:08<02:58,  1.44it/s]
loss 1.81 accuracy 0.38 -- 162.22 + 56.86 + 496.88 + 4.76 = 720.72:  88%|████████▊ | 1793/2048 [22:08<03:04,  1.39it/s]
loss 1.93 accuracy 0.38 -- 55.84 + 57.33 + 621.74 + 4.78 = 739.68:  88%|████████▊ | 1793/2048 [22:08<03:04,  1.39it/s] 
loss 1.93 accuracy 0.38 -- 55.84 + 57.33 + 621.74 + 4.78 = 739.68:  88%|████████▊ | 1794/2048 [22:08<03:06,  1.36it/s]
loss 1.85 accuracy 0.31 -- 57.00 + 56.48 + 504.83 + 4.78 = 623.09:  88%|████████▊ | 1794/2048 [22:09<03:06,  1.36it/s]
loss 1.85 accuracy 0.31 -- 57.00 + 56.48 + 504.83 + 4.78 = 623.09:  88%|████████▊ | 1795/2048 [22:09<02:59,  1.41it/s]
loss 1.69 accuracy 0.38 -- 56.02 + 57.29 + 614.72 + 4.77 = 732.80:  88%|████████▊ | 1795/2048 [22:10<02:59,  1.41it/s]
loss 1.69 accuracy 0.38 -- 56.02 + 57.29 + 614.72 + 4.77 = 732.80:  88%|████████▊ | 1796/2048 [22:10<03:02,  1.38it/s]
loss 1.78 accuracy 0.50 -- 56.82 + 56.50 + 501.24 + 4.77 = 619.33:  88%|████████▊ | 1796/2048 [22:11<03:02,  1.38it/s]
loss 1.78 accuracy 0.50 -- 56.82 + 56.50 + 501.24 + 4.77 = 619.33:  88%|████████▊ | 1797/2048 [22:11<02:56,  1.43it/s]
loss 1.64 accuracy 0.31 -- 56.01 + 166.14 + 501.02 + 4.77 = 727.95:  88%|████████▊ | 1797/2048 [22:11<02:56,  1.43it/s]
loss 1.64 accuracy 0.31 -- 56.01 + 166.14 + 501.02 + 4.77 = 727.95:  88%|████████▊ | 1798/2048 [22:11<02:59,  1.39it/s]
loss 1.76 accuracy 0.44 -- 56.04 + 56.32 + 499.59 + 4.77 = 616.72:  88%|████████▊ | 1798/2048 [22:12<02:59,  1.39it/s] 
loss 1.76 accuracy 0.44 -- 56.04 + 56.32 + 499.59 + 4.77 = 616.72:  88%|████████▊ | 1799/2048 [22:12<02:53,  1.44it/s]
loss 2.25 accuracy 0.38 -- 56.60 + 56.95 + 498.98 + 4.79 = 617.32:  88%|████████▊ | 1799/2048 [22:13<02:53,  1.44it/s]
loss 2.25 accuracy 0.38 -- 56.60 + 56.95 + 498.98 + 4.79 = 617.32:  88%|████████▊ | 1800/2048 [22:13<02:59,  1.38it/s]
loss 1.96 accuracy 0.31 -- 55.96 + 57.42 + 494.79 + 4.76 = 612.92:  88%|████████▊ | 1800/2048 [22:13<02:59,  1.38it/s]
loss 1.96 accuracy 0.31 -- 55.96 + 57.42 + 494.79 + 4.76 = 612.92:  88%|████████▊ | 1801/2048 [22:13<02:52,  1.43it/s]
loss 1.53 accuracy 0.38 -- 157.60 + 57.07 + 489.13 + 4.76 = 708.56:  88%|████████▊ | 1801/2048 [22:14<02:52,  1.43it/s]
loss 1.53 accuracy 0.38 -- 157.60 + 57.07 + 489.13 + 4.76 = 708.56:  88%|████████▊ | 1802/2048 [22:14<02:54,  1.41it/s]
loss 1.54 accuracy 0.38 -- 55.69 + 166.92 + 500.72 + 4.79 = 728.12:  88%|████████▊ | 1802/2048 [22:15<02:54,  1.41it/s]
loss 1.54 accuracy 0.38 -- 55.69 + 166.92 + 500.72 + 4.79 = 728.12:  88%|████████▊ | 1803/2048 [22:15<02:57,  1.38it/s]
loss 2.07 accuracy 0.31 -- 56.79 + 56.31 + 496.73 + 4.77 = 614.60:  88%|████████▊ | 1803/2048 [22:15<02:57,  1.38it/s] 
loss 2.07 accuracy 0.31 -- 56.79 + 56.31 + 496.73 + 4.77 = 614.60:  88%|████████▊ | 1804/2048 [22:15<02:50,  1.43it/s]
loss 1.87 accuracy 0.38 -- 162.83 + 56.83 + 497.14 + 4.78 = 721.58:  88%|████████▊ | 1804/2048 [22:16<02:50,  1.43it/s]
loss 1.87 accuracy 0.38 -- 162.83 + 56.83 + 497.14 + 4.78 = 721.58:  88%|████████▊ | 1805/2048 [22:16<02:53,  1.40it/s]
loss 1.94 accuracy 0.12 -- 56.33 + 57.56 + 620.88 + 4.79 = 739.56:  88%|████████▊ | 1805/2048 [22:17<02:53,  1.40it/s] 
loss 1.94 accuracy 0.12 -- 56.33 + 57.56 + 620.88 + 4.79 = 739.56:  88%|████████▊ | 1806/2048 [22:17<02:56,  1.37it/s]
loss 1.89 accuracy 0.31 -- 56.83 + 56.49 + 504.31 + 4.76 = 622.39:  88%|████████▊ | 1806/2048 [22:18<02:56,  1.37it/s]
loss 1.89 accuracy 0.31 -- 56.83 + 56.49 + 504.31 + 4.76 = 622.39:  88%|████████▊ | 1807/2048 [22:18<02:52,  1.40it/s]
loss 1.70 accuracy 0.25 -- 56.52 + 57.22 + 614.54 + 4.79 = 733.06:  88%|████████▊ | 1807/2048 [22:18<02:52,  1.40it/s]
loss 1.70 accuracy 0.25 -- 56.52 + 57.22 + 614.54 + 4.79 = 733.06:  88%|████████▊ | 1808/2048 [22:18<02:54,  1.37it/s]
loss 1.51 accuracy 0.31 -- 57.04 + 56.86 + 503.60 + 4.79 = 622.30:  88%|████████▊ | 1808/2048 [22:19<02:54,  1.37it/s]
loss 1.51 accuracy 0.31 -- 57.04 + 56.86 + 503.60 + 4.79 = 622.30:  88%|████████▊ | 1809/2048 [22:19<02:48,  1.42it/s]
loss 2.17 accuracy 0.12 -- 56.26 + 166.12 + 502.58 + 4.80 = 729.76:  88%|████████▊ | 1809/2048 [22:20<02:48,  1.42it/s]
loss 2.17 accuracy 0.12 -- 56.26 + 166.12 + 502.58 + 4.80 = 729.76:  88%|████████▊ | 1810/2048 [22:20<02:51,  1.39it/s]
loss 1.82 accuracy 0.25 -- 56.05 + 56.16 + 500.96 + 4.77 = 617.94:  88%|████████▊ | 1810/2048 [22:20<02:51,  1.39it/s] 
loss 1.82 accuracy 0.25 -- 56.05 + 56.16 + 500.96 + 4.77 = 617.94:  88%|████████▊ | 1811/2048 [22:20<02:45,  1.43it/s]
loss 2.29 accuracy 0.31 -- 56.72 + 56.62 + 499.25 + 4.76 = 617.36:  88%|████████▊ | 1811/2048 [22:21<02:45,  1.43it/s]
loss 2.29 accuracy 0.31 -- 56.72 + 56.62 + 499.25 + 4.76 = 617.36:  88%|████████▊ | 1812/2048 [22:21<02:48,  1.40it/s]
loss 1.66 accuracy 0.31 -- 55.97 + 57.30 + 494.98 + 4.76 = 613.01:  88%|████████▊ | 1812/2048 [22:22<02:48,  1.40it/s]
loss 1.66 accuracy 0.31 -- 55.97 + 57.30 + 494.98 + 4.76 = 613.01:  89%|████████▊ | 1813/2048 [22:22<02:42,  1.45it/s]
loss 1.86 accuracy 0.19 -- 157.29 + 56.81 + 489.77 + 4.77 = 708.64:  89%|████████▊ | 1813/2048 [22:23<02:42,  1.45it/s]
loss 1.86 accuracy 0.19 -- 157.29 + 56.81 + 489.77 + 4.77 = 708.64:  89%|████████▊ | 1814/2048 [22:23<02:44,  1.42it/s]
loss 1.45 accuracy 0.50 -- 55.81 + 166.16 + 501.19 + 4.77 = 727.93:  89%|████████▊ | 1814/2048 [22:23<02:44,  1.42it/s]
loss 1.45 accuracy 0.50 -- 55.81 + 166.16 + 501.19 + 4.77 = 727.93:  89%|████████▊ | 1815/2048 [22:23<02:50,  1.37it/s]
loss 1.54 accuracy 0.44 -- 56.68 + 56.39 + 497.53 + 4.79 = 615.39:  89%|████████▊ | 1815/2048 [22:24<02:50,  1.37it/s] 
loss 1.54 accuracy 0.44 -- 56.68 + 56.39 + 497.53 + 4.79 = 615.39:  89%|████████▊ | 1816/2048 [22:24<02:43,  1.42it/s]
loss 2.04 accuracy 0.31 -- 162.28 + 57.25 + 496.72 + 4.76 = 721.01:  89%|████████▊ | 1816/2048 [22:25<02:43,  1.42it/s]
loss 2.04 accuracy 0.31 -- 162.28 + 57.25 + 496.72 + 4.76 = 721.01:  89%|████████▊ | 1817/2048 [22:25<02:45,  1.39it/s]
loss 1.38 accuracy 0.50 -- 56.33 + 57.10 + 620.27 + 4.77 = 738.47:  89%|████████▊ | 1817/2048 [22:26<02:45,  1.39it/s] 
loss 1.38 accuracy 0.50 -- 56.33 + 57.10 + 620.27 + 4.77 = 738.47:  89%|████████▉ | 1818/2048 [22:26<02:48,  1.37it/s]
loss 1.91 accuracy 0.38 -- 56.55 + 56.33 + 505.28 + 4.76 = 622.92:  89%|████████▉ | 1818/2048 [22:26<02:48,  1.37it/s]
loss 1.91 accuracy 0.38 -- 56.55 + 56.33 + 505.28 + 4.76 = 622.92:  89%|████████▉ | 1819/2048 [22:26<02:41,  1.41it/s]
loss 2.27 accuracy 0.19 -- 56.30 + 57.24 + 615.49 + 4.78 = 733.81:  89%|████████▉ | 1819/2048 [22:27<02:41,  1.41it/s]
loss 2.27 accuracy 0.19 -- 56.30 + 57.24 + 615.49 + 4.78 = 733.81:  89%|████████▉ | 1820/2048 [22:27<02:44,  1.38it/s]
loss 2.42 accuracy 0.12 -- 56.89 + 56.45 + 500.49 + 4.78 = 618.60:  89%|████████▉ | 1820/2048 [22:28<02:44,  1.38it/s]
loss 2.42 accuracy 0.12 -- 56.89 + 56.45 + 500.49 + 4.78 = 618.60:  89%|████████▉ | 1821/2048 [22:28<02:38,  1.43it/s]
loss 2.04 accuracy 0.38 -- 56.47 + 166.00 + 502.46 + 4.79 = 729.71:  89%|████████▉ | 1821/2048 [22:28<02:38,  1.43it/s]
loss 2.04 accuracy 0.38 -- 56.47 + 166.00 + 502.46 + 4.79 = 729.71:  89%|████████▉ | 1822/2048 [22:28<02:44,  1.37it/s]
loss 1.73 accuracy 0.44 -- 56.20 + 56.43 + 498.42 + 4.78 = 615.82:  89%|████████▉ | 1822/2048 [22:29<02:44,  1.37it/s] 
loss 1.73 accuracy 0.44 -- 56.20 + 56.43 + 498.42 + 4.78 = 615.82:  89%|████████▉ | 1823/2048 [22:29<02:37,  1.42it/s]
loss 1.76 accuracy 0.38 -- 56.76 + 56.49 + 498.49 + 4.79 = 616.54:  89%|████████▉ | 1823/2048 [22:30<02:37,  1.42it/s]
loss 1.76 accuracy 0.38 -- 56.76 + 56.49 + 498.49 + 4.79 = 616.54:  89%|████████▉ | 1824/2048 [22:30<02:40,  1.40it/s]
loss 1.88 accuracy 0.25 -- 56.00 + 57.31 + 495.47 + 4.78 = 613.57:  89%|████████▉ | 1824/2048 [22:30<02:40,  1.40it/s]
loss 1.88 accuracy 0.25 -- 56.00 + 57.31 + 495.47 + 4.78 = 613.57:  89%|████████▉ | 1825/2048 [22:30<02:34,  1.44it/s]
loss 1.60 accuracy 0.38 -- 157.83 + 56.80 + 490.37 + 4.76 = 709.76:  89%|████████▉ | 1825/2048 [22:31<02:34,  1.44it/s]
loss 1.60 accuracy 0.38 -- 157.83 + 56.80 + 490.37 + 4.76 = 709.76:  89%|████████▉ | 1826/2048 [22:31<02:36,  1.42it/s]
loss 1.79 accuracy 0.31 -- 55.65 + 166.52 + 502.12 + 4.76 = 729.06:  89%|████████▉ | 1826/2048 [22:32<02:36,  1.42it/s]
loss 1.79 accuracy 0.31 -- 55.65 + 166.52 + 502.12 + 4.76 = 729.06:  89%|████████▉ | 1827/2048 [22:32<02:39,  1.39it/s]
loss 2.11 accuracy 0.31 -- 56.32 + 56.35 + 499.37 + 4.77 = 616.81:  89%|████████▉ | 1827/2048 [22:33<02:39,  1.39it/s] 
loss 2.11 accuracy 0.31 -- 56.32 + 56.35 + 499.37 + 4.77 = 616.81:  89%|████████▉ | 1828/2048 [22:33<02:33,  1.43it/s]
loss 1.56 accuracy 0.56 -- 162.75 + 57.07 + 496.36 + 4.78 = 720.95:  89%|████████▉ | 1828/2048 [22:33<02:33,  1.43it/s]
loss 1.56 accuracy 0.56 -- 162.75 + 57.07 + 496.36 + 4.78 = 720.95:  89%|████████▉ | 1829/2048 [22:33<02:36,  1.40it/s]
loss 1.74 accuracy 0.38 -- 55.99 + 57.33 + 620.28 + 4.78 = 738.38:  89%|████████▉ | 1829/2048 [22:34<02:36,  1.40it/s] 
loss 1.74 accuracy 0.38 -- 55.99 + 57.33 + 620.28 + 4.78 = 738.38:  89%|████████▉ | 1830/2048 [22:34<02:38,  1.37it/s]
loss 1.67 accuracy 0.38 -- 56.92 + 56.48 + 504.03 + 4.78 = 622.21:  89%|████████▉ | 1830/2048 [22:35<02:38,  1.37it/s]
loss 1.67 accuracy 0.38 -- 56.92 + 56.48 + 504.03 + 4.78 = 622.21:  89%|████████▉ | 1831/2048 [22:35<02:32,  1.42it/s]
loss 1.85 accuracy 0.31 -- 55.99 + 57.08 + 615.74 + 4.78 = 733.59:  89%|████████▉ | 1831/2048 [22:35<02:32,  1.42it/s]
loss 1.85 accuracy 0.31 -- 55.99 + 57.08 + 615.74 + 4.78 = 733.59:  89%|████████▉ | 1832/2048 [22:35<02:35,  1.39it/s]
loss 1.81 accuracy 0.31 -- 56.69 + 56.60 + 502.39 + 4.76 = 620.44:  89%|████████▉ | 1832/2048 [22:36<02:35,  1.39it/s]
loss 1.81 accuracy 0.31 -- 56.69 + 56.60 + 502.39 + 4.76 = 620.44:  90%|████████▉ | 1833/2048 [22:36<02:30,  1.43it/s]
loss 1.66 accuracy 0.50 -- 56.54 + 166.51 + 501.35 + 4.78 = 729.18:  90%|████████▉ | 1833/2048 [22:37<02:30,  1.43it/s]
loss 1.66 accuracy 0.50 -- 56.54 + 166.51 + 501.35 + 4.78 = 729.18:  90%|████████▉ | 1834/2048 [22:37<02:33,  1.40it/s]
loss 2.19 accuracy 0.25 -- 56.02 + 56.53 + 498.61 + 4.77 = 615.94:  90%|████████▉ | 1834/2048 [22:38<02:33,  1.40it/s] 
loss 2.19 accuracy 0.25 -- 56.02 + 56.53 + 498.61 + 4.77 = 615.94:  90%|████████▉ | 1835/2048 [22:38<02:27,  1.44it/s]
loss 2.09 accuracy 0.38 -- 56.85 + 56.55 + 497.13 + 4.78 = 615.30:  90%|████████▉ | 1835/2048 [22:38<02:27,  1.44it/s]
loss 2.09 accuracy 0.38 -- 56.85 + 56.55 + 497.13 + 4.78 = 615.30:  90%|████████▉ | 1836/2048 [22:38<02:30,  1.41it/s]
loss 1.56 accuracy 0.56 -- 56.04 + 57.49 + 495.43 + 4.80 = 613.77:  90%|████████▉ | 1836/2048 [22:39<02:30,  1.41it/s]
loss 1.56 accuracy 0.56 -- 56.04 + 57.49 + 495.43 + 4.80 = 613.77:  90%|████████▉ | 1837/2048 [22:39<02:25,  1.45it/s]
loss 1.93 accuracy 0.38 -- 157.49 + 56.69 + 488.67 + 4.77 = 707.63:  90%|████████▉ | 1837/2048 [22:40<02:25,  1.45it/s]
loss 1.93 accuracy 0.38 -- 157.49 + 56.69 + 488.67 + 4.77 = 707.63:  90%|████████▉ | 1838/2048 [22:40<02:29,  1.40it/s]
loss 1.65 accuracy 0.31 -- 55.92 + 166.38 + 502.10 + 4.85 = 729.25:  90%|████████▉ | 1838/2048 [22:40<02:29,  1.40it/s]
loss 1.65 accuracy 0.31 -- 55.92 + 166.38 + 502.10 + 4.85 = 729.25:  90%|████████▉ | 1839/2048 [22:40<02:31,  1.38it/s]
loss 1.48 accuracy 0.56 -- 56.76 + 56.24 + 497.60 + 4.77 = 615.37:  90%|████████▉ | 1839/2048 [22:41<02:31,  1.38it/s] 
loss 1.48 accuracy 0.56 -- 56.76 + 56.24 + 497.60 + 4.77 = 615.37:  90%|████████▉ | 1840/2048 [22:41<02:25,  1.43it/s]
loss 2.01 accuracy 0.31 -- 162.24 + 57.29 + 497.24 + 4.80 = 721.57:  90%|████████▉ | 1840/2048 [22:42<02:25,  1.43it/s]
loss 2.01 accuracy 0.31 -- 162.24 + 57.29 + 497.24 + 4.80 = 721.57:  90%|████████▉ | 1841/2048 [22:42<02:28,  1.40it/s]
loss 2.13 accuracy 0.31 -- 56.05 + 57.12 + 619.42 + 4.79 = 737.39:  90%|████████▉ | 1841/2048 [22:43<02:28,  1.40it/s] 
loss 2.13 accuracy 0.31 -- 56.05 + 57.12 + 619.42 + 4.79 = 737.39:  90%|████████▉ | 1842/2048 [22:43<02:30,  1.37it/s]
loss 2.11 accuracy 0.38 -- 56.74 + 56.61 + 505.54 + 4.77 = 623.67:  90%|████████▉ | 1842/2048 [22:43<02:30,  1.37it/s]
loss 2.11 accuracy 0.38 -- 56.74 + 56.61 + 505.54 + 4.77 = 623.67:  90%|████████▉ | 1843/2048 [22:43<02:24,  1.42it/s]
loss 1.58 accuracy 0.38 -- 56.33 + 57.43 + 615.01 + 4.78 = 733.55:  90%|████████▉ | 1843/2048 [22:44<02:24,  1.42it/s]
loss 1.58 accuracy 0.38 -- 56.33 + 57.43 + 615.01 + 4.78 = 733.55:  90%|█████████ | 1844/2048 [22:44<02:27,  1.38it/s]
loss 1.83 accuracy 0.31 -- 56.66 + 56.74 + 502.44 + 4.79 = 620.62:  90%|█████████ | 1844/2048 [22:45<02:27,  1.38it/s]
loss 1.83 accuracy 0.31 -- 56.66 + 56.74 + 502.44 + 4.79 = 620.62:  90%|█████████ | 1845/2048 [22:45<02:24,  1.41it/s]
loss 1.27 accuracy 0.56 -- 56.32 + 166.21 + 500.79 + 4.78 = 728.09:  90%|█████████ | 1845/2048 [22:45<02:24,  1.41it/s]
loss 1.27 accuracy 0.56 -- 56.32 + 166.21 + 500.79 + 4.78 = 728.09:  90%|█████████ | 1846/2048 [22:45<02:26,  1.38it/s]
loss 1.91 accuracy 0.19 -- 56.25 + 56.76 + 498.54 + 4.78 = 616.32:  90%|█████████ | 1846/2048 [22:46<02:26,  1.38it/s] 
loss 1.91 accuracy 0.19 -- 56.25 + 56.76 + 498.54 + 4.78 = 616.32:  90%|█████████ | 1847/2048 [22:46<02:20,  1.43it/s]
loss 1.50 accuracy 0.38 -- 56.67 + 56.47 + 499.78 + 4.77 = 617.69:  90%|█████████ | 1847/2048 [22:47<02:20,  1.43it/s]
loss 1.50 accuracy 0.38 -- 56.67 + 56.47 + 499.78 + 4.77 = 617.69:  90%|█████████ | 1848/2048 [22:47<02:23,  1.40it/s]
loss 1.53 accuracy 0.50 -- 56.13 + 57.06 + 494.63 + 4.76 = 612.59:  90%|█████████ | 1848/2048 [22:47<02:23,  1.40it/s]
loss 1.53 accuracy 0.50 -- 56.13 + 57.06 + 494.63 + 4.76 = 612.59:  90%|█████████ | 1849/2048 [22:47<02:17,  1.44it/s]
loss 1.66 accuracy 0.25 -- 156.93 + 57.20 + 490.14 + 4.78 = 709.04:  90%|█████████ | 1849/2048 [22:48<02:17,  1.44it/s]
loss 1.66 accuracy 0.25 -- 156.93 + 57.20 + 490.14 + 4.78 = 709.04:  90%|█████████ | 1850/2048 [22:48<02:19,  1.42it/s]
loss 1.98 accuracy 0.25 -- 56.15 + 166.76 + 501.81 + 4.78 = 729.49:  90%|█████████ | 1850/2048 [22:49<02:19,  1.42it/s]
loss 1.98 accuracy 0.25 -- 56.15 + 166.76 + 501.81 + 4.78 = 729.49:  90%|█████████ | 1851/2048 [22:49<02:21,  1.39it/s]
loss 1.85 accuracy 0.25 -- 56.61 + 56.58 + 497.13 + 4.77 = 615.09:  90%|█████████ | 1851/2048 [22:50<02:21,  1.39it/s] 
loss 1.85 accuracy 0.25 -- 56.61 + 56.58 + 497.13 + 4.77 = 615.09:  90%|█████████ | 1852/2048 [22:50<02:16,  1.43it/s]
loss 1.80 accuracy 0.50 -- 162.26 + 57.29 + 496.68 + 4.78 = 721.01:  90%|█████████ | 1852/2048 [22:50<02:16,  1.43it/s]
loss 1.80 accuracy 0.50 -- 162.26 + 57.29 + 496.68 + 4.78 = 721.01:  90%|█████████ | 1853/2048 [22:50<02:18,  1.40it/s]
loss 2.31 accuracy 0.25 -- 56.49 + 57.19 + 620.70 + 4.77 = 739.15:  90%|█████████ | 1853/2048 [22:51<02:18,  1.40it/s] 
loss 2.31 accuracy 0.25 -- 56.49 + 57.19 + 620.70 + 4.77 = 739.15:  91%|█████████ | 1854/2048 [22:51<02:21,  1.37it/s]
loss 1.67 accuracy 0.44 -- 56.34 + 56.14 + 504.24 + 4.77 = 621.49:  91%|█████████ | 1854/2048 [22:52<02:21,  1.37it/s]
loss 1.67 accuracy 0.44 -- 56.34 + 56.14 + 504.24 + 4.77 = 621.49:  91%|█████████ | 1855/2048 [22:52<02:15,  1.42it/s]
loss 1.64 accuracy 0.25 -- 56.65 + 57.22 + 616.54 + 4.77 = 735.19:  91%|█████████ | 1855/2048 [22:53<02:15,  1.42it/s]
loss 1.64 accuracy 0.25 -- 56.65 + 57.22 + 616.54 + 4.77 = 735.19:  91%|█████████ | 1856/2048 [22:53<02:18,  1.39it/s]
loss 1.59 accuracy 0.44 -- 56.68 + 56.47 + 501.91 + 4.78 = 619.83:  91%|█████████ | 1856/2048 [22:53<02:18,  1.39it/s]
loss 1.59 accuracy 0.44 -- 56.68 + 56.47 + 501.91 + 4.78 = 619.83:  91%|█████████ | 1857/2048 [22:53<02:13,  1.43it/s]
loss 1.33 accuracy 0.50 -- 56.29 + 166.30 + 502.02 + 4.79 = 729.40:  91%|█████████ | 1857/2048 [22:54<02:13,  1.43it/s]
loss 1.33 accuracy 0.50 -- 56.29 + 166.30 + 502.02 + 4.79 = 729.40:  91%|█████████ | 1858/2048 [22:54<02:16,  1.40it/s]
loss 2.36 accuracy 0.31 -- 55.87 + 56.26 + 498.44 + 4.75 = 615.32:  91%|█████████ | 1858/2048 [22:55<02:16,  1.40it/s] 
loss 2.36 accuracy 0.31 -- 55.87 + 56.26 + 498.44 + 4.75 = 615.32:  91%|█████████ | 1859/2048 [22:55<02:11,  1.44it/s]
loss 1.46 accuracy 0.44 -- 56.74 + 56.32 + 499.38 + 4.78 = 617.22:  91%|█████████ | 1859/2048 [22:55<02:11,  1.44it/s]
loss 1.46 accuracy 0.44 -- 56.74 + 56.32 + 499.38 + 4.78 = 617.22:  91%|█████████ | 1860/2048 [22:55<02:13,  1.41it/s]
loss 2.09 accuracy 0.31 -- 56.07 + 57.36 + 494.31 + 4.78 = 612.51:  91%|█████████ | 1860/2048 [22:56<02:13,  1.41it/s]
loss 2.09 accuracy 0.31 -- 56.07 + 57.36 + 494.31 + 4.78 = 612.51:  91%|█████████ | 1861/2048 [22:56<02:08,  1.45it/s]
loss 1.65 accuracy 0.38 -- 157.49 + 57.00 + 490.85 + 4.77 = 710.10:  91%|█████████ | 1861/2048 [22:57<02:08,  1.45it/s]
loss 1.65 accuracy 0.38 -- 157.49 + 57.00 + 490.85 + 4.77 = 710.10:  91%|█████████ | 1862/2048 [22:57<02:10,  1.42it/s]
loss 1.29 accuracy 0.44 -- 55.97 + 166.28 + 501.80 + 4.79 = 728.84:  91%|█████████ | 1862/2048 [22:57<02:10,  1.42it/s]
loss 1.29 accuracy 0.44 -- 55.97 + 166.28 + 501.80 + 4.79 = 728.84:  91%|█████████ | 1863/2048 [22:57<02:13,  1.39it/s]
loss 2.09 accuracy 0.19 -- 56.75 + 56.74 + 498.15 + 4.78 = 616.42:  91%|█████████ | 1863/2048 [22:58<02:13,  1.39it/s] 
loss 2.09 accuracy 0.19 -- 56.75 + 56.74 + 498.15 + 4.78 = 616.42:  91%|█████████ | 1864/2048 [22:58<02:08,  1.44it/s]
loss 1.54 accuracy 0.38 -- 162.01 + 57.28 + 497.18 + 4.79 = 721.26:  91%|█████████ | 1864/2048 [22:59<02:08,  1.44it/s]
loss 1.54 accuracy 0.38 -- 162.01 + 57.28 + 497.18 + 4.79 = 721.26:  91%|█████████ | 1865/2048 [22:59<02:10,  1.40it/s]
loss 1.48 accuracy 0.44 -- 56.26 + 57.27 + 620.08 + 4.78 = 738.39:  91%|█████████ | 1865/2048 [23:00<02:10,  1.40it/s] 
loss 1.48 accuracy 0.44 -- 56.26 + 57.27 + 620.08 + 4.78 = 738.39:  91%|█████████ | 1866/2048 [23:00<02:12,  1.37it/s]
loss 1.87 accuracy 0.12 -- 56.71 + 56.80 + 505.46 + 4.78 = 623.75:  91%|█████████ | 1866/2048 [23:00<02:12,  1.37it/s]
loss 1.87 accuracy 0.12 -- 56.71 + 56.80 + 505.46 + 4.78 = 623.75:  91%|█████████ | 1867/2048 [23:00<02:07,  1.42it/s]
loss 1.66 accuracy 0.38 -- 56.59 + 57.62 + 617.13 + 4.78 = 736.12:  91%|█████████ | 1867/2048 [23:01<02:07,  1.42it/s]
loss 1.66 accuracy 0.38 -- 56.59 + 57.62 + 617.13 + 4.78 = 736.12:  91%|█████████ | 1868/2048 [23:01<02:10,  1.38it/s]
loss 1.53 accuracy 0.44 -- 56.65 + 56.70 + 501.35 + 4.78 = 619.47:  91%|█████████ | 1868/2048 [23:02<02:10,  1.38it/s]
loss 1.53 accuracy 0.44 -- 56.65 + 56.70 + 501.35 + 4.78 = 619.47:  91%|█████████▏| 1869/2048 [23:02<02:05,  1.43it/s]
loss 2.12 accuracy 0.19 -- 56.09 + 166.29 + 501.16 + 4.82 = 728.36:  91%|█████████▏| 1869/2048 [23:02<02:05,  1.43it/s]
loss 2.12 accuracy 0.19 -- 56.09 + 166.29 + 501.16 + 4.82 = 728.36:  91%|█████████▏| 1870/2048 [23:02<02:07,  1.40it/s]
loss 1.87 accuracy 0.19 -- 56.38 + 56.56 + 499.10 + 4.76 = 616.80:  91%|█████████▏| 1870/2048 [23:03<02:07,  1.40it/s] 
loss 1.87 accuracy 0.19 -- 56.38 + 56.56 + 499.10 + 4.76 = 616.80:  91%|█████████▏| 1871/2048 [23:03<02:02,  1.44it/s]
loss 1.38 accuracy 0.44 -- 56.50 + 56.26 + 498.25 + 4.77 = 615.77:  91%|█████████▏| 1871/2048 [23:04<02:02,  1.44it/s]
loss 1.38 accuracy 0.44 -- 56.50 + 56.26 + 498.25 + 4.77 = 615.77:  91%|█████████▏| 1872/2048 [23:04<02:05,  1.41it/s]
loss 2.05 accuracy 0.31 -- 55.99 + 57.08 + 496.43 + 4.79 = 614.28:  91%|█████████▏| 1872/2048 [23:04<02:05,  1.41it/s]
loss 2.05 accuracy 0.31 -- 55.99 + 57.08 + 496.43 + 4.79 = 614.28:  91%|█████████▏| 1873/2048 [23:04<02:00,  1.45it/s]
loss 1.59 accuracy 0.50 -- 157.46 + 56.96 + 489.00 + 4.76 = 708.18:  91%|█████████▏| 1873/2048 [23:05<02:00,  1.45it/s]
loss 1.59 accuracy 0.50 -- 157.46 + 56.96 + 489.00 + 4.76 = 708.18:  92%|█████████▏| 1874/2048 [23:05<02:02,  1.42it/s]
loss 1.86 accuracy 0.19 -- 55.65 + 166.39 + 500.72 + 4.76 = 727.53:  92%|█████████▏| 1874/2048 [23:06<02:02,  1.42it/s]
loss 1.86 accuracy 0.19 -- 55.65 + 166.39 + 500.72 + 4.76 = 727.53:  92%|█████████▏| 1875/2048 [23:06<02:04,  1.39it/s]
loss 1.55 accuracy 0.38 -- 56.44 + 56.23 + 496.90 + 4.86 = 614.42:  92%|█████████▏| 1875/2048 [23:07<02:04,  1.39it/s] 
loss 1.55 accuracy 0.38 -- 56.44 + 56.23 + 496.90 + 4.86 = 614.42:  92%|█████████▏| 1876/2048 [23:07<01:59,  1.44it/s]
loss 1.68 accuracy 0.44 -- 162.16 + 56.86 + 498.17 + 4.78 = 721.96:  92%|█████████▏| 1876/2048 [23:07<01:59,  1.44it/s]
loss 1.68 accuracy 0.44 -- 162.16 + 56.86 + 498.17 + 4.78 = 721.96:  92%|█████████▏| 1877/2048 [23:07<02:01,  1.40it/s]
loss 1.65 accuracy 0.50 -- 56.16 + 57.12 + 620.32 + 4.77 = 738.36:  92%|█████████▏| 1877/2048 [23:08<02:01,  1.40it/s] 
loss 1.65 accuracy 0.50 -- 56.16 + 57.12 + 620.32 + 4.77 = 738.36:  92%|█████████▏| 1878/2048 [23:08<02:03,  1.37it/s]
loss 2.03 accuracy 0.19 -- 56.96 + 56.64 + 505.58 + 4.78 = 623.96:  92%|█████████▏| 1878/2048 [23:09<02:03,  1.37it/s]
loss 2.03 accuracy 0.19 -- 56.96 + 56.64 + 505.58 + 4.78 = 623.96:  92%|█████████▏| 1879/2048 [23:09<01:59,  1.42it/s]
loss 1.64 accuracy 0.31 -- 56.09 + 56.98 + 615.48 + 4.78 = 733.33:  92%|█████████▏| 1879/2048 [23:10<01:59,  1.42it/s]
loss 1.64 accuracy 0.31 -- 56.09 + 56.98 + 615.48 + 4.78 = 733.33:  92%|█████████▏| 1880/2048 [23:10<02:01,  1.39it/s]
loss 1.35 accuracy 0.44 -- 56.80 + 56.57 + 502.02 + 4.76 = 620.15:  92%|█████████▏| 1880/2048 [23:10<02:01,  1.39it/s]
loss 1.35 accuracy 0.44 -- 56.80 + 56.57 + 502.02 + 4.76 = 620.15:  92%|█████████▏| 1881/2048 [23:10<01:56,  1.43it/s]
loss 1.52 accuracy 0.50 -- 55.91 + 165.85 + 501.72 + 4.80 = 728.29:  92%|█████████▏| 1881/2048 [23:11<01:56,  1.43it/s]
loss 1.52 accuracy 0.50 -- 55.91 + 165.85 + 501.72 + 4.80 = 728.29:  92%|█████████▏| 1882/2048 [23:11<01:58,  1.40it/s]
loss 1.67 accuracy 0.38 -- 56.39 + 56.32 + 499.78 + 4.76 = 617.25:  92%|█████████▏| 1882/2048 [23:12<01:58,  1.40it/s] 
loss 1.67 accuracy 0.38 -- 56.39 + 56.32 + 499.78 + 4.76 = 617.25:  92%|█████████▏| 1883/2048 [23:12<01:54,  1.44it/s]
loss 2.12 accuracy 0.19 -- 56.90 + 56.74 + 499.23 + 4.77 = 617.63:  92%|█████████▏| 1883/2048 [23:12<01:54,  1.44it/s]
loss 2.12 accuracy 0.19 -- 56.90 + 56.74 + 499.23 + 4.77 = 617.63:  92%|█████████▏| 1884/2048 [23:12<01:56,  1.41it/s]
loss 1.39 accuracy 0.50 -- 56.40 + 57.76 + 494.85 + 4.77 = 613.78:  92%|█████████▏| 1884/2048 [23:13<01:56,  1.41it/s]
loss 1.39 accuracy 0.50 -- 56.40 + 57.76 + 494.85 + 4.77 = 613.78:  92%|█████████▏| 1885/2048 [23:13<01:52,  1.45it/s]
loss 1.69 accuracy 0.31 -- 157.18 + 56.86 + 490.16 + 4.76 = 708.97:  92%|█████████▏| 1885/2048 [23:14<01:52,  1.45it/s]
loss 1.69 accuracy 0.31 -- 157.18 + 56.86 + 490.16 + 4.76 = 708.97:  92%|█████████▏| 1886/2048 [23:14<01:54,  1.42it/s]
loss 1.70 accuracy 0.56 -- 56.01 + 166.90 + 500.62 + 4.79 = 728.31:  92%|█████████▏| 1886/2048 [23:14<01:54,  1.42it/s]
loss 1.70 accuracy 0.56 -- 56.01 + 166.90 + 500.62 + 4.79 = 728.31:  92%|█████████▏| 1887/2048 [23:14<01:55,  1.39it/s]
loss 1.83 accuracy 0.38 -- 56.49 + 56.67 + 497.44 + 4.77 = 615.37:  92%|█████████▏| 1887/2048 [23:15<01:55,  1.39it/s] 
loss 1.83 accuracy 0.38 -- 56.49 + 56.67 + 497.44 + 4.77 = 615.37:  92%|█████████▏| 1888/2048 [23:15<01:51,  1.44it/s]
loss 1.42 accuracy 0.31 -- 162.19 + 56.99 + 496.43 + 4.79 = 720.40:  92%|█████████▏| 1888/2048 [23:16<01:51,  1.44it/s]
loss 1.42 accuracy 0.31 -- 162.19 + 56.99 + 496.43 + 4.79 = 720.40:  92%|█████████▏| 1889/2048 [23:16<01:53,  1.41it/s]
loss 1.71 accuracy 0.25 -- 56.23 + 57.53 + 622.44 + 4.78 = 740.99:  92%|█████████▏| 1889/2048 [23:17<01:53,  1.41it/s] 
loss 1.71 accuracy 0.25 -- 56.23 + 57.53 + 622.44 + 4.78 = 740.99:  92%|█████████▏| 1890/2048 [23:17<01:55,  1.37it/s]
loss 1.53 accuracy 0.31 -- 56.73 + 56.21 + 504.62 + 4.76 = 622.31:  92%|█████████▏| 1890/2048 [23:17<01:55,  1.37it/s]
loss 1.53 accuracy 0.31 -- 56.73 + 56.21 + 504.62 + 4.76 = 622.31:  92%|█████████▏| 1891/2048 [23:17<01:50,  1.42it/s]
loss 2.01 accuracy 0.19 -- 56.03 + 57.48 + 615.69 + 4.77 = 733.99:  92%|█████████▏| 1891/2048 [23:18<01:50,  1.42it/s]
loss 2.01 accuracy 0.19 -- 56.03 + 57.48 + 615.69 + 4.77 = 733.99:  92%|█████████▏| 1892/2048 [23:18<01:52,  1.39it/s]
loss 1.32 accuracy 0.50 -- 56.63 + 56.57 + 500.82 + 4.78 = 618.80:  92%|█████████▏| 1892/2048 [23:19<01:52,  1.39it/s]
loss 1.32 accuracy 0.50 -- 56.63 + 56.57 + 500.82 + 4.78 = 618.80:  92%|█████████▏| 1893/2048 [23:19<01:49,  1.41it/s]
loss 1.45 accuracy 0.50 -- 56.19 + 166.06 + 501.25 + 4.78 = 728.28:  92%|█████████▏| 1893/2048 [23:19<01:49,  1.41it/s]
loss 1.45 accuracy 0.50 -- 56.19 + 166.06 + 501.25 + 4.78 = 728.28:  92%|█████████▏| 1894/2048 [23:19<01:51,  1.38it/s]
loss 1.99 accuracy 0.56 -- 56.46 + 56.44 + 498.93 + 4.76 = 616.58:  92%|█████████▏| 1894/2048 [23:20<01:51,  1.38it/s] 
loss 1.99 accuracy 0.56 -- 56.46 + 56.44 + 498.93 + 4.76 = 616.58:  93%|█████████▎| 1895/2048 [23:20<01:46,  1.43it/s]
loss 1.50 accuracy 0.31 -- 56.75 + 56.74 + 498.27 + 4.77 = 616.54:  93%|█████████▎| 1895/2048 [23:21<01:46,  1.43it/s]
loss 1.50 accuracy 0.31 -- 56.75 + 56.74 + 498.27 + 4.77 = 616.54:  93%|█████████▎| 1896/2048 [23:21<01:48,  1.40it/s]
loss 1.73 accuracy 0.25 -- 56.11 + 57.32 + 495.11 + 4.76 = 613.30:  93%|█████████▎| 1896/2048 [23:22<01:48,  1.40it/s]
loss 1.73 accuracy 0.25 -- 56.11 + 57.32 + 495.11 + 4.76 = 613.30:  93%|█████████▎| 1897/2048 [23:22<01:44,  1.44it/s]
loss 1.47 accuracy 0.31 -- 157.14 + 56.82 + 489.79 + 4.77 = 708.52:  93%|█████████▎| 1897/2048 [23:22<01:44,  1.44it/s]
loss 1.47 accuracy 0.31 -- 157.14 + 56.82 + 489.79 + 4.77 = 708.52:  93%|█████████▎| 1898/2048 [23:22<01:45,  1.42it/s]
loss 1.63 accuracy 0.44 -- 55.80 + 166.87 + 500.35 + 4.78 = 727.80:  93%|█████████▎| 1898/2048 [23:23<01:45,  1.42it/s]
loss 1.63 accuracy 0.44 -- 55.80 + 166.87 + 500.35 + 4.78 = 727.80:  93%|█████████▎| 1899/2048 [23:23<01:47,  1.39it/s]
loss 1.58 accuracy 0.38 -- 56.86 + 56.44 + 497.50 + 4.81 = 615.61:  93%|█████████▎| 1899/2048 [23:24<01:47,  1.39it/s] 
loss 1.58 accuracy 0.38 -- 56.86 + 56.44 + 497.50 + 4.81 = 615.61:  93%|█████████▎| 1900/2048 [23:24<01:43,  1.44it/s]
loss 1.66 accuracy 0.38 -- 162.35 + 57.17 + 496.87 + 4.79 = 721.19:  93%|█████████▎| 1900/2048 [23:24<01:43,  1.44it/s]
loss 1.66 accuracy 0.38 -- 162.35 + 57.17 + 496.87 + 4.79 = 721.19:  93%|█████████▎| 1901/2048 [23:24<01:44,  1.40it/s]
loss 2.17 accuracy 0.19 -- 56.39 + 56.98 + 618.91 + 4.80 = 737.07:  93%|█████████▎| 1901/2048 [23:25<01:44,  1.40it/s] 
loss 2.17 accuracy 0.19 -- 56.39 + 56.98 + 618.91 + 4.80 = 737.07:  93%|█████████▎| 1902/2048 [23:25<01:46,  1.37it/s]
loss 1.29 accuracy 0.50 -- 56.80 + 56.28 + 505.15 + 4.76 = 622.98:  93%|█████████▎| 1902/2048 [23:26<01:46,  1.37it/s]
loss 1.29 accuracy 0.50 -- 56.80 + 56.28 + 505.15 + 4.76 = 622.98:  93%|█████████▎| 1903/2048 [23:26<01:42,  1.42it/s]
loss 1.71 accuracy 0.56 -- 56.02 + 57.12 + 616.93 + 4.79 = 734.85:  93%|█████████▎| 1903/2048 [23:27<01:42,  1.42it/s]
loss 1.71 accuracy 0.56 -- 56.02 + 57.12 + 616.93 + 4.79 = 734.85:  93%|█████████▎| 1904/2048 [23:27<01:43,  1.39it/s]
loss 1.78 accuracy 0.38 -- 56.83 + 56.80 + 502.72 + 4.78 = 621.13:  93%|█████████▎| 1904/2048 [23:27<01:43,  1.39it/s]
loss 1.78 accuracy 0.38 -- 56.83 + 56.80 + 502.72 + 4.78 = 621.13:  93%|█████████▎| 1905/2048 [23:27<01:40,  1.43it/s]
loss 1.79 accuracy 0.31 -- 56.20 + 166.05 + 501.29 + 4.79 = 728.33:  93%|█████████▎| 1905/2048 [23:28<01:40,  1.43it/s]
loss 1.79 accuracy 0.31 -- 56.20 + 166.05 + 501.29 + 4.79 = 728.33:  93%|█████████▎| 1906/2048 [23:28<01:41,  1.40it/s]
loss 1.68 accuracy 0.19 -- 56.49 + 56.46 + 500.27 + 4.77 = 617.99:  93%|█████████▎| 1906/2048 [23:29<01:41,  1.40it/s] 
loss 1.68 accuracy 0.19 -- 56.49 + 56.46 + 500.27 + 4.77 = 617.99:  93%|█████████▎| 1907/2048 [23:29<01:37,  1.44it/s]
loss 1.49 accuracy 0.44 -- 56.48 + 56.42 + 497.29 + 4.79 = 614.98:  93%|█████████▎| 1907/2048 [23:29<01:37,  1.44it/s]
loss 1.49 accuracy 0.44 -- 56.48 + 56.42 + 497.29 + 4.79 = 614.98:  93%|█████████▎| 1908/2048 [23:29<01:39,  1.41it/s]
loss 1.56 accuracy 0.38 -- 55.79 + 57.45 + 494.80 + 4.77 = 612.80:  93%|█████████▎| 1908/2048 [23:30<01:39,  1.41it/s]
loss 1.56 accuracy 0.38 -- 55.79 + 57.45 + 494.80 + 4.77 = 612.80:  93%|█████████▎| 1909/2048 [23:30<01:37,  1.43it/s]
loss 1.55 accuracy 0.44 -- 157.60 + 56.73 + 488.66 + 4.76 = 707.74:  93%|█████████▎| 1909/2048 [23:31<01:37,  1.43it/s]
loss 1.55 accuracy 0.44 -- 157.60 + 56.73 + 488.66 + 4.76 = 707.74:  93%|█████████▎| 1910/2048 [23:31<01:38,  1.41it/s]
loss 1.45 accuracy 0.50 -- 55.50 + 166.04 + 500.31 + 4.76 = 726.61:  93%|█████████▎| 1910/2048 [23:32<01:38,  1.41it/s]
loss 1.45 accuracy 0.50 -- 55.50 + 166.04 + 500.31 + 4.76 = 726.61:  93%|█████████▎| 1911/2048 [23:32<01:39,  1.38it/s]
loss 1.80 accuracy 0.31 -- 56.17 + 56.42 + 497.51 + 4.81 = 614.91:  93%|█████████▎| 1911/2048 [23:32<01:39,  1.38it/s] 
loss 1.80 accuracy 0.31 -- 56.17 + 56.42 + 497.51 + 4.81 = 614.91:  93%|█████████▎| 1912/2048 [23:32<01:35,  1.43it/s]
loss 1.70 accuracy 0.31 -- 163.43 + 57.26 + 496.59 + 4.78 = 722.06:  93%|█████████▎| 1912/2048 [23:33<01:35,  1.43it/s]
loss 1.70 accuracy 0.31 -- 163.43 + 57.26 + 496.59 + 4.78 = 722.06:  93%|█████████▎| 1913/2048 [23:33<01:36,  1.40it/s]
loss 1.98 accuracy 0.12 -- 56.12 + 57.38 + 619.92 + 4.79 = 738.22:  93%|█████████▎| 1913/2048 [23:34<01:36,  1.40it/s] 
loss 1.98 accuracy 0.12 -- 56.12 + 57.38 + 619.92 + 4.79 = 738.22:  93%|█████████▎| 1914/2048 [23:34<01:37,  1.37it/s]
loss 1.71 accuracy 0.38 -- 56.87 + 56.50 + 505.39 + 4.82 = 623.57:  93%|█████████▎| 1914/2048 [23:34<01:37,  1.37it/s]
loss 1.71 accuracy 0.38 -- 56.87 + 56.50 + 505.39 + 4.82 = 623.57:  94%|█████████▎| 1915/2048 [23:34<01:33,  1.42it/s]
loss 2.05 accuracy 0.12 -- 55.98 + 56.93 + 615.62 + 4.78 = 733.31:  94%|█████████▎| 1915/2048 [23:35<01:33,  1.42it/s]
loss 2.05 accuracy 0.12 -- 55.98 + 56.93 + 615.62 + 4.78 = 733.31:  94%|█████████▎| 1916/2048 [23:35<01:35,  1.39it/s]
loss 1.55 accuracy 0.44 -- 56.71 + 56.78 + 501.05 + 4.77 = 619.32:  94%|█████████▎| 1916/2048 [23:36<01:35,  1.39it/s]
loss 1.55 accuracy 0.44 -- 56.71 + 56.78 + 501.05 + 4.77 = 619.32:  94%|█████████▎| 1917/2048 [23:36<01:31,  1.43it/s]
loss 1.91 accuracy 0.19 -- 56.01 + 165.96 + 503.64 + 4.78 = 730.40:  94%|█████████▎| 1917/2048 [23:36<01:31,  1.43it/s]
loss 1.91 accuracy 0.19 -- 56.01 + 165.96 + 503.64 + 4.78 = 730.40:  94%|█████████▎| 1918/2048 [23:36<01:33,  1.40it/s]
loss 1.40 accuracy 0.38 -- 56.21 + 56.31 + 499.72 + 4.76 = 617.00:  94%|█████████▎| 1918/2048 [23:37<01:33,  1.40it/s] 
loss 1.40 accuracy 0.38 -- 56.21 + 56.31 + 499.72 + 4.76 = 617.00:  94%|█████████▎| 1919/2048 [23:37<01:29,  1.44it/s]
loss 1.57 accuracy 0.38 -- 56.83 + 56.74 + 498.58 + 4.78 = 616.93:  94%|█████████▎| 1919/2048 [23:38<01:29,  1.44it/s]
loss 1.57 accuracy 0.38 -- 56.83 + 56.74 + 498.58 + 4.78 = 616.93:  94%|█████████▍| 1920/2048 [23:38<01:31,  1.41it/s]
loss 1.94 accuracy 0.38 -- 56.03 + 57.11 + 495.75 + 4.80 = 613.70:  94%|█████████▍| 1920/2048 [23:39<01:31,  1.41it/s]
loss 1.94 accuracy 0.38 -- 56.03 + 57.11 + 495.75 + 4.80 = 613.70:  94%|█████████▍| 1921/2048 [23:39<01:27,  1.45it/s]
loss 1.62 accuracy 0.25 -- 156.96 + 57.24 + 489.00 + 4.77 = 707.97:  94%|█████████▍| 1921/2048 [23:39<01:27,  1.45it/s]
loss 1.62 accuracy 0.25 -- 156.96 + 57.24 + 489.00 + 4.77 = 707.97:  94%|█████████▍| 1922/2048 [23:39<01:28,  1.42it/s]
loss 1.72 accuracy 0.25 -- 55.93 + 166.35 + 501.53 + 4.81 = 728.61:  94%|█████████▍| 1922/2048 [23:40<01:28,  1.42it/s]
loss 1.72 accuracy 0.25 -- 55.93 + 166.35 + 501.53 + 4.81 = 728.61:  94%|█████████▍| 1923/2048 [23:40<01:29,  1.39it/s]
loss 1.86 accuracy 0.44 -- 56.61 + 57.00 + 497.90 + 4.78 = 616.29:  94%|█████████▍| 1923/2048 [23:41<01:29,  1.39it/s] 
loss 1.86 accuracy 0.44 -- 56.61 + 57.00 + 497.90 + 4.78 = 616.29:  94%|█████████▍| 1924/2048 [23:41<01:26,  1.44it/s]
loss 1.81 accuracy 0.25 -- 162.77 + 56.91 + 497.24 + 4.77 = 721.69:  94%|█████████▍| 1924/2048 [23:41<01:26,  1.44it/s]
loss 1.81 accuracy 0.25 -- 162.77 + 56.91 + 497.24 + 4.77 = 721.69:  94%|█████████▍| 1925/2048 [23:41<01:27,  1.40it/s]
loss 1.51 accuracy 0.19 -- 56.20 + 57.18 + 620.94 + 4.78 = 739.09:  94%|█████████▍| 1925/2048 [23:42<01:27,  1.40it/s] 
loss 1.51 accuracy 0.19 -- 56.20 + 57.18 + 620.94 + 4.78 = 739.09:  94%|█████████▍| 1926/2048 [23:42<01:28,  1.37it/s]
loss 1.61 accuracy 0.44 -- 56.59 + 56.40 + 505.06 + 4.77 = 622.82:  94%|█████████▍| 1926/2048 [23:43<01:28,  1.37it/s]
loss 1.61 accuracy 0.44 -- 56.59 + 56.40 + 505.06 + 4.77 = 622.82:  94%|█████████▍| 1927/2048 [23:43<01:25,  1.42it/s]
loss 1.59 accuracy 0.31 -- 55.97 + 57.22 + 616.41 + 4.79 = 734.38:  94%|█████████▍| 1927/2048 [23:44<01:25,  1.42it/s]
loss 1.59 accuracy 0.31 -- 55.97 + 57.22 + 616.41 + 4.79 = 734.38:  94%|█████████▍| 1928/2048 [23:44<01:26,  1.39it/s]
loss 2.30 accuracy 0.31 -- 56.51 + 56.37 + 502.28 + 4.79 = 619.95:  94%|█████████▍| 1928/2048 [23:44<01:26,  1.39it/s]
loss 2.30 accuracy 0.31 -- 56.51 + 56.37 + 502.28 + 4.79 = 619.95:  94%|█████████▍| 1929/2048 [23:44<01:23,  1.43it/s]
loss 1.54 accuracy 0.50 -- 56.46 + 166.28 + 500.17 + 4.77 = 727.68:  94%|█████████▍| 1929/2048 [23:45<01:23,  1.43it/s]
loss 1.54 accuracy 0.50 -- 56.46 + 166.28 + 500.17 + 4.77 = 727.68:  94%|█████████▍| 1930/2048 [23:45<01:24,  1.40it/s]
loss 1.56 accuracy 0.44 -- 56.05 + 56.74 + 498.90 + 4.78 = 616.47:  94%|█████████▍| 1930/2048 [23:46<01:24,  1.40it/s] 
loss 1.56 accuracy 0.44 -- 56.05 + 56.74 + 498.90 + 4.78 = 616.47:  94%|█████████▍| 1931/2048 [23:46<01:21,  1.44it/s]
loss 1.85 accuracy 0.44 -- 56.64 + 56.35 + 498.60 + 4.77 = 616.36:  94%|█████████▍| 1931/2048 [23:46<01:21,  1.44it/s]
loss 1.85 accuracy 0.44 -- 56.64 + 56.35 + 498.60 + 4.77 = 616.36:  94%|█████████▍| 1932/2048 [23:46<01:22,  1.41it/s]
loss 2.34 accuracy 0.25 -- 56.16 + 57.23 + 494.89 + 4.79 = 613.06:  94%|█████████▍| 1932/2048 [23:47<01:22,  1.41it/s]
loss 2.34 accuracy 0.25 -- 56.16 + 57.23 + 494.89 + 4.79 = 613.06:  94%|█████████▍| 1933/2048 [23:47<01:20,  1.43it/s]
loss 1.97 accuracy 0.12 -- 158.05 + 56.83 + 489.65 + 4.77 = 709.30:  94%|█████████▍| 1933/2048 [23:48<01:20,  1.43it/s]
loss 1.97 accuracy 0.12 -- 158.05 + 56.83 + 489.65 + 4.77 = 709.30:  94%|█████████▍| 1934/2048 [23:48<01:21,  1.41it/s]
loss 1.72 accuracy 0.44 -- 56.01 + 166.59 + 502.26 + 4.76 = 729.63:  94%|█████████▍| 1934/2048 [23:49<01:21,  1.41it/s]
loss 1.72 accuracy 0.44 -- 56.01 + 166.59 + 502.26 + 4.76 = 729.63:  94%|█████████▍| 1935/2048 [23:49<01:21,  1.38it/s]
loss 1.51 accuracy 0.44 -- 56.53 + 56.30 + 497.17 + 4.78 = 614.78:  94%|█████████▍| 1935/2048 [23:49<01:21,  1.38it/s] 
loss 1.51 accuracy 0.44 -- 56.53 + 56.30 + 497.17 + 4.78 = 614.78:  95%|█████████▍| 1936/2048 [23:49<01:18,  1.43it/s]
loss 1.56 accuracy 0.31 -- 162.40 + 57.20 + 497.17 + 4.77 = 721.54:  95%|█████████▍| 1936/2048 [23:50<01:18,  1.43it/s]
loss 1.56 accuracy 0.31 -- 162.40 + 57.20 + 497.17 + 4.77 = 721.54:  95%|█████████▍| 1937/2048 [23:50<01:19,  1.40it/s]
loss 1.93 accuracy 0.31 -- 56.17 + 57.19 + 620.67 + 4.77 = 738.80:  95%|█████████▍| 1937/2048 [23:51<01:19,  1.40it/s] 
loss 1.93 accuracy 0.31 -- 56.17 + 57.19 + 620.67 + 4.77 = 738.80:  95%|█████████▍| 1938/2048 [23:51<01:20,  1.37it/s]
loss 1.58 accuracy 0.38 -- 57.07 + 56.68 + 506.53 + 4.77 = 625.05:  95%|█████████▍| 1938/2048 [23:51<01:20,  1.37it/s]
loss 1.58 accuracy 0.38 -- 57.07 + 56.68 + 506.53 + 4.77 = 625.05:  95%|█████████▍| 1939/2048 [23:51<01:16,  1.42it/s]
loss 1.86 accuracy 0.25 -- 56.22 + 57.30 + 616.67 + 4.78 = 734.98:  95%|█████████▍| 1939/2048 [23:52<01:16,  1.42it/s]
loss 1.86 accuracy 0.25 -- 56.22 + 57.30 + 616.67 + 4.78 = 734.98:  95%|█████████▍| 1940/2048 [23:52<01:18,  1.38it/s]
loss 2.07 accuracy 0.31 -- 57.45 + 57.01 + 502.78 + 4.85 = 622.09:  95%|█████████▍| 1940/2048 [23:53<01:18,  1.38it/s]
loss 2.07 accuracy 0.31 -- 57.45 + 57.01 + 502.78 + 4.85 = 622.09:  95%|█████████▍| 1941/2048 [23:53<01:14,  1.43it/s]
loss 1.61 accuracy 0.38 -- 56.24 + 166.46 + 502.37 + 4.78 = 729.85:  95%|█████████▍| 1941/2048 [23:54<01:14,  1.43it/s]
loss 1.61 accuracy 0.38 -- 56.24 + 166.46 + 502.37 + 4.78 = 729.85:  95%|█████████▍| 1942/2048 [23:54<01:16,  1.39it/s]
loss 1.71 accuracy 0.38 -- 56.17 + 56.28 + 498.94 + 4.79 = 616.18:  95%|█████████▍| 1942/2048 [23:54<01:16,  1.39it/s] 
loss 1.71 accuracy 0.38 -- 56.17 + 56.28 + 498.94 + 4.79 = 616.18:  95%|█████████▍| 1943/2048 [23:54<01:12,  1.44it/s]
loss 1.77 accuracy 0.25 -- 56.81 + 56.48 + 497.99 + 4.80 = 616.08:  95%|█████████▍| 1943/2048 [23:55<01:12,  1.44it/s]
loss 1.77 accuracy 0.25 -- 56.81 + 56.48 + 497.99 + 4.80 = 616.08:  95%|█████████▍| 1944/2048 [23:55<01:14,  1.41it/s]
loss 1.95 accuracy 0.50 -- 55.88 + 57.15 + 495.12 + 4.81 = 612.95:  95%|█████████▍| 1944/2048 [23:56<01:14,  1.41it/s]
loss 1.95 accuracy 0.50 -- 55.88 + 57.15 + 495.12 + 4.81 = 612.95:  95%|█████████▍| 1945/2048 [23:56<01:11,  1.45it/s]
loss 1.50 accuracy 0.31 -- 157.94 + 56.70 + 490.52 + 4.80 = 709.96:  95%|█████████▍| 1945/2048 [23:56<01:11,  1.45it/s]
loss 1.50 accuracy 0.31 -- 157.94 + 56.70 + 490.52 + 4.80 = 709.96:  95%|█████████▌| 1946/2048 [23:56<01:11,  1.42it/s]
loss 1.57 accuracy 0.44 -- 56.14 + 167.10 + 502.77 + 4.78 = 730.78:  95%|█████████▌| 1946/2048 [23:57<01:11,  1.42it/s]
loss 1.57 accuracy 0.44 -- 56.14 + 167.10 + 502.77 + 4.78 = 730.78:  95%|█████████▌| 1947/2048 [23:57<01:12,  1.39it/s]
loss 1.76 accuracy 0.31 -- 56.53 + 56.62 + 498.15 + 4.78 = 616.08:  95%|█████████▌| 1947/2048 [23:58<01:12,  1.39it/s] 
loss 1.76 accuracy 0.31 -- 56.53 + 56.62 + 498.15 + 4.78 = 616.08:  95%|█████████▌| 1948/2048 [23:58<01:09,  1.43it/s]
loss 2.18 accuracy 0.25 -- 162.31 + 56.88 + 497.33 + 4.76 = 721.28:  95%|█████████▌| 1948/2048 [23:58<01:09,  1.43it/s]
loss 2.18 accuracy 0.25 -- 162.31 + 56.88 + 497.33 + 4.76 = 721.28:  95%|█████████▌| 1949/2048 [23:58<01:10,  1.40it/s]
loss 1.63 accuracy 0.25 -- 55.85 + 57.18 + 618.98 + 4.78 = 736.79:  95%|█████████▌| 1949/2048 [23:59<01:10,  1.40it/s] 
loss 1.63 accuracy 0.25 -- 55.85 + 57.18 + 618.98 + 4.78 = 736.79:  95%|█████████▌| 1950/2048 [23:59<01:11,  1.37it/s]
loss 1.88 accuracy 0.38 -- 56.76 + 56.71 + 504.03 + 4.80 = 622.30:  95%|█████████▌| 1950/2048 [24:00<01:11,  1.37it/s]
loss 1.88 accuracy 0.38 -- 56.76 + 56.71 + 504.03 + 4.80 = 622.30:  95%|█████████▌| 1951/2048 [24:00<01:08,  1.42it/s]
loss 1.56 accuracy 0.38 -- 55.99 + 57.17 + 617.36 + 4.77 = 735.29:  95%|█████████▌| 1951/2048 [24:01<01:08,  1.42it/s]
loss 1.56 accuracy 0.38 -- 55.99 + 57.17 + 617.36 + 4.77 = 735.29:  95%|█████████▌| 1952/2048 [24:01<01:09,  1.39it/s]
loss 2.12 accuracy 0.25 -- 56.39 + 56.25 + 500.92 + 4.76 = 618.32:  95%|█████████▌| 1952/2048 [24:01<01:09,  1.39it/s]
loss 2.12 accuracy 0.25 -- 56.39 + 56.25 + 500.92 + 4.76 = 618.32:  95%|█████████▌| 1953/2048 [24:01<01:06,  1.43it/s]
loss 1.71 accuracy 0.38 -- 56.11 + 166.02 + 500.83 + 4.82 = 727.78:  95%|█████████▌| 1953/2048 [24:02<01:06,  1.43it/s]
loss 1.71 accuracy 0.38 -- 56.11 + 166.02 + 500.83 + 4.82 = 727.78:  95%|█████████▌| 1954/2048 [24:02<01:07,  1.40it/s]
loss 1.58 accuracy 0.38 -- 56.08 + 56.39 + 498.51 + 4.77 = 615.75:  95%|█████████▌| 1954/2048 [24:03<01:07,  1.40it/s] 
loss 1.58 accuracy 0.38 -- 56.08 + 56.39 + 498.51 + 4.77 = 615.75:  95%|█████████▌| 1955/2048 [24:03<01:04,  1.44it/s]
loss 1.59 accuracy 0.44 -- 56.56 + 56.49 + 497.88 + 4.78 = 615.71:  95%|█████████▌| 1955/2048 [24:03<01:04,  1.44it/s]
loss 1.59 accuracy 0.44 -- 56.56 + 56.49 + 497.88 + 4.78 = 615.71:  96%|█████████▌| 1956/2048 [24:03<01:05,  1.41it/s]
loss 1.88 accuracy 0.44 -- 56.08 + 57.50 + 495.64 + 4.81 = 614.04:  96%|█████████▌| 1956/2048 [24:04<01:05,  1.41it/s]
loss 1.88 accuracy 0.44 -- 56.08 + 57.50 + 495.64 + 4.81 = 614.04:  96%|█████████▌| 1957/2048 [24:04<01:03,  1.43it/s]
loss 1.77 accuracy 0.19 -- 157.37 + 57.17 + 490.24 + 4.76 = 709.54:  96%|█████████▌| 1957/2048 [24:05<01:03,  1.43it/s]
loss 1.77 accuracy 0.19 -- 157.37 + 57.17 + 490.24 + 4.76 = 709.54:  96%|█████████▌| 1958/2048 [24:05<01:03,  1.41it/s]
loss 1.67 accuracy 0.50 -- 55.89 + 166.48 + 501.74 + 4.78 = 728.89:  96%|█████████▌| 1958/2048 [24:06<01:03,  1.41it/s]
loss 1.67 accuracy 0.50 -- 55.89 + 166.48 + 501.74 + 4.78 = 728.89:  96%|█████████▌| 1959/2048 [24:06<01:04,  1.38it/s]
loss 1.67 accuracy 0.44 -- 56.54 + 56.08 + 498.17 + 4.77 = 615.55:  96%|█████████▌| 1959/2048 [24:06<01:04,  1.38it/s] 
loss 1.67 accuracy 0.44 -- 56.54 + 56.08 + 498.17 + 4.77 = 615.55:  96%|█████████▌| 1960/2048 [24:06<01:01,  1.43it/s]
loss 1.77 accuracy 0.38 -- 162.83 + 56.77 + 496.27 + 4.81 = 720.68:  96%|█████████▌| 1960/2048 [24:07<01:01,  1.43it/s]
loss 1.77 accuracy 0.38 -- 162.83 + 56.77 + 496.27 + 4.81 = 720.68:  96%|█████████▌| 1961/2048 [24:07<01:02,  1.40it/s]
loss 1.73 accuracy 0.50 -- 56.25 + 57.34 + 619.59 + 4.77 = 737.95:  96%|█████████▌| 1961/2048 [24:08<01:02,  1.40it/s] 
loss 1.73 accuracy 0.50 -- 56.25 + 57.34 + 619.59 + 4.77 = 737.95:  96%|█████████▌| 1962/2048 [24:08<01:02,  1.37it/s]
loss 2.39 accuracy 0.12 -- 56.56 + 56.53 + 506.81 + 4.76 = 624.65:  96%|█████████▌| 1962/2048 [24:08<01:02,  1.37it/s]
loss 2.39 accuracy 0.12 -- 56.56 + 56.53 + 506.81 + 4.76 = 624.65:  96%|█████████▌| 1963/2048 [24:08<01:00,  1.42it/s]
loss 1.77 accuracy 0.38 -- 56.38 + 57.03 + 615.52 + 4.78 = 733.70:  96%|█████████▌| 1963/2048 [24:09<01:00,  1.42it/s]
loss 1.77 accuracy 0.38 -- 56.38 + 57.03 + 615.52 + 4.78 = 733.70:  96%|█████████▌| 1964/2048 [24:09<01:00,  1.38it/s]
loss 1.94 accuracy 0.31 -- 56.63 + 56.61 + 501.49 + 4.79 = 619.52:  96%|█████████▌| 1964/2048 [24:10<01:00,  1.38it/s]
loss 1.94 accuracy 0.31 -- 56.63 + 56.61 + 501.49 + 4.79 = 619.52:  96%|█████████▌| 1965/2048 [24:10<00:58,  1.41it/s]
loss 1.76 accuracy 0.38 -- 56.04 + 166.16 + 501.56 + 4.78 = 728.53:  96%|█████████▌| 1965/2048 [24:11<00:58,  1.41it/s]
loss 1.76 accuracy 0.38 -- 56.04 + 166.16 + 501.56 + 4.78 = 728.53:  96%|█████████▌| 1966/2048 [24:11<00:59,  1.38it/s]
loss 1.94 accuracy 0.12 -- 56.03 + 56.52 + 499.38 + 4.79 = 616.72:  96%|█████████▌| 1966/2048 [24:11<00:59,  1.38it/s] 
loss 1.94 accuracy 0.12 -- 56.03 + 56.52 + 499.38 + 4.79 = 616.72:  96%|█████████▌| 1967/2048 [24:11<00:56,  1.43it/s]
loss 1.29 accuracy 0.69 -- 56.83 + 56.39 + 498.90 + 4.78 = 616.90:  96%|█████████▌| 1967/2048 [24:12<00:56,  1.43it/s]
loss 1.29 accuracy 0.69 -- 56.83 + 56.39 + 498.90 + 4.78 = 616.90:  96%|█████████▌| 1968/2048 [24:12<00:57,  1.40it/s]
loss 1.70 accuracy 0.31 -- 56.16 + 57.22 + 498.71 + 4.78 = 616.86:  96%|█████████▌| 1968/2048 [24:13<00:57,  1.40it/s]
loss 1.70 accuracy 0.31 -- 56.16 + 57.22 + 498.71 + 4.78 = 616.86:  96%|█████████▌| 1969/2048 [24:13<00:54,  1.44it/s]
loss 2.07 accuracy 0.19 -- 158.12 + 57.03 + 491.39 + 4.78 = 711.31:  96%|█████████▌| 1969/2048 [24:13<00:54,  1.44it/s]
loss 2.07 accuracy 0.19 -- 158.12 + 57.03 + 491.39 + 4.78 = 711.31:  96%|█████████▌| 1970/2048 [24:13<00:55,  1.41it/s]
loss 2.17 accuracy 0.12 -- 55.92 + 166.04 + 502.01 + 4.77 = 728.75:  96%|█████████▌| 1970/2048 [24:14<00:55,  1.41it/s]
loss 2.17 accuracy 0.12 -- 55.92 + 166.04 + 502.01 + 4.77 = 728.75:  96%|█████████▌| 1971/2048 [24:14<00:55,  1.39it/s]
loss 1.28 accuracy 0.69 -- 56.70 + 56.50 + 497.82 + 4.76 = 615.79:  96%|█████████▌| 1971/2048 [24:15<00:55,  1.39it/s] 
loss 1.28 accuracy 0.69 -- 56.70 + 56.50 + 497.82 + 4.76 = 615.79:  96%|█████████▋| 1972/2048 [24:15<00:53,  1.43it/s]
loss 1.77 accuracy 0.25 -- 162.25 + 57.01 + 496.86 + 4.77 = 720.88:  96%|█████████▋| 1972/2048 [24:16<00:53,  1.43it/s]
loss 1.77 accuracy 0.25 -- 162.25 + 57.01 + 496.86 + 4.77 = 720.88:  96%|█████████▋| 1973/2048 [24:16<00:54,  1.38it/s]
loss 1.80 accuracy 0.31 -- 55.89 + 57.05 + 620.75 + 4.80 = 738.50:  96%|█████████▋| 1973/2048 [24:16<00:54,  1.38it/s] 
loss 1.80 accuracy 0.31 -- 55.89 + 57.05 + 620.75 + 4.80 = 738.50:  96%|█████████▋| 1974/2048 [24:16<00:54,  1.36it/s]
loss 2.15 accuracy 0.25 -- 57.07 + 56.90 + 505.86 + 4.77 = 624.60:  96%|█████████▋| 1974/2048 [24:17<00:54,  1.36it/s]
loss 2.15 accuracy 0.25 -- 57.07 + 56.90 + 505.86 + 4.77 = 624.60:  96%|█████████▋| 1975/2048 [24:17<00:51,  1.41it/s]
loss 1.86 accuracy 0.25 -- 56.19 + 57.25 + 616.42 + 4.78 = 734.64:  96%|█████████▋| 1975/2048 [24:18<00:51,  1.41it/s]
loss 1.86 accuracy 0.25 -- 56.19 + 57.25 + 616.42 + 4.78 = 734.64:  96%|█████████▋| 1976/2048 [24:18<00:52,  1.38it/s]
loss 1.58 accuracy 0.31 -- 56.71 + 56.60 + 502.06 + 4.76 = 620.14:  96%|█████████▋| 1976/2048 [24:18<00:52,  1.38it/s]
loss 1.58 accuracy 0.31 -- 56.71 + 56.60 + 502.06 + 4.76 = 620.14:  97%|█████████▋| 1977/2048 [24:18<00:49,  1.42it/s]
loss 1.43 accuracy 0.38 -- 56.05 + 165.99 + 501.88 + 4.78 = 728.70:  97%|█████████▋| 1977/2048 [24:19<00:49,  1.42it/s]
loss 1.43 accuracy 0.38 -- 56.05 + 165.99 + 501.88 + 4.78 = 728.70:  97%|█████████▋| 1978/2048 [24:19<00:50,  1.39it/s]
loss 1.52 accuracy 0.56 -- 56.18 + 56.74 + 499.49 + 4.76 = 617.17:  97%|█████████▋| 1978/2048 [24:20<00:50,  1.39it/s] 
loss 1.52 accuracy 0.56 -- 56.18 + 56.74 + 499.49 + 4.76 = 617.17:  97%|█████████▋| 1979/2048 [24:20<00:48,  1.44it/s]
loss 1.64 accuracy 0.31 -- 56.68 + 56.69 + 499.48 + 4.78 = 617.62:  97%|█████████▋| 1979/2048 [24:21<00:48,  1.44it/s]
loss 1.64 accuracy 0.31 -- 56.68 + 56.69 + 499.48 + 4.78 = 617.62:  97%|█████████▋| 1980/2048 [24:21<00:48,  1.40it/s]
loss 1.33 accuracy 0.56 -- 55.91 + 57.13 + 495.66 + 4.78 = 613.48:  97%|█████████▋| 1980/2048 [24:21<00:48,  1.40it/s]
loss 1.33 accuracy 0.56 -- 55.91 + 57.13 + 495.66 + 4.78 = 613.48:  97%|█████████▋| 1981/2048 [24:21<00:46,  1.45it/s]
loss 1.84 accuracy 0.38 -- 158.02 + 56.95 + 488.92 + 4.77 = 708.66:  97%|█████████▋| 1981/2048 [24:22<00:46,  1.45it/s]
loss 1.84 accuracy 0.38 -- 158.02 + 56.95 + 488.92 + 4.77 = 708.66:  97%|█████████▋| 1982/2048 [24:22<00:46,  1.42it/s]
loss 2.22 accuracy 0.38 -- 55.91 + 166.38 + 502.58 + 4.78 = 729.65:  97%|█████████▋| 1982/2048 [24:23<00:46,  1.42it/s]
loss 2.22 accuracy 0.38 -- 55.91 + 166.38 + 502.58 + 4.78 = 729.65:  97%|█████████▋| 1983/2048 [24:23<00:46,  1.39it/s]
loss 1.66 accuracy 0.44 -- 56.52 + 56.29 + 498.61 + 4.77 = 616.19:  97%|█████████▋| 1983/2048 [24:23<00:46,  1.39it/s] 
loss 1.66 accuracy 0.44 -- 56.52 + 56.29 + 498.61 + 4.77 = 616.19:  97%|█████████▋| 1984/2048 [24:23<00:44,  1.43it/s]
loss 2.49 accuracy 0.19 -- 162.95 + 56.96 + 496.99 + 4.79 = 721.69:  97%|█████████▋| 1984/2048 [24:24<00:44,  1.43it/s]
loss 2.49 accuracy 0.19 -- 162.95 + 56.96 + 496.99 + 4.79 = 721.69:  97%|█████████▋| 1985/2048 [24:24<00:44,  1.40it/s]
loss 1.57 accuracy 0.19 -- 56.20 + 57.63 + 621.01 + 4.80 = 739.63:  97%|█████████▋| 1985/2048 [24:25<00:44,  1.40it/s] 
loss 1.57 accuracy 0.19 -- 56.20 + 57.63 + 621.01 + 4.80 = 739.63:  97%|█████████▋| 1986/2048 [24:25<00:45,  1.37it/s]
loss 1.46 accuracy 0.31 -- 56.74 + 56.50 + 504.57 + 4.78 = 622.59:  97%|█████████▋| 1986/2048 [24:25<00:45,  1.37it/s]
loss 1.46 accuracy 0.31 -- 56.74 + 56.50 + 504.57 + 4.78 = 622.59:  97%|█████████▋| 1987/2048 [24:25<00:43,  1.42it/s]
loss 1.56 accuracy 0.56 -- 56.20 + 57.09 + 615.49 + 4.78 = 733.55:  97%|█████████▋| 1987/2048 [24:26<00:43,  1.42it/s]
loss 1.56 accuracy 0.56 -- 56.20 + 57.09 + 615.49 + 4.78 = 733.55:  97%|█████████▋| 1988/2048 [24:26<00:43,  1.39it/s]
loss 1.72 accuracy 0.31 -- 56.64 + 56.42 + 501.84 + 4.78 = 619.68:  97%|█████████▋| 1988/2048 [24:27<00:43,  1.39it/s]
loss 1.72 accuracy 0.31 -- 56.64 + 56.42 + 501.84 + 4.78 = 619.68:  97%|█████████▋| 1989/2048 [24:27<00:41,  1.43it/s]
loss 2.08 accuracy 0.19 -- 56.17 + 166.04 + 501.76 + 4.78 = 728.74:  97%|█████████▋| 1989/2048 [24:28<00:41,  1.43it/s]
loss 2.08 accuracy 0.19 -- 56.17 + 166.04 + 501.76 + 4.78 = 728.74:  97%|█████████▋| 1990/2048 [24:28<00:41,  1.40it/s]
loss 1.61 accuracy 0.31 -- 56.26 + 56.41 + 498.92 + 4.79 = 616.38:  97%|█████████▋| 1990/2048 [24:28<00:41,  1.40it/s] 
loss 1.61 accuracy 0.31 -- 56.26 + 56.41 + 498.92 + 4.79 = 616.38:  97%|█████████▋| 1991/2048 [24:28<00:39,  1.44it/s]
loss 1.66 accuracy 0.38 -- 57.30 + 56.50 + 498.46 + 4.79 = 617.04:  97%|█████████▋| 1991/2048 [24:29<00:39,  1.44it/s]
loss 1.66 accuracy 0.38 -- 57.30 + 56.50 + 498.46 + 4.79 = 617.04:  97%|█████████▋| 1992/2048 [24:29<00:39,  1.41it/s]
loss 2.12 accuracy 0.25 -- 56.29 + 57.77 + 496.10 + 4.79 = 614.95:  97%|█████████▋| 1992/2048 [24:30<00:39,  1.41it/s]
loss 2.12 accuracy 0.25 -- 56.29 + 57.77 + 496.10 + 4.79 = 614.95:  97%|█████████▋| 1993/2048 [24:30<00:37,  1.45it/s]
loss 1.92 accuracy 0.50 -- 157.71 + 57.14 + 489.33 + 4.76 = 708.94:  97%|█████████▋| 1993/2048 [24:30<00:37,  1.45it/s]
loss 1.92 accuracy 0.50 -- 157.71 + 57.14 + 489.33 + 4.76 = 708.94:  97%|█████████▋| 1994/2048 [24:30<00:38,  1.42it/s]
loss 1.68 accuracy 0.75 -- 55.79 + 166.28 + 501.62 + 4.77 = 728.46:  97%|█████████▋| 1994/2048 [24:31<00:38,  1.42it/s]
loss 1.68 accuracy 0.75 -- 55.79 + 166.28 + 501.62 + 4.77 = 728.46:  97%|█████████▋| 1995/2048 [24:31<00:38,  1.39it/s]
loss 1.84 accuracy 0.31 -- 56.59 + 56.41 + 498.12 + 4.76 = 615.89:  97%|█████████▋| 1995/2048 [24:32<00:38,  1.39it/s] 
loss 1.84 accuracy 0.31 -- 56.59 + 56.41 + 498.12 + 4.76 = 615.89:  97%|█████████▋| 1996/2048 [24:32<00:36,  1.44it/s]
loss 1.52 accuracy 0.44 -- 162.06 + 57.05 + 498.47 + 4.78 = 722.35:  97%|█████████▋| 1996/2048 [24:33<00:36,  1.44it/s]
loss 1.52 accuracy 0.44 -- 162.06 + 57.05 + 498.47 + 4.78 = 722.35:  98%|█████████▊| 1997/2048 [24:33<00:36,  1.40it/s]
loss 1.56 accuracy 0.31 -- 56.01 + 57.14 + 620.52 + 4.78 = 738.45:  98%|█████████▊| 1997/2048 [24:33<00:36,  1.40it/s] 
loss 1.56 accuracy 0.31 -- 56.01 + 57.14 + 620.52 + 4.78 = 738.45:  98%|█████████▊| 1998/2048 [24:33<00:36,  1.37it/s]
loss 1.65 accuracy 0.38 -- 56.83 + 56.93 + 505.05 + 4.79 = 623.60:  98%|█████████▊| 1998/2048 [24:34<00:36,  1.37it/s]
loss 1.65 accuracy 0.38 -- 56.83 + 56.93 + 505.05 + 4.79 = 623.60:  98%|█████████▊| 1999/2048 [24:34<00:34,  1.42it/s]
loss 2.13 accuracy 0.25 -- 56.23 + 57.24 + 615.72 + 4.77 = 733.97:  98%|█████████▊| 1999/2048 [24:35<00:34,  1.42it/s]
loss 2.13 accuracy 0.25 -- 56.23 + 57.24 + 615.72 + 4.77 = 733.97:  98%|█████████▊| 2000/2048 [24:35<00:34,  1.39it/s]
loss 2.11 accuracy 0.44 -- 56.69 + 56.44 + 501.61 + 4.82 = 619.55:  98%|█████████▊| 2000/2048 [24:35<00:34,  1.39it/s]
loss 2.11 accuracy 0.44 -- 56.69 + 56.44 + 501.61 + 4.82 = 619.55:  98%|█████████▊| 2001/2048 [24:35<00:32,  1.43it/s]
loss 1.81 accuracy 0.56 -- 56.73 + 166.69 + 501.31 + 4.83 = 729.57:  98%|█████████▊| 2001/2048 [24:36<00:32,  1.43it/s]
loss 1.81 accuracy 0.56 -- 56.73 + 166.69 + 501.31 + 4.83 = 729.57:  98%|█████████▊| 2002/2048 [24:36<00:32,  1.40it/s]
loss 1.61 accuracy 0.25 -- 56.59 + 56.98 + 500.52 + 4.76 = 618.86:  98%|█████████▊| 2002/2048 [24:37<00:32,  1.40it/s] 
loss 1.61 accuracy 0.25 -- 56.59 + 56.98 + 500.52 + 4.76 = 618.86:  98%|█████████▊| 2003/2048 [24:37<00:31,  1.44it/s]
loss 1.57 accuracy 0.38 -- 56.47 + 56.36 + 498.06 + 4.79 = 615.67:  98%|█████████▊| 2003/2048 [24:38<00:31,  1.44it/s]
loss 1.57 accuracy 0.38 -- 56.47 + 56.36 + 498.06 + 4.79 = 615.67:  98%|█████████▊| 2004/2048 [24:38<00:31,  1.41it/s]
loss 2.23 accuracy 0.31 -- 55.92 + 57.22 + 494.63 + 4.81 = 612.57:  98%|█████████▊| 2004/2048 [24:38<00:31,  1.41it/s]
loss 2.23 accuracy 0.31 -- 55.92 + 57.22 + 494.63 + 4.81 = 612.57:  98%|█████████▊| 2005/2048 [24:38<00:29,  1.45it/s]
loss 1.38 accuracy 0.44 -- 157.40 + 56.73 + 488.64 + 4.77 = 707.53:  98%|█████████▊| 2005/2048 [24:39<00:29,  1.45it/s]
loss 1.38 accuracy 0.44 -- 157.40 + 56.73 + 488.64 + 4.77 = 707.53:  98%|█████████▊| 2006/2048 [24:39<00:29,  1.42it/s]
loss 1.78 accuracy 0.38 -- 55.61 + 166.31 + 500.78 + 4.76 = 727.46:  98%|█████████▊| 2006/2048 [24:40<00:29,  1.42it/s]
loss 1.78 accuracy 0.38 -- 55.61 + 166.31 + 500.78 + 4.76 = 727.46:  98%|█████████▊| 2007/2048 [24:40<00:29,  1.39it/s]
loss 1.99 accuracy 0.25 -- 56.40 + 56.53 + 497.99 + 4.76 = 615.68:  98%|█████████▊| 2007/2048 [24:40<00:29,  1.39it/s] 
loss 1.99 accuracy 0.25 -- 56.40 + 56.53 + 497.99 + 4.76 = 615.68:  98%|█████████▊| 2008/2048 [24:40<00:27,  1.44it/s]
loss 1.82 accuracy 0.31 -- 162.53 + 57.26 + 496.41 + 4.77 = 720.97:  98%|█████████▊| 2008/2048 [24:41<00:27,  1.44it/s]
loss 1.82 accuracy 0.31 -- 162.53 + 57.26 + 496.41 + 4.77 = 720.97:  98%|█████████▊| 2009/2048 [24:41<00:27,  1.41it/s]
loss 2.01 accuracy 0.44 -- 56.10 + 57.40 + 620.57 + 4.80 = 738.87:  98%|█████████▊| 2009/2048 [24:42<00:27,  1.41it/s] 
loss 2.01 accuracy 0.44 -- 56.10 + 57.40 + 620.57 + 4.80 = 738.87:  98%|█████████▊| 2010/2048 [24:42<00:28,  1.35it/s]
loss 1.76 accuracy 0.31 -- 56.70 + 56.79 + 504.95 + 4.78 = 623.22:  98%|█████████▊| 2010/2048 [24:43<00:28,  1.35it/s]
loss 1.76 accuracy 0.31 -- 56.70 + 56.79 + 504.95 + 4.78 = 623.22:  98%|█████████▊| 2011/2048 [24:43<00:26,  1.40it/s]
loss 1.58 accuracy 0.25 -- 56.41 + 57.08 + 616.61 + 4.78 = 734.87:  98%|█████████▊| 2011/2048 [24:43<00:26,  1.40it/s]
loss 1.58 accuracy 0.25 -- 56.41 + 57.08 + 616.61 + 4.78 = 734.87:  98%|█████████▊| 2012/2048 [24:43<00:26,  1.38it/s]
loss 1.85 accuracy 0.31 -- 56.68 + 56.54 + 502.52 + 4.77 = 620.51:  98%|█████████▊| 2012/2048 [24:44<00:26,  1.38it/s]
loss 1.85 accuracy 0.31 -- 56.68 + 56.54 + 502.52 + 4.77 = 620.51:  98%|█████████▊| 2013/2048 [24:44<00:24,  1.42it/s]
loss 1.47 accuracy 0.44 -- 56.18 + 166.61 + 503.71 + 4.78 = 731.29:  98%|█████████▊| 2013/2048 [24:45<00:24,  1.42it/s]
loss 1.47 accuracy 0.44 -- 56.18 + 166.61 + 503.71 + 4.78 = 731.29:  98%|█████████▊| 2014/2048 [24:45<00:24,  1.39it/s]
loss 1.98 accuracy 0.31 -- 56.20 + 56.43 + 498.72 + 4.78 = 616.13:  98%|█████████▊| 2014/2048 [24:45<00:24,  1.39it/s] 
loss 1.98 accuracy 0.31 -- 56.20 + 56.43 + 498.72 + 4.78 = 616.13:  98%|█████████▊| 2015/2048 [24:45<00:22,  1.44it/s]
loss 1.71 accuracy 0.38 -- 56.45 + 56.62 + 497.17 + 4.78 = 615.03:  98%|█████████▊| 2015/2048 [24:46<00:22,  1.44it/s]
loss 1.71 accuracy 0.38 -- 56.45 + 56.62 + 497.17 + 4.78 = 615.03:  98%|█████████▊| 2016/2048 [24:46<00:22,  1.40it/s]
loss 1.51 accuracy 0.50 -- 56.00 + 57.15 + 494.89 + 4.77 = 612.80:  98%|█████████▊| 2016/2048 [24:47<00:22,  1.40it/s]
loss 1.51 accuracy 0.50 -- 56.00 + 57.15 + 494.89 + 4.77 = 612.80:  98%|█████████▊| 2017/2048 [24:47<00:21,  1.43it/s]
loss 1.81 accuracy 0.38 -- 157.20 + 56.76 + 488.64 + 4.76 = 707.37:  98%|█████████▊| 2017/2048 [24:47<00:21,  1.43it/s]
loss 1.81 accuracy 0.38 -- 157.20 + 56.76 + 488.64 + 4.76 = 707.37:  99%|█████████▊| 2018/2048 [24:47<00:21,  1.41it/s]
loss 2.18 accuracy 0.31 -- 55.84 + 166.45 + 500.72 + 4.78 = 727.79:  99%|█████████▊| 2018/2048 [24:48<00:21,  1.41it/s]
loss 2.18 accuracy 0.31 -- 55.84 + 166.45 + 500.72 + 4.78 = 727.79:  99%|█████████▊| 2019/2048 [24:48<00:21,  1.38it/s]
loss 2.55 accuracy 0.31 -- 56.78 + 56.80 + 498.30 + 4.79 = 616.66:  99%|█████████▊| 2019/2048 [24:49<00:21,  1.38it/s] 
loss 2.55 accuracy 0.31 -- 56.78 + 56.80 + 498.30 + 4.79 = 616.66:  99%|█████████▊| 2020/2048 [24:49<00:19,  1.43it/s]
loss 1.77 accuracy 0.25 -- 161.82 + 57.20 + 496.47 + 4.78 = 720.28:  99%|█████████▊| 2020/2048 [24:50<00:19,  1.43it/s]
loss 1.77 accuracy 0.25 -- 161.82 + 57.20 + 496.47 + 4.78 = 720.28:  99%|█████████▊| 2021/2048 [24:50<00:19,  1.40it/s]
loss 1.74 accuracy 0.25 -- 55.85 + 57.30 + 619.05 + 4.77 = 736.97:  99%|█████████▊| 2021/2048 [24:50<00:19,  1.40it/s] 
loss 1.74 accuracy 0.25 -- 55.85 + 57.30 + 619.05 + 4.77 = 736.97:  99%|█████████▊| 2022/2048 [24:50<00:18,  1.37it/s]
loss 2.06 accuracy 0.12 -- 56.61 + 56.79 + 504.93 + 4.79 = 623.12:  99%|█████████▊| 2022/2048 [24:51<00:18,  1.37it/s]
loss 2.06 accuracy 0.12 -- 56.61 + 56.79 + 504.93 + 4.79 = 623.12:  99%|█████████▉| 2023/2048 [24:51<00:17,  1.42it/s]
loss 1.67 accuracy 0.38 -- 56.29 + 57.44 + 615.80 + 4.78 = 734.31:  99%|█████████▉| 2023/2048 [24:52<00:17,  1.42it/s]
loss 1.67 accuracy 0.38 -- 56.29 + 57.44 + 615.80 + 4.78 = 734.31:  99%|█████████▉| 2024/2048 [24:52<00:17,  1.38it/s]
loss 1.61 accuracy 0.44 -- 56.66 + 56.68 + 503.46 + 4.79 = 621.60:  99%|█████████▉| 2024/2048 [24:52<00:17,  1.38it/s]
loss 1.61 accuracy 0.44 -- 56.66 + 56.68 + 503.46 + 4.79 = 621.60:  99%|█████████▉| 2025/2048 [24:52<00:16,  1.41it/s]
loss 1.72 accuracy 0.25 -- 56.36 + 166.03 + 501.16 + 4.76 = 728.31:  99%|█████████▉| 2025/2048 [24:53<00:16,  1.41it/s]
loss 1.72 accuracy 0.25 -- 56.36 + 166.03 + 501.16 + 4.76 = 728.31:  99%|█████████▉| 2026/2048 [24:53<00:15,  1.38it/s]
loss 1.74 accuracy 0.31 -- 55.79 + 56.33 + 499.28 + 4.77 = 616.17:  99%|█████████▉| 2026/2048 [24:54<00:15,  1.38it/s] 
loss 1.74 accuracy 0.31 -- 55.79 + 56.33 + 499.28 + 4.77 = 616.17:  99%|█████████▉| 2027/2048 [24:54<00:14,  1.43it/s]
loss 1.73 accuracy 0.25 -- 56.88 + 56.44 + 499.05 + 4.78 = 617.15:  99%|█████████▉| 2027/2048 [24:55<00:14,  1.43it/s]
loss 1.73 accuracy 0.25 -- 56.88 + 56.44 + 499.05 + 4.78 = 617.15:  99%|█████████▉| 2028/2048 [24:55<00:14,  1.40it/s]
loss 1.61 accuracy 0.38 -- 55.88 + 56.99 + 495.97 + 4.77 = 613.62:  99%|█████████▉| 2028/2048 [24:55<00:14,  1.40it/s]
loss 1.61 accuracy 0.38 -- 55.88 + 56.99 + 495.97 + 4.77 = 613.62:  99%|█████████▉| 2029/2048 [24:55<00:13,  1.44it/s]
loss 1.88 accuracy 0.06 -- 157.32 + 57.33 + 491.26 + 4.77 = 710.68:  99%|█████████▉| 2029/2048 [24:56<00:13,  1.44it/s]
loss 1.88 accuracy 0.06 -- 157.32 + 57.33 + 491.26 + 4.77 = 710.68:  99%|█████████▉| 2030/2048 [24:56<00:12,  1.42it/s]
loss 1.67 accuracy 0.31 -- 55.85 + 166.31 + 502.82 + 4.77 = 729.74:  99%|█████████▉| 2030/2048 [24:57<00:12,  1.42it/s]
loss 1.67 accuracy 0.31 -- 55.85 + 166.31 + 502.82 + 4.77 = 729.74:  99%|█████████▉| 2031/2048 [24:57<00:12,  1.39it/s]
loss 1.96 accuracy 0.31 -- 56.92 + 56.50 + 498.44 + 4.77 = 616.63:  99%|█████████▉| 2031/2048 [24:57<00:12,  1.39it/s] 
loss 1.96 accuracy 0.31 -- 56.92 + 56.50 + 498.44 + 4.77 = 616.63:  99%|█████████▉| 2032/2048 [24:57<00:11,  1.43it/s]
loss 2.01 accuracy 0.31 -- 162.34 + 56.89 + 497.02 + 4.77 = 721.02:  99%|█████████▉| 2032/2048 [24:58<00:11,  1.43it/s]
loss 2.01 accuracy 0.31 -- 162.34 + 56.89 + 497.02 + 4.77 = 721.02:  99%|█████████▉| 2033/2048 [24:58<00:10,  1.38it/s]
loss 1.79 accuracy 0.31 -- 56.19 + 57.56 + 620.20 + 4.77 = 738.72:  99%|█████████▉| 2033/2048 [24:59<00:10,  1.38it/s] 
loss 1.79 accuracy 0.31 -- 56.19 + 57.56 + 620.20 + 4.77 = 738.72:  99%|█████████▉| 2034/2048 [24:59<00:10,  1.36it/s]
loss 1.76 accuracy 0.31 -- 56.60 + 56.60 + 505.16 + 4.77 = 623.12:  99%|█████████▉| 2034/2048 [25:00<00:10,  1.36it/s]
loss 1.76 accuracy 0.31 -- 56.60 + 56.60 + 505.16 + 4.77 = 623.12:  99%|█████████▉| 2035/2048 [25:00<00:09,  1.41it/s]
loss 1.64 accuracy 0.31 -- 56.19 + 57.43 + 617.11 + 4.78 = 735.52:  99%|█████████▉| 2035/2048 [25:00<00:09,  1.41it/s]
loss 1.64 accuracy 0.31 -- 56.19 + 57.43 + 617.11 + 4.78 = 735.52:  99%|█████████▉| 2036/2048 [25:00<00:08,  1.38it/s]
loss 1.40 accuracy 0.56 -- 57.21 + 56.55 + 502.67 + 4.77 = 621.20:  99%|█████████▉| 2036/2048 [25:01<00:08,  1.38it/s]
loss 1.40 accuracy 0.56 -- 57.21 + 56.55 + 502.67 + 4.77 = 621.20:  99%|█████████▉| 2037/2048 [25:01<00:07,  1.42it/s]
loss 1.91 accuracy 0.31 -- 56.08 + 165.77 + 500.97 + 4.77 = 727.58:  99%|█████████▉| 2037/2048 [25:02<00:07,  1.42it/s]
loss 1.91 accuracy 0.31 -- 56.08 + 165.77 + 500.97 + 4.77 = 727.58: 100%|█████████▉| 2038/2048 [25:02<00:07,  1.39it/s]
loss 1.33 accuracy 0.38 -- 56.22 + 56.21 + 499.04 + 4.78 = 616.25: 100%|█████████▉| 2038/2048 [25:02<00:07,  1.39it/s] 
loss 1.33 accuracy 0.38 -- 56.22 + 56.21 + 499.04 + 4.78 = 616.25: 100%|█████████▉| 2039/2048 [25:02<00:06,  1.44it/s]
loss 1.88 accuracy 0.31 -- 57.04 + 56.43 + 498.63 + 4.78 = 616.87: 100%|█████████▉| 2039/2048 [25:03<00:06,  1.44it/s]
loss 1.88 accuracy 0.31 -- 57.04 + 56.43 + 498.63 + 4.78 = 616.87: 100%|█████████▉| 2040/2048 [25:03<00:05,  1.38it/s]
loss 1.63 accuracy 0.50 -- 56.24 + 57.27 + 496.35 + 4.79 = 614.65: 100%|█████████▉| 2040/2048 [25:04<00:05,  1.38it/s]
loss 1.63 accuracy 0.50 -- 56.24 + 57.27 + 496.35 + 4.79 = 614.65: 100%|█████████▉| 2041/2048 [25:04<00:04,  1.43it/s]
loss 1.51 accuracy 0.44 -- 157.50 + 57.04 + 491.51 + 4.76 = 710.81: 100%|█████████▉| 2041/2048 [25:05<00:04,  1.43it/s]
loss 1.51 accuracy 0.44 -- 157.50 + 57.04 + 491.51 + 4.76 = 710.81: 100%|█████████▉| 2042/2048 [25:05<00:04,  1.41it/s]
loss 1.77 accuracy 0.25 -- 56.06 + 166.03 + 501.05 + 4.79 = 727.93: 100%|█████████▉| 2042/2048 [25:05<00:04,  1.41it/s]
loss 1.77 accuracy 0.25 -- 56.06 + 166.03 + 501.05 + 4.79 = 727.93: 100%|█████████▉| 2043/2048 [25:05<00:03,  1.38it/s]
loss 1.90 accuracy 0.38 -- 56.68 + 56.67 + 497.25 + 4.80 = 615.41: 100%|█████████▉| 2043/2048 [25:06<00:03,  1.38it/s] 
loss 1.90 accuracy 0.38 -- 56.68 + 56.67 + 497.25 + 4.80 = 615.41: 100%|█████████▉| 2044/2048 [25:06<00:02,  1.43it/s]
loss 1.90 accuracy 0.50 -- 162.28 + 57.17 + 496.46 + 4.78 = 720.68: 100%|█████████▉| 2044/2048 [25:07<00:02,  1.43it/s]
loss 1.90 accuracy 0.50 -- 162.28 + 57.17 + 496.46 + 4.78 = 720.68: 100%|█████████▉| 2045/2048 [25:07<00:02,  1.40it/s]
loss 1.73 accuracy 0.44 -- 55.81 + 57.19 + 619.30 + 4.77 = 737.06: 100%|█████████▉| 2045/2048 [25:07<00:02,  1.40it/s] 
loss 1.73 accuracy 0.44 -- 55.81 + 57.19 + 619.30 + 4.77 = 737.06: 100%|█████████▉| 2046/2048 [25:07<00:01,  1.37it/s]
loss 2.41 accuracy 0.12 -- 56.75 + 56.62 + 505.55 + 4.77 = 623.70: 100%|█████████▉| 2046/2048 [25:08<00:01,  1.37it/s]
loss 2.41 accuracy 0.12 -- 56.75 + 56.62 + 505.55 + 4.77 = 623.70: 100%|█████████▉| 2047/2048 [25:08<00:00,  1.40it/s]
loss 1.83 accuracy 0.38 -- 56.69 + 57.95 + 617.37 + 4.77 = 736.78: 100%|█████████▉| 2047/2048 [25:09<00:00,  1.40it/s]
loss 1.83 accuracy 0.38 -- 56.69 + 57.95 + 617.37 + 4.77 = 736.78: 100%|██████████| 2048/2048 [25:09<00:00,  1.37it/s]
loss 1.83 accuracy 0.38 -- 56.69 + 57.95 + 617.37 + 4.77 = 736.78: 100%|██████████| 2048/2048 [25:09<00:00,  1.36it/s]

train_resnet.py

  0%|          | 0/100 [00:00<?, ?it/s]
loss 2.36 accuracy 0.03:   0%|          | 0/100 [00:19<?, ?it/s]
loss 2.36 accuracy 0.03:   1%|          | 1/100 [00:19<32:43, 19.83s/it]
loss 2.33 accuracy 0.22:   1%|          | 1/100 [00:27<32:43, 19.83s/it]
loss 2.33 accuracy 0.22:   2%|▏         | 2/100 [00:27<21:06, 12.93s/it]
loss 2.16 accuracy 0.28:   2%|▏         | 2/100 [00:27<21:06, 12.93s/it]
loss 1.81 accuracy 0.47:   2%|▏         | 2/100 [00:27<21:06, 12.93s/it]
loss 1.85 accuracy 0.41:   2%|▏         | 2/100 [00:27<21:06, 12.93s/it]
loss 1.82 accuracy 0.44:   2%|▏         | 2/100 [00:27<21:06, 12.93s/it]
loss 1.41 accuracy 0.59:   2%|▏         | 2/100 [00:27<21:06, 12.93s/it]
loss 1.37 accuracy 0.59:   2%|▏         | 2/100 [00:28<21:06, 12.93s/it]
loss 1.34 accuracy 0.59:   2%|▏         | 2/100 [00:28<21:06, 12.93s/it]
loss 0.76 accuracy 0.91:   2%|▏         | 2/100 [00:28<21:06, 12.93s/it]
loss 0.76 accuracy 0.91:  10%|█         | 10/100 [00:28<02:31,  1.69s/it]
loss 0.70 accuracy 0.88:  10%|█         | 10/100 [00:28<02:31,  1.69s/it]
loss 0.65 accuracy 0.78:  10%|█         | 10/100 [00:28<02:31,  1.69s/it]
loss 0.90 accuracy 0.75:  10%|█         | 10/100 [00:28<02:31,  1.69s/it]
loss 0.73 accuracy 0.78:  10%|█         | 10/100 [00:28<02:31,  1.69s/it]
loss 0.49 accuracy 0.88:  10%|█         | 10/100 [00:28<02:31,  1.69s/it]
loss 0.45 accuracy 0.81:  10%|█         | 10/100 [00:28<02:31,  1.69s/it]
loss 0.44 accuracy 0.88:  10%|█         | 10/100 [00:28<02:31,  1.69s/it]
loss 0.43 accuracy 0.84:  10%|█         | 10/100 [00:28<02:31,  1.69s/it]
loss 0.43 accuracy 0.84:  18%|█▊        | 18/100 [00:28<01:02,  1.32it/s]
loss 0.97 accuracy 0.75:  18%|█▊        | 18/100 [00:28<01:02,  1.32it/s]
loss 0.38 accuracy 0.91:  18%|█▊        | 18/100 [00:28<01:02,  1.32it/s]
loss 0.44 accuracy 0.81:  18%|█▊        | 18/100 [00:28<01:02,  1.32it/s]
loss 0.32 accuracy 0.94:  18%|█▊        | 18/100 [00:28<01:02,  1.32it/s]
loss 0.33 accuracy 0.88:  18%|█▊        | 18/100 [00:28<01:02,  1.32it/s]
loss 0.44 accuracy 0.81:  18%|█▊        | 18/100 [00:28<01:02,  1.32it/s]
loss 0.43 accuracy 0.91:  18%|█▊        | 18/100 [00:28<01:02,  1.32it/s]
loss 0.58 accuracy 0.91:  18%|█▊        | 18/100 [00:28<01:02,  1.32it/s]
loss 0.58 accuracy 0.91:  26%|██▌       | 26/100 [00:28<00:31,  2.33it/s]
loss 0.26 accuracy 0.88:  26%|██▌       | 26/100 [00:28<00:31,  2.33it/s]
loss 0.41 accuracy 0.91:  26%|██▌       | 26/100 [00:28<00:31,  2.33it/s]
loss 0.44 accuracy 0.88:  26%|██▌       | 26/100 [00:28<00:31,  2.33it/s]
loss 0.27 accuracy 0.91:  26%|██▌       | 26/100 [00:28<00:31,  2.33it/s]
loss 0.49 accuracy 0.81:  26%|██▌       | 26/100 [00:28<00:31,  2.33it/s]
loss 0.91 accuracy 0.72:  26%|██▌       | 26/100 [00:28<00:31,  2.33it/s]
loss 0.27 accuracy 0.94:  26%|██▌       | 26/100 [00:28<00:31,  2.33it/s]
loss 0.24 accuracy 0.91:  26%|██▌       | 26/100 [00:28<00:31,  2.33it/s]
loss 0.24 accuracy 0.91:  34%|███▍      | 34/100 [00:28<00:17,  3.73it/s]
loss 0.26 accuracy 0.94:  34%|███▍      | 34/100 [00:28<00:17,  3.73it/s]
loss 0.45 accuracy 0.88:  34%|███▍      | 34/100 [00:28<00:17,  3.73it/s]
loss 0.33 accuracy 0.91:  34%|███▍      | 34/100 [00:28<00:17,  3.73it/s]
loss 0.23 accuracy 0.94:  34%|███▍      | 34/100 [00:28<00:17,  3.73it/s]
loss 0.25 accuracy 0.91:  34%|███▍      | 34/100 [00:28<00:17,  3.73it/s]
loss 0.22 accuracy 0.94:  34%|███▍      | 34/100 [00:28<00:17,  3.73it/s]
loss 0.22 accuracy 0.94:  34%|███▍      | 34/100 [00:28<00:17,  3.73it/s]
loss 0.36 accuracy 0.91:  34%|███▍      | 34/100 [00:28<00:17,  3.73it/s]
loss 0.36 accuracy 0.91:  42%|████▏     | 42/100 [00:28<00:10,  5.64it/s]
loss 0.28 accuracy 0.91:  42%|████▏     | 42/100 [00:28<00:10,  5.64it/s]
loss 0.61 accuracy 0.81:  42%|████▏     | 42/100 [00:28<00:10,  5.64it/s]
loss 0.15 accuracy 0.97:  42%|████▏     | 42/100 [00:28<00:10,  5.64it/s]
loss 0.17 accuracy 0.97:  42%|████▏     | 42/100 [00:28<00:10,  5.64it/s]
loss 0.28 accuracy 0.91:  42%|████▏     | 42/100 [00:28<00:10,  5.64it/s]
loss 0.62 accuracy 0.81:  42%|████▏     | 42/100 [00:28<00:10,  5.64it/s]
loss 0.85 accuracy 0.78:  42%|████▏     | 42/100 [00:28<00:10,  5.64it/s]
loss 1.09 accuracy 0.75:  42%|████▏     | 42/100 [00:28<00:10,  5.64it/s]
loss 1.09 accuracy 0.75:  50%|█████     | 50/100 [00:28<00:06,  8.21it/s]
loss 0.36 accuracy 0.88:  50%|█████     | 50/100 [00:28<00:06,  8.21it/s]
loss 0.16 accuracy 0.97:  50%|█████     | 50/100 [00:28<00:06,  8.21it/s]
loss 0.22 accuracy 0.91:  50%|█████     | 50/100 [00:28<00:06,  8.21it/s]
loss 0.20 accuracy 0.97:  50%|█████     | 50/100 [00:28<00:06,  8.21it/s]
loss 0.43 accuracy 0.88:  50%|█████     | 50/100 [00:28<00:06,  8.21it/s]
loss 0.64 accuracy 0.75:  50%|█████     | 50/100 [00:28<00:06,  8.21it/s]
loss 0.38 accuracy 0.91:  50%|█████     | 50/100 [00:28<00:06,  8.21it/s]
loss 0.47 accuracy 0.91:  50%|█████     | 50/100 [00:28<00:06,  8.21it/s]
loss 0.47 accuracy 0.91:  58%|█████▊    | 58/100 [00:28<00:03, 11.57it/s]
loss 0.33 accuracy 0.91:  58%|█████▊    | 58/100 [00:28<00:03, 11.57it/s]
loss 0.42 accuracy 0.88:  58%|█████▊    | 58/100 [00:28<00:03, 11.57it/s]
loss 0.20 accuracy 0.91:  58%|█████▊    | 58/100 [00:28<00:03, 11.57it/s]
loss 0.33 accuracy 0.88:  58%|█████▊    | 58/100 [00:28<00:03, 11.57it/s]
loss 0.24 accuracy 0.88:  58%|█████▊    | 58/100 [00:28<00:03, 11.57it/s]
loss 0.11 accuracy 0.94:  58%|█████▊    | 58/100 [00:28<00:03, 11.57it/s]
loss 0.24 accuracy 0.88:  58%|█████▊    | 58/100 [00:28<00:03, 11.57it/s]
loss 0.52 accuracy 0.88:  58%|█████▊    | 58/100 [00:28<00:03, 11.57it/s]
loss 0.52 accuracy 0.88:  66%|██████▌   | 66/100 [00:28<00:02, 15.85it/s]
loss 0.14 accuracy 0.97:  66%|██████▌   | 66/100 [00:28<00:02, 15.85it/s]
loss 0.29 accuracy 0.97:  66%|██████▌   | 66/100 [00:28<00:02, 15.85it/s]
loss 0.39 accuracy 0.84:  66%|██████▌   | 66/100 [00:28<00:02, 15.85it/s]
loss 0.17 accuracy 0.97:  66%|██████▌   | 66/100 [00:28<00:02, 15.85it/s]
loss 0.18 accuracy 0.94:  66%|██████▌   | 66/100 [00:28<00:02, 15.85it/s]
loss 0.12 accuracy 0.97:  66%|██████▌   | 66/100 [00:28<00:02, 15.85it/s]
loss 0.26 accuracy 0.94:  66%|██████▌   | 66/100 [00:28<00:02, 15.85it/s]
loss 0.61 accuracy 0.81:  66%|██████▌   | 66/100 [00:28<00:02, 15.85it/s]
loss 0.61 accuracy 0.81:  74%|███████▍  | 74/100 [00:28<00:01, 21.07it/s]
loss 0.41 accuracy 0.88:  74%|███████▍  | 74/100 [00:28<00:01, 21.07it/s]
loss 0.02 accuracy 1.00:  74%|███████▍  | 74/100 [00:28<00:01, 21.07it/s]
loss 0.13 accuracy 0.94:  74%|███████▍  | 74/100 [00:28<00:01, 21.07it/s]
loss 0.03 accuracy 1.00:  74%|███████▍  | 74/100 [00:28<00:01, 21.07it/s]
loss 0.13 accuracy 0.97:  74%|███████▍  | 74/100 [00:28<00:01, 21.07it/s]
loss 0.31 accuracy 0.94:  74%|███████▍  | 74/100 [00:28<00:01, 21.07it/s]
loss 0.31 accuracy 0.84:  74%|███████▍  | 74/100 [00:28<00:01, 21.07it/s]
loss 0.08 accuracy 0.97:  74%|███████▍  | 74/100 [00:28<00:01, 21.07it/s]
loss 0.08 accuracy 0.97:  82%|████████▏ | 82/100 [00:28<00:00, 27.17it/s]
loss 0.30 accuracy 0.94:  82%|████████▏ | 82/100 [00:28<00:00, 27.17it/s]
loss 0.43 accuracy 0.81:  82%|████████▏ | 82/100 [00:28<00:00, 27.17it/s]
loss 0.05 accuracy 1.00:  82%|████████▏ | 82/100 [00:29<00:00, 27.17it/s]
loss 0.07 accuracy 0.97:  82%|████████▏ | 82/100 [00:29<00:00, 27.17it/s]
loss 0.29 accuracy 0.94:  82%|████████▏ | 82/100 [00:29<00:00, 27.17it/s]
loss 0.31 accuracy 0.91:  82%|████████▏ | 82/100 [00:29<00:00, 27.17it/s]
loss 0.31 accuracy 0.91:  82%|████████▏ | 82/100 [00:29<00:00, 27.17it/s]
loss 0.18 accuracy 0.94:  82%|████████▏ | 82/100 [00:29<00:00, 27.17it/s]
loss 0.18 accuracy 0.94:  90%|█████████ | 90/100 [00:29<00:00, 33.88it/s]
loss 0.14 accuracy 0.97:  90%|█████████ | 90/100 [00:29<00:00, 33.88it/s]
loss 0.08 accuracy 0.97:  90%|█████████ | 90/100 [00:29<00:00, 33.88it/s]
loss 0.06 accuracy 0.97:  90%|█████████ | 90/100 [00:29<00:00, 33.88it/s]
loss 0.03 accuracy 1.00:  90%|█████████ | 90/100 [00:29<00:00, 33.88it/s]
loss 0.03 accuracy 1.00:  90%|█████████ | 90/100 [00:29<00:00, 33.88it/s]
loss 0.05 accuracy 0.97:  90%|█████████ | 90/100 [00:29<00:00, 33.88it/s]
loss 0.03 accuracy 1.00:  90%|█████████ | 90/100 [00:29<00:00, 33.88it/s]
loss 0.24 accuracy 0.91:  90%|█████████ | 90/100 [00:29<00:00, 33.88it/s]
loss 0.24 accuracy 0.91:  98%|█████████▊| 98/100 [00:29<00:00, 40.87it/s]
loss 0.13 accuracy 0.97:  98%|█████████▊| 98/100 [00:29<00:00, 40.87it/s]
loss 0.27 accuracy 0.91:  98%|█████████▊| 98/100 [00:29<00:00, 40.87it/s]
loss 0.27 accuracy 0.91: 100%|██████████| 100/100 [00:29<00:00,  3.43it/s]

  0%|          | 0/79 [00:00<?, ?it/s]
  1%|▏         | 1/79 [00:02<03:51,  2.97s/it]
  4%|▍         | 3/79 [00:03<01:02,  1.22it/s]
  9%|▉         | 7/79 [00:03<00:20,  3.50it/s]
 14%|█▍        | 11/79 [00:03<00:10,  6.22it/s]
 19%|█▉        | 15/79 [00:03<00:06,  9.30it/s]
 24%|██▍       | 19/79 [00:03<00:04, 12.58it/s]
 28%|██▊       | 22/79 [00:03<00:04, 13.88it/s]
 33%|███▎      | 26/79 [00:03<00:03, 17.25it/s]
 38%|███▊      | 30/79 [00:04<00:02, 20.26it/s]
 43%|████▎     | 34/79 [00:04<00:01, 22.82it/s]
 48%|████▊     | 38/79 [00:04<00:01, 24.95it/s]
 52%|█████▏    | 41/79 [00:04<00:01, 23.05it/s]
 57%|█████▋    | 45/79 [00:04<00:01, 25.18it/s]
 62%|██████▏   | 49/79 [00:04<00:01, 26.77it/s]
 67%|██████▋   | 53/79 [00:04<00:00, 28.01it/s]
 72%|███████▏  | 57/79 [00:04<00:00, 28.91it/s]
 77%|███████▋  | 61/79 [00:05<00:00, 25.89it/s]
 82%|████████▏ | 65/79 [00:05<00:00, 27.29it/s]
 87%|████████▋ | 69/79 [00:05<00:00, 28.30it/s]
 92%|█████████▏| 73/79 [00:05<00:00, 29.08it/s]
 97%|█████████▋| 77/79 [00:05<00:00, 29.72it/s]
100%|██████████| 79/79 [00:08<00:00,  8.94it/s]
test set accuracy is 0.938300
reducing lr to 0.0041667

  0%|          | 0/100 [00:00<?, ?it/s]
loss 0.07 accuracy 1.00:   0%|          | 0/100 [00:00<?, ?it/s]
loss 0.07 accuracy 1.00:   1%|          | 1/100 [00:00<00:13,  7.33it/s]
loss 0.25 accuracy 0.94:   1%|          | 1/100 [00:00<00:13,  7.33it/s]
loss 0.25 accuracy 0.94:   2%|▏         | 2/100 [00:00<00:14,  6.59it/s]
loss 0.26 accuracy 0.94:   2%|▏         | 2/100 [00:00<00:14,  6.59it/s]
loss 0.14 accuracy 0.97:   2%|▏         | 2/100 [00:00<00:14,  6.59it/s]
loss 0.13 accuracy 0.94:   2%|▏         | 2/100 [00:00<00:14,  6.59it/s]
loss 0.19 accuracy 0.94:   2%|▏         | 2/100 [00:00<00:14,  6.59it/s]
loss 0.16 accuracy 0.94:   2%|▏         | 2/100 [00:00<00:14,  6.59it/s]
loss 0.10 accuracy 0.94:   2%|▏         | 2/100 [00:00<00:14,  6.59it/s]
loss 0.10 accuracy 0.97:   2%|▏         | 2/100 [00:00<00:14,  6.59it/s]
loss 0.37 accuracy 0.88:   2%|▏         | 2/100 [00:00<00:14,  6.59it/s]
loss 0.37 accuracy 0.88:  10%|█         | 10/100 [00:00<00:02, 32.49it/s]
loss 0.19 accuracy 0.94:  10%|█         | 10/100 [00:00<00:02, 32.49it/s]
loss 0.10 accuracy 0.94:  10%|█         | 10/100 [00:00<00:02, 32.49it/s]
loss 0.15 accuracy 0.94:  10%|█         | 10/100 [00:00<00:02, 32.49it/s]
loss 0.53 accuracy 0.91:  10%|█         | 10/100 [00:00<00:02, 32.49it/s]
loss 0.02 accuracy 1.00:  10%|█         | 10/100 [00:00<00:02, 32.49it/s]
loss 0.11 accuracy 0.94:  10%|█         | 10/100 [00:00<00:02, 32.49it/s]
loss 0.12 accuracy 0.97:  10%|█         | 10/100 [00:00<00:02, 32.49it/s]
loss 0.03 accuracy 1.00:  10%|█         | 10/100 [00:00<00:02, 32.49it/s]
loss 0.03 accuracy 1.00:  18%|█▊        | 18/100 [00:00<00:01, 48.32it/s]
loss 0.19 accuracy 0.94:  18%|█▊        | 18/100 [00:00<00:01, 48.32it/s]
loss 0.19 accuracy 0.94:  18%|█▊        | 18/100 [00:00<00:01, 48.32it/s]
loss 0.10 accuracy 0.97:  18%|█▊        | 18/100 [00:00<00:01, 48.32it/s]
loss 0.05 accuracy 0.97:  18%|█▊        | 18/100 [00:00<00:01, 48.32it/s]
loss 0.14 accuracy 0.97:  18%|█▊        | 18/100 [00:00<00:01, 48.32it/s]
loss 0.06 accuracy 1.00:  18%|█▊        | 18/100 [00:00<00:01, 48.32it/s]
loss 0.16 accuracy 0.97:  18%|█▊        | 18/100 [00:00<00:01, 48.32it/s]
loss 0.15 accuracy 0.97:  18%|█▊        | 18/100 [00:00<00:01, 48.32it/s]
loss 0.15 accuracy 0.97:  26%|██▌       | 26/100 [00:00<00:01, 58.40it/s]
loss 0.44 accuracy 0.84:  26%|██▌       | 26/100 [00:00<00:01, 58.40it/s]
loss 0.06 accuracy 1.00:  26%|██▌       | 26/100 [00:00<00:01, 58.40it/s]
loss 0.24 accuracy 0.94:  26%|██▌       | 26/100 [00:00<00:01, 58.40it/s]
loss 0.08 accuracy 0.97:  26%|██▌       | 26/100 [00:00<00:01, 58.40it/s]
loss 0.04 accuracy 1.00:  26%|██▌       | 26/100 [00:00<00:01, 58.40it/s]
loss 0.18 accuracy 0.94:  26%|██▌       | 26/100 [00:00<00:01, 58.40it/s]
loss 0.02 accuracy 1.00:  26%|██▌       | 26/100 [00:00<00:01, 58.40it/s]
loss 0.10 accuracy 0.97:  26%|██▌       | 26/100 [00:00<00:01, 58.40it/s]
loss 0.10 accuracy 0.97:  34%|███▍      | 34/100 [00:00<00:01, 65.11it/s]
loss 0.21 accuracy 0.91:  34%|███▍      | 34/100 [00:00<00:01, 65.11it/s]
loss 0.16 accuracy 0.97:  34%|███▍      | 34/100 [00:00<00:01, 65.11it/s]
loss 0.17 accuracy 0.94:  34%|███▍      | 34/100 [00:00<00:01, 65.11it/s]
loss 0.02 accuracy 1.00:  34%|███▍      | 34/100 [00:00<00:01, 65.11it/s]
loss 0.03 accuracy 1.00:  34%|███▍      | 34/100 [00:00<00:01, 65.11it/s]
loss 0.10 accuracy 0.97:  34%|███▍      | 34/100 [00:00<00:01, 65.11it/s]
loss 0.19 accuracy 0.94:  34%|███▍      | 34/100 [00:00<00:01, 65.11it/s]
loss 0.12 accuracy 0.94:  34%|███▍      | 34/100 [00:00<00:01, 65.11it/s]
loss 0.12 accuracy 0.94:  42%|████▏     | 42/100 [00:00<00:00, 69.56it/s]
loss 0.09 accuracy 0.94:  42%|████▏     | 42/100 [00:00<00:00, 69.56it/s]
loss 0.08 accuracy 0.97:  42%|████▏     | 42/100 [00:00<00:00, 69.56it/s]
loss 0.26 accuracy 0.97:  42%|████▏     | 42/100 [00:00<00:00, 69.56it/s]
loss 0.04 accuracy 1.00:  42%|████▏     | 42/100 [00:00<00:00, 69.56it/s]
loss 0.18 accuracy 0.94:  42%|████▏     | 42/100 [00:00<00:00, 69.56it/s]
loss 0.05 accuracy 1.00:  42%|████▏     | 42/100 [00:00<00:00, 69.56it/s]
loss 0.26 accuracy 0.91:  42%|████▏     | 42/100 [00:00<00:00, 69.56it/s]
loss 0.05 accuracy 1.00:  42%|████▏     | 42/100 [00:00<00:00, 69.56it/s]
loss 0.05 accuracy 1.00:  50%|█████     | 50/100 [00:00<00:00, 72.58it/s]
loss 0.02 accuracy 1.00:  50%|█████     | 50/100 [00:00<00:00, 72.58it/s]
loss 0.14 accuracy 0.94:  50%|█████     | 50/100 [00:00<00:00, 72.58it/s]
loss 0.08 accuracy 0.97:  50%|█████     | 50/100 [00:00<00:00, 72.58it/s]
loss 0.25 accuracy 0.91:  50%|█████     | 50/100 [00:00<00:00, 72.58it/s]
loss 0.31 accuracy 0.94:  50%|█████     | 50/100 [00:00<00:00, 72.58it/s]
loss 0.16 accuracy 0.94:  50%|█████     | 50/100 [00:00<00:00, 72.58it/s]
loss 0.29 accuracy 0.88:  50%|█████     | 50/100 [00:00<00:00, 72.58it/s]
loss 0.21 accuracy 0.97:  50%|█████     | 50/100 [00:01<00:00, 72.58it/s]
loss 0.21 accuracy 0.97:  58%|█████▊    | 58/100 [00:01<00:00, 74.59it/s]
loss 0.11 accuracy 0.97:  58%|█████▊    | 58/100 [00:01<00:00, 74.59it/s]
loss 0.21 accuracy 0.94:  58%|█████▊    | 58/100 [00:01<00:00, 74.59it/s]
loss 0.08 accuracy 0.97:  58%|█████▊    | 58/100 [00:01<00:00, 74.59it/s]
loss 0.38 accuracy 0.91:  58%|█████▊    | 58/100 [00:01<00:00, 74.59it/s]
loss 0.04 accuracy 1.00:  58%|█████▊    | 58/100 [00:01<00:00, 74.59it/s]
loss 0.15 accuracy 0.97:  58%|█████▊    | 58/100 [00:01<00:00, 74.59it/s]
loss 0.26 accuracy 0.94:  58%|█████▊    | 58/100 [00:01<00:00, 74.59it/s]
loss 0.05 accuracy 1.00:  58%|█████▊    | 58/100 [00:01<00:00, 74.59it/s]
loss 0.05 accuracy 1.00:  66%|██████▌   | 66/100 [00:01<00:00, 76.01it/s]
loss 0.03 accuracy 1.00:  66%|██████▌   | 66/100 [00:01<00:00, 76.01it/s]
loss 0.27 accuracy 0.94:  66%|██████▌   | 66/100 [00:01<00:00, 76.01it/s]
loss 0.17 accuracy 0.94:  66%|██████▌   | 66/100 [00:01<00:00, 76.01it/s]
loss 0.04 accuracy 1.00:  66%|██████▌   | 66/100 [00:01<00:00, 76.01it/s]
loss 0.04 accuracy 1.00:  66%|██████▌   | 66/100 [00:01<00:00, 76.01it/s]
loss 0.04 accuracy 1.00:  66%|██████▌   | 66/100 [00:01<00:00, 76.01it/s]
loss 0.12 accuracy 0.94:  66%|██████▌   | 66/100 [00:01<00:00, 76.01it/s]
loss 0.38 accuracy 0.78:  66%|██████▌   | 66/100 [00:01<00:00, 76.01it/s]
loss 0.38 accuracy 0.78:  74%|███████▍  | 74/100 [00:01<00:00, 77.05it/s]
loss 0.09 accuracy 0.94:  74%|███████▍  | 74/100 [00:01<00:00, 77.05it/s]
loss 0.07 accuracy 0.97:  74%|███████▍  | 74/100 [00:01<00:00, 77.05it/s]
loss 0.02 accuracy 1.00:  74%|███████▍  | 74/100 [00:01<00:00, 77.05it/s]
loss 0.02 accuracy 1.00:  74%|███████▍  | 74/100 [00:01<00:00, 77.05it/s]
loss 0.08 accuracy 0.97:  74%|███████▍  | 74/100 [00:01<00:00, 77.05it/s]
loss 0.34 accuracy 0.91:  74%|███████▍  | 74/100 [00:01<00:00, 77.05it/s]
loss 0.03 accuracy 1.00:  74%|███████▍  | 74/100 [00:01<00:00, 77.05it/s]
loss 0.05 accuracy 0.97:  74%|███████▍  | 74/100 [00:01<00:00, 77.05it/s]
loss 0.05 accuracy 0.97:  82%|████████▏ | 82/100 [00:01<00:00, 77.68it/s]
loss 0.07 accuracy 0.97:  82%|████████▏ | 82/100 [00:01<00:00, 77.68it/s]
loss 0.08 accuracy 0.94:  82%|████████▏ | 82/100 [00:01<00:00, 77.68it/s]
loss 0.04 accuracy 1.00:  82%|████████▏ | 82/100 [00:01<00:00, 77.68it/s]
loss 0.16 accuracy 0.94:  82%|████████▏ | 82/100 [00:01<00:00, 77.68it/s]
loss 0.45 accuracy 0.91:  82%|████████▏ | 82/100 [00:01<00:00, 77.68it/s]
loss 0.37 accuracy 0.91:  82%|████████▏ | 82/100 [00:01<00:00, 77.68it/s]
loss 0.07 accuracy 0.97:  82%|████████▏ | 82/100 [00:01<00:00, 77.68it/s]
loss 0.03 accuracy 1.00:  82%|████████▏ | 82/100 [00:01<00:00, 77.68it/s]
loss 0.03 accuracy 1.00:  90%|█████████ | 90/100 [00:01<00:00, 78.14it/s]
loss 0.08 accuracy 0.97:  90%|█████████ | 90/100 [00:01<00:00, 78.14it/s]
loss 0.05 accuracy 1.00:  90%|█████████ | 90/100 [00:01<00:00, 78.14it/s]
loss 0.16 accuracy 0.97:  90%|█████████ | 90/100 [00:01<00:00, 78.14it/s]
loss 0.09 accuracy 0.94:  90%|█████████ | 90/100 [00:01<00:00, 78.14it/s]
loss 0.24 accuracy 0.94:  90%|█████████ | 90/100 [00:01<00:00, 78.14it/s]
loss 0.03 accuracy 1.00:  90%|█████████ | 90/100 [00:01<00:00, 78.14it/s]
loss 0.07 accuracy 0.97:  90%|█████████ | 90/100 [00:01<00:00, 78.14it/s]
loss 0.15 accuracy 0.91:  90%|█████████ | 90/100 [00:01<00:00, 78.14it/s]
loss 0.15 accuracy 0.91:  98%|█████████▊| 98/100 [00:01<00:00, 78.40it/s]
loss 0.13 accuracy 0.91:  98%|█████████▊| 98/100 [00:01<00:00, 78.40it/s]
loss 0.17 accuracy 0.97:  98%|█████████▊| 98/100 [00:01<00:00, 78.40it/s]
loss 0.17 accuracy 0.97: 100%|██████████| 100/100 [00:01<00:00, 65.08it/s]

  0%|          | 0/79 [00:00<?, ?it/s]
  5%|▌         | 4/79 [00:00<00:02, 30.67it/s]
 10%|█         | 8/79 [00:00<00:03, 23.22it/s]
 15%|█▌        | 12/79 [00:00<00:02, 26.17it/s]
 20%|██        | 16/79 [00:00<00:02, 27.85it/s]
 25%|██▌       | 20/79 [00:00<00:02, 28.96it/s]
 30%|███       | 24/79 [00:00<00:01, 29.58it/s]
 35%|███▌      | 28/79 [00:01<00:01, 25.65it/s]
 41%|████      | 32/79 [00:01<00:01, 27.14it/s]
 46%|████▌     | 36/79 [00:01<00:01, 28.18it/s]
 51%|█████     | 40/79 [00:01<00:01, 28.94it/s]
 54%|█████▍    | 43/79 [00:01<00:01, 25.03it/s]
 59%|█████▉    | 47/79 [00:01<00:01, 26.64it/s]
 65%|██████▍   | 51/79 [00:01<00:01, 27.83it/s]
 70%|██████▉   | 55/79 [00:01<00:00, 28.67it/s]
 75%|███████▍  | 59/79 [00:02<00:00, 29.31it/s]
 78%|███████▊  | 62/79 [00:02<00:00, 25.27it/s]
 84%|████████▎ | 66/79 [00:02<00:00, 26.83it/s]
 89%|████████▊ | 70/79 [00:02<00:00, 28.02it/s]
 94%|█████████▎| 74/79 [00:02<00:00, 28.80it/s]
 99%|█████████▊| 78/79 [00:02<00:00, 29.40it/s]
100%|██████████| 79/79 [00:02<00:00, 27.87it/s]
test set accuracy is 0.960200
reducing lr to 0.0034722

  0%|          | 0/100 [00:00<?, ?it/s]
loss 0.33 accuracy 0.97:   0%|          | 0/100 [00:00<?, ?it/s]
loss 0.33 accuracy 0.97:   1%|          | 1/100 [00:00<00:20,  4.88it/s]
loss 0.10 accuracy 0.97:   1%|          | 1/100 [00:00<00:20,  4.88it/s]
loss 0.10 accuracy 0.97:   2%|▏         | 2/100 [00:00<00:17,  5.56it/s]
loss 0.08 accuracy 0.94:   2%|▏         | 2/100 [00:00<00:17,  5.56it/s]
loss 0.13 accuracy 0.94:   2%|▏         | 2/100 [00:00<00:17,  5.56it/s]
loss 0.19 accuracy 0.97:   2%|▏         | 2/100 [00:00<00:17,  5.56it/s]
loss 0.09 accuracy 0.97:   2%|▏         | 2/100 [00:00<00:17,  5.56it/s]
loss 0.01 accuracy 1.00:   2%|▏         | 2/100 [00:00<00:17,  5.56it/s]
loss 0.22 accuracy 0.88:   2%|▏         | 2/100 [00:00<00:17,  5.56it/s]
loss 0.05 accuracy 1.00:   2%|▏         | 2/100 [00:00<00:17,  5.56it/s]
loss 0.16 accuracy 0.91:   2%|▏         | 2/100 [00:00<00:17,  5.56it/s]
loss 0.16 accuracy 0.91:  10%|█         | 10/100 [00:00<00:03, 29.21it/s]
loss 0.03 accuracy 0.97:  10%|█         | 10/100 [00:00<00:03, 29.21it/s]
loss 0.12 accuracy 0.94:  10%|█         | 10/100 [00:00<00:03, 29.21it/s]
loss 0.02 accuracy 1.00:  10%|█         | 10/100 [00:00<00:03, 29.21it/s]
loss 0.04 accuracy 0.97:  10%|█         | 10/100 [00:00<00:03, 29.21it/s]
loss 0.08 accuracy 0.97:  10%|█         | 10/100 [00:00<00:03, 29.21it/s]
loss 0.06 accuracy 0.97:  10%|█         | 10/100 [00:00<00:03, 29.21it/s]
loss 0.08 accuracy 0.97:  10%|█         | 10/100 [00:00<00:03, 29.21it/s]
loss 0.00 accuracy 1.00:  10%|█         | 10/100 [00:00<00:03, 29.21it/s]
loss 0.14 accuracy 0.91:  10%|█         | 10/100 [00:00<00:03, 29.21it/s]
loss 0.14 accuracy 0.91:  19%|█▉        | 19/100 [00:00<00:01, 46.48it/s]
loss 0.08 accuracy 0.97:  19%|█▉        | 19/100 [00:00<00:01, 46.48it/s]
loss 0.02 accuracy 1.00:  19%|█▉        | 19/100 [00:00<00:01, 46.48it/s]
loss 0.07 accuracy 0.97:  19%|█▉        | 19/100 [00:00<00:01, 46.48it/s]
loss 0.21 accuracy 0.94:  19%|█▉        | 19/100 [00:00<00:01, 46.48it/s]
loss 0.07 accuracy 0.94:  19%|█▉        | 19/100 [00:00<00:01, 46.48it/s]
loss 0.03 accuracy 1.00:  19%|█▉        | 19/100 [00:00<00:01, 46.48it/s]
loss 0.01 accuracy 1.00:  19%|█▉        | 19/100 [00:00<00:01, 46.48it/s]
loss 0.24 accuracy 0.94:  19%|█▉        | 19/100 [00:00<00:01, 46.48it/s]
loss 0.29 accuracy 0.94:  19%|█▉        | 19/100 [00:00<00:01, 46.48it/s]
loss 0.29 accuracy 0.94:  28%|██▊       | 28/100 [00:00<00:01, 57.46it/s]
loss 0.16 accuracy 0.94:  28%|██▊       | 28/100 [00:00<00:01, 57.46it/s]
loss 0.01 accuracy 1.00:  28%|██▊       | 28/100 [00:00<00:01, 57.46it/s]
loss 0.21 accuracy 0.91:  28%|██▊       | 28/100 [00:00<00:01, 57.46it/s]
loss 0.12 accuracy 0.97:  28%|██▊       | 28/100 [00:00<00:01, 57.46it/s]
loss 0.21 accuracy 0.94:  28%|██▊       | 28/100 [00:00<00:01, 57.46it/s]
loss 0.05 accuracy 0.97:  28%|██▊       | 28/100 [00:00<00:01, 57.46it/s]
loss 0.01 accuracy 1.00:  28%|██▊       | 28/100 [00:00<00:01, 57.46it/s]
loss 0.04 accuracy 1.00:  28%|██▊       | 28/100 [00:00<00:01, 57.46it/s]
loss 0.02 accuracy 1.00:  28%|██▊       | 28/100 [00:00<00:01, 57.46it/s]
loss 0.02 accuracy 1.00:  37%|███▋      | 37/100 [00:00<00:00, 64.70it/s]
loss 0.31 accuracy 0.94:  37%|███▋      | 37/100 [00:00<00:00, 64.70it/s]
loss 0.15 accuracy 0.97:  37%|███▋      | 37/100 [00:00<00:00, 64.70it/s]
loss 0.04 accuracy 1.00:  37%|███▋      | 37/100 [00:00<00:00, 64.70it/s]
loss 0.01 accuracy 1.00:  37%|███▋      | 37/100 [00:00<00:00, 64.70it/s]
loss 0.01 accuracy 1.00:  37%|███▋      | 37/100 [00:00<00:00, 64.70it/s]
loss 0.10 accuracy 0.97:  37%|███▋      | 37/100 [00:00<00:00, 64.70it/s]
loss 0.07 accuracy 0.94:  37%|███▋      | 37/100 [00:00<00:00, 64.70it/s]
loss 0.21 accuracy 0.97:  37%|███▋      | 37/100 [00:00<00:00, 64.70it/s]
loss 0.12 accuracy 0.97:  37%|███▋      | 37/100 [00:00<00:00, 64.70it/s]
loss 0.12 accuracy 0.97:  46%|████▌     | 46/100 [00:00<00:00, 69.56it/s]
loss 0.12 accuracy 0.97:  46%|████▌     | 46/100 [00:00<00:00, 69.56it/s]
loss 0.08 accuracy 0.97:  46%|████▌     | 46/100 [00:00<00:00, 69.56it/s]
loss 0.11 accuracy 0.97:  46%|████▌     | 46/100 [00:00<00:00, 69.56it/s]
loss 0.05 accuracy 1.00:  46%|████▌     | 46/100 [00:00<00:00, 69.56it/s]
loss 0.02 accuracy 1.00:  46%|████▌     | 46/100 [00:00<00:00, 69.56it/s]
loss 0.01 accuracy 1.00:  46%|████▌     | 46/100 [00:00<00:00, 69.56it/s]
loss 0.01 accuracy 1.00:  46%|████▌     | 46/100 [00:01<00:00, 69.56it/s]
loss 0.14 accuracy 0.94:  46%|████▌     | 46/100 [00:01<00:00, 69.56it/s]
loss 0.04 accuracy 1.00:  46%|████▌     | 46/100 [00:01<00:00, 69.56it/s]
loss 0.04 accuracy 1.00:  55%|█████▌    | 55/100 [00:01<00:00, 72.91it/s]
loss 0.17 accuracy 0.97:  55%|█████▌    | 55/100 [00:01<00:00, 72.91it/s]
loss 0.01 accuracy 1.00:  55%|█████▌    | 55/100 [00:01<00:00, 72.91it/s]
loss 0.30 accuracy 0.88:  55%|█████▌    | 55/100 [00:01<00:00, 72.91it/s]
loss 0.25 accuracy 0.94:  55%|█████▌    | 55/100 [00:01<00:00, 72.91it/s]
loss 0.03 accuracy 1.00:  55%|█████▌    | 55/100 [00:01<00:00, 72.91it/s]
loss 0.08 accuracy 0.97:  55%|█████▌    | 55/100 [00:01<00:00, 72.91it/s]
loss 0.25 accuracy 0.94:  55%|█████▌    | 55/100 [00:01<00:00, 72.91it/s]
loss 0.16 accuracy 0.94:  55%|█████▌    | 55/100 [00:01<00:00, 72.91it/s]
loss 0.11 accuracy 0.97:  55%|█████▌    | 55/100 [00:01<00:00, 72.91it/s]
loss 0.11 accuracy 0.97:  64%|██████▍   | 64/100 [00:01<00:00, 75.23it/s]
loss 0.14 accuracy 0.94:  64%|██████▍   | 64/100 [00:01<00:00, 75.23it/s]
loss 0.14 accuracy 0.94:  64%|██████▍   | 64/100 [00:01<00:00, 75.23it/s]
loss 0.07 accuracy 0.97:  64%|██████▍   | 64/100 [00:01<00:00, 75.23it/s]
loss 0.08 accuracy 0.97:  64%|██████▍   | 64/100 [00:01<00:00, 75.23it/s]
loss 0.15 accuracy 0.94:  64%|██████▍   | 64/100 [00:01<00:00, 75.23it/s]
loss 0.03 accuracy 1.00:  64%|██████▍   | 64/100 [00:01<00:00, 75.23it/s]
loss 0.22 accuracy 0.88:  64%|██████▍   | 64/100 [00:01<00:00, 75.23it/s]
loss 0.13 accuracy 0.94:  64%|██████▍   | 64/100 [00:01<00:00, 75.23it/s]
loss 0.09 accuracy 0.94:  64%|██████▍   | 64/100 [00:01<00:00, 75.23it/s]
loss 0.09 accuracy 0.94:  73%|███████▎  | 73/100 [00:01<00:00, 76.81it/s]
loss 0.16 accuracy 0.97:  73%|███████▎  | 73/100 [00:01<00:00, 76.81it/s]
loss 0.30 accuracy 0.94:  73%|███████▎  | 73/100 [00:01<00:00, 76.81it/s]
loss 0.16 accuracy 0.97:  73%|███████▎  | 73/100 [00:01<00:00, 76.81it/s]
loss 0.05 accuracy 0.97:  73%|███████▎  | 73/100 [00:01<00:00, 76.81it/s]
loss 0.05 accuracy 0.97:  73%|███████▎  | 73/100 [00:01<00:00, 76.81it/s]
loss 0.03 accuracy 1.00:  73%|███████▎  | 73/100 [00:01<00:00, 76.81it/s]
loss 0.16 accuracy 0.97:  73%|███████▎  | 73/100 [00:01<00:00, 76.81it/s]
loss 0.18 accuracy 0.94:  73%|███████▎  | 73/100 [00:01<00:00, 76.81it/s]
loss 0.08 accuracy 0.97:  73%|███████▎  | 73/100 [00:01<00:00, 76.81it/s]
loss 0.08 accuracy 0.97:  82%|████████▏ | 82/100 [00:01<00:00, 77.88it/s]
loss 0.02 accuracy 1.00:  82%|████████▏ | 82/100 [00:01<00:00, 77.88it/s]
loss 0.00 accuracy 1.00:  82%|████████▏ | 82/100 [00:01<00:00, 77.88it/s]
loss 0.33 accuracy 0.97:  82%|████████▏ | 82/100 [00:01<00:00, 77.88it/s]
loss 0.03 accuracy 1.00:  82%|████████▏ | 82/100 [00:01<00:00, 77.88it/s]
loss 0.03 accuracy 1.00:  82%|████████▏ | 82/100 [00:01<00:00, 77.88it/s]
loss 0.03 accuracy 1.00:  82%|████████▏ | 82/100 [00:01<00:00, 77.88it/s]
loss 0.16 accuracy 0.97:  82%|████████▏ | 82/100 [00:01<00:00, 77.88it/s]
loss 0.04 accuracy 1.00:  82%|████████▏ | 82/100 [00:01<00:00, 77.88it/s]
loss 0.07 accuracy 0.97:  82%|████████▏ | 82/100 [00:01<00:00, 77.88it/s]
loss 0.07 accuracy 0.97:  91%|█████████ | 91/100 [00:01<00:00, 78.61it/s]
loss 0.02 accuracy 1.00:  91%|█████████ | 91/100 [00:01<00:00, 78.61it/s]
loss 0.06 accuracy 1.00:  91%|█████████ | 91/100 [00:01<00:00, 78.61it/s]
loss 0.10 accuracy 0.97:  91%|█████████ | 91/100 [00:01<00:00, 78.61it/s]
loss 0.16 accuracy 0.97:  91%|█████████ | 91/100 [00:01<00:00, 78.61it/s]
loss 0.07 accuracy 0.97:  91%|█████████ | 91/100 [00:01<00:00, 78.61it/s]
loss 0.09 accuracy 0.97:  91%|█████████ | 91/100 [00:01<00:00, 78.61it/s]
loss 0.08 accuracy 0.97:  91%|█████████ | 91/100 [00:01<00:00, 78.61it/s]
loss 0.16 accuracy 0.97:  91%|█████████ | 91/100 [00:01<00:00, 78.61it/s]
loss 0.16 accuracy 0.94:  91%|█████████ | 91/100 [00:01<00:00, 78.61it/s]
loss 0.16 accuracy 0.94: 100%|██████████| 100/100 [00:01<00:00, 79.15it/s]
loss 0.16 accuracy 0.94: 100%|██████████| 100/100 [00:01<00:00, 62.98it/s]

  0%|          | 0/79 [00:00<?, ?it/s]
  5%|▌         | 4/79 [00:00<00:02, 31.09it/s]
 10%|█         | 8/79 [00:00<00:03, 23.47it/s]
 15%|█▌        | 12/79 [00:00<00:02, 26.45it/s]
 20%|██        | 16/79 [00:00<00:02, 28.05it/s]
 25%|██▌       | 20/79 [00:00<00:02, 29.07it/s]
 30%|███       | 24/79 [00:00<00:01, 29.75it/s]
 35%|███▌      | 28/79 [00:01<00:01, 25.65it/s]
 41%|████      | 32/79 [00:01<00:01, 27.20it/s]
 46%|████▌     | 36/79 [00:01<00:01, 28.32it/s]
 51%|█████     | 40/79 [00:01<00:01, 29.09it/s]
 56%|█████▌    | 44/79 [00:01<00:01, 29.66it/s]
 61%|██████    | 48/79 [00:01<00:01, 25.95it/s]
 66%|██████▌   | 52/79 [00:01<00:00, 27.28it/s]
 71%|███████   | 56/79 [00:02<00:00, 28.32it/s]
 76%|███████▌  | 60/79 [00:02<00:00, 29.08it/s]
 81%|████████  | 64/79 [00:02<00:00, 25.51it/s]
 86%|████████▌ | 68/79 [00:02<00:00, 26.90it/s]
 91%|█████████ | 72/79 [00:02<00:00, 28.04it/s]
 96%|█████████▌| 76/79 [00:02<00:00, 28.85it/s]
100%|██████████| 79/79 [00:02<00:00, 28.03it/s]
test set accuracy is 0.971000
reducing lr to 0.0028935

  0%|          | 0/100 [00:00<?, ?it/s]
loss 0.07 accuracy 0.97:   0%|          | 0/100 [00:00<?, ?it/s]
loss 0.07 accuracy 0.97:   1%|          | 1/100 [00:00<00:20,  4.81it/s]
loss 0.16 accuracy 0.97:   1%|          | 1/100 [00:00<00:20,  4.81it/s]
loss 0.16 accuracy 0.97:   2%|▏         | 2/100 [00:00<00:17,  5.51it/s]
loss 0.02 accuracy 1.00:   2%|▏         | 2/100 [00:00<00:17,  5.51it/s]
loss 0.03 accuracy 1.00:   2%|▏         | 2/100 [00:00<00:17,  5.51it/s]
loss 0.01 accuracy 1.00:   2%|▏         | 2/100 [00:00<00:17,  5.51it/s]
loss 0.04 accuracy 1.00:   2%|▏         | 2/100 [00:00<00:17,  5.51it/s]
loss 0.01 accuracy 1.00:   2%|▏         | 2/100 [00:00<00:17,  5.51it/s]
loss 0.04 accuracy 1.00:   2%|▏         | 2/100 [00:00<00:17,  5.51it/s]
loss 0.21 accuracy 0.91:   2%|▏         | 2/100 [00:00<00:17,  5.51it/s]
loss 0.03 accuracy 1.00:   2%|▏         | 2/100 [00:00<00:17,  5.51it/s]
loss 0.03 accuracy 1.00:  10%|█         | 10/100 [00:00<00:03, 28.97it/s]
loss 0.07 accuracy 0.97:  10%|█         | 10/100 [00:00<00:03, 28.97it/s]
loss 0.12 accuracy 0.97:  10%|█         | 10/100 [00:00<00:03, 28.97it/s]
loss 0.25 accuracy 0.94:  10%|█         | 10/100 [00:00<00:03, 28.97it/s]
loss 0.01 accuracy 1.00:  10%|█         | 10/100 [00:00<00:03, 28.97it/s]
loss 0.46 accuracy 0.94:  10%|█         | 10/100 [00:00<00:03, 28.97it/s]
loss 0.08 accuracy 0.97:  10%|█         | 10/100 [00:00<00:03, 28.97it/s]
loss 0.08 accuracy 0.97:  10%|█         | 10/100 [00:00<00:03, 28.97it/s]
loss 0.11 accuracy 0.97:  10%|█         | 10/100 [00:00<00:03, 28.97it/s]
loss 0.11 accuracy 0.97:  18%|█▊        | 18/100 [00:00<00:01, 44.81it/s]
loss 0.02 accuracy 1.00:  18%|█▊        | 18/100 [00:00<00:01, 44.81it/s]
loss 0.01 accuracy 1.00:  18%|█▊        | 18/100 [00:00<00:01, 44.81it/s]
loss 0.21 accuracy 0.94:  18%|█▊        | 18/100 [00:00<00:01, 44.81it/s]
loss 0.03 accuracy 1.00:  18%|█▊        | 18/100 [00:00<00:01, 44.81it/s]
loss 0.05 accuracy 0.97:  18%|█▊        | 18/100 [00:00<00:01, 44.81it/s]
loss 0.02 accuracy 1.00:  18%|█▊        | 18/100 [00:00<00:01, 44.81it/s]
loss 0.02 accuracy 1.00:  18%|█▊        | 18/100 [00:00<00:01, 44.81it/s]
loss 0.02 accuracy 1.00:  18%|█▊        | 18/100 [00:00<00:01, 44.81it/s]
loss 0.02 accuracy 1.00:  26%|██▌       | 26/100 [00:00<00:01, 55.59it/s]
loss 0.11 accuracy 0.97:  26%|██▌       | 26/100 [00:00<00:01, 55.59it/s]
loss 0.17 accuracy 0.94:  26%|██▌       | 26/100 [00:00<00:01, 55.59it/s]
loss 0.03 accuracy 1.00:  26%|██▌       | 26/100 [00:00<00:01, 55.59it/s]
loss 0.04 accuracy 1.00:  26%|██▌       | 26/100 [00:00<00:01, 55.59it/s]
loss 0.06 accuracy 0.97:  26%|██▌       | 26/100 [00:00<00:01, 55.59it/s]
loss 0.05 accuracy 0.97:  26%|██▌       | 26/100 [00:00<00:01, 55.59it/s]
loss 0.03 accuracy 1.00:  26%|██▌       | 26/100 [00:00<00:01, 55.59it/s]
loss 0.02 accuracy 1.00:  26%|██▌       | 26/100 [00:00<00:01, 55.59it/s]
loss 0.02 accuracy 1.00:  34%|███▍      | 34/100 [00:00<00:01, 63.01it/s]
loss 0.07 accuracy 0.97:  34%|███▍      | 34/100 [00:00<00:01, 63.01it/s]
loss 0.12 accuracy 0.97:  34%|███▍      | 34/100 [00:00<00:01, 63.01it/s]
loss 0.19 accuracy 0.97:  34%|███▍      | 34/100 [00:00<00:01, 63.01it/s]
loss 0.09 accuracy 0.97:  34%|███▍      | 34/100 [00:00<00:01, 63.01it/s]
loss 0.01 accuracy 1.00:  34%|███▍      | 34/100 [00:00<00:01, 63.01it/s]
loss 0.01 accuracy 1.00:  34%|███▍      | 34/100 [00:00<00:01, 63.01it/s]
loss 0.13 accuracy 0.97:  34%|███▍      | 34/100 [00:00<00:01, 63.01it/s]
loss 0.09 accuracy 0.97:  34%|███▍      | 34/100 [00:00<00:01, 63.01it/s]
loss 0.09 accuracy 0.97:  42%|████▏     | 42/100 [00:00<00:00, 68.10it/s]
loss 0.02 accuracy 1.00:  42%|████▏     | 42/100 [00:00<00:00, 68.10it/s]
loss 0.03 accuracy 1.00:  42%|████▏     | 42/100 [00:00<00:00, 68.10it/s]
loss 0.06 accuracy 0.97:  42%|████▏     | 42/100 [00:00<00:00, 68.10it/s]
loss 0.09 accuracy 0.97:  42%|████▏     | 42/100 [00:00<00:00, 68.10it/s]
loss 0.09 accuracy 0.94:  42%|████▏     | 42/100 [00:00<00:00, 68.10it/s]
loss 0.06 accuracy 0.97:  42%|████▏     | 42/100 [00:00<00:00, 68.10it/s]
loss 0.04 accuracy 0.97:  42%|████▏     | 42/100 [00:00<00:00, 68.10it/s]
loss 0.00 accuracy 1.00:  42%|████▏     | 42/100 [00:00<00:00, 68.10it/s]
loss 0.15 accuracy 0.97:  42%|████▏     | 42/100 [00:00<00:00, 68.10it/s]
loss 0.15 accuracy 0.97:  51%|█████     | 51/100 [00:00<00:00, 72.04it/s]
loss 0.14 accuracy 0.97:  51%|█████     | 51/100 [00:00<00:00, 72.04it/s]
loss 0.02 accuracy 1.00:  51%|█████     | 51/100 [00:01<00:00, 72.04it/s]
loss 0.03 accuracy 0.97:  51%|█████     | 51/100 [00:01<00:00, 72.04it/s]
loss 0.08 accuracy 0.97:  51%|█████     | 51/100 [00:01<00:00, 72.04it/s]
loss 0.06 accuracy 0.97:  51%|█████     | 51/100 [00:01<00:00, 72.04it/s]
loss 0.04 accuracy 1.00:  51%|█████     | 51/100 [00:01<00:00, 72.04it/s]
loss 0.01 accuracy 1.00:  51%|█████     | 51/100 [00:01<00:00, 72.04it/s]
loss 0.30 accuracy 0.94:  51%|█████     | 51/100 [00:01<00:00, 72.04it/s]
loss 0.30 accuracy 0.94:  59%|█████▉    | 59/100 [00:01<00:00, 74.26it/s]
loss 0.02 accuracy 1.00:  59%|█████▉    | 59/100 [00:01<00:00, 74.26it/s]
loss 0.05 accuracy 0.97:  59%|█████▉    | 59/100 [00:01<00:00, 74.26it/s]
loss 0.05 accuracy 1.00:  59%|█████▉    | 59/100 [00:01<00:00, 74.26it/s]
loss 0.20 accuracy 0.94:  59%|█████▉    | 59/100 [00:01<00:00, 74.26it/s]
loss 0.02 accuracy 1.00:  59%|█████▉    | 59/100 [00:01<00:00, 74.26it/s]
loss 0.13 accuracy 0.94:  59%|█████▉    | 59/100 [00:01<00:00, 74.26it/s]
loss 0.04 accuracy 0.97:  59%|█████▉    | 59/100 [00:01<00:00, 74.26it/s]
loss 0.24 accuracy 0.88:  59%|█████▉    | 59/100 [00:01<00:00, 74.26it/s]
loss 0.24 accuracy 0.88:  67%|██████▋   | 67/100 [00:01<00:00, 75.93it/s]
loss 0.02 accuracy 1.00:  67%|██████▋   | 67/100 [00:01<00:00, 75.93it/s]
loss 0.21 accuracy 0.97:  67%|██████▋   | 67/100 [00:01<00:00, 75.93it/s]
loss 0.02 accuracy 1.00:  67%|██████▋   | 67/100 [00:01<00:00, 75.93it/s]
loss 0.08 accuracy 1.00:  67%|██████▋   | 67/100 [00:01<00:00, 75.93it/s]
loss 0.06 accuracy 0.97:  67%|██████▋   | 67/100 [00:01<00:00, 75.93it/s]
loss 0.22 accuracy 0.91:  67%|██████▋   | 67/100 [00:01<00:00, 75.93it/s]
loss 0.12 accuracy 0.94:  67%|██████▋   | 67/100 [00:01<00:00, 75.93it/s]
loss 0.05 accuracy 1.00:  67%|██████▋   | 67/100 [00:01<00:00, 75.93it/s]
loss 0.05 accuracy 1.00:  75%|███████▌  | 75/100 [00:01<00:00, 72.21it/s]
loss 0.09 accuracy 0.94:  75%|███████▌  | 75/100 [00:01<00:00, 72.21it/s]
loss 0.10 accuracy 0.97:  75%|███████▌  | 75/100 [00:01<00:00, 72.21it/s]
loss 0.16 accuracy 0.97:  75%|███████▌  | 75/100 [00:01<00:00, 72.21it/s]
loss 0.11 accuracy 0.97:  75%|███████▌  | 75/100 [00:01<00:00, 72.21it/s]
loss 0.06 accuracy 0.97:  75%|███████▌  | 75/100 [00:01<00:00, 72.21it/s]
loss 0.27 accuracy 0.94:  75%|███████▌  | 75/100 [00:01<00:00, 72.21it/s]
loss 0.09 accuracy 0.97:  75%|███████▌  | 75/100 [00:01<00:00, 72.21it/s]
loss 0.04 accuracy 0.97:  75%|███████▌  | 75/100 [00:01<00:00, 72.21it/s]
loss 0.04 accuracy 0.97:  83%|████████▎ | 83/100 [00:01<00:00, 74.34it/s]
loss 0.03 accuracy 1.00:  83%|████████▎ | 83/100 [00:01<00:00, 74.34it/s]
loss 0.05 accuracy 0.97:  83%|████████▎ | 83/100 [00:01<00:00, 74.34it/s]
loss 0.05 accuracy 1.00:  83%|████████▎ | 83/100 [00:01<00:00, 74.34it/s]
loss 0.03 accuracy 1.00:  83%|████████▎ | 83/100 [00:01<00:00, 74.34it/s]
loss 0.03 accuracy 1.00:  83%|████████▎ | 83/100 [00:01<00:00, 74.34it/s]
loss 0.01 accuracy 1.00:  83%|████████▎ | 83/100 [00:01<00:00, 74.34it/s]
loss 0.06 accuracy 0.97:  83%|████████▎ | 83/100 [00:01<00:00, 74.34it/s]
loss 0.00 accuracy 1.00:  83%|████████▎ | 83/100 [00:01<00:00, 74.34it/s]
loss 0.00 accuracy 1.00:  91%|█████████ | 91/100 [00:01<00:00, 75.95it/s]
loss 0.03 accuracy 1.00:  91%|█████████ | 91/100 [00:01<00:00, 75.95it/s]
loss 0.11 accuracy 0.97:  91%|█████████ | 91/100 [00:01<00:00, 75.95it/s]
loss 0.14 accuracy 0.94:  91%|█████████ | 91/100 [00:01<00:00, 75.95it/s]
loss 0.18 accuracy 0.94:  91%|█████████ | 91/100 [00:01<00:00, 75.95it/s]
loss 0.12 accuracy 0.94:  91%|█████████ | 91/100 [00:01<00:00, 75.95it/s]
loss 0.02 accuracy 1.00:  91%|█████████ | 91/100 [00:01<00:00, 75.95it/s]
loss 0.01 accuracy 1.00:  91%|█████████ | 91/100 [00:01<00:00, 75.95it/s]
loss 0.05 accuracy 0.97:  91%|█████████ | 91/100 [00:01<00:00, 75.95it/s]
loss 0.19 accuracy 0.97:  91%|█████████ | 91/100 [00:01<00:00, 75.95it/s]
loss 0.19 accuracy 0.97: 100%|██████████| 100/100 [00:01<00:00, 77.27it/s]
loss 0.19 accuracy 0.97: 100%|██████████| 100/100 [00:01<00:00, 61.66it/s]

  0%|          | 0/79 [00:00<?, ?it/s]
  5%|▌         | 4/79 [00:00<00:02, 31.03it/s]
 10%|█         | 8/79 [00:00<00:02, 30.96it/s]
 15%|█▌        | 12/79 [00:00<00:02, 24.77it/s]
 20%|██        | 16/79 [00:00<00:02, 26.86it/s]
 25%|██▌       | 20/79 [00:00<00:02, 28.15it/s]
 30%|███       | 24/79 [00:00<00:01, 29.01it/s]
 35%|███▌      | 28/79 [00:00<00:01, 29.59it/s]
 39%|███▉      | 31/79 [00:01<00:01, 25.24it/s]
 44%|████▍     | 35/79 [00:01<00:01, 26.92it/s]
 49%|████▉     | 39/79 [00:01<00:01, 28.03it/s]
 54%|█████▍    | 43/79 [00:01<00:01, 28.90it/s]
 59%|█████▉    | 47/79 [00:01<00:01, 29.47it/s]
 63%|██████▎   | 50/79 [00:01<00:01, 25.34it/s]
 68%|██████▊   | 54/79 [00:01<00:00, 26.80it/s]
 73%|███████▎  | 58/79 [00:02<00:00, 27.93it/s]
 78%|███████▊  | 62/79 [00:02<00:00, 28.80it/s]
 84%|████████▎ | 66/79 [00:02<00:00, 29.41it/s]
 87%|████████▋ | 69/79 [00:02<00:00, 25.41it/s]
 92%|█████████▏| 73/79 [00:02<00:00, 26.92it/s]
 97%|█████████▋| 77/79 [00:02<00:00, 28.07it/s]
100%|██████████| 79/79 [00:02<00:00, 27.88it/s]
test set accuracy is 0.978200
reducing lr to 0.0024113

  0%|          | 0/100 [00:00<?, ?it/s]
loss 0.01 accuracy 1.00:   0%|          | 0/100 [00:00<?, ?it/s]
loss 0.01 accuracy 1.00:   1%|          | 1/100 [00:00<00:20,  4.82it/s]
loss 0.01 accuracy 1.00:   1%|          | 1/100 [00:00<00:20,  4.82it/s]
loss 0.01 accuracy 1.00:   2%|▏         | 2/100 [00:00<00:17,  5.52it/s]
loss 0.02 accuracy 1.00:   2%|▏         | 2/100 [00:00<00:17,  5.52it/s]
loss 0.21 accuracy 0.94:   2%|▏         | 2/100 [00:00<00:17,  5.52it/s]
loss 0.01 accuracy 1.00:   2%|▏         | 2/100 [00:00<00:17,  5.52it/s]
loss 0.02 accuracy 1.00:   2%|▏         | 2/100 [00:00<00:17,  5.52it/s]
loss 0.11 accuracy 0.97:   2%|▏         | 2/100 [00:00<00:17,  5.52it/s]
loss 0.03 accuracy 1.00:   2%|▏         | 2/100 [00:00<00:17,  5.52it/s]
loss 0.16 accuracy 0.94:   2%|▏         | 2/100 [00:00<00:17,  5.52it/s]
loss 0.01 accuracy 1.00:   2%|▏         | 2/100 [00:00<00:17,  5.52it/s]
loss 0.01 accuracy 1.00:  10%|█         | 10/100 [00:00<00:03, 28.87it/s]
loss 0.03 accuracy 1.00:  10%|█         | 10/100 [00:00<00:03, 28.87it/s]
loss 0.11 accuracy 0.97:  10%|█         | 10/100 [00:00<00:03, 28.87it/s]
loss 0.00 accuracy 1.00:  10%|█         | 10/100 [00:00<00:03, 28.87it/s]
loss 0.08 accuracy 0.97:  10%|█         | 10/100 [00:00<00:03, 28.87it/s]
loss 0.10 accuracy 0.97:  10%|█         | 10/100 [00:00<00:03, 28.87it/s]
loss 0.26 accuracy 0.94:  10%|█         | 10/100 [00:00<00:03, 28.87it/s]
loss 0.01 accuracy 1.00:  10%|█         | 10/100 [00:00<00:03, 28.87it/s]
loss 0.01 accuracy 1.00:  10%|█         | 10/100 [00:00<00:03, 28.87it/s]
loss 0.01 accuracy 1.00:  18%|█▊        | 18/100 [00:00<00:01, 44.48it/s]
loss 0.03 accuracy 1.00:  18%|█▊        | 18/100 [00:00<00:01, 44.48it/s]
loss 0.07 accuracy 0.94:  18%|█▊        | 18/100 [00:00<00:01, 44.48it/s]
loss 0.02 accuracy 1.00:  18%|█▊        | 18/100 [00:00<00:01, 44.48it/s]
loss 0.02 accuracy 1.00:  18%|█▊        | 18/100 [00:00<00:01, 44.48it/s]
loss 0.01 accuracy 1.00:  18%|█▊        | 18/100 [00:00<00:01, 44.48it/s]
loss 0.04 accuracy 1.00:  18%|█▊        | 18/100 [00:00<00:01, 44.48it/s]
loss 0.12 accuracy 0.91:  18%|█▊        | 18/100 [00:00<00:01, 44.48it/s]
loss 0.24 accuracy 0.94:  18%|█▊        | 18/100 [00:00<00:01, 44.48it/s]
loss 0.24 accuracy 0.94:  26%|██▌       | 26/100 [00:00<00:01, 55.00it/s]
loss 0.02 accuracy 1.00:  26%|██▌       | 26/100 [00:00<00:01, 55.00it/s]
loss 0.20 accuracy 0.94:  26%|██▌       | 26/100 [00:00<00:01, 55.00it/s]
loss 0.00 accuracy 1.00:  26%|██▌       | 26/100 [00:00<00:01, 55.00it/s]
loss 0.01 accuracy 1.00:  26%|██▌       | 26/100 [00:00<00:01, 55.00it/s]
loss 0.21 accuracy 0.94:  26%|██▌       | 26/100 [00:00<00:01, 55.00it/s]
loss 0.04 accuracy 1.00:  26%|██▌       | 26/100 [00:00<00:01, 55.00it/s]
loss 0.09 accuracy 0.97:  26%|██▌       | 26/100 [00:00<00:01, 55.00it/s]
loss 0.04 accuracy 1.00:  26%|██▌       | 26/100 [00:00<00:01, 55.00it/s]
loss 0.04 accuracy 1.00:  34%|███▍      | 34/100 [00:00<00:01, 62.23it/s]
loss 0.04 accuracy 0.97:  34%|███▍      | 34/100 [00:00<00:01, 62.23it/s]
loss 0.14 accuracy 0.94:  34%|███▍      | 34/100 [00:00<00:01, 62.23it/s]
loss 0.01 accuracy 1.00:  34%|███▍      | 34/100 [00:00<00:01, 62.23it/s]
loss 0.05 accuracy 1.00:  34%|███▍      | 34/100 [00:00<00:01, 62.23it/s]
loss 0.20 accuracy 0.97:  34%|███▍      | 34/100 [00:00<00:01, 62.23it/s]
loss 0.01 accuracy 1.00:  34%|███▍      | 34/100 [00:00<00:01, 62.23it/s]
loss 0.17 accuracy 0.97:  34%|███▍      | 34/100 [00:00<00:01, 62.23it/s]
loss 0.00 accuracy 1.00:  34%|███▍      | 34/100 [00:00<00:01, 62.23it/s]
loss 0.00 accuracy 1.00:  42%|████▏     | 42/100 [00:00<00:00, 67.12it/s]
loss 0.02 accuracy 1.00:  42%|████▏     | 42/100 [00:00<00:00, 67.12it/s]
loss 0.02 accuracy 1.00:  42%|████▏     | 42/100 [00:00<00:00, 67.12it/s]
loss 0.03 accuracy 1.00:  42%|████▏     | 42/100 [00:00<00:00, 67.12it/s]
loss 0.07 accuracy 0.94:  42%|████▏     | 42/100 [00:00<00:00, 67.12it/s]
loss 0.01 accuracy 1.00:  42%|████▏     | 42/100 [00:00<00:00, 67.12it/s]
loss 0.02 accuracy 1.00:  42%|████▏     | 42/100 [00:00<00:00, 67.12it/s]
loss 0.03 accuracy 1.00:  42%|████▏     | 42/100 [00:00<00:00, 67.12it/s]
loss 0.03 accuracy 0.97:  42%|████▏     | 42/100 [00:00<00:00, 67.12it/s]
loss 0.03 accuracy 0.97:  50%|█████     | 50/100 [00:00<00:00, 70.50it/s]
loss 0.20 accuracy 0.97:  50%|█████     | 50/100 [00:00<00:00, 70.50it/s]
loss 0.03 accuracy 1.00:  50%|█████     | 50/100 [00:01<00:00, 70.50it/s]
loss 0.04 accuracy 1.00:  50%|█████     | 50/100 [00:01<00:00, 70.50it/s]
loss 0.05 accuracy 0.97:  50%|█████     | 50/100 [00:01<00:00, 70.50it/s]
loss 0.10 accuracy 0.94:  50%|█████     | 50/100 [00:01<00:00, 70.50it/s]
loss 0.04 accuracy 1.00:  50%|█████     | 50/100 [00:01<00:00, 70.50it/s]
loss 0.04 accuracy 0.97:  50%|█████     | 50/100 [00:01<00:00, 70.50it/s]
loss 0.09 accuracy 0.94:  50%|█████     | 50/100 [00:01<00:00, 70.50it/s]
loss 0.09 accuracy 0.94:  58%|█████▊    | 58/100 [00:01<00:00, 72.76it/s]
loss 0.11 accuracy 0.97:  58%|█████▊    | 58/100 [00:01<00:00, 72.76it/s]
loss 0.01 accuracy 1.00:  58%|█████▊    | 58/100 [00:01<00:00, 72.76it/s]
loss 0.27 accuracy 0.94:  58%|█████▊    | 58/100 [00:01<00:00, 72.76it/s]
loss 0.10 accuracy 0.97:  58%|█████▊    | 58/100 [00:01<00:00, 72.76it/s]
loss 0.16 accuracy 0.97:  58%|█████▊    | 58/100 [00:01<00:00, 72.76it/s]
loss 0.08 accuracy 0.94:  58%|█████▊    | 58/100 [00:01<00:00, 72.76it/s]
loss 0.00 accuracy 1.00:  58%|█████▊    | 58/100 [00:01<00:00, 72.76it/s]
loss 0.34 accuracy 0.94:  58%|█████▊    | 58/100 [00:01<00:00, 72.76it/s]
loss 0.34 accuracy 0.94:  66%|██████▌   | 66/100 [00:01<00:00, 74.26it/s]
loss 0.11 accuracy 0.97:  66%|██████▌   | 66/100 [00:01<00:00, 74.26it/s]
loss 0.03 accuracy 1.00:  66%|██████▌   | 66/100 [00:01<00:00, 74.26it/s]
loss 0.01 accuracy 1.00:  66%|██████▌   | 66/100 [00:01<00:00, 74.26it/s]
loss 0.09 accuracy 1.00:  66%|██████▌   | 66/100 [00:01<00:00, 74.26it/s]
loss 0.07 accuracy 0.94:  66%|██████▌   | 66/100 [00:01<00:00, 74.26it/s]
loss 0.01 accuracy 1.00:  66%|██████▌   | 66/100 [00:01<00:00, 74.26it/s]
loss 0.00 accuracy 1.00:  66%|██████▌   | 66/100 [00:01<00:00, 74.26it/s]
loss 0.00 accuracy 1.00:  66%|██████▌   | 66/100 [00:01<00:00, 74.26it/s]
loss 0.00 accuracy 1.00:  74%|███████▍  | 74/100 [00:01<00:00, 75.31it/s]
loss 0.01 accuracy 1.00:  74%|███████▍  | 74/100 [00:01<00:00, 75.31it/s]
loss 0.06 accuracy 0.97:  74%|███████▍  | 74/100 [00:01<00:00, 75.31it/s]
loss 0.07 accuracy 0.97:  74%|███████▍  | 74/100 [00:01<00:00, 75.31it/s]
loss 0.01 accuracy 1.00:  74%|███████▍  | 74/100 [00:01<00:00, 75.31it/s]
loss 0.07 accuracy 0.97:  74%|███████▍  | 74/100 [00:01<00:00, 75.31it/s]
loss 0.15 accuracy 0.97:  74%|███████▍  | 74/100 [00:01<00:00, 75.31it/s]
loss 0.02 accuracy 1.00:  74%|███████▍  | 74/100 [00:01<00:00, 75.31it/s]
loss 0.05 accuracy 0.97:  74%|███████▍  | 74/100 [00:01<00:00, 75.31it/s]
loss 0.05 accuracy 0.97:  82%|████████▏ | 82/100 [00:01<00:00, 76.06it/s]
loss 0.08 accuracy 0.97:  82%|████████▏ | 82/100 [00:01<00:00, 76.06it/s]
loss 0.01 accuracy 1.00:  82%|████████▏ | 82/100 [00:01<00:00, 76.06it/s]
loss 0.03 accuracy 1.00:  82%|████████▏ | 82/100 [00:01<00:00, 76.06it/s]
loss 0.05 accuracy 0.97:  82%|████████▏ | 82/100 [00:01<00:00, 76.06it/s]
loss 0.02 accuracy 1.00:  82%|████████▏ | 82/100 [00:01<00:00, 76.06it/s]
loss 0.03 accuracy 1.00:  82%|████████▏ | 82/100 [00:01<00:00, 76.06it/s]
loss 0.02 accuracy 1.00:  82%|████████▏ | 82/100 [00:01<00:00, 76.06it/s]
loss 0.11 accuracy 0.97:  82%|████████▏ | 82/100 [00:01<00:00, 76.06it/s]
loss 0.11 accuracy 0.97:  90%|█████████ | 90/100 [00:01<00:00, 76.59it/s]
loss 0.01 accuracy 1.00:  90%|█████████ | 90/100 [00:01<00:00, 76.59it/s]
loss 0.04 accuracy 1.00:  90%|█████████ | 90/100 [00:01<00:00, 76.59it/s]
loss 0.09 accuracy 0.97:  90%|█████████ | 90/100 [00:01<00:00, 76.59it/s]
loss 0.02 accuracy 1.00:  90%|█████████ | 90/100 [00:01<00:00, 76.59it/s]
loss 0.05 accuracy 0.97:  90%|█████████ | 90/100 [00:01<00:00, 76.59it/s]
loss 0.03 accuracy 0.97:  90%|█████████ | 90/100 [00:01<00:00, 76.59it/s]
loss 0.02 accuracy 1.00:  90%|█████████ | 90/100 [00:01<00:00, 76.59it/s]
loss 0.02 accuracy 1.00:  90%|█████████ | 90/100 [00:01<00:00, 76.59it/s]
loss 0.02 accuracy 1.00:  98%|█████████▊| 98/100 [00:01<00:00, 77.01it/s]
loss 0.00 accuracy 1.00:  98%|█████████▊| 98/100 [00:01<00:00, 77.01it/s]
loss 0.01 accuracy 1.00:  98%|█████████▊| 98/100 [00:01<00:00, 77.01it/s]
loss 0.01 accuracy 1.00: 100%|██████████| 100/100 [00:01<00:00, 61.53it/s]

  0%|          | 0/79 [00:00<?, ?it/s]
  5%|▌         | 4/79 [00:00<00:02, 30.90it/s]
 10%|█         | 8/79 [00:00<00:02, 30.91it/s]
 15%|█▌        | 12/79 [00:00<00:02, 24.69it/s]
 20%|██        | 16/79 [00:00<00:02, 26.87it/s]
 25%|██▌       | 20/79 [00:00<00:02, 28.18it/s]
 30%|███       | 24/79 [00:00<00:01, 29.09it/s]
 35%|███▌      | 28/79 [00:00<00:01, 29.69it/s]
 41%|████      | 32/79 [00:01<00:01, 25.74it/s]
 46%|████▌     | 36/79 [00:01<00:01, 27.18it/s]
 51%|█████     | 40/79 [00:01<00:01, 28.12it/s]
 56%|█████▌    | 44/79 [00:01<00:01, 28.88it/s]
 61%|██████    | 48/79 [00:01<00:01, 29.47it/s]
 65%|██████▍   | 51/79 [00:01<00:01, 25.34it/s]
 70%|██████▉   | 55/79 [00:01<00:00, 26.91it/s]
 75%|███████▍  | 59/79 [00:02<00:00, 28.00it/s]
 80%|███████▉  | 63/79 [00:02<00:00, 28.86it/s]
 85%|████████▍ | 67/79 [00:02<00:00, 29.49it/s]
 89%|████████▊ | 70/79 [00:02<00:00, 25.34it/s]
 94%|█████████▎| 74/79 [00:02<00:00, 26.94it/s]
 99%|█████████▊| 78/79 [00:02<00:00, 28.03it/s]
100%|██████████| 79/79 [00:02<00:00, 27.90it/s]
test set accuracy is 0.981400
reducing lr to 0.0020094

transformer.py

  0%|          | 0/50 [00:00<?, ?it/s]
loss 2.35 accuracy 0.07:   0%|          | 0/50 [00:04<?, ?it/s]
loss 2.35 accuracy 0.07:   2%|▏         | 1/50 [00:04<03:58,  4.86s/it]
loss 2.23 accuracy 0.23:   2%|▏         | 1/50 [00:06<03:58,  4.86s/it]
loss 2.23 accuracy 0.23:   4%|▍         | 2/50 [00:06<02:15,  2.83s/it]
loss 2.20 accuracy 0.19:   4%|▍         | 2/50 [00:06<02:15,  2.83s/it]
loss 2.15 accuracy 0.24:   4%|▍         | 2/50 [00:06<02:15,  2.83s/it]
loss 2.12 accuracy 0.24:   4%|▍         | 2/50 [00:06<02:15,  2.83s/it]
loss 2.09 accuracy 0.20:   4%|▍         | 2/50 [00:06<02:15,  2.83s/it]
loss 2.03 accuracy 0.29:   4%|▍         | 2/50 [00:06<02:15,  2.83s/it]
loss 1.96 accuracy 0.31:   4%|▍         | 2/50 [00:06<02:15,  2.83s/it]
loss 1.92 accuracy 0.33:   4%|▍         | 2/50 [00:06<02:15,  2.83s/it]
loss 1.88 accuracy 0.34:   4%|▍         | 2/50 [00:06<02:15,  2.83s/it]
loss 1.79 accuracy 0.39:   4%|▍         | 2/50 [00:06<02:15,  2.83s/it]
loss 1.76 accuracy 0.41:   4%|▍         | 2/50 [00:06<02:15,  2.83s/it]
loss 1.65 accuracy 0.47:   4%|▍         | 2/50 [00:06<02:15,  2.83s/it]
loss 1.64 accuracy 0.46:   4%|▍         | 2/50 [00:06<02:15,  2.83s/it]
loss 1.43 accuracy 0.65:   4%|▍         | 2/50 [00:06<02:15,  2.83s/it]
loss 1.38 accuracy 0.72:   4%|▍         | 2/50 [00:06<02:15,  2.83s/it]
loss 1.38 accuracy 0.72:  32%|███▏      | 16/50 [00:06<00:07,  4.37it/s]
loss 1.26 accuracy 0.72:  32%|███▏      | 16/50 [00:06<00:07,  4.37it/s]
loss 1.13 accuracy 0.81:  32%|███▏      | 16/50 [00:06<00:07,  4.37it/s]
loss 1.03 accuracy 0.83:  32%|███▏      | 16/50 [00:06<00:07,  4.37it/s]
loss 0.89 accuracy 0.85:  32%|███▏      | 16/50 [00:06<00:07,  4.37it/s]
loss 0.78 accuracy 0.87:  32%|███▏      | 16/50 [00:06<00:07,  4.37it/s]
loss 0.82 accuracy 0.85:  32%|███▏      | 16/50 [00:06<00:07,  4.37it/s]
loss 0.70 accuracy 0.87:  32%|███▏      | 16/50 [00:06<00:07,  4.37it/s]
loss 0.65 accuracy 0.86:  32%|███▏      | 16/50 [00:06<00:07,  4.37it/s]
loss 0.66 accuracy 0.85:  32%|███▏      | 16/50 [00:06<00:07,  4.37it/s]
loss 0.60 accuracy 0.86:  32%|███▏      | 16/50 [00:06<00:07,  4.37it/s]
loss 0.57 accuracy 0.85:  32%|███▏      | 16/50 [00:06<00:07,  4.37it/s]
loss 0.60 accuracy 0.84:  32%|███▏      | 16/50 [00:06<00:07,  4.37it/s]
loss 0.54 accuracy 0.85:  32%|███▏      | 16/50 [00:06<00:07,  4.37it/s]
loss 0.52 accuracy 0.86:  32%|███▏      | 16/50 [00:06<00:07,  4.37it/s]
loss 0.52 accuracy 0.86:  60%|██████    | 30/50 [00:06<00:02,  9.73it/s]
loss 0.50 accuracy 0.87:  60%|██████    | 30/50 [00:06<00:02,  9.73it/s]
loss 0.49 accuracy 0.87:  60%|██████    | 30/50 [00:06<00:02,  9.73it/s]
loss 0.58 accuracy 0.85:  60%|██████    | 30/50 [00:06<00:02,  9.73it/s]
loss 0.49 accuracy 0.87:  60%|██████    | 30/50 [00:06<00:02,  9.73it/s]
loss 0.45 accuracy 0.89:  60%|██████    | 30/50 [00:06<00:02,  9.73it/s]
loss 0.51 accuracy 0.86:  60%|██████    | 30/50 [00:06<00:02,  9.73it/s]
loss 0.46 accuracy 0.86:  60%|██████    | 30/50 [00:06<00:02,  9.73it/s]
loss 0.48 accuracy 0.85:  60%|██████    | 30/50 [00:06<00:02,  9.73it/s]
loss 0.45 accuracy 0.87:  60%|██████    | 30/50 [00:06<00:02,  9.73it/s]
loss 0.44 accuracy 0.86:  60%|██████    | 30/50 [00:06<00:02,  9.73it/s]
loss 0.41 accuracy 0.87:  60%|██████    | 30/50 [00:06<00:02,  9.73it/s]
loss 0.45 accuracy 0.87:  60%|██████    | 30/50 [00:06<00:02,  9.73it/s]
loss 0.42 accuracy 0.87:  60%|██████    | 30/50 [00:06<00:02,  9.73it/s]
loss 0.42 accuracy 0.86:  60%|██████    | 30/50 [00:06<00:02,  9.73it/s]
loss 0.42 accuracy 0.86:  88%|████████▊ | 44/50 [00:06<00:00, 16.68it/s]
loss 0.43 accuracy 0.85:  88%|████████▊ | 44/50 [00:06<00:00, 16.68it/s]
loss 0.39 accuracy 0.87:  88%|████████▊ | 44/50 [00:06<00:00, 16.68it/s]
loss 0.38 accuracy 0.88:  88%|████████▊ | 44/50 [00:06<00:00, 16.68it/s]
loss 0.39 accuracy 0.86:  88%|████████▊ | 44/50 [00:06<00:00, 16.68it/s]
loss 0.35 accuracy 0.89:  88%|████████▊ | 44/50 [00:06<00:00, 16.68it/s]
loss 0.38 accuracy 0.87:  88%|████████▊ | 44/50 [00:06<00:00, 16.68it/s]
loss 0.38 accuracy 0.87: 100%|██████████| 50/50 [00:06<00:00,  7.55it/s]

  0%|          | 0/16 [00:00<?, ?it/s]
  6%|▋         | 1/16 [00:01<00:17,  1.15s/it]
 62%|██████▎   | 10/16 [00:01<00:00, 10.61it/s]
100%|██████████| 16/16 [00:02<00:00,  6.73it/s]
100%|██████████| 16/16 [00:02<00:00,  6.35it/s]
test set accuracy is 0.867750
reducing lr to 0.0025

  0%|          | 0/50 [00:00<?, ?it/s]
loss 0.36 accuracy 0.88:   0%|          | 0/50 [00:00<?, ?it/s]
loss 0.39 accuracy 0.87:   0%|          | 0/50 [00:00<?, ?it/s]
loss 0.39 accuracy 0.87:   4%|▍         | 2/50 [00:00<00:04, 10.36it/s]
loss 0.36 accuracy 0.88:   4%|▍         | 2/50 [00:00<00:04, 10.36it/s]
loss 0.43 accuracy 0.86:   4%|▍         | 2/50 [00:00<00:04, 10.36it/s]
loss 0.40 accuracy 0.87:   4%|▍         | 2/50 [00:00<00:04, 10.36it/s]
loss 0.40 accuracy 0.87:   4%|▍         | 2/50 [00:00<00:04, 10.36it/s]
loss 0.38 accuracy 0.87:   4%|▍         | 2/50 [00:00<00:04, 10.36it/s]
loss 0.38 accuracy 0.86:   4%|▍         | 2/50 [00:00<00:04, 10.36it/s]
loss 0.39 accuracy 0.88:   4%|▍         | 2/50 [00:00<00:04, 10.36it/s]
loss 0.40 accuracy 0.85:   4%|▍         | 2/50 [00:00<00:04, 10.36it/s]
loss 0.38 accuracy 0.86:   4%|▍         | 2/50 [00:00<00:04, 10.36it/s]
loss 0.38 accuracy 0.87:   4%|▍         | 2/50 [00:00<00:04, 10.36it/s]
loss 0.39 accuracy 0.86:   4%|▍         | 2/50 [00:00<00:04, 10.36it/s]
loss 0.36 accuracy 0.88:   4%|▍         | 2/50 [00:00<00:04, 10.36it/s]
loss 0.34 accuracy 0.88:   4%|▍         | 2/50 [00:00<00:04, 10.36it/s]
loss 0.34 accuracy 0.88:   4%|▍         | 2/50 [00:00<00:04, 10.36it/s]
loss 0.34 accuracy 0.88:  32%|███▏      | 16/50 [00:00<00:00, 64.82it/s]
loss 0.34 accuracy 0.88:  32%|███▏      | 16/50 [00:00<00:00, 64.82it/s]
loss 0.38 accuracy 0.85:  32%|███▏      | 16/50 [00:00<00:00, 64.82it/s]
loss 0.34 accuracy 0.88:  32%|███▏      | 16/50 [00:00<00:00, 64.82it/s]
loss 0.38 accuracy 0.85:  32%|███▏      | 16/50 [00:00<00:00, 64.82it/s]
loss 0.33 accuracy 0.88:  32%|███▏      | 16/50 [00:00<00:00, 64.82it/s]
loss 0.40 accuracy 0.86:  32%|███▏      | 16/50 [00:00<00:00, 64.82it/s]
loss 0.33 accuracy 0.88:  32%|███▏      | 16/50 [00:00<00:00, 64.82it/s]
loss 0.36 accuracy 0.86:  32%|███▏      | 16/50 [00:00<00:00, 64.82it/s]
loss 0.36 accuracy 0.87:  32%|███▏      | 16/50 [00:00<00:00, 64.82it/s]
loss 0.34 accuracy 0.87:  32%|███▏      | 16/50 [00:00<00:00, 64.82it/s]
loss 0.33 accuracy 0.89:  32%|███▏      | 16/50 [00:00<00:00, 64.82it/s]
loss 0.36 accuracy 0.86:  32%|███▏      | 16/50 [00:00<00:00, 64.82it/s]
loss 0.34 accuracy 0.88:  32%|███▏      | 16/50 [00:00<00:00, 64.82it/s]
loss 0.33 accuracy 0.89:  32%|███▏      | 16/50 [00:00<00:00, 64.82it/s]
loss 0.33 accuracy 0.89:  60%|██████    | 30/50 [00:00<00:00, 92.64it/s]
loss 0.29 accuracy 0.90:  60%|██████    | 30/50 [00:00<00:00, 92.64it/s]
loss 0.35 accuracy 0.87:  60%|██████    | 30/50 [00:00<00:00, 92.64it/s]
loss 0.32 accuracy 0.88:  60%|██████    | 30/50 [00:00<00:00, 92.64it/s]
loss 0.35 accuracy 0.87:  60%|██████    | 30/50 [00:00<00:00, 92.64it/s]
loss 0.33 accuracy 0.88:  60%|██████    | 30/50 [00:00<00:00, 92.64it/s]
loss 0.30 accuracy 0.89:  60%|██████    | 30/50 [00:00<00:00, 92.64it/s]
loss 0.32 accuracy 0.89:  60%|██████    | 30/50 [00:00<00:00, 92.64it/s]
loss 0.32 accuracy 0.89:  60%|██████    | 30/50 [00:00<00:00, 92.64it/s]
loss 0.29 accuracy 0.90:  60%|██████    | 30/50 [00:00<00:00, 92.64it/s]
loss 0.30 accuracy 0.89:  60%|██████    | 30/50 [00:00<00:00, 92.64it/s]
loss 0.29 accuracy 0.90:  60%|██████    | 30/50 [00:00<00:00, 92.64it/s]
loss 0.30 accuracy 0.89:  60%|██████    | 30/50 [00:00<00:00, 92.64it/s]
loss 0.31 accuracy 0.88:  60%|██████    | 30/50 [00:00<00:00, 92.64it/s]
loss 0.30 accuracy 0.89:  60%|██████    | 30/50 [00:00<00:00, 92.64it/s]
loss 0.30 accuracy 0.89:  88%|████████▊ | 44/50 [00:00<00:00, 108.77it/s]
loss 0.27 accuracy 0.90:  88%|████████▊ | 44/50 [00:00<00:00, 108.77it/s]
loss 0.31 accuracy 0.88:  88%|████████▊ | 44/50 [00:00<00:00, 108.77it/s]
loss 0.28 accuracy 0.90:  88%|████████▊ | 44/50 [00:00<00:00, 108.77it/s]
loss 0.28 accuracy 0.89:  88%|████████▊ | 44/50 [00:00<00:00, 108.77it/s]
loss 0.27 accuracy 0.90:  88%|████████▊ | 44/50 [00:00<00:00, 108.77it/s]
loss 0.25 accuracy 0.91:  88%|████████▊ | 44/50 [00:00<00:00, 108.77it/s]
loss 0.25 accuracy 0.91: 100%|██████████| 50/50 [00:00<00:00, 92.45it/s] 

  0%|          | 0/16 [00:00<?, ?it/s]
 38%|███▊      | 6/16 [00:00<00:00, 56.07it/s]
 94%|█████████▍| 15/16 [00:00<00:00, 74.99it/s]
100%|██████████| 16/16 [00:00<00:00, 72.89it/s]
test set accuracy is 0.898583
reducing lr to 0.0021

  0%|          | 0/50 [00:00<?, ?it/s]
loss 0.32 accuracy 0.87:   0%|          | 0/50 [00:00<?, ?it/s]
loss 0.35 accuracy 0.87:   0%|          | 0/50 [00:00<?, ?it/s]
loss 0.35 accuracy 0.87:   4%|▍         | 2/50 [00:00<00:04, 10.42it/s]
loss 0.31 accuracy 0.88:   4%|▍         | 2/50 [00:00<00:04, 10.42it/s]
loss 0.26 accuracy 0.91:   4%|▍         | 2/50 [00:00<00:04, 10.42it/s]
loss 0.28 accuracy 0.90:   4%|▍         | 2/50 [00:00<00:04, 10.42it/s]
loss 0.25 accuracy 0.91:   4%|▍         | 2/50 [00:00<00:04, 10.42it/s]
loss 0.28 accuracy 0.90:   4%|▍         | 2/50 [00:00<00:04, 10.42it/s]
loss 0.28 accuracy 0.89:   4%|▍         | 2/50 [00:00<00:04, 10.42it/s]
loss 0.29 accuracy 0.89:   4%|▍         | 2/50 [00:00<00:04, 10.42it/s]
loss 0.29 accuracy 0.89:   4%|▍         | 2/50 [00:00<00:04, 10.42it/s]
loss 0.27 accuracy 0.91:   4%|▍         | 2/50 [00:00<00:04, 10.42it/s]
loss 0.28 accuracy 0.89:   4%|▍         | 2/50 [00:00<00:04, 10.42it/s]
loss 0.29 accuracy 0.89:   4%|▍         | 2/50 [00:00<00:04, 10.42it/s]
loss 0.26 accuracy 0.91:   4%|▍         | 2/50 [00:00<00:04, 10.42it/s]
loss 0.30 accuracy 0.90:   4%|▍         | 2/50 [00:00<00:04, 10.42it/s]
loss 0.30 accuracy 0.89:   4%|▍         | 2/50 [00:00<00:04, 10.42it/s]
loss 0.30 accuracy 0.89:  32%|███▏      | 16/50 [00:00<00:00, 65.34it/s]
loss 0.25 accuracy 0.90:  32%|███▏      | 16/50 [00:00<00:00, 65.34it/s]
loss 0.25 accuracy 0.91:  32%|███▏      | 16/50 [00:00<00:00, 65.34it/s]
loss 0.28 accuracy 0.90:  32%|███▏      | 16/50 [00:00<00:00, 65.34it/s]
loss 0.23 accuracy 0.90:  32%|███▏      | 16/50 [00:00<00:00, 65.34it/s]
loss 0.26 accuracy 0.89:  32%|███▏      | 16/50 [00:00<00:00, 65.34it/s]
loss 0.25 accuracy 0.90:  32%|███▏      | 16/50 [00:00<00:00, 65.34it/s]
loss 0.26 accuracy 0.90:  32%|███▏      | 16/50 [00:00<00:00, 65.34it/s]
loss 0.23 accuracy 0.91:  32%|███▏      | 16/50 [00:00<00:00, 65.34it/s]
loss 0.24 accuracy 0.91:  32%|███▏      | 16/50 [00:00<00:00, 65.34it/s]
loss 0.22 accuracy 0.93:  32%|███▏      | 16/50 [00:00<00:00, 65.34it/s]
loss 0.24 accuracy 0.90:  32%|███▏      | 16/50 [00:00<00:00, 65.34it/s]
loss 0.25 accuracy 0.90:  32%|███▏      | 16/50 [00:00<00:00, 65.34it/s]
loss 0.24 accuracy 0.90:  32%|███▏      | 16/50 [00:00<00:00, 65.34it/s]
loss 0.23 accuracy 0.91:  32%|███▏      | 16/50 [00:00<00:00, 65.34it/s]
loss 0.19 accuracy 0.93:  32%|███▏      | 16/50 [00:00<00:00, 65.34it/s]
loss 0.19 accuracy 0.93:  62%|██████▏   | 31/50 [00:00<00:00, 95.09it/s]
loss 0.19 accuracy 0.93:  62%|██████▏   | 31/50 [00:00<00:00, 95.09it/s]
loss 0.22 accuracy 0.92:  62%|██████▏   | 31/50 [00:00<00:00, 95.09it/s]
loss 0.20 accuracy 0.93:  62%|██████▏   | 31/50 [00:00<00:00, 95.09it/s]
loss 0.20 accuracy 0.91:  62%|██████▏   | 31/50 [00:00<00:00, 95.09it/s]
loss 0.22 accuracy 0.91:  62%|██████▏   | 31/50 [00:00<00:00, 95.09it/s]
loss 0.24 accuracy 0.92:  62%|██████▏   | 31/50 [00:00<00:00, 95.09it/s]
loss 0.22 accuracy 0.90:  62%|██████▏   | 31/50 [00:00<00:00, 95.09it/s]
loss 0.21 accuracy 0.91:  62%|██████▏   | 31/50 [00:00<00:00, 95.09it/s]
loss 0.19 accuracy 0.92:  62%|██████▏   | 31/50 [00:00<00:00, 95.09it/s]
loss 0.22 accuracy 0.91:  62%|██████▏   | 31/50 [00:00<00:00, 95.09it/s]
loss 0.16 accuracy 0.93:  62%|██████▏   | 31/50 [00:00<00:00, 95.09it/s]
loss 0.20 accuracy 0.94:  62%|██████▏   | 31/50 [00:00<00:00, 95.09it/s]
loss 0.22 accuracy 0.91:  62%|██████▏   | 31/50 [00:00<00:00, 95.09it/s]
loss 0.21 accuracy 0.92:  62%|██████▏   | 31/50 [00:00<00:00, 95.09it/s]
loss 0.21 accuracy 0.92:  62%|██████▏   | 31/50 [00:00<00:00, 95.09it/s]
loss 0.21 accuracy 0.92:  92%|█████████▏| 46/50 [00:00<00:00, 111.79it/s]
loss 0.19 accuracy 0.92:  92%|█████████▏| 46/50 [00:00<00:00, 111.79it/s]
loss 0.19 accuracy 0.92:  92%|█████████▏| 46/50 [00:00<00:00, 111.79it/s]
loss 0.21 accuracy 0.92:  92%|█████████▏| 46/50 [00:00<00:00, 111.79it/s]
loss 0.22 accuracy 0.92:  92%|█████████▏| 46/50 [00:00<00:00, 111.79it/s]
loss 0.22 accuracy 0.92: 100%|██████████| 50/50 [00:00<00:00, 93.71it/s] 

  0%|          | 0/16 [00:00<?, ?it/s]
 56%|█████▋    | 9/16 [00:00<00:00, 86.95it/s]
100%|██████████| 16/16 [00:00<00:00, 73.27it/s]
test set accuracy is 0.929500
reducing lr to 0.0017

  0%|          | 0/50 [00:00<?, ?it/s]
loss 0.19 accuracy 0.92:   0%|          | 0/50 [00:00<?, ?it/s]
loss 0.21 accuracy 0.92:   0%|          | 0/50 [00:00<?, ?it/s]
loss 0.21 accuracy 0.92:   4%|▍         | 2/50 [00:00<00:04, 10.44it/s]
loss 0.15 accuracy 0.96:   4%|▍         | 2/50 [00:00<00:04, 10.44it/s]
loss 0.21 accuracy 0.91:   4%|▍         | 2/50 [00:00<00:04, 10.44it/s]
loss 0.19 accuracy 0.94:   4%|▍         | 2/50 [00:00<00:04, 10.44it/s]
loss 0.20 accuracy 0.93:   4%|▍         | 2/50 [00:00<00:04, 10.44it/s]
loss 0.19 accuracy 0.92:   4%|▍         | 2/50 [00:00<00:04, 10.44it/s]
loss 0.19 accuracy 0.92:   4%|▍         | 2/50 [00:00<00:04, 10.44it/s]
loss 0.24 accuracy 0.92:   4%|▍         | 2/50 [00:00<00:04, 10.44it/s]
loss 0.15 accuracy 0.94:   4%|▍         | 2/50 [00:00<00:04, 10.44it/s]
loss 0.20 accuracy 0.92:   4%|▍         | 2/50 [00:00<00:04, 10.44it/s]
loss 0.18 accuracy 0.93:   4%|▍         | 2/50 [00:00<00:04, 10.44it/s]
loss 0.15 accuracy 0.95:   4%|▍         | 2/50 [00:00<00:04, 10.44it/s]
loss 0.18 accuracy 0.93:   4%|▍         | 2/50 [00:00<00:04, 10.44it/s]
loss 0.18 accuracy 0.92:   4%|▍         | 2/50 [00:00<00:04, 10.44it/s]
loss 0.19 accuracy 0.92:   4%|▍         | 2/50 [00:00<00:04, 10.44it/s]
loss 0.19 accuracy 0.92:  32%|███▏      | 16/50 [00:00<00:00, 65.19it/s]
loss 0.17 accuracy 0.94:  32%|███▏      | 16/50 [00:00<00:00, 65.19it/s]
loss 0.18 accuracy 0.93:  32%|███▏      | 16/50 [00:00<00:00, 65.19it/s]
loss 0.15 accuracy 0.94:  32%|███▏      | 16/50 [00:00<00:00, 65.19it/s]
loss 0.16 accuracy 0.94:  32%|███▏      | 16/50 [00:00<00:00, 65.19it/s]
loss 0.17 accuracy 0.94:  32%|███▏      | 16/50 [00:00<00:00, 65.19it/s]
loss 0.17 accuracy 0.92:  32%|███▏      | 16/50 [00:00<00:00, 65.19it/s]
loss 0.18 accuracy 0.93:  32%|███▏      | 16/50 [00:00<00:00, 65.19it/s]
loss 0.16 accuracy 0.94:  32%|███▏      | 16/50 [00:00<00:00, 65.19it/s]
loss 0.12 accuracy 0.95:  32%|███▏      | 16/50 [00:00<00:00, 65.19it/s]
loss 0.14 accuracy 0.94:  32%|███▏      | 16/50 [00:00<00:00, 65.19it/s]
loss 0.14 accuracy 0.95:  32%|███▏      | 16/50 [00:00<00:00, 65.19it/s]
loss 0.12 accuracy 0.96:  32%|███▏      | 16/50 [00:00<00:00, 65.19it/s]
loss 0.11 accuracy 0.96:  32%|███▏      | 16/50 [00:00<00:00, 65.19it/s]
loss 0.12 accuracy 0.95:  32%|███▏      | 16/50 [00:00<00:00, 65.19it/s]
loss 0.12 accuracy 0.95:  60%|██████    | 30/50 [00:00<00:00, 92.90it/s]
loss 0.14 accuracy 0.93:  60%|██████    | 30/50 [00:00<00:00, 92.90it/s]
loss 0.13 accuracy 0.95:  60%|██████    | 30/50 [00:00<00:00, 92.90it/s]
loss 0.12 accuracy 0.95:  60%|██████    | 30/50 [00:00<00:00, 92.90it/s]
loss 0.11 accuracy 0.95:  60%|██████    | 30/50 [00:00<00:00, 92.90it/s]
loss 0.13 accuracy 0.95:  60%|██████    | 30/50 [00:00<00:00, 92.90it/s]
loss 0.12 accuracy 0.96:  60%|██████    | 30/50 [00:00<00:00, 92.90it/s]
loss 0.15 accuracy 0.93:  60%|██████    | 30/50 [00:00<00:00, 92.90it/s]
loss 0.08 accuracy 0.98:  60%|██████    | 30/50 [00:00<00:00, 92.90it/s]
loss 0.08 accuracy 0.98:  60%|██████    | 30/50 [00:00<00:00, 92.90it/s]
loss 0.10 accuracy 0.96:  60%|██████    | 30/50 [00:00<00:00, 92.90it/s]
loss 0.13 accuracy 0.95:  60%|██████    | 30/50 [00:00<00:00, 92.90it/s]
loss 0.11 accuracy 0.97:  60%|██████    | 30/50 [00:00<00:00, 92.90it/s]
loss 0.11 accuracy 0.96:  60%|██████    | 30/50 [00:00<00:00, 92.90it/s]
loss 0.09 accuracy 0.98:  60%|██████    | 30/50 [00:00<00:00, 92.90it/s]
loss 0.08 accuracy 0.97:  60%|██████    | 30/50 [00:00<00:00, 92.90it/s]
loss 0.08 accuracy 0.97:  90%|█████████ | 45/50 [00:00<00:00, 110.13it/s]
loss 0.10 accuracy 0.96:  90%|█████████ | 45/50 [00:00<00:00, 110.13it/s]
loss 0.08 accuracy 0.98:  90%|█████████ | 45/50 [00:00<00:00, 110.13it/s]
loss 0.09 accuracy 0.97:  90%|█████████ | 45/50 [00:00<00:00, 110.13it/s]
loss 0.08 accuracy 0.98:  90%|█████████ | 45/50 [00:00<00:00, 110.13it/s]
loss 0.07 accuracy 0.98:  90%|█████████ | 45/50 [00:00<00:00, 110.13it/s]
loss 0.07 accuracy 0.98: 100%|██████████| 50/50 [00:00<00:00, 93.05it/s] 

  0%|          | 0/16 [00:00<?, ?it/s]
 56%|█████▋    | 9/16 [00:00<00:00, 87.34it/s]
100%|██████████| 16/16 [00:00<00:00, 73.32it/s]
test set accuracy is 0.979917
reducing lr to 0.0014

  0%|          | 0/50 [00:00<?, ?it/s]
loss 0.09 accuracy 0.97:   0%|          | 0/50 [00:00<?, ?it/s]
loss 0.22 accuracy 0.93:   0%|          | 0/50 [00:00<?, ?it/s]
loss 0.22 accuracy 0.93:   4%|▍         | 2/50 [00:00<00:04, 10.35it/s]
loss 0.13 accuracy 0.95:   4%|▍         | 2/50 [00:00<00:04, 10.35it/s]
loss 0.19 accuracy 0.93:   4%|▍         | 2/50 [00:00<00:04, 10.35it/s]
loss 0.17 accuracy 0.93:   4%|▍         | 2/50 [00:00<00:04, 10.35it/s]
loss 0.09 accuracy 0.97:   4%|▍         | 2/50 [00:00<00:04, 10.35it/s]
loss 0.17 accuracy 0.95:   4%|▍         | 2/50 [00:00<00:04, 10.35it/s]
loss 0.10 accuracy 0.97:   4%|▍         | 2/50 [00:00<00:04, 10.35it/s]
loss 0.12 accuracy 0.96:   4%|▍         | 2/50 [00:00<00:04, 10.35it/s]
loss 0.11 accuracy 0.96:   4%|▍         | 2/50 [00:00<00:04, 10.35it/s]
loss 0.15 accuracy 0.95:   4%|▍         | 2/50 [00:00<00:04, 10.35it/s]
loss 0.16 accuracy 0.93:   4%|▍         | 2/50 [00:00<00:04, 10.35it/s]
loss 0.08 accuracy 0.98:   4%|▍         | 2/50 [00:00<00:04, 10.35it/s]
loss 0.11 accuracy 0.95:   4%|▍         | 2/50 [00:00<00:04, 10.35it/s]
loss 0.10 accuracy 0.97:   4%|▍         | 2/50 [00:00<00:04, 10.35it/s]
loss 0.09 accuracy 0.97:   4%|▍         | 2/50 [00:00<00:04, 10.35it/s]
loss 0.09 accuracy 0.97:  32%|███▏      | 16/50 [00:00<00:00, 65.06it/s]
loss 0.12 accuracy 0.95:  32%|███▏      | 16/50 [00:00<00:00, 65.06it/s]
loss 0.08 accuracy 0.97:  32%|███▏      | 16/50 [00:00<00:00, 65.06it/s]
loss 0.10 accuracy 0.97:  32%|███▏      | 16/50 [00:00<00:00, 65.06it/s]
loss 0.10 accuracy 0.96:  32%|███▏      | 16/50 [00:00<00:00, 65.06it/s]
loss 0.09 accuracy 0.96:  32%|███▏      | 16/50 [00:00<00:00, 65.06it/s]
loss 0.08 accuracy 0.97:  32%|███▏      | 16/50 [00:00<00:00, 65.06it/s]
loss 0.10 accuracy 0.97:  32%|███▏      | 16/50 [00:00<00:00, 65.06it/s]
loss 0.08 accuracy 0.98:  32%|███▏      | 16/50 [00:00<00:00, 65.06it/s]
loss 0.10 accuracy 0.97:  32%|███▏      | 16/50 [00:00<00:00, 65.06it/s]
loss 0.10 accuracy 0.97:  32%|███▏      | 16/50 [00:00<00:00, 65.06it/s]
loss 0.08 accuracy 0.98:  32%|███▏      | 16/50 [00:00<00:00, 65.06it/s]
loss 0.09 accuracy 0.97:  32%|███▏      | 16/50 [00:00<00:00, 65.06it/s]
loss 0.09 accuracy 0.97:  32%|███▏      | 16/50 [00:00<00:00, 65.06it/s]
loss 0.08 accuracy 0.97:  32%|███▏      | 16/50 [00:00<00:00, 65.06it/s]
loss 0.08 accuracy 0.97:  60%|██████    | 30/50 [00:00<00:00, 92.63it/s]
loss 0.10 accuracy 0.98:  60%|██████    | 30/50 [00:00<00:00, 92.63it/s]
loss 0.06 accuracy 0.98:  60%|██████    | 30/50 [00:00<00:00, 92.63it/s]
loss 0.08 accuracy 0.97:  60%|██████    | 30/50 [00:00<00:00, 92.63it/s]
loss 0.08 accuracy 0.96:  60%|██████    | 30/50 [00:00<00:00, 92.63it/s]
loss 0.08 accuracy 0.97:  60%|██████    | 30/50 [00:00<00:00, 92.63it/s]
loss 0.06 accuracy 0.98:  60%|██████    | 30/50 [00:00<00:00, 92.63it/s]
loss 0.08 accuracy 0.98:  60%|██████    | 30/50 [00:00<00:00, 92.63it/s]
loss 0.07 accuracy 0.98:  60%|██████    | 30/50 [00:00<00:00, 92.63it/s]
loss 0.06 accuracy 0.99:  60%|██████    | 30/50 [00:00<00:00, 92.63it/s]
loss 0.08 accuracy 0.98:  60%|██████    | 30/50 [00:00<00:00, 92.63it/s]
loss 0.06 accuracy 0.98:  60%|██████    | 30/50 [00:00<00:00, 92.63it/s]
loss 0.05 accuracy 0.99:  60%|██████    | 30/50 [00:00<00:00, 92.63it/s]
loss 0.06 accuracy 0.99:  60%|██████    | 30/50 [00:00<00:00, 92.63it/s]
loss 0.10 accuracy 0.97:  60%|██████    | 30/50 [00:00<00:00, 92.63it/s]
loss 0.06 accuracy 0.99:  60%|██████    | 30/50 [00:00<00:00, 92.63it/s]
loss 0.06 accuracy 0.99:  90%|█████████ | 45/50 [00:00<00:00, 109.96it/s]
loss 0.05 accuracy 0.99:  90%|█████████ | 45/50 [00:00<00:00, 109.96it/s]
loss 0.06 accuracy 0.99:  90%|█████████ | 45/50 [00:00<00:00, 109.96it/s]
loss 0.09 accuracy 0.98:  90%|█████████ | 45/50 [00:00<00:00, 109.96it/s]
loss 0.06 accuracy 0.98:  90%|█████████ | 45/50 [00:00<00:00, 109.96it/s]
loss 0.06 accuracy 0.98:  90%|█████████ | 45/50 [00:00<00:00, 109.96it/s]
loss 0.06 accuracy 0.98: 100%|██████████| 50/50 [00:00<00:00, 92.77it/s] 

  0%|          | 0/16 [00:00<?, ?it/s]
 56%|█████▋    | 9/16 [00:00<00:00, 86.92it/s]
100%|██████████| 16/16 [00:00<00:00, 87.32it/s]
test set accuracy is 0.988917
reducing lr to 0.0012

  0%|          | 0/50 [00:00<?, ?it/s]
loss 0.05 accuracy 0.99:   0%|          | 0/50 [00:00<?, ?it/s]
loss 0.05 accuracy 0.99:   2%|▏         | 1/50 [00:00<00:06,  8.05it/s]
loss 0.13 accuracy 0.95:   2%|▏         | 1/50 [00:00<00:06,  8.05it/s]
loss 0.13 accuracy 0.95:   4%|▍         | 2/50 [00:00<00:05,  8.86it/s]
loss 0.07 accuracy 0.98:   4%|▍         | 2/50 [00:00<00:05,  8.86it/s]
loss 0.08 accuracy 0.97:   4%|▍         | 2/50 [00:00<00:05,  8.86it/s]
loss 0.11 accuracy 0.96:   4%|▍         | 2/50 [00:00<00:05,  8.86it/s]
loss 0.09 accuracy 0.97:   4%|▍         | 2/50 [00:00<00:05,  8.86it/s]
loss 0.09 accuracy 0.97:   4%|▍         | 2/50 [00:00<00:05,  8.86it/s]
loss 0.06 accuracy 0.98:   4%|▍         | 2/50 [00:00<00:05,  8.86it/s]
loss 0.07 accuracy 0.98:   4%|▍         | 2/50 [00:00<00:05,  8.86it/s]
loss 0.04 accuracy 0.99:   4%|▍         | 2/50 [00:00<00:05,  8.86it/s]
loss 0.07 accuracy 0.98:   4%|▍         | 2/50 [00:00<00:05,  8.86it/s]
loss 0.06 accuracy 0.98:   4%|▍         | 2/50 [00:00<00:05,  8.86it/s]
loss 0.05 accuracy 0.99:   4%|▍         | 2/50 [00:00<00:05,  8.86it/s]
loss 0.05 accuracy 0.98:   4%|▍         | 2/50 [00:00<00:05,  8.86it/s]
loss 0.08 accuracy 0.97:   4%|▍         | 2/50 [00:00<00:05,  8.86it/s]
loss 0.05 accuracy 0.99:   4%|▍         | 2/50 [00:00<00:05,  8.86it/s]
loss 0.05 accuracy 0.99:  32%|███▏      | 16/50 [00:00<00:00, 64.18it/s]
loss 0.05 accuracy 0.98:  32%|███▏      | 16/50 [00:00<00:00, 64.18it/s]
loss 0.04 accuracy 0.99:  32%|███▏      | 16/50 [00:00<00:00, 64.18it/s]
loss 0.06 accuracy 0.98:  32%|███▏      | 16/50 [00:00<00:00, 64.18it/s]
loss 0.05 accuracy 0.99:  32%|███▏      | 16/50 [00:00<00:00, 64.18it/s]
loss 0.06 accuracy 0.98:  32%|███▏      | 16/50 [00:00<00:00, 64.18it/s]
loss 0.04 accuracy 0.99:  32%|███▏      | 16/50 [00:00<00:00, 64.18it/s]
loss 0.05 accuracy 0.99:  32%|███▏      | 16/50 [00:00<00:00, 64.18it/s]
loss 0.04 accuracy 0.98:  32%|███▏      | 16/50 [00:00<00:00, 64.18it/s]
loss 0.05 accuracy 0.99:  32%|███▏      | 16/50 [00:00<00:00, 64.18it/s]
loss 0.05 accuracy 0.98:  32%|███▏      | 16/50 [00:00<00:00, 64.18it/s]
loss 0.05 accuracy 0.98:  32%|███▏      | 16/50 [00:00<00:00, 64.18it/s]
loss 0.04 accuracy 0.99:  32%|███▏      | 16/50 [00:00<00:00, 64.18it/s]
loss 0.03 accuracy 1.00:  32%|███▏      | 16/50 [00:00<00:00, 64.18it/s]
loss 0.04 accuracy 0.99:  32%|███▏      | 16/50 [00:00<00:00, 64.18it/s]
loss 0.04 accuracy 0.99:  60%|██████    | 30/50 [00:00<00:00, 92.25it/s]
loss 0.06 accuracy 0.99:  60%|██████    | 30/50 [00:00<00:00, 92.25it/s]
loss 0.04 accuracy 0.99:  60%|██████    | 30/50 [00:00<00:00, 92.25it/s]
loss 0.03 accuracy 0.99:  60%|██████    | 30/50 [00:00<00:00, 92.25it/s]
loss 0.04 accuracy 0.99:  60%|██████    | 30/50 [00:00<00:00, 92.25it/s]
loss 0.05 accuracy 0.99:  60%|██████    | 30/50 [00:00<00:00, 92.25it/s]
loss 0.03 accuracy 0.99:  60%|██████    | 30/50 [00:00<00:00, 92.25it/s]
loss 0.05 accuracy 0.98:  60%|██████    | 30/50 [00:00<00:00, 92.25it/s]
loss 0.03 accuracy 0.99:  60%|██████    | 30/50 [00:00<00:00, 92.25it/s]
loss 0.04 accuracy 0.98:  60%|██████    | 30/50 [00:00<00:00, 92.25it/s]
loss 0.03 accuracy 0.99:  60%|██████    | 30/50 [00:00<00:00, 92.25it/s]
loss 0.07 accuracy 0.98:  60%|██████    | 30/50 [00:00<00:00, 92.25it/s]
loss 0.03 accuracy 0.99:  60%|██████    | 30/50 [00:00<00:00, 92.25it/s]
loss 0.03 accuracy 0.99:  60%|██████    | 30/50 [00:00<00:00, 92.25it/s]
loss 0.03 accuracy 0.99:  60%|██████    | 30/50 [00:00<00:00, 92.25it/s]
loss 0.03 accuracy 0.99:  88%|████████▊ | 44/50 [00:00<00:00, 108.73it/s]
loss 0.02 accuracy 1.00:  88%|████████▊ | 44/50 [00:00<00:00, 108.73it/s]
loss 0.06 accuracy 0.99:  88%|████████▊ | 44/50 [00:00<00:00, 108.73it/s]
loss 0.03 accuracy 1.00:  88%|████████▊ | 44/50 [00:00<00:00, 108.73it/s]
loss 0.03 accuracy 0.99:  88%|████████▊ | 44/50 [00:00<00:00, 108.73it/s]
loss 0.03 accuracy 0.99:  88%|████████▊ | 44/50 [00:00<00:00, 108.73it/s]
loss 0.04 accuracy 0.99:  88%|████████▊ | 44/50 [00:00<00:00, 108.73it/s]
loss 0.04 accuracy 0.99: 100%|██████████| 50/50 [00:00<00:00, 86.77it/s] 

  0%|          | 0/16 [00:00<?, ?it/s]
 56%|█████▋    | 9/16 [00:00<00:00, 87.70it/s]
100%|██████████| 16/16 [00:00<00:00, 87.74it/s]
test set accuracy is 0.995000
reducing lr to 0.0010

  0%|          | 0/50 [00:00<?, ?it/s]
loss 0.04 accuracy 1.00:   0%|          | 0/50 [00:00<?, ?it/s]
loss 0.04 accuracy 1.00:   2%|▏         | 1/50 [00:00<00:06,  7.91it/s]
loss 0.03 accuracy 0.99:   2%|▏         | 1/50 [00:00<00:06,  7.91it/s]
loss 0.03 accuracy 0.99:   4%|▍         | 2/50 [00:00<00:05,  8.81it/s]
loss 0.04 accuracy 0.99:   4%|▍         | 2/50 [00:00<00:05,  8.81it/s]
loss 0.05 accuracy 0.98:   4%|▍         | 2/50 [00:00<00:05,  8.81it/s]
loss 0.04 accuracy 0.99:   4%|▍         | 2/50 [00:00<00:05,  8.81it/s]
loss 0.06 accuracy 0.98:   4%|▍         | 2/50 [00:00<00:05,  8.81it/s]
loss 0.03 accuracy 1.00:   4%|▍         | 2/50 [00:00<00:05,  8.81it/s]
loss 0.02 accuracy 1.00:   4%|▍         | 2/50 [00:00<00:05,  8.81it/s]
loss 0.03 accuracy 0.99:   4%|▍         | 2/50 [00:00<00:05,  8.81it/s]
loss 0.04 accuracy 0.99:   4%|▍         | 2/50 [00:00<00:05,  8.81it/s]
loss 0.07 accuracy 0.99:   4%|▍         | 2/50 [00:00<00:05,  8.81it/s]
loss 0.05 accuracy 0.99:   4%|▍         | 2/50 [00:00<00:05,  8.81it/s]
loss 0.03 accuracy 0.99:   4%|▍         | 2/50 [00:00<00:05,  8.81it/s]
loss 0.03 accuracy 0.99:   4%|▍         | 2/50 [00:00<00:05,  8.81it/s]
loss 0.03 accuracy 0.99:   4%|▍         | 2/50 [00:00<00:05,  8.81it/s]
loss 0.03 accuracy 1.00:   4%|▍         | 2/50 [00:00<00:05,  8.81it/s]
loss 0.03 accuracy 1.00:  32%|███▏      | 16/50 [00:00<00:00, 64.03it/s]
loss 0.03 accuracy 0.99:  32%|███▏      | 16/50 [00:00<00:00, 64.03it/s]
loss 0.03 accuracy 0.99:  32%|███▏      | 16/50 [00:00<00:00, 64.03it/s]
loss 0.02 accuracy 1.00:  32%|███▏      | 16/50 [00:00<00:00, 64.03it/s]
loss 0.02 accuracy 1.00:  32%|███▏      | 16/50 [00:00<00:00, 64.03it/s]
loss 0.03 accuracy 1.00:  32%|███▏      | 16/50 [00:00<00:00, 64.03it/s]
loss 0.03 accuracy 0.99:  32%|███▏      | 16/50 [00:00<00:00, 64.03it/s]
loss 0.02 accuracy 0.99:  32%|███▏      | 16/50 [00:00<00:00, 64.03it/s]
loss 0.02 accuracy 1.00:  32%|███▏      | 16/50 [00:00<00:00, 64.03it/s]
loss 0.03 accuracy 0.99:  32%|███▏      | 16/50 [00:00<00:00, 64.03it/s]
loss 0.02 accuracy 1.00:  32%|███▏      | 16/50 [00:00<00:00, 64.03it/s]
loss 0.03 accuracy 0.99:  32%|███▏      | 16/50 [00:00<00:00, 64.03it/s]
loss 0.03 accuracy 1.00:  32%|███▏      | 16/50 [00:00<00:00, 64.03it/s]
loss 0.02 accuracy 0.99:  32%|███▏      | 16/50 [00:00<00:00, 64.03it/s]
loss 0.02 accuracy 1.00:  32%|███▏      | 16/50 [00:00<00:00, 64.03it/s]
loss 0.02 accuracy 1.00:  60%|██████    | 30/50 [00:00<00:00, 91.97it/s]
loss 0.01 accuracy 1.00:  60%|██████    | 30/50 [00:00<00:00, 91.97it/s]
loss 0.03 accuracy 0.99:  60%|██████    | 30/50 [00:00<00:00, 91.97it/s]
loss 0.02 accuracy 0.99:  60%|██████    | 30/50 [00:00<00:00, 91.97it/s]
loss 0.01 accuracy 1.00:  60%|██████    | 30/50 [00:00<00:00, 91.97it/s]
loss 0.01 accuracy 1.00:  60%|██████    | 30/50 [00:00<00:00, 91.97it/s]
loss 0.01 accuracy 1.00:  60%|██████    | 30/50 [00:00<00:00, 91.97it/s]
loss 0.02 accuracy 1.00:  60%|██████    | 30/50 [00:00<00:00, 91.97it/s]
loss 0.01 accuracy 1.00:  60%|██████    | 30/50 [00:00<00:00, 91.97it/s]
loss 0.02 accuracy 1.00:  60%|██████    | 30/50 [00:00<00:00, 91.97it/s]
loss 0.01 accuracy 1.00:  60%|██████    | 30/50 [00:00<00:00, 91.97it/s]
loss 0.02 accuracy 1.00:  60%|██████    | 30/50 [00:00<00:00, 91.97it/s]
loss 0.01 accuracy 1.00:  60%|██████    | 30/50 [00:00<00:00, 91.97it/s]
loss 0.01 accuracy 1.00:  60%|██████    | 30/50 [00:00<00:00, 91.97it/s]
loss 0.01 accuracy 1.00:  60%|██████    | 30/50 [00:00<00:00, 91.97it/s]
loss 0.01 accuracy 1.00:  88%|████████▊ | 44/50 [00:00<00:00, 108.20it/s]
loss 0.02 accuracy 1.00:  88%|████████▊ | 44/50 [00:00<00:00, 108.20it/s]
loss 0.03 accuracy 0.99:  88%|████████▊ | 44/50 [00:00<00:00, 108.20it/s]
loss 0.01 accuracy 1.00:  88%|████████▊ | 44/50 [00:00<00:00, 108.20it/s]
loss 0.02 accuracy 1.00:  88%|████████▊ | 44/50 [00:00<00:00, 108.20it/s]
loss 0.02 accuracy 1.00:  88%|████████▊ | 44/50 [00:00<00:00, 108.20it/s]
loss 0.01 accuracy 1.00:  88%|████████▊ | 44/50 [00:00<00:00, 108.20it/s]
loss 0.01 accuracy 1.00: 100%|██████████| 50/50 [00:00<00:00, 86.26it/s] 

  0%|          | 0/16 [00:00<?, ?it/s]
 56%|█████▋    | 9/16 [00:00<00:00, 86.56it/s]
100%|██████████| 16/16 [00:00<00:00, 87.15it/s]
test set accuracy is 0.999583
reducing lr to 0.0008

  0%|          | 0/50 [00:00<?, ?it/s]
loss 0.01 accuracy 1.00:   0%|          | 0/50 [00:00<?, ?it/s]
loss 0.02 accuracy 0.99:   0%|          | 0/50 [00:00<?, ?it/s]
loss 0.02 accuracy 0.99:   4%|▍         | 2/50 [00:00<00:05,  8.69it/s]
loss 0.01 accuracy 1.00:   4%|▍         | 2/50 [00:00<00:05,  8.69it/s]
loss 0.03 accuracy 1.00:   4%|▍         | 2/50 [00:00<00:05,  8.69it/s]
loss 0.02 accuracy 1.00:   4%|▍         | 2/50 [00:00<00:05,  8.69it/s]
loss 0.02 accuracy 1.00:   4%|▍         | 2/50 [00:00<00:05,  8.69it/s]
loss 0.02 accuracy 0.99:   4%|▍         | 2/50 [00:00<00:05,  8.69it/s]
loss 0.02 accuracy 1.00:   4%|▍         | 2/50 [00:00<00:05,  8.69it/s]
loss 0.02 accuracy 0.99:   4%|▍         | 2/50 [00:00<00:05,  8.69it/s]
loss 0.01 accuracy 1.00:   4%|▍         | 2/50 [00:00<00:05,  8.69it/s]
loss 0.01 accuracy 1.00:   4%|▍         | 2/50 [00:00<00:05,  8.69it/s]
loss 0.01 accuracy 1.00:   4%|▍         | 2/50 [00:00<00:05,  8.69it/s]
loss 0.01 accuracy 1.00:   4%|▍         | 2/50 [00:00<00:05,  8.69it/s]
loss 0.02 accuracy 0.99:   4%|▍         | 2/50 [00:00<00:05,  8.69it/s]
loss 0.03 accuracy 0.99:   4%|▍         | 2/50 [00:00<00:05,  8.69it/s]
loss 0.02 accuracy 1.00:   4%|▍         | 2/50 [00:00<00:05,  8.69it/s]
loss 0.02 accuracy 1.00:  32%|███▏      | 16/50 [00:00<00:00, 58.53it/s]
loss 0.01 accuracy 1.00:  32%|███▏      | 16/50 [00:00<00:00, 58.53it/s]
loss 0.01 accuracy 1.00:  32%|███▏      | 16/50 [00:00<00:00, 58.53it/s]
loss 0.01 accuracy 1.00:  32%|███▏      | 16/50 [00:00<00:00, 58.53it/s]
loss 0.01 accuracy 1.00:  32%|███▏      | 16/50 [00:00<00:00, 58.53it/s]
loss 0.02 accuracy 0.99:  32%|███▏      | 16/50 [00:00<00:00, 58.53it/s]
loss 0.01 accuracy 1.00:  32%|███▏      | 16/50 [00:00<00:00, 58.53it/s]
loss 0.02 accuracy 0.99:  32%|███▏      | 16/50 [00:00<00:00, 58.53it/s]
loss 0.03 accuracy 1.00:  32%|███▏      | 16/50 [00:00<00:00, 58.53it/s]
loss 0.03 accuracy 1.00:  32%|███▏      | 16/50 [00:00<00:00, 58.53it/s]
loss 0.01 accuracy 1.00:  32%|███▏      | 16/50 [00:00<00:00, 58.53it/s]
loss 0.02 accuracy 0.99:  32%|███▏      | 16/50 [00:00<00:00, 58.53it/s]
loss 0.02 accuracy 1.00:  32%|███▏      | 16/50 [00:00<00:00, 58.53it/s]
loss 0.02 accuracy 1.00:  56%|█████▌    | 28/50 [00:00<00:00, 76.19it/s]
loss 0.02 accuracy 0.99:  56%|█████▌    | 28/50 [00:00<00:00, 76.19it/s]
loss 0.01 accuracy 1.00:  56%|█████▌    | 28/50 [00:00<00:00, 76.19it/s]
loss 0.01 accuracy 1.00:  56%|█████▌    | 28/50 [00:00<00:00, 76.19it/s]
loss 0.01 accuracy 0.99:  56%|█████▌    | 28/50 [00:00<00:00, 76.19it/s]
loss 0.01 accuracy 1.00:  56%|█████▌    | 28/50 [00:00<00:00, 76.19it/s]
loss 0.01 accuracy 0.99:  56%|█████▌    | 28/50 [00:00<00:00, 76.19it/s]
loss 0.01 accuracy 1.00:  56%|█████▌    | 28/50 [00:00<00:00, 76.19it/s]
loss 0.01 accuracy 1.00:  56%|█████▌    | 28/50 [00:00<00:00, 76.19it/s]
loss 0.01 accuracy 1.00:  56%|█████▌    | 28/50 [00:00<00:00, 76.19it/s]
loss 0.01 accuracy 1.00:  56%|█████▌    | 28/50 [00:00<00:00, 76.19it/s]
loss 0.01 accuracy 1.00:  56%|█████▌    | 28/50 [00:00<00:00, 76.19it/s]
loss 0.02 accuracy 0.99:  56%|█████▌    | 28/50 [00:00<00:00, 76.19it/s]
loss 0.01 accuracy 0.99:  56%|█████▌    | 28/50 [00:00<00:00, 76.19it/s]
loss 0.01 accuracy 1.00:  56%|█████▌    | 28/50 [00:00<00:00, 76.19it/s]
loss 0.01 accuracy 1.00:  84%|████████▍ | 42/50 [00:00<00:00, 96.43it/s]
loss 0.01 accuracy 1.00:  84%|████████▍ | 42/50 [00:00<00:00, 96.43it/s]
loss 0.01 accuracy 1.00:  84%|████████▍ | 42/50 [00:00<00:00, 96.43it/s]
loss 0.01 accuracy 1.00:  84%|████████▍ | 42/50 [00:00<00:00, 96.43it/s]
loss 0.02 accuracy 1.00:  84%|████████▍ | 42/50 [00:00<00:00, 96.43it/s]
loss 0.01 accuracy 1.00:  84%|████████▍ | 42/50 [00:00<00:00, 96.43it/s]
loss 0.02 accuracy 0.99:  84%|████████▍ | 42/50 [00:00<00:00, 96.43it/s]
loss 0.06 accuracy 0.99:  84%|████████▍ | 42/50 [00:00<00:00, 96.43it/s]
loss 0.01 accuracy 0.99:  84%|████████▍ | 42/50 [00:00<00:00, 96.43it/s]
loss 0.01 accuracy 0.99: 100%|██████████| 50/50 [00:00<00:00, 82.55it/s]

  0%|          | 0/16 [00:00<?, ?it/s]
 56%|█████▋    | 9/16 [00:00<00:00, 86.23it/s]
100%|██████████| 16/16 [00:00<00:00, 87.04it/s]
test set accuracy is 1.000000
reducing lr to 0.0007

  0%|          | 0/50 [00:00<?, ?it/s]
loss 0.01 accuracy 1.00:   0%|          | 0/50 [00:00<?, ?it/s]
loss 0.02 accuracy 0.99:   0%|          | 0/50 [00:00<?, ?it/s]
loss 0.02 accuracy 0.99:   4%|▍         | 2/50 [00:00<00:05,  8.63it/s]
loss 0.01 accuracy 1.00:   4%|▍         | 2/50 [00:00<00:05,  8.63it/s]
loss 0.01 accuracy 1.00:   4%|▍         | 2/50 [00:00<00:05,  8.63it/s]
loss 0.01 accuracy 1.00:   4%|▍         | 2/50 [00:00<00:05,  8.63it/s]
loss 0.01 accuracy 0.99:   4%|▍         | 2/50 [00:00<00:05,  8.63it/s]
loss 0.01 accuracy 1.00:   4%|▍         | 2/50 [00:00<00:05,  8.63it/s]
loss 0.01 accuracy 1.00:   4%|▍         | 2/50 [00:00<00:05,  8.63it/s]
loss 0.01 accuracy 1.00:   4%|▍         | 2/50 [00:00<00:05,  8.63it/s]
loss 0.01 accuracy 1.00:   4%|▍         | 2/50 [00:00<00:05,  8.63it/s]
loss 0.02 accuracy 1.00:   4%|▍         | 2/50 [00:00<00:05,  8.63it/s]
loss 0.01 accuracy 1.00:   4%|▍         | 2/50 [00:00<00:05,  8.63it/s]
loss 0.01 accuracy 1.00:   4%|▍         | 2/50 [00:00<00:05,  8.63it/s]
loss 0.00 accuracy 1.00:   4%|▍         | 2/50 [00:00<00:05,  8.63it/s]
loss 0.01 accuracy 0.99:   4%|▍         | 2/50 [00:00<00:05,  8.63it/s]
loss 0.03 accuracy 0.99:   4%|▍         | 2/50 [00:00<00:05,  8.63it/s]
loss 0.03 accuracy 0.99:  32%|███▏      | 16/50 [00:00<00:00, 58.20it/s]
loss 0.01 accuracy 1.00:  32%|███▏      | 16/50 [00:00<00:00, 58.20it/s]
loss 0.01 accuracy 1.00:  32%|███▏      | 16/50 [00:00<00:00, 58.20it/s]
loss 0.01 accuracy 1.00:  32%|███▏      | 16/50 [00:00<00:00, 58.20it/s]
loss 0.02 accuracy 0.99:  32%|███▏      | 16/50 [00:00<00:00, 58.20it/s]
loss 0.02 accuracy 0.99:  32%|███▏      | 16/50 [00:00<00:00, 58.20it/s]
loss 0.01 accuracy 1.00:  32%|███▏      | 16/50 [00:00<00:00, 58.20it/s]
loss 0.01 accuracy 1.00:  32%|███▏      | 16/50 [00:00<00:00, 58.20it/s]
loss 0.01 accuracy 1.00:  32%|███▏      | 16/50 [00:00<00:00, 58.20it/s]
loss 0.01 accuracy 1.00:  32%|███▏      | 16/50 [00:00<00:00, 58.20it/s]
loss 0.01 accuracy 1.00:  32%|███▏      | 16/50 [00:00<00:00, 58.20it/s]
loss 0.01 accuracy 1.00:  32%|███▏      | 16/50 [00:00<00:00, 58.20it/s]
loss 0.01 accuracy 0.99:  32%|███▏      | 16/50 [00:00<00:00, 58.20it/s]
loss 0.01 accuracy 0.99:  32%|███▏      | 16/50 [00:00<00:00, 58.20it/s]
loss 0.01 accuracy 0.99:  32%|███▏      | 16/50 [00:00<00:00, 58.20it/s]
loss 0.01 accuracy 0.99:  60%|██████    | 30/50 [00:00<00:00, 86.38it/s]
loss 0.01 accuracy 1.00:  60%|██████    | 30/50 [00:00<00:00, 86.38it/s]
loss 0.01 accuracy 1.00:  60%|██████    | 30/50 [00:00<00:00, 86.38it/s]
loss 0.00 accuracy 1.00:  60%|██████    | 30/50 [00:00<00:00, 86.38it/s]
loss 0.00 accuracy 1.00:  60%|██████    | 30/50 [00:00<00:00, 86.38it/s]
loss 0.00 accuracy 1.00:  60%|██████    | 30/50 [00:00<00:00, 86.38it/s]
loss 0.00 accuracy 1.00:  60%|██████    | 30/50 [00:00<00:00, 86.38it/s]
loss 0.01 accuracy 1.00:  60%|██████    | 30/50 [00:00<00:00, 86.38it/s]
loss 0.00 accuracy 1.00:  60%|██████    | 30/50 [00:00<00:00, 86.38it/s]
loss 0.01 accuracy 1.00:  60%|██████    | 30/50 [00:00<00:00, 86.38it/s]
loss 0.00 accuracy 1.00:  60%|██████    | 30/50 [00:00<00:00, 86.38it/s]
loss 0.02 accuracy 0.99:  60%|██████    | 30/50 [00:00<00:00, 86.38it/s]
loss 0.00 accuracy 1.00:  60%|██████    | 30/50 [00:00<00:00, 86.38it/s]
loss 0.01 accuracy 1.00:  60%|██████    | 30/50 [00:00<00:00, 86.38it/s]
loss 0.00 accuracy 1.00:  60%|██████    | 30/50 [00:00<00:00, 86.38it/s]
loss 0.00 accuracy 1.00:  88%|████████▊ | 44/50 [00:00<00:00, 103.89it/s]
loss 0.02 accuracy 1.00:  88%|████████▊ | 44/50 [00:00<00:00, 103.89it/s]
loss 0.00 accuracy 1.00:  88%|████████▊ | 44/50 [00:00<00:00, 103.89it/s]
loss 0.00 accuracy 1.00:  88%|████████▊ | 44/50 [00:00<00:00, 103.89it/s]
loss 0.01 accuracy 1.00:  88%|████████▊ | 44/50 [00:00<00:00, 103.89it/s]
loss 0.00 accuracy 1.00:  88%|████████▊ | 44/50 [00:00<00:00, 103.89it/s]
loss 0.00 accuracy 1.00:  88%|████████▊ | 44/50 [00:00<00:00, 103.89it/s]
loss 0.00 accuracy 1.00: 100%|██████████| 50/50 [00:00<00:00, 86.30it/s] 

  0%|          | 0/16 [00:00<?, ?it/s]
 56%|█████▋    | 9/16 [00:00<00:00, 87.82it/s]
100%|██████████| 16/16 [00:00<00:00, 88.10it/s]
test set accuracy is 1.000000
reducing lr to 0.0006

  0%|          | 0/50 [00:00<?, ?it/s]
loss 0.00 accuracy 1.00:   0%|          | 0/50 [00:00<?, ?it/s]
loss 0.01 accuracy 1.00:   0%|          | 0/50 [00:00<?, ?it/s]
loss 0.01 accuracy 1.00:   4%|▍         | 2/50 [00:00<00:05,  8.75it/s]
loss 0.00 accuracy 1.00:   4%|▍         | 2/50 [00:00<00:05,  8.75it/s]
loss 0.00 accuracy 1.00:   4%|▍         | 2/50 [00:00<00:05,  8.75it/s]
loss 0.01 accuracy 0.99:   4%|▍         | 2/50 [00:00<00:05,  8.75it/s]
loss 0.01 accuracy 0.99:   4%|▍         | 2/50 [00:00<00:05,  8.75it/s]
loss 0.00 accuracy 1.00:   4%|▍         | 2/50 [00:00<00:05,  8.75it/s]
loss 0.00 accuracy 1.00:   4%|▍         | 2/50 [00:00<00:05,  8.75it/s]
loss 0.00 accuracy 1.00:   4%|▍         | 2/50 [00:00<00:05,  8.75it/s]
loss 0.01 accuracy 1.00:   4%|▍         | 2/50 [00:00<00:05,  8.75it/s]
loss 0.01 accuracy 1.00:   4%|▍         | 2/50 [00:00<00:05,  8.75it/s]
loss 0.00 accuracy 1.00:   4%|▍         | 2/50 [00:00<00:05,  8.75it/s]
loss 0.01 accuracy 1.00:   4%|▍         | 2/50 [00:00<00:05,  8.75it/s]
loss 0.00 accuracy 1.00:   4%|▍         | 2/50 [00:00<00:05,  8.75it/s]
loss 0.02 accuracy 1.00:   4%|▍         | 2/50 [00:00<00:05,  8.75it/s]
loss 0.01 accuracy 1.00:   4%|▍         | 2/50 [00:00<00:05,  8.75it/s]
loss 0.01 accuracy 1.00:  32%|███▏      | 16/50 [00:00<00:00, 58.80it/s]
loss 0.00 accuracy 1.00:  32%|███▏      | 16/50 [00:00<00:00, 58.80it/s]
loss 0.01 accuracy 1.00:  32%|███▏      | 16/50 [00:00<00:00, 58.80it/s]
loss 0.01 accuracy 1.00:  32%|███▏      | 16/50 [00:00<00:00, 58.80it/s]
loss 0.01 accuracy 1.00:  32%|███▏      | 16/50 [00:00<00:00, 58.80it/s]
loss 0.01 accuracy 1.00:  32%|███▏      | 16/50 [00:00<00:00, 58.80it/s]
loss 0.00 accuracy 1.00:  32%|███▏      | 16/50 [00:00<00:00, 58.80it/s]
loss 0.00 accuracy 1.00:  32%|███▏      | 16/50 [00:00<00:00, 58.80it/s]
loss 0.00 accuracy 1.00:  32%|███▏      | 16/50 [00:00<00:00, 58.80it/s]
loss 0.00 accuracy 1.00:  32%|███▏      | 16/50 [00:00<00:00, 58.80it/s]
loss 0.00 accuracy 1.00:  32%|███▏      | 16/50 [00:00<00:00, 58.80it/s]
loss 0.00 accuracy 1.00:  32%|███▏      | 16/50 [00:00<00:00, 58.80it/s]
loss 0.00 accuracy 1.00:  32%|███▏      | 16/50 [00:00<00:00, 58.80it/s]
loss 0.01 accuracy 1.00:  32%|███▏      | 16/50 [00:00<00:00, 58.80it/s]
loss 0.00 accuracy 1.00:  32%|███▏      | 16/50 [00:00<00:00, 58.80it/s]
loss 0.00 accuracy 1.00:  60%|██████    | 30/50 [00:00<00:00, 86.77it/s]
loss 0.01 accuracy 1.00:  60%|██████    | 30/50 [00:00<00:00, 86.77it/s]
loss 0.00 accuracy 1.00:  60%|██████    | 30/50 [00:00<00:00, 86.77it/s]
loss 0.01 accuracy 1.00:  60%|██████    | 30/50 [00:00<00:00, 86.77it/s]
loss 0.00 accuracy 1.00:  60%|██████    | 30/50 [00:00<00:00, 86.77it/s]
loss 0.00 accuracy 1.00:  60%|██████    | 30/50 [00:00<00:00, 86.77it/s]
loss 0.02 accuracy 1.00:  60%|██████    | 30/50 [00:00<00:00, 86.77it/s]
loss 0.00 accuracy 1.00:  60%|██████    | 30/50 [00:00<00:00, 86.77it/s]
loss 0.00 accuracy 1.00:  60%|██████    | 30/50 [00:00<00:00, 86.77it/s]
loss 0.00 accuracy 1.00:  60%|██████    | 30/50 [00:00<00:00, 86.77it/s]
loss 0.00 accuracy 1.00:  60%|██████    | 30/50 [00:00<00:00, 86.77it/s]
loss 0.00 accuracy 1.00:  60%|██████    | 30/50 [00:00<00:00, 86.77it/s]
loss 0.00 accuracy 1.00:  60%|██████    | 30/50 [00:00<00:00, 86.77it/s]
loss 0.00 accuracy 1.00:  60%|██████    | 30/50 [00:00<00:00, 86.77it/s]
loss 0.00 accuracy 1.00:  60%|██████    | 30/50 [00:00<00:00, 86.77it/s]
loss 0.00 accuracy 1.00:  88%|████████▊ | 44/50 [00:00<00:00, 104.25it/s]
loss 0.00 accuracy 1.00:  88%|████████▊ | 44/50 [00:00<00:00, 104.25it/s]
loss 0.00 accuracy 1.00:  88%|████████▊ | 44/50 [00:00<00:00, 104.25it/s]
loss 0.00 accuracy 1.00:  88%|████████▊ | 44/50 [00:00<00:00, 104.25it/s]
loss 0.00 accuracy 1.00:  88%|████████▊ | 44/50 [00:00<00:00, 104.25it/s]
loss 0.00 accuracy 1.00:  88%|████████▊ | 44/50 [00:00<00:00, 104.25it/s]
loss 0.00 accuracy 1.00:  88%|████████▊ | 44/50 [00:00<00:00, 104.25it/s]
loss 0.00 accuracy 1.00: 100%|██████████| 50/50 [00:00<00:00, 86.80it/s] 

  0%|          | 0/16 [00:00<?, ?it/s]
 56%|█████▋    | 9/16 [00:00<00:00, 87.44it/s]
100%|██████████| 16/16 [00:00<00:00, 87.77it/s]
test set accuracy is 1.000000
reducing lr to 0.0005
Wrong predictions: 0, acc = 1.0000

vgg7.py

python3 -m examples.vgg7 import MODELJSON MODEL
 imports a waifu2x JSON vgg_7 model, i.e. waifu2x/models/vgg_7/art/scale2.0x_model.json
 into a safetensors file
 weight tensors are ordered in tinygrad/ncnn form, as so: (outC,inC,H,W)
 *this format is used by most other commands in this program*
python3 -m examples.vgg7 import_kinne MODEL_KINNE MODEL_SAFETENSORS
 imports a model in 'KINNE' format (raw floats: used by older versions of this example) into safetensors
python3 -m examples.vgg7 execute MODEL IMG_IN IMG_OUT
 given an already-nearest-neighbour-scaled image, runs vgg7 on it
 output image has 7 pixels removed on all edges
 do not run on large images, will have *hilarious* RAM use
python3 -m examples.vgg7 execute_full MODEL IMG_IN IMG_OUT
 does the 'whole thing' (padding, tiling)
 safe for large images, etc.
python3 -m examples.vgg7 new MODEL
 creates a new model (experimental)
python3 -m examples.vgg7 train MODEL SAMPLES_DIR ROUNDS ROUNDS_SAVE
 trains a model (experimental)
 (how experimental? well, every time I tried it, it flooded w/ NaNs)
 note: ROUNDS < 0 means 'forever'. ROUNDS_SAVE <= 0 is not a good idea.
 expects roughly execute's input as SAMPLES_DIR/IDXa.png
 expects roughly execute's output as SAMPLES_DIR/IDXb.png
 (i.e. my_samples/0a.png is the first pre-nearest-scaled image,
       my_samples/0b.png is the first original image)
 in addition, SAMPLES_DIR/samples_count.txt indicates sample count
 won't pad or tile, so keep image sizes sane
python3 -m examples.vgg7 samplify IMG_A IMG_B SAMPLES_DIR SIZE
 creates overlapping micropatches (SIZExSIZE w/ 7-pixel border) for training
 maintains/creates samples_count.txt automatically
 unlike training, IMG_A must be exactly half the size of IMG_B

vit.py

(1, 1000) 208 16.274183 Labrador retriever

vits.py

INFO:root:Model has 109 speakers
INFO:root:You selected speaker 6 (name: ?)
Traceback (most recent call last):
  File "/home/jebba/devel/tinygrad/tinygrad/examples/vits.py", line 723, in <module>
    net_g = load_model(text_mapper.symbols, hps, model_config)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jebba/devel/tinygrad/tinygrad/examples/vits.py", line 535, in load_model
    _ = load_checkpoint(fetch(model[1]), net_g, None)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jebba/devel/tinygrad/tinygrad/examples/vits.py", line 540, in load_checkpoint
    checkpoint_dict = torch_load(checkpoint_path)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jebba/devel/tinygrad/tinygrad/tinygrad/nn/state.py", line 145, in torch_load
    _, _, _, rwd, _, ids, base_offset = pkl.load(), pkl.load(), pkl.load(), f.tell(), pkl.load(), pkl.load(), f.tell()
                                        ^^^^^^^^^^
_pickle.UnpicklingError: invalid load key, '<'.

whisper.py

  0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.00 GB, encoder.conv1.weight                              :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.00 GB, encoder.conv1.bias                                :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.00 GB, encoder.conv2.weight                              :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.00 GB, encoder.conv2.bias                                :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.00 GB, encoder.blocks.0.attn.query.weight                :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.00 GB, encoder.blocks.0.attn.query.bias                  :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.00 GB, encoder.blocks.0.attn.key.weight                  :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.00 GB, encoder.blocks.0.attn.value.weight                :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.00 GB, encoder.blocks.0.attn.value.bias                  :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.00 GB, encoder.blocks.0.attn.out.weight                  :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.00 GB, encoder.blocks.0.attn.out.bias                    :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.00 GB, encoder.blocks.0.attn_ln.weight                   :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.00 GB, encoder.blocks.0.attn_ln.bias                     :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.00 GB, encoder.blocks.0.mlp.0.weight                     :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.01 GB, encoder.blocks.0.mlp.0.bias                       :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.01 GB, encoder.blocks.0.mlp.2.weight                     :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.01 GB, encoder.blocks.0.mlp.2.bias                       :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.01 GB, encoder.blocks.0.mlp_ln.weight                    :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.01 GB, encoder.blocks.0.mlp_ln.bias                      :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.01 GB, encoder.blocks.1.attn.query.weight                :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.01 GB, encoder.blocks.1.attn.query.bias                  :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.01 GB, encoder.blocks.1.attn.key.weight                  :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.01 GB, encoder.blocks.1.attn.value.weight                :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.01 GB, encoder.blocks.1.attn.value.bias                  :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.01 GB, encoder.blocks.1.attn.out.weight                  :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.01 GB, encoder.blocks.1.attn.out.bias                    :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.01 GB, encoder.blocks.1.attn_ln.weight                   :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.01 GB, encoder.blocks.1.attn_ln.bias                     :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.01 GB, encoder.blocks.1.mlp.0.weight                     :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.01 GB, encoder.blocks.1.mlp.0.bias                       :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.01 GB, encoder.blocks.1.mlp.2.weight                     :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.02 GB, encoder.blocks.1.mlp.2.bias                       :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.02 GB, encoder.blocks.1.mlp_ln.weight                    :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.02 GB, encoder.blocks.1.mlp_ln.bias                      :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.02 GB, encoder.blocks.2.attn.query.weight                :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.02 GB, encoder.blocks.2.attn.query.bias                  :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.02 GB, encoder.blocks.2.attn.key.weight                  :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.02 GB, encoder.blocks.2.attn.value.weight                :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.02 GB, encoder.blocks.2.attn.value.bias                  :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.02 GB, encoder.blocks.2.attn.out.weight                  :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.02 GB, encoder.blocks.2.attn.out.bias                    :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.02 GB, encoder.blocks.2.attn_ln.weight                   :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.02 GB, encoder.blocks.2.attn_ln.bias                     :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.02 GB, encoder.blocks.2.mlp.0.weight                     :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.02 GB, encoder.blocks.2.mlp.0.bias                       :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.02 GB, encoder.blocks.2.mlp.2.weight                     :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.02 GB, encoder.blocks.2.mlp.2.bias                       :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.02 GB, encoder.blocks.2.mlp_ln.weight                    :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.02 GB, encoder.blocks.2.mlp_ln.bias                      :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.02 GB, encoder.blocks.3.attn.query.weight                :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.02 GB, encoder.blocks.3.attn.query.bias                  :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.02 GB, encoder.blocks.3.attn.key.weight                  :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.02 GB, encoder.blocks.3.attn.value.weight                :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.02 GB, encoder.blocks.3.attn.value.bias                  :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.02 GB, encoder.blocks.3.attn.out.weight                  :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.03 GB, encoder.blocks.3.attn.out.bias                    :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.03 GB, encoder.blocks.3.attn_ln.weight                   :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.03 GB, encoder.blocks.3.attn_ln.bias                     :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.03 GB, encoder.blocks.3.mlp.0.weight                     :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.03 GB, encoder.blocks.3.mlp.0.bias                       :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.03 GB, encoder.blocks.3.mlp.2.weight                     :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.03 GB, encoder.blocks.3.mlp.2.bias                       :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.03 GB, encoder.blocks.3.mlp_ln.weight                    :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.03 GB, encoder.blocks.3.mlp_ln.bias                      :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.03 GB, encoder.ln_post.weight                            :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.03 GB, encoder.ln_post.bias                              :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.03 GB, encoder.positional_embedding                      :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.03 GB, decoder.token_embedding.weight                    :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.03 GB, decoder.token_embedding.weight                    :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.07 GB, decoder.positional_embedding                      :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.07 GB, decoder.blocks.0.attn.query.weight                :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.07 GB, decoder.blocks.0.attn.query.bias                  :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.07 GB, decoder.blocks.0.attn.key.weight                  :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.07 GB, decoder.blocks.0.attn.value.weight                :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.07 GB, decoder.blocks.0.attn.value.bias                  :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.07 GB, decoder.blocks.0.attn.out.weight                  :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.07 GB, decoder.blocks.0.attn.out.bias                    :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.07 GB, decoder.blocks.0.attn_ln.weight                   :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.07 GB, decoder.blocks.0.attn_ln.bias                     :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.07 GB, decoder.blocks.0.cross_attn.query.weight          :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.07 GB, decoder.blocks.0.cross_attn.query.bias            :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.07 GB, decoder.blocks.0.cross_attn.key.weight            :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.08 GB, decoder.blocks.0.cross_attn.value.weight          :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.08 GB, decoder.blocks.0.cross_attn.value.bias            :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.08 GB, decoder.blocks.0.cross_attn.out.weight            :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.08 GB, decoder.blocks.0.cross_attn.out.bias              :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.08 GB, decoder.blocks.0.cross_attn_ln.weight             :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.08 GB, decoder.blocks.0.cross_attn_ln.bias               :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.08 GB, decoder.blocks.0.mlp.0.weight                     :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.08 GB, decoder.blocks.0.mlp.0.bias                       :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.08 GB, decoder.blocks.0.mlp.2.weight                     :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.08 GB, decoder.blocks.0.mlp.2.bias                       :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.08 GB, decoder.blocks.0.mlp_ln.weight                    :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.08 GB, decoder.blocks.0.mlp_ln.bias                      :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.08 GB, decoder.blocks.1.attn.query.weight                :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.08 GB, decoder.blocks.1.attn.query.bias                  :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.08 GB, decoder.blocks.1.attn.key.weight                  :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.08 GB, decoder.blocks.1.attn.value.weight                :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.08 GB, decoder.blocks.1.attn.value.bias                  :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.08 GB, decoder.blocks.1.attn.out.weight                  :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.08 GB, decoder.blocks.1.attn.out.bias                    :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.08 GB, decoder.blocks.1.attn_ln.weight                   :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.08 GB, decoder.blocks.1.attn_ln.bias                     :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.08 GB, decoder.blocks.1.cross_attn.query.weight          :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.08 GB, decoder.blocks.1.cross_attn.query.bias            :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.08 GB, decoder.blocks.1.cross_attn.key.weight            :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.08 GB, decoder.blocks.1.cross_attn.value.weight          :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.09 GB, decoder.blocks.1.cross_attn.value.bias            :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.09 GB, decoder.blocks.1.cross_attn.out.weight            :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.09 GB, decoder.blocks.1.cross_attn.out.bias              :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.09 GB, decoder.blocks.1.cross_attn_ln.weight             :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.09 GB, decoder.blocks.1.cross_attn_ln.bias               :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.09 GB, decoder.blocks.1.mlp.0.weight                     :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.09 GB, decoder.blocks.1.mlp.0.bias                       :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.09 GB, decoder.blocks.1.mlp.2.weight                     :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.09 GB, decoder.blocks.1.mlp.2.bias                       :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.09 GB, decoder.blocks.1.mlp_ln.weight                    :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.09 GB, decoder.blocks.1.mlp_ln.bias                      :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.09 GB, decoder.blocks.2.attn.query.weight                :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.09 GB, decoder.blocks.2.attn.query.bias                  :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.09 GB, decoder.blocks.2.attn.key.weight                  :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.09 GB, decoder.blocks.2.attn.value.weight                :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.09 GB, decoder.blocks.2.attn.value.bias                  :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.09 GB, decoder.blocks.2.attn.out.weight                  :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.09 GB, decoder.blocks.2.attn.out.bias                    :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.09 GB, decoder.blocks.2.attn_ln.weight                   :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.09 GB, decoder.blocks.2.attn_ln.bias                     :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.09 GB, decoder.blocks.2.cross_attn.query.weight          :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.09 GB, decoder.blocks.2.cross_attn.query.bias            :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.09 GB, decoder.blocks.2.cross_attn.key.weight            :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.09 GB, decoder.blocks.2.cross_attn.value.weight          :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.09 GB, decoder.blocks.2.cross_attn.value.bias            :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.09 GB, decoder.blocks.2.cross_attn.out.weight            :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.10 GB, decoder.blocks.2.cross_attn.out.bias              :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.10 GB, decoder.blocks.2.cross_attn_ln.weight             :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.10 GB, decoder.blocks.2.cross_attn_ln.bias               :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.10 GB, decoder.blocks.2.mlp.0.weight                     :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.10 GB, decoder.blocks.2.mlp.0.bias                       :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.10 GB, decoder.blocks.2.mlp.2.weight                     :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.10 GB, decoder.blocks.2.mlp.2.bias                       :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.10 GB, decoder.blocks.2.mlp_ln.weight                    :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.10 GB, decoder.blocks.2.mlp_ln.bias                      :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.10 GB, decoder.blocks.3.attn.query.weight                :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.10 GB, decoder.blocks.3.attn.query.bias                  :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.10 GB, decoder.blocks.3.attn.key.weight                  :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.10 GB, decoder.blocks.3.attn.value.weight                :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.10 GB, decoder.blocks.3.attn.value.bias                  :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.10 GB, decoder.blocks.3.attn.out.weight                  :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.10 GB, decoder.blocks.3.attn.out.bias                    :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.10 GB, decoder.blocks.3.attn_ln.weight                   :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.10 GB, decoder.blocks.3.attn_ln.bias                     :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.10 GB, decoder.blocks.3.cross_attn.query.weight          :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.10 GB, decoder.blocks.3.cross_attn.query.bias            :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.10 GB, decoder.blocks.3.cross_attn.key.weight            :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.10 GB, decoder.blocks.3.cross_attn.value.weight          :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.10 GB, decoder.blocks.3.cross_attn.value.bias            :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.10 GB, decoder.blocks.3.cross_attn.out.weight            :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.10 GB, decoder.blocks.3.cross_attn.out.bias              :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.10 GB, decoder.blocks.3.cross_attn_ln.weight             :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.10 GB, decoder.blocks.3.cross_attn_ln.bias               :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.10 GB, decoder.blocks.3.mlp.0.weight                     :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.11 GB, decoder.blocks.3.mlp.0.bias                       :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.11 GB, decoder.blocks.3.mlp.2.weight                     :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.11 GB, decoder.blocks.3.mlp.2.bias                       :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.11 GB, decoder.blocks.3.mlp_ln.weight                    :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.11 GB, decoder.blocks.3.mlp_ln.bias                      :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.11 GB, decoder.ln.weight                                 :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.11 GB, decoder.ln.bias                                   :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.11 GB, decoder.mask                                      :  40%|████      | 68/168 [00:00<00:00, 604.47it/s]
ram used:  0.11 GB, decoder.mask                                      : 100%|██████████| 168/168 [00:00<00:00, 967.70it/s]
loaded weights in 175.99 ms, 0.11 GB loaded at 0.62 GB/s
ALSA lib confmisc.c:855:(parse_card) cannot find card '0'
ALSA lib conf.c:5180:(_snd_config_evaluate) function snd_func_card_inum returned error: No such file or directory
ALSA lib confmisc.c:422:(snd_func_concat) error evaluating strings
ALSA lib conf.c:5180:(_snd_config_evaluate) function snd_func_concat returned error: No such file or directory
ALSA lib confmisc.c:1334:(snd_func_refer) error evaluating name
ALSA lib conf.c:5180:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory
ALSA lib conf.c:5703:(snd_config_expand) Evaluate error: No such file or directory
ALSA lib pcm.c:2666:(snd_pcm_open_noupdate) Unknown PCM sysdefault
ALSA lib confmisc.c:855:(parse_card) cannot find card '0'
ALSA lib conf.c:5180:(_snd_config_evaluate) function snd_func_card_inum returned error: No such file or directory
ALSA lib confmisc.c:422:(snd_func_concat) error evaluating strings
ALSA lib conf.c:5180:(_snd_config_evaluate) function snd_func_concat returned error: No such file or directory
ALSA lib confmisc.c:1334:(snd_func_refer) error evaluating name
ALSA lib conf.c:5180:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory
ALSA lib conf.c:5703:(snd_config_expand) Evaluate error: No such file or directory
ALSA lib pcm.c:2666:(snd_pcm_open_noupdate) Unknown PCM sysdefault
ALSA lib pcm.c:2666:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.front
ALSA lib pcm.c:2666:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.rear
ALSA lib pcm.c:2666:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.center_lfe
ALSA lib pcm.c:2666:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.side
ALSA lib pcm.c:2666:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.surround21
ALSA lib pcm.c:2666:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.surround21
ALSA lib pcm.c:2666:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.surround40
ALSA lib pcm.c:2666:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.surround41
ALSA lib pcm.c:2666:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.surround50
ALSA lib pcm.c:2666:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.surround51
ALSA lib pcm.c:2666:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.surround71
ALSA lib pcm.c:2666:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.iec958
ALSA lib pcm.c:2666:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.iec958
ALSA lib pcm.c:2666:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.iec958
ALSA lib pcm.c:2666:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.hdmi
ALSA lib pcm.c:2666:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.hdmi
ALSA lib pcm.c:2666:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.modem
ALSA lib pcm.c:2666:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.modem
ALSA lib pcm.c:2666:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.phoneline
ALSA lib pcm.c:2666:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.phoneline
ALSA lib confmisc.c:855:(parse_card) cannot find card '0'
ALSA lib conf.c:5180:(_snd_config_evaluate) function snd_func_card_inum returned error: No such file or directory
ALSA lib confmisc.c:422:(snd_func_concat) error evaluating strings
ALSA lib conf.c:5180:(_snd_config_evaluate) function snd_func_concat returned error: No such file or directory
ALSA lib confmisc.c:1334:(snd_func_refer) error evaluating name
ALSA lib conf.c:5180:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory
ALSA lib conf.c:5703:(snd_config_expand) Evaluate error: No such file or directory
ALSA lib pcm.c:2666:(snd_pcm_open_noupdate) Unknown PCM default
ALSA lib confmisc.c:855:(parse_card) cannot find card '0'
ALSA lib conf.c:5180:(_snd_config_evaluate) function snd_func_card_inum returned error: No such file or directory
ALSA lib confmisc.c:422:(snd_func_concat) error evaluating strings
ALSA lib conf.c:5180:(_snd_config_evaluate) function snd_func_concat returned error: No such file or directory
ALSA lib confmisc.c:1334:(snd_func_refer) error evaluating name
ALSA lib conf.c:5180:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory
ALSA lib conf.c:5703:(snd_config_expand) Evaluate error: No such file or directory
ALSA lib pcm.c:2666:(snd_pcm_open_noupdate) Unknown PCM default
ALSA lib confmisc.c:855:(parse_card) cannot find card '0'
ALSA lib conf.c:5180:(_snd_config_evaluate) function snd_func_card_id returned error: No such file or directory
ALSA lib confmisc.c:422:(snd_func_concat) error evaluating strings
ALSA lib conf.c:5180:(_snd_config_evaluate) function snd_func_concat returned error: No such file or directory
ALSA lib confmisc.c:1334:(snd_func_refer) error evaluating name
ALSA lib conf.c:5180:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory
ALSA lib conf.c:5703:(snd_config_expand) Evaluate error: No such file or directory
ALSA lib pcm.c:2666:(snd_pcm_open_noupdate) Unknown PCM dmix
Cannot connect to server socket err = No such file or directory
Cannot connect to server request channel
jack server is not running or cannot be started
JackShmReadWritePtr::~JackShmReadWritePtr - Init not done for -1, skipping unlock
JackShmReadWritePtr::~JackShmReadWritePtr - Init not done for -1, skipping unlock
Process Process-1:
Traceback (most recent call last):
  File "/usr/lib/python3.11/multiprocessing/process.py", line 314, in _bootstrap
    self.run()
  File "/usr/lib/python3.11/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/home/jebba/devel/tinygrad/tinygrad/examples/whisper.py", line 313, in listener
    stream = p.open(format=pyaudio.paInt16, channels=1, rate=RATE, input=True, frames_per_buffer=CHUNK)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jebba/devel/tinygrad/tinygrad/venv/lib/python3.11/site-packages/pyaudio/__init__.py", line 639, in open
    stream = PyAudio.Stream(self, *args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jebba/devel/tinygrad/tinygrad/venv/lib/python3.11/site-packages/pyaudio/__init__.py", line 441, in __init__
    self._stream = pa.open(**arguments)
                   ^^^^^^^^^^^^^^^^^^^^
OSError: [Errno -9996] Invalid input device (no default output device)
Traceback (most recent call last):
  File "/home/jebba/devel/tinygrad/tinygrad/examples/whisper.py", line 338, in <module>
    waveform = q.get()
               ^^^^^^^
  File "/usr/lib/python3.11/multiprocessing/queues.py", line 103, in get
    res = self._recv_bytes()
          ^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/multiprocessing/connection.py", line 215, in recv_bytes
    buf = self._recv_bytes(maxlength)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/multiprocessing/connection.py", line 413, in _recv_bytes
    buf = self._recv(4)
          ^^^^^^^^^^^^^
  File "/usr/lib/python3.11/multiprocessing/connection.py", line 378, in _recv
    chunk = read(handle, remaining)
            ^^^^^^^^^^^^^^^^^^^^^^^
KeyboardInterrupt

yolov3.py

Modules length: 107
Loading weights file (237MB). This might take a while…
running inference…
did inference in 3.367573s
Detected bicycle 98.95
Detected truck 99.00
Detected dog 94.23

yolov8-onnx.py

{'images': (1, 3, 480, 640)}
0: op Conv shape [(1, 3, 480, 640), (16, 3, 3, 3), (16,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (3, 3), 'pads': (1, 1, 1, 1), 'strides': (2, 2)}
1: op Sigmoid shape [(1, 16, 240, 320)] opt {}
2: op Mul shape [(1, 16, 240, 320), (1, 16, 240, 320)] opt {}
3: op Conv shape [(1, 16, 240, 320), (32, 16, 3, 3), (32,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (3, 3), 'pads': (1, 1, 1, 1), 'strides': (2, 2)}
4: op Sigmoid shape [(1, 32, 120, 160)] opt {}
5: op Mul shape [(1, 32, 120, 160), (1, 32, 120, 160)] opt {}
6: op Conv shape [(1, 32, 120, 160), (32, 32, 1, 1), (32,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (1, 1), 'pads': (0, 0, 0, 0), 'strides': (1, 1)}
7: op Sigmoid shape [(1, 32, 120, 160)] opt {}
8: op Mul shape [(1, 32, 120, 160), (1, 32, 120, 160)] opt {}
9: op Constant shape [] opt {'value': <Tensor <LB HIP (2,) contig:True (<LoadOps.COPY: 3>, None)> on HIP with grad None>}
10: op Split shape [(1, 32, 120, 160), (2,)] opt {'axis': 1}
11: op Conv shape [(1, 16, 120, 160), (16, 16, 3, 3), (16,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (3, 3), 'pads': (1, 1, 1, 1), 'strides': (1, 1)}
12: op Sigmoid shape [(1, 16, 120, 160)] opt {}
13: op Mul shape [(1, 16, 120, 160), (1, 16, 120, 160)] opt {}
14: op Conv shape [(1, 16, 120, 160), (16, 16, 3, 3), (16,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (3, 3), 'pads': (1, 1, 1, 1), 'strides': (1, 1)}
15: op Sigmoid shape [(1, 16, 120, 160)] opt {}
16: op Mul shape [(1, 16, 120, 160), (1, 16, 120, 160)] opt {}
17: op Add shape [(1, 16, 120, 160), (1, 16, 120, 160)] opt {}
18: op Concat shape [(1, 16, 120, 160), (1, 16, 120, 160), (1, 16, 120, 160)] opt {'axis': 1}
19: op Conv shape [(1, 48, 120, 160), (32, 48, 1, 1), (32,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (1, 1), 'pads': (0, 0, 0, 0), 'strides': (1, 1)}
20: op Sigmoid shape [(1, 32, 120, 160)] opt {}
21: op Mul shape [(1, 32, 120, 160), (1, 32, 120, 160)] opt {}
22: op Conv shape [(1, 32, 120, 160), (64, 32, 3, 3), (64,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (3, 3), 'pads': (1, 1, 1, 1), 'strides': (2, 2)}
23: op Sigmoid shape [(1, 64, 60, 80)] opt {}
24: op Mul shape [(1, 64, 60, 80), (1, 64, 60, 80)] opt {}
25: op Conv shape [(1, 64, 60, 80), (64, 64, 1, 1), (64,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (1, 1), 'pads': (0, 0, 0, 0), 'strides': (1, 1)}
26: op Sigmoid shape [(1, 64, 60, 80)] opt {}
27: op Mul shape [(1, 64, 60, 80), (1, 64, 60, 80)] opt {}
28: op Constant shape [] opt {'value': <Tensor <LB HIP (2,) contig:True (<LoadOps.COPY: 3>, None)> on HIP with grad None>}
29: op Split shape [(1, 64, 60, 80), (2,)] opt {'axis': 1}
30: op Conv shape [(1, 32, 60, 80), (32, 32, 3, 3), (32,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (3, 3), 'pads': (1, 1, 1, 1), 'strides': (1, 1)}
31: op Sigmoid shape [(1, 32, 60, 80)] opt {}
32: op Mul shape [(1, 32, 60, 80), (1, 32, 60, 80)] opt {}
33: op Conv shape [(1, 32, 60, 80), (32, 32, 3, 3), (32,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (3, 3), 'pads': (1, 1, 1, 1), 'strides': (1, 1)}
34: op Sigmoid shape [(1, 32, 60, 80)] opt {}
35: op Mul shape [(1, 32, 60, 80), (1, 32, 60, 80)] opt {}
36: op Add shape [(1, 32, 60, 80), (1, 32, 60, 80)] opt {}
37: op Conv shape [(1, 32, 60, 80), (32, 32, 3, 3), (32,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (3, 3), 'pads': (1, 1, 1, 1), 'strides': (1, 1)}
38: op Sigmoid shape [(1, 32, 60, 80)] opt {}
39: op Mul shape [(1, 32, 60, 80), (1, 32, 60, 80)] opt {}
40: op Conv shape [(1, 32, 60, 80), (32, 32, 3, 3), (32,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (3, 3), 'pads': (1, 1, 1, 1), 'strides': (1, 1)}
41: op Sigmoid shape [(1, 32, 60, 80)] opt {}
42: op Mul shape [(1, 32, 60, 80), (1, 32, 60, 80)] opt {}
43: op Add shape [(1, 32, 60, 80), (1, 32, 60, 80)] opt {}
44: op Concat shape [(1, 32, 60, 80), (1, 32, 60, 80), (1, 32, 60, 80), (1, 32, 60, 80)] opt {'axis': 1}
45: op Conv shape [(1, 128, 60, 80), (64, 128, 1, 1), (64,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (1, 1), 'pads': (0, 0, 0, 0), 'strides': (1, 1)}
46: op Sigmoid shape [(1, 64, 60, 80)] opt {}
47: op Mul shape [(1, 64, 60, 80), (1, 64, 60, 80)] opt {}
48: op Conv shape [(1, 64, 60, 80), (128, 64, 3, 3), (128,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (3, 3), 'pads': (1, 1, 1, 1), 'strides': (2, 2)}
49: op Sigmoid shape [(1, 128, 30, 40)] opt {}
50: op Mul shape [(1, 128, 30, 40), (1, 128, 30, 40)] opt {}
51: op Conv shape [(1, 128, 30, 40), (128, 128, 1, 1), (128,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (1, 1), 'pads': (0, 0, 0, 0), 'strides': (1, 1)}
52: op Sigmoid shape [(1, 128, 30, 40)] opt {}
53: op Mul shape [(1, 128, 30, 40), (1, 128, 30, 40)] opt {}
54: op Constant shape [] opt {'value': <Tensor <LB HIP (2,) contig:True (<LoadOps.COPY: 3>, None)> on HIP with grad None>}
55: op Split shape [(1, 128, 30, 40), (2,)] opt {'axis': 1}
56: op Conv shape [(1, 64, 30, 40), (64, 64, 3, 3), (64,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (3, 3), 'pads': (1, 1, 1, 1), 'strides': (1, 1)}
57: op Sigmoid shape [(1, 64, 30, 40)] opt {}
58: op Mul shape [(1, 64, 30, 40), (1, 64, 30, 40)] opt {}
59: op Conv shape [(1, 64, 30, 40), (64, 64, 3, 3), (64,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (3, 3), 'pads': (1, 1, 1, 1), 'strides': (1, 1)}
60: op Sigmoid shape [(1, 64, 30, 40)] opt {}
61: op Mul shape [(1, 64, 30, 40), (1, 64, 30, 40)] opt {}
62: op Add shape [(1, 64, 30, 40), (1, 64, 30, 40)] opt {}
63: op Conv shape [(1, 64, 30, 40), (64, 64, 3, 3), (64,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (3, 3), 'pads': (1, 1, 1, 1), 'strides': (1, 1)}
64: op Sigmoid shape [(1, 64, 30, 40)] opt {}
65: op Mul shape [(1, 64, 30, 40), (1, 64, 30, 40)] opt {}
66: op Conv shape [(1, 64, 30, 40), (64, 64, 3, 3), (64,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (3, 3), 'pads': (1, 1, 1, 1), 'strides': (1, 1)}
67: op Sigmoid shape [(1, 64, 30, 40)] opt {}
68: op Mul shape [(1, 64, 30, 40), (1, 64, 30, 40)] opt {}
69: op Add shape [(1, 64, 30, 40), (1, 64, 30, 40)] opt {}
70: op Concat shape [(1, 64, 30, 40), (1, 64, 30, 40), (1, 64, 30, 40), (1, 64, 30, 40)] opt {'axis': 1}
71: op Conv shape [(1, 256, 30, 40), (128, 256, 1, 1), (128,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (1, 1), 'pads': (0, 0, 0, 0), 'strides': (1, 1)}
72: op Sigmoid shape [(1, 128, 30, 40)] opt {}
73: op Mul shape [(1, 128, 30, 40), (1, 128, 30, 40)] opt {}
74: op Conv shape [(1, 128, 30, 40), (256, 128, 3, 3), (256,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (3, 3), 'pads': (1, 1, 1, 1), 'strides': (2, 2)}
75: op Sigmoid shape [(1, 256, 15, 20)] opt {}
76: op Mul shape [(1, 256, 15, 20), (1, 256, 15, 20)] opt {}
77: op Conv shape [(1, 256, 15, 20), (256, 256, 1, 1), (256,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (1, 1), 'pads': (0, 0, 0, 0), 'strides': (1, 1)}
78: op Sigmoid shape [(1, 256, 15, 20)] opt {}
79: op Mul shape [(1, 256, 15, 20), (1, 256, 15, 20)] opt {}
80: op Constant shape [] opt {'value': <Tensor <LB HIP (2,) contig:True (<LoadOps.COPY: 3>, None)> on HIP with grad None>}
81: op Split shape [(1, 256, 15, 20), (2,)] opt {'axis': 1}
82: op Conv shape [(1, 128, 15, 20), (128, 128, 3, 3), (128,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (3, 3), 'pads': (1, 1, 1, 1), 'strides': (1, 1)}
83: op Sigmoid shape [(1, 128, 15, 20)] opt {}
84: op Mul shape [(1, 128, 15, 20), (1, 128, 15, 20)] opt {}
85: op Conv shape [(1, 128, 15, 20), (128, 128, 3, 3), (128,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (3, 3), 'pads': (1, 1, 1, 1), 'strides': (1, 1)}
86: op Sigmoid shape [(1, 128, 15, 20)] opt {}
87: op Mul shape [(1, 128, 15, 20), (1, 128, 15, 20)] opt {}
88: op Add shape [(1, 128, 15, 20), (1, 128, 15, 20)] opt {}
89: op Concat shape [(1, 128, 15, 20), (1, 128, 15, 20), (1, 128, 15, 20)] opt {'axis': 1}
90: op Conv shape [(1, 384, 15, 20), (256, 384, 1, 1), (256,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (1, 1), 'pads': (0, 0, 0, 0), 'strides': (1, 1)}
91: op Sigmoid shape [(1, 256, 15, 20)] opt {}
92: op Mul shape [(1, 256, 15, 20), (1, 256, 15, 20)] opt {}
93: op Conv shape [(1, 256, 15, 20), (128, 256, 1, 1), (128,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (1, 1), 'pads': (0, 0, 0, 0), 'strides': (1, 1)}
94: op Sigmoid shape [(1, 128, 15, 20)] opt {}
95: op Mul shape [(1, 128, 15, 20), (1, 128, 15, 20)] opt {}
96: op MaxPool shape [(1, 128, 15, 20)] opt {'ceil_mode': 0, 'dilations': (1, 1), 'kernel_shape': (5, 5), 'pads': (2, 2, 2, 2), 'strides': (1, 1)}
97: op MaxPool shape [(1, 128, 15, 20)] opt {'ceil_mode': 0, 'dilations': (1, 1), 'kernel_shape': (5, 5), 'pads': (2, 2, 2, 2), 'strides': (1, 1)}
98: op MaxPool shape [(1, 128, 15, 20)] opt {'ceil_mode': 0, 'dilations': (1, 1), 'kernel_shape': (5, 5), 'pads': (2, 2, 2, 2), 'strides': (1, 1)}
99: op Concat shape [(1, 128, 15, 20), (1, 128, 15, 20), (1, 128, 15, 20), (1, 128, 15, 20)] opt {'axis': 1}
100: op Conv shape [(1, 512, 15, 20), (256, 512, 1, 1), (256,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (1, 1), 'pads': (0, 0, 0, 0), 'strides': (1, 1)}
101: op Sigmoid shape [(1, 256, 15, 20)] opt {}
102: op Mul shape [(1, 256, 15, 20), (1, 256, 15, 20)] opt {}
103: op Constant shape [] opt {'value': <Tensor <LB HIP (4,) contig:True (<LoadOps.COPY: 3>, None)> on HIP with grad None>}
104: op Resize shape [(1, 256, 15, 20), None, (4,)] opt {'coordinate_transformation_mode': 'asymmetric', 'cubic_coeff_a': -0.75, 'mode': 'nearest', 'nearest_mode': 'floor'}
105: op Concat shape [(1, 256, 30, 40), (1, 128, 30, 40)] opt {'axis': 1}
106: op Conv shape [(1, 384, 30, 40), (128, 384, 1, 1), (128,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (1, 1), 'pads': (0, 0, 0, 0), 'strides': (1, 1)}
107: op Sigmoid shape [(1, 128, 30, 40)] opt {}
108: op Mul shape [(1, 128, 30, 40), (1, 128, 30, 40)] opt {}
109: op Split shape [(1, 128, 30, 40), (2,)] opt {'axis': 1}
110: op Conv shape [(1, 64, 30, 40), (64, 64, 3, 3), (64,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (3, 3), 'pads': (1, 1, 1, 1), 'strides': (1, 1)}
111: op Sigmoid shape [(1, 64, 30, 40)] opt {}
112: op Mul shape [(1, 64, 30, 40), (1, 64, 30, 40)] opt {}
113: op Conv shape [(1, 64, 30, 40), (64, 64, 3, 3), (64,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (3, 3), 'pads': (1, 1, 1, 1), 'strides': (1, 1)}
114: op Sigmoid shape [(1, 64, 30, 40)] opt {}
115: op Mul shape [(1, 64, 30, 40), (1, 64, 30, 40)] opt {}
116: op Concat shape [(1, 64, 30, 40), (1, 64, 30, 40), (1, 64, 30, 40)] opt {'axis': 1}
117: op Conv shape [(1, 192, 30, 40), (128, 192, 1, 1), (128,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (1, 1), 'pads': (0, 0, 0, 0), 'strides': (1, 1)}
118: op Sigmoid shape [(1, 128, 30, 40)] opt {}
119: op Mul shape [(1, 128, 30, 40), (1, 128, 30, 40)] opt {}
120: op Constant shape [] opt {'value': <Tensor <LB HIP (4,) contig:True (<LoadOps.COPY: 3>, None)> on HIP with grad None>}
121: op Resize shape [(1, 128, 30, 40), None, (4,)] opt {'coordinate_transformation_mode': 'asymmetric', 'cubic_coeff_a': -0.75, 'mode': 'nearest', 'nearest_mode': 'floor'}
122: op Concat shape [(1, 128, 60, 80), (1, 64, 60, 80)] opt {'axis': 1}
123: op Conv shape [(1, 192, 60, 80), (64, 192, 1, 1), (64,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (1, 1), 'pads': (0, 0, 0, 0), 'strides': (1, 1)}
124: op Sigmoid shape [(1, 64, 60, 80)] opt {}
125: op Mul shape [(1, 64, 60, 80), (1, 64, 60, 80)] opt {}
126: op Split shape [(1, 64, 60, 80), (2,)] opt {'axis': 1}
127: op Conv shape [(1, 32, 60, 80), (32, 32, 3, 3), (32,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (3, 3), 'pads': (1, 1, 1, 1), 'strides': (1, 1)}
128: op Sigmoid shape [(1, 32, 60, 80)] opt {}
129: op Mul shape [(1, 32, 60, 80), (1, 32, 60, 80)] opt {}
130: op Conv shape [(1, 32, 60, 80), (32, 32, 3, 3), (32,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (3, 3), 'pads': (1, 1, 1, 1), 'strides': (1, 1)}
131: op Sigmoid shape [(1, 32, 60, 80)] opt {}
132: op Mul shape [(1, 32, 60, 80), (1, 32, 60, 80)] opt {}
133: op Concat shape [(1, 32, 60, 80), (1, 32, 60, 80), (1, 32, 60, 80)] opt {'axis': 1}
134: op Conv shape [(1, 96, 60, 80), (64, 96, 1, 1), (64,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (1, 1), 'pads': (0, 0, 0, 0), 'strides': (1, 1)}
135: op Sigmoid shape [(1, 64, 60, 80)] opt {}
136: op Mul shape [(1, 64, 60, 80), (1, 64, 60, 80)] opt {}
137: op Conv shape [(1, 64, 60, 80), (64, 64, 3, 3), (64,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (3, 3), 'pads': (1, 1, 1, 1), 'strides': (2, 2)}
138: op Sigmoid shape [(1, 64, 30, 40)] opt {}
139: op Mul shape [(1, 64, 30, 40), (1, 64, 30, 40)] opt {}
140: op Concat shape [(1, 64, 30, 40), (1, 128, 30, 40)] opt {'axis': 1}
141: op Conv shape [(1, 192, 30, 40), (128, 192, 1, 1), (128,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (1, 1), 'pads': (0, 0, 0, 0), 'strides': (1, 1)}
142: op Sigmoid shape [(1, 128, 30, 40)] opt {}
143: op Mul shape [(1, 128, 30, 40), (1, 128, 30, 40)] opt {}
144: op Split shape [(1, 128, 30, 40), (2,)] opt {'axis': 1}
145: op Conv shape [(1, 64, 30, 40), (64, 64, 3, 3), (64,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (3, 3), 'pads': (1, 1, 1, 1), 'strides': (1, 1)}
146: op Sigmoid shape [(1, 64, 30, 40)] opt {}
147: op Mul shape [(1, 64, 30, 40), (1, 64, 30, 40)] opt {}
148: op Conv shape [(1, 64, 30, 40), (64, 64, 3, 3), (64,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (3, 3), 'pads': (1, 1, 1, 1), 'strides': (1, 1)}
149: op Sigmoid shape [(1, 64, 30, 40)] opt {}
150: op Mul shape [(1, 64, 30, 40), (1, 64, 30, 40)] opt {}
151: op Concat shape [(1, 64, 30, 40), (1, 64, 30, 40), (1, 64, 30, 40)] opt {'axis': 1}
152: op Conv shape [(1, 192, 30, 40), (128, 192, 1, 1), (128,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (1, 1), 'pads': (0, 0, 0, 0), 'strides': (1, 1)}
153: op Sigmoid shape [(1, 128, 30, 40)] opt {}
154: op Mul shape [(1, 128, 30, 40), (1, 128, 30, 40)] opt {}
155: op Conv shape [(1, 128, 30, 40), (128, 128, 3, 3), (128,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (3, 3), 'pads': (1, 1, 1, 1), 'strides': (2, 2)}
156: op Sigmoid shape [(1, 128, 15, 20)] opt {}
157: op Mul shape [(1, 128, 15, 20), (1, 128, 15, 20)] opt {}
158: op Concat shape [(1, 128, 15, 20), (1, 256, 15, 20)] opt {'axis': 1}
159: op Conv shape [(1, 384, 15, 20), (256, 384, 1, 1), (256,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (1, 1), 'pads': (0, 0, 0, 0), 'strides': (1, 1)}
160: op Sigmoid shape [(1, 256, 15, 20)] opt {}
161: op Mul shape [(1, 256, 15, 20), (1, 256, 15, 20)] opt {}
162: op Split shape [(1, 256, 15, 20), (2,)] opt {'axis': 1}
163: op Conv shape [(1, 128, 15, 20), (128, 128, 3, 3), (128,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (3, 3), 'pads': (1, 1, 1, 1), 'strides': (1, 1)}
164: op Sigmoid shape [(1, 128, 15, 20)] opt {}
165: op Mul shape [(1, 128, 15, 20), (1, 128, 15, 20)] opt {}
166: op Conv shape [(1, 128, 15, 20), (128, 128, 3, 3), (128,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (3, 3), 'pads': (1, 1, 1, 1), 'strides': (1, 1)}
167: op Sigmoid shape [(1, 128, 15, 20)] opt {}
168: op Mul shape [(1, 128, 15, 20), (1, 128, 15, 20)] opt {}
169: op Concat shape [(1, 128, 15, 20), (1, 128, 15, 20), (1, 128, 15, 20)] opt {'axis': 1}
170: op Conv shape [(1, 384, 15, 20), (256, 384, 1, 1), (256,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (1, 1), 'pads': (0, 0, 0, 0), 'strides': (1, 1)}
171: op Sigmoid shape [(1, 256, 15, 20)] opt {}
172: op Mul shape [(1, 256, 15, 20), (1, 256, 15, 20)] opt {}
173: op Conv shape [(1, 64, 60, 80), (64, 64, 3, 3), (64,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (3, 3), 'pads': (1, 1, 1, 1), 'strides': (1, 1)}
174: op Sigmoid shape [(1, 64, 60, 80)] opt {}
175: op Mul shape [(1, 64, 60, 80), (1, 64, 60, 80)] opt {}
176: op ConvTranspose shape [(1, 64, 60, 80), (64, 64, 2, 2), (64,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (2, 2), 'pads': (0, 0, 0, 0), 'strides': (2, 2)}
177: op Conv shape [(1, 64, 120, 160), (64, 64, 3, 3), (64,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (3, 3), 'pads': (1, 1, 1, 1), 'strides': (1, 1)}
178: op Sigmoid shape [(1, 64, 120, 160)] opt {}
179: op Mul shape [(1, 64, 120, 160), (1, 64, 120, 160)] opt {}
180: op Conv shape [(1, 64, 120, 160), (32, 64, 1, 1), (32,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (1, 1), 'pads': (0, 0, 0, 0), 'strides': (1, 1)}
181: op Sigmoid shape [(1, 32, 120, 160)] opt {}
182: op Mul shape [(1, 32, 120, 160), (1, 32, 120, 160)] opt {}
183: op Conv shape [(1, 64, 60, 80), (32, 64, 3, 3), (32,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (3, 3), 'pads': (1, 1, 1, 1), 'strides': (1, 1)}
184: op Sigmoid shape [(1, 32, 60, 80)] opt {}
185: op Mul shape [(1, 32, 60, 80), (1, 32, 60, 80)] opt {}
186: op Conv shape [(1, 32, 60, 80), (32, 32, 3, 3), (32,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (3, 3), 'pads': (1, 1, 1, 1), 'strides': (1, 1)}
187: op Sigmoid shape [(1, 32, 60, 80)] opt {}
188: op Mul shape [(1, 32, 60, 80), (1, 32, 60, 80)] opt {}
189: op Conv shape [(1, 32, 60, 80), (32, 32, 1, 1), (32,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (1, 1), 'pads': (0, 0, 0, 0), 'strides': (1, 1)}
190: op Constant shape [] opt {'value': <Tensor <LB HIP (3,) contig:True (<LoadOps.COPY: 3>, None)> on HIP with grad None>}
191: op Constant shape [] opt {'value': <Tensor <LB HIP (3,) contig:True (<LoadOps.COPY: 3>, None)> on HIP with grad None>}
192: op Constant shape [] opt {'value': <Tensor <LB HIP (3,) contig:True (<LoadOps.COPY: 3>, None)> on HIP with grad None>}
193: op Reshape shape [(1, 32, 60, 80), (3,)] opt {'allowzero': 0}
194: op Conv shape [(1, 128, 30, 40), (32, 128, 3, 3), (32,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (3, 3), 'pads': (1, 1, 1, 1), 'strides': (1, 1)}
195: op Sigmoid shape [(1, 32, 30, 40)] opt {}
196: op Mul shape [(1, 32, 30, 40), (1, 32, 30, 40)] opt {}
197: op Conv shape [(1, 32, 30, 40), (32, 32, 3, 3), (32,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (3, 3), 'pads': (1, 1, 1, 1), 'strides': (1, 1)}
198: op Sigmoid shape [(1, 32, 30, 40)] opt {}
199: op Mul shape [(1, 32, 30, 40), (1, 32, 30, 40)] opt {}
200: op Conv shape [(1, 32, 30, 40), (32, 32, 1, 1), (32,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (1, 1), 'pads': (0, 0, 0, 0), 'strides': (1, 1)}
201: op Reshape shape [(1, 32, 30, 40), (3,)] opt {'allowzero': 0}
202: op Conv shape [(1, 256, 15, 20), (32, 256, 3, 3), (32,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (3, 3), 'pads': (1, 1, 1, 1), 'strides': (1, 1)}
203: op Sigmoid shape [(1, 32, 15, 20)] opt {}
204: op Mul shape [(1, 32, 15, 20), (1, 32, 15, 20)] opt {}
205: op Conv shape [(1, 32, 15, 20), (32, 32, 3, 3), (32,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (3, 3), 'pads': (1, 1, 1, 1), 'strides': (1, 1)}
206: op Sigmoid shape [(1, 32, 15, 20)] opt {}
207: op Mul shape [(1, 32, 15, 20), (1, 32, 15, 20)] opt {}
208: op Conv shape [(1, 32, 15, 20), (32, 32, 1, 1), (32,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (1, 1), 'pads': (0, 0, 0, 0), 'strides': (1, 1)}
209: op Reshape shape [(1, 32, 15, 20), (3,)] opt {'allowzero': 0}
210: op Concat shape [(1, 32, 4800), (1, 32, 1200), (1, 32, 300)] opt {'axis': 2}
211: op Conv shape [(1, 64, 60, 80), (64, 64, 3, 3), (64,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (3, 3), 'pads': (1, 1, 1, 1), 'strides': (1, 1)}
212: op Sigmoid shape [(1, 64, 60, 80)] opt {}
213: op Mul shape [(1, 64, 60, 80), (1, 64, 60, 80)] opt {}
214: op Conv shape [(1, 64, 60, 80), (64, 64, 3, 3), (64,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (3, 3), 'pads': (1, 1, 1, 1), 'strides': (1, 1)}
215: op Sigmoid shape [(1, 64, 60, 80)] opt {}
216: op Mul shape [(1, 64, 60, 80), (1, 64, 60, 80)] opt {}
217: op Conv shape [(1, 64, 60, 80), (64, 64, 1, 1), (64,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (1, 1), 'pads': (0, 0, 0, 0), 'strides': (1, 1)}
218: op Conv shape [(1, 64, 60, 80), (80, 64, 3, 3), (80,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (3, 3), 'pads': (1, 1, 1, 1), 'strides': (1, 1)}
219: op Sigmoid shape [(1, 80, 60, 80)] opt {}
220: op Mul shape [(1, 80, 60, 80), (1, 80, 60, 80)] opt {}
221: op Conv shape [(1, 80, 60, 80), (80, 80, 3, 3), (80,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (3, 3), 'pads': (1, 1, 1, 1), 'strides': (1, 1)}
222: op Sigmoid shape [(1, 80, 60, 80)] opt {}
223: op Mul shape [(1, 80, 60, 80), (1, 80, 60, 80)] opt {}
224: op Conv shape [(1, 80, 60, 80), (80, 80, 1, 1), (80,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (1, 1), 'pads': (0, 0, 0, 0), 'strides': (1, 1)}
225: op Concat shape [(1, 64, 60, 80), (1, 80, 60, 80)] opt {'axis': 1}
226: op Conv shape [(1, 128, 30, 40), (64, 128, 3, 3), (64,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (3, 3), 'pads': (1, 1, 1, 1), 'strides': (1, 1)}
227: op Sigmoid shape [(1, 64, 30, 40)] opt {}
228: op Mul shape [(1, 64, 30, 40), (1, 64, 30, 40)] opt {}
229: op Conv shape [(1, 64, 30, 40), (64, 64, 3, 3), (64,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (3, 3), 'pads': (1, 1, 1, 1), 'strides': (1, 1)}
230: op Sigmoid shape [(1, 64, 30, 40)] opt {}
231: op Mul shape [(1, 64, 30, 40), (1, 64, 30, 40)] opt {}
232: op Conv shape [(1, 64, 30, 40), (64, 64, 1, 1), (64,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (1, 1), 'pads': (0, 0, 0, 0), 'strides': (1, 1)}
233: op Conv shape [(1, 128, 30, 40), (80, 128, 3, 3), (80,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (3, 3), 'pads': (1, 1, 1, 1), 'strides': (1, 1)}
234: op Sigmoid shape [(1, 80, 30, 40)] opt {}
235: op Mul shape [(1, 80, 30, 40), (1, 80, 30, 40)] opt {}
236: op Conv shape [(1, 80, 30, 40), (80, 80, 3, 3), (80,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (3, 3), 'pads': (1, 1, 1, 1), 'strides': (1, 1)}
237: op Sigmoid shape [(1, 80, 30, 40)] opt {}
238: op Mul shape [(1, 80, 30, 40), (1, 80, 30, 40)] opt {}
239: op Conv shape [(1, 80, 30, 40), (80, 80, 1, 1), (80,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (1, 1), 'pads': (0, 0, 0, 0), 'strides': (1, 1)}
240: op Concat shape [(1, 64, 30, 40), (1, 80, 30, 40)] opt {'axis': 1}
241: op Conv shape [(1, 256, 15, 20), (64, 256, 3, 3), (64,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (3, 3), 'pads': (1, 1, 1, 1), 'strides': (1, 1)}
242: op Sigmoid shape [(1, 64, 15, 20)] opt {}
243: op Mul shape [(1, 64, 15, 20), (1, 64, 15, 20)] opt {}
244: op Conv shape [(1, 64, 15, 20), (64, 64, 3, 3), (64,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (3, 3), 'pads': (1, 1, 1, 1), 'strides': (1, 1)}
245: op Sigmoid shape [(1, 64, 15, 20)] opt {}
246: op Mul shape [(1, 64, 15, 20), (1, 64, 15, 20)] opt {}
247: op Conv shape [(1, 64, 15, 20), (64, 64, 1, 1), (64,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (1, 1), 'pads': (0, 0, 0, 0), 'strides': (1, 1)}
248: op Conv shape [(1, 256, 15, 20), (80, 256, 3, 3), (80,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (3, 3), 'pads': (1, 1, 1, 1), 'strides': (1, 1)}
249: op Sigmoid shape [(1, 80, 15, 20)] opt {}
250: op Mul shape [(1, 80, 15, 20), (1, 80, 15, 20)] opt {}
251: op Conv shape [(1, 80, 15, 20), (80, 80, 3, 3), (80,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (3, 3), 'pads': (1, 1, 1, 1), 'strides': (1, 1)}
252: op Sigmoid shape [(1, 80, 15, 20)] opt {}
253: op Mul shape [(1, 80, 15, 20), (1, 80, 15, 20)] opt {}
254: op Conv shape [(1, 80, 15, 20), (80, 80, 1, 1), (80,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (1, 1), 'pads': (0, 0, 0, 0), 'strides': (1, 1)}
255: op Concat shape [(1, 64, 15, 20), (1, 80, 15, 20)] opt {'axis': 1}
256: op Constant shape [] opt {'value': <Tensor <LB HIP (3,) contig:True (<LoadOps.COPY: 3>, None)> on HIP with grad None>}
257: op Constant shape [] opt {'value': <Tensor <LB HIP (3,) contig:True (<LoadOps.COPY: 3>, None)> on HIP with grad None>}
258: op Constant shape [] opt {'value': <Tensor <LB HIP (3,) contig:True (<LoadOps.COPY: 3>, None)> on HIP with grad None>}
259: op Reshape shape [(1, 144, 60, 80), (3,)] opt {'allowzero': 0}
260: op Reshape shape [(1, 144, 30, 40), (3,)] opt {'allowzero': 0}
261: op Reshape shape [(1, 144, 15, 20), (3,)] opt {'allowzero': 0}
262: op Concat shape [(1, 144, 4800), (1, 144, 1200), (1, 144, 300)] opt {'axis': 2}
263: op Constant shape [] opt {'value': <Tensor <LB HIP (2,) contig:True (<LoadOps.COPY: 3>, None)> on HIP with grad None>}
264: op Split shape [(1, 144, 6300), (2,)] opt {'axis': 1}
265: op Constant shape [] opt {'value': <Tensor <LB HIP (4,) contig:True (<LoadOps.COPY: 3>, None)> on HIP with grad None>}
266: op Reshape shape [(1, 64, 6300), (4,)] opt {'allowzero': 0}
267: op Transpose shape [(1, 4, 16, 6300)] opt {'perm': (0, 2, 1, 3)}
268: op Softmax shape [(1, 16, 4, 6300)] opt {'axis': 1}
269: op Conv shape [(1, 16, 4, 6300), (1, 16, 1, 1)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (1, 1), 'pads': (0, 0, 0, 0), 'strides': (1, 1)}
270: op Constant shape [] opt {'value': <Tensor <LB HIP (3,) contig:True (<LoadOps.COPY: 3>, None)> on HIP with grad None>}
271: op Reshape shape [(1, 1, 4, 6300), (3,)] opt {'allowzero': 0}
272: op Shape shape [(1, 4, 6300)] opt {}
273: op Constant shape [] opt {'value': <Tensor <LB HIP (1,) contig:True (<LoadOps.COPY: 3>, None)> on HIP with grad None>}
274: op Gather shape [(3,), (1,)] opt {'axis': 0}
275: op Constant shape [] opt {'value': <Tensor <LB HIP (1,) contig:True (<LoadOps.COPY: 3>, None)> on HIP with grad None>}
276: op Constant shape [] opt {'value': <Tensor <LB HIP (1,) contig:True (<LoadOps.COPY: 3>, None)> on HIP with grad None>}
277: op Add shape [(1,), (1,)] opt {}
278: op Constant shape [] opt {'value': <Tensor <LB HIP (1,) contig:True (<LoadOps.COPY: 3>, None)> on HIP with grad None>}
279: op Div shape [(1,), (1,)] opt {}
280: op Constant shape [] opt {'value': <Tensor <LB HIP (1,) contig:True (<LoadOps.COPY: 3>, None)> on HIP with grad None>}
281: op Mul shape [(1,), (1,)] opt {}
282: op Slice shape [(1, 4, 6300), (1,), (1,), (1,)] opt {}
283: op Constant shape [] opt {'value': <Tensor <LB HIP (1,) contig:True (<LoadOps.COPY: 3>, None)> on HIP with grad None>}
284: op Mul shape [(1,), (1,)] opt {}
285: op Slice shape [(1, 4, 6300), (1,), (1,), (1,)] opt {}
286: op Constant shape [] opt {'value': <Tensor <LB HIP (1, 2, 6300) contig:True (<LoadOps.COPY: 3>, None)> on HIP with grad None>}
287: op Sub shape [(1, 2, 6300), (1, 3, 6300)] opt {}
Traceback (most recent call last):
  File "/home/jebba/devel/tinygrad/tinygrad/examples/yolov8-onnx.py", line 18, in <module>
    run_onnx({"images": Tensor.zeros(1,3,480,640)}, debug=True)
  File "/home/jebba/devel/tinygrad/tinygrad/extra/onnx.py", line 211, in run_onnx
    ret = real_fxn(*inp, **opt)
          ^^^^^^^^^^^^^^^^^^^^^
  File "/home/jebba/devel/tinygrad/tinygrad/extra/onnx_ops.py", line 18, in Sub
    def Sub(x: Union[Tensor, Any], other: Tensor): return x - other # some test has input as int
                                                          ~~^~~~~~~
  File "/home/jebba/devel/tinygrad/tinygrad/tinygrad/tensor.py", line 858, in __sub__
    def __sub__(self, x) -> Tensor: return self.sub(x)
                                           ^^^^^^^^^^^
  File "/home/jebba/devel/tinygrad/tinygrad/tinygrad/tensor.py", line 812, in sub
    return mlops.Sub.apply(*self._broadcasted(x, reverse)) if x.__class__ is Tensor or x else (-self if reverse else self)
                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jebba/devel/tinygrad/tinygrad/tinygrad/tensor.py", line 800, in _broadcasted
    return x.expand(broadcasted_shape), y.expand(broadcasted_shape)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jebba/devel/tinygrad/tinygrad/tinygrad/tensor.py", line 309, in expand
    return mlops.Expand.apply(self, shape=new_shape) if new_shape != self.shape else self
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jebba/devel/tinygrad/tinygrad/tinygrad/tensor.py", line 34, in apply
    ret.lazydata, ret.requires_grad, ret.grad = ctx.forward(*[t.lazydata for t in x], **kwargs), ctx.requires_grad, None
                                                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jebba/devel/tinygrad/tinygrad/tinygrad/mlops.py", line 168, in forward
    return x.expand(shape)
           ^^^^^^^^^^^^^^^
  File "/home/jebba/devel/tinygrad/tinygrad/tinygrad/lazy.py", line 147, in expand
    def expand(self, arg:Tuple[sint, ...]): return self._view(self.st.expand(arg))
                                                              ^^^^^^^^^^^^^^^^^^^
  File "/home/jebba/devel/tinygrad/tinygrad/tinygrad/shape/shapetracker.py", line 180, in expand
    def expand(self, new_shape: Tuple[sint, ...]) -> ShapeTracker: return ShapeTracker(self.views[0:-1] + (self.views[-1].expand(new_shape), ))
                                                                                                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jebba/devel/tinygrad/tinygrad/tinygrad/shape/view.py", line 156, in expand
    assert all((s == x or (s == 1 and st == 0)) for s,x,st in zip(self.shape, new_shape, self.strides)), f"can't expand {self.shape} into {new_shape}"
AssertionError: can't expand (1, 2, 6300) into (1, 3, 6300)

yolov8.py

Error: Image URL or path not provided.

mlperf/model_spec.py

testing resnet
8.32 GOPS, 0.00 ms
testing retinanet
23.98 GOPS, 0.00 ms
testing unet3d
1068.83 GOPS, 0.00 ms
testing rnnt
47.32 GOPS, 0.00 ms
testing bert
273.53 GOPS, 0.00 ms
testing mrcnn
89.61 GOPS, 2046.28 ms