tinygrad Examples

Output from running the Python scripts in test/.

Note, while these examples were running, builds were also running, hitting 128 processors, so these examples aren’t benchmarks.

In all runs so far, just one GPU was used in the tinygrad examples.

beautiful_cartpole.py

  0%|          | 0/40 [00:00<?, ?it/s]
sz:    16 steps/s:    4.53 action_loss:  -8.405 entropy_loss:  -0.677 critic_loss:   83.913 reward:  16.00:   0%|          | 0/40 [00:03<?, ?it/s]
sz:    16 steps/s:    4.53 action_loss:  -8.405 entropy_loss:  -0.677 critic_loss:   83.913 reward:  16.00:   2%|▎         | 1/40 [00:03<02:18,  3.54s/it]
sz:    42 steps/s:    9.89 action_loss:  -9.755 entropy_loss:  -0.656 critic_loss:  130.257 reward:  26.00:   2%|▎         | 1/40 [00:04<02:18,  3.54s/it]
sz:    42 steps/s:    9.89 action_loss:  -9.755 entropy_loss:  -0.656 critic_loss:  130.257 reward:  26.00:   5%|▌         | 2/40 [00:04<01:11,  1.88s/it]
sz:    62 steps/s:   12.84 action_loss:  -9.323 entropy_loss:  -0.632 critic_loss:  120.510 reward:  20.00:   5%|▌         | 2/40 [00:04<01:11,  1.88s/it]
sz:    62 steps/s:   12.84 action_loss:  -9.323 entropy_loss:  -0.632 critic_loss:  120.510 reward:  20.00:   8%|▊         | 3/40 [00:04<00:47,  1.28s/it]
sz:   125 steps/s:   22.83 action_loss: -16.834 entropy_loss:  -0.612 critic_loss:  461.137 reward:  63.00:   8%|▊         | 3/40 [00:05<00:47,  1.28s/it]
sz:   125 steps/s:   22.83 action_loss: -16.834 entropy_loss:  -0.612 critic_loss:  461.137 reward:  63.00:  10%|█         | 4/40 [00:05<00:37,  1.03s/it]
sz:   140 steps/s:   22.87 action_loss: -14.485 entropy_loss:  -0.604 critic_loss:  378.145 reward:  15.00:  10%|█         | 4/40 [00:06<00:37,  1.03s/it]
sz:   140 steps/s:   22.87 action_loss: -14.485 entropy_loss:  -0.604 critic_loss:  378.145 reward:  15.00:  12%|█▎        | 5/40 [00:06<00:31,  1.12it/s]
sz:   158 steps/s:   23.51 action_loss: -12.541 entropy_loss:  -0.609 critic_loss:  317.384 reward:  18.00:  12%|█▎        | 5/40 [00:06<00:31,  1.12it/s]
sz:   158 steps/s:   23.51 action_loss: -12.541 entropy_loss:  -0.609 critic_loss:  317.384 reward:  18.00:  15%|█▌        | 6/40 [00:06<00:26,  1.26it/s]
sz:   173 steps/s:   23.68 action_loss: -10.413 entropy_loss:  -0.618 critic_loss:  270.751 reward:  15.00:  15%|█▌        | 6/40 [00:07<00:26,  1.26it/s]
sz:   173 steps/s:   23.68 action_loss: -10.413 entropy_loss:  -0.618 critic_loss:  270.751 reward:  15.00:  18%|█▊        | 7/40 [00:07<00:23,  1.38it/s]
sz:   197 steps/s:   24.93 action_loss:  -9.082 entropy_loss:  -0.623 critic_loss:  249.494 reward:  24.00:  18%|█▊        | 7/40 [00:07<00:23,  1.38it/s]
sz:   197 steps/s:   24.93 action_loss:  -9.082 entropy_loss:  -0.623 critic_loss:  249.494 reward:  24.00:  20%|██        | 8/40 [00:07<00:21,  1.46it/s]
sz:   234 steps/s:   27.33 action_loss:  -8.351 entropy_loss:  -0.608 critic_loss:  237.388 reward:  37.00:  20%|██        | 8/40 [00:08<00:21,  1.46it/s]
sz:   234 steps/s:   27.33 action_loss:  -8.351 entropy_loss:  -0.608 critic_loss:  237.388 reward:  37.00:  22%|██▎       | 9/40 [00:08<00:20,  1.48it/s]
sz:   284 steps/s:   30.63 action_loss:  -8.378 entropy_loss:  -0.566 critic_loss:  238.783 reward:  50.00:  22%|██▎       | 9/40 [00:09<00:20,  1.48it/s]
sz:   284 steps/s:   30.63 action_loss:  -8.378 entropy_loss:  -0.566 critic_loss:  238.783 reward:  50.00:  25%|██▌       | 10/40 [00:09<00:20,  1.46it/s]
sz:   327 steps/s:   32.95 action_loss:  -7.069 entropy_loss:  -0.520 critic_loss:  217.788 reward:  43.00:  25%|██▌       | 10/40 [00:09<00:20,  1.46it/s]
sz:   327 steps/s:   32.95 action_loss:  -7.069 entropy_loss:  -0.520 critic_loss:  217.788 reward:  43.00:  28%|██▊       | 11/40 [00:09<00:19,  1.48it/s]
sz:   356 steps/s:   33.55 action_loss:  -3.947 entropy_loss:  -0.461 critic_loss:  144.527 reward:  29.00:  28%|██▊       | 11/40 [00:10<00:19,  1.48it/s]
sz:   356 steps/s:   33.55 action_loss:  -3.947 entropy_loss:  -0.461 critic_loss:  144.527 reward:  29.00:  30%|███       | 12/40 [00:10<00:19,  1.47it/s]
sz:   479 steps/s:   42.10 action_loss:  -8.053 entropy_loss:  -0.448 critic_loss:  421.731 reward: 123.00:  30%|███       | 12/40 [00:11<00:19,  1.47it/s]
sz:   479 steps/s:   42.10 action_loss:  -8.053 entropy_loss:  -0.448 critic_loss:  421.731 reward: 123.00:  32%|███▎      | 13/40 [00:11<00:19,  1.42it/s]
sz:   645 steps/s:   52.32 action_loss: -15.570 entropy_loss:  -0.464 critic_loss:  810.073 reward: 166.00:  32%|███▎      | 13/40 [00:12<00:19,  1.42it/s]
sz:   645 steps/s:   52.32 action_loss: -15.570 entropy_loss:  -0.464 critic_loss:  810.073 reward: 166.00:  35%|███▌      | 14/40 [00:12<00:20,  1.28it/s]
sz:   792 steps/s:   60.34 action_loss: -15.626 entropy_loss:  -0.414 critic_loss:  871.600 reward: 147.00:  35%|███▌      | 14/40 [00:13<00:20,  1.28it/s]
sz:   792 steps/s:   60.34 action_loss: -15.626 entropy_loss:  -0.414 critic_loss:  871.600 reward: 147.00:  38%|███▊      | 15/40 [00:13<00:19,  1.27it/s]
sz:  1203 steps/s:   83.40 action_loss: -28.025 entropy_loss:  -0.473 critic_loss: 1730.603 reward: 411.00:  38%|███▊      | 15/40 [00:14<00:19,  1.27it/s]
sz:  1203 steps/s:   83.40 action_loss: -28.025 entropy_loss:  -0.473 critic_loss: 1730.603 reward: 411.00:  40%|████      | 16/40 [00:14<00:22,  1.06it/s]
sz:  1435 steps/s:   93.00 action_loss: -25.648 entropy_loss:  -0.458 critic_loss: 1608.802 reward: 232.00:  40%|████      | 16/40 [00:15<00:22,  1.06it/s]
sz:  1435 steps/s:   93.00 action_loss: -25.648 entropy_loss:  -0.458 critic_loss: 1608.802 reward: 232.00:  42%|████▎     | 17/40 [00:15<00:22,  1.04it/s]
sz:  1592 steps/s:   96.96 action_loss: -16.278 entropy_loss:  -0.413 critic_loss: 1147.564 reward: 157.00:  42%|████▎     | 17/40 [00:16<00:22,  1.04it/s]
sz:  1592 steps/s:   96.96 action_loss: -16.278 entropy_loss:  -0.413 critic_loss: 1147.564 reward: 157.00:  45%|████▌     | 18/40 [00:16<00:21,  1.03it/s]
sz:  1789 steps/s:  103.18 action_loss: -19.750 entropy_loss:  -0.398 critic_loss: 1422.291 reward: 197.00:  45%|████▌     | 18/40 [00:17<00:21,  1.03it/s]
sz:  1789 steps/s:  103.18 action_loss: -19.750 entropy_loss:  -0.398 critic_loss: 1422.291 reward: 197.00:  48%|████▊     | 19/40 [00:17<00:20,  1.05it/s]
sz:  1949 steps/s:  106.81 action_loss: -10.565 entropy_loss:  -0.397 critic_loss:  895.935 reward: 160.00:  48%|████▊     | 19/40 [00:18<00:20,  1.05it/s]
sz:  1949 steps/s:  106.81 action_loss: -10.565 entropy_loss:  -0.397 critic_loss:  895.935 reward: 160.00:  50%|█████     | 20/40 [00:18<00:18,  1.06it/s]
sz:  2000 steps/s:  110.75 action_loss:  -9.772 entropy_loss:  -0.385 critic_loss:  871.258 reward: 182.00:  50%|█████     | 20/40 [00:19<00:18,  1.06it/s]
sz:  2000 steps/s:  110.75 action_loss:  -9.772 entropy_loss:  -0.385 critic_loss:  871.258 reward: 182.00:  52%|█████▎    | 21/40 [00:19<00:18,  1.05it/s]
sz:  2000 steps/s:  121.57 action_loss: -11.975 entropy_loss:  -0.382 critic_loss:  808.180 reward: 261.00:  52%|█████▎    | 21/40 [00:19<00:18,  1.05it/s]
sz:  2000 steps/s:  121.57 action_loss: -11.975 entropy_loss:  -0.382 critic_loss:  808.180 reward: 261.00:  55%|█████▌    | 22/40 [00:19<00:14,  1.25it/s]
sz:  2000 steps/s:  133.60 action_loss: -10.460 entropy_loss:  -0.392 critic_loss:  741.236 reward: 296.00:  55%|█████▌    | 22/40 [00:20<00:14,  1.25it/s]
sz:  2000 steps/s:  133.60 action_loss: -10.460 entropy_loss:  -0.392 critic_loss:  741.236 reward: 296.00:  57%|█████▊    | 23/40 [00:20<00:11,  1.44it/s]
sz:  2000 steps/s:  153.39 action_loss:  -9.818 entropy_loss:  -0.397 critic_loss:  624.722 reward: 500.00:  57%|█████▊    | 23/40 [00:20<00:11,  1.44it/s]
sz:  2000 steps/s:  153.39 action_loss:  -9.818 entropy_loss:  -0.397 critic_loss:  624.722 reward: 500.00:  60%|██████    | 24/40 [00:20<00:10,  1.46it/s]
sz:  2000 steps/s:  171.97 action_loss:  -8.829 entropy_loss:  -0.434 critic_loss:  686.303 reward: 500.00:  60%|██████    | 24/40 [00:21<00:10,  1.46it/s]
sz:  2000 steps/s:  171.97 action_loss:  -8.829 entropy_loss:  -0.434 critic_loss:  686.303 reward: 500.00:  62%|██████▎   | 25/40 [00:21<00:10,  1.48it/s]
sz:  2000 steps/s:  189.42 action_loss: -12.795 entropy_loss:  -0.473 critic_loss:  664.423 reward: 500.00:  62%|██████▎   | 25/40 [00:22<00:10,  1.48it/s]
sz:  2000 steps/s:  189.42 action_loss: -12.795 entropy_loss:  -0.473 critic_loss:  664.423 reward: 500.00:  65%|██████▌   | 26/40 [00:22<00:09,  1.49it/s]
sz:  2000 steps/s:  205.86 action_loss:  -7.701 entropy_loss:  -0.469 critic_loss:  597.681 reward: 500.00:  65%|██████▌   | 26/40 [00:22<00:09,  1.49it/s]
sz:  2000 steps/s:  205.86 action_loss:  -7.701 entropy_loss:  -0.469 critic_loss:  597.681 reward: 500.00:  68%|██████▊   | 27/40 [00:22<00:08,  1.49it/s]
sz:  2000 steps/s:  221.37 action_loss:  -2.992 entropy_loss:  -0.482 critic_loss:  482.757 reward: 500.00:  68%|██████▊   | 27/40 [00:23<00:08,  1.49it/s]
sz:  2000 steps/s:  221.37 action_loss:  -2.992 entropy_loss:  -0.482 critic_loss:  482.757 reward: 500.00:  70%|███████   | 28/40 [00:23<00:08,  1.50it/s]
sz:  2000 steps/s:  236.03 action_loss:   1.380 entropy_loss:  -0.453 critic_loss:  538.772 reward: 500.00:  70%|███████   | 28/40 [00:24<00:08,  1.50it/s]
sz:  2000 steps/s:  236.03 action_loss:   1.380 entropy_loss:  -0.453 critic_loss:  538.772 reward: 500.00:  72%|███████▎  | 29/40 [00:24<00:07,  1.50it/s]
sz:  2000 steps/s:  249.51 action_loss:   3.675 entropy_loss:  -0.430 critic_loss:  589.797 reward: 500.00:  72%|███████▎  | 29/40 [00:24<00:07,  1.50it/s]
sz:  2000 steps/s:  249.51 action_loss:   3.675 entropy_loss:  -0.430 critic_loss:  589.797 reward: 500.00:  75%|███████▌  | 30/40 [00:24<00:06,  1.48it/s]
sz:  2000 steps/s:  262.02 action_loss:   1.766 entropy_loss:  -0.456 critic_loss:  630.521 reward: 500.00:  75%|███████▌  | 30/40 [00:25<00:06,  1.48it/s]
sz:  2000 steps/s:  262.02 action_loss:   1.766 entropy_loss:  -0.456 critic_loss:  630.521 reward: 500.00:  78%|███████▊  | 31/40 [00:25<00:06,  1.45it/s]
sz:  2000 steps/s:  274.49 action_loss:   2.491 entropy_loss:  -0.429 critic_loss:  615.586 reward: 500.00:  78%|███████▊  | 31/40 [00:26<00:06,  1.45it/s]
sz:  2000 steps/s:  274.49 action_loss:   2.491 entropy_loss:  -0.429 critic_loss:  615.586 reward: 500.00:  80%|████████  | 32/40 [00:26<00:05,  1.47it/s]
sz:  2000 steps/s:  286.35 action_loss:  -0.218 entropy_loss:  -0.476 critic_loss:  532.887 reward: 500.00:  80%|████████  | 32/40 [00:26<00:05,  1.47it/s]
sz:  2000 steps/s:  286.35 action_loss:  -0.218 entropy_loss:  -0.476 critic_loss:  532.887 reward: 500.00:  82%|████████▎ | 33/40 [00:26<00:04,  1.48it/s]
sz:  2000 steps/s:  297.63 action_loss:  -0.294 entropy_loss:  -0.486 critic_loss:  535.985 reward: 500.00:  82%|████████▎ | 33/40 [00:27<00:04,  1.48it/s]
sz:  2000 steps/s:  297.63 action_loss:  -0.294 entropy_loss:  -0.486 critic_loss:  535.985 reward: 500.00:  85%|████████▌ | 34/40 [00:27<00:04,  1.49it/s]
sz:  2000 steps/s:  308.38 action_loss:  -0.592 entropy_loss:  -0.502 critic_loss:  491.457 reward: 500.00:  85%|████████▌ | 34/40 [00:28<00:04,  1.49it/s]
sz:  2000 steps/s:  308.38 action_loss:  -0.592 entropy_loss:  -0.502 critic_loss:  491.457 reward: 500.00:  88%|████████▊ | 35/40 [00:28<00:03,  1.49it/s]
sz:  2000 steps/s:  318.63 action_loss:   2.020 entropy_loss:  -0.532 critic_loss:  754.644 reward: 500.00:  88%|████████▊ | 35/40 [00:28<00:03,  1.49it/s]
sz:  2000 steps/s:  318.63 action_loss:   2.020 entropy_loss:  -0.532 critic_loss:  754.644 reward: 500.00:  90%|█████████ | 36/40 [00:28<00:02,  1.50it/s]
sz:  2000 steps/s:  328.42 action_loss:  -0.480 entropy_loss:  -0.542 critic_loss:  522.170 reward: 500.00:  90%|█████████ | 36/40 [00:29<00:02,  1.50it/s]
sz:  2000 steps/s:  328.42 action_loss:  -0.480 entropy_loss:  -0.542 critic_loss:  522.170 reward: 500.00:  92%|█████████▎| 37/40 [00:29<00:01,  1.50it/s]
sz:  2000 steps/s:  337.78 action_loss:  -1.788 entropy_loss:  -0.560 critic_loss:  586.351 reward: 500.00:  92%|█████████▎| 37/40 [00:30<00:01,  1.50it/s]
sz:  2000 steps/s:  337.78 action_loss:  -1.788 entropy_loss:  -0.560 critic_loss:  586.351 reward: 500.00:  95%|█████████▌| 38/40 [00:30<00:01,  1.49it/s]
sz:  2000 steps/s:  346.58 action_loss:   1.829 entropy_loss:  -0.512 critic_loss:  649.962 reward: 500.00:  95%|█████████▌| 38/40 [00:30<00:01,  1.49it/s]
sz:  2000 steps/s:  346.58 action_loss:   1.829 entropy_loss:  -0.512 critic_loss:  649.962 reward: 500.00:  98%|█████████▊| 39/40 [00:30<00:00,  1.50it/s]
sz:  2000 steps/s:  355.17 action_loss:  -1.843 entropy_loss:  -0.548 critic_loss:  540.821 reward: 500.00:  98%|█████████▊| 39/40 [00:31<00:00,  1.50it/s]
sz:  2000 steps/s:  355.17 action_loss:  -1.843 entropy_loss:  -0.548 critic_loss:  540.821 reward: 500.00: 100%|██████████| 40/40 [00:31<00:00,  1.50it/s]
sz:  2000 steps/s:  355.17 action_loss:  -1.843 entropy_loss:  -0.548 critic_loss:  540.821 reward: 500.00: 100%|██████████| 40/40 [00:31<00:00,  1.27it/s]
test reward: 500.0

beautiful_mnist.py

  0%|          | 0/70 [00:00<?, ?it/s]
loss:   2.85 test_accuracy:   nan%:   0%|          | 0/70 [00:05<?, ?it/s]
loss:   2.85 test_accuracy:   nan%:   1%|▏         | 1/70 [00:05<06:52,  5.98s/it]
loss:   1.75 test_accuracy:   nan%:   1%|▏         | 1/70 [00:07<06:52,  5.98s/it]
loss:   1.75 test_accuracy:   nan%:   3%|▎         | 2/70 [00:07<03:51,  3.41s/it]
loss:   1.33 test_accuracy:   nan%:   3%|▎         | 2/70 [00:07<03:51,  3.41s/it]
loss:   1.00 test_accuracy:   nan%:   3%|▎         | 2/70 [00:07<03:51,  3.41s/it]
loss:   0.83 test_accuracy:   nan%:   3%|▎         | 2/70 [00:07<03:51,  3.41s/it]
loss:   0.68 test_accuracy:   nan%:   3%|▎         | 2/70 [00:07<03:51,  3.41s/it]
loss:   0.68 test_accuracy:   nan%:   9%|▊         | 6/70 [00:07<00:51,  1.24it/s]
loss:   0.55 test_accuracy:   nan%:   9%|▊         | 6/70 [00:07<00:51,  1.24it/s]
loss:   0.50 test_accuracy:   nan%:   9%|▊         | 6/70 [00:07<00:51,  1.24it/s]
loss:   0.44 test_accuracy:   nan%:   9%|▊         | 6/70 [00:07<00:51,  1.24it/s]
loss:   0.41 test_accuracy: 85.98%:   9%|▊         | 6/70 [00:08<00:51,  1.24it/s]
loss:   0.41 test_accuracy: 85.98%:  14%|█▍        | 10/70 [00:08<00:28,  2.09it/s]
loss:   0.38 test_accuracy: 85.98%:  14%|█▍        | 10/70 [00:08<00:28,  2.09it/s]
loss:   0.33 test_accuracy: 85.98%:  14%|█▍        | 10/70 [00:08<00:28,  2.09it/s]
loss:   0.30 test_accuracy: 85.98%:  14%|█▍        | 10/70 [00:08<00:28,  2.09it/s]
loss:   0.27 test_accuracy: 85.98%:  14%|█▍        | 10/70 [00:08<00:28,  2.09it/s]
loss:   0.27 test_accuracy: 85.98%:  20%|██        | 14/70 [00:08<00:16,  3.49it/s]
loss:   0.25 test_accuracy: 85.98%:  20%|██        | 14/70 [00:08<00:16,  3.49it/s]
loss:   0.26 test_accuracy: 85.98%:  20%|██        | 14/70 [00:08<00:16,  3.49it/s]
loss:   0.25 test_accuracy: 85.98%:  20%|██        | 14/70 [00:08<00:16,  3.49it/s]
loss:   0.23 test_accuracy: 85.98%:  20%|██        | 14/70 [00:08<00:16,  3.49it/s]
loss:   0.23 test_accuracy: 85.98%:  26%|██▌       | 18/70 [00:08<00:09,  5.30it/s]
loss:   0.19 test_accuracy: 85.98%:  26%|██▌       | 18/70 [00:08<00:09,  5.30it/s]
loss:   0.19 test_accuracy: 94.39%:  26%|██▌       | 18/70 [00:08<00:09,  5.30it/s]
loss:   0.20 test_accuracy: 94.39%:  26%|██▌       | 18/70 [00:08<00:09,  5.30it/s]
loss:   0.20 test_accuracy: 94.39%:  30%|███       | 21/70 [00:08<00:07,  6.93it/s]
loss:   0.19 test_accuracy: 94.39%:  30%|███       | 21/70 [00:08<00:07,  6.93it/s]
loss:   0.14 test_accuracy: 94.39%:  30%|███       | 21/70 [00:08<00:07,  6.93it/s]
loss:   0.15 test_accuracy: 94.39%:  30%|███       | 21/70 [00:08<00:07,  6.93it/s]
loss:   0.14 test_accuracy: 94.39%:  30%|███       | 21/70 [00:08<00:07,  6.93it/s]
loss:   0.14 test_accuracy: 94.39%:  36%|███▌      | 25/70 [00:08<00:04,  9.69it/s]
loss:   0.14 test_accuracy: 94.39%:  36%|███▌      | 25/70 [00:08<00:04,  9.69it/s]
loss:   0.15 test_accuracy: 94.39%:  36%|███▌      | 25/70 [00:08<00:04,  9.69it/s]
loss:   0.14 test_accuracy: 94.39%:  36%|███▌      | 25/70 [00:08<00:04,  9.69it/s]
loss:   0.11 test_accuracy: 94.39%:  36%|███▌      | 25/70 [00:09<00:04,  9.69it/s]
loss:   0.11 test_accuracy: 94.39%:  41%|████▏     | 29/70 [00:09<00:03, 12.82it/s]
loss:   0.12 test_accuracy: 96.40%:  41%|████▏     | 29/70 [00:09<00:03, 12.82it/s]
loss:   0.15 test_accuracy: 96.40%:  41%|████▏     | 29/70 [00:09<00:03, 12.82it/s]
loss:   0.11 test_accuracy: 96.40%:  41%|████▏     | 29/70 [00:09<00:03, 12.82it/s]
loss:   0.10 test_accuracy: 96.40%:  41%|████▏     | 29/70 [00:09<00:03, 12.82it/s]
loss:   0.10 test_accuracy: 96.40%:  47%|████▋     | 33/70 [00:09<00:02, 15.72it/s]
loss:   0.08 test_accuracy: 96.40%:  47%|████▋     | 33/70 [00:09<00:02, 15.72it/s]
loss:   0.08 test_accuracy: 96.40%:  47%|████▋     | 33/70 [00:09<00:02, 15.72it/s]
loss:   0.11 test_accuracy: 96.40%:  47%|████▋     | 33/70 [00:09<00:02, 15.72it/s]
loss:   0.11 test_accuracy: 96.40%:  47%|████▋     | 33/70 [00:09<00:02, 15.72it/s]
loss:   0.11 test_accuracy: 96.40%:  53%|█████▎    | 37/70 [00:09<00:01, 19.05it/s]
loss:   0.09 test_accuracy: 96.40%:  53%|█████▎    | 37/70 [00:09<00:01, 19.05it/s]
loss:   0.10 test_accuracy: 96.40%:  53%|█████▎    | 37/70 [00:09<00:01, 19.05it/s]
loss:   0.09 test_accuracy: 97.22%:  53%|█████▎    | 37/70 [00:09<00:01, 19.05it/s]
loss:   0.07 test_accuracy: 97.22%:  53%|█████▎    | 37/70 [00:09<00:01, 19.05it/s]
loss:   0.07 test_accuracy: 97.22%:  59%|█████▊    | 41/70 [00:09<00:01, 21.47it/s]
loss:   0.10 test_accuracy: 97.22%:  59%|█████▊    | 41/70 [00:09<00:01, 21.47it/s]
loss:   0.09 test_accuracy: 97.22%:  59%|█████▊    | 41/70 [00:09<00:01, 21.47it/s]
loss:   0.08 test_accuracy: 97.22%:  59%|█████▊    | 41/70 [00:09<00:01, 21.47it/s]
loss:   0.09 test_accuracy: 97.22%:  59%|█████▊    | 41/70 [00:09<00:01, 21.47it/s]
loss:   0.09 test_accuracy: 97.22%:  64%|██████▍   | 45/70 [00:09<00:01, 24.40it/s]
loss:   0.10 test_accuracy: 97.22%:  64%|██████▍   | 45/70 [00:09<00:01, 24.40it/s]
loss:   0.09 test_accuracy: 97.22%:  64%|██████▍   | 45/70 [00:09<00:01, 24.40it/s]
loss:   0.07 test_accuracy: 97.22%:  64%|██████▍   | 45/70 [00:09<00:01, 24.40it/s]
loss:   0.09 test_accuracy: 97.22%:  64%|██████▍   | 45/70 [00:09<00:01, 24.40it/s]
loss:   0.09 test_accuracy: 97.22%:  70%|███████   | 49/70 [00:09<00:00, 26.90it/s]
loss:   0.10 test_accuracy: 97.70%:  70%|███████   | 49/70 [00:09<00:00, 26.90it/s]
loss:   0.06 test_accuracy: 97.70%:  70%|███████   | 49/70 [00:09<00:00, 26.90it/s]
loss:   0.06 test_accuracy: 97.70%:  70%|███████   | 49/70 [00:09<00:00, 26.90it/s]
loss:   0.10 test_accuracy: 97.70%:  70%|███████   | 49/70 [00:09<00:00, 26.90it/s]
loss:   0.10 test_accuracy: 97.70%:  76%|███████▌  | 53/70 [00:09<00:00, 27.73it/s]
loss:   0.08 test_accuracy: 97.70%:  76%|███████▌  | 53/70 [00:09<00:00, 27.73it/s]
loss:   0.08 test_accuracy: 97.70%:  76%|███████▌  | 53/70 [00:09<00:00, 27.73it/s]
loss:   0.06 test_accuracy: 97.70%:  76%|███████▌  | 53/70 [00:09<00:00, 27.73it/s]
loss:   0.06 test_accuracy: 97.70%:  76%|███████▌  | 53/70 [00:09<00:00, 27.73it/s]
loss:   0.06 test_accuracy: 97.70%:  81%|████████▏ | 57/70 [00:09<00:00, 29.62it/s]
loss:   0.08 test_accuracy: 97.70%:  81%|████████▏ | 57/70 [00:09<00:00, 29.62it/s]
loss:   0.08 test_accuracy: 97.70%:  81%|████████▏ | 57/70 [00:09<00:00, 29.62it/s]
loss:   0.07 test_accuracy: 98.14%:  81%|████████▏ | 57/70 [00:09<00:00, 29.62it/s]
loss:   0.05 test_accuracy: 98.14%:  81%|████████▏ | 57/70 [00:09<00:00, 29.62it/s]
loss:   0.05 test_accuracy: 98.14%:  87%|████████▋ | 61/70 [00:09<00:00, 29.67it/s]
loss:   0.07 test_accuracy: 98.14%:  87%|████████▋ | 61/70 [00:10<00:00, 29.67it/s]
loss:   0.06 test_accuracy: 98.14%:  87%|████████▋ | 61/70 [00:10<00:00, 29.67it/s]
loss:   0.09 test_accuracy: 98.14%:  87%|████████▋ | 61/70 [00:10<00:00, 29.67it/s]
loss:   0.07 test_accuracy: 98.14%:  87%|████████▋ | 61/70 [00:10<00:00, 29.67it/s]
loss:   0.07 test_accuracy: 98.14%:  93%|█████████▎| 65/70 [00:10<00:00, 31.13it/s]
loss:   0.06 test_accuracy: 98.14%:  93%|█████████▎| 65/70 [00:10<00:00, 31.13it/s]
loss:   0.06 test_accuracy: 98.14%:  93%|█████████▎| 65/70 [00:10<00:00, 31.13it/s]
loss:   0.06 test_accuracy: 98.14%:  93%|█████████▎| 65/70 [00:10<00:00, 31.13it/s]
loss:   0.08 test_accuracy: 98.14%:  93%|█████████▎| 65/70 [00:10<00:00, 31.13it/s]
loss:   0.08 test_accuracy: 98.14%:  99%|█████████▊| 69/70 [00:10<00:00, 32.21it/s]
loss:   0.06 test_accuracy: 98.42%:  99%|█████████▊| 69/70 [00:10<00:00, 32.21it/s]
loss:   0.06 test_accuracy: 98.42%: 100%|██████████| 70/70 [00:10<00:00,  6.81it/s]

benchmark_train_efficientnet.py

NUM:2 BS:8 CNT:10

  0%|          | 0/10 [00:00<?, ?it/s]
 10%|█         | 1/10 [00:09<01:26,  9.67s/it]
 20%|██        | 2/10 [00:09<00:32,  4.06s/it]
 30%|███       | 3/10 [00:09<00:15,  2.28s/it]
 40%|████      | 4/10 [00:10<00:08,  1.43s/it]
 50%|█████     | 5/10 [00:10<00:04,  1.04it/s]
 60%|██████    | 6/10 [00:10<00:02,  1.47it/s]
 70%|███████   | 7/10 [00:10<00:01,  1.94it/s]
 80%|████████  | 8/10 [00:10<00:00,  2.54it/s]
 90%|█████████ | 9/10 [00:10<00:00,  3.20it/s]
100%|██████████| 10/10 [00:10<00:00,  3.89it/s]
100%|██████████| 10/10 [00:10<00:00,  1.09s/it]
 175.27 ms cpy,  9470.44 ms run,   62.31 ms build, 9347.02 ms realize,   61.11 ms CL,    0.06 loss,  421 tensors, 0.04 GB used,      1.22 GFLOPS
  12.54 ms cpy,   103.18 ms run,   55.31 ms build,   45.06 ms realize,    2.82 ms CL,   -0.02 loss,  421 tensors, 0.04 GB used,    111.65 GFLOPS
  11.01 ms cpy,   142.94 ms run,   53.91 ms build,   86.17 ms realize,    2.85 ms CL,    0.07 loss,  421 tensors, 0.04 GB used,     80.60 GFLOPS
  11.05 ms cpy,   102.45 ms run,   53.68 ms build,   45.98 ms realize,    2.79 ms CL,    0.03 loss,  421 tensors, 0.04 GB used,    112.45 GFLOPS
  11.07 ms cpy,   102.35 ms run,   53.75 ms build,   45.86 ms realize,    2.74 ms CL,    0.07 loss,  421 tensors, 0.04 GB used,    112.56 GFLOPS
  11.14 ms cpy,   101.95 ms run,   53.89 ms build,   45.27 ms realize,    2.78 ms CL,   -0.00 loss,  421 tensors, 0.04 GB used,    113.01 GFLOPS
  11.14 ms cpy,   143.39 ms run,   54.09 ms build,   86.44 ms realize,    2.86 ms CL,    0.03 loss,  421 tensors, 0.04 GB used,     80.34 GFLOPS
  11.97 ms cpy,   103.14 ms run,   54.24 ms build,   46.12 ms realize,    2.78 ms CL,   -0.04 loss,  421 tensors, 0.04 GB used,    111.70 GFLOPS
  11.29 ms cpy,   102.81 ms run,   54.46 ms build,   45.58 ms realize,    2.77 ms CL,    0.04 loss,  421 tensors, 0.04 GB used,    112.06 GFLOPS
  11.15 ms cpy,   103.26 ms run,   54.59 ms build,   45.89 ms realize,    2.77 ms CL,   -0.05 loss,  421 tensors, 0.04 GB used,    111.57 GFLOPS

coder.py

create model: 155.25 ms
download weights:  24.86 ms

  0%|          | 0/292 [00:00<?, ?it/s]
ram used:  0.00 GB, layers.0.attention.wq.weight                      :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  0.00 GB, layers.0.attention.wq.weight                      :   0%|          | 1/292 [00:00<01:35,  3.04it/s]
ram used:  0.03 GB, layers.0.attention.wk.weight                      :   0%|          | 1/292 [00:00<01:35,  3.04it/s]
ram used:  0.04 GB, layers.0.attention.wv.weight                      :   0%|          | 1/292 [00:00<01:35,  3.04it/s]
ram used:  0.05 GB, layers.0.attention.wo.weight                      :   0%|          | 1/292 [00:00<01:35,  3.04it/s]
ram used:  0.05 GB, layers.0.attention.wo.weight                      :   1%|▏         | 4/292 [00:00<00:28, 10.07it/s]
ram used:  0.08 GB, layers.0.feed_forward.w1.weight                   :   1%|▏         | 4/292 [00:00<00:28, 10.07it/s]
ram used:  0.20 GB, layers.0.feed_forward.w2.weight                   :   1%|▏         | 4/292 [00:00<00:28, 10.07it/s]
ram used:  0.20 GB, layers.0.feed_forward.w2.weight                   :   2%|▏         | 6/292 [00:00<00:26, 10.89it/s]
ram used:  0.32 GB, layers.0.feed_forward.w3.weight                   :   2%|▏         | 6/292 [00:00<00:26, 10.89it/s]
ram used:  0.44 GB, layers.0.attention_norm.weight                    :   2%|▏         | 6/292 [00:00<00:26, 10.89it/s]
ram used:  0.44 GB, layers.0.ffn_norm.weight                          :   2%|▏         | 6/292 [00:00<00:26, 10.89it/s]
ram used:  0.44 GB, layers.1.attention.wq.weight                      :   2%|▏         | 6/292 [00:00<00:26, 10.89it/s]
ram used:  0.47 GB, layers.1.attention.wk.weight                      :   2%|▏         | 6/292 [00:00<00:26, 10.89it/s]
ram used:  0.48 GB, layers.1.attention.wv.weight                      :   2%|▏         | 6/292 [00:00<00:26, 10.89it/s]
ram used:  0.48 GB, layers.1.attention.wv.weight                      :   4%|▍         | 12/292 [00:00<00:12, 23.09it/s]
ram used:  0.49 GB, layers.1.attention.wo.weight                      :   4%|▍         | 12/292 [00:00<00:12, 23.09it/s]
ram used:  0.52 GB, layers.1.feed_forward.w1.weight                   :   4%|▍         | 12/292 [00:00<00:12, 23.09it/s]
ram used:  0.64 GB, layers.1.feed_forward.w2.weight                   :   4%|▍         | 12/292 [00:00<00:12, 23.09it/s]
ram used:  0.75 GB, layers.1.feed_forward.w3.weight                   :   4%|▍         | 12/292 [00:00<00:12, 23.09it/s]
ram used:  0.75 GB, layers.1.feed_forward.w3.weight                   :   5%|▌         | 16/292 [00:00<00:11, 24.79it/s]
ram used:  0.87 GB, layers.1.attention_norm.weight                    :   5%|▌         | 16/292 [00:00<00:11, 24.79it/s]
ram used:  0.87 GB, layers.1.ffn_norm.weight                          :   5%|▌         | 16/292 [00:00<00:11, 24.79it/s]
ram used:  0.87 GB, layers.2.attention.wq.weight                      :   5%|▌         | 16/292 [00:00<00:11, 24.79it/s]
ram used:  0.91 GB, layers.2.attention.wk.weight                      :   5%|▌         | 16/292 [00:00<00:11, 24.79it/s]
ram used:  0.91 GB, layers.2.attention.wv.weight                      :   5%|▌         | 16/292 [00:00<00:11, 24.79it/s]
ram used:  0.92 GB, layers.2.attention.wo.weight                      :   5%|▌         | 16/292 [00:00<00:11, 24.79it/s]
ram used:  0.96 GB, layers.2.feed_forward.w1.weight                   :   5%|▌         | 16/292 [00:00<00:11, 24.79it/s]
ram used:  1.07 GB, layers.2.feed_forward.w2.weight                   :   5%|▌         | 16/292 [00:00<00:11, 24.79it/s]
ram used:  1.07 GB, layers.2.feed_forward.w2.weight                   :   8%|▊         | 24/292 [00:01<00:07, 36.49it/s]
ram used:  1.19 GB, layers.2.feed_forward.w3.weight                   :   8%|▊         | 24/292 [00:01<00:07, 36.49it/s]
ram used:  1.31 GB, layers.2.attention_norm.weight                    :   8%|▊         | 24/292 [00:01<00:07, 36.49it/s]
ram used:  1.31 GB, layers.2.ffn_norm.weight                          :   8%|▊         | 24/292 [00:01<00:07, 36.49it/s]
ram used:  1.31 GB, layers.3.attention.wq.weight                      :   8%|▊         | 24/292 [00:01<00:07, 36.49it/s]
ram used:  1.34 GB, layers.3.attention.wk.weight                      :   8%|▊         | 24/292 [00:01<00:07, 36.49it/s]
ram used:  1.35 GB, layers.3.attention.wv.weight                      :   8%|▊         | 24/292 [00:01<00:07, 36.49it/s]
ram used:  1.36 GB, layers.3.attention.wo.weight                      :   8%|▊         | 24/292 [00:01<00:07, 36.49it/s]
ram used:  1.39 GB, layers.3.feed_forward.w1.weight                   :   8%|▊         | 24/292 [00:01<00:07, 36.49it/s]
ram used:  1.39 GB, layers.3.feed_forward.w1.weight                   :  11%|█         | 32/292 [00:01<00:05, 44.77it/s]
ram used:  1.51 GB, layers.3.feed_forward.w2.weight                   :  11%|█         | 32/292 [00:01<00:05, 44.77it/s]
ram used:  1.63 GB, layers.3.feed_forward.w3.weight                   :  11%|█         | 32/292 [00:01<00:05, 44.77it/s]
ram used:  1.74 GB, layers.3.attention_norm.weight                    :  11%|█         | 32/292 [00:01<00:05, 44.77it/s]
ram used:  1.74 GB, layers.3.ffn_norm.weight                          :  11%|█         | 32/292 [00:01<00:05, 44.77it/s]
ram used:  1.74 GB, layers.4.attention.wq.weight                      :  11%|█         | 32/292 [00:01<00:05, 44.77it/s]
ram used:  1.74 GB, layers.4.attention.wq.weight                      :  13%|█▎        | 37/292 [00:01<00:05, 45.67it/s]
ram used:  1.78 GB, layers.4.attention.wk.weight                      :  13%|█▎        | 37/292 [00:01<00:05, 45.67it/s]
ram used:  1.79 GB, layers.4.attention.wv.weight                      :  13%|█▎        | 37/292 [00:01<00:05, 45.67it/s]
ram used:  1.80 GB, layers.4.attention.wo.weight                      :  13%|█▎        | 37/292 [00:01<00:05, 45.67it/s]
ram used:  1.83 GB, layers.4.feed_forward.w1.weight                   :  13%|█▎        | 37/292 [00:01<00:05, 45.67it/s]
ram used:  1.95 GB, layers.4.feed_forward.w2.weight                   :  13%|█▎        | 37/292 [00:01<00:05, 45.67it/s]
ram used:  1.95 GB, layers.4.feed_forward.w2.weight                   :  14%|█▍        | 42/292 [00:01<00:05, 46.03it/s]
ram used:  2.06 GB, layers.4.feed_forward.w3.weight                   :  14%|█▍        | 42/292 [00:01<00:05, 46.03it/s]
ram used:  2.18 GB, layers.4.attention_norm.weight                    :  14%|█▍        | 42/292 [00:01<00:05, 46.03it/s]
ram used:  2.18 GB, layers.4.ffn_norm.weight                          :  14%|█▍        | 42/292 [00:01<00:05, 46.03it/s]
ram used:  2.18 GB, layers.5.attention.wq.weight                      :  14%|█▍        | 42/292 [00:01<00:05, 46.03it/s]
ram used:  2.21 GB, layers.5.attention.wk.weight                      :  14%|█▍        | 42/292 [00:01<00:05, 46.03it/s]
ram used:  2.22 GB, layers.5.attention.wv.weight                      :  14%|█▍        | 42/292 [00:01<00:05, 46.03it/s]
ram used:  2.23 GB, layers.5.attention.wo.weight                      :  14%|█▍        | 42/292 [00:01<00:05, 46.03it/s]
ram used:  2.27 GB, layers.5.feed_forward.w1.weight                   :  14%|█▍        | 42/292 [00:01<00:05, 46.03it/s]
ram used:  2.27 GB, layers.5.feed_forward.w1.weight                   :  17%|█▋        | 50/292 [00:01<00:04, 51.86it/s]
ram used:  2.38 GB, layers.5.feed_forward.w2.weight                   :  17%|█▋        | 50/292 [00:01<00:04, 51.86it/s]
ram used:  2.50 GB, layers.5.feed_forward.w3.weight                   :  17%|█▋        | 50/292 [00:01<00:04, 51.86it/s]
ram used:  2.62 GB, layers.5.attention_norm.weight                    :  17%|█▋        | 50/292 [00:01<00:04, 51.86it/s]
ram used:  2.62 GB, layers.5.ffn_norm.weight                          :  17%|█▋        | 50/292 [00:01<00:04, 51.86it/s]
ram used:  2.62 GB, layers.6.attention.wq.weight                      :  17%|█▋        | 50/292 [00:01<00:04, 51.86it/s]
ram used:  2.65 GB, layers.6.attention.wk.weight                      :  17%|█▋        | 50/292 [00:01<00:04, 51.86it/s]
ram used:  2.65 GB, layers.6.attention.wk.weight                      :  19%|█▉        | 56/292 [00:01<00:04, 53.00it/s]
ram used:  2.66 GB, layers.6.attention.wv.weight                      :  19%|█▉        | 56/292 [00:01<00:04, 53.00it/s]
ram used:  2.67 GB, layers.6.attention.wo.weight                      :  19%|█▉        | 56/292 [00:01<00:04, 53.00it/s]
ram used:  2.70 GB, layers.6.feed_forward.w1.weight                   :  19%|█▉        | 56/292 [00:01<00:04, 53.00it/s]
ram used:  2.82 GB, layers.6.feed_forward.w2.weight                   :  19%|█▉        | 56/292 [00:01<00:04, 53.00it/s]
ram used:  2.94 GB, layers.6.feed_forward.w3.weight                   :  19%|█▉        | 56/292 [00:01<00:04, 53.00it/s]
ram used:  3.05 GB, layers.6.attention_norm.weight                    :  19%|█▉        | 56/292 [00:01<00:04, 53.00it/s]
ram used:  3.05 GB, layers.6.attention_norm.weight                    :  21%|██        | 62/292 [00:01<00:04, 48.91it/s]
ram used:  3.05 GB, layers.6.ffn_norm.weight                          :  21%|██        | 62/292 [00:01<00:04, 48.91it/s]
ram used:  3.05 GB, layers.7.attention.wq.weight                      :  21%|██        | 62/292 [00:01<00:04, 48.91it/s]
ram used:  3.09 GB, layers.7.attention.wk.weight                      :  21%|██        | 62/292 [00:01<00:04, 48.91it/s]
ram used:  3.10 GB, layers.7.attention.wv.weight                      :  21%|██        | 62/292 [00:01<00:04, 48.91it/s]
ram used:  3.10 GB, layers.7.attention.wo.weight                      :  21%|██        | 62/292 [00:01<00:04, 48.91it/s]
ram used:  3.14 GB, layers.7.feed_forward.w1.weight                   :  21%|██        | 62/292 [00:01<00:04, 48.91it/s]
ram used:  3.25 GB, layers.7.feed_forward.w2.weight                   :  21%|██        | 62/292 [00:01<00:04, 48.91it/s]
ram used:  3.25 GB, layers.7.feed_forward.w2.weight                   :  24%|██▎       | 69/292 [00:01<00:04, 51.20it/s]
ram used:  3.37 GB, layers.7.feed_forward.w3.weight                   :  24%|██▎       | 69/292 [00:01<00:04, 51.20it/s]
ram used:  3.49 GB, layers.7.attention_norm.weight                    :  24%|██▎       | 69/292 [00:01<00:04, 51.20it/s]
ram used:  3.49 GB, layers.7.ffn_norm.weight                          :  24%|██▎       | 69/292 [00:01<00:04, 51.20it/s]
ram used:  3.49 GB, layers.8.attention.wq.weight                      :  24%|██▎       | 69/292 [00:01<00:04, 51.20it/s]
ram used:  3.52 GB, layers.8.attention.wk.weight                      :  24%|██▎       | 69/292 [00:01<00:04, 51.20it/s]
ram used:  3.53 GB, layers.8.attention.wv.weight                      :  24%|██▎       | 69/292 [00:01<00:04, 51.20it/s]
ram used:  3.54 GB, layers.8.attention.wo.weight                      :  24%|██▎       | 69/292 [00:01<00:04, 51.20it/s]
ram used:  3.57 GB, layers.8.feed_forward.w1.weight                   :  24%|██▎       | 69/292 [00:01<00:04, 51.20it/s]
ram used:  3.57 GB, layers.8.feed_forward.w1.weight                   :  26%|██▋       | 77/292 [00:01<00:03, 55.18it/s]
ram used:  3.69 GB, layers.8.feed_forward.w2.weight                   :  26%|██▋       | 77/292 [00:01<00:03, 55.18it/s]
ram used:  3.81 GB, layers.8.feed_forward.w3.weight                   :  26%|██▋       | 77/292 [00:02<00:03, 55.18it/s]
ram used:  3.93 GB, layers.8.attention_norm.weight                    :  26%|██▋       | 77/292 [00:02<00:03, 55.18it/s]
ram used:  3.93 GB, layers.8.ffn_norm.weight                          :  26%|██▋       | 77/292 [00:02<00:03, 55.18it/s]
ram used:  3.93 GB, layers.9.attention.wq.weight                      :  26%|██▋       | 77/292 [00:02<00:03, 55.18it/s]
ram used:  3.96 GB, layers.9.attention.wk.weight                      :  26%|██▋       | 77/292 [00:02<00:03, 55.18it/s]
ram used:  3.96 GB, layers.9.attention.wk.weight                      :  28%|██▊       | 83/292 [00:02<00:03, 55.42it/s]
ram used:  3.97 GB, layers.9.attention.wv.weight                      :  28%|██▊       | 83/292 [00:02<00:03, 55.42it/s]
ram used:  3.98 GB, layers.9.attention.wo.weight                      :  28%|██▊       | 83/292 [00:02<00:03, 55.42it/s]
ram used:  4.01 GB, layers.9.feed_forward.w1.weight                   :  28%|██▊       | 83/292 [00:02<00:03, 55.42it/s]
ram used:  4.13 GB, layers.9.feed_forward.w2.weight                   :  28%|██▊       | 83/292 [00:02<00:03, 55.42it/s]
ram used:  4.24 GB, layers.9.feed_forward.w3.weight                   :  28%|██▊       | 83/292 [00:02<00:03, 55.42it/s]
ram used:  4.36 GB, layers.9.attention_norm.weight                    :  28%|██▊       | 83/292 [00:02<00:03, 55.42it/s]
ram used:  4.36 GB, layers.9.attention_norm.weight                    :  30%|███       | 89/292 [00:02<00:04, 50.54it/s]
ram used:  4.36 GB, layers.9.ffn_norm.weight                          :  30%|███       | 89/292 [00:02<00:04, 50.54it/s]
ram used:  4.36 GB, layers.10.attention.wq.weight                     :  30%|███       | 89/292 [00:02<00:04, 50.54it/s]
ram used:  4.40 GB, layers.10.attention.wk.weight                     :  30%|███       | 89/292 [00:02<00:04, 50.54it/s]
ram used:  4.40 GB, layers.10.attention.wv.weight                     :  30%|███       | 89/292 [00:02<00:04, 50.54it/s]
ram used:  4.41 GB, layers.10.attention.wo.weight                     :  30%|███       | 89/292 [00:02<00:04, 50.54it/s]
ram used:  4.45 GB, layers.10.feed_forward.w1.weight                  :  30%|███       | 89/292 [00:02<00:04, 50.54it/s]
ram used:  4.56 GB, layers.10.feed_forward.w2.weight                  :  30%|███       | 89/292 [00:02<00:04, 50.54it/s]
ram used:  4.56 GB, layers.10.feed_forward.w2.weight                  :  33%|███▎      | 96/292 [00:02<00:03, 52.40it/s]
ram used:  4.68 GB, layers.10.feed_forward.w3.weight                  :  33%|███▎      | 96/292 [00:02<00:03, 52.40it/s]
ram used:  4.80 GB, layers.10.attention_norm.weight                   :  33%|███▎      | 96/292 [00:02<00:03, 52.40it/s]
ram used:  4.80 GB, layers.10.ffn_norm.weight                         :  33%|███▎      | 96/292 [00:02<00:03, 52.40it/s]
ram used:  4.80 GB, layers.11.attention.wq.weight                     :  33%|███▎      | 96/292 [00:02<00:03, 52.40it/s]
ram used:  4.83 GB, layers.11.attention.wk.weight                     :  33%|███▎      | 96/292 [00:02<00:03, 52.40it/s]
ram used:  4.84 GB, layers.11.attention.wv.weight                     :  33%|███▎      | 96/292 [00:02<00:03, 52.40it/s]
ram used:  4.85 GB, layers.11.attention.wo.weight                     :  33%|███▎      | 96/292 [00:02<00:03, 52.40it/s]
ram used:  4.88 GB, layers.11.feed_forward.w1.weight                  :  33%|███▎      | 96/292 [00:02<00:03, 52.40it/s]
ram used:  4.88 GB, layers.11.feed_forward.w1.weight                  :  36%|███▌      | 104/292 [00:02<00:03, 55.97it/s]
ram used:  5.00 GB, layers.11.feed_forward.w2.weight                  :  36%|███▌      | 104/292 [00:02<00:03, 55.97it/s]
ram used:  5.12 GB, layers.11.feed_forward.w3.weight                  :  36%|███▌      | 104/292 [00:02<00:03, 55.97it/s]
ram used:  5.23 GB, layers.11.attention_norm.weight                   :  36%|███▌      | 104/292 [00:02<00:03, 55.97it/s]
ram used:  5.23 GB, layers.11.ffn_norm.weight                         :  36%|███▌      | 104/292 [00:02<00:03, 55.97it/s]
ram used:  5.23 GB, layers.12.attention.wq.weight                     :  36%|███▌      | 104/292 [00:02<00:03, 55.97it/s]
ram used:  5.27 GB, layers.12.attention.wk.weight                     :  36%|███▌      | 104/292 [00:02<00:03, 55.97it/s]
ram used:  5.27 GB, layers.12.attention.wk.weight                     :  38%|███▊      | 110/292 [00:02<00:03, 56.03it/s]
ram used:  5.28 GB, layers.12.attention.wv.weight                     :  38%|███▊      | 110/292 [00:02<00:03, 56.03it/s]
ram used:  5.29 GB, layers.12.attention.wo.weight                     :  38%|███▊      | 110/292 [00:02<00:03, 56.03it/s]
ram used:  5.32 GB, layers.12.feed_forward.w1.weight                  :  38%|███▊      | 110/292 [00:02<00:03, 56.03it/s]
ram used:  5.44 GB, layers.12.feed_forward.w2.weight                  :  38%|███▊      | 110/292 [00:02<00:03, 56.03it/s]
ram used:  5.55 GB, layers.12.feed_forward.w3.weight                  :  38%|███▊      | 110/292 [00:02<00:03, 56.03it/s]
ram used:  5.67 GB, layers.12.attention_norm.weight                   :  38%|███▊      | 110/292 [00:02<00:03, 56.03it/s]
ram used:  5.67 GB, layers.12.attention_norm.weight                   :  40%|███▉      | 116/292 [00:02<00:03, 51.03it/s]
ram used:  5.67 GB, layers.12.ffn_norm.weight                         :  40%|███▉      | 116/292 [00:02<00:03, 51.03it/s]
ram used:  5.67 GB, layers.13.attention.wq.weight                     :  40%|███▉      | 116/292 [00:02<00:03, 51.03it/s]
ram used:  5.70 GB, layers.13.attention.wk.weight                     :  40%|███▉      | 116/292 [00:02<00:03, 51.03it/s]
ram used:  5.71 GB, layers.13.attention.wv.weight                     :  40%|███▉      | 116/292 [00:02<00:03, 51.03it/s]
ram used:  5.72 GB, layers.13.attention.wo.weight                     :  40%|███▉      | 116/292 [00:02<00:03, 51.03it/s]
ram used:  5.75 GB, layers.13.feed_forward.w1.weight                  :  40%|███▉      | 116/292 [00:02<00:03, 51.03it/s]
ram used:  5.87 GB, layers.13.feed_forward.w2.weight                  :  40%|███▉      | 116/292 [00:02<00:03, 51.03it/s]
ram used:  5.87 GB, layers.13.feed_forward.w2.weight                  :  42%|████▏     | 123/292 [00:02<00:03, 52.60it/s]
ram used:  5.99 GB, layers.13.feed_forward.w3.weight                  :  42%|████▏     | 123/292 [00:02<00:03, 52.60it/s]
ram used:  6.11 GB, layers.13.attention_norm.weight                   :  42%|████▏     | 123/292 [00:02<00:03, 52.60it/s]
ram used:  6.11 GB, layers.13.ffn_norm.weight                         :  42%|████▏     | 123/292 [00:02<00:03, 52.60it/s]
ram used:  6.11 GB, layers.14.attention.wq.weight                     :  42%|████▏     | 123/292 [00:02<00:03, 52.60it/s]
ram used:  6.14 GB, layers.14.attention.wk.weight                     :  42%|████▏     | 123/292 [00:02<00:03, 52.60it/s]
ram used:  6.15 GB, layers.14.attention.wv.weight                     :  42%|████▏     | 123/292 [00:02<00:03, 52.60it/s]
ram used:  6.16 GB, layers.14.attention.wo.weight                     :  42%|████▏     | 123/292 [00:02<00:03, 52.60it/s]
ram used:  6.19 GB, layers.14.feed_forward.w1.weight                  :  42%|████▏     | 123/292 [00:02<00:03, 52.60it/s]
ram used:  6.19 GB, layers.14.feed_forward.w1.weight                  :  45%|████▍     | 131/292 [00:02<00:02, 55.36it/s]
ram used:  6.31 GB, layers.14.feed_forward.w2.weight                  :  45%|████▍     | 131/292 [00:02<00:02, 55.36it/s]
ram used:  6.43 GB, layers.14.feed_forward.w3.weight                  :  45%|████▍     | 131/292 [00:03<00:02, 55.36it/s]
ram used:  6.54 GB, layers.14.attention_norm.weight                   :  45%|████▍     | 131/292 [00:03<00:02, 55.36it/s]
ram used:  6.54 GB, layers.14.ffn_norm.weight                         :  45%|████▍     | 131/292 [00:03<00:02, 55.36it/s]
ram used:  6.54 GB, layers.15.attention.wq.weight                     :  45%|████▍     | 131/292 [00:03<00:02, 55.36it/s]
ram used:  6.58 GB, layers.15.attention.wk.weight                     :  45%|████▍     | 131/292 [00:03<00:02, 55.36it/s]
ram used:  6.58 GB, layers.15.attention.wk.weight                     :  47%|████▋     | 137/292 [00:03<00:02, 55.27it/s]
ram used:  6.59 GB, layers.15.attention.wv.weight                     :  47%|████▋     | 137/292 [00:03<00:02, 55.27it/s]
ram used:  6.59 GB, layers.15.attention.wo.weight                     :  47%|████▋     | 137/292 [00:03<00:02, 55.27it/s]
ram used:  6.63 GB, layers.15.feed_forward.w1.weight                  :  47%|████▋     | 137/292 [00:03<00:02, 55.27it/s]
ram used:  6.74 GB, layers.15.feed_forward.w2.weight                  :  47%|████▋     | 137/292 [00:03<00:02, 55.27it/s]
ram used:  6.86 GB, layers.15.feed_forward.w3.weight                  :  47%|████▋     | 137/292 [00:03<00:02, 55.27it/s]
ram used:  6.98 GB, layers.15.attention_norm.weight                   :  47%|████▋     | 137/292 [00:03<00:02, 55.27it/s]
ram used:  6.98 GB, layers.15.attention_norm.weight                   :  49%|████▉     | 143/292 [00:03<00:02, 50.26it/s]
ram used:  6.98 GB, layers.15.ffn_norm.weight                         :  49%|████▉     | 143/292 [00:03<00:02, 50.26it/s]
ram used:  6.98 GB, layers.16.attention.wq.weight                     :  49%|████▉     | 143/292 [00:03<00:02, 50.26it/s]
ram used:  7.01 GB, layers.16.attention.wk.weight                     :  49%|████▉     | 143/292 [00:03<00:02, 50.26it/s]
ram used:  7.02 GB, layers.16.attention.wv.weight                     :  49%|████▉     | 143/292 [00:03<00:02, 50.26it/s]
ram used:  7.03 GB, layers.16.attention.wo.weight                     :  49%|████▉     | 143/292 [00:03<00:02, 50.26it/s]
ram used:  7.06 GB, layers.16.feed_forward.w1.weight                  :  49%|████▉     | 143/292 [00:03<00:02, 50.26it/s]
ram used:  7.18 GB, layers.16.feed_forward.w2.weight                  :  49%|████▉     | 143/292 [00:03<00:02, 50.26it/s]
ram used:  7.18 GB, layers.16.feed_forward.w2.weight                  :  51%|█████▏    | 150/292 [00:03<00:02, 51.02it/s]
ram used:  7.30 GB, layers.16.feed_forward.w3.weight                  :  51%|█████▏    | 150/292 [00:03<00:02, 51.02it/s]
ram used:  7.42 GB, layers.16.attention_norm.weight                   :  51%|█████▏    | 150/292 [00:03<00:02, 51.02it/s]
ram used:  7.42 GB, layers.16.ffn_norm.weight                         :  51%|█████▏    | 150/292 [00:03<00:02, 51.02it/s]
ram used:  7.42 GB, layers.17.attention.wq.weight                     :  51%|█████▏    | 150/292 [00:03<00:02, 51.02it/s]
ram used:  7.45 GB, layers.17.attention.wk.weight                     :  51%|█████▏    | 150/292 [00:03<00:02, 51.02it/s]
ram used:  7.46 GB, layers.17.attention.wv.weight                     :  51%|█████▏    | 150/292 [00:03<00:02, 51.02it/s]
ram used:  7.47 GB, layers.17.attention.wo.weight                     :  51%|█████▏    | 150/292 [00:03<00:02, 51.02it/s]
ram used:  7.50 GB, layers.17.feed_forward.w1.weight                  :  51%|█████▏    | 150/292 [00:03<00:02, 51.02it/s]
ram used:  7.50 GB, layers.17.feed_forward.w1.weight                  :  54%|█████▍    | 158/292 [00:03<00:02, 53.90it/s]
ram used:  7.62 GB, layers.17.feed_forward.w2.weight                  :  54%|█████▍    | 158/292 [00:03<00:02, 53.90it/s]
ram used:  7.73 GB, layers.17.feed_forward.w3.weight                  :  54%|█████▍    | 158/292 [00:03<00:02, 53.90it/s]
ram used:  7.85 GB, layers.17.attention_norm.weight                   :  54%|█████▍    | 158/292 [00:03<00:02, 53.90it/s]
ram used:  7.85 GB, layers.17.ffn_norm.weight                         :  54%|█████▍    | 158/292 [00:03<00:02, 53.90it/s]
ram used:  7.85 GB, layers.18.attention.wq.weight                     :  54%|█████▍    | 158/292 [00:03<00:02, 53.90it/s]
ram used:  7.89 GB, layers.18.attention.wk.weight                     :  54%|█████▍    | 158/292 [00:03<00:02, 53.90it/s]
ram used:  7.89 GB, layers.18.attention.wk.weight                     :  56%|█████▌    | 164/292 [00:03<00:02, 54.35it/s]
ram used:  7.89 GB, layers.18.attention.wv.weight                     :  56%|█████▌    | 164/292 [00:03<00:02, 54.35it/s]
ram used:  7.90 GB, layers.18.attention.wo.weight                     :  56%|█████▌    | 164/292 [00:03<00:02, 54.35it/s]
ram used:  7.94 GB, layers.18.feed_forward.w1.weight                  :  56%|█████▌    | 164/292 [00:03<00:02, 54.35it/s]
ram used:  8.05 GB, layers.18.feed_forward.w2.weight                  :  56%|█████▌    | 164/292 [00:03<00:02, 54.35it/s]
ram used:  8.17 GB, layers.18.feed_forward.w3.weight                  :  56%|█████▌    | 164/292 [00:03<00:02, 54.35it/s]
ram used:  8.29 GB, layers.18.attention_norm.weight                   :  56%|█████▌    | 164/292 [00:03<00:02, 54.35it/s]
ram used:  8.29 GB, layers.18.attention_norm.weight                   :  58%|█████▊    | 170/292 [00:03<00:02, 49.50it/s]
ram used:  8.29 GB, layers.18.ffn_norm.weight                         :  58%|█████▊    | 170/292 [00:03<00:02, 49.50it/s]
ram used:  8.29 GB, layers.19.attention.wq.weight                     :  58%|█████▊    | 170/292 [00:03<00:02, 49.50it/s]
ram used:  8.32 GB, layers.19.attention.wk.weight                     :  58%|█████▊    | 170/292 [00:03<00:02, 49.50it/s]
ram used:  8.33 GB, layers.19.attention.wv.weight                     :  58%|█████▊    | 170/292 [00:03<00:02, 49.50it/s]
ram used:  8.34 GB, layers.19.attention.wo.weight                     :  58%|█████▊    | 170/292 [00:03<00:02, 49.50it/s]
ram used:  8.37 GB, layers.19.feed_forward.w1.weight                  :  58%|█████▊    | 170/292 [00:03<00:02, 49.50it/s]
ram used:  8.49 GB, layers.19.feed_forward.w2.weight                  :  58%|█████▊    | 170/292 [00:03<00:02, 49.50it/s]
ram used:  8.49 GB, layers.19.feed_forward.w2.weight                  :  61%|██████    | 177/292 [00:03<00:02, 51.18it/s]
ram used:  8.61 GB, layers.19.feed_forward.w3.weight                  :  61%|██████    | 177/292 [00:03<00:02, 51.18it/s]
ram used:  8.72 GB, layers.19.attention_norm.weight                   :  61%|██████    | 177/292 [00:03<00:02, 51.18it/s]
ram used:  8.72 GB, layers.19.ffn_norm.weight                         :  61%|██████    | 177/292 [00:03<00:02, 51.18it/s]
ram used:  8.72 GB, layers.20.attention.wq.weight                     :  61%|██████    | 177/292 [00:03<00:02, 51.18it/s]
ram used:  8.76 GB, layers.20.attention.wk.weight                     :  61%|██████    | 177/292 [00:03<00:02, 51.18it/s]
ram used:  8.77 GB, layers.20.attention.wv.weight                     :  61%|██████    | 177/292 [00:03<00:02, 51.18it/s]
ram used:  8.77 GB, layers.20.attention.wo.weight                     :  61%|██████    | 177/292 [00:03<00:02, 51.18it/s]
ram used:  8.81 GB, layers.20.feed_forward.w1.weight                  :  61%|██████    | 177/292 [00:03<00:02, 51.18it/s]
ram used:  8.81 GB, layers.20.feed_forward.w1.weight                  :  63%|██████▎   | 185/292 [00:04<00:01, 54.40it/s]
ram used:  8.93 GB, layers.20.feed_forward.w2.weight                  :  63%|██████▎   | 185/292 [00:04<00:01, 54.40it/s]
ram used:  9.04 GB, layers.20.feed_forward.w3.weight                  :  63%|██████▎   | 185/292 [00:04<00:01, 54.40it/s]
ram used:  9.16 GB, layers.20.attention_norm.weight                   :  63%|██████▎   | 185/292 [00:04<00:01, 54.40it/s]
ram used:  9.16 GB, layers.20.ffn_norm.weight                         :  63%|██████▎   | 185/292 [00:04<00:01, 54.40it/s]
ram used:  9.16 GB, layers.21.attention.wq.weight                     :  63%|██████▎   | 185/292 [00:04<00:01, 54.40it/s]
ram used:  9.19 GB, layers.21.attention.wk.weight                     :  63%|██████▎   | 185/292 [00:04<00:01, 54.40it/s]
ram used:  9.19 GB, layers.21.attention.wk.weight                     :  65%|██████▌   | 191/292 [00:04<00:01, 54.59it/s]
ram used:  9.20 GB, layers.21.attention.wv.weight                     :  65%|██████▌   | 191/292 [00:04<00:01, 54.59it/s]
ram used:  9.21 GB, layers.21.attention.wo.weight                     :  65%|██████▌   | 191/292 [00:04<00:01, 54.59it/s]
ram used:  9.24 GB, layers.21.feed_forward.w1.weight                  :  65%|██████▌   | 191/292 [00:04<00:01, 54.59it/s]
ram used:  9.36 GB, layers.21.feed_forward.w2.weight                  :  65%|██████▌   | 191/292 [00:04<00:01, 54.59it/s]
ram used:  9.48 GB, layers.21.feed_forward.w3.weight                  :  65%|██████▌   | 191/292 [00:04<00:01, 54.59it/s]
ram used:  9.60 GB, layers.21.attention_norm.weight                   :  65%|██████▌   | 191/292 [00:04<00:01, 54.59it/s]
ram used:  9.60 GB, layers.21.attention_norm.weight                   :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.60 GB, layers.21.ffn_norm.weight                         :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.60 GB, layers.22.attention.wq.weight                     :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.63 GB, layers.22.attention.wk.weight                     :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.64 GB, layers.22.attention.wv.weight                     :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.65 GB, layers.22.attention.wo.weight                     :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.22.feed_forward.w1.weight                  :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.22.feed_forward.w2.weight                  :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.22.feed_forward.w3.weight                  :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.22.attention_norm.weight                   :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.22.ffn_norm.weight                         :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.23.attention.wq.weight                     :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.23.attention.wk.weight                     :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.23.attention.wv.weight                     :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.23.attention.wo.weight                     :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.23.feed_forward.w1.weight                  :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.23.feed_forward.w2.weight                  :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.23.feed_forward.w3.weight                  :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.23.attention_norm.weight                   :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.23.ffn_norm.weight                         :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.24.attention.wq.weight                     :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.24.attention.wk.weight                     :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.24.attention.wv.weight                     :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.24.attention.wo.weight                     :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.24.feed_forward.w1.weight                  :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.24.feed_forward.w2.weight                  :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.24.feed_forward.w3.weight                  :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.24.attention_norm.weight                   :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.24.ffn_norm.weight                         :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.25.attention.wq.weight                     :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.25.attention.wk.weight                     :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.25.attention.wv.weight                     :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.25.attention.wo.weight                     :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.25.feed_forward.w1.weight                  :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.25.feed_forward.w2.weight                  :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.25.feed_forward.w3.weight                  :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.25.attention_norm.weight                   :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.25.ffn_norm.weight                         :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.26.attention.wq.weight                     :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.26.attention.wk.weight                     :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.26.attention.wv.weight                     :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.26.attention.wo.weight                     :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.26.feed_forward.w1.weight                  :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.26.feed_forward.w2.weight                  :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.26.feed_forward.w3.weight                  :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.26.attention_norm.weight                   :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.26.ffn_norm.weight                         :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.27.attention.wq.weight                     :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.27.attention.wk.weight                     :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.27.attention.wv.weight                     :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.27.attention.wo.weight                     :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.27.feed_forward.w1.weight                  :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.27.feed_forward.w2.weight                  :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.27.feed_forward.w3.weight                  :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.27.attention_norm.weight                   :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.27.ffn_norm.weight                         :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.28.attention.wq.weight                     :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.28.attention.wk.weight                     :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.28.attention.wv.weight                     :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.28.attention.wo.weight                     :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.28.feed_forward.w1.weight                  :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.28.feed_forward.w2.weight                  :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.28.feed_forward.w3.weight                  :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.28.attention_norm.weight                   :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.28.ffn_norm.weight                         :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.29.attention.wq.weight                     :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.29.attention.wk.weight                     :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.29.attention.wv.weight                     :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.29.attention.wo.weight                     :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.29.feed_forward.w1.weight                  :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.29.feed_forward.w2.weight                  :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.29.feed_forward.w3.weight                  :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.29.attention_norm.weight                   :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.29.ffn_norm.weight                         :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.30.attention.wq.weight                     :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.30.attention.wk.weight                     :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.30.attention.wv.weight                     :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.30.attention.wo.weight                     :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.30.feed_forward.w1.weight                  :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.30.feed_forward.w2.weight                  :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.30.feed_forward.w3.weight                  :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.30.attention_norm.weight                   :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.30.ffn_norm.weight                         :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.31.attention.wq.weight                     :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.31.attention.wk.weight                     :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.31.attention.wv.weight                     :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.31.attention.wo.weight                     :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.31.feed_forward.w1.weight                  :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.31.feed_forward.w2.weight                  :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.31.feed_forward.w3.weight                  :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.31.attention_norm.weight                   :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, layers.31.ffn_norm.weight                         :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, norm.weight                                       :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, tok_embeddings.weight                             :  67%|██████▋   | 197/292 [00:04<00:01, 49.45it/s]
ram used:  9.68 GB, tok_embeddings.weight                             :  99%|█████████▉| 290/292 [00:04<00:00, 222.51it/s]
ram used:  9.94 GB, output.weight                                     :  99%|█████████▉| 290/292 [00:04<00:00, 222.51it/s]
ram used:  9.94 GB, freqs_cis                                         :  99%|█████████▉| 290/292 [00:04<00:00, 222.51it/s]
ram used:  9.94 GB, freqs_cis                                         : 100%|██████████| 292/292 [00:04<00:00, 65.78it/s] 
loaded weights in 4442.99 ms, 9.94 GB loaded at 2.24 GB/s

  0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.0.attention.wq.weight                      :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.0.attention.wk.weight                      :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.0.attention.wv.weight                      :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.0.attention.wo.weight                      :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.0.feed_forward.w1.weight                   :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.0.feed_forward.w2.weight                   :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.0.feed_forward.w3.weight                   :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.0.attention_norm.weight                    :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.0.ffn_norm.weight                          :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.1.attention.wq.weight                      :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.1.attention.wk.weight                      :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.1.attention.wv.weight                      :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.1.attention.wo.weight                      :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.1.feed_forward.w1.weight                   :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.1.feed_forward.w2.weight                   :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.1.feed_forward.w3.weight                   :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.1.attention_norm.weight                    :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.1.ffn_norm.weight                          :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.2.attention.wq.weight                      :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.2.attention.wk.weight                      :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.2.attention.wv.weight                      :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.2.attention.wo.weight                      :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.2.feed_forward.w1.weight                   :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.2.feed_forward.w2.weight                   :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.2.feed_forward.w3.weight                   :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.2.attention_norm.weight                    :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.2.ffn_norm.weight                          :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.3.attention.wq.weight                      :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.3.attention.wk.weight                      :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.3.attention.wv.weight                      :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.3.attention.wo.weight                      :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.3.feed_forward.w1.weight                   :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.3.feed_forward.w2.weight                   :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.3.feed_forward.w3.weight                   :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.3.attention_norm.weight                    :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.3.ffn_norm.weight                          :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.4.attention.wq.weight                      :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.4.attention.wk.weight                      :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.4.attention.wv.weight                      :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.4.attention.wo.weight                      :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.4.feed_forward.w1.weight                   :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.4.feed_forward.w2.weight                   :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.4.feed_forward.w3.weight                   :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.4.attention_norm.weight                    :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.4.ffn_norm.weight                          :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.5.attention.wq.weight                      :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.5.attention.wk.weight                      :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.5.attention.wv.weight                      :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.5.attention.wo.weight                      :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.5.feed_forward.w1.weight                   :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.5.feed_forward.w2.weight                   :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.5.feed_forward.w3.weight                   :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.5.attention_norm.weight                    :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.5.ffn_norm.weight                          :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.6.attention.wq.weight                      :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.6.attention.wk.weight                      :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.6.attention.wv.weight                      :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.6.attention.wo.weight                      :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.6.feed_forward.w1.weight                   :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.6.feed_forward.w2.weight                   :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.6.feed_forward.w3.weight                   :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.6.attention_norm.weight                    :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.6.ffn_norm.weight                          :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.7.attention.wq.weight                      :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.7.attention.wk.weight                      :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.7.attention.wv.weight                      :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.7.attention.wo.weight                      :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.7.feed_forward.w1.weight                   :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.7.feed_forward.w2.weight                   :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.7.feed_forward.w3.weight                   :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.7.attention_norm.weight                    :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.7.ffn_norm.weight                          :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.8.attention.wq.weight                      :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.8.attention.wk.weight                      :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.8.attention.wv.weight                      :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.8.attention.wo.weight                      :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.8.feed_forward.w1.weight                   :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.8.feed_forward.w2.weight                   :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.8.feed_forward.w3.weight                   :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.8.attention_norm.weight                    :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.8.ffn_norm.weight                          :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.9.attention.wq.weight                      :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.9.attention.wk.weight                      :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.9.attention.wv.weight                      :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.9.attention.wo.weight                      :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.9.feed_forward.w1.weight                   :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.9.feed_forward.w2.weight                   :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.9.feed_forward.w3.weight                   :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.9.attention_norm.weight                    :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.9.ffn_norm.weight                          :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.10.attention.wq.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.10.attention.wk.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.10.attention.wv.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.10.attention.wo.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.10.feed_forward.w1.weight                  :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.10.feed_forward.w2.weight                  :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.10.feed_forward.w3.weight                  :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.10.attention_norm.weight                   :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.10.ffn_norm.weight                         :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.11.attention.wq.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.11.attention.wk.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.11.attention.wv.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.11.attention.wo.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.11.feed_forward.w1.weight                  :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.11.feed_forward.w2.weight                  :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.11.feed_forward.w3.weight                  :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.11.attention_norm.weight                   :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.11.ffn_norm.weight                         :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.12.attention.wq.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.12.attention.wk.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.12.attention.wv.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.12.attention.wo.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.12.feed_forward.w1.weight                  :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.12.feed_forward.w2.weight                  :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.12.feed_forward.w3.weight                  :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.12.attention_norm.weight                   :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.12.ffn_norm.weight                         :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.13.attention.wq.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.13.attention.wk.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.13.attention.wv.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.13.attention.wo.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.13.feed_forward.w1.weight                  :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.13.feed_forward.w2.weight                  :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.13.feed_forward.w3.weight                  :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.13.attention_norm.weight                   :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.13.ffn_norm.weight                         :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.14.attention.wq.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.14.attention.wk.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.14.attention.wv.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.14.attention.wo.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.14.feed_forward.w1.weight                  :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.14.feed_forward.w2.weight                  :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.14.feed_forward.w3.weight                  :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.14.attention_norm.weight                   :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.14.ffn_norm.weight                         :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.15.attention.wq.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.15.attention.wk.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.15.attention.wv.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.15.attention.wo.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.15.feed_forward.w1.weight                  :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.15.feed_forward.w2.weight                  :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.15.feed_forward.w3.weight                  :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.15.attention_norm.weight                   :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.15.ffn_norm.weight                         :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.16.attention.wq.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.16.attention.wk.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.16.attention.wv.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.16.attention.wo.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.16.feed_forward.w1.weight                  :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.16.feed_forward.w2.weight                  :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.16.feed_forward.w3.weight                  :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.16.attention_norm.weight                   :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.16.ffn_norm.weight                         :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.17.attention.wq.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.17.attention.wk.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.17.attention.wv.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.17.attention.wo.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.17.feed_forward.w1.weight                  :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.17.feed_forward.w2.weight                  :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.17.feed_forward.w3.weight                  :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.17.attention_norm.weight                   :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.17.ffn_norm.weight                         :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.18.attention.wq.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.18.attention.wk.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.18.attention.wv.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.18.attention.wo.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.18.feed_forward.w1.weight                  :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.18.feed_forward.w2.weight                  :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.18.feed_forward.w3.weight                  :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.18.attention_norm.weight                   :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.18.ffn_norm.weight                         :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.19.attention.wq.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.19.attention.wk.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.19.attention.wv.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.19.attention.wo.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.19.feed_forward.w1.weight                  :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.19.feed_forward.w2.weight                  :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.19.feed_forward.w3.weight                  :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.19.attention_norm.weight                   :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.19.ffn_norm.weight                         :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.20.attention.wq.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.20.attention.wk.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.20.attention.wv.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.20.attention.wo.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.20.feed_forward.w1.weight                  :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.20.feed_forward.w2.weight                  :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.20.feed_forward.w3.weight                  :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.20.attention_norm.weight                   :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.20.ffn_norm.weight                         :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.21.attention.wq.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.21.attention.wk.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.21.attention.wv.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.21.attention.wo.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.21.feed_forward.w1.weight                  :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.21.feed_forward.w2.weight                  :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.21.feed_forward.w3.weight                  :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.21.attention_norm.weight                   :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.21.ffn_norm.weight                         :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.22.attention.wq.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.22.attention.wk.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.22.attention.wv.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.22.attention.wo.weight                     :   0%|          | 0/292 [00:00<?, ?it/s]
ram used:  9.94 GB, layers.22.feed_forward.w1.weight                  :   0%|          | 0/292 [00:00<?, ?it/s]
ram used: 10.06 GB, layers.22.feed_forward.w2.weight                  :   0%|          | 0/292 [00:00<?, ?it/s]
ram used: 10.18 GB, layers.22.feed_forward.w3.weight                  :   0%|          | 0/292 [00:00<?, ?it/s]
ram used: 10.18 GB, layers.22.feed_forward.w3.weight                  :  70%|███████   | 205/292 [00:00<00:00, 1533.22it/s]
ram used: 10.30 GB, layers.22.attention_norm.weight                   :  70%|███████   | 205/292 [00:00<00:00, 1533.22it/s]
ram used: 10.30 GB, layers.22.ffn_norm.weight                         :  70%|███████   | 205/292 [00:00<00:00, 1533.22it/s]
ram used: 10.30 GB, layers.23.attention.wq.weight                     :  70%|███████   | 205/292 [00:00<00:00, 1533.22it/s]
ram used: 10.33 GB, layers.23.attention.wk.weight                     :  70%|███████   | 205/292 [00:00<00:00, 1533.22it/s]
ram used: 10.34 GB, layers.23.attention.wv.weight                     :  70%|███████   | 205/292 [00:00<00:00, 1533.22it/s]
ram used: 10.35 GB, layers.23.attention.wo.weight                     :  70%|███████   | 205/292 [00:00<00:00, 1533.22it/s]
ram used: 10.38 GB, layers.23.feed_forward.w1.weight                  :  70%|███████   | 205/292 [00:00<00:00, 1533.22it/s]
ram used: 10.50 GB, layers.23.feed_forward.w2.weight                  :  70%|███████   | 205/292 [00:00<00:00, 1533.22it/s]
ram used: 10.61 GB, layers.23.feed_forward.w3.weight                  :  70%|███████   | 205/292 [00:00<00:00, 1533.22it/s]
ram used: 10.73 GB, layers.23.attention_norm.weight                   :  70%|███████   | 205/292 [00:00<00:00, 1533.22it/s]
ram used: 10.73 GB, layers.23.ffn_norm.weight                         :  70%|███████   | 205/292 [00:00<00:00, 1533.22it/s]
ram used: 10.73 GB, layers.24.attention.wq.weight                     :  70%|███████   | 205/292 [00:00<00:00, 1533.22it/s]
ram used: 10.77 GB, layers.24.attention.wk.weight                     :  70%|███████   | 205/292 [00:00<00:00, 1533.22it/s]
ram used: 10.77 GB, layers.24.attention.wv.weight                     :  70%|███████   | 205/292 [00:00<00:00, 1533.22it/s]
ram used: 10.78 GB, layers.24.attention.wo.weight                     :  70%|███████   | 205/292 [00:00<00:00, 1533.22it/s]
ram used: 10.82 GB, layers.24.feed_forward.w1.weight                  :  70%|███████   | 205/292 [00:00<00:00, 1533.22it/s]
ram used: 10.93 GB, layers.24.feed_forward.w2.weight                  :  70%|███████   | 205/292 [00:00<00:00, 1533.22it/s]
ram used: 11.05 GB, layers.24.feed_forward.w3.weight                  :  70%|███████   | 205/292 [00:00<00:00, 1533.22it/s]
ram used: 11.17 GB, layers.24.attention_norm.weight                   :  70%|███████   | 205/292 [00:00<00:00, 1533.22it/s]
ram used: 11.17 GB, layers.24.ffn_norm.weight                         :  70%|███████   | 205/292 [00:00<00:00, 1533.22it/s]
ram used: 11.17 GB, layers.25.attention.wq.weight                     :  70%|███████   | 205/292 [00:00<00:00, 1533.22it/s]
ram used: 11.20 GB, layers.25.attention.wk.weight                     :  70%|███████   | 205/292 [00:00<00:00, 1533.22it/s]
ram used: 11.21 GB, layers.25.attention.wv.weight                     :  70%|███████   | 205/292 [00:00<00:00, 1533.22it/s]
ram used: 11.22 GB, layers.25.attention.wo.weight                     :  70%|███████   | 205/292 [00:00<00:00, 1533.22it/s]
ram used: 11.25 GB, layers.25.feed_forward.w1.weight                  :  70%|███████   | 205/292 [00:00<00:00, 1533.22it/s]
ram used: 11.37 GB, layers.25.feed_forward.w2.weight                  :  70%|███████   | 205/292 [00:01<00:00, 1533.22it/s]
ram used: 11.49 GB, layers.25.feed_forward.w3.weight                  :  70%|███████   | 205/292 [00:01<00:00, 1533.22it/s]
ram used: 11.60 GB, layers.25.attention_norm.weight                   :  70%|███████   | 205/292 [00:01<00:00, 1533.22it/s]
ram used: 11.60 GB, layers.25.ffn_norm.weight                         :  70%|███████   | 205/292 [00:01<00:00, 1533.22it/s]
ram used: 11.60 GB, layers.26.attention.wq.weight                     :  70%|███████   | 205/292 [00:01<00:00, 1533.22it/s]
ram used: 11.64 GB, layers.26.attention.wk.weight                     :  70%|███████   | 205/292 [00:01<00:00, 1533.22it/s]
ram used: 11.65 GB, layers.26.attention.wv.weight                     :  70%|███████   | 205/292 [00:01<00:00, 1533.22it/s]
ram used: 11.65 GB, layers.26.attention.wo.weight                     :  70%|███████   | 205/292 [00:01<00:00, 1533.22it/s]
ram used: 11.69 GB, layers.26.feed_forward.w1.weight                  :  70%|███████   | 205/292 [00:01<00:00, 1533.22it/s]
ram used: 11.81 GB, layers.26.feed_forward.w2.weight                  :  70%|███████   | 205/292 [00:01<00:00, 1533.22it/s]
ram used: 11.92 GB, layers.26.feed_forward.w3.weight                  :  70%|███████   | 205/292 [00:01<00:00, 1533.22it/s]
ram used: 12.04 GB, layers.26.attention_norm.weight                   :  70%|███████   | 205/292 [00:02<00:00, 1533.22it/s]
ram used: 12.04 GB, layers.26.ffn_norm.weight                         :  70%|███████   | 205/292 [00:02<00:00, 1533.22it/s]
ram used: 12.04 GB, layers.27.attention.wq.weight                     :  70%|███████   | 205/292 [00:02<00:00, 1533.22it/s]
ram used: 12.07 GB, layers.27.attention.wk.weight                     :  70%|███████   | 205/292 [00:02<00:00, 1533.22it/s]
ram used: 12.08 GB, layers.27.attention.wv.weight                     :  70%|███████   | 205/292 [00:02<00:00, 1533.22it/s]
ram used: 12.09 GB, layers.27.attention.wo.weight                     :  70%|███████   | 205/292 [00:02<00:00, 1533.22it/s]
ram used: 12.12 GB, layers.27.feed_forward.w1.weight                  :  70%|███████   | 205/292 [00:02<00:00, 1533.22it/s]
ram used: 12.24 GB, layers.27.feed_forward.w2.weight                  :  70%|███████   | 205/292 [00:03<00:00, 1533.22it/s]
ram used: 12.36 GB, layers.27.feed_forward.w3.weight                  :  70%|███████   | 205/292 [00:10<00:00, 1533.22it/s]
ram used: 12.48 GB, layers.27.attention_norm.weight                   :  70%|███████   | 205/292 [00:11<00:00, 1533.22it/s]
ram used: 12.48 GB, layers.27.ffn_norm.weight                         :  70%|███████   | 205/292 [00:13<00:00, 1533.22it/s]
ram used: 12.48 GB, layers.28.attention.wq.weight                     :  70%|███████   | 205/292 [00:13<00:00, 1533.22it/s]
ram used: 12.51 GB, layers.28.attention.wk.weight                     :  70%|███████   | 205/292 [00:13<00:00, 1533.22it/s]
ram used: 12.52 GB, layers.28.attention.wv.weight                     :  70%|███████   | 205/292 [00:13<00:00, 1533.22it/s]
ram used: 12.53 GB, layers.28.attention.wo.weight                     :  70%|███████   | 205/292 [00:13<00:00, 1533.22it/s]
ram used: 12.56 GB, layers.28.feed_forward.w1.weight                  :  70%|███████   | 205/292 [00:13<00:00, 1533.22it/s]
ram used: 12.68 GB, layers.28.feed_forward.w2.weight                  :  70%|███████   | 205/292 [00:13<00:00, 1533.22it/s]
ram used: 12.80 GB, layers.28.feed_forward.w3.weight                  :  70%|███████   | 205/292 [00:13<00:00, 1533.22it/s]
ram used: 12.91 GB, layers.28.attention_norm.weight                   :  70%|███████   | 205/292 [00:13<00:00, 1533.22it/s]
ram used: 12.91 GB, layers.28.ffn_norm.weight                         :  70%|███████   | 205/292 [00:13<00:00, 1533.22it/s]
ram used: 12.91 GB, layers.29.attention.wq.weight                     :  70%|███████   | 205/292 [00:13<00:00, 1533.22it/s]
ram used: 12.95 GB, layers.29.attention.wk.weight                     :  70%|███████   | 205/292 [00:13<00:00, 1533.22it/s]
ram used: 12.95 GB, layers.29.attention.wv.weight                     :  70%|███████   | 205/292 [00:13<00:00, 1533.22it/s]
ram used: 12.96 GB, layers.29.attention.wo.weight                     :  70%|███████   | 205/292 [00:13<00:00, 1533.22it/s]
ram used: 13.00 GB, layers.29.feed_forward.w1.weight                  :  70%|███████   | 205/292 [00:13<00:00, 1533.22it/s]
ram used: 13.11 GB, layers.29.feed_forward.w2.weight                  :  70%|███████   | 205/292 [00:13<00:00, 1533.22it/s]
ram used: 13.23 GB, layers.29.feed_forward.w3.weight                  :  70%|███████   | 205/292 [00:13<00:00, 1533.22it/s]
ram used: 13.35 GB, layers.29.attention_norm.weight                   :  70%|███████   | 205/292 [00:13<00:00, 1533.22it/s]
ram used: 13.35 GB, layers.29.ffn_norm.weight                         :  70%|███████   | 205/292 [00:13<00:00, 1533.22it/s]
ram used: 13.35 GB, layers.30.attention.wq.weight                     :  70%|███████   | 205/292 [00:13<00:00, 1533.22it/s]
ram used: 13.38 GB, layers.30.attention.wk.weight                     :  70%|███████   | 205/292 [00:14<00:00, 1533.22it/s]
ram used: 13.39 GB, layers.30.attention.wv.weight                     :  70%|███████   | 205/292 [00:14<00:00, 1533.22it/s]
ram used: 13.40 GB, layers.30.attention.wo.weight                     :  70%|███████   | 205/292 [00:14<00:00, 1533.22it/s]
ram used: 13.43 GB, layers.30.feed_forward.w1.weight                  :  70%|███████   | 205/292 [00:14<00:00, 1533.22it/s]
ram used: 13.55 GB, layers.30.feed_forward.w2.weight                  :  70%|███████   | 205/292 [00:14<00:00, 1533.22it/s]
ram used: 13.67 GB, layers.30.feed_forward.w3.weight                  :  70%|███████   | 205/292 [00:14<00:00, 1533.22it/s]
ram used: 13.79 GB, layers.30.attention_norm.weight                   :  70%|███████   | 205/292 [00:14<00:00, 1533.22it/s]
ram used: 13.79 GB, layers.30.ffn_norm.weight                         :  70%|███████   | 205/292 [00:14<00:00, 1533.22it/s]
ram used: 13.79 GB, layers.31.attention.wq.weight                     :  70%|███████   | 205/292 [00:14<00:00, 1533.22it/s]
ram used: 13.82 GB, layers.31.attention.wk.weight                     :  70%|███████   | 205/292 [00:14<00:00, 1533.22it/s]
ram used: 13.83 GB, layers.31.attention.wv.weight                     :  70%|███████   | 205/292 [00:14<00:00, 1533.22it/s]
ram used: 13.84 GB, layers.31.attention.wo.weight                     :  70%|███████   | 205/292 [00:14<00:00, 1533.22it/s]
ram used: 13.87 GB, layers.31.feed_forward.w1.weight                  :  70%|███████   | 205/292 [00:14<00:00, 1533.22it/s]
ram used: 13.99 GB, layers.31.feed_forward.w2.weight                  :  70%|███████   | 205/292 [00:14<00:00, 1533.22it/s]
ram used: 14.10 GB, layers.31.feed_forward.w3.weight                  :  70%|███████   | 205/292 [00:14<00:00, 1533.22it/s]
ram used: 14.22 GB, layers.31.attention_norm.weight                   :  70%|███████   | 205/292 [00:14<00:00, 1533.22it/s]
ram used: 14.22 GB, layers.31.ffn_norm.weight                         :  70%|███████   | 205/292 [00:14<00:00, 1533.22it/s]
ram used: 14.22 GB, norm.weight                                       :  70%|███████   | 205/292 [00:14<00:00, 1533.22it/s]
ram used: 14.22 GB, tok_embeddings.weight                             :  70%|███████   | 205/292 [00:14<00:00, 1533.22it/s]
ram used: 14.22 GB, output.weight                                     :  70%|███████   | 205/292 [00:14<00:00, 1533.22it/s]
ram used: 14.48 GB, freqs_cis                                         :  70%|███████   | 205/292 [00:14<00:00, 1533.22it/s]
ram used: 14.48 GB, freqs_cis                                         : 100%|██████████| 292/292 [00:14<00:00, 20.23it/s]  
loaded weights in 14431.85 ms, 4.54 GB loaded at 0.31 GB/s
weights -> model: 18909.50 ms
<|im_start|> system
You are Quentin. Quentin is a useful assistant who writes Python code to answer questions. He keeps the code as short as possible and doesn't read from user input<|im_end|> 
Q: 

compile_efficientnet.py

Traceback (most recent call last):
  File "/home/jebba/devel/tinygrad/tinygrad/examples/compile_efficientnet.py", line 13, in <module>
    prg, inp_sizes, out_sizes, state = export_model(model, mode, Tensor.randn(1,3,224,224))
                                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jebba/devel/tinygrad/tinygrad/extra/export_model.py", line 313, in export_model
    assert Device.DEFAULT in EXPORT_SUPPORTED_DEVICE, "only WEBGPU, WEBGL, CLANG, CUDA, GPU, METAL are supported"
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError: only WEBGPU, WEBGL, CLANG, CUDA, GPU, METAL are supported

compile_tensorflow.py.txt

2024-02-06 13:09:00.382257: I external/local_tsl/tsl/cuda/cudart_stub.cc:31] Could not find cuda drivers on your machine, GPU will not be used.
2024-02-06 13:09:00.420457: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-02-06 13:09:00.420503: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-02-06 13:09:00.421410: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-02-06 13:09:00.426991: I external/local_tsl/tsl/cuda/cudart_stub.cc:31] Could not find cuda drivers on your machine, GPU will not be used.
2024-02-06 13:09:00.427163: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-02-06 13:09:01.185053: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
2024-02-06 13:09:02.593911: I tensorflow/core/grappler/devices.cc:66] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 0
2024-02-06 13:09:02.594059: I tensorflow/core/grappler/clusters/single_machine.cc:361] Starting new session
2024-02-06 13:09:02.691872: I tensorflow/core/grappler/devices.cc:66] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 0
2024-02-06 13:09:02.692025: I tensorflow/core/grappler/clusters/single_machine.cc:361] Starting new session
tinygrad: [0.29635584354400635, 0.5070338845252991, 0.6352834105491638, 0.15874029695987701]
compiled: [0.296356, 0.507034, 0.635283, 0.15874]
keras:    [0.29635587 0.5070339  0.6352834  0.15874033]
#include <string.h>
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#define max(x,y) ((x>y)?x:y)
#define int64 long
#define half __fp16
#define uchar unsigned char
#include <stdbool.h>

float buf_0[64];
float input0[128];
float buf_1[2048];
float buf_2[64];
float buf_3[128];
float buf_4[2048];
float output0[16];
float buf_5[512];
void r_16_32(float* restrict data0, const float* restrict data1, const float* restrict data2, const float* restrict data3) {
  float val0 = data1[0];
  float val1 = data1[1];
  float val2 = data1[2];
  float val3 = data1[3];
  float val4 = data1[4];
  float val5 = data1[5];
  float val6 = data1[6];
  float val7 = data1[7];
  float val8 = data1[8];
  float val9 = data1[9];
  float val10 = data1[10];
  float val11 = data1[11];
  float val12 = data1[12];
  float val13 = data1[13];
  float val14 = data1[14];
  float val15 = data1[15];
  float val16 = data1[16];
  float val17 = data1[17];
  float val18 = data1[18];
  float val19 = data1[19];
  float val20 = data1[20];
  float val21 = data1[21];
  float val22 = data1[22];
  float val23 = data1[23];
  float val24 = data1[24];
  float val25 = data1[25];
  float val26 = data1[26];
  float val27 = data1[27];
  float val28 = data1[28];
  float val29 = data1[29];
  float val30 = data1[30];
  float val31 = data1[31];
  for (int ridx0 = 0; ridx0 < 16; ridx0++) {
    float acc0 = 0.0f;
    float val32 = data2[ridx0];
    float val33 = data2[ridx0+16];
    float val34 = data2[ridx0+32];
    float val35 = data2[ridx0+48];
    float val36 = data2[ridx0+64];
    float val37 = data2[ridx0+80];
    float val38 = data2[ridx0+96];
    float val39 = data2[ridx0+112];
    float val40 = data2[ridx0+128];
    float val41 = data2[ridx0+144];
    float val42 = data2[ridx0+160];
    float val43 = data2[ridx0+176];
    float val44 = data2[ridx0+192];
    float val45 = data2[ridx0+208];
    float val46 = data2[ridx0+224];
    float val47 = data2[ridx0+240];
    float val48 = data2[ridx0+256];
    float val49 = data2[ridx0+272];
    float val50 = data2[ridx0+288];
    float val51 = data2[ridx0+304];
    float val52 = data2[ridx0+320];
    float val53 = data2[ridx0+336];
    float val54 = data2[ridx0+352];
    float val55 = data2[ridx0+368];
    float val56 = data2[ridx0+384];
    float val57 = data2[ridx0+400];
    float val58 = data2[ridx0+416];
    float val59 = data2[ridx0+432];
    float val60 = data2[ridx0+448];
    float val61 = data2[ridx0+464];
    float val62 = data2[ridx0+480];
    float val63 = data2[ridx0+496];
    float val64 = data3[ridx0];
    float alu0 = max(((val31*val63)+((val30*val62)+((val29*val61)+((val28*val60)+((val27*val59)+((val26*val58)+((val25*val57)+((val24*val56)+((val23*val55)+((val22*val54)+((val21*val53)+((val20*val52)+((val19*val51)+((val18*val50)+((val17*val49)+((val16*val48)+((val15*val47)+((val14*val46)+((val13*val45)+((val12*val44)+((val11*val43)+((val10*val42)+((val9*val41)+((val8*val40)+((val7*val39)+((val6*val38)+((val5*val37)+((val4*val36)+((val3*val35)+((val2*val34)+((val1*val33)+((val0*val32)+acc0)))))))))))))))))))))))))))))))),0.0f);
    data0[ridx0] = (alu0*val64);
  }
}
void r_32_16(float* restrict data0, const float* restrict data1, const float* restrict data2) {
  float val0 = data1[0];
  float val1 = data1[1];
  float val2 = data1[2];
  float val3 = data1[3];
  float val4 = data1[4];
  float val5 = data1[5];
  float val6 = data1[6];
  float val7 = data1[7];
  float val8 = data1[8];
  float val9 = data1[9];
  float val10 = data1[10];
  float val11 = data1[11];
  float val12 = data1[12];
  float val13 = data1[13];
  float val14 = data1[14];
  float val15 = data1[15];
  for (int ridx0 = 0; ridx0 < 32; ridx0++) {
    float acc0 = 0.0f;
    float val16 = data2[ridx0];
    float val17 = data2[ridx0+32];
    float val18 = data2[ridx0+64];
    float val19 = data2[ridx0+96];
    float val20 = data2[ridx0+128];
    float val21 = data2[ridx0+160];
    float val22 = data2[ridx0+192];
    float val23 = data2[ridx0+224];
    float val24 = data2[ridx0+256];
    float val25 = data2[ridx0+288];
    float val26 = data2[ridx0+320];
    float val27 = data2[ridx0+352];
    float val28 = data2[ridx0+384];
    float val29 = data2[ridx0+416];
    float val30 = data2[ridx0+448];
    float val31 = data2[ridx0+480];
    float alu0 = max(((val15*val31)+((val14*val30)+((val13*val29)+((val12*val28)+((val11*val27)+((val10*val26)+((val9*val25)+((val8*val24)+((val7*val23)+((val6*val22)+((val5*val21)+((val4*val20)+((val3*val19)+((val2*val18)+((val1*val17)+((val0*val16)+acc0)))))))))))))))),0.0f);
    data0[ridx0] = alu0;
  }
}
void r_4_32(float* restrict data0, const float* restrict data1, const float* restrict data2) {
  float val0 = data1[0];
  float val1 = data1[1];
  float val2 = data1[2];
  float val3 = data1[3];
  float val4 = data1[4];
  float val5 = data1[5];
  float val6 = data1[6];
  float val7 = data1[7];
  float val8 = data1[8];
  float val9 = data1[9];
  float val10 = data1[10];
  float val11 = data1[11];
  float val12 = data1[12];
  float val13 = data1[13];
  float val14 = data1[14];
  float val15 = data1[15];
  float val16 = data1[16];
  float val17 = data1[17];
  float val18 = data1[18];
  float val19 = data1[19];
  float val20 = data1[20];
  float val21 = data1[21];
  float val22 = data1[22];
  float val23 = data1[23];
  float val24 = data1[24];
  float val25 = data1[25];
  float val26 = data1[26];
  float val27 = data1[27];
  float val28 = data1[28];
  float val29 = data1[29];
  float val30 = data1[30];
  float val31 = data1[31];
  for (int ridx0 = 0; ridx0 < 4; ridx0++) {
    float acc0 = 0.0f;
    float val32 = data2[ridx0];
    float val33 = data2[ridx0+4];
    float val34 = data2[ridx0+8];
    float val35 = data2[ridx0+12];
    float val36 = data2[ridx0+16];
    float val37 = data2[ridx0+20];
    float val38 = data2[ridx0+24];
    float val39 = data2[ridx0+28];
    float val40 = data2[ridx0+32];
    float val41 = data2[ridx0+36];
    float val42 = data2[ridx0+40];
    float val43 = data2[ridx0+44];
    float val44 = data2[ridx0+48];
    float val45 = data2[ridx0+52];
    float val46 = data2[ridx0+56];
    float val47 = data2[ridx0+60];
    float val48 = data2[ridx0+64];
    float val49 = data2[ridx0+68];
    float val50 = data2[ridx0+72];
    float val51 = data2[ridx0+76];
    float val52 = data2[ridx0+80];
    float val53 = data2[ridx0+84];
    float val54 = data2[ridx0+88];
    float val55 = data2[ridx0+92];
    float val56 = data2[ridx0+96];
    float val57 = data2[ridx0+100];
    float val58 = data2[ridx0+104];
    float val59 = data2[ridx0+108];
    float val60 = data2[ridx0+112];
    float val61 = data2[ridx0+116];
    float val62 = data2[ridx0+120];
    float val63 = data2[ridx0+124];
    data0[ridx0] = (1.0f/(1.0f+exp2((((val31*val63)+((val30*val62)+((val29*val61)+((val28*val60)+((val27*val59)+((val26*val58)+((val25*val57)+((val24*val56)+((val23*val55)+((val22*val54)+((val21*val53)+((val20*val52)+((val19*val51)+((val18*val50)+((val17*val49)+((val16*val48)+((val15*val47)+((val14*val46)+((val13*val45)+((val12*val44)+((val11*val43)+((val10*val42)+((val9*val41)+((val8*val40)+((val7*val39)+((val6*val38)+((val5*val37)+((val4*val36)+((val3*val35)+((val2*val34)+((val1*val33)+((val0*val32)+acc0))))))))))))))))))))))))))))))))*(-1.4426950408889634f)))));
  }
}
void net(float* input0, float* output0) {
r_16_32(buf_0, input0, buf_1, buf_2);
r_32_16(buf_3, buf_0, buf_4);
r_4_32(output0, buf_3, buf_5);
}
void initialize(float *weights) {
memcpy(buf_1, weights + 0, 8192);
memcpy(buf_2, weights + 512, 256);
memcpy(buf_4, weights + 528, 8192);
memcpy(buf_5, weights + 1040, 2048);
}
int main(int argc, char *argv[]) {
    // read in the weights from disk
    FILE *f = fopen("/tmp/tf_weights", "rb");
    float *weights = (float *)malloc(4672);
    fread(weights, 1, 4672, f);
    fclose(f);

    // init the net
    initialize(weights);

    // test run
    float input[32];
    float outputs[4];
    for (int i = 0; i < 32; i++) scanf("%f", &input[i]);
    net(input, outputs);
    printf("%f %f %f %f\n", outputs[0], outputs[1], outputs[2], outputs[3]);
  }

conversation.py

[nltk_data] Downloading package punkt to /home/jebba/nltk_data...
[nltk_data]   Package punkt is already up-to-date!

  0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.00 GB, encoder.conv1.weight                              :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.00 GB, encoder.conv1.bias                                :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.00 GB, encoder.conv2.weight                              :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.00 GB, encoder.conv2.bias                                :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.00 GB, encoder.blocks.0.attn.query.weight                :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.00 GB, encoder.blocks.0.attn.query.bias                  :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.00 GB, encoder.blocks.0.attn.key.weight                  :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.00 GB, encoder.blocks.0.attn.value.weight                :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.00 GB, encoder.blocks.0.attn.value.bias                  :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.00 GB, encoder.blocks.0.attn.out.weight                  :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.00 GB, encoder.blocks.0.attn.out.bias                    :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.00 GB, encoder.blocks.0.attn_ln.weight                   :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.00 GB, encoder.blocks.0.attn_ln.bias                     :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.00 GB, encoder.blocks.0.mlp.0.weight                     :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.01 GB, encoder.blocks.0.mlp.0.bias                       :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.01 GB, encoder.blocks.0.mlp.2.weight                     :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.01 GB, encoder.blocks.0.mlp.2.bias                       :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.01 GB, encoder.blocks.0.mlp_ln.weight                    :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.01 GB, encoder.blocks.0.mlp_ln.bias                      :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.01 GB, encoder.blocks.1.attn.query.weight                :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.01 GB, encoder.blocks.1.attn.query.bias                  :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.01 GB, encoder.blocks.1.attn.key.weight                  :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.01 GB, encoder.blocks.1.attn.value.weight                :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.01 GB, encoder.blocks.1.attn.value.bias                  :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.01 GB, encoder.blocks.1.attn.out.weight                  :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.01 GB, encoder.blocks.1.attn.out.bias                    :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.01 GB, encoder.blocks.1.attn_ln.weight                   :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.01 GB, encoder.blocks.1.attn_ln.bias                     :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.01 GB, encoder.blocks.1.mlp.0.weight                     :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.01 GB, encoder.blocks.1.mlp.0.bias                       :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.01 GB, encoder.blocks.1.mlp.2.weight                     :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.02 GB, encoder.blocks.1.mlp.2.bias                       :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.02 GB, encoder.blocks.1.mlp_ln.weight                    :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.02 GB, encoder.blocks.1.mlp_ln.bias                      :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.02 GB, encoder.blocks.2.attn.query.weight                :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.02 GB, encoder.blocks.2.attn.query.bias                  :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.02 GB, encoder.blocks.2.attn.key.weight                  :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.02 GB, encoder.blocks.2.attn.value.weight                :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.02 GB, encoder.blocks.2.attn.value.bias                  :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.02 GB, encoder.blocks.2.attn.out.weight                  :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.02 GB, encoder.blocks.2.attn.out.bias                    :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.02 GB, encoder.blocks.2.attn_ln.weight                   :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.02 GB, encoder.blocks.2.attn_ln.bias                     :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.02 GB, encoder.blocks.2.mlp.0.weight                     :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.02 GB, encoder.blocks.2.mlp.0.bias                       :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.02 GB, encoder.blocks.2.mlp.2.weight                     :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.02 GB, encoder.blocks.2.mlp.2.bias                       :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.02 GB, encoder.blocks.2.mlp_ln.weight                    :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.02 GB, encoder.blocks.2.mlp_ln.bias                      :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.02 GB, encoder.blocks.3.attn.query.weight                :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.02 GB, encoder.blocks.3.attn.query.bias                  :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.02 GB, encoder.blocks.3.attn.key.weight                  :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.02 GB, encoder.blocks.3.attn.value.weight                :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.02 GB, encoder.blocks.3.attn.value.bias                  :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.02 GB, encoder.blocks.3.attn.out.weight                  :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.03 GB, encoder.blocks.3.attn.out.bias                    :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.03 GB, encoder.blocks.3.attn_ln.weight                   :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.03 GB, encoder.blocks.3.attn_ln.bias                     :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.03 GB, encoder.blocks.3.mlp.0.weight                     :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.03 GB, encoder.blocks.3.mlp.0.bias                       :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.03 GB, encoder.blocks.3.mlp.2.weight                     :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.03 GB, encoder.blocks.3.mlp.2.bias                       :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.03 GB, encoder.blocks.3.mlp_ln.weight                    :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.03 GB, encoder.blocks.3.mlp_ln.bias                      :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.03 GB, encoder.ln_post.weight                            :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.03 GB, encoder.ln_post.bias                              :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.03 GB, encoder.positional_embedding                      :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.03 GB, decoder.token_embedding.weight                    :   0%|          | 0/168 [00:00<?, ?it/s]
ram used:  0.03 GB, decoder.token_embedding.weight                    :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.07 GB, decoder.positional_embedding                      :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.07 GB, decoder.blocks.0.attn.query.weight                :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.07 GB, decoder.blocks.0.attn.query.bias                  :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.07 GB, decoder.blocks.0.attn.key.weight                  :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.07 GB, decoder.blocks.0.attn.value.weight                :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.07 GB, decoder.blocks.0.attn.value.bias                  :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.07 GB, decoder.blocks.0.attn.out.weight                  :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.07 GB, decoder.blocks.0.attn.out.bias                    :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.07 GB, decoder.blocks.0.attn_ln.weight                   :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.07 GB, decoder.blocks.0.attn_ln.bias                     :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.07 GB, decoder.blocks.0.cross_attn.query.weight          :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.07 GB, decoder.blocks.0.cross_attn.query.bias            :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.07 GB, decoder.blocks.0.cross_attn.key.weight            :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.08 GB, decoder.blocks.0.cross_attn.value.weight          :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.08 GB, decoder.blocks.0.cross_attn.value.bias            :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.08 GB, decoder.blocks.0.cross_attn.out.weight            :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.08 GB, decoder.blocks.0.cross_attn.out.bias              :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.08 GB, decoder.blocks.0.cross_attn_ln.weight             :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.08 GB, decoder.blocks.0.cross_attn_ln.bias               :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.08 GB, decoder.blocks.0.mlp.0.weight                     :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.08 GB, decoder.blocks.0.mlp.0.bias                       :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.08 GB, decoder.blocks.0.mlp.2.weight                     :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.08 GB, decoder.blocks.0.mlp.2.bias                       :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.08 GB, decoder.blocks.0.mlp_ln.weight                    :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.08 GB, decoder.blocks.0.mlp_ln.bias                      :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.08 GB, decoder.blocks.1.attn.query.weight                :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.08 GB, decoder.blocks.1.attn.query.bias                  :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.08 GB, decoder.blocks.1.attn.key.weight                  :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.08 GB, decoder.blocks.1.attn.value.weight                :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.08 GB, decoder.blocks.1.attn.value.bias                  :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.08 GB, decoder.blocks.1.attn.out.weight                  :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.08 GB, decoder.blocks.1.attn.out.bias                    :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.08 GB, decoder.blocks.1.attn_ln.weight                   :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.08 GB, decoder.blocks.1.attn_ln.bias                     :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.08 GB, decoder.blocks.1.cross_attn.query.weight          :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.08 GB, decoder.blocks.1.cross_attn.query.bias            :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.08 GB, decoder.blocks.1.cross_attn.key.weight            :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.08 GB, decoder.blocks.1.cross_attn.value.weight          :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.09 GB, decoder.blocks.1.cross_attn.value.bias            :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.09 GB, decoder.blocks.1.cross_attn.out.weight            :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.09 GB, decoder.blocks.1.cross_attn.out.bias              :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.09 GB, decoder.blocks.1.cross_attn_ln.weight             :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.09 GB, decoder.blocks.1.cross_attn_ln.bias               :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.09 GB, decoder.blocks.1.mlp.0.weight                     :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.09 GB, decoder.blocks.1.mlp.0.bias                       :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.09 GB, decoder.blocks.1.mlp.2.weight                     :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.09 GB, decoder.blocks.1.mlp.2.bias                       :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.09 GB, decoder.blocks.1.mlp_ln.weight                    :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.09 GB, decoder.blocks.1.mlp_ln.bias                      :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.09 GB, decoder.blocks.2.attn.query.weight                :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.09 GB, decoder.blocks.2.attn.query.bias                  :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.09 GB, decoder.blocks.2.attn.key.weight                  :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.09 GB, decoder.blocks.2.attn.value.weight                :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.09 GB, decoder.blocks.2.attn.value.bias                  :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.09 GB, decoder.blocks.2.attn.out.weight                  :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.09 GB, decoder.blocks.2.attn.out.bias                    :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.09 GB, decoder.blocks.2.attn_ln.weight                   :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.09 GB, decoder.blocks.2.attn_ln.bias                     :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.09 GB, decoder.blocks.2.cross_attn.query.weight          :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.09 GB, decoder.blocks.2.cross_attn.query.bias            :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.09 GB, decoder.blocks.2.cross_attn.key.weight            :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.09 GB, decoder.blocks.2.cross_attn.value.weight          :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.09 GB, decoder.blocks.2.cross_attn.value.bias            :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.09 GB, decoder.blocks.2.cross_attn.out.weight            :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.10 GB, decoder.blocks.2.cross_attn.out.bias              :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.10 GB, decoder.blocks.2.cross_attn_ln.weight             :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.10 GB, decoder.blocks.2.cross_attn_ln.bias               :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.10 GB, decoder.blocks.2.mlp.0.weight                     :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.10 GB, decoder.blocks.2.mlp.0.bias                       :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.10 GB, decoder.blocks.2.mlp.2.weight                     :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.10 GB, decoder.blocks.2.mlp.2.bias                       :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.10 GB, decoder.blocks.2.mlp_ln.weight                    :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.10 GB, decoder.blocks.2.mlp_ln.bias                      :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.10 GB, decoder.blocks.3.attn.query.weight                :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.10 GB, decoder.blocks.3.attn.query.bias                  :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.10 GB, decoder.blocks.3.attn.key.weight                  :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.10 GB, decoder.blocks.3.attn.value.weight                :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.10 GB, decoder.blocks.3.attn.value.bias                  :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.10 GB, decoder.blocks.3.attn.out.weight                  :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.10 GB, decoder.blocks.3.attn.out.bias                    :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.10 GB, decoder.blocks.3.attn_ln.weight                   :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.10 GB, decoder.blocks.3.attn_ln.bias                     :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.10 GB, decoder.blocks.3.cross_attn.query.weight          :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.10 GB, decoder.blocks.3.cross_attn.query.bias            :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.10 GB, decoder.blocks.3.cross_attn.key.weight            :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.10 GB, decoder.blocks.3.cross_attn.value.weight          :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.10 GB, decoder.blocks.3.cross_attn.value.bias            :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.10 GB, decoder.blocks.3.cross_attn.out.weight            :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.10 GB, decoder.blocks.3.cross_attn.out.bias              :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.10 GB, decoder.blocks.3.cross_attn_ln.weight             :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.10 GB, decoder.blocks.3.cross_attn_ln.bias               :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.10 GB, decoder.blocks.3.mlp.0.weight                     :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.11 GB, decoder.blocks.3.mlp.0.bias                       :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.11 GB, decoder.blocks.3.mlp.2.weight                     :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.11 GB, decoder.blocks.3.mlp.2.bias                       :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.11 GB, decoder.blocks.3.mlp_ln.weight                    :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.11 GB, decoder.blocks.3.mlp_ln.bias                      :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.11 GB, decoder.ln.weight                                 :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.11 GB, decoder.ln.bias                                   :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.11 GB, decoder.mask                                      :  40%|████      | 68/168 [00:00<00:00, 609.31it/s]
ram used:  0.11 GB, decoder.mask                                      : 100%|██████████| 168/168 [00:00<00:00, 976.08it/s]
loaded weights in 173.76 ms, 0.11 GB loaded at 0.62 GB/s
Traceback (most recent call last):
  File "/home/jebba/devel/tinygrad/tinygrad/examples/conversation.py", line 261, in <module>
    synth, emotion_embedding, text_mapper, hps, model_has_multiple_speakers = init_vits(args.vits_model_to_use, args.vits_emotion_path, args.vits_speaker_id, args.vits_seed)
                                                                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jebba/devel/tinygrad/tinygrad/examples/conversation.py", line 166, in init_vits
    net_g = load_model(text_mapper.symbols, hps, model_config)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jebba/devel/tinygrad/tinygrad/examples/vits.py", line 535, in load_model
    _ = load_checkpoint(fetch(model[1]), net_g, None)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jebba/devel/tinygrad/tinygrad/examples/vits.py", line 540, in load_checkpoint
    checkpoint_dict = torch_load(checkpoint_path)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jebba/devel/tinygrad/tinygrad/tinygrad/nn/state.py", line 145, in torch_load
    _, _, _, rwd, _, ids, base_offset = pkl.load(), pkl.load(), pkl.load(), f.tell(), pkl.load(), pkl.load(), f.tell()
                                        ^^^^^^^^^^
_pickle.UnpicklingError: invalid load key, '<'.

efficientnet.py

281 8.961816 tabby, tabby cat
did inference in 5905.02 ms

f16_w_uint32.py

[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0.]

gpt2.py

using HIP backend
using gpt2-medium

  0%|          | 0/293 [00:00<?, ?it/s]
ram used:  0.00 GB, wte.weight                                        :   0%|          | 0/293 [00:00<?, ?it/s]
ram used:  0.00 GB, wte.weight                                        :   0%|          | 1/293 [00:00<01:37,  3.01it/s]
ram used:  0.21 GB, wpe.weight                                        :   0%|          | 1/293 [00:00<01:37,  3.01it/s]
ram used:  0.21 GB, h.0.attn.c_attn.weight                            :   0%|          | 1/293 [00:00<01:37,  3.01it/s]
ram used:  0.22 GB, h.0.attn.c_attn.bias                              :   0%|          | 1/293 [00:00<01:37,  3.01it/s]
ram used:  0.22 GB, h.0.attn.c_proj.weight                            :   0%|          | 1/293 [00:00<01:37,  3.01it/s]
ram used:  0.23 GB, h.0.attn.c_proj.bias                              :   0%|          | 1/293 [00:00<01:37,  3.01it/s]
ram used:  0.23 GB, h.0.mlp.c_fc.weight                               :   0%|          | 1/293 [00:00<01:37,  3.01it/s]
ram used:  0.24 GB, h.0.mlp.c_fc.bias                                 :   0%|          | 1/293 [00:00<01:37,  3.01it/s]
ram used:  0.24 GB, h.0.mlp.c_proj.weight                             :   0%|          | 1/293 [00:00<01:37,  3.01it/s]
ram used:  0.26 GB, h.0.mlp.c_proj.bias                               :   0%|          | 1/293 [00:00<01:37,  3.01it/s]
ram used:  0.26 GB, h.0.ln_1.weight                                   :   0%|          | 1/293 [00:00<01:37,  3.01it/s]
ram used:  0.26 GB, h.0.ln_1.bias                                     :   0%|          | 1/293 [00:00<01:37,  3.01it/s]
ram used:  0.26 GB, h.0.ln_2.weight                                   :   0%|          | 1/293 [00:00<01:37,  3.01it/s]
ram used:  0.26 GB, h.0.ln_2.bias                                     :   0%|          | 1/293 [00:00<01:37,  3.01it/s]
ram used:  0.26 GB, h.1.attn.c_attn.weight                            :   0%|          | 1/293 [00:00<01:37,  3.01it/s]
ram used:  0.27 GB, h.1.attn.c_attn.bias                              :   0%|          | 1/293 [00:00<01:37,  3.01it/s]
ram used:  0.27 GB, h.1.attn.c_proj.weight                            :   0%|          | 1/293 [00:00<01:37,  3.01it/s]
ram used:  0.28 GB, h.1.attn.c_proj.bias                              :   0%|          | 1/293 [00:00<01:37,  3.01it/s]
ram used:  0.28 GB, h.1.mlp.c_fc.weight                               :   0%|          | 1/293 [00:00<01:37,  3.01it/s]
ram used:  0.29 GB, h.1.mlp.c_fc.bias                                 :   0%|          | 1/293 [00:00<01:37,  3.01it/s]
ram used:  0.29 GB, h.1.mlp.c_proj.weight                             :   0%|          | 1/293 [00:00<01:37,  3.01it/s]
ram used:  0.31 GB, h.1.mlp.c_proj.bias                               :   0%|          | 1/293 [00:00<01:37,  3.01it/s]
ram used:  0.31 GB, h.1.ln_1.weight                                   :   0%|          | 1/293 [00:00<01:37,  3.01it/s]
ram used:  0.31 GB, h.1.ln_1.bias                                     :   0%|          | 1/293 [00:00<01:37,  3.01it/s]
ram used:  0.31 GB, h.1.ln_2.weight                                   :   0%|          | 1/293 [00:00<01:37,  3.01it/s]
ram used:  0.31 GB, h.1.ln_2.bias                                     :   0%|          | 1/293 [00:00<01:37,  3.01it/s]
ram used:  0.31 GB, h.2.attn.c_attn.weight                            :   0%|          | 1/293 [00:00<01:37,  3.01it/s]
ram used:  0.32 GB, h.2.attn.c_attn.bias                              :   0%|          | 1/293 [00:00<01:37,  3.01it/s]
ram used:  0.32 GB, h.2.attn.c_proj.weight                            :   0%|          | 1/293 [00:00<01:37,  3.01it/s]
ram used:  0.33 GB, h.2.attn.c_proj.bias                              :   0%|          | 1/293 [00:00<01:37,  3.01it/s]
ram used:  0.33 GB, h.2.mlp.c_fc.weight                               :   0%|          | 1/293 [00:00<01:37,  3.01it/s]
ram used:  0.34 GB, h.2.mlp.c_fc.bias                                 :   0%|          | 1/293 [00:00<01:37,  3.01it/s]
ram used:  0.34 GB, h.2.mlp.c_proj.weight                             :   0%|          | 1/293 [00:00<01:37,  3.01it/s]
ram used:  0.36 GB, h.2.mlp.c_proj.bias                               :   0%|          | 1/293 [00:00<01:37,  3.01it/s]
ram used:  0.36 GB, h.2.ln_1.weight                                   :   0%|          | 1/293 [00:00<01:37,  3.01it/s]
ram used:  0.36 GB, h.2.ln_1.bias                                     :   0%|          | 1/293 [00:00<01:37,  3.01it/s]
ram used:  0.36 GB, h.2.ln_2.weight                                   :   0%|          | 1/293 [00:00<01:37,  3.01it/s]
ram used:  0.36 GB, h.2.ln_2.bias                                     :   0%|          | 1/293 [00:00<01:37,  3.01it/s]
ram used:  0.36 GB, h.3.attn.c_attn.weight                            :   0%|          | 1/293 [00:00<01:37,  3.01it/s]
ram used:  0.37 GB, h.3.attn.c_attn.bias                              :   0%|          | 1/293 [00:00<01:37,  3.01it/s]
ram used:  0.37 GB, h.3.attn.c_proj.weight                            :   0%|          | 1/293 [00:00<01:37,  3.01it/s]
ram used:  0.38 GB, h.3.attn.c_proj.bias                              :   0%|          | 1/293 [00:00<01:37,  3.01it/s]
ram used:  0.38 GB, h.3.mlp.c_fc.weight                               :   0%|          | 1/293 [00:00<01:37,  3.01it/s]
ram used:  0.38 GB, h.3.mlp.c_fc.weight                               :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.39 GB, h.3.mlp.c_fc.bias                                 :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.39 GB, h.3.mlp.c_proj.weight                             :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.41 GB, h.3.mlp.c_proj.bias                               :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.41 GB, h.3.ln_1.weight                                   :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.41 GB, h.3.ln_1.bias                                     :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.41 GB, h.3.ln_2.weight                                   :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.41 GB, h.3.ln_2.bias                                     :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.41 GB, h.4.attn.c_attn.weight                            :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.42 GB, h.4.attn.c_attn.bias                              :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.42 GB, h.4.attn.c_proj.weight                            :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.43 GB, h.4.attn.c_proj.bias                              :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.43 GB, h.4.mlp.c_fc.weight                               :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.45 GB, h.4.mlp.c_fc.bias                                 :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.45 GB, h.4.mlp.c_proj.weight                             :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.46 GB, h.4.mlp.c_proj.bias                               :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.46 GB, h.4.ln_1.weight                                   :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.46 GB, h.4.ln_1.bias                                     :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.46 GB, h.4.ln_2.weight                                   :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.46 GB, h.4.ln_2.bias                                     :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.46 GB, h.5.attn.c_attn.weight                            :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.47 GB, h.5.attn.c_attn.bias                              :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.47 GB, h.5.attn.c_proj.weight                            :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.48 GB, h.5.attn.c_proj.bias                              :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.48 GB, h.5.mlp.c_fc.weight                               :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.50 GB, h.5.mlp.c_fc.bias                                 :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.50 GB, h.5.mlp.c_proj.weight                             :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.51 GB, h.5.mlp.c_proj.bias                               :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.51 GB, h.5.ln_1.weight                                   :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.51 GB, h.5.ln_1.bias                                     :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.51 GB, h.5.ln_2.weight                                   :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.51 GB, h.5.ln_2.bias                                     :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.51 GB, h.6.attn.c_attn.weight                            :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.52 GB, h.6.attn.c_attn.bias                              :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.52 GB, h.6.attn.c_proj.weight                            :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.53 GB, h.6.attn.c_proj.bias                              :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.53 GB, h.6.mlp.c_fc.weight                               :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.55 GB, h.6.mlp.c_fc.bias                                 :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.55 GB, h.6.mlp.c_proj.weight                             :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.56 GB, h.6.mlp.c_proj.bias                               :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.56 GB, h.6.ln_1.weight                                   :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.56 GB, h.6.ln_1.bias                                     :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.56 GB, h.6.ln_2.weight                                   :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.56 GB, h.6.ln_2.bias                                     :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.56 GB, h.7.attn.c_attn.weight                            :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.58 GB, h.7.attn.c_attn.bias                              :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.58 GB, h.7.attn.c_proj.weight                            :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.58 GB, h.7.attn.c_proj.bias                              :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.58 GB, h.7.mlp.c_fc.weight                               :  15%|█▍        | 43/293 [00:00<00:01, 127.03it/s]
ram used:  0.58 GB, h.7.mlp.c_fc.weight                               :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.60 GB, h.7.mlp.c_fc.bias                                 :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.60 GB, h.7.mlp.c_proj.weight                             :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.61 GB, h.7.mlp.c_proj.bias                               :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.61 GB, h.7.ln_1.weight                                   :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.61 GB, h.7.ln_1.bias                                     :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.61 GB, h.7.ln_2.weight                                   :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.61 GB, h.7.ln_2.bias                                     :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.61 GB, h.8.attn.c_attn.weight                            :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.63 GB, h.8.attn.c_attn.bias                              :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.63 GB, h.8.attn.c_proj.weight                            :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.63 GB, h.8.attn.c_proj.bias                              :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.63 GB, h.8.mlp.c_fc.weight                               :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.65 GB, h.8.mlp.c_fc.bias                                 :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.65 GB, h.8.mlp.c_proj.weight                             :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.66 GB, h.8.mlp.c_proj.bias                               :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.66 GB, h.8.ln_1.weight                                   :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.66 GB, h.8.ln_1.bias                                     :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.66 GB, h.8.ln_2.weight                                   :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.66 GB, h.8.ln_2.bias                                     :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.66 GB, h.9.attn.c_attn.weight                            :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.68 GB, h.9.attn.c_attn.bias                              :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.68 GB, h.9.attn.c_proj.weight                            :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.68 GB, h.9.attn.c_proj.bias                              :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.68 GB, h.9.mlp.c_fc.weight                               :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.70 GB, h.9.mlp.c_fc.bias                                 :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.70 GB, h.9.mlp.c_proj.weight                             :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.71 GB, h.9.mlp.c_proj.bias                               :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.71 GB, h.9.ln_1.weight                                   :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.71 GB, h.9.ln_1.bias                                     :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.71 GB, h.9.ln_2.weight                                   :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.71 GB, h.9.ln_2.bias                                     :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.71 GB, h.10.attn.c_attn.weight                           :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.73 GB, h.10.attn.c_attn.bias                             :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.73 GB, h.10.attn.c_proj.weight                           :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.73 GB, h.10.attn.c_proj.bias                             :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.73 GB, h.10.mlp.c_fc.weight                              :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.75 GB, h.10.mlp.c_fc.bias                                :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.75 GB, h.10.mlp.c_proj.weight                            :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.76 GB, h.10.mlp.c_proj.bias                              :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.76 GB, h.10.ln_1.weight                                  :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.76 GB, h.10.ln_1.bias                                    :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.76 GB, h.10.ln_2.weight                                  :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.76 GB, h.10.ln_2.bias                                    :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.76 GB, h.11.attn.c_attn.weight                           :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.78 GB, h.11.attn.c_attn.bias                             :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.78 GB, h.11.attn.c_proj.weight                           :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.78 GB, h.11.attn.c_proj.bias                             :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.78 GB, h.11.mlp.c_fc.weight                              :  31%|███       | 91/293 [00:00<00:00, 229.99it/s]
ram used:  0.78 GB, h.11.mlp.c_fc.weight                              :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.80 GB, h.11.mlp.c_fc.bias                                :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.80 GB, h.11.mlp.c_proj.weight                            :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.81 GB, h.11.mlp.c_proj.bias                              :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.81 GB, h.11.ln_1.weight                                  :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.81 GB, h.11.ln_1.bias                                    :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.81 GB, h.11.ln_2.weight                                  :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.81 GB, h.11.ln_2.bias                                    :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.81 GB, h.12.attn.c_attn.weight                           :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.83 GB, h.12.attn.c_attn.bias                             :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.83 GB, h.12.attn.c_proj.weight                           :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.83 GB, h.12.attn.c_proj.bias                             :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.83 GB, h.12.mlp.c_fc.weight                              :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.85 GB, h.12.mlp.c_fc.bias                                :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.85 GB, h.12.mlp.c_proj.weight                            :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.87 GB, h.12.mlp.c_proj.bias                              :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.87 GB, h.12.ln_1.weight                                  :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.87 GB, h.12.ln_1.bias                                    :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.87 GB, h.12.ln_2.weight                                  :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.87 GB, h.12.ln_2.bias                                    :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.87 GB, h.13.attn.c_attn.weight                           :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.88 GB, h.13.attn.c_attn.bias                             :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.88 GB, h.13.attn.c_proj.weight                           :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.88 GB, h.13.attn.c_proj.bias                             :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.88 GB, h.13.mlp.c_fc.weight                              :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.90 GB, h.13.mlp.c_fc.bias                                :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.90 GB, h.13.mlp.c_proj.weight                            :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.92 GB, h.13.mlp.c_proj.bias                              :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.92 GB, h.13.ln_1.weight                                  :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.92 GB, h.13.ln_1.bias                                    :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.92 GB, h.13.ln_2.weight                                  :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.92 GB, h.13.ln_2.bias                                    :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.92 GB, h.14.attn.c_attn.weight                           :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.93 GB, h.14.attn.c_attn.bias                             :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.93 GB, h.14.attn.c_proj.weight                           :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.93 GB, h.14.attn.c_proj.bias                             :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.93 GB, h.14.mlp.c_fc.weight                              :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.95 GB, h.14.mlp.c_fc.bias                                :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.95 GB, h.14.mlp.c_proj.weight                            :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.97 GB, h.14.mlp.c_proj.bias                              :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.97 GB, h.14.ln_1.weight                                  :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.97 GB, h.14.ln_1.bias                                    :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.97 GB, h.14.ln_2.weight                                  :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.97 GB, h.14.ln_2.bias                                    :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.97 GB, h.15.attn.c_attn.weight                           :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.98 GB, h.15.attn.c_attn.bias                             :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.98 GB, h.15.attn.c_proj.weight                           :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.98 GB, h.15.attn.c_proj.bias                             :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.98 GB, h.15.mlp.c_fc.weight                              :  47%|████▋     | 139/293 [00:00<00:00, 301.56it/s]
ram used:  0.98 GB, h.15.mlp.c_fc.weight                              :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.00 GB, h.15.mlp.c_fc.bias                                :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.00 GB, h.15.mlp.c_proj.weight                            :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.02 GB, h.15.mlp.c_proj.bias                              :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.02 GB, h.15.ln_1.weight                                  :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.02 GB, h.15.ln_1.bias                                    :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.02 GB, h.15.ln_2.weight                                  :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.02 GB, h.15.ln_2.bias                                    :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.02 GB, h.16.attn.c_attn.weight                           :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.03 GB, h.16.attn.c_attn.bias                             :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.03 GB, h.16.attn.c_proj.weight                           :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.03 GB, h.16.attn.c_proj.bias                             :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.03 GB, h.16.mlp.c_fc.weight                              :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.05 GB, h.16.mlp.c_fc.bias                                :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.05 GB, h.16.mlp.c_proj.weight                            :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.07 GB, h.16.mlp.c_proj.bias                              :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.07 GB, h.16.ln_1.weight                                  :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.07 GB, h.16.ln_1.bias                                    :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.07 GB, h.16.ln_2.weight                                  :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.07 GB, h.16.ln_2.bias                                    :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.07 GB, h.17.attn.c_attn.weight                           :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.08 GB, h.17.attn.c_attn.bias                             :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.08 GB, h.17.attn.c_proj.weight                           :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.08 GB, h.17.attn.c_proj.bias                             :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.08 GB, h.17.mlp.c_fc.weight                              :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.10 GB, h.17.mlp.c_fc.bias                                :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.10 GB, h.17.mlp.c_proj.weight                            :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.12 GB, h.17.mlp.c_proj.bias                              :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.12 GB, h.17.ln_1.weight                                  :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.12 GB, h.17.ln_1.bias                                    :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.12 GB, h.17.ln_2.weight                                  :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.12 GB, h.17.ln_2.bias                                    :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.12 GB, h.18.attn.c_attn.weight                           :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.13 GB, h.18.attn.c_attn.bias                             :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.13 GB, h.18.attn.c_proj.weight                           :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.13 GB, h.18.attn.c_proj.bias                             :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.13 GB, h.18.mlp.c_fc.weight                              :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.15 GB, h.18.mlp.c_fc.bias                                :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.15 GB, h.18.mlp.c_proj.weight                            :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.17 GB, h.18.mlp.c_proj.bias                              :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.17 GB, h.18.ln_1.weight                                  :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.17 GB, h.18.ln_1.bias                                    :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.17 GB, h.18.ln_2.weight                                  :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.17 GB, h.18.ln_2.bias                                    :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.17 GB, h.19.attn.c_attn.weight                           :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.18 GB, h.19.attn.c_attn.bias                             :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.18 GB, h.19.attn.c_proj.weight                           :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.18 GB, h.19.attn.c_proj.bias                             :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.18 GB, h.19.mlp.c_fc.weight                              :  64%|██████▍   | 187/293 [00:00<00:00, 351.28it/s]
ram used:  1.18 GB, h.19.mlp.c_fc.weight                              :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.20 GB, h.19.mlp.c_fc.bias                                :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.20 GB, h.19.mlp.c_proj.weight                            :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.22 GB, h.19.mlp.c_proj.bias                              :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.22 GB, h.19.ln_1.weight                                  :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.22 GB, h.19.ln_1.bias                                    :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.22 GB, h.19.ln_2.weight                                  :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.22 GB, h.19.ln_2.bias                                    :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.22 GB, h.20.attn.c_attn.weight                           :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.23 GB, h.20.attn.c_attn.bias                             :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.23 GB, h.20.attn.c_proj.weight                           :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.23 GB, h.20.attn.c_proj.bias                             :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.23 GB, h.20.mlp.c_fc.weight                              :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.25 GB, h.20.mlp.c_fc.bias                                :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.25 GB, h.20.mlp.c_proj.weight                            :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.27 GB, h.20.mlp.c_proj.bias                              :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.27 GB, h.20.ln_1.weight                                  :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.27 GB, h.20.ln_1.bias                                    :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.27 GB, h.20.ln_2.weight                                  :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.27 GB, h.20.ln_2.bias                                    :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.27 GB, h.21.attn.c_attn.weight                           :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.28 GB, h.21.attn.c_attn.bias                             :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.28 GB, h.21.attn.c_proj.weight                           :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.28 GB, h.21.attn.c_proj.bias                             :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.28 GB, h.21.mlp.c_fc.weight                              :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.30 GB, h.21.mlp.c_fc.bias                                :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.30 GB, h.21.mlp.c_proj.weight                            :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.32 GB, h.21.mlp.c_proj.bias                              :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.32 GB, h.21.ln_1.weight                                  :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.32 GB, h.21.ln_1.bias                                    :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.32 GB, h.21.ln_2.weight                                  :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.32 GB, h.21.ln_2.bias                                    :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.32 GB, h.22.attn.c_attn.weight                           :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.33 GB, h.22.attn.c_attn.bias                             :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.33 GB, h.22.attn.c_proj.weight                           :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.34 GB, h.22.attn.c_proj.bias                             :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.34 GB, h.22.mlp.c_fc.weight                              :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.35 GB, h.22.mlp.c_fc.bias                                :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.35 GB, h.22.mlp.c_proj.weight                            :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.37 GB, h.22.mlp.c_proj.bias                              :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.37 GB, h.22.ln_1.weight                                  :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.37 GB, h.22.ln_1.bias                                    :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.37 GB, h.22.ln_2.weight                                  :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.37 GB, h.22.ln_2.bias                                    :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.37 GB, h.23.attn.c_attn.weight                           :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.38 GB, h.23.attn.c_attn.bias                             :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.38 GB, h.23.attn.c_proj.weight                           :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.39 GB, h.23.attn.c_proj.bias                             :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.39 GB, h.23.mlp.c_fc.weight                              :  80%|████████  | 235/293 [00:00<00:00, 387.09it/s]
ram used:  1.39 GB, h.23.mlp.c_fc.weight                              :  97%|█████████▋| 283/293 [00:00<00:00, 412.40it/s]
ram used:  1.40 GB, h.23.mlp.c_fc.bias                                :  97%|█████████▋| 283/293 [00:00<00:00, 412.40it/s]
ram used:  1.40 GB, h.23.mlp.c_proj.weight                            :  97%|█████████▋| 283/293 [00:00<00:00, 412.40it/s]
ram used:  1.42 GB, h.23.mlp.c_proj.bias                              :  97%|█████████▋| 283/293 [00:00<00:00, 412.40it/s]
ram used:  1.42 GB, h.23.ln_1.weight                                  :  97%|█████████▋| 283/293 [00:00<00:00, 412.40it/s]
ram used:  1.42 GB, h.23.ln_1.bias                                    :  97%|█████████▋| 283/293 [00:00<00:00, 412.40it/s]
ram used:  1.42 GB, h.23.ln_2.weight                                  :  97%|█████████▋| 283/293 [00:00<00:00, 412.40it/s]
ram used:  1.42 GB, h.23.ln_2.bias                                    :  97%|█████████▋| 283/293 [00:00<00:00, 412.40it/s]
ram used:  1.42 GB, ln_f.weight                                       :  97%|█████████▋| 283/293 [00:00<00:00, 412.40it/s]
ram used:  1.42 GB, ln_f.bias                                         :  97%|█████████▋| 283/293 [00:00<00:00, 412.40it/s]
ram used:  1.42 GB, lm_head.weight                                    :  97%|█████████▋| 283/293 [00:00<00:00, 412.40it/s]
ram used:  1.42 GB, lm_head.weight                                    : 100%|██████████| 293/293 [00:01<00:00, 284.19it/s]
loaded weights in 1034.68 ms, 1.63 GB loaded at 1.57 GB/s

  0%|          | 0/100 [00:00<?, ?it/s]
  1%|          | 1/100 [00:00<00:23,  4.14it/s]
  2%|▏         | 2/100 [00:00<00:23,  4.14it/s]
  3%|▎         | 3/100 [00:00<00:19,  5.00it/s]
  4%|▍         | 4/100 [00:00<00:17,  5.54it/s]
  5%|▌         | 5/100 [00:00<00:16,  5.90it/s]
  6%|▌         | 6/100 [00:01<00:15,  6.14it/s]
  7%|▋         | 7/100 [00:01<00:15,  6.06it/s]
  8%|▊         | 8/100 [00:01<00:14,  6.23it/s]
  9%|▉         | 9/100 [00:01<00:14,  6.35it/s]
 10%|█         | 10/100 [00:01<00:13,  6.44it/s]
 11%|█         | 11/100 [00:01<00:13,  6.51it/s]
 12%|█▏        | 12/100 [00:02<00:13,  6.55it/s]
 13%|█▎        | 13/100 [00:02<00:13,  6.58it/s]
 14%|█▍        | 14/100 [00:02<00:13,  6.60it/s]
 15%|█▌        | 15/100 [00:02<00:13,  6.37it/s]
 16%|█▌        | 16/100 [00:02<00:13,  6.45it/s]
 17%|█▋        | 17/100 [00:02<00:12,  6.51it/s]
 18%|█▊        | 18/100 [00:02<00:12,  6.56it/s]
 19%|█▉        | 19/100 [00:03<00:12,  6.59it/s]
 20%|██        | 20/100 [00:03<00:12,  6.60it/s]
 21%|██        | 21/100 [00:03<00:11,  6.62it/s]
 22%|██▏       | 22/100 [00:03<00:11,  6.64it/s]
 23%|██▎       | 23/100 [00:03<00:12,  6.38it/s]
 24%|██▍       | 24/100 [00:03<00:11,  6.46it/s]
 25%|██▌       | 25/100 [00:03<00:11,  6.51it/s]
 26%|██▌       | 26/100 [00:04<00:11,  6.56it/s]
 27%|██▋       | 27/100 [00:04<00:11,  6.59it/s]
 28%|██▊       | 28/100 [00:04<00:10,  6.61it/s]
 29%|██▉       | 29/100 [00:04<00:10,  6.63it/s]
 30%|███       | 30/100 [00:04<00:10,  6.64it/s]
 31%|███       | 31/100 [00:04<00:10,  6.36it/s]
 32%|███▏      | 32/100 [00:05<00:10,  6.43it/s]
 33%|███▎      | 33/100 [00:05<00:10,  6.49it/s]
 34%|███▍      | 34/100 [00:05<00:10,  6.54it/s]
 35%|███▌      | 35/100 [00:05<00:09,  6.58it/s]
 36%|███▌      | 36/100 [00:05<00:09,  6.59it/s]
 37%|███▋      | 37/100 [00:05<00:09,  6.61it/s]
 38%|███▊      | 38/100 [00:05<00:09,  6.62it/s]
 39%|███▉      | 39/100 [00:06<00:09,  6.33it/s]
 40%|████      | 40/100 [00:06<00:09,  6.41it/s]
 41%|████      | 41/100 [00:06<00:09,  6.47it/s]
 42%|████▏     | 42/100 [00:06<00:08,  6.52it/s]
 43%|████▎     | 43/100 [00:06<00:08,  6.55it/s]
 44%|████▍     | 44/100 [00:06<00:08,  6.58it/s]
 45%|████▌     | 45/100 [00:07<00:08,  6.60it/s]
 46%|████▌     | 46/100 [00:07<00:08,  6.61it/s]
 47%|████▋     | 47/100 [00:07<00:08,  6.61it/s]
 48%|████▊     | 48/100 [00:07<00:08,  6.32it/s]
 49%|████▉     | 49/100 [00:07<00:07,  6.39it/s]
 50%|█████     | 50/100 [00:07<00:07,  6.46it/s]
 51%|█████     | 51/100 [00:07<00:07,  6.52it/s]
 52%|█████▏    | 52/100 [00:08<00:07,  6.55it/s]
 53%|█████▎    | 53/100 [00:08<00:07,  6.58it/s]
 54%|█████▍    | 54/100 [00:08<00:06,  6.59it/s]
 55%|█████▌    | 55/100 [00:08<00:06,  6.61it/s]
 56%|█████▌    | 56/100 [00:08<00:06,  6.62it/s]
 57%|█████▋    | 57/100 [00:08<00:06,  6.63it/s]
 58%|█████▊    | 58/100 [00:09<00:06,  6.28it/s]
 59%|█████▉    | 59/100 [00:09<00:06,  6.38it/s]
 60%|██████    | 60/100 [00:09<00:06,  6.45it/s]
 61%|██████    | 61/100 [00:09<00:05,  6.50it/s]
 62%|██████▏   | 62/100 [00:09<00:05,  6.55it/s]
 63%|██████▎   | 63/100 [00:09<00:05,  6.58it/s]
 64%|██████▍   | 64/100 [00:09<00:05,  6.60it/s]
 65%|██████▌   | 65/100 [00:10<00:05,  6.62it/s]
 66%|██████▌   | 66/100 [00:10<00:05,  6.63it/s]
 67%|██████▋   | 67/100 [00:10<00:04,  6.64it/s]
 68%|██████▊   | 68/100 [00:10<00:05,  6.27it/s]
 69%|██████▉   | 69/100 [00:10<00:04,  6.37it/s]
 70%|███████   | 70/100 [00:10<00:04,  6.45it/s]
 71%|███████   | 71/100 [00:11<00:04,  6.50it/s]
 72%|███████▏  | 72/100 [00:11<00:04,  6.54it/s]
 73%|███████▎  | 73/100 [00:11<00:04,  6.58it/s]
 74%|███████▍  | 74/100 [00:11<00:03,  6.59it/s]
 75%|███████▌  | 75/100 [00:11<00:03,  6.60it/s]
 76%|███████▌  | 76/100 [00:11<00:03,  6.61it/s]
 77%|███████▋  | 77/100 [00:11<00:03,  6.62it/s]
 78%|███████▊  | 78/100 [00:12<00:03,  6.24it/s]
 79%|███████▉  | 79/100 [00:12<00:03,  6.34it/s]
 80%|████████  | 80/100 [00:12<00:03,  6.42it/s]
 81%|████████  | 81/100 [00:12<00:02,  6.48it/s]
 82%|████████▏ | 82/100 [00:12<00:02,  6.52it/s]
 83%|████████▎ | 83/100 [00:12<00:02,  6.55it/s]
 84%|████████▍ | 84/100 [00:13<00:02,  6.58it/s]
 85%|████████▌ | 85/100 [00:13<00:02,  6.60it/s]
 86%|████████▌ | 86/100 [00:13<00:02,  6.60it/s]
 87%|████████▋ | 87/100 [00:13<00:01,  6.61it/s]
 88%|████████▊ | 88/100 [00:13<00:01,  6.62it/s]
 89%|████████▉ | 89/100 [00:13<00:01,  6.20it/s]
 90%|█████████ | 90/100 [00:13<00:01,  6.31it/s]
 91%|█████████ | 91/100 [00:14<00:01,  6.40it/s]
 92%|█████████▏| 92/100 [00:14<00:01,  6.47it/s]
 93%|█████████▎| 93/100 [00:14<00:01,  6.51it/s]
 94%|█████████▍| 94/100 [00:14<00:00,  6.55it/s]
 95%|█████████▌| 95/100 [00:14<00:00,  6.58it/s]
 96%|█████████▌| 96/100 [00:14<00:00,  6.60it/s]
 97%|█████████▋| 97/100 [00:15<00:00,  6.61it/s]
 98%|█████████▊| 98/100 [00:15<00:00,  6.62it/s]
 99%|█████████▉| 99/100 [00:15<00:00,  6.63it/s]
100%|██████████| 100/100 [00:15<00:00,  6.19it/s]
100%|██████████| 100/100 [00:15<00:00,  6.44it/s]
Generating text...
What is the answer to life, the universe, and everything? You can't. If you can solve it, you'll find a way. But don't try to do it alone. See what I have done, and what you can do to solve it. The universe has no solution."

Note that Domino is not referring to the existence of God: He is referencing the existence of nine Upside-Down Order-Holes.

P.S. I just re-read the RIT book, pointed out this in the comments,

handcode_resnet50_opt.py

optimizing for HIP
***    2.25 ms : kernel  0 r_64_8_7_7_2_16_4_3_7_4_4_7           [49, 8, 64]        [4, 16, 2]   takes    2.25 ms,   6881 GFLOPS
***    2.58 ms : kernel  1 r_2048_7_7_2_8_8_3_3                  [7, 7, 2048]       [8, 8, 2]    takes    0.33 ms,    351 GFLOPS
***    2.77 ms : kernel  2 r_64_2_49_8_16_16_4_4_4               [49, 2, 64]        [16, 8]      takes    0.19 ms,   9245 GFLOPS
***    3.72 ms : kernel  3 r_64_2_7_7_8_8_2_64_4_4_3_3           [49, 2, 64]        [2, 8, 8]    takes    0.95 ms,  15765 GFLOPS
***    4.38 ms : kernel  4 r_64_8_49_8_16_16_4_4_4               [49, 8, 64]        [16, 8]      takes    0.66 ms,   9984 GFLOPS
***    5.09 ms : kernel  5 r_64_8_49_8_16_16_4_4_4n1             [49, 8, 64]        [16, 8]      takes    0.71 ms,  10228 GFLOPS
***    5.66 ms : kernel  6 r_64_2_49_8_16_64_4_4_4               [49, 2, 64]        [16, 8]      takes    0.57 ms,  11626 GFLOPS
***    6.60 ms : kernel  7 r_64_2_7_7_8_8_2_64_4_4_3_3n1         [49, 2, 64]        [2, 8, 8]    takes    0.95 ms,  15765 GFLOPS
***    7.32 ms : kernel  8 r_64_8_49_8_16_16_4_4_4n2             [49, 8, 64]        [16, 8]      takes    0.71 ms,   9875 GFLOPS
***    7.89 ms : kernel  9 r_64_2_49_8_16_64_4_4_4n1             [49, 2, 64]        [16, 8]      takes    0.57 ms,  11626 GFLOPS
***    8.84 ms : kernel 10 r_64_2_7_7_8_8_2_64_4_4_3_3n2         [49, 2, 64]        [2, 8, 8]    takes    0.95 ms,  15765 GFLOPS
***    9.55 ms : kernel 11 r_64_8_49_8_16_16_4_4_4n3             [49, 8, 64]        [16, 8]      takes    0.71 ms,   9875 GFLOPS
***   10.38 ms : kernel 12 r_64_4_49_8_16_64_4_4_4               [49, 4, 64]        [16, 8]      takes    0.83 ms,  16069 GFLOPS
***   11.95 ms : kernel 13 r_32_2_7_7_2_16_4_128_4_4_3_3         [49, 2, 32]        [4, 16, 2]   takes    1.57 ms,   9461 GFLOPS
***   13.60 ms : kernel 14 r_32_8_7_7_2_16_4_64_4_4_4            [49, 8, 32]        [4, 16, 2]   takes    1.65 ms,   7984 GFLOPS
***   14.37 ms : kernel 15 r_32_8_49_2_16_4_32_4_4_4             [49, 8, 32]        [4, 16, 2]   takes    0.77 ms,   8986 GFLOPS
***   15.19 ms : kernel 16 r_32_2_49_2_16_4_128_4_4_4            [49, 2, 32]        [4, 16, 2]   takes    0.82 ms,   8049 GFLOPS
***   16.47 ms : kernel 17 r_32_2_7_7_2_16_4_128_4_4_3_3n1       [49, 2, 32]        [4, 16, 2]   takes    1.28 ms,  11623 GFLOPS
***   17.25 ms : kernel 18 r_32_8_49_2_16_4_32_4_4_4n1           [49, 8, 32]        [4, 16, 2]   takes    0.78 ms,   8731 GFLOPS
***   18.07 ms : kernel 19 r_32_2_49_2_16_4_128_4_4_4n1          [49, 2, 32]        [4, 16, 2]   takes    0.82 ms,   8049 GFLOPS
***   19.35 ms : kernel 20 r_32_2_7_7_2_16_4_128_4_4_3_3n2       [49, 2, 32]        [4, 16, 2]   takes    1.28 ms,  11623 GFLOPS
***   20.13 ms : kernel 21 r_32_8_49_2_16_4_32_4_4_4n2           [49, 8, 32]        [4, 16, 2]   takes    0.78 ms,   8731 GFLOPS
***   20.95 ms : kernel 22 r_32_2_49_2_16_4_128_4_4_4n2          [49, 2, 32]        [4, 16, 2]   takes    0.82 ms,   8049 GFLOPS
***   22.23 ms : kernel 23 r_32_2_7_7_2_16_4_128_4_4_3_3n3       [49, 2, 32]        [4, 16, 2]   takes    1.28 ms,  11623 GFLOPS
***   23.01 ms : kernel 24 r_32_8_49_2_16_4_32_4_4_4n3           [49, 8, 32]        [4, 16, 2]   takes    0.78 ms,   8731 GFLOPS
***   24.26 ms : kernel 25 r_32_4_49_2_16_4_128_4_4_4            [49, 4, 32]        [4, 16, 2]   takes    1.25 ms,  10619 GFLOPS
***   26.39 ms : kernel 26 r_16_4_7_7_16_2_2_256_4_4_3_3         [49, 4, 16]        [2, 2, 16]   takes    2.13 ms,   6957 GFLOPS
***   29.24 ms : kernel 27 r_16_16_7_7_16_2_2_128_4_4_4          [49, 16, 16]       [2, 2, 16]   takes    2.85 ms,   4618 GFLOPS
***   30.42 ms : kernel 28 r_8_16_49_8_16_64_4_4_4               [49, 16, 8]        [16, 8]      takes    1.18 ms,   5702 GFLOPS
***   31.83 ms : kernel 29 r_8_4_49_8_16_256_4_4_4               [49, 4, 8]         [16, 8]      takes    1.41 ms,   4669 GFLOPS
***   33.35 ms : kernel 30 r_16_4_7_7_16_2_2_256_4_4_3_3n1       [49, 4, 16]        [2, 2, 16]   takes    1.52 ms,   9776 GFLOPS
***   34.55 ms : kernel 31 r_8_16_49_8_16_64_4_4_4n1             [49, 16, 8]        [16, 8]      takes    1.20 ms,   5569 GFLOPS
***   35.97 ms : kernel 32 r_8_4_49_8_16_256_4_4_4n1             [49, 4, 8]         [16, 8]      takes    1.41 ms,   4669 GFLOPS
***   37.48 ms : kernel 33 r_16_4_7_7_16_2_2_256_4_4_3_3n2       [49, 4, 16]        [2, 2, 16]   takes    1.52 ms,   9776 GFLOPS
***   38.68 ms : kernel 34 r_8_16_49_8_16_64_4_4_4n2             [49, 16, 8]        [16, 8]      takes    1.20 ms,   5569 GFLOPS
***   40.10 ms : kernel 35 r_8_4_49_8_16_256_4_4_4n2             [49, 4, 8]         [16, 8]      takes    1.41 ms,   4669 GFLOPS
***   41.61 ms : kernel 36 r_16_4_7_7_16_2_2_256_4_4_3_3n3       [49, 4, 16]        [2, 2, 16]   takes    1.52 ms,   9776 GFLOPS
***   42.82 ms : kernel 37 r_8_16_49_8_16_64_4_4_4n3             [49, 16, 8]        [16, 8]      takes    1.20 ms,   5569 GFLOPS
***   44.23 ms : kernel 38 r_8_4_49_8_16_256_4_4_4n3             [49, 4, 8]         [16, 8]      takes    1.41 ms,   4669 GFLOPS
***   45.75 ms : kernel 39 r_16_4_7_7_16_2_2_256_4_4_3_3n4       [49, 4, 16]        [2, 2, 16]   takes    1.52 ms,   9776 GFLOPS
***   46.95 ms : kernel 40 r_8_16_49_8_16_64_4_4_4n4             [49, 16, 8]        [16, 8]      takes    1.20 ms,   5569 GFLOPS
***   48.36 ms : kernel 41 r_8_4_49_8_16_256_4_4_4n4             [49, 4, 8]         [16, 8]      takes    1.41 ms,   4669 GFLOPS
***   49.88 ms : kernel 42 r_16_4_7_7_16_2_2_256_4_4_3_3n5       [49, 4, 16]        [2, 2, 16]   takes    1.52 ms,   9776 GFLOPS
***   51.08 ms : kernel 43 r_8_16_49_8_16_64_4_4_4n5             [49, 16, 8]        [16, 8]      takes    1.20 ms,   5569 GFLOPS
***   53.78 ms : kernel 44 r_8_8_49_8_16_256_4_4_4               [49, 8, 8]         [16, 8]      takes    2.70 ms,   4896 GFLOPS
***  100.26 ms : kernel 45 r_8_8_8_16_512_3_3_7_7_4              [8, 8]             [16, 8]      takes   46.48 ms,    319 GFLOPS
***  104.56 ms : kernel 46 r_2_32_7_7_8_16_256_4_4_4             [49, 32, 2]        [16, 8]      takes    4.31 ms,   3055 GFLOPS
***  106.54 ms : kernel 47 r_2_32_49_8_16_128_4_4_4              [49, 32, 2]        [16, 8]      takes    1.98 ms,   3367 GFLOPS
***  108.96 ms : kernel 48 r_2_8_49_8_16_512_4_4_4               [49, 8, 2]         [16, 8]      takes    2.41 ms,   2731 GFLOPS
***  126.40 ms : kernel 49 r_8_8_8_16_512_3_3_7_7_4n1            [8, 8]             [16, 8]      takes   17.44 ms,    849 GFLOPS
***  128.35 ms : kernel 50 r_2_32_49_8_16_128_4_4_4n1            [49, 32, 2]        [16, 8]      takes    1.95 ms,   3401 GFLOPS
***  130.76 ms : kernel 51 r_2_8_49_8_16_512_4_4_4n1             [49, 8, 2]         [16, 8]      takes    2.41 ms,   2731 GFLOPS
***  148.20 ms : kernel 52 r_8_8_8_16_512_3_3_7_7_4n2            [8, 8]             [16, 8]      takes   17.44 ms,    849 GFLOPS
***  150.15 ms : kernel 53 r_2_32_49_8_16_128_4_4_4n2            [49, 32, 2]        [16, 8]      takes    1.95 ms,   3401 GFLOPS
***  150.24 ms : kernel 54 r_1024_32_49_4                        [1024]             [32]         takes    0.08 ms,     79 GFLOPS
***  150.43 ms : kernel 55 r_125_16_2_512_4_4_4                  [125]              [2, 16]      takes    0.19 ms,   1382 GFLOPS
***  150.45 ms : kernel 56 r_2_32_250_4                          [2]                [32]         takes    0.03 ms,      2 GFLOPS
***  150.53 ms : kernel 57 r_2_32_250_4n1                        [2]                [32]         takes    0.07 ms,      3 GFLOPS
***  150.54 ms : kernel 58 E_2_125_32_2_4                        [125, 2]           [2, 32]      takes    0.01 ms,      9 GFLOPS
******* total 150.54 ms,   3515 GFLOPS

hlb_cifar10.py

shuffling training dataset in 1337.90 ms (epoch=0)
  0 15108.61 ms run, 15098.47 ms python,   10.13 ms HIP, 1198.26 loss, 0.000015 LR, 0.84 GB used,     44.80 GFLOPS,    676.91 GOPS
  1 5837.29 ms run, 5834.97 ms python,    2.32 ms HIP, 1197.49 loss, 0.000030 LR, 4.54 GB used,    115.64 GFLOPS,    675.05 GOPS
  2   74.37 ms run,    4.43 ms python,   69.93 ms HIP, 1188.08 loss, 0.000045 LR, 4.54 GB used,   9077.39 GFLOPS,    675.05 GOPS
  3   73.70 ms run,    2.85 ms python,   70.85 ms HIP, 1171.24 loss, 0.000060 LR, 4.54 GB used,   9159.11 GFLOPS,    675.05 GOPS
  4   71.96 ms run,    2.86 ms python,   69.10 ms HIP, 1160.91 loss, 0.000075 LR, 4.54 GB used,   9380.72 GFLOPS,    675.05 GOPS
  5   70.66 ms run,    2.79 ms python,   67.87 ms HIP, 1158.75 loss, 0.000090 LR, 4.54 GB used,   9553.50 GFLOPS,    675.05 GOPS
  6   70.78 ms run,    2.77 ms python,   68.02 ms HIP, 1149.44 loss, 0.000105 LR, 4.54 GB used,   9537.02 GFLOPS,    675.05 GOPS
  7   69.50 ms run,    2.79 ms python,   66.71 ms HIP, 1172.52 loss, 0.000120 LR, 4.54 GB used,   9713.08 GFLOPS,    675.05 GOPS
  8   69.50 ms run,    2.76 ms python,   66.73 ms HIP, 1143.33 loss, 0.000135 LR, 4.54 GB used,   9713.20 GFLOPS,    675.05 GOPS
  9   69.41 ms run,    2.74 ms python,   66.67 ms HIP, 1129.94 loss, 0.000149 LR, 4.54 GB used,   9725.38 GFLOPS,    675.05 GOPS
 10   69.42 ms run,    2.96 ms python,   66.46 ms HIP, 1114.84 loss, 0.000164 LR, 4.54 GB used,   9724.70 GFLOPS,    675.05 GOPS
 11   69.34 ms run,    2.75 ms python,   66.59 ms HIP, 1099.61 loss, 0.000179 LR, 4.54 GB used,   9735.30 GFLOPS,    675.05 GOPS
 12   68.99 ms run,    2.71 ms python,   66.28 ms HIP, 1092.18 loss, 0.000194 LR, 4.54 GB used,   9784.85 GFLOPS,    675.05 GOPS
 13   68.94 ms run,    2.72 ms python,   66.22 ms HIP, 1068.63 loss, 0.000209 LR, 4.54 GB used,   9791.70 GFLOPS,    675.05 GOPS
 14   68.94 ms run,    2.74 ms python,   66.19 ms HIP, 1066.79 loss, 0.000224 LR, 4.54 GB used,   9792.07 GFLOPS,    675.05 GOPS
 15   69.61 ms run,    2.74 ms python,   66.86 ms HIP, 1065.05 loss, 0.000239 LR, 4.54 GB used,   9698.12 GFLOPS,    675.05 GOPS
 16   68.99 ms run,    2.76 ms python,   66.23 ms HIP, 1030.24 loss, 0.000254 LR, 4.54 GB used,   9785.08 GFLOPS,    675.05 GOPS
 17   69.05 ms run,    2.74 ms python,   66.31 ms HIP, 1034.32 loss, 0.000269 LR, 4.54 GB used,   9776.24 GFLOPS,    675.05 GOPS
 18   69.09 ms run,    2.79 ms python,   66.30 ms HIP, 1012.50 loss, 0.000284 LR, 4.54 GB used,   9770.48 GFLOPS,    675.05 GOPS
 19   68.84 ms run,    2.75 ms python,   66.10 ms HIP,  995.76 loss, 0.000299 LR, 4.54 GB used,   9805.71 GFLOPS,    675.05 GOPS
 20   69.74 ms run,    2.72 ms python,   67.02 ms HIP,  985.93 loss, 0.000314 LR, 4.54 GB used,   9679.54 GFLOPS,    675.05 GOPS
 21   69.37 ms run,    2.67 ms python,   66.69 ms HIP,  970.91 loss, 0.000329 LR, 4.54 GB used,   9731.73 GFLOPS,    675.05 GOPS
 22   69.33 ms run,    2.76 ms python,   66.58 ms HIP,  978.42 loss, 0.000344 LR, 4.54 GB used,   9736.07 GFLOPS,    675.05 GOPS
 23   69.28 ms run,    2.69 ms python,   66.58 ms HIP,  993.64 loss, 0.000359 LR, 4.54 GB used,   9744.05 GFLOPS,    675.05 GOPS
 24   69.03 ms run,    2.72 ms python,   66.31 ms HIP,  950.65 loss, 0.000374 LR, 4.54 GB used,   9779.28 GFLOPS,    675.05 GOPS
 25   69.20 ms run,    2.72 ms python,   66.48 ms HIP,  928.26 loss, 0.000389 LR, 4.54 GB used,   9754.64 GFLOPS,    675.05 GOPS
 26   69.84 ms run,    2.67 ms python,   67.16 ms HIP,  941.32 loss, 0.000404 LR, 4.54 GB used,   9665.76 GFLOPS,    675.05 GOPS
 27   69.34 ms run,    2.70 ms python,   66.64 ms HIP,  932.31 loss, 0.000418 LR, 4.54 GB used,   9734.64 GFLOPS,    675.05 GOPS
 28   69.07 ms run,    2.75 ms python,   66.32 ms HIP,  928.35 loss, 0.000433 LR, 4.54 GB used,   9773.78 GFLOPS,    675.05 GOPS
 29   68.97 ms run,    2.76 ms python,   66.22 ms HIP,  896.15 loss, 0.000448 LR, 4.54 GB used,   9787.18 GFLOPS,    675.05 GOPS
 30   68.71 ms run,    2.74 ms python,   65.97 ms HIP,  928.76 loss, 0.000463 LR, 4.54 GB used,   9824.42 GFLOPS,    675.05 GOPS
 31   69.22 ms run,    2.78 ms python,   66.44 ms HIP,  907.65 loss, 0.000478 LR, 4.54 GB used,   9751.77 GFLOPS,    675.05 GOPS
 32   69.47 ms run,    2.75 ms python,   66.73 ms HIP,  892.33 loss, 0.000493 LR, 4.54 GB used,   9716.56 GFLOPS,    675.05 GOPS
 33   68.84 ms run,    2.69 ms python,   66.15 ms HIP,  873.42 loss, 0.000508 LR, 4.54 GB used,   9805.90 GFLOPS,    675.05 GOPS
 34   69.59 ms run,    2.72 ms python,   66.86 ms HIP,  873.31 loss, 0.000523 LR, 4.54 GB used,   9700.65 GFLOPS,    675.05 GOPS
 35   69.19 ms run,    2.70 ms python,   66.49 ms HIP,  880.48 loss, 0.000538 LR, 4.54 GB used,   9757.04 GFLOPS,    675.05 GOPS
 36   69.74 ms run,    2.74 ms python,   67.01 ms HIP,  889.98 loss, 0.000553 LR, 4.54 GB used,   9678.76 GFLOPS,    675.05 GOPS
 37   68.99 ms run,    2.72 ms python,   66.27 ms HIP,  944.16 loss, 0.000568 LR, 4.54 GB used,   9784.54 GFLOPS,    675.05 GOPS
 38   69.25 ms run,    2.69 ms python,   66.56 ms HIP,  906.17 loss, 0.000583 LR, 4.54 GB used,   9747.65 GFLOPS,    675.05 GOPS
 39   69.12 ms run,    2.73 ms python,   66.40 ms HIP,  912.06 loss, 0.000598 LR, 4.54 GB used,   9765.62 GFLOPS,    675.05 GOPS
 40   69.11 ms run,    2.68 ms python,   66.42 ms HIP,  872.09 loss, 0.000613 LR, 4.54 GB used,   9768.14 GFLOPS,    675.05 GOPS
 41   68.29 ms run,    2.70 ms python,   65.59 ms HIP,  847.76 loss, 0.000628 LR, 4.54 GB used,   9885.47 GFLOPS,    675.05 GOPS
 42   69.38 ms run,    2.74 ms python,   66.64 ms HIP,  847.61 loss, 0.000643 LR, 4.54 GB used,   9729.01 GFLOPS,    675.05 GOPS
 43   69.86 ms run,    2.73 ms python,   67.12 ms HIP,  852.14 loss, 0.000658 LR, 4.54 GB used,   9663.11 GFLOPS,    675.05 GOPS
 44   69.20 ms run,    2.74 ms python,   66.47 ms HIP,  858.28 loss, 0.000673 LR, 4.54 GB used,   9754.62 GFLOPS,    675.05 GOPS
 45   69.28 ms run,    2.71 ms python,   66.57 ms HIP,  883.35 loss, 0.000688 LR, 4.54 GB used,   9743.96 GFLOPS,    675.05 GOPS
 46   69.97 ms run,    2.66 ms python,   67.31 ms HIP,  854.11 loss, 0.000702 LR, 4.54 GB used,   9647.80 GFLOPS,    675.05 GOPS
 47   69.51 ms run,    2.72 ms python,   66.78 ms HIP,  807.23 loss, 0.000717 LR, 4.54 GB used,   9712.03 GFLOPS,    675.05 GOPS
 48   69.31 ms run,    2.71 ms python,   66.60 ms HIP,  809.48 loss, 0.000732 LR, 4.54 GB used,   9739.65 GFLOPS,    675.05 GOPS
 49   69.46 ms run,    2.70 ms python,   66.76 ms HIP,  838.32 loss, 0.000747 LR, 4.54 GB used,   9718.94 GFLOPS,    675.05 GOPS
 50   69.30 ms run,    2.72 ms python,   66.58 ms HIP,  825.54 loss, 0.000762 LR, 4.54 GB used,   9741.24 GFLOPS,    675.05 GOPS
 51   69.55 ms run,    2.67 ms python,   66.88 ms HIP,  781.49 loss, 0.000777 LR, 4.54 GB used,   9706.30 GFLOPS,    675.05 GOPS
 52   69.25 ms run,    2.71 ms python,   66.54 ms HIP,  841.42 loss, 0.000792 LR, 4.54 GB used,   9747.96 GFLOPS,    675.05 GOPS
 53   69.24 ms run,    2.66 ms python,   66.58 ms HIP,  798.52 loss, 0.000807 LR, 4.54 GB used,   9748.76 GFLOPS,    675.05 GOPS
 54   69.38 ms run,    2.68 ms python,   66.70 ms HIP,  809.88 loss, 0.000822 LR, 4.54 GB used,   9729.31 GFLOPS,    675.05 GOPS
 55   69.59 ms run,    2.67 ms python,   66.92 ms HIP,  823.67 loss, 0.000837 LR, 4.54 GB used,   9700.17 GFLOPS,    675.05 GOPS
 56   68.85 ms run,    2.74 ms python,   66.11 ms HIP,  815.30 loss, 0.000852 LR, 4.54 GB used,   9804.06 GFLOPS,    675.05 GOPS
 57   69.44 ms run,    2.70 ms python,   66.74 ms HIP,  800.62 loss, 0.000867 LR, 4.54 GB used,   9721.16 GFLOPS,    675.05 GOPS
 58   69.37 ms run,    2.74 ms python,   66.63 ms HIP,  782.18 loss, 0.000882 LR, 4.54 GB used,   9731.23 GFLOPS,    675.05 GOPS
 59   69.41 ms run,    2.72 ms python,   66.69 ms HIP,  811.72 loss, 0.000897 LR, 4.54 GB used,   9725.58 GFLOPS,    675.05 GOPS
 60   68.69 ms run,    2.70 ms python,   65.99 ms HIP,  834.58 loss, 0.000912 LR, 4.54 GB used,   9826.86 GFLOPS,    675.05 GOPS
 61   70.02 ms run,    2.67 ms python,   67.35 ms HIP,  817.35 loss, 0.000927 LR, 4.54 GB used,   9640.55 GFLOPS,    675.05 GOPS
 62   69.54 ms run,    2.67 ms python,   66.87 ms HIP,  840.39 loss, 0.000942 LR, 4.54 GB used,   9707.34 GFLOPS,    675.05 GOPS
 63   69.11 ms run,    2.89 ms python,   66.22 ms HIP,  789.49 loss, 0.000957 LR, 4.54 GB used,   9767.09 GFLOPS,    675.05 GOPS
 64   69.37 ms run,    2.79 ms python,   66.58 ms HIP,  767.33 loss, 0.000971 LR, 4.54 GB used,   9730.40 GFLOPS,    675.05 GOPS
 65   68.84 ms run,    2.74 ms python,   66.10 ms HIP,  735.83 loss, 0.000986 LR, 4.54 GB used,   9806.38 GFLOPS,    675.05 GOPS
 66   69.71 ms run,    2.81 ms python,   66.90 ms HIP,  767.32 loss, 0.001001 LR, 4.54 GB used,   9683.70 GFLOPS,    675.05 GOPS
 67   69.49 ms run,    2.73 ms python,   66.76 ms HIP,  740.48 loss, 0.001016 LR, 4.54 GB used,   9714.26 GFLOPS,    675.05 GOPS
 68   69.04 ms run,    2.74 ms python,   66.31 ms HIP,  754.44 loss, 0.001031 LR, 4.54 GB used,   9777.48 GFLOPS,    675.05 GOPS
 69   68.58 ms run,    2.78 ms python,   65.80 ms HIP,  751.04 loss, 0.001046 LR, 4.54 GB used,   9843.00 GFLOPS,    675.05 GOPS
 70   69.71 ms run,    2.75 ms python,   66.95 ms HIP,  758.91 loss, 0.001061 LR, 4.54 GB used,   9684.11 GFLOPS,    675.05 GOPS
 71   69.41 ms run,    2.76 ms python,   66.64 ms HIP,  753.18 loss, 0.001076 LR, 4.54 GB used,   9725.77 GFLOPS,    675.05 GOPS
 72   69.59 ms run,    2.72 ms python,   66.87 ms HIP,  770.21 loss, 0.001091 LR, 4.54 GB used,   9699.87 GFLOPS,    675.05 GOPS
 73   69.39 ms run,    2.72 ms python,   66.67 ms HIP,  758.43 loss, 0.001106 LR, 4.54 GB used,   9727.67 GFLOPS,    675.05 GOPS
 74   69.18 ms run,    2.72 ms python,   66.46 ms HIP,  734.02 loss, 0.001121 LR, 4.54 GB used,   9757.70 GFLOPS,    675.05 GOPS
 75   68.85 ms run,    2.75 ms python,   66.09 ms HIP,  737.91 loss, 0.001136 LR, 4.54 GB used,   9805.03 GFLOPS,    675.05 GOPS
 76   69.30 ms run,    2.71 ms python,   66.59 ms HIP,  727.93 loss, 0.001151 LR, 4.54 GB used,   9741.20 GFLOPS,    675.05 GOPS
 77   69.46 ms run,    2.74 ms python,   66.71 ms HIP,  746.44 loss, 0.001166 LR, 4.54 GB used,   9719.09 GFLOPS,    675.05 GOPS
 78   69.37 ms run,    2.76 ms python,   66.62 ms HIP,  729.42 loss, 0.001181 LR, 4.54 GB used,   9731.01 GFLOPS,    675.05 GOPS
 79   69.36 ms run,    2.86 ms python,   66.50 ms HIP,  763.18 loss, 0.001196 LR, 4.54 GB used,   9731.99 GFLOPS,    675.05 GOPS
 80   69.00 ms run,    2.73 ms python,   66.27 ms HIP,  728.07 loss, 0.001211 LR, 4.54 GB used,   9783.30 GFLOPS,    675.05 GOPS
 81   69.05 ms run,    2.75 ms python,   66.30 ms HIP,  732.20 loss, 0.001226 LR, 4.54 GB used,   9776.00 GFLOPS,    675.05 GOPS
 82   69.80 ms run,    2.73 ms python,   67.08 ms HIP,  731.84 loss, 0.001240 LR, 4.54 GB used,   9670.57 GFLOPS,    675.05 GOPS
 83   69.34 ms run,    2.72 ms python,   66.62 ms HIP,  723.89 loss, 0.001255 LR, 4.54 GB used,   9735.15 GFLOPS,    675.05 GOPS
 84   69.09 ms run,    2.75 ms python,   66.34 ms HIP,  716.49 loss, 0.001270 LR, 4.54 GB used,   9770.21 GFLOPS,    675.05 GOPS
 85   68.92 ms run,    2.73 ms python,   66.19 ms HIP,  721.01 loss, 0.001285 LR, 4.54 GB used,   9795.17 GFLOPS,    675.05 GOPS
 86   69.31 ms run,    2.73 ms python,   66.58 ms HIP,  726.47 loss, 0.001300 LR, 4.54 GB used,   9739.78 GFLOPS,    675.05 GOPS
 87   69.16 ms run,    2.83 ms python,   66.33 ms HIP,  743.07 loss, 0.001315 LR, 4.54 GB used,   9761.19 GFLOPS,    675.05 GOPS
 88   69.55 ms run,    2.77 ms python,   66.78 ms HIP,  751.18 loss, 0.001330 LR, 4.54 GB used,   9706.38 GFLOPS,    675.05 GOPS
 89   69.36 ms run,    2.74 ms python,   66.61 ms HIP,  720.70 loss, 0.001345 LR, 4.54 GB used,   9732.68 GFLOPS,    675.05 GOPS
 90   69.07 ms run,    2.72 ms python,   66.35 ms HIP,  715.80 loss, 0.001360 LR, 4.54 GB used,   9773.50 GFLOPS,    675.05 GOPS
 91   68.65 ms run,    2.73 ms python,   65.91 ms HIP,  716.98 loss, 0.001375 LR, 4.54 GB used,   9833.84 GFLOPS,    675.05 GOPS
 92   69.23 ms run,    2.79 ms python,   66.45 ms HIP,  715.26 loss, 0.001390 LR, 4.54 GB used,   9750.10 GFLOPS,    675.05 GOPS
 93   69.16 ms run,    2.74 ms python,   66.42 ms HIP,  692.31 loss, 0.001405 LR, 4.54 GB used,   9760.43 GFLOPS,    675.05 GOPS
 94   69.43 ms run,    2.71 ms python,   66.72 ms HIP,  678.61 loss, 0.001420 LR, 4.54 GB used,   9723.11 GFLOPS,    675.05 GOPS
 95   69.07 ms run,    2.76 ms python,   66.31 ms HIP,  702.36 loss, 0.001435 LR, 4.54 GB used,   9772.82 GFLOPS,    675.05 GOPS
 96   68.79 ms run,    2.72 ms python,   66.07 ms HIP,  657.52 loss, 0.001450 LR, 4.54 GB used,   9813.28 GFLOPS,    675.05 GOPS
 97   68.94 ms run,    2.75 ms python,   66.19 ms HIP,  665.23 loss, 0.001465 LR, 4.54 GB used,   9792.20 GFLOPS,    675.05 GOPS
shuffling training dataset in 755.98 ms (epoch=1)
 98  831.82 ms run,  759.18 ms python,   72.64 ms HIP,  665.56 loss, 0.001480 LR, 4.54 GB used,    811.88 GFLOPS,    675.34 GOPS
 99   73.72 ms run,    2.85 ms python,   70.87 ms HIP,  695.91 loss, 0.001495 LR, 4.54 GB used,   9157.12 GFLOPS,    675.05 GOPS
100   72.84 ms run,    2.77 ms python,   70.07 ms HIP,  715.10 loss, 0.001510 LR, 4.54 GB used,   9267.20 GFLOPS,    675.05 GOPS
101   71.20 ms run,    2.87 ms python,   68.33 ms HIP,  703.98 loss, 0.001524 LR, 4.54 GB used,   9480.52 GFLOPS,    675.05 GOPS
102   71.12 ms run,    2.77 ms python,   68.34 ms HIP,  691.39 loss, 0.001539 LR, 4.54 GB used,   9492.06 GFLOPS,    675.05 GOPS
103   69.92 ms run,    2.75 ms python,   67.17 ms HIP,  678.40 loss, 0.001554 LR, 4.54 GB used,   9655.06 GFLOPS,    675.05 GOPS
104   69.41 ms run,    2.71 ms python,   66.70 ms HIP,  679.19 loss, 0.001569 LR, 4.54 GB used,   9725.23 GFLOPS,    675.05 GOPS
105   69.77 ms run,    2.70 ms python,   67.07 ms HIP,  684.38 loss, 0.001584 LR, 4.54 GB used,   9675.54 GFLOPS,    675.05 GOPS
106   69.57 ms run,    2.72 ms python,   66.85 ms HIP,  679.57 loss, 0.001599 LR, 4.54 GB used,   9702.74 GFLOPS,    675.05 GOPS
107   69.07 ms run,    2.76 ms python,   66.31 ms HIP,  675.04 loss, 0.001614 LR, 4.54 GB used,   9773.89 GFLOPS,    675.05 GOPS
108   69.64 ms run,    2.76 ms python,   66.88 ms HIP,  663.53 loss, 0.001629 LR, 4.54 GB used,   9693.30 GFLOPS,    675.05 GOPS
109   70.40 ms run,    2.83 ms python,   67.57 ms HIP,  669.80 loss, 0.001644 LR, 4.54 GB used,   9589.22 GFLOPS,    675.05 GOPS
110   69.53 ms run,    2.72 ms python,   66.82 ms HIP,  675.51 loss, 0.001659 LR, 4.54 GB used,   9708.04 GFLOPS,    675.05 GOPS
111   68.61 ms run,    2.80 ms python,   65.81 ms HIP,  675.21 loss, 0.001674 LR, 4.54 GB used,   9838.29 GFLOPS,    675.05 GOPS
112   68.84 ms run,    2.69 ms python,   66.15 ms HIP,  697.70 loss, 0.001689 LR, 4.54 GB used,   9806.06 GFLOPS,    675.05 GOPS
113   68.99 ms run,    2.69 ms python,   66.30 ms HIP,  699.45 loss, 0.001704 LR, 4.54 GB used,   9785.10 GFLOPS,    675.05 GOPS
114   69.07 ms run,    2.74 ms python,   66.34 ms HIP,  666.35 loss, 0.001719 LR, 4.54 GB used,   9772.85 GFLOPS,    675.05 GOPS
115   69.38 ms run,    2.73 ms python,   66.65 ms HIP,  685.84 loss, 0.001734 LR, 4.54 GB used,   9729.75 GFLOPS,    675.05 GOPS
116   69.55 ms run,    2.82 ms python,   66.74 ms HIP,  675.04 loss, 0.001749 LR, 4.54 GB used,   9705.29 GFLOPS,    675.05 GOPS
117   69.10 ms run,    2.72 ms python,   66.37 ms HIP,  659.46 loss, 0.001764 LR, 4.54 GB used,   9769.32 GFLOPS,    675.05 GOPS
118   69.31 ms run,    2.72 ms python,   66.59 ms HIP,  664.42 loss, 0.001779 LR, 4.54 GB used,   9739.88 GFLOPS,    675.05 GOPS
119   69.29 ms run,    2.69 ms python,   66.60 ms HIP,  687.12 loss, 0.001793 LR, 4.54 GB used,   9742.33 GFLOPS,    675.05 GOPS
120   69.46 ms run,    2.71 ms python,   66.76 ms HIP,  702.90 loss, 0.001808 LR, 4.54 GB used,   9717.85 GFLOPS,    675.05 GOPS
121   69.46 ms run,    2.74 ms python,   66.72 ms HIP,  705.35 loss, 0.001823 LR, 4.54 GB used,   9718.82 GFLOPS,    675.05 GOPS
122   69.70 ms run,    2.76 ms python,   66.94 ms HIP,  664.24 loss, 0.001838 LR, 4.54 GB used,   9685.62 GFLOPS,    675.05 GOPS
123   69.16 ms run,    2.75 ms python,   66.41 ms HIP,  670.65 loss, 0.001853 LR, 4.54 GB used,   9760.12 GFLOPS,    675.05 GOPS
124   69.02 ms run,    2.68 ms python,   66.34 ms HIP,  659.85 loss, 0.001868 LR, 4.54 GB used,   9780.20 GFLOPS,    675.05 GOPS
125   69.18 ms run,    2.79 ms python,   66.39 ms HIP,  668.93 loss, 0.001883 LR, 4.54 GB used,   9757.52 GFLOPS,    675.05 GOPS
126   69.38 ms run,    2.70 ms python,   66.68 ms HIP,  661.00 loss, 0.001898 LR, 4.54 GB used,   9729.95 GFLOPS,    675.05 GOPS
127   69.16 ms run,    2.74 ms python,   66.42 ms HIP,  654.85 loss, 0.001913 LR, 4.54 GB used,   9760.06 GFLOPS,    675.05 GOPS
128   69.01 ms run,    2.74 ms python,   66.27 ms HIP,  676.15 loss, 0.001928 LR, 4.54 GB used,   9781.70 GFLOPS,    675.05 GOPS
129   68.86 ms run,    2.70 ms python,   66.16 ms HIP,  676.86 loss, 0.001943 LR, 4.54 GB used,   9802.63 GFLOPS,    675.05 GOPS
130   68.56 ms run,    2.70 ms python,   65.86 ms HIP,  663.13 loss, 0.001958 LR, 4.54 GB used,   9846.06 GFLOPS,    675.05 GOPS
131   69.10 ms run,    2.68 ms python,   66.41 ms HIP,  645.75 loss, 0.001973 LR, 4.54 GB used,   9769.36 GFLOPS,    675.05 GOPS
132   70.37 ms run,    2.70 ms python,   67.67 ms HIP,  670.99 loss, 0.001988 LR, 4.54 GB used,   9593.29 GFLOPS,    675.05 GOPS
133   69.58 ms run,    2.70 ms python,   66.88 ms HIP,  661.26 loss, 0.002003 LR, 4.54 GB used,   9701.80 GFLOPS,    675.05 GOPS
134   69.39 ms run,    2.69 ms python,   66.71 ms HIP,  669.69 loss, 0.002018 LR, 4.54 GB used,   9727.68 GFLOPS,    675.05 GOPS
135   69.94 ms run,    2.70 ms python,   67.24 ms HIP,  673.50 loss, 0.002033 LR, 4.54 GB used,   9651.33 GFLOPS,    675.05 GOPS
136   70.10 ms run,    2.76 ms python,   67.34 ms HIP,  657.75 loss, 0.002048 LR, 4.54 GB used,   9630.03 GFLOPS,    675.05 GOPS
137   70.11 ms run,    2.72 ms python,   67.38 ms HIP,  660.81 loss, 0.002063 LR, 4.54 GB used,   9628.57 GFLOPS,    675.05 GOPS
138   69.64 ms run,    2.73 ms python,   66.91 ms HIP,  671.17 loss, 0.002077 LR, 4.54 GB used,   9693.96 GFLOPS,    675.05 GOPS
139   69.33 ms run,    2.72 ms python,   66.61 ms HIP,  688.35 loss, 0.002092 LR, 4.54 GB used,   9736.29 GFLOPS,    675.05 GOPS
140   69.38 ms run,    2.71 ms python,   66.67 ms HIP,  648.27 loss, 0.002107 LR, 4.54 GB used,   9729.40 GFLOPS,    675.05 GOPS
141   70.56 ms run,    2.72 ms python,   67.84 ms HIP,  645.85 loss, 0.002122 LR, 4.54 GB used,   9567.45 GFLOPS,    675.05 GOPS
142   69.73 ms run,    2.69 ms python,   67.04 ms HIP,  665.99 loss, 0.002137 LR, 4.54 GB used,   9680.96 GFLOPS,    675.05 GOPS
143   69.40 ms run,    2.80 ms python,   66.60 ms HIP,  692.06 loss, 0.002152 LR, 4.54 GB used,   9726.19 GFLOPS,    675.05 GOPS
144   68.76 ms run,    2.68 ms python,   66.08 ms HIP,  675.49 loss, 0.002167 LR, 4.54 GB used,   9817.55 GFLOPS,    675.05 GOPS
145   68.66 ms run,    2.68 ms python,   65.99 ms HIP,  715.16 loss, 0.002182 LR, 4.54 GB used,   9831.15 GFLOPS,    675.05 GOPS
146   69.04 ms run,    2.71 ms python,   66.33 ms HIP,  681.11 loss, 0.002197 LR, 4.54 GB used,   9777.15 GFLOPS,    675.05 GOPS
147   69.47 ms run,    2.67 ms python,   66.79 ms HIP,  713.74 loss, 0.002212 LR, 4.54 GB used,   9717.48 GFLOPS,    675.05 GOPS
148   69.13 ms run,    2.67 ms python,   66.47 ms HIP,  696.30 loss, 0.002227 LR, 4.54 GB used,   9764.18 GFLOPS,    675.05 GOPS
149   69.17 ms run,    2.67 ms python,   66.50 ms HIP,  651.40 loss, 0.002242 LR, 4.54 GB used,   9759.92 GFLOPS,    675.05 GOPS
150   69.74 ms run,    2.74 ms python,   67.00 ms HIP,  656.05 loss, 0.002257 LR, 4.54 GB used,   9680.08 GFLOPS,    675.05 GOPS
151   69.64 ms run,    2.68 ms python,   66.96 ms HIP,  659.93 loss, 0.002272 LR, 4.54 GB used,   9692.70 GFLOPS,    675.05 GOPS
152   70.08 ms run,    2.73 ms python,   67.35 ms HIP,  655.41 loss, 0.002287 LR, 4.54 GB used,   9633.18 GFLOPS,    675.05 GOPS
153   69.50 ms run,    2.71 ms python,   66.79 ms HIP,  642.93 loss, 0.002302 LR, 4.54 GB used,   9713.06 GFLOPS,    675.05 GOPS
154   69.17 ms run,    2.66 ms python,   66.51 ms HIP,  661.86 loss, 0.002317 LR, 4.54 GB used,   9758.88 GFLOPS,    675.05 GOPS
155   69.83 ms run,    2.70 ms python,   67.13 ms HIP,  656.04 loss, 0.002332 LR, 4.54 GB used,   9667.62 GFLOPS,    675.05 GOPS
156   69.37 ms run,    2.71 ms python,   66.66 ms HIP,  671.41 loss, 0.002346 LR, 4.54 GB used,   9731.45 GFLOPS,    675.05 GOPS
157   69.66 ms run,    2.75 ms python,   66.90 ms HIP,  670.28 loss, 0.002361 LR, 4.54 GB used,   9691.02 GFLOPS,    675.05 GOPS
158   70.13 ms run,    2.73 ms python,   67.39 ms HIP,  653.53 loss, 0.002376 LR, 4.54 GB used,   9625.92 GFLOPS,    675.05 GOPS
159   69.86 ms run,    2.68 ms python,   67.18 ms HIP,  645.35 loss, 0.002391 LR, 4.54 GB used,   9662.92 GFLOPS,    675.05 GOPS
160   69.78 ms run,    2.70 ms python,   67.08 ms HIP,  667.87 loss, 0.002406 LR, 4.54 GB used,   9674.32 GFLOPS,    675.05 GOPS
161   68.98 ms run,    2.72 ms python,   66.26 ms HIP,  646.49 loss, 0.002421 LR, 4.54 GB used,   9786.22 GFLOPS,    675.05 GOPS
162   69.74 ms run,    2.67 ms python,   67.07 ms HIP,  649.51 loss, 0.002436 LR, 4.54 GB used,   9679.95 GFLOPS,    675.05 GOPS
163   69.49 ms run,    2.71 ms python,   66.79 ms HIP,  643.96 loss, 0.002451 LR, 4.54 GB used,   9714.14 GFLOPS,    675.05 GOPS
164   69.13 ms run,    2.68 ms python,   66.45 ms HIP,  656.23 loss, 0.002466 LR, 4.54 GB used,   9764.87 GFLOPS,    675.05 GOPS
165   69.57 ms run,    2.66 ms python,   66.90 ms HIP,  670.91 loss, 0.002481 LR, 4.54 GB used,   9703.63 GFLOPS,    675.05 GOPS
166   69.08 ms run,    2.66 ms python,   66.42 ms HIP,  653.54 loss, 0.002496 LR, 4.54 GB used,   9771.72 GFLOPS,    675.05 GOPS
167   69.15 ms run,    2.69 ms python,   66.46 ms HIP,  664.16 loss, 0.002511 LR, 4.54 GB used,   9762.65 GFLOPS,    675.05 GOPS
168   69.67 ms run,    2.68 ms python,   66.98 ms HIP,  649.77 loss, 0.002526 LR, 4.54 GB used,   9689.66 GFLOPS,    675.05 GOPS
169   69.39 ms run,    2.67 ms python,   66.72 ms HIP,  644.44 loss, 0.002541 LR, 4.54 GB used,   9728.23 GFLOPS,    675.05 GOPS
170   69.52 ms run,    2.71 ms python,   66.81 ms HIP,  629.33 loss, 0.002556 LR, 4.54 GB used,   9710.44 GFLOPS,    675.05 GOPS
171   69.34 ms run,    2.73 ms python,   66.61 ms HIP,  655.48 loss, 0.002571 LR, 4.54 GB used,   9735.60 GFLOPS,    675.05 GOPS
172   69.39 ms run,    2.70 ms python,   66.70 ms HIP,  669.01 loss, 0.002586 LR, 4.54 GB used,   9727.61 GFLOPS,    675.05 GOPS
173   68.89 ms run,    2.68 ms python,   66.21 ms HIP,  678.96 loss, 0.002601 LR, 4.54 GB used,   9798.66 GFLOPS,    675.05 GOPS
174   69.34 ms run,    2.71 ms python,   66.63 ms HIP,  695.76 loss, 0.002615 LR, 4.54 GB used,   9735.53 GFLOPS,    675.05 GOPS
175   68.74 ms run,    2.71 ms python,   66.04 ms HIP,  657.40 loss, 0.002630 LR, 4.54 GB used,   9819.56 GFLOPS,    675.05 GOPS
176   69.06 ms run,    2.68 ms python,   66.37 ms HIP,  649.10 loss, 0.002645 LR, 4.54 GB used,   9775.22 GFLOPS,    675.05 GOPS
177   69.97 ms run,    2.66 ms python,   67.30 ms HIP,  640.65 loss, 0.002660 LR, 4.54 GB used,   9648.05 GFLOPS,    675.05 GOPS
178   69.51 ms run,    2.69 ms python,   66.82 ms HIP,  627.96 loss, 0.002675 LR, 4.54 GB used,   9711.71 GFLOPS,    675.05 GOPS
179   68.84 ms run,    2.64 ms python,   66.20 ms HIP,  677.60 loss, 0.002690 LR, 4.54 GB used,   9805.53 GFLOPS,    675.05 GOPS
180   68.97 ms run,    2.68 ms python,   66.29 ms HIP,  646.24 loss, 0.002705 LR, 4.54 GB used,   9787.94 GFLOPS,    675.05 GOPS
181   69.45 ms run,    2.67 ms python,   66.78 ms HIP,  667.96 loss, 0.002720 LR, 4.54 GB used,   9720.16 GFLOPS,    675.05 GOPS
182   69.08 ms run,    2.66 ms python,   66.42 ms HIP,  629.36 loss, 0.002735 LR, 4.54 GB used,   9771.26 GFLOPS,    675.05 GOPS
183   69.24 ms run,    2.66 ms python,   66.58 ms HIP,  662.15 loss, 0.002750 LR, 4.54 GB used,   9749.65 GFLOPS,    675.05 GOPS
184   69.35 ms run,    2.67 ms python,   66.68 ms HIP,  655.82 loss, 0.002765 LR, 4.54 GB used,   9733.97 GFLOPS,    675.05 GOPS
185   69.36 ms run,    2.67 ms python,   66.68 ms HIP,  660.38 loss, 0.002780 LR, 4.54 GB used,   9733.08 GFLOPS,    675.05 GOPS
186   69.16 ms run,    2.68 ms python,   66.48 ms HIP,  653.15 loss, 0.002795 LR, 4.54 GB used,   9760.66 GFLOPS,    675.05 GOPS
187   69.70 ms run,    2.73 ms python,   66.97 ms HIP,  660.77 loss, 0.002810 LR, 4.54 GB used,   9685.19 GFLOPS,    675.05 GOPS
188   69.29 ms run,    2.68 ms python,   66.62 ms HIP,  639.66 loss, 0.002825 LR, 4.54 GB used,   9741.91 GFLOPS,    675.05 GOPS
189   69.36 ms run,    2.68 ms python,   66.68 ms HIP,  677.11 loss, 0.002840 LR, 4.54 GB used,   9732.29 GFLOPS,    675.05 GOPS
190   68.75 ms run,    2.77 ms python,   65.97 ms HIP,  657.45 loss, 0.002855 LR, 4.54 GB used,   9819.44 GFLOPS,    675.05 GOPS
191   69.18 ms run,    2.70 ms python,   66.48 ms HIP,  657.44 loss, 0.002870 LR, 4.54 GB used,   9757.84 GFLOPS,    675.05 GOPS
192   69.28 ms run,    2.70 ms python,   66.58 ms HIP,  644.28 loss, 0.002885 LR, 4.54 GB used,   9743.48 GFLOPS,    675.05 GOPS
193   69.36 ms run,    2.64 ms python,   66.72 ms HIP,  664.24 loss, 0.002899 LR, 4.54 GB used,   9732.33 GFLOPS,    675.05 GOPS
194   69.23 ms run,    2.66 ms python,   66.57 ms HIP,  658.53 loss, 0.002914 LR, 4.54 GB used,   9751.01 GFLOPS,    675.05 GOPS
195   69.44 ms run,    2.65 ms python,   66.79 ms HIP,  625.03 loss, 0.002929 LR, 4.54 GB used,   9721.33 GFLOPS,    675.05 GOPS
shuffling training dataset in 755.85 ms (epoch=2)
196  832.06 ms run,  759.15 ms python,   72.91 ms HIP,  659.99 loss, 0.002944 LR, 4.54 GB used,    811.65 GFLOPS,    675.34 GOPS
197   73.49 ms run,    2.80 ms python,   70.69 ms HIP,  634.10 loss, 0.002959 LR, 4.54 GB used,   9185.29 GFLOPS,    675.05 GOPS
198   72.67 ms run,    2.79 ms python,   69.89 ms HIP,  629.92 loss, 0.002974 LR, 4.54 GB used,   9288.55 GFLOPS,    675.05 GOPS
199   71.07 ms run,    2.72 ms python,   68.35 ms HIP,  619.03 loss, 0.002989 LR, 4.54 GB used,   9498.03 GFLOPS,    675.05 GOPS
200   70.13 ms run,    2.70 ms python,   67.43 ms HIP,  622.81 loss, 0.003004 LR, 4.54 GB used,   9625.49 GFLOPS,    675.05 GOPS
201   69.80 ms run,    2.69 ms python,   67.11 ms HIP,  653.30 loss, 0.003019 LR, 4.54 GB used,   9671.01 GFLOPS,    675.05 GOPS
202   70.01 ms run,    2.66 ms python,   67.35 ms HIP,  669.36 loss, 0.003034 LR, 4.54 GB used,   9642.53 GFLOPS,    675.05 GOPS
203   69.19 ms run,    2.68 ms python,   66.51 ms HIP,  636.66 loss, 0.003049 LR, 4.54 GB used,   9756.09 GFLOPS,    675.05 GOPS
204   69.73 ms run,    2.70 ms python,   67.03 ms HIP,  638.22 loss, 0.003064 LR, 4.54 GB used,   9681.18 GFLOPS,    675.05 GOPS
205   70.08 ms run,    2.67 ms python,   67.41 ms HIP,  637.03 loss, 0.003079 LR, 4.54 GB used,   9631.95 GFLOPS,    675.05 GOPS
206   69.47 ms run,    2.73 ms python,   66.74 ms HIP,  650.79 loss, 0.003094 LR, 4.54 GB used,   9716.82 GFLOPS,    675.05 GOPS
207   69.78 ms run,    2.73 ms python,   67.05 ms HIP,  626.88 loss, 0.003109 LR, 4.54 GB used,   9674.44 GFLOPS,    675.05 GOPS
208   69.31 ms run,    2.66 ms python,   66.65 ms HIP,  640.41 loss, 0.003124 LR, 4.54 GB used,   9739.37 GFLOPS,    675.05 GOPS
209   70.36 ms run,    2.71 ms python,   67.65 ms HIP,  655.87 loss, 0.003139 LR, 4.54 GB used,   9593.98 GFLOPS,    675.05 GOPS
210   69.85 ms run,    2.67 ms python,   67.18 ms HIP,  641.10 loss, 0.003154 LR, 4.54 GB used,   9663.86 GFLOPS,    675.05 GOPS
211   69.48 ms run,    2.67 ms python,   66.80 ms HIP,  612.88 loss, 0.003168 LR, 4.54 GB used,   9716.12 GFLOPS,    675.05 GOPS
212   69.22 ms run,    2.71 ms python,   66.51 ms HIP,  620.64 loss, 0.003183 LR, 4.54 GB used,   9751.83 GFLOPS,    675.05 GOPS
213   69.32 ms run,    2.73 ms python,   66.59 ms HIP,  632.55 loss, 0.003198 LR, 4.54 GB used,   9737.44 GFLOPS,    675.05 GOPS
214   69.71 ms run,    2.69 ms python,   67.02 ms HIP,  647.31 loss, 0.003213 LR, 4.54 GB used,   9683.01 GFLOPS,    675.05 GOPS
215   69.61 ms run,    2.72 ms python,   66.89 ms HIP,  656.43 loss, 0.003228 LR, 4.54 GB used,   9697.12 GFLOPS,    675.05 GOPS
216   69.15 ms run,    2.66 ms python,   66.49 ms HIP,  625.25 loss, 0.003243 LR, 4.54 GB used,   9761.35 GFLOPS,    675.05 GOPS
217   69.35 ms run,    2.67 ms python,   66.67 ms HIP,  657.98 loss, 0.003258 LR, 4.54 GB used,   9733.98 GFLOPS,    675.05 GOPS
218   69.91 ms run,    2.73 ms python,   67.19 ms HIP,  629.03 loss, 0.003273 LR, 4.54 GB used,   9655.39 GFLOPS,    675.05 GOPS
219   69.45 ms run,    2.76 ms python,   66.69 ms HIP,  627.14 loss, 0.003288 LR, 4.54 GB used,   9719.18 GFLOPS,    675.05 GOPS
220   69.20 ms run,    2.68 ms python,   66.52 ms HIP,  627.50 loss, 0.003303 LR, 4.54 GB used,   9754.88 GFLOPS,    675.05 GOPS
221   69.24 ms run,    2.70 ms python,   66.54 ms HIP,  640.78 loss, 0.003318 LR, 4.54 GB used,   9749.73 GFLOPS,    675.05 GOPS
222   69.41 ms run,    2.77 ms python,   66.64 ms HIP,  645.41 loss, 0.003333 LR, 4.54 GB used,   9724.79 GFLOPS,    675.05 GOPS
223   69.49 ms run,    2.70 ms python,   66.79 ms HIP,  653.10 loss, 0.003348 LR, 4.54 GB used,   9714.20 GFLOPS,    675.05 GOPS
224   69.62 ms run,    2.69 ms python,   66.93 ms HIP,  638.37 loss, 0.003363 LR, 4.54 GB used,   9695.58 GFLOPS,    675.05 GOPS
225   69.07 ms run,    2.66 ms python,   66.41 ms HIP,  638.95 loss, 0.003378 LR, 4.54 GB used,   9773.32 GFLOPS,    675.05 GOPS
226   68.82 ms run,    2.67 ms python,   66.15 ms HIP,  620.94 loss, 0.003393 LR, 4.54 GB used,   9808.29 GFLOPS,    675.05 GOPS
227   68.87 ms run,    2.68 ms python,   66.19 ms HIP,  627.10 loss, 0.003408 LR, 4.54 GB used,   9801.68 GFLOPS,    675.05 GOPS
228   69.23 ms run,    2.68 ms python,   66.54 ms HIP,  638.22 loss, 0.003423 LR, 4.54 GB used,   9751.06 GFLOPS,    675.05 GOPS
229   69.44 ms run,    2.65 ms python,   66.78 ms HIP,  630.57 loss, 0.003437 LR, 4.54 GB used,   9721.73 GFLOPS,    675.05 GOPS
230   68.87 ms run,    2.66 ms python,   66.21 ms HIP,  641.26 loss, 0.003433 LR, 4.54 GB used,   9802.19 GFLOPS,    675.05 GOPS
231   69.57 ms run,    2.75 ms python,   66.82 ms HIP,  643.91 loss, 0.003429 LR, 4.54 GB used,   9703.59 GFLOPS,    675.05 GOPS
232   69.32 ms run,    2.68 ms python,   66.64 ms HIP,  635.29 loss, 0.003424 LR, 4.54 GB used,   9737.75 GFLOPS,    675.05 GOPS
233   69.05 ms run,    2.71 ms python,   66.34 ms HIP,  662.56 loss, 0.003420 LR, 4.54 GB used,   9776.68 GFLOPS,    675.05 GOPS
234   69.36 ms run,    2.66 ms python,   66.69 ms HIP,  676.33 loss, 0.003416 LR, 4.54 GB used,   9732.88 GFLOPS,    675.05 GOPS
235   69.24 ms run,    2.71 ms python,   66.52 ms HIP,  685.30 loss, 0.003411 LR, 4.54 GB used,   9749.93 GFLOPS,    675.05 GOPS
236   69.62 ms run,    2.68 ms python,   66.94 ms HIP,  650.38 loss, 0.003407 LR, 4.54 GB used,   9696.14 GFLOPS,    675.05 GOPS
237   69.17 ms run,    2.66 ms python,   66.51 ms HIP,  663.55 loss, 0.003403 LR, 4.54 GB used,   9759.73 GFLOPS,    675.05 GOPS
238   69.29 ms run,    2.66 ms python,   66.63 ms HIP,  647.40 loss, 0.003398 LR, 4.54 GB used,   9742.70 GFLOPS,    675.05 GOPS
239   68.74 ms run,    2.65 ms python,   66.08 ms HIP,  619.72 loss, 0.003394 LR, 4.54 GB used,   9820.60 GFLOPS,    675.05 GOPS
240   69.47 ms run,    2.69 ms python,   66.78 ms HIP,  636.31 loss, 0.003390 LR, 4.54 GB used,   9717.35 GFLOPS,    675.05 GOPS
241   69.68 ms run,    2.69 ms python,   66.99 ms HIP,  645.92 loss, 0.003385 LR, 4.54 GB used,   9687.70 GFLOPS,    675.05 GOPS
242   69.31 ms run,    2.68 ms python,   66.63 ms HIP,  632.35 loss, 0.003381 LR, 4.54 GB used,   9738.96 GFLOPS,    675.05 GOPS
243   69.45 ms run,    2.68 ms python,   66.77 ms HIP,  633.76 loss, 0.003377 LR, 4.54 GB used,   9719.79 GFLOPS,    675.05 GOPS
244   69.14 ms run,    2.72 ms python,   66.42 ms HIP,  637.14 loss, 0.003372 LR, 4.54 GB used,   9763.85 GFLOPS,    675.05 GOPS
245   68.80 ms run,    2.66 ms python,   66.13 ms HIP,  642.28 loss, 0.003368 LR, 4.54 GB used,   9811.88 GFLOPS,    675.05 GOPS
246   69.97 ms run,    2.69 ms python,   67.29 ms HIP,  647.99 loss, 0.003364 LR, 4.54 GB used,   9647.21 GFLOPS,    675.05 GOPS
247   69.52 ms run,    2.67 ms python,   66.85 ms HIP,  619.52 loss, 0.003359 LR, 4.54 GB used,   9709.50 GFLOPS,    675.05 GOPS
248   69.24 ms run,    2.65 ms python,   66.59 ms HIP,  628.87 loss, 0.003355 LR, 4.54 GB used,   9749.61 GFLOPS,    675.05 GOPS
249   69.69 ms run,    2.68 ms python,   67.01 ms HIP,  613.62 loss, 0.003350 LR, 4.54 GB used,   9686.41 GFLOPS,    675.05 GOPS
250   68.42 ms run,    2.64 ms python,   65.78 ms HIP,  639.52 loss, 0.003346 LR, 4.54 GB used,   9866.10 GFLOPS,    675.05 GOPS
251   70.09 ms run,    2.63 ms python,   67.46 ms HIP,  633.70 loss, 0.003342 LR, 4.54 GB used,   9631.68 GFLOPS,    675.05 GOPS
252   68.83 ms run,    2.72 ms python,   66.11 ms HIP,  604.80 loss, 0.003337 LR, 4.54 GB used,   9807.18 GFLOPS,    675.05 GOPS
253   69.05 ms run,    2.72 ms python,   66.33 ms HIP,  621.34 loss, 0.003333 LR, 4.54 GB used,   9775.99 GFLOPS,    675.05 GOPS
254   69.15 ms run,    2.74 ms python,   66.41 ms HIP,  629.10 loss, 0.003329 LR, 4.54 GB used,   9762.59 GFLOPS,    675.05 GOPS
255   69.57 ms run,    2.69 ms python,   66.88 ms HIP,  639.65 loss, 0.003324 LR, 4.54 GB used,   9703.32 GFLOPS,    675.05 GOPS
256   70.20 ms run,    2.76 ms python,   67.45 ms HIP,  615.65 loss, 0.003320 LR, 4.54 GB used,   9615.88 GFLOPS,    675.05 GOPS
257   69.17 ms run,    2.69 ms python,   66.48 ms HIP,  633.85 loss, 0.003316 LR, 4.54 GB used,   9759.26 GFLOPS,    675.05 GOPS
258   69.09 ms run,    2.70 ms python,   66.38 ms HIP,  614.25 loss, 0.003311 LR, 4.54 GB used,   9770.97 GFLOPS,    675.05 GOPS
259   68.95 ms run,    2.68 ms python,   66.27 ms HIP,  616.02 loss, 0.003307 LR, 4.54 GB used,   9790.04 GFLOPS,    675.05 GOPS
260   69.49 ms run,    2.67 ms python,   66.82 ms HIP,  629.08 loss, 0.003303 LR, 4.54 GB used,   9714.19 GFLOPS,    675.05 GOPS
261   69.19 ms run,    2.66 ms python,   66.52 ms HIP,  618.21 loss, 0.003298 LR, 4.54 GB used,   9756.94 GFLOPS,    675.05 GOPS
262   69.17 ms run,    2.63 ms python,   66.54 ms HIP,  647.45 loss, 0.003294 LR, 4.54 GB used,   9758.55 GFLOPS,    675.05 GOPS
263   68.86 ms run,    2.67 ms python,   66.19 ms HIP,  623.23 loss, 0.003290 LR, 4.54 GB used,   9803.43 GFLOPS,    675.05 GOPS
264   69.24 ms run,    2.66 ms python,   66.58 ms HIP,  661.51 loss, 0.003285 LR, 4.54 GB used,   9748.80 GFLOPS,    675.05 GOPS
265   69.57 ms run,    2.66 ms python,   66.92 ms HIP,  644.22 loss, 0.003281 LR, 4.54 GB used,   9702.66 GFLOPS,    675.05 GOPS
266   68.90 ms run,    2.66 ms python,   66.24 ms HIP,  636.75 loss, 0.003276 LR, 4.54 GB used,   9797.61 GFLOPS,    675.05 GOPS
267   68.77 ms run,    2.64 ms python,   66.13 ms HIP,  642.13 loss, 0.003272 LR, 4.54 GB used,   9815.75 GFLOPS,    675.05 GOPS
268   69.19 ms run,    2.69 ms python,   66.50 ms HIP,  629.14 loss, 0.003268 LR, 4.54 GB used,   9756.40 GFLOPS,    675.05 GOPS
269   69.51 ms run,    2.67 ms python,   66.85 ms HIP,  616.75 loss, 0.003263 LR, 4.54 GB used,   9710.94 GFLOPS,    675.05 GOPS
270   69.58 ms run,    2.68 ms python,   66.90 ms HIP,  644.76 loss, 0.003259 LR, 4.54 GB used,   9701.83 GFLOPS,    675.05 GOPS
271   69.62 ms run,    2.65 ms python,   66.97 ms HIP,  644.48 loss, 0.003255 LR, 4.54 GB used,   9696.53 GFLOPS,    675.05 GOPS
272   69.69 ms run,    2.66 ms python,   67.03 ms HIP,  632.84 loss, 0.003250 LR, 4.54 GB used,   9686.25 GFLOPS,    675.05 GOPS
273   69.17 ms run,    2.78 ms python,   66.39 ms HIP,  616.68 loss, 0.003246 LR, 4.54 GB used,   9758.53 GFLOPS,    675.05 GOPS
274   69.09 ms run,    2.67 ms python,   66.42 ms HIP,  622.35 loss, 0.003242 LR, 4.54 GB used,   9770.47 GFLOPS,    675.05 GOPS
275   69.79 ms run,    2.74 ms python,   67.05 ms HIP,  645.60 loss, 0.003237 LR, 4.54 GB used,   9672.81 GFLOPS,    675.05 GOPS
276   69.70 ms run,    2.66 ms python,   67.03 ms HIP,  612.62 loss, 0.003233 LR, 4.54 GB used,   9685.45 GFLOPS,    675.05 GOPS
277   68.45 ms run,    2.70 ms python,   65.75 ms HIP,  602.66 loss, 0.003229 LR, 4.54 GB used,   9861.52 GFLOPS,    675.05 GOPS
278   69.37 ms run,    2.65 ms python,   66.72 ms HIP,  616.91 loss, 0.003224 LR, 4.54 GB used,   9730.95 GFLOPS,    675.05 GOPS
279   70.57 ms run,    2.74 ms python,   67.83 ms HIP,  630.11 loss, 0.003220 LR, 4.54 GB used,   9566.13 GFLOPS,    675.05 GOPS
280   69.70 ms run,    2.67 ms python,   67.03 ms HIP,  650.35 loss, 0.003216 LR, 4.54 GB used,   9685.41 GFLOPS,    675.05 GOPS
281   69.23 ms run,    2.67 ms python,   66.57 ms HIP,  612.47 loss, 0.003211 LR, 4.54 GB used,   9750.26 GFLOPS,    675.05 GOPS
282   69.25 ms run,    2.68 ms python,   66.57 ms HIP,  632.91 loss, 0.003207 LR, 4.54 GB used,   9748.08 GFLOPS,    675.05 GOPS
283   68.70 ms run,    2.66 ms python,   66.05 ms HIP,  602.64 loss, 0.003202 LR, 4.54 GB used,   9825.40 GFLOPS,    675.05 GOPS
284   69.28 ms run,    2.66 ms python,   66.62 ms HIP,  616.49 loss, 0.003198 LR, 4.54 GB used,   9744.36 GFLOPS,    675.05 GOPS
285   69.14 ms run,    2.68 ms python,   66.47 ms HIP,  645.61 loss, 0.003194 LR, 4.54 GB used,   9762.77 GFLOPS,    675.05 GOPS
286   69.64 ms run,    2.78 ms python,   66.86 ms HIP,  615.17 loss, 0.003189 LR, 4.54 GB used,   9693.46 GFLOPS,    675.05 GOPS
287   69.62 ms run,    2.71 ms python,   66.91 ms HIP,  642.57 loss, 0.003185 LR, 4.54 GB used,   9695.77 GFLOPS,    675.05 GOPS
288   69.24 ms run,    2.65 ms python,   66.59 ms HIP,  628.23 loss, 0.003181 LR, 4.54 GB used,   9749.11 GFLOPS,    675.05 GOPS
289   69.29 ms run,    2.63 ms python,   66.66 ms HIP,  652.79 loss, 0.003176 LR, 4.54 GB used,   9742.13 GFLOPS,    675.05 GOPS
290   69.26 ms run,    2.66 ms python,   66.60 ms HIP,  646.73 loss, 0.003172 LR, 4.54 GB used,   9746.35 GFLOPS,    675.05 GOPS
291   69.11 ms run,    2.68 ms python,   66.43 ms HIP,  608.35 loss, 0.003168 LR, 4.54 GB used,   9767.87 GFLOPS,    675.05 GOPS
292   69.11 ms run,    2.64 ms python,   66.48 ms HIP,  607.70 loss, 0.003163 LR, 4.54 GB used,   9767.05 GFLOPS,    675.05 GOPS
293   68.87 ms run,    2.66 ms python,   66.21 ms HIP,  592.41 loss, 0.003159 LR, 4.54 GB used,   9801.96 GFLOPS,    675.05 GOPS
shuffling training dataset in 756.48 ms (epoch=3)
294  832.10 ms run,  759.70 ms python,   72.40 ms HIP,  608.68 loss, 0.003155 LR, 4.54 GB used,    811.61 GFLOPS,    675.34 GOPS
295   74.06 ms run,    2.77 ms python,   71.30 ms HIP,  610.76 loss, 0.003150 LR, 4.54 GB used,   9114.37 GFLOPS,    675.05 GOPS
296   72.09 ms run,    2.76 ms python,   69.33 ms HIP,  602.13 loss, 0.003146 LR, 4.54 GB used,   9364.36 GFLOPS,    675.05 GOPS
297   71.02 ms run,    2.70 ms python,   68.32 ms HIP,  605.34 loss, 0.003142 LR, 4.54 GB used,   9505.38 GFLOPS,    675.05 GOPS
298   70.45 ms run,    2.65 ms python,   67.79 ms HIP,  596.22 loss, 0.003137 LR, 4.54 GB used,   9582.37 GFLOPS,    675.05 GOPS
299   70.28 ms run,    2.71 ms python,   67.57 ms HIP,  588.64 loss, 0.003133 LR, 4.54 GB used,   9604.62 GFLOPS,    675.05 GOPS
300   69.83 ms run,    2.73 ms python,   67.10 ms HIP,  621.56 loss, 0.003128 LR, 4.54 GB used,   9667.02 GFLOPS,    675.05 GOPS
301   69.90 ms run,    2.75 ms python,   67.15 ms HIP,  629.12 loss, 0.003124 LR, 4.54 GB used,   9657.46 GFLOPS,    675.05 GOPS
302   69.82 ms run,    2.69 ms python,   67.13 ms HIP,  612.17 loss, 0.003120 LR, 4.54 GB used,   9668.33 GFLOPS,    675.05 GOPS
303   69.37 ms run,    2.68 ms python,   66.69 ms HIP,  590.49 loss, 0.003115 LR, 4.54 GB used,   9730.96 GFLOPS,    675.05 GOPS
304   69.43 ms run,    2.69 ms python,   66.74 ms HIP,  604.18 loss, 0.003111 LR, 4.54 GB used,   9721.99 GFLOPS,    675.05 GOPS
305   69.31 ms run,    2.68 ms python,   66.62 ms HIP,  618.26 loss, 0.003107 LR, 4.54 GB used,   9740.19 GFLOPS,    675.05 GOPS
306   69.64 ms run,    2.65 ms python,   66.99 ms HIP,  601.56 loss, 0.003102 LR, 4.54 GB used,   9693.74 GFLOPS,    675.05 GOPS
307   69.08 ms run,    2.70 ms python,   66.38 ms HIP,  602.58 loss, 0.003098 LR, 4.54 GB used,   9772.04 GFLOPS,    675.05 GOPS
308   69.51 ms run,    2.66 ms python,   66.86 ms HIP,  592.20 loss, 0.003094 LR, 4.54 GB used,   9710.86 GFLOPS,    675.05 GOPS
309   68.93 ms run,    2.76 ms python,   66.18 ms HIP,  581.20 loss, 0.003089 LR, 4.54 GB used,   9793.17 GFLOPS,    675.05 GOPS
310   69.35 ms run,    2.67 ms python,   66.68 ms HIP,  595.50 loss, 0.003085 LR, 4.54 GB used,   9733.98 GFLOPS,    675.05 GOPS
311   69.28 ms run,    2.64 ms python,   66.64 ms HIP,  602.04 loss, 0.003081 LR, 4.54 GB used,   9743.81 GFLOPS,    675.05 GOPS
312   69.76 ms run,    2.67 ms python,   67.10 ms HIP,  613.41 loss, 0.003076 LR, 4.54 GB used,   9676.11 GFLOPS,    675.05 GOPS
313   69.49 ms run,    2.74 ms python,   66.75 ms HIP,  612.33 loss, 0.003072 LR, 4.54 GB used,   9714.14 GFLOPS,    675.05 GOPS
314   69.55 ms run,    2.67 ms python,   66.88 ms HIP,  601.82 loss, 0.003068 LR, 4.54 GB used,   9706.48 GFLOPS,    675.05 GOPS
315   69.46 ms run,    2.70 ms python,   66.76 ms HIP,  599.91 loss, 0.003063 LR, 4.54 GB used,   9718.26 GFLOPS,    675.05 GOPS
316   69.61 ms run,    2.73 ms python,   66.88 ms HIP,  603.31 loss, 0.003059 LR, 4.54 GB used,   9697.68 GFLOPS,    675.05 GOPS
317   69.45 ms run,    2.67 ms python,   66.78 ms HIP,  606.40 loss, 0.003054 LR, 4.54 GB used,   9720.29 GFLOPS,    675.05 GOPS
318   69.77 ms run,    2.67 ms python,   67.10 ms HIP,  599.24 loss, 0.003050 LR, 4.54 GB used,   9675.79 GFLOPS,    675.05 GOPS
319   69.22 ms run,    2.66 ms python,   66.56 ms HIP,  580.43 loss, 0.003046 LR, 4.54 GB used,   9752.39 GFLOPS,    675.05 GOPS
320   69.93 ms run,    2.67 ms python,   67.26 ms HIP,  605.67 loss, 0.003041 LR, 4.54 GB used,   9653.45 GFLOPS,    675.05 GOPS
321   69.24 ms run,    2.63 ms python,   66.61 ms HIP,  618.09 loss, 0.003037 LR, 4.54 GB used,   9748.92 GFLOPS,    675.05 GOPS
322   69.77 ms run,    2.69 ms python,   67.09 ms HIP,  624.88 loss, 0.003033 LR, 4.54 GB used,   9674.90 GFLOPS,    675.05 GOPS
323   69.21 ms run,    2.78 ms python,   66.43 ms HIP,  617.77 loss, 0.003028 LR, 4.54 GB used,   9754.27 GFLOPS,    675.05 GOPS
324   68.91 ms run,    2.65 ms python,   66.26 ms HIP,  599.90 loss, 0.003024 LR, 4.54 GB used,   9796.42 GFLOPS,    675.05 GOPS
325   69.33 ms run,    2.69 ms python,   66.64 ms HIP,  604.47 loss, 0.003020 LR, 4.54 GB used,   9736.84 GFLOPS,    675.05 GOPS
326   68.80 ms run,    2.65 ms python,   66.16 ms HIP,  616.35 loss, 0.003015 LR, 4.54 GB used,   9811.55 GFLOPS,    675.05 GOPS
327   69.53 ms run,    2.66 ms python,   66.87 ms HIP,  599.59 loss, 0.003011 LR, 4.54 GB used,   9708.45 GFLOPS,    675.05 GOPS
328   68.94 ms run,    2.67 ms python,   66.27 ms HIP,  615.25 loss, 0.003007 LR, 4.54 GB used,   9792.02 GFLOPS,    675.05 GOPS
329   69.34 ms run,    2.65 ms python,   66.69 ms HIP,  599.05 loss, 0.003002 LR, 4.54 GB used,   9735.76 GFLOPS,    675.05 GOPS
330   69.63 ms run,    2.63 ms python,   67.00 ms HIP,  623.61 loss, 0.002998 LR, 4.54 GB used,   9694.07 GFLOPS,    675.05 GOPS
331   69.41 ms run,    2.64 ms python,   66.77 ms HIP,  646.88 loss, 0.002994 LR, 4.54 GB used,   9725.43 GFLOPS,    675.05 GOPS
332   69.64 ms run,    2.74 ms python,   66.90 ms HIP,  604.86 loss, 0.002989 LR, 4.54 GB used,   9693.58 GFLOPS,    675.05 GOPS
333   69.29 ms run,    2.66 ms python,   66.63 ms HIP,  590.82 loss, 0.002985 LR, 4.54 GB used,   9742.17 GFLOPS,    675.05 GOPS
334   69.44 ms run,    2.68 ms python,   66.76 ms HIP,  600.96 loss, 0.002980 LR, 4.54 GB used,   9720.99 GFLOPS,    675.05 GOPS
335   69.36 ms run,    2.68 ms python,   66.68 ms HIP,  599.57 loss, 0.002976 LR, 4.54 GB used,   9732.55 GFLOPS,    675.05 GOPS
336   69.23 ms run,    2.66 ms python,   66.57 ms HIP,  600.89 loss, 0.002972 LR, 4.54 GB used,   9750.55 GFLOPS,    675.05 GOPS
337   69.57 ms run,    2.65 ms python,   66.92 ms HIP,  606.04 loss, 0.002967 LR, 4.54 GB used,   9702.82 GFLOPS,    675.05 GOPS
338   68.98 ms run,    2.71 ms python,   66.26 ms HIP,  600.90 loss, 0.002963 LR, 4.54 GB used,   9786.77 GFLOPS,    675.05 GOPS
339   68.97 ms run,    2.65 ms python,   66.32 ms HIP,  627.56 loss, 0.002959 LR, 4.54 GB used,   9787.82 GFLOPS,    675.05 GOPS
340   68.49 ms run,    2.68 ms python,   65.81 ms HIP,  616.45 loss, 0.002954 LR, 4.54 GB used,   9856.42 GFLOPS,    675.05 GOPS
341   68.23 ms run,    2.73 ms python,   65.50 ms HIP,  584.24 loss, 0.002950 LR, 4.54 GB used,   9893.86 GFLOPS,    675.05 GOPS
342   68.86 ms run,    2.66 ms python,   66.20 ms HIP,  603.59 loss, 0.002946 LR, 4.54 GB used,   9802.71 GFLOPS,    675.05 GOPS
343   69.03 ms run,    2.67 ms python,   66.36 ms HIP,  605.93 loss, 0.002941 LR, 4.54 GB used,   9778.69 GFLOPS,    675.05 GOPS
344   68.69 ms run,    2.68 ms python,   66.00 ms HIP,  593.60 loss, 0.002937 LR, 4.54 GB used,   9827.99 GFLOPS,    675.05 GOPS
345   69.16 ms run,    2.68 ms python,   66.49 ms HIP,  594.98 loss, 0.002933 LR, 4.54 GB used,   9760.08 GFLOPS,    675.05 GOPS
346   69.51 ms run,    2.71 ms python,   66.80 ms HIP,  576.23 loss, 0.002928 LR, 4.54 GB used,   9712.07 GFLOPS,    675.05 GOPS
347   69.89 ms run,    2.65 ms python,   67.24 ms HIP,  583.97 loss, 0.002924 LR, 4.54 GB used,   9658.74 GFLOPS,    675.05 GOPS
348   69.54 ms run,    2.66 ms python,   66.88 ms HIP,  606.75 loss, 0.002920 LR, 4.54 GB used,   9707.00 GFLOPS,    675.05 GOPS
349   69.49 ms run,    2.60 ms python,   66.88 ms HIP,  586.33 loss, 0.002915 LR, 4.54 GB used,   9714.95 GFLOPS,    675.05 GOPS
350   70.11 ms run,    2.65 ms python,   67.46 ms HIP,  623.88 loss, 0.002911 LR, 4.54 GB used,   9627.77 GFLOPS,    675.05 GOPS
351   69.72 ms run,    2.65 ms python,   67.07 ms HIP,  633.02 loss, 0.002906 LR, 4.54 GB used,   9681.56 GFLOPS,    675.05 GOPS
352   69.37 ms run,    2.68 ms python,   66.70 ms HIP,  594.73 loss, 0.002902 LR, 4.54 GB used,   9730.85 GFLOPS,    675.05 GOPS
353   69.62 ms run,    2.67 ms python,   66.95 ms HIP,  577.12 loss, 0.002898 LR, 4.54 GB used,   9695.89 GFLOPS,    675.05 GOPS
354   69.93 ms run,    2.66 ms python,   67.27 ms HIP,  617.79 loss, 0.002893 LR, 4.54 GB used,   9653.10 GFLOPS,    675.05 GOPS
355   69.53 ms run,    2.70 ms python,   66.82 ms HIP,  619.25 loss, 0.002889 LR, 4.54 GB used,   9709.34 GFLOPS,    675.05 GOPS
356   69.69 ms run,    2.68 ms python,   67.01 ms HIP,  604.83 loss, 0.002885 LR, 4.54 GB used,   9687.09 GFLOPS,    675.05 GOPS
357   69.60 ms run,    2.67 ms python,   66.93 ms HIP,  586.27 loss, 0.002880 LR, 4.54 GB used,   9699.40 GFLOPS,    675.05 GOPS
358   69.78 ms run,    2.69 ms python,   67.09 ms HIP,  608.11 loss, 0.002876 LR, 4.54 GB used,   9674.21 GFLOPS,    675.05 GOPS
359   69.36 ms run,    2.69 ms python,   66.67 ms HIP,  595.93 loss, 0.002872 LR, 4.54 GB used,   9732.25 GFLOPS,    675.05 GOPS
360   69.24 ms run,    2.70 ms python,   66.54 ms HIP,  618.43 loss, 0.002867 LR, 4.54 GB used,   9750.03 GFLOPS,    675.05 GOPS
361   68.86 ms run,    2.62 ms python,   66.23 ms HIP,  611.55 loss, 0.002863 LR, 4.54 GB used,   9803.30 GFLOPS,    675.05 GOPS
362   69.63 ms run,    2.66 ms python,   66.97 ms HIP,  605.26 loss, 0.002859 LR, 4.54 GB used,   9695.27 GFLOPS,    675.05 GOPS
363   69.32 ms run,    2.75 ms python,   66.56 ms HIP,  611.81 loss, 0.002854 LR, 4.54 GB used,   9738.57 GFLOPS,    675.05 GOPS
364   68.96 ms run,    2.68 ms python,   66.29 ms HIP,  607.24 loss, 0.002850 LR, 4.54 GB used,   9788.85 GFLOPS,    675.05 GOPS
365   69.30 ms run,    2.70 ms python,   66.60 ms HIP,  613.96 loss, 0.002846 LR, 4.54 GB used,   9740.94 GFLOPS,    675.05 GOPS
366   70.14 ms run,    2.64 ms python,   67.50 ms HIP,  616.48 loss, 0.002841 LR, 4.54 GB used,   9624.49 GFLOPS,    675.05 GOPS
367   69.24 ms run,    2.68 ms python,   66.56 ms HIP,  617.21 loss, 0.002837 LR, 4.54 GB used,   9749.79 GFLOPS,    675.05 GOPS
368   69.28 ms run,    2.68 ms python,   66.60 ms HIP,  601.58 loss, 0.002832 LR, 4.54 GB used,   9743.34 GFLOPS,    675.05 GOPS
369   69.19 ms run,    2.76 ms python,   66.43 ms HIP,  601.81 loss, 0.002828 LR, 4.54 GB used,   9756.90 GFLOPS,    675.05 GOPS
370   69.64 ms run,    2.72 ms python,   66.92 ms HIP,  608.82 loss, 0.002824 LR, 4.54 GB used,   9693.09 GFLOPS,    675.05 GOPS
371   69.56 ms run,    2.71 ms python,   66.84 ms HIP,  611.55 loss, 0.002819 LR, 4.54 GB used,   9705.04 GFLOPS,    675.05 GOPS
372   69.52 ms run,    2.69 ms python,   66.82 ms HIP,  589.56 loss, 0.002815 LR, 4.54 GB used,   9710.32 GFLOPS,    675.05 GOPS
373   69.17 ms run,    2.69 ms python,   66.48 ms HIP,  607.22 loss, 0.002811 LR, 4.54 GB used,   9759.81 GFLOPS,    675.05 GOPS
374   69.31 ms run,    2.72 ms python,   66.59 ms HIP,  605.37 loss, 0.002806 LR, 4.54 GB used,   9739.58 GFLOPS,    675.05 GOPS
375   69.51 ms run,    2.67 ms python,   66.83 ms HIP,  591.62 loss, 0.002802 LR, 4.54 GB used,   9711.56 GFLOPS,    675.05 GOPS
376   69.93 ms run,    2.65 ms python,   67.27 ms HIP,  599.20 loss, 0.002798 LR, 4.54 GB used,   9653.40 GFLOPS,    675.05 GOPS
377   69.33 ms run,    2.64 ms python,   66.68 ms HIP,  626.14 loss, 0.002793 LR, 4.54 GB used,   9737.21 GFLOPS,    675.05 GOPS
378   69.49 ms run,    2.65 ms python,   66.84 ms HIP,  600.17 loss, 0.002789 LR, 4.54 GB used,   9714.43 GFLOPS,    675.05 GOPS
379   70.17 ms run,    2.71 ms python,   67.46 ms HIP,  593.42 loss, 0.002785 LR, 4.54 GB used,   9620.51 GFLOPS,    675.05 GOPS
380   68.98 ms run,    2.64 ms python,   66.34 ms HIP,  606.59 loss, 0.002780 LR, 4.54 GB used,   9786.28 GFLOPS,    675.05 GOPS
381   68.92 ms run,    2.68 ms python,   66.24 ms HIP,  600.88 loss, 0.002776 LR, 4.54 GB used,   9795.10 GFLOPS,    675.05 GOPS
382   69.30 ms run,    2.65 ms python,   66.65 ms HIP,  581.13 loss, 0.002772 LR, 4.54 GB used,   9740.97 GFLOPS,    675.05 GOPS
383   69.86 ms run,    2.80 ms python,   67.06 ms HIP,  600.93 loss, 0.002767 LR, 4.54 GB used,   9663.49 GFLOPS,    675.05 GOPS
384   69.91 ms run,    2.66 ms python,   67.25 ms HIP,  644.06 loss, 0.002763 LR, 4.54 GB used,   9655.47 GFLOPS,    675.05 GOPS
385   71.23 ms run,    2.68 ms python,   68.55 ms HIP,  579.32 loss, 0.002758 LR, 4.54 GB used,   9476.84 GFLOPS,    675.05 GOPS
386   69.61 ms run,    2.57 ms python,   67.04 ms HIP,  606.70 loss, 0.002754 LR, 4.54 GB used,   9697.09 GFLOPS,    675.05 GOPS
387   69.46 ms run,    2.74 ms python,   66.72 ms HIP,  598.76 loss, 0.002750 LR, 4.54 GB used,   9718.86 GFLOPS,    675.05 GOPS
388   69.43 ms run,    2.63 ms python,   66.80 ms HIP,  593.29 loss, 0.002745 LR, 4.54 GB used,   9722.94 GFLOPS,    675.05 GOPS
389   69.34 ms run,    2.68 ms python,   66.65 ms HIP,  615.35 loss, 0.002741 LR, 4.54 GB used,   9735.92 GFLOPS,    675.05 GOPS
390   69.72 ms run,    2.62 ms python,   67.09 ms HIP,  584.42 loss, 0.002737 LR, 4.54 GB used,   9682.56 GFLOPS,    675.05 GOPS
391   69.64 ms run,    2.69 ms python,   66.94 ms HIP,  567.33 loss, 0.002732 LR, 4.54 GB used,   9693.97 GFLOPS,    675.05 GOPS
shuffling training dataset in 756.63 ms (epoch=4)
392  832.02 ms run,  759.77 ms python,   72.25 ms HIP,  582.36 loss, 0.002728 LR, 4.54 GB used,    811.69 GFLOPS,    675.34 GOPS
393   74.18 ms run,    2.79 ms python,   71.39 ms HIP,  568.34 loss, 0.002724 LR, 4.54 GB used,   9100.01 GFLOPS,    675.05 GOPS
394   72.54 ms run,    2.70 ms python,   69.84 ms HIP,  577.83 loss, 0.002719 LR, 4.54 GB used,   9306.23 GFLOPS,    675.05 GOPS
395   71.62 ms run,    2.70 ms python,   68.92 ms HIP,  585.91 loss, 0.002715 LR, 4.54 GB used,   9425.68 GFLOPS,    675.05 GOPS
396   70.43 ms run,    2.68 ms python,   67.76 ms HIP,  577.13 loss, 0.002711 LR, 4.54 GB used,   9584.01 GFLOPS,    675.05 GOPS
397   70.22 ms run,    2.66 ms python,   67.56 ms HIP,  569.58 loss, 0.002706 LR, 4.54 GB used,   9612.95 GFLOPS,    675.05 GOPS
398   69.91 ms run,    2.65 ms python,   67.27 ms HIP,  577.52 loss, 0.002702 LR, 4.54 GB used,   9655.27 GFLOPS,    675.05 GOPS
399   69.58 ms run,    2.76 ms python,   66.82 ms HIP,  612.04 loss, 0.002698 LR, 4.54 GB used,   9701.20 GFLOPS,    675.05 GOPS
400   69.40 ms run,    2.69 ms python,   66.72 ms HIP,  604.85 loss, 0.002693 LR, 4.54 GB used,   9726.28 GFLOPS,    675.05 GOPS
401   69.48 ms run,    2.75 ms python,   66.73 ms HIP,  596.98 loss, 0.002689 LR, 4.54 GB used,   9715.27 GFLOPS,    675.05 GOPS
402   69.22 ms run,    2.64 ms python,   66.58 ms HIP,  592.40 loss, 0.002684 LR, 4.54 GB used,   9752.47 GFLOPS,    675.05 GOPS
403   69.29 ms run,    2.69 ms python,   66.60 ms HIP,  606.83 loss, 0.002680 LR, 4.54 GB used,   9742.85 GFLOPS,    675.05 GOPS
404   68.71 ms run,    2.67 ms python,   66.05 ms HIP,  591.62 loss, 0.002676 LR, 4.54 GB used,   9824.14 GFLOPS,    675.05 GOPS
405   68.61 ms run,    2.69 ms python,   65.92 ms HIP,  583.69 loss, 0.002671 LR, 4.54 GB used,   9838.98 GFLOPS,    675.05 GOPS
406   69.00 ms run,    2.65 ms python,   66.36 ms HIP,  571.33 loss, 0.002667 LR, 4.54 GB used,   9782.59 GFLOPS,    675.05 GOPS
407   68.76 ms run,    2.65 ms python,   66.11 ms HIP,  588.53 loss, 0.002663 LR, 4.54 GB used,   9817.98 GFLOPS,    675.05 GOPS
408   69.14 ms run,    2.67 ms python,   66.46 ms HIP,  615.86 loss, 0.002658 LR, 4.54 GB used,   9764.08 GFLOPS,    675.05 GOPS
409   68.86 ms run,    2.69 ms python,   66.16 ms HIP,  592.83 loss, 0.002654 LR, 4.54 GB used,   9803.67 GFLOPS,    675.05 GOPS
410   68.96 ms run,    2.68 ms python,   66.28 ms HIP,  593.65 loss, 0.002650 LR, 4.54 GB used,   9788.92 GFLOPS,    675.05 GOPS
411   69.54 ms run,    2.68 ms python,   66.85 ms HIP,  581.44 loss, 0.002645 LR, 4.54 GB used,   9707.84 GFLOPS,    675.05 GOPS
412   69.68 ms run,    2.64 ms python,   67.04 ms HIP,  579.73 loss, 0.002641 LR, 4.54 GB used,   9688.36 GFLOPS,    675.05 GOPS
413   69.88 ms run,    2.64 ms python,   67.24 ms HIP,  581.51 loss, 0.002637 LR, 4.54 GB used,   9659.46 GFLOPS,    675.05 GOPS
414   69.52 ms run,    2.70 ms python,   66.82 ms HIP,  585.01 loss, 0.002632 LR, 4.54 GB used,   9710.34 GFLOPS,    675.05 GOPS
415   69.01 ms run,    2.76 ms python,   66.25 ms HIP,  589.84 loss, 0.002628 LR, 4.54 GB used,   9782.43 GFLOPS,    675.05 GOPS
416   68.36 ms run,    2.71 ms python,   65.65 ms HIP,  577.49 loss, 0.002624 LR, 4.54 GB used,   9875.12 GFLOPS,    675.05 GOPS
417   69.35 ms run,    2.67 ms python,   66.68 ms HIP,  576.82 loss, 0.002619 LR, 4.54 GB used,   9734.03 GFLOPS,    675.05 GOPS
418   69.35 ms run,    2.65 ms python,   66.70 ms HIP,  598.78 loss, 0.002615 LR, 4.54 GB used,   9734.31 GFLOPS,    675.05 GOPS
419   69.68 ms run,    2.66 ms python,   67.02 ms HIP,  582.48 loss, 0.002610 LR, 4.54 GB used,   9687.70 GFLOPS,    675.05 GOPS
420   70.64 ms run,    2.65 ms python,   67.99 ms HIP,  594.60 loss, 0.002606 LR, 4.54 GB used,   9555.55 GFLOPS,    675.05 GOPS
421   70.61 ms run,    2.72 ms python,   67.89 ms HIP,  592.59 loss, 0.002602 LR, 4.54 GB used,   9560.37 GFLOPS,    675.05 GOPS
422   70.40 ms run,    2.62 ms python,   67.77 ms HIP,  577.48 loss, 0.002597 LR, 4.54 GB used,   9589.23 GFLOPS,    675.05 GOPS
423   70.03 ms run,    2.72 ms python,   67.31 ms HIP,  617.56 loss, 0.002593 LR, 4.54 GB used,   9640.01 GFLOPS,    675.05 GOPS
424   70.42 ms run,    2.65 ms python,   67.78 ms HIP,  615.94 loss, 0.002589 LR, 4.54 GB used,   9585.44 GFLOPS,    675.05 GOPS
425   70.32 ms run,    2.64 ms python,   67.69 ms HIP,  603.24 loss, 0.002584 LR, 4.54 GB used,   9599.25 GFLOPS,    675.05 GOPS
426   69.76 ms run,    2.78 ms python,   66.98 ms HIP,  584.85 loss, 0.002580 LR, 4.54 GB used,   9676.22 GFLOPS,    675.05 GOPS
427   69.61 ms run,    2.65 ms python,   66.95 ms HIP,  592.30 loss, 0.002576 LR, 4.54 GB used,   9698.03 GFLOPS,    675.05 GOPS
428   69.93 ms run,    2.73 ms python,   67.19 ms HIP,  579.46 loss, 0.002571 LR, 4.54 GB used,   9653.71 GFLOPS,    675.05 GOPS
429   69.83 ms run,    2.66 ms python,   67.18 ms HIP,  595.40 loss, 0.002567 LR, 4.54 GB used,   9666.58 GFLOPS,    675.05 GOPS
430   69.72 ms run,    2.70 ms python,   67.01 ms HIP,  589.41 loss, 0.002563 LR, 4.54 GB used,   9682.79 GFLOPS,    675.05 GOPS
431   69.85 ms run,    2.63 ms python,   67.22 ms HIP,  589.12 loss, 0.002558 LR, 4.54 GB used,   9664.38 GFLOPS,    675.05 GOPS
432   69.36 ms run,    2.71 ms python,   66.65 ms HIP,  583.52 loss, 0.002554 LR, 4.54 GB used,   9732.81 GFLOPS,    675.05 GOPS
433   69.06 ms run,    2.63 ms python,   66.43 ms HIP,  601.16 loss, 0.002550 LR, 4.54 GB used,   9774.94 GFLOPS,    675.05 GOPS
434   69.68 ms run,    2.63 ms python,   67.05 ms HIP,  595.74 loss, 0.002545 LR, 4.54 GB used,   9688.07 GFLOPS,    675.05 GOPS
435   69.27 ms run,    2.65 ms python,   66.62 ms HIP,  578.42 loss, 0.002541 LR, 4.54 GB used,   9744.83 GFLOPS,    675.05 GOPS
436   69.72 ms run,    2.64 ms python,   67.08 ms HIP,  579.35 loss, 0.002536 LR, 4.54 GB used,   9681.54 GFLOPS,    675.05 GOPS
437   69.53 ms run,    2.69 ms python,   66.84 ms HIP,  575.27 loss, 0.002532 LR, 4.54 GB used,   9708.32 GFLOPS,    675.05 GOPS
438   69.31 ms run,    2.65 ms python,   66.66 ms HIP,  589.69 loss, 0.002528 LR, 4.54 GB used,   9738.86 GFLOPS,    675.05 GOPS
439   68.81 ms run,    2.70 ms python,   66.11 ms HIP,  577.37 loss, 0.002523 LR, 4.54 GB used,   9810.67 GFLOPS,    675.05 GOPS
440   69.26 ms run,    2.64 ms python,   66.62 ms HIP,  594.12 loss, 0.002519 LR, 4.54 GB used,   9746.05 GFLOPS,    675.05 GOPS
441   69.77 ms run,    2.69 ms python,   67.08 ms HIP,  601.86 loss, 0.002515 LR, 4.54 GB used,   9675.42 GFLOPS,    675.05 GOPS
442   69.41 ms run,    2.68 ms python,   66.73 ms HIP,  584.64 loss, 0.002510 LR, 4.54 GB used,   9725.30 GFLOPS,    675.05 GOPS
443   69.14 ms run,    2.74 ms python,   66.40 ms HIP,  593.23 loss, 0.002506 LR, 4.54 GB used,   9764.04 GFLOPS,    675.05 GOPS
444   68.74 ms run,    2.64 ms python,   66.11 ms HIP,  579.12 loss, 0.002502 LR, 4.54 GB used,   9820.21 GFLOPS,    675.05 GOPS
445   69.27 ms run,    2.63 ms python,   66.65 ms HIP,  575.39 loss, 0.002497 LR, 4.54 GB used,   9744.64 GFLOPS,    675.05 GOPS
446   69.20 ms run,    2.62 ms python,   66.58 ms HIP,  581.47 loss, 0.002493 LR, 4.54 GB used,   9755.36 GFLOPS,    675.05 GOPS
447   69.39 ms run,    2.66 ms python,   66.73 ms HIP,  580.66 loss, 0.002489 LR, 4.54 GB used,   9727.94 GFLOPS,    675.05 GOPS
448   69.30 ms run,    2.61 ms python,   66.69 ms HIP,  577.62 loss, 0.002484 LR, 4.54 GB used,   9740.59 GFLOPS,    675.05 GOPS
449   70.02 ms run,    2.73 ms python,   67.29 ms HIP,  586.47 loss, 0.002480 LR, 4.54 GB used,   9640.13 GFLOPS,    675.05 GOPS
450   69.28 ms run,    2.70 ms python,   66.58 ms HIP,  591.01 loss, 0.002476 LR, 4.54 GB used,   9743.76 GFLOPS,    675.05 GOPS
451   69.43 ms run,    2.75 ms python,   66.68 ms HIP,  603.39 loss, 0.002471 LR, 4.54 GB used,   9723.26 GFLOPS,    675.05 GOPS
452   68.91 ms run,    2.66 ms python,   66.25 ms HIP,  608.29 loss, 0.002467 LR, 4.54 GB used,   9796.44 GFLOPS,    675.05 GOPS
453   69.21 ms run,    2.68 ms python,   66.53 ms HIP,  589.10 loss, 0.002463 LR, 4.54 GB used,   9753.33 GFLOPS,    675.05 GOPS
454   68.97 ms run,    2.73 ms python,   66.25 ms HIP,  604.86 loss, 0.002458 LR, 4.54 GB used,   9786.83 GFLOPS,    675.05 GOPS
455   69.34 ms run,    2.64 ms python,   66.70 ms HIP,  586.66 loss, 0.002454 LR, 4.54 GB used,   9734.98 GFLOPS,    675.05 GOPS
456   69.24 ms run,    2.67 ms python,   66.56 ms HIP,  576.25 loss, 0.002449 LR, 4.54 GB used,   9749.74 GFLOPS,    675.05 GOPS
457   69.19 ms run,    2.67 ms python,   66.51 ms HIP,  589.74 loss, 0.002445 LR, 4.54 GB used,   9756.83 GFLOPS,    675.05 GOPS
458   68.60 ms run,    2.65 ms python,   65.96 ms HIP,  593.15 loss, 0.002441 LR, 4.54 GB used,   9839.60 GFLOPS,    675.05 GOPS
459   69.54 ms run,    2.71 ms python,   66.83 ms HIP,  582.82 loss, 0.002436 LR, 4.54 GB used,   9707.40 GFLOPS,    675.05 GOPS
460   69.08 ms run,    2.63 ms python,   66.45 ms HIP,  590.13 loss, 0.002432 LR, 4.54 GB used,   9771.37 GFLOPS,    675.05 GOPS
461   69.15 ms run,    2.74 ms python,   66.41 ms HIP,  586.29 loss, 0.002428 LR, 4.54 GB used,   9762.12 GFLOPS,    675.05 GOPS
462   69.12 ms run,    2.62 ms python,   66.50 ms HIP,  582.98 loss, 0.002423 LR, 4.54 GB used,   9766.12 GFLOPS,    675.05 GOPS
463   69.67 ms run,    2.64 ms python,   67.03 ms HIP,  565.42 loss, 0.002419 LR, 4.54 GB used,   9689.48 GFLOPS,    675.05 GOPS
464   69.61 ms run,    2.65 ms python,   66.96 ms HIP,  573.48 loss, 0.002415 LR, 4.54 GB used,   9697.12 GFLOPS,    675.05 GOPS
465   69.30 ms run,    2.72 ms python,   66.59 ms HIP,  593.76 loss, 0.002410 LR, 4.54 GB used,   9740.30 GFLOPS,    675.05 GOPS
466   69.44 ms run,    2.71 ms python,   66.74 ms HIP,  566.20 loss, 0.002406 LR, 4.54 GB used,   9720.68 GFLOPS,    675.05 GOPS
467   69.39 ms run,    2.69 ms python,   66.70 ms HIP,  581.88 loss, 0.002402 LR, 4.54 GB used,   9728.38 GFLOPS,    675.05 GOPS
468   69.24 ms run,    2.64 ms python,   66.60 ms HIP,  578.37 loss, 0.002397 LR, 4.54 GB used,   9749.12 GFLOPS,    675.05 GOPS
469   69.55 ms run,    2.65 ms python,   66.90 ms HIP,  585.02 loss, 0.002393 LR, 4.54 GB used,   9705.48 GFLOPS,    675.05 GOPS
470   69.91 ms run,    2.74 ms python,   67.17 ms HIP,  582.28 loss, 0.002389 LR, 4.54 GB used,   9656.55 GFLOPS,    675.05 GOPS
471   69.71 ms run,    2.64 ms python,   67.07 ms HIP,  585.25 loss, 0.002384 LR, 4.54 GB used,   9684.00 GFLOPS,    675.05 GOPS
472   69.15 ms run,    2.74 ms python,   66.41 ms HIP,  581.99 loss, 0.002380 LR, 4.54 GB used,   9761.66 GFLOPS,    675.05 GOPS
473   69.56 ms run,    2.66 ms python,   66.90 ms HIP,  577.38 loss, 0.002375 LR, 4.54 GB used,   9704.28 GFLOPS,    675.05 GOPS
474   70.51 ms run,    2.73 ms python,   67.78 ms HIP,  580.28 loss, 0.002371 LR, 4.54 GB used,   9574.16 GFLOPS,    675.05 GOPS
475   69.59 ms run,    2.70 ms python,   66.88 ms HIP,  586.50 loss, 0.002367 LR, 4.54 GB used,   9700.53 GFLOPS,    675.05 GOPS
476   69.72 ms run,    2.72 ms python,   67.00 ms HIP,  597.28 loss, 0.002362 LR, 4.54 GB used,   9682.55 GFLOPS,    675.05 GOPS
477   69.12 ms run,    2.66 ms python,   66.46 ms HIP,  589.34 loss, 0.002358 LR, 4.54 GB used,   9766.60 GFLOPS,    675.05 GOPS
478   69.86 ms run,    2.63 ms python,   67.23 ms HIP,  573.59 loss, 0.002354 LR, 4.54 GB used,   9662.94 GFLOPS,    675.05 GOPS
479   69.15 ms run,    2.71 ms python,   66.44 ms HIP,  562.75 loss, 0.002349 LR, 4.54 GB used,   9761.58 GFLOPS,    675.05 GOPS
480   69.32 ms run,    2.71 ms python,   66.61 ms HIP,  593.10 loss, 0.002345 LR, 4.54 GB used,   9737.73 GFLOPS,    675.05 GOPS
481   69.49 ms run,    2.72 ms python,   66.77 ms HIP,  579.93 loss, 0.002341 LR, 4.54 GB used,   9714.20 GFLOPS,    675.05 GOPS
482   70.00 ms run,    2.64 ms python,   67.36 ms HIP,  590.04 loss, 0.002336 LR, 4.54 GB used,   9643.66 GFLOPS,    675.05 GOPS
483   68.91 ms run,    2.71 ms python,   66.20 ms HIP,  584.13 loss, 0.002332 LR, 4.54 GB used,   9795.87 GFLOPS,    675.05 GOPS
484   69.08 ms run,    2.64 ms python,   66.45 ms HIP,  594.45 loss, 0.002328 LR, 4.54 GB used,   9771.60 GFLOPS,    675.05 GOPS
485   69.07 ms run,    2.65 ms python,   66.42 ms HIP,  598.35 loss, 0.002323 LR, 4.54 GB used,   9772.80 GFLOPS,    675.05 GOPS
486   68.98 ms run,    2.66 ms python,   66.31 ms HIP,  566.22 loss, 0.002319 LR, 4.54 GB used,   9786.77 GFLOPS,    675.05 GOPS
487   69.25 ms run,    2.67 ms python,   66.58 ms HIP,  571.19 loss, 0.002315 LR, 4.54 GB used,   9747.93 GFLOPS,    675.05 GOPS
488   70.21 ms run,    2.62 ms python,   67.59 ms HIP,  581.04 loss, 0.002310 LR, 4.54 GB used,   9614.92 GFLOPS,    675.05 GOPS
489   69.87 ms run,    2.67 ms python,   67.20 ms HIP,  553.44 loss, 0.002306 LR, 4.54 GB used,   9660.97 GFLOPS,    675.05 GOPS
shuffling training dataset in 753.43 ms (epoch=5)
490  828.96 ms run,  756.58 ms python,   72.39 ms HIP,  560.60 loss, 0.002301 LR, 4.54 GB used,    814.68 GFLOPS,    675.34 GOPS
491   73.86 ms run,    2.78 ms python,   71.08 ms HIP,  548.40 loss, 0.002297 LR, 4.54 GB used,   9139.33 GFLOPS,    675.05 GOPS
492   72.50 ms run,    2.68 ms python,   69.82 ms HIP,  559.04 loss, 0.002293 LR, 4.54 GB used,   9310.86 GFLOPS,    675.05 GOPS
493   71.02 ms run,    2.68 ms python,   68.34 ms HIP,  559.92 loss, 0.002288 LR, 4.54 GB used,   9504.53 GFLOPS,    675.05 GOPS
494   71.47 ms run,    2.72 ms python,   68.75 ms HIP,  566.71 loss, 0.002284 LR, 4.54 GB used,   9445.23 GFLOPS,    675.05 GOPS
495   70.28 ms run,    2.71 ms python,   67.57 ms HIP,  566.45 loss, 0.002280 LR, 4.54 GB used,   9605.55 GFLOPS,    675.05 GOPS
496   70.03 ms run,    2.65 ms python,   67.38 ms HIP,  573.89 loss, 0.002275 LR, 4.54 GB used,   9639.65 GFLOPS,    675.05 GOPS
497   70.01 ms run,    2.64 ms python,   67.37 ms HIP,  571.42 loss, 0.002271 LR, 4.54 GB used,   9641.88 GFLOPS,    675.05 GOPS
498   69.38 ms run,    2.66 ms python,   66.72 ms HIP,  562.14 loss, 0.002267 LR, 4.54 GB used,   9729.92 GFLOPS,    675.05 GOPS
499   69.61 ms run,    2.75 ms python,   66.86 ms HIP,  551.55 loss, 0.002262 LR, 4.54 GB used,   9697.79 GFLOPS,    675.05 GOPS
500   69.50 ms run,    2.65 ms python,   66.84 ms HIP,  557.15 loss, 0.002258 LR, 4.54 GB used,   9713.18 GFLOPS,    675.05 GOPS
501   69.49 ms run,    2.65 ms python,   66.83 ms HIP,  591.58 loss, 0.002254 LR, 4.54 GB used,   9714.97 GFLOPS,    675.05 GOPS
502   69.68 ms run,    2.66 ms python,   67.02 ms HIP,  603.83 loss, 0.002249 LR, 4.54 GB used,   9688.20 GFLOPS,    675.05 GOPS
503   69.32 ms run,    2.71 ms python,   66.61 ms HIP,  562.22 loss, 0.002245 LR, 4.54 GB used,   9737.52 GFLOPS,    675.05 GOPS
504   69.64 ms run,    2.64 ms python,   67.01 ms HIP,  580.17 loss, 0.002241 LR, 4.54 GB used,   9692.68 GFLOPS,    675.05 GOPS
505   69.34 ms run,    2.67 ms python,   66.67 ms HIP,  566.71 loss, 0.002236 LR, 4.54 GB used,   9735.38 GFLOPS,    675.05 GOPS
506   69.94 ms run,    2.63 ms python,   67.31 ms HIP,  591.87 loss, 0.002232 LR, 4.54 GB used,   9651.92 GFLOPS,    675.05 GOPS
507   69.58 ms run,    2.64 ms python,   66.94 ms HIP,  566.65 loss, 0.002227 LR, 4.54 GB used,   9701.37 GFLOPS,    675.05 GOPS
508   68.80 ms run,    2.64 ms python,   66.15 ms HIP,  569.30 loss, 0.002223 LR, 4.54 GB used,   9811.99 GFLOPS,    675.05 GOPS
509   69.45 ms run,    2.64 ms python,   66.82 ms HIP,  554.92 loss, 0.002219 LR, 4.54 GB used,   9719.59 GFLOPS,    675.05 GOPS
510   68.93 ms run,    2.62 ms python,   66.30 ms HIP,  593.63 loss, 0.002214 LR, 4.54 GB used,   9793.74 GFLOPS,    675.05 GOPS
511   69.09 ms run,    2.73 ms python,   66.36 ms HIP,  574.09 loss, 0.002210 LR, 4.54 GB used,   9770.91 GFLOPS,    675.05 GOPS
512   69.08 ms run,    2.71 ms python,   66.37 ms HIP,  558.48 loss, 0.002206 LR, 4.54 GB used,   9771.45 GFLOPS,    675.05 GOPS
513   69.20 ms run,    2.66 ms python,   66.54 ms HIP,  571.87 loss, 0.002201 LR, 4.54 GB used,   9754.31 GFLOPS,    675.05 GOPS
514   69.76 ms run,    2.68 ms python,   67.08 ms HIP,  566.25 loss, 0.002197 LR, 4.54 GB used,   9676.61 GFLOPS,    675.05 GOPS
515   69.53 ms run,    2.67 ms python,   66.86 ms HIP,  575.40 loss, 0.002193 LR, 4.54 GB used,   9708.89 GFLOPS,    675.05 GOPS
516   69.52 ms run,    2.64 ms python,   66.87 ms HIP,  585.70 loss, 0.002188 LR, 4.54 GB used,   9710.32 GFLOPS,    675.05 GOPS
517   69.81 ms run,    2.66 ms python,   67.16 ms HIP,  575.09 loss, 0.002184 LR, 4.54 GB used,   9669.11 GFLOPS,    675.05 GOPS
518   69.25 ms run,    2.65 ms python,   66.60 ms HIP,  577.65 loss, 0.002180 LR, 4.54 GB used,   9748.00 GFLOPS,    675.05 GOPS
519   69.38 ms run,    2.64 ms python,   66.74 ms HIP,  562.22 loss, 0.002175 LR, 4.54 GB used,   9729.67 GFLOPS,    675.05 GOPS
520   69.59 ms run,    2.66 ms python,   66.92 ms HIP,  588.31 loss, 0.002171 LR, 4.54 GB used,   9700.68 GFLOPS,    675.05 GOPS
521   69.61 ms run,    2.68 ms python,   66.92 ms HIP,  563.62 loss, 0.002167 LR, 4.54 GB used,   9697.95 GFLOPS,    675.05 GOPS
522   69.46 ms run,    2.64 ms python,   66.81 ms HIP,  575.65 loss, 0.002162 LR, 4.54 GB used,   9718.78 GFLOPS,    675.05 GOPS
523   69.85 ms run,    2.63 ms python,   67.22 ms HIP,  582.18 loss, 0.002158 LR, 4.54 GB used,   9664.21 GFLOPS,    675.05 GOPS
524   69.46 ms run,    2.60 ms python,   66.86 ms HIP,  574.66 loss, 0.002153 LR, 4.54 GB used,   9718.70 GFLOPS,    675.05 GOPS
525   69.56 ms run,    2.65 ms python,   66.91 ms HIP,  586.75 loss, 0.002149 LR, 4.54 GB used,   9704.57 GFLOPS,    675.05 GOPS
526   69.75 ms run,    2.71 ms python,   67.05 ms HIP,  586.70 loss, 0.002145 LR, 4.54 GB used,   9677.48 GFLOPS,    675.05 GOPS
527   69.51 ms run,    2.71 ms python,   66.80 ms HIP,  571.59 loss, 0.002140 LR, 4.54 GB used,   9711.94 GFLOPS,    675.05 GOPS
528   69.46 ms run,    2.62 ms python,   66.84 ms HIP,  564.29 loss, 0.002136 LR, 4.54 GB used,   9718.11 GFLOPS,    675.05 GOPS
529   69.58 ms run,    2.65 ms python,   66.93 ms HIP,  572.51 loss, 0.002132 LR, 4.54 GB used,   9701.15 GFLOPS,    675.05 GOPS
530   69.35 ms run,    2.66 ms python,   66.69 ms HIP,  570.33 loss, 0.002127 LR, 4.54 GB used,   9733.48 GFLOPS,    675.05 GOPS
531   69.59 ms run,    2.66 ms python,   66.93 ms HIP,  566.56 loss, 0.002123 LR, 4.54 GB used,   9699.89 GFLOPS,    675.05 GOPS
532   69.29 ms run,    2.65 ms python,   66.64 ms HIP,  581.82 loss, 0.002119 LR, 4.54 GB used,   9742.58 GFLOPS,    675.05 GOPS
533   69.48 ms run,    2.66 ms python,   66.82 ms HIP,  561.05 loss, 0.002114 LR, 4.54 GB used,   9715.61 GFLOPS,    675.05 GOPS
534   69.53 ms run,    2.65 ms python,   66.89 ms HIP,  594.77 loss, 0.002110 LR, 4.54 GB used,   9708.01 GFLOPS,    675.05 GOPS
535   69.76 ms run,    2.72 ms python,   67.04 ms HIP,  558.92 loss, 0.002106 LR, 4.54 GB used,   9676.33 GFLOPS,    675.05 GOPS
536   69.24 ms run,    2.67 ms python,   66.57 ms HIP,  586.19 loss, 0.002101 LR, 4.54 GB used,   9748.85 GFLOPS,    675.05 GOPS
537   69.22 ms run,    2.71 ms python,   66.51 ms HIP,  578.50 loss, 0.002097 LR, 4.54 GB used,   9752.34 GFLOPS,    675.05 GOPS
538   69.18 ms run,    2.65 ms python,   66.54 ms HIP,  572.07 loss, 0.002093 LR, 4.54 GB used,   9757.39 GFLOPS,    675.05 GOPS
539   69.19 ms run,    2.60 ms python,   66.60 ms HIP,  565.12 loss, 0.002088 LR, 4.54 GB used,   9756.18 GFLOPS,    675.05 GOPS
540   68.98 ms run,    2.59 ms python,   66.39 ms HIP,  562.70 loss, 0.002084 LR, 4.54 GB used,   9785.63 GFLOPS,    675.05 GOPS
541   69.21 ms run,    2.62 ms python,   66.59 ms HIP,  579.45 loss, 0.002079 LR, 4.54 GB used,   9752.89 GFLOPS,    675.05 GOPS
542   68.45 ms run,    2.61 ms python,   65.84 ms HIP,  572.89 loss, 0.002075 LR, 4.54 GB used,   9861.98 GFLOPS,    675.05 GOPS
543   70.01 ms run,    2.66 ms python,   67.35 ms HIP,  569.13 loss, 0.002071 LR, 4.54 GB used,   9641.54 GFLOPS,    675.05 GOPS
544   69.05 ms run,    2.64 ms python,   66.42 ms HIP,  569.87 loss, 0.002066 LR, 4.54 GB used,   9775.58 GFLOPS,    675.05 GOPS
545   69.32 ms run,    2.65 ms python,   66.67 ms HIP,  555.79 loss, 0.002062 LR, 4.54 GB used,   9738.47 GFLOPS,    675.05 GOPS
546   69.17 ms run,    2.61 ms python,   66.56 ms HIP,  558.11 loss, 0.002058 LR, 4.54 GB used,   9759.22 GFLOPS,    675.05 GOPS
547   69.30 ms run,    2.74 ms python,   66.56 ms HIP,  557.49 loss, 0.002053 LR, 4.54 GB used,   9740.97 GFLOPS,    675.05 GOPS
548   69.23 ms run,    2.66 ms python,   66.57 ms HIP,  569.55 loss, 0.002049 LR, 4.54 GB used,   9750.78 GFLOPS,    675.05 GOPS
549   69.23 ms run,    2.66 ms python,   66.57 ms HIP,  564.51 loss, 0.002045 LR, 4.54 GB used,   9750.81 GFLOPS,    675.05 GOPS
550   69.27 ms run,    2.67 ms python,   66.60 ms HIP,  564.71 loss, 0.002040 LR, 4.54 GB used,   9744.46 GFLOPS,    675.05 GOPS
551   68.79 ms run,    2.65 ms python,   66.15 ms HIP,  571.34 loss, 0.002036 LR, 4.54 GB used,   9812.75 GFLOPS,    675.05 GOPS
552   69.87 ms run,    2.65 ms python,   67.22 ms HIP,  560.34 loss, 0.002032 LR, 4.54 GB used,   9661.72 GFLOPS,    675.05 GOPS
553   69.18 ms run,    2.64 ms python,   66.54 ms HIP,  581.30 loss, 0.002027 LR, 4.54 GB used,   9757.62 GFLOPS,    675.05 GOPS
554   69.23 ms run,    2.63 ms python,   66.60 ms HIP,  567.74 loss, 0.002023 LR, 4.54 GB used,   9751.29 GFLOPS,    675.05 GOPS
555   69.11 ms run,    2.69 ms python,   66.42 ms HIP,  565.75 loss, 0.002019 LR, 4.54 GB used,   9767.84 GFLOPS,    675.05 GOPS
556   69.28 ms run,    2.61 ms python,   66.67 ms HIP,  560.14 loss, 0.002014 LR, 4.54 GB used,   9743.20 GFLOPS,    675.05 GOPS
557   69.16 ms run,    2.64 ms python,   66.52 ms HIP,  566.14 loss, 0.002010 LR, 4.54 GB used,   9761.05 GFLOPS,    675.05 GOPS
558   69.59 ms run,    2.67 ms python,   66.93 ms HIP,  576.05 loss, 0.002005 LR, 4.54 GB used,   9699.90 GFLOPS,    675.05 GOPS
559   69.34 ms run,    2.65 ms python,   66.70 ms HIP,  568.57 loss, 0.002001 LR, 4.54 GB used,   9734.75 GFLOPS,    675.05 GOPS
560   68.96 ms run,    2.62 ms python,   66.34 ms HIP,  547.41 loss, 0.001997 LR, 4.54 GB used,   9788.71 GFLOPS,    675.05 GOPS
561   69.55 ms run,    2.62 ms python,   66.93 ms HIP,  575.74 loss, 0.001992 LR, 4.54 GB used,   9705.91 GFLOPS,    675.05 GOPS
562   69.27 ms run,    2.67 ms python,   66.60 ms HIP,  568.30 loss, 0.001988 LR, 4.54 GB used,   9745.32 GFLOPS,    675.05 GOPS
563   70.53 ms run,    2.65 ms python,   67.88 ms HIP,  573.20 loss, 0.001984 LR, 4.54 GB used,   9570.72 GFLOPS,    675.05 GOPS
564   69.40 ms run,    2.70 ms python,   66.70 ms HIP,  569.90 loss, 0.001979 LR, 4.54 GB used,   9727.09 GFLOPS,    675.05 GOPS
565   68.97 ms run,    2.67 ms python,   66.30 ms HIP,  584.92 loss, 0.001975 LR, 4.54 GB used,   9786.93 GFLOPS,    675.05 GOPS
566   69.16 ms run,    2.63 ms python,   66.53 ms HIP,  574.28 loss, 0.001971 LR, 4.54 GB used,   9760.44 GFLOPS,    675.05 GOPS
567   69.56 ms run,    2.67 ms python,   66.89 ms HIP,  574.55 loss, 0.001966 LR, 4.54 GB used,   9704.27 GFLOPS,    675.05 GOPS
568   69.87 ms run,    2.70 ms python,   67.17 ms HIP,  576.63 loss, 0.001962 LR, 4.54 GB used,   9661.82 GFLOPS,    675.05 GOPS
569   69.29 ms run,    2.72 ms python,   66.57 ms HIP,  583.46 loss, 0.001958 LR, 4.54 GB used,   9742.41 GFLOPS,    675.05 GOPS
570   69.35 ms run,    2.66 ms python,   66.69 ms HIP,  576.70 loss, 0.001953 LR, 4.54 GB used,   9733.60 GFLOPS,    675.05 GOPS
571   69.36 ms run,    2.64 ms python,   66.72 ms HIP,  572.55 loss, 0.001949 LR, 4.54 GB used,   9732.30 GFLOPS,    675.05 GOPS
572   69.22 ms run,    2.64 ms python,   66.57 ms HIP,  561.28 loss, 0.001945 LR, 4.54 GB used,   9752.43 GFLOPS,    675.05 GOPS
573   69.22 ms run,    2.67 ms python,   66.55 ms HIP,  556.27 loss, 0.001940 LR, 4.54 GB used,   9752.26 GFLOPS,    675.05 GOPS
574   69.27 ms run,    2.64 ms python,   66.63 ms HIP,  564.08 loss, 0.001936 LR, 4.54 GB used,   9744.94 GFLOPS,    675.05 GOPS
575   70.44 ms run,    2.62 ms python,   67.82 ms HIP,  553.18 loss, 0.001931 LR, 4.54 GB used,   9582.95 GFLOPS,    675.05 GOPS
576   70.21 ms run,    2.73 ms python,   67.48 ms HIP,  559.14 loss, 0.001927 LR, 4.54 GB used,   9614.34 GFLOPS,    675.05 GOPS
577   69.58 ms run,    2.64 ms python,   66.94 ms HIP,  573.60 loss, 0.001923 LR, 4.54 GB used,   9701.77 GFLOPS,    675.05 GOPS
578   69.91 ms run,    2.61 ms python,   67.29 ms HIP,  575.48 loss, 0.001918 LR, 4.54 GB used,   9656.14 GFLOPS,    675.05 GOPS
579   69.53 ms run,    2.63 ms python,   66.91 ms HIP,  573.98 loss, 0.001914 LR, 4.54 GB used,   9708.07 GFLOPS,    675.05 GOPS
580   69.42 ms run,    2.76 ms python,   66.65 ms HIP,  574.23 loss, 0.001910 LR, 4.54 GB used,   9724.32 GFLOPS,    675.05 GOPS
581   69.01 ms run,    2.64 ms python,   66.37 ms HIP,  569.52 loss, 0.001905 LR, 4.54 GB used,   9781.98 GFLOPS,    675.05 GOPS
582   69.88 ms run,    2.66 ms python,   67.21 ms HIP,  562.67 loss, 0.001901 LR, 4.54 GB used,   9660.34 GFLOPS,    675.05 GOPS
583   69.47 ms run,    2.64 ms python,   66.82 ms HIP,  548.51 loss, 0.001897 LR, 4.54 GB used,   9717.55 GFLOPS,    675.05 GOPS
584   69.13 ms run,    2.64 ms python,   66.50 ms HIP,  568.97 loss, 0.001892 LR, 4.54 GB used,   9764.27 GFLOPS,    675.05 GOPS
585   69.10 ms run,    2.63 ms python,   66.48 ms HIP,  565.35 loss, 0.001888 LR, 4.54 GB used,   9768.65 GFLOPS,    675.05 GOPS
586   68.92 ms run,    2.62 ms python,   66.29 ms HIP,  578.32 loss, 0.001884 LR, 4.54 GB used,   9795.32 GFLOPS,    675.05 GOPS
587   69.53 ms run,    2.64 ms python,   66.89 ms HIP,  538.04 loss, 0.001879 LR, 4.54 GB used,   9708.32 GFLOPS,    675.05 GOPS
shuffling training dataset in 1532.30 ms (epoch=6)
588 1607.52 ms run, 1535.42 ms python,   72.11 ms HIP,  571.49 loss, 0.001875 LR, 4.54 GB used,    420.28 GFLOPS,    675.61 GOPS
589   73.93 ms run,    2.84 ms python,   71.09 ms HIP,  564.70 loss, 0.001871 LR, 4.54 GB used,   9130.80 GFLOPS,    675.05 GOPS
590   72.13 ms run,    2.73 ms python,   69.41 ms HIP,  570.04 loss, 0.001866 LR, 4.54 GB used,   9358.34 GFLOPS,    675.05 GOPS
591   71.02 ms run,    2.75 ms python,   68.28 ms HIP,  569.78 loss, 0.001862 LR, 4.54 GB used,   9504.48 GFLOPS,    675.05 GOPS
592   70.83 ms run,    3.07 ms python,   67.76 ms HIP,  578.29 loss, 0.001857 LR, 4.54 GB used,   9531.00 GFLOPS,    675.05 GOPS
593   69.67 ms run,    2.67 ms python,   67.00 ms HIP,  573.34 loss, 0.001853 LR, 4.54 GB used,   9689.20 GFLOPS,    675.05 GOPS
594   69.52 ms run,    2.66 ms python,   66.86 ms HIP,  578.10 loss, 0.001849 LR, 4.54 GB used,   9710.61 GFLOPS,    675.05 GOPS
595   69.88 ms run,    2.70 ms python,   67.19 ms HIP,  591.64 loss, 0.001844 LR, 4.54 GB used,   9659.64 GFLOPS,    675.05 GOPS
596   69.11 ms run,    2.66 ms python,   66.45 ms HIP,  603.30 loss, 0.001840 LR, 4.54 GB used,   9767.51 GFLOPS,    675.05 GOPS
597   68.57 ms run,    2.68 ms python,   65.89 ms HIP,  567.99 loss, 0.001836 LR, 4.54 GB used,   9844.15 GFLOPS,    675.05 GOPS
598   69.68 ms run,    2.68 ms python,   67.00 ms HIP,  575.65 loss, 0.001831 LR, 4.54 GB used,   9687.80 GFLOPS,    675.05 GOPS
599   68.71 ms run,    2.64 ms python,   66.07 ms HIP,  570.96 loss, 0.001827 LR, 4.54 GB used,   9825.15 GFLOPS,    675.05 GOPS
600   68.25 ms run,    2.65 ms python,   65.60 ms HIP,  557.41 loss, 0.001823 LR, 4.54 GB used,   9890.67 GFLOPS,    675.05 GOPS
601   69.30 ms run,    2.66 ms python,   66.64 ms HIP,  571.70 loss, 0.001818 LR, 4.54 GB used,   9740.51 GFLOPS,    675.05 GOPS
602   69.25 ms run,    2.63 ms python,   66.61 ms HIP,  569.83 loss, 0.001814 LR, 4.54 GB used,   9748.63 GFLOPS,    675.05 GOPS
603   69.96 ms run,    2.64 ms python,   67.32 ms HIP,  571.17 loss, 0.001810 LR, 4.54 GB used,   9648.41 GFLOPS,    675.05 GOPS
604   69.57 ms run,    2.68 ms python,   66.89 ms HIP,  575.71 loss, 0.001805 LR, 4.54 GB used,   9702.87 GFLOPS,    675.05 GOPS
605   68.99 ms run,    2.62 ms python,   66.37 ms HIP,  577.77 loss, 0.001801 LR, 4.54 GB used,   9784.19 GFLOPS,    675.05 GOPS
606   68.93 ms run,    2.64 ms python,   66.29 ms HIP,  569.05 loss, 0.001797 LR, 4.54 GB used,   9793.24 GFLOPS,    675.05 GOPS
607   69.57 ms run,    2.66 ms python,   66.91 ms HIP,  573.34 loss, 0.001792 LR, 4.54 GB used,   9703.31 GFLOPS,    675.05 GOPS
608   69.33 ms run,    2.66 ms python,   66.67 ms HIP,  570.30 loss, 0.001788 LR, 4.54 GB used,   9737.12 GFLOPS,    675.05 GOPS
609   69.04 ms run,    2.66 ms python,   66.38 ms HIP,  570.39 loss, 0.001783 LR, 4.54 GB used,   9777.89 GFLOPS,    675.05 GOPS
610   69.05 ms run,    2.68 ms python,   66.37 ms HIP,  566.33 loss, 0.001779 LR, 4.54 GB used,   9776.76 GFLOPS,    675.05 GOPS
611   69.30 ms run,    2.63 ms python,   66.66 ms HIP,  566.64 loss, 0.001775 LR, 4.54 GB used,   9741.48 GFLOPS,    675.05 GOPS
612   68.96 ms run,    2.61 ms python,   66.34 ms HIP,  569.52 loss, 0.001770 LR, 4.54 GB used,   9789.49 GFLOPS,    675.05 GOPS
613   68.96 ms run,    2.62 ms python,   66.34 ms HIP,  581.94 loss, 0.001766 LR, 4.54 GB used,   9788.75 GFLOPS,    675.05 GOPS
614   69.12 ms run,    2.64 ms python,   66.48 ms HIP,  578.21 loss, 0.001762 LR, 4.54 GB used,   9766.66 GFLOPS,    675.05 GOPS
615   69.22 ms run,    2.68 ms python,   66.54 ms HIP,  577.52 loss, 0.001757 LR, 4.54 GB used,   9751.63 GFLOPS,    675.05 GOPS
616   69.22 ms run,    2.65 ms python,   66.57 ms HIP,  573.82 loss, 0.001753 LR, 4.54 GB used,   9752.63 GFLOPS,    675.05 GOPS
617   68.98 ms run,    2.65 ms python,   66.33 ms HIP,  570.04 loss, 0.001749 LR, 4.54 GB used,   9785.64 GFLOPS,    675.05 GOPS
618   68.79 ms run,    2.63 ms python,   66.16 ms HIP,  569.29 loss, 0.001744 LR, 4.54 GB used,   9813.12 GFLOPS,    675.05 GOPS
619   69.47 ms run,    2.62 ms python,   66.86 ms HIP,  587.24 loss, 0.001740 LR, 4.54 GB used,   9716.59 GFLOPS,    675.05 GOPS
620   69.73 ms run,    2.65 ms python,   67.07 ms HIP,  568.84 loss, 0.001736 LR, 4.54 GB used,   9681.47 GFLOPS,    675.05 GOPS
621   69.34 ms run,    2.65 ms python,   66.70 ms HIP,  566.43 loss, 0.001731 LR, 4.54 GB used,   9734.93 GFLOPS,    675.05 GOPS
622   69.13 ms run,    2.63 ms python,   66.50 ms HIP,  569.45 loss, 0.001727 LR, 4.54 GB used,   9764.84 GFLOPS,    675.05 GOPS
623   68.65 ms run,    2.63 ms python,   66.02 ms HIP,  581.13 loss, 0.001723 LR, 4.54 GB used,   9832.96 GFLOPS,    675.05 GOPS
624   69.20 ms run,    2.67 ms python,   66.54 ms HIP,  564.13 loss, 0.001718 LR, 4.54 GB used,   9754.55 GFLOPS,    675.05 GOPS
625   68.98 ms run,    2.64 ms python,   66.34 ms HIP,  573.42 loss, 0.001714 LR, 4.54 GB used,   9786.71 GFLOPS,    675.05 GOPS
626   69.54 ms run,    2.72 ms python,   66.82 ms HIP,  563.98 loss, 0.001709 LR, 4.54 GB used,   9707.19 GFLOPS,    675.05 GOPS
627   69.17 ms run,    2.61 ms python,   66.56 ms HIP,  569.14 loss, 0.001705 LR, 4.54 GB used,   9758.61 GFLOPS,    675.05 GOPS
628   69.14 ms run,    2.64 ms python,   66.50 ms HIP,  567.16 loss, 0.001701 LR, 4.54 GB used,   9763.47 GFLOPS,    675.05 GOPS
629   69.67 ms run,    2.62 ms python,   67.04 ms HIP,  573.60 loss, 0.001696 LR, 4.54 GB used,   9689.26 GFLOPS,    675.05 GOPS
630   70.03 ms run,    2.68 ms python,   67.34 ms HIP,  562.19 loss, 0.001692 LR, 4.54 GB used,   9639.76 GFLOPS,    675.05 GOPS
631   70.27 ms run,    2.70 ms python,   67.58 ms HIP,  575.71 loss, 0.001688 LR, 4.54 GB used,   9605.96 GFLOPS,    675.05 GOPS
632   69.10 ms run,    2.68 ms python,   66.41 ms HIP,  570.74 loss, 0.001683 LR, 4.54 GB used,   9769.67 GFLOPS,    675.05 GOPS
633   69.40 ms run,    2.66 ms python,   66.74 ms HIP,  567.73 loss, 0.001679 LR, 4.54 GB used,   9727.11 GFLOPS,    675.05 GOPS
634   69.35 ms run,    2.60 ms python,   66.75 ms HIP,  568.36 loss, 0.001675 LR, 4.54 GB used,   9734.21 GFLOPS,    675.05 GOPS
635   69.23 ms run,    2.63 ms python,   66.60 ms HIP,  586.04 loss, 0.001670 LR, 4.54 GB used,   9751.31 GFLOPS,    675.05 GOPS
636   69.23 ms run,    2.66 ms python,   66.57 ms HIP,  574.17 loss, 0.001666 LR, 4.54 GB used,   9751.14 GFLOPS,    675.05 GOPS
637   69.20 ms run,    2.67 ms python,   66.54 ms HIP,  579.93 loss, 0.001662 LR, 4.54 GB used,   9754.30 GFLOPS,    675.05 GOPS
638   69.40 ms run,    2.72 ms python,   66.68 ms HIP,  560.20 loss, 0.001657 LR, 4.54 GB used,   9727.38 GFLOPS,    675.05 GOPS
639   69.54 ms run,    2.66 ms python,   66.88 ms HIP,  576.22 loss, 0.001653 LR, 4.54 GB used,   9707.03 GFLOPS,    675.05 GOPS
640   70.02 ms run,    2.63 ms python,   67.39 ms HIP,  583.38 loss, 0.001649 LR, 4.54 GB used,   9640.45 GFLOPS,    675.05 GOPS
641   69.67 ms run,    2.70 ms python,   66.96 ms HIP,  563.03 loss, 0.001644 LR, 4.54 GB used,   9689.69 GFLOPS,    675.05 GOPS
642   69.39 ms run,    2.65 ms python,   66.74 ms HIP,  577.68 loss, 0.001640 LR, 4.54 GB used,   9727.86 GFLOPS,    675.05 GOPS
643   69.51 ms run,    2.60 ms python,   66.91 ms HIP,  561.45 loss, 0.001635 LR, 4.54 GB used,   9710.91 GFLOPS,    675.05 GOPS
644   68.99 ms run,    2.65 ms python,   66.34 ms HIP,  580.14 loss, 0.001631 LR, 4.54 GB used,   9784.95 GFLOPS,    675.05 GOPS
645   69.26 ms run,    2.67 ms python,   66.59 ms HIP,  579.88 loss, 0.001627 LR, 4.54 GB used,   9747.02 GFLOPS,    675.05 GOPS
646   69.48 ms run,    2.61 ms python,   66.88 ms HIP,  575.16 loss, 0.001622 LR, 4.54 GB used,   9715.32 GFLOPS,    675.05 GOPS
647   69.02 ms run,    2.64 ms python,   66.38 ms HIP,  596.25 loss, 0.001618 LR, 4.54 GB used,   9781.13 GFLOPS,    675.05 GOPS
648   70.13 ms run,    2.62 ms python,   67.51 ms HIP,  564.59 loss, 0.001614 LR, 4.54 GB used,   9626.22 GFLOPS,    675.05 GOPS
649   69.09 ms run,    2.76 ms python,   66.33 ms HIP,  569.52 loss, 0.001609 LR, 4.54 GB used,   9770.15 GFLOPS,    675.05 GOPS
650   69.10 ms run,    2.65 ms python,   66.45 ms HIP,  586.70 loss, 0.001605 LR, 4.54 GB used,   9769.54 GFLOPS,    675.05 GOPS
651   69.49 ms run,    2.65 ms python,   66.83 ms HIP,  577.35 loss, 0.001601 LR, 4.54 GB used,   9714.46 GFLOPS,    675.05 GOPS
652   70.05 ms run,    2.66 ms python,   67.40 ms HIP,  575.93 loss, 0.001596 LR, 4.54 GB used,   9636.33 GFLOPS,    675.05 GOPS
653   69.37 ms run,    2.62 ms python,   66.75 ms HIP,  576.70 loss, 0.001592 LR, 4.54 GB used,   9730.73 GFLOPS,    675.05 GOPS
654   69.51 ms run,    2.67 ms python,   66.84 ms HIP,  566.77 loss, 0.001588 LR, 4.54 GB used,   9711.19 GFLOPS,    675.05 GOPS
655   69.64 ms run,    2.63 ms python,   67.01 ms HIP,  576.23 loss, 0.001583 LR, 4.54 GB used,   9693.35 GFLOPS,    675.05 GOPS
656   69.26 ms run,    2.63 ms python,   66.63 ms HIP,  562.23 loss, 0.001579 LR, 4.54 GB used,   9746.43 GFLOPS,    675.05 GOPS
657   69.32 ms run,    2.64 ms python,   66.68 ms HIP,  564.83 loss, 0.001575 LR, 4.54 GB used,   9738.11 GFLOPS,    675.05 GOPS
658   69.05 ms run,    2.63 ms python,   66.42 ms HIP,  577.87 loss, 0.001570 LR, 4.54 GB used,   9775.95 GFLOPS,    675.05 GOPS
659   69.75 ms run,    2.66 ms python,   67.09 ms HIP,  574.14 loss, 0.001566 LR, 4.54 GB used,   9677.60 GFLOPS,    675.05 GOPS
660   69.21 ms run,    2.74 ms python,   66.47 ms HIP,  582.64 loss, 0.001561 LR, 4.54 GB used,   9753.05 GFLOPS,    675.05 GOPS
661   69.58 ms run,    2.62 ms python,   66.96 ms HIP,  586.00 loss, 0.001557 LR, 4.54 GB used,   9701.75 GFLOPS,    675.05 GOPS
662   70.11 ms run,    2.63 ms python,   67.48 ms HIP,  552.82 loss, 0.001553 LR, 4.54 GB used,   9628.76 GFLOPS,    675.05 GOPS
663   69.24 ms run,    2.61 ms python,   66.63 ms HIP,  569.97 loss, 0.001548 LR, 4.54 GB used,   9749.22 GFLOPS,    675.05 GOPS
664   69.93 ms run,    2.62 ms python,   67.31 ms HIP,  576.78 loss, 0.001544 LR, 4.54 GB used,   9653.37 GFLOPS,    675.05 GOPS
665   69.92 ms run,    2.79 ms python,   67.13 ms HIP,  565.56 loss, 0.001540 LR, 4.54 GB used,   9654.51 GFLOPS,    675.05 GOPS
666   69.48 ms run,    2.70 ms python,   66.79 ms HIP,  579.94 loss, 0.001535 LR, 4.54 GB used,   9715.04 GFLOPS,    675.05 GOPS
667   68.98 ms run,    2.66 ms python,   66.32 ms HIP,  564.24 loss, 0.001531 LR, 4.54 GB used,   9785.96 GFLOPS,    675.05 GOPS
668   69.42 ms run,    2.62 ms python,   66.80 ms HIP,  573.23 loss, 0.001527 LR, 4.54 GB used,   9724.37 GFLOPS,    675.05 GOPS
669   69.36 ms run,    2.65 ms python,   66.71 ms HIP,  571.86 loss, 0.001522 LR, 4.54 GB used,   9732.52 GFLOPS,    675.05 GOPS
670   69.28 ms run,    2.63 ms python,   66.66 ms HIP,  596.87 loss, 0.001518 LR, 4.54 GB used,   9743.53 GFLOPS,    675.05 GOPS
671   69.96 ms run,    2.71 ms python,   67.25 ms HIP,  570.27 loss, 0.001514 LR, 4.54 GB used,   9648.68 GFLOPS,    675.05 GOPS
672   69.56 ms run,    2.63 ms python,   66.93 ms HIP,  575.17 loss, 0.001509 LR, 4.54 GB used,   9704.28 GFLOPS,    675.05 GOPS
673   69.26 ms run,    2.64 ms python,   66.62 ms HIP,  563.80 loss, 0.001505 LR, 4.54 GB used,   9746.84 GFLOPS,    675.05 GOPS
674   69.59 ms run,    2.65 ms python,   66.94 ms HIP,  575.78 loss, 0.001501 LR, 4.54 GB used,   9700.17 GFLOPS,    675.05 GOPS
675   69.84 ms run,    2.62 ms python,   67.22 ms HIP,  563.96 loss, 0.001496 LR, 4.54 GB used,   9665.74 GFLOPS,    675.05 GOPS
676   69.58 ms run,    2.64 ms python,   66.94 ms HIP,  575.17 loss, 0.001492 LR, 4.54 GB used,   9701.21 GFLOPS,    675.05 GOPS
677   69.35 ms run,    2.63 ms python,   66.72 ms HIP,  576.46 loss, 0.001488 LR, 4.54 GB used,   9733.49 GFLOPS,    675.05 GOPS
678   69.84 ms run,    2.66 ms python,   67.18 ms HIP,  563.43 loss, 0.001483 LR, 4.54 GB used,   9665.56 GFLOPS,    675.05 GOPS
679   69.77 ms run,    2.63 ms python,   67.14 ms HIP,  582.41 loss, 0.001479 LR, 4.54 GB used,   9675.46 GFLOPS,    675.05 GOPS
680   69.82 ms run,    2.62 ms python,   67.20 ms HIP,  565.88 loss, 0.001474 LR, 4.54 GB used,   9667.93 GFLOPS,    675.05 GOPS
681   69.72 ms run,    2.68 ms python,   67.04 ms HIP,  571.82 loss, 0.001470 LR, 4.54 GB used,   9681.60 GFLOPS,    675.05 GOPS
682   69.43 ms run,    2.60 ms python,   66.84 ms HIP,  566.03 loss, 0.001466 LR, 4.54 GB used,   9722.06 GFLOPS,    675.05 GOPS
683   69.22 ms run,    2.76 ms python,   66.46 ms HIP,  568.31 loss, 0.001461 LR, 4.54 GB used,   9751.85 GFLOPS,    675.05 GOPS
684   69.83 ms run,    2.65 ms python,   67.18 ms HIP,  573.44 loss, 0.001457 LR, 4.54 GB used,   9667.25 GFLOPS,    675.05 GOPS
685   69.66 ms run,    2.66 ms python,   66.99 ms HIP,  533.51 loss, 0.001453 LR, 4.54 GB used,   9690.88 GFLOPS,    675.05 GOPS
shuffling training dataset in 1159.49 ms (epoch=7)
686 1235.82 ms run, 1162.57 ms python,   73.25 ms HIP,  554.17 loss, 0.001448 LR, 4.54 GB used,    546.69 GFLOPS,    675.61 GOPS
687   74.20 ms run,    2.77 ms python,   71.43 ms HIP,  546.76 loss, 0.001444 LR, 4.54 GB used,   9097.61 GFLOPS,    675.05 GOPS
688   72.21 ms run,    2.71 ms python,   69.50 ms HIP,  554.29 loss, 0.001440 LR, 4.54 GB used,   9348.90 GFLOPS,    675.05 GOPS
689   71.74 ms run,    2.66 ms python,   69.08 ms HIP,  560.47 loss, 0.001435 LR, 4.54 GB used,   9409.93 GFLOPS,    675.05 GOPS
690   70.23 ms run,    2.69 ms python,   67.54 ms HIP,  552.45 loss, 0.001431 LR, 4.54 GB used,   9612.55 GFLOPS,    675.05 GOPS
691   69.90 ms run,    2.70 ms python,   67.20 ms HIP,  558.22 loss, 0.001427 LR, 4.54 GB used,   9656.94 GFLOPS,    675.05 GOPS
692   69.89 ms run,    2.65 ms python,   67.24 ms HIP,  548.10 loss, 0.001422 LR, 4.54 GB used,   9658.98 GFLOPS,    675.05 GOPS
693   69.28 ms run,    2.64 ms python,   66.64 ms HIP,  550.90 loss, 0.001418 LR, 4.54 GB used,   9743.44 GFLOPS,    675.05 GOPS
694   69.40 ms run,    2.71 ms python,   66.69 ms HIP,  545.27 loss, 0.001414 LR, 4.54 GB used,   9727.47 GFLOPS,    675.05 GOPS
695   69.11 ms run,    2.70 ms python,   66.40 ms HIP,  555.22 loss, 0.001409 LR, 4.54 GB used,   9768.15 GFLOPS,    675.05 GOPS
696   69.04 ms run,    2.64 ms python,   66.41 ms HIP,  553.44 loss, 0.001405 LR, 4.54 GB used,   9777.12 GFLOPS,    675.05 GOPS
697   69.41 ms run,    2.62 ms python,   66.79 ms HIP,  542.86 loss, 0.001400 LR, 4.54 GB used,   9726.06 GFLOPS,    675.05 GOPS
698   70.41 ms run,    2.65 ms python,   67.76 ms HIP,  538.98 loss, 0.001396 LR, 4.54 GB used,   9588.01 GFLOPS,    675.05 GOPS
699   69.45 ms run,    2.72 ms python,   66.73 ms HIP,  547.87 loss, 0.001392 LR, 4.54 GB used,   9719.51 GFLOPS,    675.05 GOPS
700   69.36 ms run,    2.65 ms python,   66.72 ms HIP,  550.62 loss, 0.001387 LR, 4.54 GB used,   9732.17 GFLOPS,    675.05 GOPS
701   69.73 ms run,    2.61 ms python,   67.12 ms HIP,  548.99 loss, 0.001383 LR, 4.54 GB used,   9680.67 GFLOPS,    675.05 GOPS
702   69.49 ms run,    2.66 ms python,   66.83 ms HIP,  562.56 loss, 0.001379 LR, 4.54 GB used,   9714.51 GFLOPS,    675.05 GOPS
703   69.42 ms run,    2.72 ms python,   66.70 ms HIP,  545.80 loss, 0.001374 LR, 4.54 GB used,   9724.47 GFLOPS,    675.05 GOPS
704   69.53 ms run,    2.65 ms python,   66.88 ms HIP,  543.95 loss, 0.001370 LR, 4.54 GB used,   9708.93 GFLOPS,    675.05 GOPS
705   69.95 ms run,    2.65 ms python,   67.31 ms HIP,  550.51 loss, 0.001366 LR, 4.54 GB used,   9650.16 GFLOPS,    675.05 GOPS
706   69.06 ms run,    2.64 ms python,   66.42 ms HIP,  563.62 loss, 0.001361 LR, 4.54 GB used,   9774.28 GFLOPS,    675.05 GOPS
707   69.70 ms run,    2.66 ms python,   67.04 ms HIP,  553.62 loss, 0.001357 LR, 4.54 GB used,   9684.64 GFLOPS,    675.05 GOPS
708   69.86 ms run,    2.63 ms python,   67.24 ms HIP,  548.91 loss, 0.001353 LR, 4.54 GB used,   9662.51 GFLOPS,    675.05 GOPS
709   69.76 ms run,    2.64 ms python,   67.12 ms HIP,  553.52 loss, 0.001348 LR, 4.54 GB used,   9676.87 GFLOPS,    675.05 GOPS
710   69.71 ms run,    2.63 ms python,   67.09 ms HIP,  552.99 loss, 0.001344 LR, 4.54 GB used,   9683.37 GFLOPS,    675.05 GOPS
711   69.57 ms run,    2.65 ms python,   66.91 ms HIP,  554.13 loss, 0.001340 LR, 4.54 GB used,   9703.72 GFLOPS,    675.05 GOPS
712   70.20 ms run,    2.64 ms python,   67.56 ms HIP,  559.88 loss, 0.001335 LR, 4.54 GB used,   9616.10 GFLOPS,    675.05 GOPS
713   69.14 ms run,    2.70 ms python,   66.44 ms HIP,  561.09 loss, 0.001331 LR, 4.54 GB used,   9762.93 GFLOPS,    675.05 GOPS
714   69.68 ms run,    2.67 ms python,   67.01 ms HIP,  570.73 loss, 0.001326 LR, 4.54 GB used,   9688.16 GFLOPS,    675.05 GOPS
715   69.28 ms run,    2.64 ms python,   66.64 ms HIP,  549.33 loss, 0.001322 LR, 4.54 GB used,   9744.16 GFLOPS,    675.05 GOPS
716   69.36 ms run,    2.65 ms python,   66.72 ms HIP,  560.05 loss, 0.001318 LR, 4.54 GB used,   9732.05 GFLOPS,    675.05 GOPS
717   68.89 ms run,    2.65 ms python,   66.24 ms HIP,  558.79 loss, 0.001313 LR, 4.54 GB used,   9799.27 GFLOPS,    675.05 GOPS
718   69.34 ms run,    2.64 ms python,   66.70 ms HIP,  567.84 loss, 0.001309 LR, 4.54 GB used,   9735.95 GFLOPS,    675.05 GOPS
719   69.56 ms run,    2.61 ms python,   66.95 ms HIP,  556.73 loss, 0.001305 LR, 4.54 GB used,   9705.03 GFLOPS,    675.05 GOPS
720   68.84 ms run,    2.63 ms python,   66.21 ms HIP,  552.36 loss, 0.001300 LR, 4.54 GB used,   9805.99 GFLOPS,    675.05 GOPS
721   69.28 ms run,    2.62 ms python,   66.67 ms HIP,  548.49 loss, 0.001296 LR, 4.54 GB used,   9743.15 GFLOPS,    675.05 GOPS
722   69.55 ms run,    2.66 ms python,   66.89 ms HIP,  545.67 loss, 0.001292 LR, 4.54 GB used,   9705.72 GFLOPS,    675.05 GOPS
723   69.17 ms run,    2.62 ms python,   66.54 ms HIP,  562.48 loss, 0.001287 LR, 4.54 GB used,   9759.35 GFLOPS,    675.05 GOPS
724   69.49 ms run,    2.65 ms python,   66.84 ms HIP,  557.03 loss, 0.001283 LR, 4.54 GB used,   9714.21 GFLOPS,    675.05 GOPS
725   69.28 ms run,    2.68 ms python,   66.60 ms HIP,  552.82 loss, 0.001279 LR, 4.54 GB used,   9743.30 GFLOPS,    675.05 GOPS
726   70.04 ms run,    2.65 ms python,   67.39 ms HIP,  554.57 loss, 0.001274 LR, 4.54 GB used,   9638.21 GFLOPS,    675.05 GOPS
727   69.25 ms run,    2.62 ms python,   66.63 ms HIP,  546.55 loss, 0.001270 LR, 4.54 GB used,   9748.51 GFLOPS,    675.05 GOPS
728   69.16 ms run,    2.64 ms python,   66.52 ms HIP,  559.60 loss, 0.001266 LR, 4.54 GB used,   9760.15 GFLOPS,    675.05 GOPS
729   69.81 ms run,    2.63 ms python,   67.18 ms HIP,  552.28 loss, 0.001261 LR, 4.54 GB used,   9670.00 GFLOPS,    675.05 GOPS
730   69.50 ms run,    2.64 ms python,   66.86 ms HIP,  562.51 loss, 0.001257 LR, 4.54 GB used,   9712.98 GFLOPS,    675.05 GOPS
731   69.31 ms run,    2.62 ms python,   66.70 ms HIP,  557.94 loss, 0.001252 LR, 4.54 GB used,   9739.10 GFLOPS,    675.05 GOPS
732   69.84 ms run,    2.63 ms python,   67.21 ms HIP,  557.27 loss, 0.001248 LR, 4.54 GB used,   9665.30 GFLOPS,    675.05 GOPS
733   69.26 ms run,    2.74 ms python,   66.52 ms HIP,  545.38 loss, 0.001244 LR, 4.54 GB used,   9746.24 GFLOPS,    675.05 GOPS
734   69.50 ms run,    2.68 ms python,   66.82 ms HIP,  557.56 loss, 0.001239 LR, 4.54 GB used,   9713.20 GFLOPS,    675.05 GOPS
735   69.29 ms run,    2.65 ms python,   66.65 ms HIP,  539.31 loss, 0.001235 LR, 4.54 GB used,   9741.67 GFLOPS,    675.05 GOPS
736   69.40 ms run,    2.67 ms python,   66.73 ms HIP,  550.31 loss, 0.001231 LR, 4.54 GB used,   9726.19 GFLOPS,    675.05 GOPS
737   69.08 ms run,    2.64 ms python,   66.44 ms HIP,  550.49 loss, 0.001226 LR, 4.54 GB used,   9771.43 GFLOPS,    675.05 GOPS
738   69.15 ms run,    2.61 ms python,   66.54 ms HIP,  569.39 loss, 0.001222 LR, 4.54 GB used,   9762.12 GFLOPS,    675.05 GOPS
739   68.99 ms run,    2.69 ms python,   66.30 ms HIP,  557.04 loss, 0.001218 LR, 4.54 GB used,   9784.27 GFLOPS,    675.05 GOPS
740   69.56 ms run,    2.62 ms python,   66.94 ms HIP,  559.56 loss, 0.001213 LR, 4.54 GB used,   9704.54 GFLOPS,    675.05 GOPS
741   69.13 ms run,    2.62 ms python,   66.52 ms HIP,  551.71 loss, 0.001209 LR, 4.54 GB used,   9764.18 GFLOPS,    675.05 GOPS
742   69.37 ms run,    2.71 ms python,   66.65 ms HIP,  547.54 loss, 0.001205 LR, 4.54 GB used,   9731.62 GFLOPS,    675.05 GOPS
743   68.66 ms run,    2.63 ms python,   66.04 ms HIP,  557.30 loss, 0.001200 LR, 4.54 GB used,   9831.25 GFLOPS,    675.05 GOPS
744   69.95 ms run,    2.65 ms python,   67.31 ms HIP,  558.61 loss, 0.001196 LR, 4.54 GB used,   9649.86 GFLOPS,    675.05 GOPS
745   69.91 ms run,    2.65 ms python,   67.26 ms HIP,  550.68 loss, 0.001192 LR, 4.54 GB used,   9656.39 GFLOPS,    675.05 GOPS
746   70.53 ms run,    2.62 ms python,   67.91 ms HIP,  555.99 loss, 0.001187 LR, 4.54 GB used,   9570.85 GFLOPS,    675.05 GOPS
747   70.22 ms run,    2.75 ms python,   67.47 ms HIP,  555.76 loss, 0.001183 LR, 4.54 GB used,   9612.83 GFLOPS,    675.05 GOPS
748   69.29 ms run,    2.68 ms python,   66.61 ms HIP,  557.85 loss, 0.001178 LR, 4.54 GB used,   9742.13 GFLOPS,    675.05 GOPS
749   69.46 ms run,    2.64 ms python,   66.82 ms HIP,  562.48 loss, 0.001174 LR, 4.54 GB used,   9717.84 GFLOPS,    675.05 GOPS
750   69.44 ms run,    2.65 ms python,   66.80 ms HIP,  539.62 loss, 0.001170 LR, 4.54 GB used,   9721.07 GFLOPS,    675.05 GOPS
751   69.20 ms run,    2.60 ms python,   66.60 ms HIP,  550.59 loss, 0.001165 LR, 4.54 GB used,   9754.40 GFLOPS,    675.05 GOPS
752   68.78 ms run,    2.64 ms python,   66.14 ms HIP,  545.35 loss, 0.001161 LR, 4.54 GB used,   9814.52 GFLOPS,    675.05 GOPS
753   69.49 ms run,    2.62 ms python,   66.87 ms HIP,  562.82 loss, 0.001157 LR, 4.54 GB used,   9714.61 GFLOPS,    675.05 GOPS
754   69.30 ms run,    2.71 ms python,   66.59 ms HIP,  553.77 loss, 0.001152 LR, 4.54 GB used,   9740.49 GFLOPS,    675.05 GOPS
755   69.49 ms run,    2.70 ms python,   66.79 ms HIP,  558.77 loss, 0.001148 LR, 4.54 GB used,   9714.21 GFLOPS,    675.05 GOPS
756   69.36 ms run,    2.67 ms python,   66.69 ms HIP,  553.38 loss, 0.001144 LR, 4.54 GB used,   9732.63 GFLOPS,    675.05 GOPS
757   69.12 ms run,    2.67 ms python,   66.45 ms HIP,  558.24 loss, 0.001139 LR, 4.54 GB used,   9766.93 GFLOPS,    675.05 GOPS
758   68.50 ms run,    2.62 ms python,   65.88 ms HIP,  542.75 loss, 0.001135 LR, 4.54 GB used,   9855.19 GFLOPS,    675.05 GOPS
759   69.96 ms run,    2.65 ms python,   67.31 ms HIP,  567.17 loss, 0.001131 LR, 4.54 GB used,   9648.36 GFLOPS,    675.05 GOPS
760   68.99 ms run,    2.63 ms python,   66.36 ms HIP,  565.41 loss, 0.001126 LR, 4.54 GB used,   9784.37 GFLOPS,    675.05 GOPS
761   69.63 ms run,    2.64 ms python,   66.99 ms HIP,  561.65 loss, 0.001122 LR, 4.54 GB used,   9694.19 GFLOPS,    675.05 GOPS
762   69.51 ms run,    2.63 ms python,   66.89 ms HIP,  551.63 loss, 0.001118 LR, 4.54 GB used,   9710.90 GFLOPS,    675.05 GOPS
763   69.10 ms run,    2.74 ms python,   66.36 ms HIP,  552.97 loss, 0.001113 LR, 4.54 GB used,   9769.21 GFLOPS,    675.05 GOPS
764   69.26 ms run,    2.65 ms python,   66.62 ms HIP,  544.57 loss, 0.001109 LR, 4.54 GB used,   9745.87 GFLOPS,    675.05 GOPS
765   69.40 ms run,    2.64 ms python,   66.76 ms HIP,  545.29 loss, 0.001104 LR, 4.54 GB used,   9726.92 GFLOPS,    675.05 GOPS
766   69.41 ms run,    2.62 ms python,   66.79 ms HIP,  545.26 loss, 0.001100 LR, 4.54 GB used,   9725.45 GFLOPS,    675.05 GOPS
767   69.17 ms run,    2.62 ms python,   66.55 ms HIP,  543.88 loss, 0.001096 LR, 4.54 GB used,   9759.39 GFLOPS,    675.05 GOPS
768   69.20 ms run,    2.70 ms python,   66.51 ms HIP,  550.40 loss, 0.001091 LR, 4.54 GB used,   9754.92 GFLOPS,    675.05 GOPS
769   69.48 ms run,    2.63 ms python,   66.85 ms HIP,  551.71 loss, 0.001087 LR, 4.54 GB used,   9715.69 GFLOPS,    675.05 GOPS
770   69.94 ms run,    2.73 ms python,   67.21 ms HIP,  559.03 loss, 0.001083 LR, 4.54 GB used,   9651.86 GFLOPS,    675.05 GOPS
771   70.06 ms run,    2.63 ms python,   67.43 ms HIP,  568.44 loss, 0.001078 LR, 4.54 GB used,   9635.35 GFLOPS,    675.05 GOPS
772   69.31 ms run,    2.77 ms python,   66.54 ms HIP,  541.96 loss, 0.001074 LR, 4.54 GB used,   9740.15 GFLOPS,    675.05 GOPS
773   69.65 ms run,    2.65 ms python,   67.01 ms HIP,  550.99 loss, 0.001070 LR, 4.54 GB used,   9691.46 GFLOPS,    675.05 GOPS
774   69.71 ms run,    2.68 ms python,   67.03 ms HIP,  564.27 loss, 0.001065 LR, 4.54 GB used,   9683.71 GFLOPS,    675.05 GOPS
775   70.05 ms run,    2.71 ms python,   67.34 ms HIP,  542.48 loss, 0.001061 LR, 4.54 GB used,   9636.43 GFLOPS,    675.05 GOPS
776   69.61 ms run,    2.63 ms python,   66.98 ms HIP,  550.75 loss, 0.001057 LR, 4.54 GB used,   9697.27 GFLOPS,    675.05 GOPS
777   69.24 ms run,    2.74 ms python,   66.50 ms HIP,  538.54 loss, 0.001052 LR, 4.54 GB used,   9749.74 GFLOPS,    675.05 GOPS
778   69.04 ms run,    2.63 ms python,   66.42 ms HIP,  545.96 loss, 0.001048 LR, 4.54 GB used,   9776.91 GFLOPS,    675.05 GOPS
779   68.63 ms run,    2.69 ms python,   65.94 ms HIP,  542.59 loss, 0.001044 LR, 4.54 GB used,   9836.05 GFLOPS,    675.05 GOPS
780   69.08 ms run,    2.69 ms python,   66.39 ms HIP,  571.61 loss, 0.001039 LR, 4.54 GB used,   9771.52 GFLOPS,    675.05 GOPS
781   69.83 ms run,    2.64 ms python,   67.19 ms HIP,  547.73 loss, 0.001035 LR, 4.54 GB used,   9667.32 GFLOPS,    675.05 GOPS
782   69.63 ms run,    2.65 ms python,   66.98 ms HIP,  550.16 loss, 0.001030 LR, 4.54 GB used,   9695.20 GFLOPS,    675.05 GOPS
783   69.15 ms run,    2.69 ms python,   66.46 ms HIP,  526.68 loss, 0.001026 LR, 4.54 GB used,   9762.53 GFLOPS,    675.05 GOPS
shuffling training dataset in 1248.18 ms (epoch=8)
784 1323.85 ms run, 1251.22 ms python,   72.63 ms HIP,  546.24 loss, 0.001022 LR, 4.54 GB used,    510.34 GFLOPS,    675.61 GOPS
785   74.02 ms run,    2.81 ms python,   71.21 ms HIP,  541.10 loss, 0.001017 LR, 4.54 GB used,   9119.92 GFLOPS,    675.05 GOPS
786   72.56 ms run,    2.70 ms python,   69.86 ms HIP,  551.43 loss, 0.001013 LR, 4.54 GB used,   9303.12 GFLOPS,    675.05 GOPS
787   71.45 ms run,    2.70 ms python,   68.74 ms HIP,  547.53 loss, 0.001009 LR, 4.54 GB used,   9448.06 GFLOPS,    675.05 GOPS
788   71.17 ms run,    2.69 ms python,   68.48 ms HIP,  532.55 loss, 0.001004 LR, 4.54 GB used,   9484.84 GFLOPS,    675.05 GOPS
789   70.74 ms run,    2.63 ms python,   68.11 ms HIP,  535.94 loss, 0.001000 LR, 4.54 GB used,   9542.38 GFLOPS,    675.05 GOPS
790   69.83 ms run,    2.72 ms python,   67.11 ms HIP,  536.94 loss, 0.000996 LR, 4.54 GB used,   9666.81 GFLOPS,    675.05 GOPS
791   70.25 ms run,    2.66 ms python,   67.60 ms HIP,  529.04 loss, 0.000991 LR, 4.54 GB used,   9608.96 GFLOPS,    675.05 GOPS
792   69.84 ms run,    2.64 ms python,   67.20 ms HIP,  529.08 loss, 0.000987 LR, 4.54 GB used,   9665.85 GFLOPS,    675.05 GOPS
793   69.01 ms run,    2.63 ms python,   66.38 ms HIP,  548.76 loss, 0.000983 LR, 4.54 GB used,   9782.01 GFLOPS,    675.05 GOPS
794   69.28 ms run,    2.65 ms python,   66.62 ms HIP,  543.41 loss, 0.000978 LR, 4.54 GB used,   9743.95 GFLOPS,    675.05 GOPS
795   68.82 ms run,    2.66 ms python,   66.16 ms HIP,  543.51 loss, 0.000974 LR, 4.54 GB used,   9809.33 GFLOPS,    675.05 GOPS
796   69.00 ms run,    2.65 ms python,   66.35 ms HIP,  544.11 loss, 0.000970 LR, 4.54 GB used,   9782.78 GFLOPS,    675.05 GOPS
797   68.95 ms run,    2.67 ms python,   66.28 ms HIP,  528.20 loss, 0.000965 LR, 4.54 GB used,   9790.26 GFLOPS,    675.05 GOPS
798   69.07 ms run,    2.65 ms python,   66.42 ms HIP,  539.38 loss, 0.000961 LR, 4.54 GB used,   9773.15 GFLOPS,    675.05 GOPS
799   69.50 ms run,    2.62 ms python,   66.88 ms HIP,  544.07 loss, 0.000956 LR, 4.54 GB used,   9713.17 GFLOPS,    675.05 GOPS
800   69.56 ms run,    2.61 ms python,   66.95 ms HIP,  531.35 loss, 0.000952 LR, 4.54 GB used,   9704.85 GFLOPS,    675.05 GOPS
801   69.45 ms run,    2.59 ms python,   66.86 ms HIP,  532.82 loss, 0.000948 LR, 4.54 GB used,   9719.67 GFLOPS,    675.05 GOPS
802   68.74 ms run,    2.67 ms python,   66.07 ms HIP,  541.33 loss, 0.000943 LR, 4.54 GB used,   9820.35 GFLOPS,    675.05 GOPS
803   69.12 ms run,    2.64 ms python,   66.48 ms HIP,  528.60 loss, 0.000939 LR, 4.54 GB used,   9766.26 GFLOPS,    675.05 GOPS
804   69.52 ms run,    2.75 ms python,   66.77 ms HIP,  525.63 loss, 0.000935 LR, 4.54 GB used,   9710.05 GFLOPS,    675.05 GOPS
805   69.72 ms run,    2.62 ms python,   67.10 ms HIP,  536.51 loss, 0.000930 LR, 4.54 GB used,   9681.63 GFLOPS,    675.05 GOPS
806   69.32 ms run,    2.64 ms python,   66.68 ms HIP,  532.97 loss, 0.000926 LR, 4.54 GB used,   9737.54 GFLOPS,    675.05 GOPS
807   68.84 ms run,    2.66 ms python,   66.19 ms HIP,  532.18 loss, 0.000922 LR, 4.54 GB used,   9805.46 GFLOPS,    675.05 GOPS
808   69.55 ms run,    2.65 ms python,   66.90 ms HIP,  533.36 loss, 0.000917 LR, 4.54 GB used,   9706.15 GFLOPS,    675.05 GOPS
809   69.58 ms run,    2.70 ms python,   66.88 ms HIP,  535.91 loss, 0.000913 LR, 4.54 GB used,   9701.92 GFLOPS,    675.05 GOPS
810   69.74 ms run,    2.65 ms python,   67.09 ms HIP,  543.84 loss, 0.000909 LR, 4.54 GB used,   9678.97 GFLOPS,    675.05 GOPS
811   69.24 ms run,    2.61 ms python,   66.63 ms HIP,  547.66 loss, 0.000904 LR, 4.54 GB used,   9749.23 GFLOPS,    675.05 GOPS
812   69.11 ms run,    2.72 ms python,   66.39 ms HIP,  536.60 loss, 0.000900 LR, 4.54 GB used,   9768.01 GFLOPS,    675.05 GOPS
813   68.71 ms run,    2.64 ms python,   66.07 ms HIP,  542.15 loss, 0.000896 LR, 4.54 GB used,   9824.97 GFLOPS,    675.05 GOPS
814   69.23 ms run,    2.66 ms python,   66.57 ms HIP,  534.80 loss, 0.000891 LR, 4.54 GB used,   9750.34 GFLOPS,    675.05 GOPS
815   69.53 ms run,    2.68 ms python,   66.85 ms HIP,  531.42 loss, 0.000887 LR, 4.54 GB used,   9709.17 GFLOPS,    675.05 GOPS
816   70.02 ms run,    2.60 ms python,   67.41 ms HIP,  528.62 loss, 0.000882 LR, 4.54 GB used,   9641.35 GFLOPS,    675.05 GOPS
817   69.68 ms run,    2.64 ms python,   67.04 ms HIP,  538.65 loss, 0.000878 LR, 4.54 GB used,   9688.05 GFLOPS,    675.05 GOPS
818   69.74 ms run,    2.63 ms python,   67.10 ms HIP,  534.32 loss, 0.000874 LR, 4.54 GB used,   9680.00 GFLOPS,    675.05 GOPS
819   69.87 ms run,    2.64 ms python,   67.23 ms HIP,  528.08 loss, 0.000869 LR, 4.54 GB used,   9661.57 GFLOPS,    675.05 GOPS
820   69.37 ms run,    2.70 ms python,   66.67 ms HIP,  541.12 loss, 0.000865 LR, 4.54 GB used,   9730.47 GFLOPS,    675.05 GOPS
821   69.86 ms run,    2.64 ms python,   67.21 ms HIP,  527.85 loss, 0.000861 LR, 4.54 GB used,   9663.30 GFLOPS,    675.05 GOPS
822   69.52 ms run,    2.68 ms python,   66.84 ms HIP,  531.59 loss, 0.000856 LR, 4.54 GB used,   9710.65 GFLOPS,    675.05 GOPS
823   69.42 ms run,    2.62 ms python,   66.80 ms HIP,  538.39 loss, 0.000852 LR, 4.54 GB used,   9724.43 GFLOPS,    675.05 GOPS
824   70.23 ms run,    2.62 ms python,   67.61 ms HIP,  541.76 loss, 0.000848 LR, 4.54 GB used,   9611.38 GFLOPS,    675.05 GOPS
825   69.53 ms run,    2.68 ms python,   66.85 ms HIP,  528.67 loss, 0.000843 LR, 4.54 GB used,   9708.25 GFLOPS,    675.05 GOPS
826   68.89 ms run,    2.61 ms python,   66.28 ms HIP,  535.09 loss, 0.000839 LR, 4.54 GB used,   9798.49 GFLOPS,    675.05 GOPS
827   69.24 ms run,    2.68 ms python,   66.56 ms HIP,  533.99 loss, 0.000835 LR, 4.54 GB used,   9749.74 GFLOPS,    675.05 GOPS
828   69.66 ms run,    2.64 ms python,   67.02 ms HIP,  530.62 loss, 0.000830 LR, 4.54 GB used,   9690.68 GFLOPS,    675.05 GOPS
829   69.72 ms run,    2.66 ms python,   67.06 ms HIP,  537.17 loss, 0.000826 LR, 4.54 GB used,   9682.39 GFLOPS,    675.05 GOPS
830   69.32 ms run,    2.66 ms python,   66.66 ms HIP,  538.48 loss, 0.000822 LR, 4.54 GB used,   9737.89 GFLOPS,    675.05 GOPS
831   69.51 ms run,    2.63 ms python,   66.88 ms HIP,  535.70 loss, 0.000817 LR, 4.54 GB used,   9710.83 GFLOPS,    675.05 GOPS
832   69.41 ms run,    2.62 ms python,   66.79 ms HIP,  525.67 loss, 0.000813 LR, 4.54 GB used,   9725.40 GFLOPS,    675.05 GOPS
833   69.50 ms run,    2.67 ms python,   66.83 ms HIP,  532.23 loss, 0.000808 LR, 4.54 GB used,   9712.42 GFLOPS,    675.05 GOPS
834   69.33 ms run,    2.63 ms python,   66.69 ms HIP,  528.42 loss, 0.000804 LR, 4.54 GB used,   9737.16 GFLOPS,    675.05 GOPS
835   69.31 ms run,    2.68 ms python,   66.63 ms HIP,  535.32 loss, 0.000800 LR, 4.54 GB used,   9739.56 GFLOPS,    675.05 GOPS
836   69.73 ms run,    2.67 ms python,   67.07 ms HIP,  536.20 loss, 0.000795 LR, 4.54 GB used,   9680.20 GFLOPS,    675.05 GOPS
837   69.43 ms run,    2.64 ms python,   66.78 ms HIP,  538.89 loss, 0.000791 LR, 4.54 GB used,   9723.36 GFLOPS,    675.05 GOPS
838   69.13 ms run,    2.61 ms python,   66.52 ms HIP,  535.20 loss, 0.000787 LR, 4.54 GB used,   9764.50 GFLOPS,    675.05 GOPS
839   69.10 ms run,    2.60 ms python,   66.50 ms HIP,  533.34 loss, 0.000782 LR, 4.54 GB used,   9768.46 GFLOPS,    675.05 GOPS
840   68.19 ms run,    2.62 ms python,   65.57 ms HIP,  539.02 loss, 0.000778 LR, 4.54 GB used,   9899.76 GFLOPS,    675.05 GOPS
841   69.23 ms run,    2.65 ms python,   66.57 ms HIP,  536.14 loss, 0.000774 LR, 4.54 GB used,   9751.34 GFLOPS,    675.05 GOPS
842   69.47 ms run,    2.65 ms python,   66.83 ms HIP,  534.74 loss, 0.000769 LR, 4.54 GB used,   9716.57 GFLOPS,    675.05 GOPS
843   69.61 ms run,    2.62 ms python,   67.00 ms HIP,  539.64 loss, 0.000765 LR, 4.54 GB used,   9697.05 GFLOPS,    675.05 GOPS
844   69.04 ms run,    2.64 ms python,   66.40 ms HIP,  530.05 loss, 0.000761 LR, 4.54 GB used,   9777.32 GFLOPS,    675.05 GOPS
845   69.49 ms run,    2.65 ms python,   66.83 ms HIP,  536.00 loss, 0.000756 LR, 4.54 GB used,   9714.84 GFLOPS,    675.05 GOPS
846   69.04 ms run,    2.70 ms python,   66.34 ms HIP,  534.40 loss, 0.000752 LR, 4.54 GB used,   9777.87 GFLOPS,    675.05 GOPS
847   69.78 ms run,    2.63 ms python,   67.15 ms HIP,  536.08 loss, 0.000748 LR, 4.54 GB used,   9673.98 GFLOPS,    675.05 GOPS
848   68.62 ms run,    2.61 ms python,   66.02 ms HIP,  541.98 loss, 0.000743 LR, 4.54 GB used,   9836.97 GFLOPS,    675.05 GOPS
849   69.78 ms run,    2.60 ms python,   67.19 ms HIP,  526.80 loss, 0.000739 LR, 4.54 GB used,   9673.56 GFLOPS,    675.05 GOPS
850   69.34 ms run,    2.64 ms python,   66.70 ms HIP,  531.86 loss, 0.000734 LR, 4.54 GB used,   9735.31 GFLOPS,    675.05 GOPS
851   69.14 ms run,    2.65 ms python,   66.49 ms HIP,  534.87 loss, 0.000730 LR, 4.54 GB used,   9764.06 GFLOPS,    675.05 GOPS
852   69.53 ms run,    2.63 ms python,   66.90 ms HIP,  523.91 loss, 0.000726 LR, 4.54 GB used,   9709.10 GFLOPS,    675.05 GOPS
853   69.87 ms run,    3.20 ms python,   66.67 ms HIP,  535.79 loss, 0.000721 LR, 4.54 GB used,   9660.99 GFLOPS,    675.05 GOPS
854   69.08 ms run,    2.67 ms python,   66.42 ms HIP,  529.02 loss, 0.000717 LR, 4.54 GB used,   9771.59 GFLOPS,    675.05 GOPS
855   68.56 ms run,    2.65 ms python,   65.90 ms HIP,  525.92 loss, 0.000713 LR, 4.54 GB used,   9846.09 GFLOPS,    675.05 GOPS
856   69.18 ms run,    2.64 ms python,   66.54 ms HIP,  536.56 loss, 0.000708 LR, 4.54 GB used,   9757.63 GFLOPS,    675.05 GOPS
857   69.45 ms run,    2.70 ms python,   66.75 ms HIP,  520.37 loss, 0.000704 LR, 4.54 GB used,   9719.67 GFLOPS,    675.05 GOPS
858   69.41 ms run,    2.64 ms python,   66.77 ms HIP,  531.96 loss, 0.000700 LR, 4.54 GB used,   9725.64 GFLOPS,    675.05 GOPS
859   70.08 ms run,    2.66 ms python,   67.41 ms HIP,  540.40 loss, 0.000695 LR, 4.54 GB used,   9632.94 GFLOPS,    675.05 GOPS
860   69.25 ms run,    2.65 ms python,   66.60 ms HIP,  535.21 loss, 0.000691 LR, 4.54 GB used,   9747.91 GFLOPS,    675.05 GOPS
861   69.53 ms run,    2.64 ms python,   66.90 ms HIP,  538.28 loss, 0.000687 LR, 4.54 GB used,   9708.03 GFLOPS,    675.05 GOPS
862   69.53 ms run,    2.63 ms python,   66.90 ms HIP,  523.75 loss, 0.000682 LR, 4.54 GB used,   9708.76 GFLOPS,    675.05 GOPS
863   69.10 ms run,    2.63 ms python,   66.47 ms HIP,  540.04 loss, 0.000678 LR, 4.54 GB used,   9768.97 GFLOPS,    675.05 GOPS
864   69.50 ms run,    2.65 ms python,   66.86 ms HIP,  532.45 loss, 0.000674 LR, 4.54 GB used,   9712.18 GFLOPS,    675.05 GOPS
865   69.51 ms run,    2.71 ms python,   66.80 ms HIP,  535.42 loss, 0.000669 LR, 4.54 GB used,   9710.89 GFLOPS,    675.05 GOPS
866   68.93 ms run,    2.71 ms python,   66.22 ms HIP,  530.29 loss, 0.000665 LR, 4.54 GB used,   9793.27 GFLOPS,    675.05 GOPS
867   69.27 ms run,    2.68 ms python,   66.59 ms HIP,  513.62 loss, 0.000660 LR, 4.54 GB used,   9744.63 GFLOPS,    675.05 GOPS
868   69.18 ms run,    2.59 ms python,   66.58 ms HIP,  533.72 loss, 0.000656 LR, 4.54 GB used,   9758.46 GFLOPS,    675.05 GOPS
869   69.22 ms run,    2.64 ms python,   66.58 ms HIP,  520.23 loss, 0.000652 LR, 4.54 GB used,   9752.79 GFLOPS,    675.05 GOPS
870   69.57 ms run,    2.63 ms python,   66.94 ms HIP,  531.70 loss, 0.000647 LR, 4.54 GB used,   9702.61 GFLOPS,    675.05 GOPS
871   69.55 ms run,    2.68 ms python,   66.87 ms HIP,  540.77 loss, 0.000643 LR, 4.54 GB used,   9705.38 GFLOPS,    675.05 GOPS
872   69.33 ms run,    2.63 ms python,   66.70 ms HIP,  533.11 loss, 0.000639 LR, 4.54 GB used,   9736.47 GFLOPS,    675.05 GOPS
873   69.83 ms run,    2.65 ms python,   67.19 ms HIP,  527.30 loss, 0.000634 LR, 4.54 GB used,   9666.60 GFLOPS,    675.05 GOPS
874   69.72 ms run,    2.68 ms python,   67.03 ms HIP,  537.70 loss, 0.000630 LR, 4.54 GB used,   9682.89 GFLOPS,    675.05 GOPS
875   69.59 ms run,    2.64 ms python,   66.95 ms HIP,  547.15 loss, 0.000626 LR, 4.54 GB used,   9700.12 GFLOPS,    675.05 GOPS
876   69.41 ms run,    2.65 ms python,   66.77 ms HIP,  532.26 loss, 0.000621 LR, 4.54 GB used,   9725.11 GFLOPS,    675.05 GOPS
877   69.81 ms run,    2.61 ms python,   67.20 ms HIP,  546.89 loss, 0.000617 LR, 4.54 GB used,   9669.47 GFLOPS,    675.05 GOPS
878   69.05 ms run,    2.61 ms python,   66.43 ms HIP,  526.37 loss, 0.000613 LR, 4.54 GB used,   9776.26 GFLOPS,    675.05 GOPS
879   69.45 ms run,    2.61 ms python,   66.84 ms HIP,  537.78 loss, 0.000608 LR, 4.54 GB used,   9720.34 GFLOPS,    675.05 GOPS
880   69.33 ms run,    2.68 ms python,   66.64 ms HIP,  532.12 loss, 0.000604 LR, 4.54 GB used,   9737.14 GFLOPS,    675.05 GOPS
881   69.40 ms run,    2.69 ms python,   66.71 ms HIP,  512.16 loss, 0.000600 LR, 4.54 GB used,   9727.35 GFLOPS,    675.05 GOPS
shuffling training dataset in 1159.44 ms (epoch=9)
882 1235.03 ms run, 1162.57 ms python,   72.46 ms HIP,  520.87 loss, 0.000595 LR, 4.54 GB used,    547.04 GFLOPS,    675.61 GOPS
883   74.16 ms run,    2.88 ms python,   71.29 ms HIP,  516.05 loss, 0.000591 LR, 4.54 GB used,   9102.24 GFLOPS,    675.05 GOPS
884   72.66 ms run,    2.72 ms python,   69.94 ms HIP,  525.34 loss, 0.000586 LR, 4.54 GB used,   9291.01 GFLOPS,    675.05 GOPS
885   71.26 ms run,    2.67 ms python,   68.59 ms HIP,  506.30 loss, 0.000582 LR, 4.54 GB used,   9472.61 GFLOPS,    675.05 GOPS
886   70.69 ms run,    2.71 ms python,   67.98 ms HIP,  511.95 loss, 0.000578 LR, 4.54 GB used,   9550.00 GFLOPS,    675.05 GOPS
887   70.02 ms run,    2.62 ms python,   67.40 ms HIP,  516.78 loss, 0.000573 LR, 4.54 GB used,   9640.87 GFLOPS,    675.05 GOPS
888   69.19 ms run,    2.66 ms python,   66.53 ms HIP,  515.21 loss, 0.000569 LR, 4.54 GB used,   9757.04 GFLOPS,    675.05 GOPS
889   69.68 ms run,    2.62 ms python,   67.06 ms HIP,  517.03 loss, 0.000565 LR, 4.54 GB used,   9687.99 GFLOPS,    675.05 GOPS
890   69.62 ms run,    2.63 ms python,   66.99 ms HIP,  507.96 loss, 0.000560 LR, 4.54 GB used,   9696.45 GFLOPS,    675.05 GOPS
891   69.55 ms run,    2.64 ms python,   66.91 ms HIP,  515.07 loss, 0.000556 LR, 4.54 GB used,   9705.92 GFLOPS,    675.05 GOPS
892   69.32 ms run,    2.63 ms python,   66.69 ms HIP,  521.80 loss, 0.000552 LR, 4.54 GB used,   9738.24 GFLOPS,    675.05 GOPS
893   69.32 ms run,    2.63 ms python,   66.69 ms HIP,  524.90 loss, 0.000547 LR, 4.54 GB used,   9737.80 GFLOPS,    675.05 GOPS
894   69.05 ms run,    2.63 ms python,   66.42 ms HIP,  518.19 loss, 0.000543 LR, 4.54 GB used,   9776.12 GFLOPS,    675.05 GOPS
895   68.94 ms run,    2.61 ms python,   66.34 ms HIP,  514.16 loss, 0.000539 LR, 4.54 GB used,   9791.23 GFLOPS,    675.05 GOPS
896   69.71 ms run,    2.61 ms python,   67.09 ms HIP,  514.89 loss, 0.000534 LR, 4.54 GB used,   9683.90 GFLOPS,    675.05 GOPS
897   69.34 ms run,    2.65 ms python,   66.69 ms HIP,  511.81 loss, 0.000530 LR, 4.54 GB used,   9734.89 GFLOPS,    675.05 GOPS
898   69.24 ms run,    2.64 ms python,   66.61 ms HIP,  512.98 loss, 0.000526 LR, 4.54 GB used,   9749.11 GFLOPS,    675.05 GOPS
899   69.35 ms run,    2.63 ms python,   66.72 ms HIP,  508.40 loss, 0.000521 LR, 4.54 GB used,   9734.27 GFLOPS,    675.05 GOPS
900   68.77 ms run,    2.63 ms python,   66.14 ms HIP,  511.02 loss, 0.000517 LR, 4.54 GB used,   9815.58 GFLOPS,    675.05 GOPS
901   69.17 ms run,    2.64 ms python,   66.53 ms HIP,  515.47 loss, 0.000513 LR, 4.54 GB used,   9759.06 GFLOPS,    675.05 GOPS
902   69.13 ms run,    2.64 ms python,   66.49 ms HIP,  514.25 loss, 0.000508 LR, 4.54 GB used,   9764.42 GFLOPS,    675.05 GOPS
903   69.23 ms run,    2.74 ms python,   66.49 ms HIP,  514.87 loss, 0.000504 LR, 4.54 GB used,   9751.23 GFLOPS,    675.05 GOPS
904   69.54 ms run,    2.67 ms python,   66.87 ms HIP,  510.42 loss, 0.000499 LR, 4.54 GB used,   9707.15 GFLOPS,    675.05 GOPS
905   68.88 ms run,    2.63 ms python,   66.25 ms HIP,  511.87 loss, 0.000495 LR, 4.54 GB used,   9800.72 GFLOPS,    675.05 GOPS
906   68.98 ms run,    2.66 ms python,   66.32 ms HIP,  508.94 loss, 0.000491 LR, 4.54 GB used,   9785.88 GFLOPS,    675.05 GOPS
907   68.82 ms run,    2.65 ms python,   66.18 ms HIP,  519.92 loss, 0.000486 LR, 4.54 GB used,   9808.29 GFLOPS,    675.05 GOPS
908   68.83 ms run,    2.69 ms python,   66.14 ms HIP,  507.73 loss, 0.000482 LR, 4.54 GB used,   9806.73 GFLOPS,    675.05 GOPS
909   69.73 ms run,    2.63 ms python,   67.10 ms HIP,  518.91 loss, 0.000478 LR, 4.54 GB used,   9680.49 GFLOPS,    675.05 GOPS
910   69.38 ms run,    2.64 ms python,   66.74 ms HIP,  520.09 loss, 0.000473 LR, 4.54 GB used,   9729.58 GFLOPS,    675.05 GOPS
911   69.32 ms run,    2.66 ms python,   66.66 ms HIP,  520.63 loss, 0.000469 LR, 4.54 GB used,   9738.13 GFLOPS,    675.05 GOPS
912   69.55 ms run,    2.62 ms python,   66.93 ms HIP,  501.50 loss, 0.000465 LR, 4.54 GB used,   9705.62 GFLOPS,    675.05 GOPS
913   68.85 ms run,    2.63 ms python,   66.22 ms HIP,  515.14 loss, 0.000460 LR, 4.54 GB used,   9804.22 GFLOPS,    675.05 GOPS
914   69.07 ms run,    2.64 ms python,   66.43 ms HIP,  515.34 loss, 0.000456 LR, 4.54 GB used,   9773.31 GFLOPS,    675.05 GOPS
915   69.08 ms run,    2.66 ms python,   66.42 ms HIP,  515.66 loss, 0.000452 LR, 4.54 GB used,   9771.55 GFLOPS,    675.05 GOPS
916   69.16 ms run,    2.69 ms python,   66.47 ms HIP,  522.36 loss, 0.000447 LR, 4.54 GB used,   9760.84 GFLOPS,    675.05 GOPS
917   69.71 ms run,    2.60 ms python,   67.11 ms HIP,  514.66 loss, 0.000443 LR, 4.54 GB used,   9683.72 GFLOPS,    675.05 GOPS
918   69.54 ms run,    2.64 ms python,   66.90 ms HIP,  515.18 loss, 0.000439 LR, 4.54 GB used,   9707.37 GFLOPS,    675.05 GOPS
919   69.04 ms run,    2.63 ms python,   66.41 ms HIP,  515.85 loss, 0.000434 LR, 4.54 GB used,   9777.36 GFLOPS,    675.05 GOPS
920   69.66 ms run,    2.64 ms python,   67.02 ms HIP,  509.09 loss, 0.000430 LR, 4.54 GB used,   9689.92 GFLOPS,    675.05 GOPS
921   69.06 ms run,    2.60 ms python,   66.46 ms HIP,  507.59 loss, 0.000425 LR, 4.54 GB used,   9775.43 GFLOPS,    675.05 GOPS
922   69.37 ms run,    2.62 ms python,   66.75 ms HIP,  513.96 loss, 0.000421 LR, 4.54 GB used,   9730.39 GFLOPS,    675.05 GOPS
923   69.24 ms run,    2.59 ms python,   66.65 ms HIP,  509.86 loss, 0.000417 LR, 4.54 GB used,   9749.35 GFLOPS,    675.05 GOPS
924   69.20 ms run,    2.63 ms python,   66.57 ms HIP,  516.51 loss, 0.000412 LR, 4.54 GB used,   9755.15 GFLOPS,    675.05 GOPS
925   69.61 ms run,    2.61 ms python,   67.00 ms HIP,  504.50 loss, 0.000408 LR, 4.54 GB used,   9698.17 GFLOPS,    675.05 GOPS
926   68.96 ms run,    2.66 ms python,   66.30 ms HIP,  512.93 loss, 0.000404 LR, 4.54 GB used,   9789.02 GFLOPS,    675.05 GOPS
927   69.25 ms run,    2.63 ms python,   66.61 ms HIP,  510.84 loss, 0.000399 LR, 4.54 GB used,   9748.62 GFLOPS,    675.05 GOPS
928   69.52 ms run,    2.63 ms python,   66.90 ms HIP,  522.02 loss, 0.000395 LR, 4.54 GB used,   9709.53 GFLOPS,    675.05 GOPS
929   69.28 ms run,    2.62 ms python,   66.66 ms HIP,  512.55 loss, 0.000391 LR, 4.54 GB used,   9743.84 GFLOPS,    675.05 GOPS
930   69.60 ms run,    2.64 ms python,   66.96 ms HIP,  503.71 loss, 0.000386 LR, 4.54 GB used,   9699.44 GFLOPS,    675.05 GOPS
931   69.42 ms run,    2.67 ms python,   66.75 ms HIP,  509.71 loss, 0.000382 LR, 4.54 GB used,   9723.59 GFLOPS,    675.05 GOPS
932   69.20 ms run,    2.63 ms python,   66.57 ms HIP,  506.57 loss, 0.000378 LR, 4.54 GB used,   9754.58 GFLOPS,    675.05 GOPS
933   69.03 ms run,    2.68 ms python,   66.35 ms HIP,  509.72 loss, 0.000373 LR, 4.54 GB used,   9779.67 GFLOPS,    675.05 GOPS
934   68.65 ms run,    2.66 ms python,   65.99 ms HIP,  519.52 loss, 0.000369 LR, 4.54 GB used,   9832.97 GFLOPS,    675.05 GOPS
935   69.10 ms run,    2.63 ms python,   66.47 ms HIP,  523.69 loss, 0.000365 LR, 4.54 GB used,   9769.73 GFLOPS,    675.05 GOPS
936   68.79 ms run,    2.62 ms python,   66.16 ms HIP,  515.27 loss, 0.000360 LR, 4.54 GB used,   9813.41 GFLOPS,    675.05 GOPS
937   70.71 ms run,    2.65 ms python,   68.06 ms HIP,  518.81 loss, 0.000356 LR, 4.54 GB used,   9546.80 GFLOPS,    675.05 GOPS
938   69.36 ms run,    2.64 ms python,   66.71 ms HIP,  504.38 loss, 0.000351 LR, 4.54 GB used,   9733.15 GFLOPS,    675.05 GOPS
939   69.48 ms run,    2.64 ms python,   66.84 ms HIP,  507.79 loss, 0.000347 LR, 4.54 GB used,   9715.61 GFLOPS,    675.05 GOPS
940   68.71 ms run,    2.69 ms python,   66.02 ms HIP,  514.01 loss, 0.000343 LR, 4.54 GB used,   9824.71 GFLOPS,    675.05 GOPS
941   69.19 ms run,    2.64 ms python,   66.55 ms HIP,  507.03 loss, 0.000338 LR, 4.54 GB used,   9756.46 GFLOPS,    675.05 GOPS
942   69.27 ms run,    2.67 ms python,   66.60 ms HIP,  506.73 loss, 0.000334 LR, 4.54 GB used,   9744.75 GFLOPS,    675.05 GOPS
943   69.16 ms run,    2.65 ms python,   66.52 ms HIP,  524.35 loss, 0.000330 LR, 4.54 GB used,   9760.49 GFLOPS,    675.05 GOPS
944   68.93 ms run,    2.62 ms python,   66.31 ms HIP,  501.72 loss, 0.000325 LR, 4.54 GB used,   9793.02 GFLOPS,    675.05 GOPS
945   69.79 ms run,    2.62 ms python,   67.17 ms HIP,  522.89 loss, 0.000321 LR, 4.54 GB used,   9672.81 GFLOPS,    675.05 GOPS
946   69.44 ms run,    2.69 ms python,   66.75 ms HIP,  500.60 loss, 0.000317 LR, 4.54 GB used,   9720.65 GFLOPS,    675.05 GOPS
947   70.38 ms run,    2.59 ms python,   67.79 ms HIP,  526.98 loss, 0.000312 LR, 4.54 GB used,   9591.57 GFLOPS,    675.05 GOPS
948   69.53 ms run,    2.66 ms python,   66.87 ms HIP,  509.99 loss, 0.000308 LR, 4.54 GB used,   9708.92 GFLOPS,    675.05 GOPS
949   69.39 ms run,    2.60 ms python,   66.78 ms HIP,  507.01 loss, 0.000304 LR, 4.54 GB used,   9728.96 GFLOPS,    675.05 GOPS
950   69.55 ms run,    2.61 ms python,   66.94 ms HIP,  510.29 loss, 0.000299 LR, 4.54 GB used,   9705.98 GFLOPS,    675.05 GOPS
951   69.38 ms run,    2.69 ms python,   66.69 ms HIP,  508.08 loss, 0.000295 LR, 4.54 GB used,   9729.52 GFLOPS,    675.05 GOPS
952   69.67 ms run,    2.60 ms python,   67.07 ms HIP,  511.04 loss, 0.000291 LR, 4.54 GB used,   9688.63 GFLOPS,    675.05 GOPS
953   68.88 ms run,    2.66 ms python,   66.22 ms HIP,  506.13 loss, 0.000286 LR, 4.54 GB used,   9800.60 GFLOPS,    675.05 GOPS
954   69.86 ms run,    2.64 ms python,   67.22 ms HIP,  503.12 loss, 0.000282 LR, 4.54 GB used,   9662.69 GFLOPS,    675.05 GOPS
955   70.15 ms run,    2.67 ms python,   67.48 ms HIP,  509.37 loss, 0.000277 LR, 4.54 GB used,   9622.85 GFLOPS,    675.05 GOPS
956   68.98 ms run,    2.67 ms python,   66.31 ms HIP,  514.90 loss, 0.000273 LR, 4.54 GB used,   9786.24 GFLOPS,    675.05 GOPS
957   69.87 ms run,    2.61 ms python,   67.26 ms HIP,  508.61 loss, 0.000269 LR, 4.54 GB used,   9661.85 GFLOPS,    675.05 GOPS
958   69.52 ms run,    2.64 ms python,   66.88 ms HIP,  506.84 loss, 0.000264 LR, 4.54 GB used,   9710.24 GFLOPS,    675.05 GOPS
959   69.13 ms run,    2.62 ms python,   66.50 ms HIP,  513.77 loss, 0.000260 LR, 4.54 GB used,   9765.39 GFLOPS,    675.05 GOPS
960   69.53 ms run,    2.62 ms python,   66.91 ms HIP,  512.25 loss, 0.000256 LR, 4.54 GB used,   9709.08 GFLOPS,    675.05 GOPS
961   69.65 ms run,    2.64 ms python,   67.01 ms HIP,  516.46 loss, 0.000251 LR, 4.54 GB used,   9692.63 GFLOPS,    675.05 GOPS
962   69.41 ms run,    2.64 ms python,   66.77 ms HIP,  513.07 loss, 0.000247 LR, 4.54 GB used,   9724.91 GFLOPS,    675.05 GOPS
963   69.36 ms run,    2.62 ms python,   66.74 ms HIP,  499.24 loss, 0.000243 LR, 4.54 GB used,   9732.38 GFLOPS,    675.05 GOPS
964   69.86 ms run,    2.68 ms python,   67.18 ms HIP,  508.07 loss, 0.000238 LR, 4.54 GB used,   9663.18 GFLOPS,    675.05 GOPS
965   69.71 ms run,    2.64 ms python,   67.07 ms HIP,  503.10 loss, 0.000234 LR, 4.54 GB used,   9683.32 GFLOPS,    675.05 GOPS
966   69.43 ms run,    2.64 ms python,   66.79 ms HIP,  503.75 loss, 0.000230 LR, 4.54 GB used,   9722.44 GFLOPS,    675.05 GOPS
967   69.82 ms run,    2.65 ms python,   67.17 ms HIP,  507.46 loss, 0.000225 LR, 4.54 GB used,   9668.10 GFLOPS,    675.05 GOPS
968   69.49 ms run,    2.62 ms python,   66.87 ms HIP,  501.20 loss, 0.000221 LR, 4.54 GB used,   9713.78 GFLOPS,    675.05 GOPS
969   69.42 ms run,    2.63 ms python,   66.79 ms HIP,  518.10 loss, 0.000217 LR, 4.54 GB used,   9723.82 GFLOPS,    675.05 GOPS
970   69.72 ms run,    2.63 ms python,   67.08 ms HIP,  504.68 loss, 0.000212 LR, 4.54 GB used,   9682.81 GFLOPS,    675.05 GOPS
971   69.56 ms run,    2.69 ms python,   66.87 ms HIP,  518.01 loss, 0.000208 LR, 4.54 GB used,   9704.35 GFLOPS,    675.05 GOPS
972   69.55 ms run,    2.61 ms python,   66.94 ms HIP,  508.69 loss, 0.000203 LR, 4.54 GB used,   9706.43 GFLOPS,    675.05 GOPS
973   69.74 ms run,    2.67 ms python,   67.07 ms HIP,  503.42 loss, 0.000199 LR, 4.54 GB used,   9680.06 GFLOPS,    675.05 GOPS
974   69.33 ms run,    2.68 ms python,   66.65 ms HIP,  507.29 loss, 0.000195 LR, 4.54 GB used,   9736.17 GFLOPS,    675.05 GOPS
975   69.80 ms run,    2.61 ms python,   67.18 ms HIP,  498.26 loss, 0.000190 LR, 4.54 GB used,   9671.72 GFLOPS,    675.05 GOPS
976   69.22 ms run,    2.63 ms python,   66.59 ms HIP,  512.58 loss, 0.000186 LR, 4.54 GB used,   9752.85 GFLOPS,    675.05 GOPS
977   69.06 ms run,    2.63 ms python,   66.44 ms HIP,  513.81 loss, 0.000182 LR, 4.54 GB used,   9774.40 GFLOPS,    675.05 GOPS
978   68.90 ms run,    2.62 ms python,   66.28 ms HIP,  501.11 loss, 0.000177 LR, 4.54 GB used,   9797.71 GFLOPS,    675.05 GOPS
979   69.44 ms run,    2.63 ms python,   66.81 ms HIP,  508.49 loss, 0.000173 LR, 4.54 GB used,   9720.73 GFLOPS,    675.05 GOPS
shuffling training dataset in 1154.76 ms (epoch=10)
980 1230.07 ms run, 1157.80 ms python,   72.27 ms HIP,  496.13 loss, 0.000169 LR, 4.54 GB used,    549.25 GFLOPS,    675.61 GOPS
981   73.90 ms run,    2.78 ms python,   71.13 ms HIP,  495.23 loss, 0.000164 LR, 4.54 GB used,   9134.13 GFLOPS,    675.05 GOPS
982   72.39 ms run,    2.67 ms python,   69.72 ms HIP,  496.88 loss, 0.000160 LR, 4.54 GB used,   9325.09 GFLOPS,    675.05 GOPS
983   71.53 ms run,    2.73 ms python,   68.80 ms HIP,  490.28 loss, 0.000156 LR, 4.54 GB used,   9436.97 GFLOPS,    675.05 GOPS
984   70.56 ms run,    2.68 ms python,   67.88 ms HIP,  489.54 loss, 0.000151 LR, 4.54 GB used,   9567.28 GFLOPS,    675.05 GOPS
985   69.59 ms run,    2.66 ms python,   66.92 ms HIP,  495.99 loss, 0.000147 LR, 4.54 GB used,   9700.61 GFLOPS,    675.05 GOPS
986   70.18 ms run,    2.67 ms python,   67.51 ms HIP,  503.37 loss, 0.000143 LR, 4.54 GB used,   9619.28 GFLOPS,    675.05 GOPS
987   69.76 ms run,    2.73 ms python,   67.02 ms HIP,  495.15 loss, 0.000138 LR, 4.54 GB used,   9677.25 GFLOPS,    675.05 GOPS
988   69.16 ms run,    2.78 ms python,   66.38 ms HIP,  501.15 loss, 0.000134 LR, 4.54 GB used,   9760.82 GFLOPS,    675.05 GOPS
989   70.33 ms run,    2.66 ms python,   67.67 ms HIP,  500.14 loss, 0.000129 LR, 4.54 GB used,   9598.59 GFLOPS,    675.05 GOPS
990   69.20 ms run,    2.63 ms python,   66.56 ms HIP,  496.23 loss, 0.000125 LR, 4.54 GB used,   9755.47 GFLOPS,    675.05 GOPS
991   68.96 ms run,    2.66 ms python,   66.29 ms HIP,  500.80 loss, 0.000121 LR, 4.54 GB used,   9789.60 GFLOPS,    675.05 GOPS
992   69.12 ms run,    2.67 ms python,   66.45 ms HIP,  497.86 loss, 0.000116 LR, 4.54 GB used,   9766.96 GFLOPS,    675.05 GOPS
993   69.31 ms run,    2.69 ms python,   66.62 ms HIP,  498.34 loss, 0.000112 LR, 4.54 GB used,   9739.42 GFLOPS,    675.05 GOPS
994   70.40 ms run,    2.62 ms python,   67.78 ms HIP,  495.24 loss, 0.000108 LR, 4.54 GB used,   9588.55 GFLOPS,    675.05 GOPS
995   70.55 ms run,    2.61 ms python,   67.93 ms HIP,  494.52 loss, 0.000103 LR, 4.54 GB used,   9568.90 GFLOPS,    675.05 GOPS
996   69.80 ms run,    2.63 ms python,   67.18 ms HIP,  494.78 loss, 0.000099 LR, 4.54 GB used,   9670.67 GFLOPS,    675.05 GOPS
997   69.70 ms run,    2.69 ms python,   67.01 ms HIP,  494.93 loss, 0.000095 LR, 4.54 GB used,   9684.90 GFLOPS,    675.05 GOPS
998   69.11 ms run,    2.65 ms python,   66.46 ms HIP,  497.52 loss, 0.000090 LR, 4.54 GB used,   9768.35 GFLOPS,    675.05 GOPS
999   69.46 ms run,    2.65 ms python,   66.81 ms HIP,  491.65 loss, 0.000086 LR, 4.54 GB used,   9718.04 GFLOPS,    675.05 GOPS
shuffling test dataset in 185.55 ms (epoch=0)
eval     9616/10240 93.91%,    0.40 val_loss STEP=1000 (in 1416.17 ms)

llama.py

using HIP backend
using LLaMA-7B model
Traceback (most recent call last):
  File "/home/jebba/devel/tinygrad/tinygrad/examples/llama.py", line 386, in <module>
    llama = LLaMa.build(MODEL_PATH, TOKENIZER_PATH, model_gen=args.gen, model_size=args.size, quantize=args.quantize, device=device)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jebba/devel/tinygrad/tinygrad/examples/llama.py", line 155, in build
    sp_model = SentencePieceProcessor(model_file=str(tokenizer_path))
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jebba/devel/tinygrad/tinygrad/venv/lib/python3.11/site-packages/sentencepiece/__init__.py", line 447, in Init
    self.Load(model_file=model_file, model_proto=model_proto)
  File "/home/jebba/devel/tinygrad/tinygrad/venv/lib/python3.11/site-packages/sentencepiece/__init__.py", line 905, in Load
    return self.LoadFromFile(model_file)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jebba/devel/tinygrad/tinygrad/venv/lib/python3.11/site-packages/sentencepiece/__init__.py", line 310, in LoadFromFile
    return _sentencepiece.SentencePieceProcessor_LoadFromFile(self, arg)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
OSError: Not found: "/home/jebba/devel/tinygrad/tinygrad/weights/LLaMA/tokenizer.model": No such file or directory Error #2

mask_rcnn.py

Traceback (most recent call last):
  File "/home/jebba/devel/tinygrad/tinygrad/venv/lib/python3.11/site-packages/PIL/Image.py", line 3135, in open
    fp.seek(0)
    ^^^^^^^
AttributeError: 'NoneType' object has no attribute 'seek'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/jebba/devel/tinygrad/tinygrad/examples/mask_rcnn.py", line 290, in <module>
    img = Image.open(args.image)
          ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jebba/devel/tinygrad/tinygrad/venv/lib/python3.11/site-packages/PIL/Image.py", line 3137, in open
    fp = io.BytesIO(fp.read())
                    ^^^^^^^
AttributeError: 'NoneType' object has no attribute 'read'

mixtral.py

Traceback (most recent call last):
  File "/home/jebba/devel/tinygrad/tinygrad/examples/mixtral.py", line 33, in <module>
    state = torch_load(args.weights + "/consolidated.00.pth.b")
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jebba/devel/tinygrad/tinygrad/tinygrad/nn/state.py", line 77, in torch_load
    t = Tensor.empty(os.stat(fn).st_size, dtype=dtypes.uint8, device=f"disk:{fn}")
                     ^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: '/home/jebba/devel/tinygrad/tinygrad/weights/mixtral-8x7b-32kseqlen/consolidated.00.pth.b'

mnist_gan.py

  0%|          | 0/300 [00:00<?, ?it/s]
Generator loss: 1.392429107471424, Discriminator loss: 1.2234591077876222:   0%|          | 0/300 [00:52<?, ?it/s]
Generator loss: 1.392429107471424, Discriminator loss: 1.2234591077876222:   0%|          | 1/300 [00:52<4:23:15, 52.83s/it]
Generator loss: 0.742053180713864, Discriminator loss: 1.3610961077844395:   0%|          | 1/300 [01:10<4:23:15, 52.83s/it]
Generator loss: 0.742053180713864, Discriminator loss: 1.3610961077844395:   1%|          | 2/300 [01:10<2:39:21, 32.09s/it]
Generator loss: 0.7467478732852375, Discriminator loss: 1.3732893361764795:   1%|          | 2/300 [01:28<2:39:21, 32.09s/it]
Generator loss: 0.7467478732852375, Discriminator loss: 1.3732893361764795:   1%|          | 3/300 [01:28<2:06:44, 25.61s/it]
Generator loss: 0.7906162344357547, Discriminator loss: 1.280748259057017:   1%|          | 3/300 [01:45<2:06:44, 25.61s/it] 
Generator loss: 0.7906162344357547, Discriminator loss: 1.280748259057017:   1%|▏         | 4/300 [01:45<1:49:47, 22.25s/it]
Generator loss: 1.3312474861741066, Discriminator loss: 1.0106008916216738:   1%|▏         | 4/300 [01:56<1:49:47, 22.25s/it]
Generator loss: 1.3312474861741066, Discriminator loss: 1.0106008916216738:   2%|▏         | 5/300 [01:56<1:28:50, 18.07s/it]
Generator loss: 1.783903177608462, Discriminator loss: 0.7816843103398295:   2%|▏         | 5/300 [02:05<1:28:50, 18.07s/it] 
Generator loss: 1.783903177608462, Discriminator loss: 0.7816843103398295:   2%|▏         | 6/300 [02:05<1:13:42, 15.04s/it]
Generator loss: 2.0209721011274002, Discriminator loss: 0.7753454947515446:   2%|▏         | 6/300 [02:21<1:13:42, 15.04s/it]
Generator loss: 2.0209721011274002, Discriminator loss: 0.7753454947515446:   2%|▏         | 7/300 [02:21<1:15:22, 15.43s/it]
Generator loss: 1.7905255674439318, Discriminator loss: 0.8429553416721961:   2%|▏         | 7/300 [02:36<1:15:22, 15.43s/it]
Generator loss: 1.7905255674439318, Discriminator loss: 0.8429553416721961:   3%|▎         | 8/300 [02:36<1:13:46, 15.16s/it]
Generator loss: 1.8539055073085953, Discriminator loss: 0.8918271034079439:   3%|▎         | 8/300 [02:50<1:13:46, 15.16s/it]
Generator loss: 1.8539055073085953, Discriminator loss: 0.8918271034079439:   3%|▎         | 9/300 [02:50<1:12:13, 14.89s/it]
Generator loss: 1.3095603074659319, Discriminator loss: 1.036813088637941:   3%|▎         | 9/300 [03:05<1:12:13, 14.89s/it] 
Generator loss: 1.3095603074659319, Discriminator loss: 1.036813088637941:   3%|▎         | 10/300 [03:05<1:13:03, 15.12s/it]
Generator loss: 1.7761961335645002, Discriminator loss: 0.8414131459944388:   3%|▎         | 10/300 [03:18<1:13:03, 15.12s/it]
Generator loss: 1.7761961335645002, Discriminator loss: 0.8414131459944388:   4%|▎         | 11/300 [03:18<1:08:35, 14.24s/it]
Generator loss: 2.118274845621165, Discriminator loss: 0.6414801717242774:   4%|▎         | 11/300 [03:27<1:08:35, 14.24s/it] 
Generator loss: 2.118274845621165, Discriminator loss: 0.6414801717242774:   4%|▍         | 12/300 [03:27<1:01:02, 12.72s/it]
Generator loss: 2.3443840575568817, Discriminator loss: 0.6470773925676065:   4%|▍         | 12/300 [03:36<1:01:02, 12.72s/it]
Generator loss: 2.3443840575568817, Discriminator loss: 0.6470773925676065:   4%|▍         | 13/300 [03:36<55:28, 11.60s/it]  
Generator loss: 2.4566313190495266, Discriminator loss: 0.5863546194399104:   4%|▍         | 13/300 [03:45<55:28, 11.60s/it]
Generator loss: 2.4566313190495266, Discriminator loss: 0.5863546194399104:   5%|▍         | 14/300 [03:45<51:37, 10.83s/it]
Generator loss: 2.6201670660692105, Discriminator loss: 0.6070238498642164:   5%|▍         | 14/300 [03:54<51:37, 10.83s/it]
Generator loss: 2.6201670660692105, Discriminator loss: 0.6070238498642164:   5%|▌         | 15/300 [03:54<48:49, 10.28s/it]
Generator loss: 2.6223588322891906, Discriminator loss: 0.5109086793792599:   5%|▌         | 15/300 [04:02<48:49, 10.28s/it]
Generator loss: 2.6223588322891906, Discriminator loss: 0.5109086793792599:   5%|▌         | 16/300 [04:02<46:03,  9.73s/it]
Generator loss: 2.5351900554755153, Discriminator loss: 0.5703090098412598:   5%|▌         | 16/300 [04:10<46:03,  9.73s/it]
Generator loss: 2.5351900554755153, Discriminator loss: 0.5703090098412598:   6%|▌         | 17/300 [04:10<42:44,  9.06s/it]
Generator loss: 2.224751223536099, Discriminator loss: 0.6933160715681665:   6%|▌         | 17/300 [04:17<42:44,  9.06s/it] 
Generator loss: 2.224751223536099, Discriminator loss: 0.6933160715681665:   6%|▌         | 18/300 [04:17<39:40,  8.44s/it]
Generator loss: 2.2140680621652042, Discriminator loss: 0.6735449100241941:   6%|▌         | 18/300 [04:24<39:40,  8.44s/it]
Generator loss: 2.2140680621652042, Discriminator loss: 0.6735449100241941:   6%|▋         | 19/300 [04:24<37:23,  7.98s/it]
Generator loss: 1.9777411288198303, Discriminator loss: 0.692805947845473:   6%|▋         | 19/300 [04:31<37:23,  7.98s/it] 
Generator loss: 1.9777411288198303, Discriminator loss: 0.692805947845473:   7%|▋         | 20/300 [04:31<35:43,  7.66s/it]
Generator loss: 1.8741068846600897, Discriminator loss: 0.7449631901348338:   7%|▋         | 20/300 [04:38<35:43,  7.66s/it]
Generator loss: 1.8741068846600897, Discriminator loss: 0.7449631901348338:   7%|▋         | 21/300 [04:38<34:32,  7.43s/it]
Generator loss: 1.9462997685460484, Discriminator loss: 0.61833321062081:   7%|▋         | 21/300 [04:45<34:32,  7.43s/it]  
Generator loss: 1.9462997685460484, Discriminator loss: 0.61833321062081:   7%|▋         | 22/300 [04:45<33:38,  7.26s/it]
Generator loss: 2.117940893944572, Discriminator loss: 0.6480395973605269:   7%|▋         | 22/300 [04:51<33:38,  7.26s/it]
Generator loss: 2.117940893944572, Discriminator loss: 0.6480395973605269:   8%|▊         | 23/300 [04:51<33:01,  7.15s/it]
Generator loss: 2.0316804112756954, Discriminator loss: 0.5923001387101763:   8%|▊         | 23/300 [04:58<33:01,  7.15s/it]
Generator loss: 2.0316804112756954, Discriminator loss: 0.5923001387101763:   8%|▊         | 24/300 [04:58<32:34,  7.08s/it]
Generator loss: 2.110353120109614, Discriminator loss: 0.5987094748107826:   8%|▊         | 24/300 [05:05<32:34,  7.08s/it] 
Generator loss: 2.110353120109614, Discriminator loss: 0.5987094748107826:   8%|▊         | 25/300 [05:05<32:13,  7.03s/it]
Generator loss: 2.12758731053156, Discriminator loss: 0.6420223644989378:   8%|▊         | 25/300 [05:12<32:13,  7.03s/it] 
Generator loss: 2.12758731053156, Discriminator loss: 0.6420223644989378:   9%|▊         | 26/300 [05:12<31:53,  6.99s/it]
Generator loss: 1.9290453379645067, Discriminator loss: 0.7173247856690603:   9%|▊         | 26/300 [05:19<31:53,  6.99s/it]
Generator loss: 1.9290453379645067, Discriminator loss: 0.7173247856690603:   9%|▉         | 27/300 [05:19<31:42,  6.97s/it]
Generator loss: 1.9236001933322233, Discriminator loss: 0.7348716399248909:   9%|▉         | 27/300 [05:26<31:42,  6.97s/it]
Generator loss: 1.9236001933322233, Discriminator loss: 0.7348716399248909:   9%|▉         | 28/300 [05:26<31:32,  6.96s/it]
Generator loss: 1.9243448284618996, Discriminator loss: 0.6838795056237894:   9%|▉         | 28/300 [05:33<31:32,  6.96s/it]
Generator loss: 1.9243448284618996, Discriminator loss: 0.6838795056237894:  10%|▉         | 29/300 [05:33<31:26,  6.96s/it]
Generator loss: 1.9530906686011482, Discriminator loss: 0.6574842614286086:  10%|▉         | 29/300 [05:40<31:26,  6.96s/it]
Generator loss: 1.9530906686011482, Discriminator loss: 0.6574842614286086:  10%|█         | 30/300 [05:40<31:42,  7.05s/it]
Generator loss: 1.987439405392198, Discriminator loss: 0.6398052040706662:  10%|█         | 30/300 [05:47<31:42,  7.05s/it] 
Generator loss: 1.987439405392198, Discriminator loss: 0.6398052040706662:  10%|█         | 31/300 [05:47<30:59,  6.91s/it]
Generator loss: 2.044340126654681, Discriminator loss: 0.6637535426108276:  10%|█         | 31/300 [05:53<30:59,  6.91s/it]
Generator loss: 2.044340126654681, Discriminator loss: 0.6637535426108276:  11%|█         | 32/300 [05:53<30:30,  6.83s/it]
Generator loss: 1.987223917070557, Discriminator loss: 0.6655241660773754:  11%|█         | 32/300 [06:00<30:30,  6.83s/it]
Generator loss: 1.987223917070557, Discriminator loss: 0.6655241660773754:  11%|█         | 33/300 [06:00<30:05,  6.76s/it]
Generator loss: 1.9940989157732796, Discriminator loss: 0.6812811452237999:  11%|█         | 33/300 [06:07<30:05,  6.76s/it]
Generator loss: 1.9940989157732796, Discriminator loss: 0.6812811452237999:  11%|█▏        | 34/300 [06:07<29:52,  6.74s/it]
Generator loss: 1.9257937928333002, Discriminator loss: 0.6992776880369467:  11%|█▏        | 34/300 [06:13<29:52,  6.74s/it]
Generator loss: 1.9257937928333002, Discriminator loss: 0.6992776880369467:  12%|█▏        | 35/300 [06:13<29:33,  6.69s/it]
Generator loss: 1.9709613064632696, Discriminator loss: 0.6311690412900027:  12%|█▏        | 35/300 [06:20<29:33,  6.69s/it]
Generator loss: 1.9709613064632696, Discriminator loss: 0.6311690412900027:  12%|█▏        | 36/300 [06:20<29:22,  6.68s/it]
Generator loss: 1.860101638471379, Discriminator loss: 0.7256270524333505:  12%|█▏        | 36/300 [06:27<29:22,  6.68s/it] 
Generator loss: 1.860101638471379, Discriminator loss: 0.7256270524333505:  12%|█▏        | 37/300 [06:27<29:12,  6.66s/it]
Generator loss: 1.7695811439086409, Discriminator loss: 0.7627294414183673:  12%|█▏        | 37/300 [06:33<29:12,  6.66s/it]
Generator loss: 1.7695811439086409, Discriminator loss: 0.7627294414183673:  13%|█▎        | 38/300 [06:33<29:06,  6.67s/it]
Generator loss: 1.710711133392418, Discriminator loss: 0.7783913958598586:  13%|█▎        | 38/300 [06:40<29:06,  6.67s/it] 
Generator loss: 1.710711133392418, Discriminator loss: 0.7783913958598586:  13%|█▎        | 39/300 [06:40<28:51,  6.63s/it]
Generator loss: 1.6298308666138088, Discriminator loss: 0.8163589966647765:  13%|█▎        | 39/300 [06:46<28:51,  6.63s/it]
Generator loss: 1.6298308666138088, Discriminator loss: 0.8163589966647765:  13%|█▎        | 40/300 [06:46<28:42,  6.62s/it]
Generator loss: 1.6522921586737913, Discriminator loss: 0.7979075299466357:  13%|█▎        | 40/300 [06:53<28:42,  6.62s/it]
Generator loss: 1.6522921586737913, Discriminator loss: 0.7979075299466357:  14%|█▎        | 41/300 [06:53<28:35,  6.62s/it]
Generator loss: 1.672510720351163, Discriminator loss: 0.7909478799385183:  14%|█▎        | 41/300 [07:00<28:35,  6.62s/it] 
Generator loss: 1.672510720351163, Discriminator loss: 0.7909478799385183:  14%|█▍        | 42/300 [07:00<28:29,  6.63s/it]
Generator loss: 1.6667017340660095, Discriminator loss: 0.7982207159785664:  14%|█▍        | 42/300 [07:06<28:29,  6.63s/it]
Generator loss: 1.6667017340660095, Discriminator loss: 0.7982207159785664:  14%|█▍        | 43/300 [07:06<28:17,  6.61s/it]
Generator loss: 1.65825504020733, Discriminator loss: 0.7997007856474203:  14%|█▍        | 43/300 [07:13<28:17,  6.61s/it]  
Generator loss: 1.65825504020733, Discriminator loss: 0.7997007856474203:  15%|█▍        | 44/300 [07:13<28:10,  6.61s/it]
Generator loss: 1.6209210477331106, Discriminator loss: 0.8447715745252722:  15%|█▍        | 44/300 [07:20<28:10,  6.61s/it]
Generator loss: 1.6209210477331106, Discriminator loss: 0.8447715745252722:  15%|█▌        | 45/300 [07:20<28:08,  6.62s/it]
Generator loss: 1.5814028741682278, Discriminator loss: 0.8335039265015546:  15%|█▌        | 45/300 [07:26<28:08,  6.62s/it]
Generator loss: 1.5814028741682278, Discriminator loss: 0.8335039265015546:  15%|█▌        | 46/300 [07:26<28:02,  6.63s/it]
Generator loss: 1.584620900452137, Discriminator loss: 0.8714165779597619:  15%|█▌        | 46/300 [07:33<28:02,  6.63s/it] 
Generator loss: 1.584620900452137, Discriminator loss: 0.8714165779597619:  16%|█▌        | 47/300 [07:33<28:01,  6.65s/it]
Generator loss: 1.5452900244032635, Discriminator loss: 0.841843509060495:  16%|█▌        | 47/300 [07:39<28:01,  6.65s/it]
Generator loss: 1.5452900244032635, Discriminator loss: 0.841843509060495:  16%|█▌        | 48/300 [07:39<27:47,  6.62s/it]
Generator loss: 1.542084024671246, Discriminator loss: 0.8812410480835858:  16%|█▌        | 48/300 [07:46<27:47,  6.62s/it]
Generator loss: 1.542084024671246, Discriminator loss: 0.8812410480835858:  16%|█▋        | 49/300 [07:46<27:36,  6.60s/it]
Generator loss: 1.486083539969781, Discriminator loss: 0.8791569059385973:  16%|█▋        | 49/300 [07:53<27:36,  6.60s/it]
Generator loss: 1.486083539969781, Discriminator loss: 0.8791569059385973:  17%|█▋        | 50/300 [07:53<27:27,  6.59s/it]
Generator loss: 1.5112405147622614, Discriminator loss: 0.8912068890298114:  17%|█▋        | 50/300 [07:59<27:27,  6.59s/it]
Generator loss: 1.5112405147622614, Discriminator loss: 0.8912068890298114:  17%|█▋        | 51/300 [07:59<27:29,  6.62s/it]
Generator loss: 1.4893103610066807, Discriminator loss: 0.8984880587633919:  17%|█▋        | 51/300 [08:06<27:29,  6.62s/it]
Generator loss: 1.4893103610066807, Discriminator loss: 0.8984880587633919:  17%|█▋        | 52/300 [08:06<27:22,  6.62s/it]
Generator loss: 1.5350522933637394, Discriminator loss: 0.8891902101390502:  17%|█▋        | 52/300 [08:13<27:22,  6.62s/it]
Generator loss: 1.5350522933637394, Discriminator loss: 0.8891902101390502:  18%|█▊        | 53/300 [08:13<27:16,  6.63s/it]
Generator loss: 1.5181330429280506, Discriminator loss: 0.899337364031988:  18%|█▊        | 53/300 [08:19<27:16,  6.63s/it] 
Generator loss: 1.5181330429280506, Discriminator loss: 0.899337364031988:  18%|█▊        | 54/300 [08:19<27:10,  6.63s/it]
Generator loss: 1.5080074895830715, Discriminator loss: 0.8935804108486456:  18%|█▊        | 54/300 [08:26<27:10,  6.63s/it]
Generator loss: 1.5080074895830715, Discriminator loss: 0.8935804108486456:  18%|█▊        | 55/300 [08:26<27:11,  6.66s/it]
Generator loss: 1.4986476161900688, Discriminator loss: 0.9006087337346638:  18%|█▊        | 55/300 [08:33<27:11,  6.66s/it]
Generator loss: 1.4986476161900688, Discriminator loss: 0.9006087337346638:  19%|█▊        | 56/300 [08:33<27:02,  6.65s/it]
Generator loss: 1.478811984114787, Discriminator loss: 0.9054143661085297:  19%|█▊        | 56/300 [08:39<27:02,  6.65s/it] 
Generator loss: 1.478811984114787, Discriminator loss: 0.9054143661085297:  19%|█▉        | 57/300 [08:39<26:52,  6.63s/it]
Generator loss: 1.5209060469094444, Discriminator loss: 0.9018364202450303:  19%|█▉        | 57/300 [08:46<26:52,  6.63s/it]
Generator loss: 1.5209060469094444, Discriminator loss: 0.9018364202450303:  19%|█▉        | 58/300 [08:46<26:39,  6.61s/it]
Generator loss: 1.4833326471202515, Discriminator loss: 0.8973440760198761:  19%|█▉        | 58/300 [08:52<26:39,  6.61s/it]
Generator loss: 1.4833326471202515, Discriminator loss: 0.8973440760198761:  20%|█▉        | 59/300 [08:52<26:29,  6.59s/it]
Generator loss: 1.5183102063396399, Discriminator loss: 0.895128549898372:  20%|█▉        | 59/300 [08:59<26:29,  6.59s/it] 
Generator loss: 1.5183102063396399, Discriminator loss: 0.895128549898372:  20%|██        | 60/300 [08:59<26:25,  6.61s/it]
Generator loss: 1.5287320368430193, Discriminator loss: 0.8949343741816633:  20%|██        | 60/300 [09:05<26:25,  6.61s/it]
Generator loss: 1.5287320368430193, Discriminator loss: 0.8949343741816633:  20%|██        | 61/300 [09:05<26:15,  6.59s/it]
Generator loss: 1.5220893643358175, Discriminator loss: 0.9018049928195336:  20%|██        | 61/300 [09:12<26:15,  6.59s/it]
Generator loss: 1.5220893643358175, Discriminator loss: 0.9018049928195336:  21%|██        | 62/300 [09:12<26:06,  6.58s/it]
Generator loss: 1.5172377960646855, Discriminator loss: 0.8894347741323358:  21%|██        | 62/300 [09:19<26:06,  6.58s/it]
Generator loss: 1.5172377960646855, Discriminator loss: 0.8894347741323358:  21%|██        | 63/300 [09:19<26:04,  6.60s/it]
Generator loss: 1.5071446154924, Discriminator loss: 0.896211767459617:  21%|██        | 63/300 [09:25<26:04,  6.60s/it]    
Generator loss: 1.5071446154924, Discriminator loss: 0.896211767459617:  21%|██▏       | 64/300 [09:25<26:07,  6.64s/it]
Generator loss: 1.521011765827151, Discriminator loss: 0.9010778538444463:  21%|██▏       | 64/300 [09:32<26:07,  6.64s/it]
Generator loss: 1.521011765827151, Discriminator loss: 0.9010778538444463:  22%|██▏       | 65/300 [09:32<26:00,  6.64s/it]
Generator loss: 1.5088214804144466, Discriminator loss: 0.8929120166336789:  22%|██▏       | 65/300 [09:39<26:00,  6.64s/it]
Generator loss: 1.5088214804144466, Discriminator loss: 0.8929120166336789:  22%|██▏       | 66/300 [09:39<25:53,  6.64s/it]
Generator loss: 1.5387992946540607, Discriminator loss: 0.8957542154718848:  22%|██▏       | 66/300 [09:45<25:53,  6.64s/it]
Generator loss: 1.5387992946540607, Discriminator loss: 0.8957542154718848:  22%|██▏       | 67/300 [09:45<25:45,  6.63s/it]
Generator loss: 1.5478724832920467, Discriminator loss: 0.8870287239551544:  22%|██▏       | 67/300 [09:52<25:45,  6.63s/it]
Generator loss: 1.5478724832920467, Discriminator loss: 0.8870287239551544:  23%|██▎       | 68/300 [09:52<25:44,  6.66s/it]
Generator loss: 1.5466448664665222, Discriminator loss: 0.8750020063975278:  23%|██▎       | 68/300 [09:59<25:44,  6.66s/it]
Generator loss: 1.5466448664665222, Discriminator loss: 0.8750020063975278:  23%|██▎       | 69/300 [09:59<25:35,  6.65s/it]
Generator loss: 1.5378692430608414, Discriminator loss: 0.8932287136421484:  23%|██▎       | 69/300 [10:05<25:35,  6.65s/it]
Generator loss: 1.5378692430608414, Discriminator loss: 0.8932287136421484:  23%|██▎       | 70/300 [10:05<25:25,  6.63s/it]
Generator loss: 1.5668704426463913, Discriminator loss: 0.866356801460771:  23%|██▎       | 70/300 [10:12<25:25,  6.63s/it] 
Generator loss: 1.5668704426463913, Discriminator loss: 0.866356801460771:  24%|██▎       | 71/300 [10:12<25:18,  6.63s/it]
Generator loss: 1.5750630353303516, Discriminator loss: 0.8744459419566042:  24%|██▎       | 71/300 [10:18<25:18,  6.63s/it]
Generator loss: 1.5750630353303516, Discriminator loss: 0.8744459419566042:  24%|██▍       | 72/300 [10:18<25:12,  6.63s/it]
Generator loss: 1.5859676148085033, Discriminator loss: 0.8690760284662247:  24%|██▍       | 72/300 [10:25<25:12,  6.63s/it]
Generator loss: 1.5859676148085033, Discriminator loss: 0.8690760284662247:  24%|██▍       | 73/300 [10:25<25:11,  6.66s/it]
Generator loss: 1.5942121735390495, Discriminator loss: 0.8656245947760695:  24%|██▍       | 73/300 [10:32<25:11,  6.66s/it]
Generator loss: 1.5942121735390495, Discriminator loss: 0.8656245947760695:  25%|██▍       | 74/300 [10:32<25:03,  6.65s/it]
Generator loss: 1.5940522244747948, Discriminator loss: 0.8590863057795692:  25%|██▍       | 74/300 [10:38<25:03,  6.65s/it]
Generator loss: 1.5940522244747948, Discriminator loss: 0.8590863057795692:  25%|██▌       | 75/300 [10:38<24:55,  6.65s/it]
Generator loss: 1.625133322442279, Discriminator loss: 0.8461951021762455:  25%|██▌       | 75/300 [10:45<24:55,  6.65s/it] 
Generator loss: 1.625133322442279, Discriminator loss: 0.8461951021762455:  25%|██▌       | 76/300 [10:45<24:47,  6.64s/it]
Generator loss: 1.6150044386877733, Discriminator loss: 0.8720687991556:  25%|██▌       | 76/300 [10:52<24:47,  6.64s/it]  
Generator loss: 1.6150044386877733, Discriminator loss: 0.8720687991556:  26%|██▌       | 77/300 [10:52<24:45,  6.66s/it]
Generator loss: 1.5708426285315962, Discriminator loss: 0.8573583491584834:  26%|██▌       | 77/300 [10:58<24:45,  6.66s/it]
Generator loss: 1.5708426285315962, Discriminator loss: 0.8573583491584834:  26%|██▌       | 78/300 [10:58<24:34,  6.64s/it]
Generator loss: 1.5813802248414826, Discriminator loss: 0.8782505826915011:  26%|██▌       | 78/300 [11:05<24:34,  6.64s/it]
Generator loss: 1.5813802248414826, Discriminator loss: 0.8782505826915011:  26%|██▋       | 79/300 [11:05<24:24,  6.63s/it]
Generator loss: 1.567047947908149, Discriminator loss: 0.8690432971891235:  26%|██▋       | 79/300 [11:12<24:24,  6.63s/it] 
Generator loss: 1.567047947908149, Discriminator loss: 0.8690432971891235:  27%|██▋       | 80/300 [11:12<24:13,  6.61s/it]
Generator loss: 1.583584943676696, Discriminator loss: 0.8649900998262798:  27%|██▋       | 80/300 [11:18<24:13,  6.61s/it]
Generator loss: 1.583584943676696, Discriminator loss: 0.8649900998262798:  27%|██▋       | 81/300 [11:18<24:13,  6.64s/it]
Generator loss: 1.5720649168771856, Discriminator loss: 0.8745317069046638:  27%|██▋       | 81/300 [11:25<24:13,  6.64s/it]
Generator loss: 1.5720649168771856, Discriminator loss: 0.8745317069046638:  27%|██▋       | 82/300 [11:25<24:05,  6.63s/it]
Generator loss: 1.5929746877621203, Discriminator loss: 0.884861863711301:  27%|██▋       | 82/300 [11:31<24:05,  6.63s/it] 
Generator loss: 1.5929746877621203, Discriminator loss: 0.884861863711301:  28%|██▊       | 83/300 [11:31<23:56,  6.62s/it]
Generator loss: 1.5993040927192743, Discriminator loss: 0.8676384678658318:  28%|██▊       | 83/300 [11:38<23:56,  6.62s/it]
Generator loss: 1.5993040927192743, Discriminator loss: 0.8676384678658318:  28%|██▊       | 84/300 [11:38<23:48,  6.61s/it]
Generator loss: 1.6024239904740278, Discriminator loss: 0.8599830475800178:  28%|██▊       | 84/300 [11:45<23:48,  6.61s/it]
Generator loss: 1.6024239904740278, Discriminator loss: 0.8599830475800178:  28%|██▊       | 85/300 [11:45<23:43,  6.62s/it]
Generator loss: 1.5926220316220732, Discriminator loss: 0.8531831862295375:  28%|██▊       | 85/300 [11:51<23:43,  6.62s/it]
Generator loss: 1.5926220316220732, Discriminator loss: 0.8531831862295375:  29%|██▊       | 86/300 [11:51<23:42,  6.65s/it]
Generator loss: 1.6410702589680166, Discriminator loss: 0.8368448203100878:  29%|██▊       | 86/300 [11:58<23:42,  6.65s/it]
Generator loss: 1.6410702589680166, Discriminator loss: 0.8368448203100878:  29%|██▉       | 87/300 [11:58<23:34,  6.64s/it]
Generator loss: 1.612115778028965, Discriminator loss: 0.866695671397097:  29%|██▉       | 87/300 [12:05<23:34,  6.64s/it]  
Generator loss: 1.612115778028965, Discriminator loss: 0.866695671397097:  29%|██▉       | 88/300 [12:05<23:26,  6.63s/it]
Generator loss: 1.6191413227249594, Discriminator loss: 0.8370693299700233:  29%|██▉       | 88/300 [12:11<23:26,  6.63s/it]
Generator loss: 1.6191413227249594, Discriminator loss: 0.8370693299700233:  30%|██▉       | 89/300 [12:11<23:19,  6.63s/it]
Generator loss: 1.6128742370535345, Discriminator loss: 0.8590594262761229:  30%|██▉       | 89/300 [12:18<23:19,  6.63s/it]
Generator loss: 1.6128742370535345, Discriminator loss: 0.8590594262761229:  30%|███       | 90/300 [12:18<23:18,  6.66s/it]
Generator loss: 1.6292471666546429, Discriminator loss: 0.8545262178077417:  30%|███       | 90/300 [12:25<23:18,  6.66s/it]
Generator loss: 1.6292471666546429, Discriminator loss: 0.8545262178077417:  30%|███       | 91/300 [12:25<23:07,  6.64s/it]
Generator loss: 1.645205161150764, Discriminator loss: 0.8198951066416853:  30%|███       | 91/300 [12:31<23:07,  6.64s/it] 
Generator loss: 1.645205161150764, Discriminator loss: 0.8198951066416853:  31%|███       | 92/300 [12:31<23:00,  6.64s/it]
Generator loss: 1.625604101401918, Discriminator loss: 0.8479677473797518:  31%|███       | 92/300 [12:38<23:00,  6.64s/it]
Generator loss: 1.625604101401918, Discriminator loss: 0.8479677473797518:  31%|███       | 93/300 [12:38<22:51,  6.63s/it]
Generator loss: 1.6520345750100471, Discriminator loss: 0.8335142885060871:  31%|███       | 93/300 [12:44<22:51,  6.63s/it]
Generator loss: 1.6520345750100471, Discriminator loss: 0.8335142885060871:  31%|███▏      | 94/300 [12:44<22:49,  6.65s/it]
Generator loss: 1.6731279856141876, Discriminator loss: 0.8364165591842988:  31%|███▏      | 94/300 [12:51<22:49,  6.65s/it]
Generator loss: 1.6731279856141876, Discriminator loss: 0.8364165591842988:  32%|███▏      | 95/300 [12:51<22:40,  6.64s/it]
Generator loss: 1.6628490156110596, Discriminator loss: 0.8328475991592688:  32%|███▏      | 95/300 [12:58<22:40,  6.64s/it]
Generator loss: 1.6628490156110596, Discriminator loss: 0.8328475991592688:  32%|███▏      | 96/300 [12:58<22:33,  6.63s/it]
Generator loss: 1.6615130590165363, Discriminator loss: 0.8281888374510933:  32%|███▏      | 96/300 [13:04<22:33,  6.63s/it]
Generator loss: 1.6615130590165363, Discriminator loss: 0.8281888374510933:  32%|███▏      | 97/300 [13:04<22:25,  6.63s/it]
Generator loss: 1.6148593184702538, Discriminator loss: 0.853957395781489:  32%|███▏      | 97/300 [13:11<22:25,  6.63s/it] 
Generator loss: 1.6148593184702538, Discriminator loss: 0.853957395781489:  33%|███▎      | 98/300 [13:11<22:17,  6.62s/it]
Generator loss: 1.6571329468313385, Discriminator loss: 0.8237307260141653:  33%|███▎      | 98/300 [13:18<22:17,  6.62s/it]
Generator loss: 1.6571329468313385, Discriminator loss: 0.8237307260141653:  33%|███▎      | 99/300 [13:18<22:15,  6.65s/it]
Generator loss: 1.6786996196298039, Discriminator loss: 0.8207834148231674:  33%|███▎      | 99/300 [13:24<22:15,  6.65s/it]
Generator loss: 1.6786996196298039, Discriminator loss: 0.8207834148231674:  33%|███▎      | 100/300 [13:24<22:07,  6.64s/it]
Generator loss: 1.6578268540256165, Discriminator loss: 0.8502317338305361:  33%|███▎      | 100/300 [13:31<22:07,  6.64s/it]
Generator loss: 1.6578268540256165, Discriminator loss: 0.8502317338305361:  34%|███▎      | 101/300 [13:31<22:00,  6.63s/it]
Generator loss: 1.6500777036828154, Discriminator loss: 0.8360809832811356:  34%|███▎      | 101/300 [13:37<22:00,  6.63s/it]
Generator loss: 1.6500777036828154, Discriminator loss: 0.8360809832811356:  34%|███▍      | 102/300 [13:37<21:49,  6.61s/it]
Generator loss: 1.667227153392399, Discriminator loss: 0.8129227284122916:  34%|███▍      | 102/300 [13:44<21:49,  6.61s/it] 
Generator loss: 1.667227153392399, Discriminator loss: 0.8129227284122916:  34%|███▍      | 103/300 [13:44<21:47,  6.64s/it]
Generator loss: 1.6800949520924513, Discriminator loss: 0.8065380576778861:  34%|███▍      | 103/300 [13:51<21:47,  6.64s/it]
Generator loss: 1.6800949520924513, Discriminator loss: 0.8065380576778861:  35%|███▍      | 104/300 [13:51<21:41,  6.64s/it]
Generator loss: 1.7048907402683706, Discriminator loss: 0.8073845773058779:  35%|███▍      | 104/300 [13:57<21:41,  6.64s/it]
Generator loss: 1.7048907402683706, Discriminator loss: 0.8073845773058779:  35%|███▌      | 105/300 [13:57<21:33,  6.63s/it]
Generator loss: 1.678999839898418, Discriminator loss: 0.8190580729176017:  35%|███▌      | 105/300 [14:04<21:33,  6.63s/it] 
Generator loss: 1.678999839898418, Discriminator loss: 0.8190580729176017:  35%|███▌      | 106/300 [14:04<21:22,  6.61s/it]
Generator loss: 1.6873094570987366, Discriminator loss: 0.8126953007543788:  35%|███▌      | 106/300 [14:11<21:22,  6.61s/it]
Generator loss: 1.6873094570987366, Discriminator loss: 0.8126953007543788:  36%|███▌      | 107/300 [14:11<21:22,  6.65s/it]
Generator loss: 1.678116421927424, Discriminator loss: 0.8270426212864763:  36%|███▌      | 107/300 [14:17<21:22,  6.65s/it] 
Generator loss: 1.678116421927424, Discriminator loss: 0.8270426212864763:  36%|███▌      | 108/300 [14:17<21:15,  6.64s/it]
Generator loss: 1.6950164301430477, Discriminator loss: 0.8026039232225979:  36%|███▌      | 108/300 [14:24<21:15,  6.64s/it]
Generator loss: 1.6950164301430477, Discriminator loss: 0.8026039232225979:  36%|███▋      | 109/300 [14:24<21:04,  6.62s/it]
Generator loss: 1.7014941169935114, Discriminator loss: 0.7969357147812843:  36%|███▋      | 109/300 [14:30<21:04,  6.62s/it]
Generator loss: 1.7014941169935114, Discriminator loss: 0.7969357147812843:  37%|███▋      | 110/300 [14:30<20:53,  6.60s/it]
Generator loss: 1.719758610953303, Discriminator loss: 0.7976097617955769:  37%|███▋      | 110/300 [14:37<20:53,  6.60s/it] 
Generator loss: 1.719758610953303, Discriminator loss: 0.7976097617955769:  37%|███▋      | 111/300 [14:37<20:44,  6.59s/it]
Generator loss: 1.7388707302949007, Discriminator loss: 0.7936992846867618:  37%|███▋      | 111/300 [14:44<20:44,  6.59s/it]
Generator loss: 1.7388707302949007, Discriminator loss: 0.7936992846867618:  37%|███▋      | 112/300 [14:44<20:40,  6.60s/it]
Generator loss: 1.7232475412242554, Discriminator loss: 0.7827038243412971:  37%|███▋      | 112/300 [14:50<20:40,  6.60s/it]
Generator loss: 1.7232475412242554, Discriminator loss: 0.7827038243412971:  38%|███▊      | 113/300 [14:50<20:31,  6.59s/it]
Generator loss: 1.7635971491827684, Discriminator loss: 0.7883220731335527:  38%|███▊      | 113/300 [14:57<20:31,  6.59s/it]
Generator loss: 1.7635971491827684, Discriminator loss: 0.7883220731335527:  38%|███▊      | 114/300 [14:57<20:22,  6.58s/it]
Generator loss: 1.7574936063850628, Discriminator loss: 0.7908676126424004:  38%|███▊      | 114/300 [15:03<20:22,  6.58s/it]
Generator loss: 1.7574936063850628, Discriminator loss: 0.7908676126424004:  38%|███▊      | 115/300 [15:03<20:15,  6.57s/it]
Generator loss: 1.7410221165593933, Discriminator loss: 0.796536782208611:  38%|███▊      | 115/300 [15:10<20:15,  6.57s/it] 
Generator loss: 1.7410221165593933, Discriminator loss: 0.796536782208611:  39%|███▊      | 116/300 [15:10<20:12,  6.59s/it]
Generator loss: 1.7575040148461567, Discriminator loss: 0.7798431997790056:  39%|███▊      | 116/300 [15:16<20:12,  6.59s/it]
Generator loss: 1.7575040148461567, Discriminator loss: 0.7798431997790056:  39%|███▉      | 117/300 [15:16<20:03,  6.58s/it]
Generator loss: 1.7489790890146704, Discriminator loss: 0.7853775230400702:  39%|███▉      | 117/300 [15:23<20:03,  6.58s/it]
Generator loss: 1.7489790890146704, Discriminator loss: 0.7853775230400702:  39%|███▉      | 118/300 [15:23<19:55,  6.57s/it]
Generator loss: 1.7569141054854673, Discriminator loss: 0.7929856251267826:  39%|███▉      | 118/300 [15:30<19:55,  6.57s/it]
Generator loss: 1.7569141054854673, Discriminator loss: 0.7929856251267826:  40%|███▉      | 119/300 [15:30<19:48,  6.56s/it]
Generator loss: 1.7644893553327112, Discriminator loss: 0.7732302866437856:  40%|███▉      | 119/300 [15:36<19:48,  6.56s/it]
Generator loss: 1.7644893553327112, Discriminator loss: 0.7732302866437856:  40%|████      | 120/300 [15:36<19:45,  6.59s/it]
Generator loss: 1.7821472046129845, Discriminator loss: 0.7615856252172414:  40%|████      | 120/300 [15:43<19:45,  6.59s/it]
Generator loss: 1.7821472046129845, Discriminator loss: 0.7615856252172414:  40%|████      | 121/300 [15:43<19:27,  6.52s/it]
Generator loss: 1.7836939162191223, Discriminator loss: 0.7784580820623566:  40%|████      | 121/300 [15:49<19:27,  6.52s/it]
Generator loss: 1.7836939162191223, Discriminator loss: 0.7784580820623566:  41%|████      | 122/300 [15:49<19:13,  6.48s/it]
Generator loss: 1.7743018295835047, Discriminator loss: 0.7797000899034388:  41%|████      | 122/300 [15:55<19:13,  6.48s/it]
Generator loss: 1.7743018295835047, Discriminator loss: 0.7797000899034388:  41%|████      | 123/300 [15:55<19:01,  6.45s/it]
Generator loss: 1.7860442395595943, Discriminator loss: 0.779537913553855:  41%|████      | 123/300 [16:02<19:01,  6.45s/it] 
Generator loss: 1.7860442395595943, Discriminator loss: 0.779537913553855:  41%|████▏     | 124/300 [16:02<18:52,  6.43s/it]
Generator loss: 1.7961063314886654, Discriminator loss: 0.7632119296228185:  41%|████▏     | 124/300 [16:08<18:52,  6.43s/it]
Generator loss: 1.7961063314886654, Discriminator loss: 0.7632119296228185:  42%|████▏     | 125/300 [16:08<18:46,  6.44s/it]
Generator loss: 1.7912442175781025, Discriminator loss: 0.7627700041322147:  42%|████▏     | 125/300 [16:15<18:46,  6.44s/it]
Generator loss: 1.7912442175781025, Discriminator loss: 0.7627700041322147:  42%|████▏     | 126/300 [16:15<18:37,  6.42s/it]
Generator loss: 1.8034317142823164, Discriminator loss: 0.7996259073124212:  42%|████▏     | 126/300 [16:21<18:37,  6.42s/it]
Generator loss: 1.8034317142823164, Discriminator loss: 0.7996259073124212:  42%|████▏     | 127/300 [16:21<18:31,  6.42s/it]
Generator loss: 1.7972839588628096, Discriminator loss: 0.7540580669746679:  42%|████▏     | 127/300 [16:27<18:31,  6.42s/it]
Generator loss: 1.7972839588628096, Discriminator loss: 0.7540580669746679:  43%|████▎     | 128/300 [16:27<18:22,  6.41s/it]
Generator loss: 1.819234922528267, Discriminator loss: 0.751131749328445:  43%|████▎     | 128/300 [16:34<18:22,  6.41s/it]  
Generator loss: 1.819234922528267, Discriminator loss: 0.751131749328445:  43%|████▎     | 129/300 [16:34<18:14,  6.40s/it]
Generator loss: 1.8124022290987127, Discriminator loss: 0.757165546364644:  43%|████▎     | 129/300 [16:40<18:14,  6.40s/it]
Generator loss: 1.8124022290987127, Discriminator loss: 0.757165546364644:  43%|████▎     | 130/300 [16:40<18:06,  6.39s/it]
Generator loss: 1.8343695323256886, Discriminator loss: 0.7554672592703033:  43%|████▎     | 130/300 [16:46<18:06,  6.39s/it]
Generator loss: 1.8343695323256886, Discriminator loss: 0.7554672592703033:  44%|████▎     | 131/300 [16:46<17:57,  6.38s/it]
Generator loss: 1.8135301514583475, Discriminator loss: 0.7552496725145508:  44%|████▎     | 131/300 [16:53<17:57,  6.38s/it]
Generator loss: 1.8135301514583475, Discriminator loss: 0.7552496725145508:  44%|████▍     | 132/300 [16:53<17:53,  6.39s/it]
Generator loss: 1.7663639468305252, Discriminator loss: 0.843885213136673:  44%|████▍     | 132/300 [16:59<17:53,  6.39s/it] 
Generator loss: 1.7663639468305252, Discriminator loss: 0.843885213136673:  44%|████▍     | 133/300 [16:59<17:53,  6.43s/it]
Generator loss: 1.7719000402618856, Discriminator loss: 0.7477432011681444:  44%|████▍     | 133/300 [17:06<17:53,  6.43s/it]
Generator loss: 1.7719000402618856, Discriminator loss: 0.7477432011681444:  45%|████▍     | 134/300 [17:06<17:46,  6.43s/it]
Generator loss: 1.8142524659633636, Discriminator loss: 0.7370717661345706:  45%|████▍     | 134/300 [17:12<17:46,  6.43s/it]
Generator loss: 1.8142524659633636, Discriminator loss: 0.7370717661345706:  45%|████▌     | 135/300 [17:12<17:38,  6.41s/it]
Generator loss: 1.8233654722571373, Discriminator loss: 0.7425804826266625:  45%|████▌     | 135/300 [17:19<17:38,  6.41s/it]
Generator loss: 1.8233654722571373, Discriminator loss: 0.7425804826266625:  45%|████▌     | 136/300 [17:19<17:32,  6.42s/it]
Generator loss: 1.8471009555984945, Discriminator loss: 0.7435824047116673:  45%|████▌     | 136/300 [17:25<17:32,  6.42s/it]
Generator loss: 1.8471009555984945, Discriminator loss: 0.7435824047116673:  46%|████▌     | 137/300 [17:25<17:30,  6.44s/it]
Generator loss: 1.8517840381930857, Discriminator loss: 0.7490521963028347:  46%|████▌     | 137/300 [17:32<17:30,  6.44s/it]
Generator loss: 1.8517840381930857, Discriminator loss: 0.7490521963028347:  46%|████▌     | 138/300 [17:32<17:23,  6.44s/it]
Generator loss: 1.8829423031386208, Discriminator loss: 0.7372819693649516:  46%|████▌     | 138/300 [17:38<17:23,  6.44s/it]
Generator loss: 1.8829423031386208, Discriminator loss: 0.7372819693649516:  46%|████▋     | 139/300 [17:38<17:24,  6.49s/it]
Generator loss: 1.8680496364831924, Discriminator loss: 0.7670731268384877:  46%|████▋     | 139/300 [17:45<17:24,  6.49s/it]
Generator loss: 1.8680496364831924, Discriminator loss: 0.7670731268384877:  47%|████▋     | 140/300 [17:45<17:16,  6.48s/it]
Generator loss: 1.857833515633555, Discriminator loss: 0.7407474833376267:  47%|████▋     | 140/300 [17:51<17:16,  6.48s/it] 
Generator loss: 1.857833515633555, Discriminator loss: 0.7407474833376267:  47%|████▋     | 141/300 [17:51<17:08,  6.47s/it]
Generator loss: 1.8678517113713657, Discriminator loss: 0.7261389905915541:  47%|████▋     | 141/300 [17:58<17:08,  6.47s/it]
Generator loss: 1.8678517113713657, Discriminator loss: 0.7261389905915541:  47%|████▋     | 142/300 [17:58<17:00,  6.46s/it]
Generator loss: 1.8683480313595604, Discriminator loss: 0.7343018848229858:  47%|████▋     | 142/300 [18:04<17:00,  6.46s/it]
Generator loss: 1.8683480313595604, Discriminator loss: 0.7343018848229858:  48%|████▊     | 143/300 [18:04<16:56,  6.47s/it]
Generator loss: 1.8793934167307966, Discriminator loss: 0.735251775559257:  48%|████▊     | 143/300 [18:11<16:56,  6.47s/it] 
Generator loss: 1.8793934167307966, Discriminator loss: 0.735251775559257:  48%|████▊     | 144/300 [18:11<16:49,  6.47s/it]
Generator loss: 1.8663447096067316, Discriminator loss: 0.7562430614934248:  48%|████▊     | 144/300 [18:17<16:49,  6.47s/it]
Generator loss: 1.8663447096067316, Discriminator loss: 0.7562430614934248:  48%|████▊     | 145/300 [18:17<16:46,  6.49s/it]
Generator loss: 1.853991342818036, Discriminator loss: 0.7314641164506183:  48%|████▊     | 145/300 [18:23<16:46,  6.49s/it] 
Generator loss: 1.853991342818036, Discriminator loss: 0.7314641164506183:  49%|████▊     | 146/300 [18:23<16:36,  6.47s/it]
Generator loss: 1.8588006592848723, Discriminator loss: 0.7245916391120237:  49%|████▊     | 146/300 [18:30<16:36,  6.47s/it]
Generator loss: 1.8588006592848723, Discriminator loss: 0.7245916391120237:  49%|████▉     | 147/300 [18:30<16:31,  6.48s/it]
Generator loss: 1.887835018336773, Discriminator loss: 0.7272283965173889:  49%|████▉     | 147/300 [18:36<16:31,  6.48s/it] 
Generator loss: 1.887835018336773, Discriminator loss: 0.7272283965173889:  49%|████▉     | 148/300 [18:36<16:21,  6.45s/it]
Generator loss: 1.870099083027419, Discriminator loss: 0.7440555796903723:  49%|████▉     | 148/300 [18:43<16:21,  6.45s/it]
Generator loss: 1.870099083027419, Discriminator loss: 0.7440555796903723:  50%|████▉     | 149/300 [18:43<16:14,  6.45s/it]
Generator loss: 1.8805445914759356, Discriminator loss: 0.7166378572583199:  50%|████▉     | 149/300 [18:49<16:14,  6.45s/it]
Generator loss: 1.8805445914759356, Discriminator loss: 0.7166378572583199:  50%|█████     | 150/300 [18:49<16:06,  6.44s/it]
Generator loss: 1.9056233597152374, Discriminator loss: 0.730847950805636:  50%|█████     | 150/300 [18:56<16:06,  6.44s/it] 
Generator loss: 1.9056233597152374, Discriminator loss: 0.730847950805636:  50%|█████     | 151/300 [18:56<16:03,  6.47s/it]
Generator loss: 1.9140165527077282, Discriminator loss: 0.7123787074404604:  50%|█████     | 151/300 [19:02<16:03,  6.47s/it]
Generator loss: 1.9140165527077282, Discriminator loss: 0.7123787074404604:  51%|█████     | 152/300 [19:02<15:56,  6.46s/it]
Generator loss: 1.9222180150887545, Discriminator loss: 0.7199763120973811:  51%|█████     | 152/300 [19:09<15:56,  6.46s/it]
Generator loss: 1.9222180150887545, Discriminator loss: 0.7199763120973811:  51%|█████     | 153/300 [19:09<15:50,  6.46s/it]
Generator loss: 1.8877925149658148, Discriminator loss: 0.7476223606397124:  51%|█████     | 153/300 [19:15<15:50,  6.46s/it]
Generator loss: 1.8877925149658148, Discriminator loss: 0.7476223606397124:  51%|█████▏    | 154/300 [19:15<15:40,  6.44s/it]
Generator loss: 1.883141139850897, Discriminator loss: 0.7233283677521873:  51%|█████▏    | 154/300 [19:22<15:40,  6.44s/it] 
Generator loss: 1.883141139850897, Discriminator loss: 0.7233283677521873:  52%|█████▏    | 155/300 [19:22<15:34,  6.45s/it]
Generator loss: 1.8941808974041658, Discriminator loss: 0.7128842459882007:  52%|█████▏    | 155/300 [19:28<15:34,  6.45s/it]
Generator loss: 1.8941808974041658, Discriminator loss: 0.7128842459882007:  52%|█████▏    | 156/300 [19:28<15:31,  6.47s/it]
Generator loss: 1.9113889801151611, Discriminator loss: 0.7131333828848951:  52%|█████▏    | 156/300 [19:35<15:31,  6.47s/it]
Generator loss: 1.9113889801151611, Discriminator loss: 0.7131333828848951:  52%|█████▏    | 157/300 [19:35<15:30,  6.51s/it]
Generator loss: 1.907938465476036, Discriminator loss: 0.7078887916663114:  52%|█████▏    | 157/300 [19:41<15:30,  6.51s/it] 
Generator loss: 1.907938465476036, Discriminator loss: 0.7078887916663114:  53%|█████▎    | 158/300 [19:41<15:24,  6.51s/it]
Generator loss: 1.911291787729544, Discriminator loss: 0.7252739961532986:  53%|█████▎    | 158/300 [19:48<15:24,  6.51s/it]
Generator loss: 1.911291787729544, Discriminator loss: 0.7252739961532986:  53%|█████▎    | 159/300 [19:48<15:16,  6.50s/it]
Generator loss: 1.9184895096456303, Discriminator loss: 0.7103787857820006:  53%|█████▎    | 159/300 [19:54<15:16,  6.50s/it]
Generator loss: 1.9184895096456303, Discriminator loss: 0.7103787857820006:  53%|█████▎    | 160/300 [19:54<15:07,  6.48s/it]
Generator loss: 1.6518787658076903, Discriminator loss: 1.6146739887840607:  53%|█████▎    | 160/300 [20:01<15:07,  6.48s/it]
Generator loss: 1.6518787658076903, Discriminator loss: 1.6146739887840607:  54%|█████▎    | 161/300 [20:01<14:59,  6.47s/it]
Generator loss: 1.1205224596402223, Discriminator loss: 1.0990365059936749:  54%|█████▎    | 161/300 [20:07<14:59,  6.47s/it]
Generator loss: 1.1205224596402223, Discriminator loss: 1.0990365059936749:  54%|█████▍    | 162/300 [20:07<14:54,  6.48s/it]
Generator loss: 1.3062065813471289, Discriminator loss: 0.9539765860227978:  54%|█████▍    | 162/300 [20:13<14:54,  6.48s/it]
Generator loss: 1.3062065813471289, Discriminator loss: 0.9539765860227978:  54%|█████▍    | 163/300 [20:13<14:47,  6.48s/it]
Generator loss: 1.5725498107426308, Discriminator loss: 0.8217732069246909:  54%|█████▍    | 163/300 [20:20<14:47,  6.48s/it]
Generator loss: 1.5725498107426308, Discriminator loss: 0.8217732069246909:  55%|█████▍    | 164/300 [20:20<14:43,  6.50s/it]
Generator loss: 1.6757422453340363, Discriminator loss: 0.7526570115895832:  55%|█████▍    | 164/300 [20:26<14:43,  6.50s/it]
Generator loss: 1.6757422453340363, Discriminator loss: 0.7526570115895832:  55%|█████▌    | 165/300 [20:26<14:35,  6.49s/it]
Generator loss: 1.7489459041286917, Discriminator loss: 0.7313809710390428:  55%|█████▌    | 165/300 [20:33<14:35,  6.49s/it]
Generator loss: 1.7489459041286917, Discriminator loss: 0.7313809710390428:  55%|█████▌    | 166/300 [20:33<14:26,  6.46s/it]
Generator loss: 1.793169997194234, Discriminator loss: 0.7243372022229082:  55%|█████▌    | 166/300 [20:39<14:26,  6.46s/it] 
Generator loss: 1.793169997194234, Discriminator loss: 0.7243372022229082:  56%|█████▌    | 167/300 [20:39<14:17,  6.45s/it]
Generator loss: 1.8074285098735023, Discriminator loss: 0.7162313873276991:  56%|█████▌    | 167/300 [20:46<14:17,  6.45s/it]
Generator loss: 1.8074285098735023, Discriminator loss: 0.7162313873276991:  56%|█████▌    | 168/300 [20:46<14:11,  6.45s/it]
Generator loss: 1.8447622586699093, Discriminator loss: 0.7173904425957623:  56%|█████▌    | 168/300 [20:52<14:11,  6.45s/it]
Generator loss: 1.8447622586699093, Discriminator loss: 0.7173904425957623:  56%|█████▋    | 169/300 [20:52<14:06,  6.46s/it]
Generator loss: 1.8570084054680431, Discriminator loss: 0.7179789012845825:  56%|█████▋    | 169/300 [20:59<14:06,  6.46s/it]
Generator loss: 1.8570084054680431, Discriminator loss: 0.7179789012845825:  57%|█████▋    | 170/300 [20:59<14:03,  6.49s/it]
Generator loss: 1.8499646011520834, Discriminator loss: 0.7092674168593743:  57%|█████▋    | 170/300 [21:05<14:03,  6.49s/it]
Generator loss: 1.8499646011520834, Discriminator loss: 0.7092674168593743:  57%|█████▋    | 171/300 [21:05<13:55,  6.48s/it]
Generator loss: 1.8719150879803825, Discriminator loss: 0.7033579261863933:  57%|█████▋    | 171/300 [21:12<13:55,  6.48s/it]
Generator loss: 1.8719150879803825, Discriminator loss: 0.7033579261863933:  57%|█████▋    | 172/300 [21:12<13:47,  6.47s/it]
Generator loss: 1.8675539554918514, Discriminator loss: 0.7109484506003997:  57%|█████▋    | 172/300 [21:18<13:47,  6.47s/it]
Generator loss: 1.8675539554918514, Discriminator loss: 0.7109484506003997:  58%|█████▊    | 173/300 [21:18<13:39,  6.46s/it]
Generator loss: 1.886877525378676, Discriminator loss: 0.715422235429287:  58%|█████▊    | 173/300 [21:25<13:39,  6.46s/it]  
Generator loss: 1.886877525378676, Discriminator loss: 0.715422235429287:  58%|█████▊    | 174/300 [21:25<13:30,  6.43s/it]
Generator loss: 1.8639318084015566, Discriminator loss: 0.7092845672193695:  58%|█████▊    | 174/300 [21:31<13:30,  6.43s/it]
Generator loss: 1.8639318084015566, Discriminator loss: 0.7092845672193695:  58%|█████▊    | 175/300 [21:31<13:23,  6.43s/it]
Generator loss: 1.8762365180779905, Discriminator loss: 0.7277653979904511:  58%|█████▊    | 175/300 [21:37<13:23,  6.43s/it]
Generator loss: 1.8762365180779905, Discriminator loss: 0.7277653979904511:  59%|█████▊    | 176/300 [21:37<13:20,  6.46s/it]
Generator loss: 1.8840555084102295, Discriminator loss: 0.7117495133596308:  59%|█████▊    | 176/300 [21:44<13:20,  6.46s/it]
Generator loss: 1.8840555084102295, Discriminator loss: 0.7117495133596308:  59%|█████▉    | 177/300 [21:44<13:09,  6.42s/it]
Generator loss: 1.8722961474867428, Discriminator loss: 0.7381857567850281:  59%|█████▉    | 177/300 [21:50<13:09,  6.42s/it]
Generator loss: 1.8722961474867428, Discriminator loss: 0.7381857567850281:  59%|█████▉    | 178/300 [21:50<12:58,  6.38s/it]
Generator loss: 1.879811488530215, Discriminator loss: 0.7050962079973782:  59%|█████▉    | 178/300 [21:57<12:58,  6.38s/it] 
Generator loss: 1.879811488530215, Discriminator loss: 0.7050962079973782:  60%|█████▉    | 179/300 [21:57<12:54,  6.40s/it]
Generator loss: 1.8758899578276802, Discriminator loss: 0.7048510142108974:  60%|█████▉    | 179/300 [22:03<12:54,  6.40s/it]
Generator loss: 1.8758899578276802, Discriminator loss: 0.7048510142108974:  60%|██████    | 180/300 [22:03<12:50,  6.42s/it]
Generator loss: 1.8873987224172144, Discriminator loss: 0.7038660505238701:  60%|██████    | 180/300 [22:09<12:50,  6.42s/it]
Generator loss: 1.8873987224172144, Discriminator loss: 0.7038660505238701:  60%|██████    | 181/300 [22:09<12:44,  6.42s/it]
Generator loss: 1.8888916346956701, Discriminator loss: 0.7137206582462087:  60%|██████    | 181/300 [22:16<12:44,  6.42s/it]
Generator loss: 1.8888916346956701, Discriminator loss: 0.7137206582462087:  61%|██████    | 182/300 [22:16<12:37,  6.42s/it]
Generator loss: 1.9036011213765425, Discriminator loss: 0.7122918327941614:  61%|██████    | 182/300 [22:22<12:37,  6.42s/it]
Generator loss: 1.9036011213765425, Discriminator loss: 0.7122918327941614:  61%|██████    | 183/300 [22:22<12:27,  6.39s/it]
Generator loss: 1.8918571752660416, Discriminator loss: 0.7075192012331065:  61%|██████    | 183/300 [22:28<12:27,  6.39s/it]
Generator loss: 1.8918571752660416, Discriminator loss: 0.7075192012331065:  61%|██████▏   | 184/300 [22:28<12:18,  6.37s/it]
Generator loss: 1.9028138803208576, Discriminator loss: 0.7169827043133623:  61%|██████▏   | 184/300 [22:35<12:18,  6.37s/it]
Generator loss: 1.9028138803208576, Discriminator loss: 0.7169827043133623:  62%|██████▏   | 185/300 [22:35<12:13,  6.38s/it]
Generator loss: 1.8884707106386913, Discriminator loss: 0.7273798655061161:  62%|██████▏   | 185/300 [22:41<12:13,  6.38s/it]
Generator loss: 1.8884707106386913, Discriminator loss: 0.7273798655061161:  62%|██████▏   | 186/300 [22:41<12:10,  6.41s/it]
Generator loss: 1.89948268848307, Discriminator loss: 0.6963861961575115:  62%|██████▏   | 186/300 [22:48<12:10,  6.41s/it]  
Generator loss: 1.89948268848307, Discriminator loss: 0.6963861961575115:  62%|██████▏   | 187/300 [22:48<12:03,  6.41s/it]
Generator loss: 1.8734046777381617, Discriminator loss: 0.71642957058023:  62%|██████▏   | 187/300 [22:54<12:03,  6.41s/it]
Generator loss: 1.8734046777381617, Discriminator loss: 0.71642957058023:  63%|██████▎   | 188/300 [22:54<11:57,  6.41s/it]
Generator loss: 1.897829573820619, Discriminator loss: 0.7021876989918596:  63%|██████▎   | 188/300 [23:00<11:57,  6.41s/it]
Generator loss: 1.897829573820619, Discriminator loss: 0.7021876989918596:  63%|██████▎   | 189/300 [23:00<11:47,  6.38s/it]
Generator loss: 1.9079211196478676, Discriminator loss: 0.7051354537115377:  63%|██████▎   | 189/300 [23:07<11:47,  6.38s/it]
Generator loss: 1.9079211196478676, Discriminator loss: 0.7051354537115377:  63%|██████▎   | 190/300 [23:07<11:38,  6.35s/it]
Generator loss: 1.8897704271709217, Discriminator loss: 0.7202173291760332:  63%|██████▎   | 190/300 [23:13<11:38,  6.35s/it]
Generator loss: 1.8897704271709217, Discriminator loss: 0.7202173291760332:  64%|██████▎   | 191/300 [23:13<11:29,  6.32s/it]
Generator loss: 1.896269340725506, Discriminator loss: 0.6983693526948199:  64%|██████▎   | 191/300 [23:19<11:29,  6.32s/it] 
Generator loss: 1.896269340725506, Discriminator loss: 0.6983693526948199:  64%|██████▍   | 192/300 [23:19<11:22,  6.32s/it]
Generator loss: 1.9090866350075777, Discriminator loss: 0.7042679291437653:  64%|██████▍   | 192/300 [23:26<11:22,  6.32s/it]
Generator loss: 1.9090866350075777, Discriminator loss: 0.7042679291437653:  64%|██████▍   | 193/300 [23:26<11:15,  6.32s/it]
Generator loss: 1.9113440697684008, Discriminator loss: 0.6876289436922354:  64%|██████▍   | 193/300 [23:32<11:15,  6.32s/it]
Generator loss: 1.9113440697684008, Discriminator loss: 0.6876289436922354:  65%|██████▍   | 194/300 [23:32<11:13,  6.35s/it]
Generator loss: 1.903952716904528, Discriminator loss: 0.7130992004976553:  65%|██████▍   | 194/300 [23:38<11:13,  6.35s/it] 
Generator loss: 1.903952716904528, Discriminator loss: 0.7130992004976553:  65%|██████▌   | 195/300 [23:38<11:06,  6.35s/it]
Generator loss: 1.9276543259620667, Discriminator loss: 0.6819565396975068:  65%|██████▌   | 195/300 [23:45<11:06,  6.35s/it]
Generator loss: 1.9276543259620667, Discriminator loss: 0.6819565396975068:  65%|██████▌   | 196/300 [23:45<11:00,  6.35s/it]
Generator loss: 1.9072776890414602, Discriminator loss: 0.7404027844176573:  65%|██████▌   | 196/300 [23:51<11:00,  6.35s/it]
Generator loss: 1.9072776890414602, Discriminator loss: 0.7404027844176573:  66%|██████▌   | 197/300 [23:51<10:53,  6.35s/it]
Generator loss: 1.9140455117997002, Discriminator loss: 0.695412377224249:  66%|██████▌   | 197/300 [23:57<10:53,  6.35s/it] 
Generator loss: 1.9140455117997002, Discriminator loss: 0.695412377224249:  66%|██████▌   | 198/300 [23:57<10:47,  6.35s/it]
Generator loss: 1.939264622681281, Discriminator loss: 0.6866086449693231:  66%|██████▌   | 198/300 [24:04<10:47,  6.35s/it]
Generator loss: 1.939264622681281, Discriminator loss: 0.6866086449693231:  66%|██████▋   | 199/300 [24:04<10:40,  6.34s/it]
Generator loss: 1.8983901568195398, Discriminator loss: 0.7000106559956775:  66%|██████▋   | 199/300 [24:10<10:40,  6.34s/it]
Generator loss: 1.8983901568195398, Discriminator loss: 0.7000106559956775:  67%|██████▋   | 200/300 [24:10<10:37,  6.37s/it]
Generator loss: 1.9391444851370419, Discriminator loss: 0.6873982890563852:  67%|██████▋   | 200/300 [24:17<10:37,  6.37s/it]
Generator loss: 1.9391444851370419, Discriminator loss: 0.6873982890563852:  67%|██████▋   | 201/300 [24:17<10:29,  6.36s/it]
Generator loss: 1.937527555753203, Discriminator loss: 0.6812970143030671:  67%|██████▋   | 201/300 [24:23<10:29,  6.36s/it] 
Generator loss: 1.937527555753203, Discriminator loss: 0.6812970143030671:  67%|██████▋   | 202/300 [24:23<10:22,  6.36s/it]
Generator loss: 1.9439553898923538, Discriminator loss: 0.6840969950837248:  67%|██████▋   | 202/300 [24:29<10:22,  6.36s/it]
Generator loss: 1.9439553898923538, Discriminator loss: 0.6840969950837248:  68%|██████▊   | 203/300 [24:29<10:15,  6.35s/it]
Generator loss: 1.9516027254216812, Discriminator loss: 0.6815296915524146:  68%|██████▊   | 203/300 [24:36<10:15,  6.35s/it]
Generator loss: 1.9516027254216812, Discriminator loss: 0.6815296915524146:  68%|██████▊   | 204/300 [24:36<10:09,  6.35s/it]
Generator loss: 1.947387943373007, Discriminator loss: 0.7077010602635496:  68%|██████▊   | 204/300 [24:42<10:09,  6.35s/it] 
Generator loss: 1.947387943373007, Discriminator loss: 0.7077010602635496:  68%|██████▊   | 205/300 [24:42<10:02,  6.34s/it]
Generator loss: 1.943607038434814, Discriminator loss: 0.6858728102901402:  68%|██████▊   | 205/300 [24:48<10:02,  6.34s/it]
Generator loss: 1.943607038434814, Discriminator loss: 0.6858728102901402:  69%|██████▊   | 206/300 [24:48<09:56,  6.34s/it]
Generator loss: 1.955567304702366, Discriminator loss: 0.6870937211548581:  69%|██████▊   | 206/300 [24:55<09:56,  6.34s/it]
Generator loss: 1.955567304702366, Discriminator loss: 0.6870937211548581:  69%|██████▉   | 207/300 [24:55<09:52,  6.37s/it]
Generator loss: 1.939436419045224, Discriminator loss: 0.684797972878989:  69%|██████▉   | 207/300 [25:01<09:52,  6.37s/it] 
Generator loss: 1.939436419045224, Discriminator loss: 0.684797972878989:  69%|██████▉   | 208/300 [25:01<09:45,  6.36s/it]
Generator loss: 1.951779685476247, Discriminator loss: 0.6865587295854793:  69%|██████▉   | 208/300 [25:07<09:45,  6.36s/it]
Generator loss: 1.951779685476247, Discriminator loss: 0.6865587295854793:  70%|██████▉   | 209/300 [25:07<09:38,  6.36s/it]
Generator loss: 1.9524799935957964, Discriminator loss: 0.6790948058752453:  70%|██████▉   | 209/300 [25:14<09:38,  6.36s/it]
Generator loss: 1.9524799935957964, Discriminator loss: 0.6790948058752453:  70%|███████   | 210/300 [25:14<09:31,  6.35s/it]
Generator loss: 1.9670421332120895, Discriminator loss: 0.6861023486537092:  70%|███████   | 210/300 [25:20<09:31,  6.35s/it]
Generator loss: 1.9670421332120895, Discriminator loss: 0.6861023486537092:  70%|███████   | 211/300 [25:20<09:24,  6.35s/it]
Generator loss: 1.9545674709712757, Discriminator loss: 0.6863196763922187:  70%|███████   | 211/300 [25:26<09:24,  6.35s/it]
Generator loss: 1.9545674709712757, Discriminator loss: 0.6863196763922187:  71%|███████   | 212/300 [25:26<09:18,  6.34s/it]
Generator loss: 1.958948059117093, Discriminator loss: 0.6826620995998383:  71%|███████   | 212/300 [25:33<09:18,  6.34s/it] 
Generator loss: 1.958948059117093, Discriminator loss: 0.6826620995998383:  71%|███████   | 213/300 [25:33<09:14,  6.37s/it]
Generator loss: 1.9341519448687048, Discriminator loss: 0.6950189532602534:  71%|███████   | 213/300 [25:39<09:14,  6.37s/it]
Generator loss: 1.9341519448687048, Discriminator loss: 0.6950189532602534:  71%|███████▏  | 214/300 [25:39<09:06,  6.36s/it]
Generator loss: 1.9501662600566358, Discriminator loss: 0.6898260752067846:  71%|███████▏  | 214/300 [25:46<09:06,  6.36s/it]
Generator loss: 1.9501662600566358, Discriminator loss: 0.6898260752067846:  72%|███████▏  | 215/300 [25:46<08:59,  6.35s/it]
Generator loss: 1.9576767919694675, Discriminator loss: 0.6672668312402332:  72%|███████▏  | 215/300 [25:52<08:59,  6.35s/it]
Generator loss: 1.9576767919694675, Discriminator loss: 0.6672668312402332:  72%|███████▏  | 216/300 [25:52<08:53,  6.35s/it]
Generator loss: 1.9782413156593548, Discriminator loss: 0.6659881433143335:  72%|███████▏  | 216/300 [25:58<08:53,  6.35s/it]
Generator loss: 1.9782413156593548, Discriminator loss: 0.6659881433143335:  72%|███████▏  | 217/300 [25:58<08:46,  6.35s/it]
Generator loss: 1.97869812039768, Discriminator loss: 0.6723256720339551:  72%|███████▏  | 217/300 [26:05<08:46,  6.35s/it]  
Generator loss: 1.97869812039768, Discriminator loss: 0.6723256720339551:  73%|███████▎  | 218/300 [26:05<08:40,  6.34s/it]
Generator loss: 1.9745400394586956, Discriminator loss: 0.6871117230723885:  73%|███████▎  | 218/300 [26:11<08:40,  6.34s/it]
Generator loss: 1.9745400394586956, Discriminator loss: 0.6871117230723885:  73%|███████▎  | 219/300 [26:11<08:35,  6.37s/it]
Generator loss: 1.9665792154915192, Discriminator loss: 0.6812433035058134:  73%|███████▎  | 219/300 [26:17<08:35,  6.37s/it]
Generator loss: 1.9665792154915192, Discriminator loss: 0.6812433035058134:  73%|███████▎  | 220/300 [26:17<08:28,  6.36s/it]
Generator loss: 1.9785992275266087, Discriminator loss: 0.67591892182827:  73%|███████▎  | 220/300 [26:24<08:28,  6.36s/it]  
Generator loss: 1.9785992275266087, Discriminator loss: 0.67591892182827:  74%|███████▎  | 221/300 [26:24<08:21,  6.35s/it]
Generator loss: 1.973692320725497, Discriminator loss: 0.676657390945098:  74%|███████▎  | 221/300 [26:30<08:21,  6.35s/it]
Generator loss: 1.973692320725497, Discriminator loss: 0.676657390945098:  74%|███████▍  | 222/300 [26:30<08:15,  6.35s/it]
Generator loss: 1.9833708035157007, Discriminator loss: 0.6975493062944973:  74%|███████▍  | 222/300 [26:36<08:15,  6.35s/it]
Generator loss: 1.9833708035157007, Discriminator loss: 0.6975493062944973:  74%|███████▍  | 223/300 [26:36<08:08,  6.34s/it]
Generator loss: 1.9668212234973907, Discriminator loss: 0.6715009891811539:  74%|███████▍  | 223/300 [26:43<08:08,  6.34s/it]
Generator loss: 1.9668212234973907, Discriminator loss: 0.6715009891811539:  75%|███████▍  | 224/300 [26:43<08:02,  6.34s/it]
Generator loss: 1.9748070257551529, Discriminator loss: 0.6686904412858626:  75%|███████▍  | 224/300 [26:49<08:02,  6.34s/it]
Generator loss: 1.9748070257551529, Discriminator loss: 0.6686904412858626:  75%|███████▌  | 225/300 [26:49<07:57,  6.37s/it]
Generator loss: 2.0004925999571297, Discriminator loss: 0.6750858426094055:  75%|███████▌  | 225/300 [26:55<07:57,  6.37s/it]
Generator loss: 2.0004925999571297, Discriminator loss: 0.6750858426094055:  75%|███████▌  | 226/300 [26:55<07:52,  6.38s/it]
Generator loss: 1.9894653199350132, Discriminator loss: 0.6713695771553937:  75%|███████▌  | 226/300 [27:02<07:52,  6.38s/it]
Generator loss: 1.9894653199350132, Discriminator loss: 0.6713695771553937:  76%|███████▌  | 227/300 [27:02<07:47,  6.40s/it]
Generator loss: 1.9966776563840754, Discriminator loss: 0.6589639384080382:  76%|███████▌  | 227/300 [27:08<07:47,  6.40s/it]
Generator loss: 1.9966776563840754, Discriminator loss: 0.6589639384080382:  76%|███████▌  | 228/300 [27:08<07:40,  6.39s/it]
Generator loss: 1.994829827810035, Discriminator loss: 0.6754506212823531:  76%|███████▌  | 228/300 [27:15<07:40,  6.39s/it] 
Generator loss: 1.994829827810035, Discriminator loss: 0.6754506212823531:  76%|███████▋  | 229/300 [27:15<07:33,  6.39s/it]
Generator loss: 2.001828783575226, Discriminator loss: 0.650514531223213:  76%|███████▋  | 229/300 [27:21<07:33,  6.39s/it] 
Generator loss: 2.001828783575226, Discriminator loss: 0.650514531223213:  77%|███████▋  | 230/300 [27:21<07:28,  6.40s/it]
Generator loss: 2.0208084548220917, Discriminator loss: 0.6529218304683181:  77%|███████▋  | 230/300 [27:28<07:28,  6.40s/it]
Generator loss: 2.0208084548220917, Discriminator loss: 0.6529218304683181:  77%|███████▋  | 231/300 [27:28<07:23,  6.43s/it]
Generator loss: 2.000203351764118, Discriminator loss: 0.665237603818669:  77%|███████▋  | 231/300 [27:34<07:23,  6.43s/it]  
Generator loss: 2.000203351764118, Discriminator loss: 0.665237603818669:  77%|███████▋  | 232/300 [27:34<07:15,  6.40s/it]
Generator loss: 2.0168945166994545, Discriminator loss: 0.6575310440624461:  77%|███████▋  | 232/300 [27:40<07:15,  6.40s/it]
Generator loss: 2.0168945166994545, Discriminator loss: 0.6575310440624461:  78%|███████▊  | 233/300 [27:40<07:08,  6.40s/it]
Generator loss: 2.0304839768830467, Discriminator loss: 0.6567836140885073:  78%|███████▊  | 233/300 [27:47<07:08,  6.40s/it]
Generator loss: 2.0304839768830467, Discriminator loss: 0.6567836140885073:  78%|███████▊  | 234/300 [27:47<07:02,  6.40s/it]
Generator loss: 1.983285310513833, Discriminator loss: 0.692187463535982:  78%|███████▊  | 234/300 [27:53<07:02,  6.40s/it]  
Generator loss: 1.983285310513833, Discriminator loss: 0.692187463535982:  78%|███████▊  | 235/300 [27:53<06:55,  6.39s/it]
Generator loss: 1.9890784682596432, Discriminator loss: 0.6605642341515597:  78%|███████▊  | 235/300 [28:00<06:55,  6.39s/it]
Generator loss: 1.9890784682596432, Discriminator loss: 0.6605642341515597:  79%|███████▊  | 236/300 [28:00<06:49,  6.39s/it]
Generator loss: 2.0142978377201977, Discriminator loss: 0.653194293818053:  79%|███████▊  | 236/300 [28:06<06:49,  6.39s/it] 
Generator loss: 2.0142978377201977, Discriminator loss: 0.653194293818053:  79%|███████▉  | 237/300 [28:06<06:44,  6.42s/it]
Generator loss: 2.049765216953614, Discriminator loss: 0.6443047887262177:  79%|███████▉  | 237/300 [28:12<06:44,  6.42s/it]
Generator loss: 2.049765216953614, Discriminator loss: 0.6443047887262177:  79%|███████▉  | 238/300 [28:12<06:38,  6.43s/it]
Generator loss: 2.0278301063705895, Discriminator loss: 0.64878829963067:  79%|███████▉  | 238/300 [28:19<06:38,  6.43s/it] 
Generator loss: 2.0278301063705895, Discriminator loss: 0.64878829963067:  80%|███████▉  | 239/300 [28:19<06:31,  6.42s/it]
Generator loss: 2.043862792498925, Discriminator loss: 0.658869181485737:  80%|███████▉  | 239/300 [28:25<06:31,  6.42s/it]
Generator loss: 2.043862792498925, Discriminator loss: 0.658869181485737:  80%|████████  | 240/300 [28:25<06:24,  6.41s/it]
Generator loss: 2.0490779552389595, Discriminator loss: 0.6539010339800049:  80%|████████  | 240/300 [28:32<06:24,  6.41s/it]
Generator loss: 2.0490779552389595, Discriminator loss: 0.6539010339800049:  80%|████████  | 241/300 [28:32<06:18,  6.41s/it]
Generator loss: 2.0562286771395626, Discriminator loss: 0.6415835838107502:  80%|████████  | 241/300 [28:38<06:18,  6.41s/it]
Generator loss: 2.0562286771395626, Discriminator loss: 0.6415835838107502:  81%|████████  | 242/300 [28:38<06:12,  6.42s/it]
Generator loss: 2.0574584980221355, Discriminator loss: 0.633175873581101:  81%|████████  | 242/300 [28:45<06:12,  6.42s/it] 
Generator loss: 2.0574584980221355, Discriminator loss: 0.633175873581101:  81%|████████  | 243/300 [28:45<06:07,  6.44s/it]
Generator loss: 2.0515074326711544, Discriminator loss: 0.6391502916812897:  81%|████████  | 243/300 [28:51<06:07,  6.44s/it]
Generator loss: 2.0515074326711544, Discriminator loss: 0.6391502916812897:  81%|████████▏ | 244/300 [28:51<05:59,  6.43s/it]
Generator loss: 2.0736149461830364, Discriminator loss: 0.6443934633451349:  81%|████████▏ | 244/300 [28:57<05:59,  6.43s/it]
Generator loss: 2.0736149461830364, Discriminator loss: 0.6443934633451349:  82%|████████▏ | 245/300 [28:57<05:53,  6.43s/it]
Generator loss: 2.0439358862007366, Discriminator loss: 0.675808913567487:  82%|████████▏ | 245/300 [29:04<05:53,  6.43s/it] 
Generator loss: 2.0439358862007366, Discriminator loss: 0.675808913567487:  82%|████████▏ | 246/300 [29:04<05:46,  6.42s/it]
Generator loss: 2.0692701777991127, Discriminator loss: 0.6446802813340636:  82%|████████▏ | 246/300 [29:10<05:46,  6.42s/it]
Generator loss: 2.0692701777991127, Discriminator loss: 0.6446802813340636:  82%|████████▏ | 247/300 [29:10<05:40,  6.42s/it]
Generator loss: 2.049621323452276, Discriminator loss: 0.6281508106519195:  82%|████████▏ | 247/300 [29:17<05:40,  6.42s/it] 
Generator loss: 2.049621323452276, Discriminator loss: 0.6281508106519195:  83%|████████▎ | 248/300 [29:17<05:33,  6.41s/it]
Generator loss: 2.068838319357704, Discriminator loss: 0.6554880567333278:  83%|████████▎ | 248/300 [29:23<05:33,  6.41s/it]
Generator loss: 2.068838319357704, Discriminator loss: 0.6554880567333278:  83%|████████▎ | 249/300 [29:23<05:27,  6.42s/it]
Generator loss: 2.0447138083331726, Discriminator loss: 0.6430171342457042:  83%|████████▎ | 249/300 [29:30<05:27,  6.42s/it]
Generator loss: 2.0447138083331726, Discriminator loss: 0.6430171342457042:  83%|████████▎ | 250/300 [29:30<05:22,  6.44s/it]
Generator loss: 2.0798104010960636, Discriminator loss: 0.6381222447928261:  83%|████████▎ | 250/300 [29:36<05:22,  6.44s/it]
Generator loss: 2.0798104010960636, Discriminator loss: 0.6381222447928261:  84%|████████▎ | 251/300 [29:36<05:15,  6.43s/it]
Generator loss: 2.071589014985982, Discriminator loss: 0.6498981937766075:  84%|████████▎ | 251/300 [29:42<05:15,  6.43s/it] 
Generator loss: 2.071589014985982, Discriminator loss: 0.6498981937766075:  84%|████████▍ | 252/300 [29:42<05:08,  6.43s/it]
Generator loss: 2.0728830090340447, Discriminator loss: 0.6422935833825785:  84%|████████▍ | 252/300 [29:49<05:08,  6.43s/it]
Generator loss: 2.0728830090340447, Discriminator loss: 0.6422935833825785:  84%|████████▍ | 253/300 [29:49<05:01,  6.42s/it]
Generator loss: 2.059836439350072, Discriminator loss: 0.6349203437566757:  84%|████████▍ | 253/300 [29:55<05:01,  6.42s/it] 
Generator loss: 2.059836439350072, Discriminator loss: 0.6349203437566757:  85%|████████▍ | 254/300 [29:55<04:55,  6.41s/it]
Generator loss: 2.091802353368086, Discriminator loss: 0.6277351261061781:  85%|████████▍ | 254/300 [30:02<04:55,  6.41s/it]
Generator loss: 2.091802353368086, Discriminator loss: 0.6277351261061781:  85%|████████▌ | 255/300 [30:02<04:48,  6.42s/it]
Generator loss: 2.0656274004894146, Discriminator loss: 0.6457043259459383:  85%|████████▌ | 255/300 [30:08<04:48,  6.42s/it]
Generator loss: 2.0656274004894146, Discriminator loss: 0.6457043259459383:  85%|████████▌ | 256/300 [30:08<04:43,  6.45s/it]
Generator loss: 2.1072967964060165, Discriminator loss: 0.6247626034652486:  85%|████████▌ | 256/300 [30:15<04:43,  6.45s/it]
Generator loss: 2.1072967964060165, Discriminator loss: 0.6247626034652486:  86%|████████▌ | 257/300 [30:15<04:36,  6.44s/it]
Generator loss: 2.0962683190317715, Discriminator loss: 0.631443692919086:  86%|████████▌ | 257/300 [30:21<04:36,  6.44s/it] 
Generator loss: 2.0962683190317715, Discriminator loss: 0.631443692919086:  86%|████████▌ | 258/300 [30:21<04:30,  6.43s/it]
Generator loss: 2.1010903006090835, Discriminator loss: 0.6387709094321027:  86%|████████▌ | 258/300 [30:27<04:30,  6.43s/it]
Generator loss: 2.1010903006090835, Discriminator loss: 0.6387709094321027:  86%|████████▋ | 259/300 [30:27<04:23,  6.43s/it]
Generator loss: 2.108405414749594, Discriminator loss: 0.6291532516479492:  86%|████████▋ | 259/300 [30:34<04:23,  6.43s/it] 
Generator loss: 2.108405414749594, Discriminator loss: 0.6291532516479492:  87%|████████▋ | 260/300 [30:34<04:17,  6.43s/it]
Generator loss: 2.1114456373102524, Discriminator loss: 0.6167570541010183:  87%|████████▋ | 260/300 [30:40<04:17,  6.43s/it]
Generator loss: 2.1114456373102524, Discriminator loss: 0.6167570541010183:  87%|████████▋ | 261/300 [30:40<04:10,  6.43s/it]
Generator loss: 2.1329608182696735, Discriminator loss: 0.6212209826883148:  87%|████████▋ | 261/300 [30:47<04:10,  6.43s/it]
Generator loss: 2.1329608182696735, Discriminator loss: 0.6212209826883148:  87%|████████▋ | 262/300 [30:47<04:05,  6.45s/it]
Generator loss: 2.120411331162733, Discriminator loss: 0.6373535583124441:  87%|████████▋ | 262/300 [30:53<04:05,  6.45s/it] 
Generator loss: 2.120411331162733, Discriminator loss: 0.6373535583124441:  88%|████████▊ | 263/300 [30:53<03:59,  6.48s/it]
Generator loss: 2.099952261237537, Discriminator loss: 0.6335013679721776:  88%|████████▊ | 263/300 [31:00<03:59,  6.48s/it]
Generator loss: 2.099952261237537, Discriminator loss: 0.6335013679721776:  88%|████████▊ | 264/300 [31:00<03:54,  6.52s/it]
Generator loss: 2.1003993810976254, Discriminator loss: 0.6331959783154375:  88%|████████▊ | 264/300 [31:07<03:54,  6.52s/it]
Generator loss: 2.1003993810976254, Discriminator loss: 0.6331959783154375:  88%|████████▊ | 265/300 [31:07<03:49,  6.55s/it]
Generator loss: 2.088849120280322, Discriminator loss: 0.6298180586274933:  88%|████████▊ | 265/300 [31:13<03:49,  6.55s/it] 
Generator loss: 2.088849120280322, Discriminator loss: 0.6298180586274933:  89%|████████▊ | 266/300 [31:13<03:43,  6.57s/it]
Generator loss: 2.1161189938292786, Discriminator loss: 0.6166971799205331:  89%|████████▊ | 266/300 [31:20<03:43,  6.57s/it]
Generator loss: 2.1161189938292786, Discriminator loss: 0.6166971799205331:  89%|████████▉ | 267/300 [31:20<03:37,  6.59s/it]
Generator loss: 2.142376761226093, Discriminator loss: 0.6130306541043169:  89%|████████▉ | 267/300 [31:27<03:37,  6.59s/it] 
Generator loss: 2.142376761226093, Discriminator loss: 0.6130306541043169:  89%|████████▉ | 268/300 [31:27<03:32,  6.65s/it]
Generator loss: 2.1195085732375873, Discriminator loss: 0.6244634438086959:  89%|████████▉ | 268/300 [31:33<03:32,  6.65s/it]
Generator loss: 2.1195085732375873, Discriminator loss: 0.6244634438086959:  90%|████████▉ | 269/300 [31:33<03:26,  6.67s/it]
Generator loss: 2.136966422200203, Discriminator loss: 0.6256932173581684:  90%|████████▉ | 269/300 [31:40<03:26,  6.67s/it] 
Generator loss: 2.136966422200203, Discriminator loss: 0.6256932173581684:  90%|█████████ | 270/300 [31:40<03:20,  6.69s/it]
Generator loss: 2.1253008071114037, Discriminator loss: 0.6211840646231875:  90%|█████████ | 270/300 [31:47<03:20,  6.69s/it]
Generator loss: 2.1253008071114037, Discriminator loss: 0.6211840646231875:  90%|█████████ | 271/300 [31:47<03:14,  6.70s/it]
Generator loss: 2.1245981340899185, Discriminator loss: 0.6270312956150841:  90%|█████████ | 271/300 [31:53<03:14,  6.70s/it]
Generator loss: 2.1245981340899185, Discriminator loss: 0.6270312956150841:  91%|█████████ | 272/300 [31:53<03:07,  6.71s/it]
Generator loss: 2.151730140342432, Discriminator loss: 0.6084378054913353:  91%|█████████ | 272/300 [32:00<03:07,  6.71s/it] 
Generator loss: 2.151730140342432, Discriminator loss: 0.6084378054913353:  91%|█████████ | 273/300 [32:00<03:01,  6.72s/it]
Generator loss: 2.1285239063641606, Discriminator loss: 0.6252047546646174:  91%|█████████ | 273/300 [32:07<03:01,  6.72s/it]
Generator loss: 2.1285239063641606, Discriminator loss: 0.6252047546646174:  91%|█████████▏| 274/300 [32:07<02:54,  6.73s/it]
Generator loss: 2.1367137300617554, Discriminator loss: 0.6136608995935496:  91%|█████████▏| 274/300 [32:14<02:54,  6.73s/it]
Generator loss: 2.1367137300617554, Discriminator loss: 0.6136608995935496:  92%|█████████▏| 275/300 [32:14<02:47,  6.71s/it]
Generator loss: 2.147204104153549, Discriminator loss: 0.6176723807173616:  92%|█████████▏| 275/300 [32:20<02:47,  6.71s/it] 
Generator loss: 2.147204104153549, Discriminator loss: 0.6176723807173616:  92%|█████████▏| 276/300 [32:20<02:41,  6.71s/it]
Generator loss: 2.1437266688136494, Discriminator loss: 0.6095429987591856:  92%|█████████▏| 276/300 [32:27<02:41,  6.71s/it]
Generator loss: 2.1437266688136494, Discriminator loss: 0.6095429987591856:  92%|█████████▏| 277/300 [32:27<02:33,  6.68s/it]
Generator loss: 2.152301244875964, Discriminator loss: 0.618432903991026:  92%|█████████▏| 277/300 [32:34<02:33,  6.68s/it]  
Generator loss: 2.152301244875964, Discriminator loss: 0.618432903991026:  93%|█████████▎| 278/300 [32:34<02:27,  6.69s/it]
Generator loss: 2.1435503083116867, Discriminator loss: 0.6124763900742811:  93%|█████████▎| 278/300 [32:40<02:27,  6.69s/it]
Generator loss: 2.1435503083116867, Discriminator loss: 0.6124763900742811:  93%|█████████▎| 279/300 [32:40<02:20,  6.69s/it]
Generator loss: 2.1760846148518955, Discriminator loss: 0.6109043350991081:  93%|█████████▎| 279/300 [32:47<02:20,  6.69s/it]
Generator loss: 2.1760846148518955, Discriminator loss: 0.6109043350991081:  93%|█████████▎| 280/300 [32:47<02:14,  6.70s/it]
Generator loss: 2.1473377986865887, Discriminator loss: 0.6142729858265203:  93%|█████████▎| 280/300 [32:54<02:14,  6.70s/it]
Generator loss: 2.1473377986865887, Discriminator loss: 0.6142729858265203:  94%|█████████▎| 281/300 [32:54<02:07,  6.70s/it]
Generator loss: 2.147419778739705, Discriminator loss: 0.6075115444905618:  94%|█████████▎| 281/300 [33:00<02:07,  6.70s/it] 
Generator loss: 2.147419778739705, Discriminator loss: 0.6075115444905618:  94%|█████████▍| 282/300 [33:00<02:00,  6.68s/it]
Generator loss: 2.1586636971024906, Discriminator loss: 0.6037897603476748:  94%|█████████▍| 282/300 [33:07<02:00,  6.68s/it]
Generator loss: 2.1586636971024906, Discriminator loss: 0.6037897603476748:  94%|█████████▍| 283/300 [33:07<01:53,  6.67s/it]
Generator loss: 2.1525688136325165, Discriminator loss: 0.611905101029312:  94%|█████████▍| 283/300 [33:14<01:53,  6.67s/it] 
Generator loss: 2.1525688136325165, Discriminator loss: 0.611905101029312:  95%|█████████▍| 284/300 [33:14<01:46,  6.66s/it]
Generator loss: 2.1883829025661243, Discriminator loss: 0.6092779662679223:  95%|█████████▍| 284/300 [33:20<01:46,  6.66s/it]
Generator loss: 2.1883829025661243, Discriminator loss: 0.6092779662679223:  95%|█████████▌| 285/300 [33:20<01:39,  6.66s/it]
Generator loss: 2.175755638410063, Discriminator loss: 0.5928393164101768:  95%|█████████▌| 285/300 [33:27<01:39,  6.66s/it] 
Generator loss: 2.175755638410063, Discriminator loss: 0.5928393164101768:  95%|█████████▌| 286/300 [33:27<01:33,  6.67s/it]
Generator loss: 2.1687005433966133, Discriminator loss: 0.6157251898856724:  95%|█████████▌| 286/300 [33:34<01:33,  6.67s/it]
Generator loss: 2.1687005433966133, Discriminator loss: 0.6157251898856724:  96%|█████████▌| 287/300 [33:34<01:26,  6.65s/it]
Generator loss: 2.16642308585784, Discriminator loss: 0.6245952195980969:  96%|█████████▌| 287/300 [33:40<01:26,  6.65s/it]  
Generator loss: 2.16642308585784, Discriminator loss: 0.6245952195980969:  96%|█████████▌| 288/300 [33:40<01:19,  6.63s/it]
Generator loss: 2.178683908546672, Discriminator loss: 0.593368058476378:  96%|█████████▌| 288/300 [33:47<01:19,  6.63s/it]
Generator loss: 2.178683908546672, Discriminator loss: 0.593368058476378:  96%|█████████▋| 289/300 [33:47<01:12,  6.62s/it]
Generator loss: 2.181570036446347, Discriminator loss: 0.5935841163291651:  96%|█████████▋| 289/300 [33:53<01:12,  6.62s/it]
Generator loss: 2.181570036446347, Discriminator loss: 0.5935841163291651:  97%|█████████▋| 290/300 [33:53<01:06,  6.62s/it]
Generator loss: 2.180004572167116, Discriminator loss: 0.6197993030004642:  97%|█████████▋| 290/300 [34:00<01:06,  6.62s/it]
Generator loss: 2.180004572167116, Discriminator loss: 0.6197993030004642:  97%|█████████▋| 291/300 [34:00<00:59,  6.61s/it]
Generator loss: 2.157050187096876, Discriminator loss: 0.5961371349061236:  97%|█████████▋| 291/300 [34:07<00:59,  6.61s/it]
Generator loss: 2.157050187096876, Discriminator loss: 0.5961371349061236:  97%|█████████▋| 292/300 [34:07<00:52,  6.62s/it]
Generator loss: 2.181761022876291, Discriminator loss: 0.5929972853730706:  97%|█████████▋| 292/300 [34:13<00:52,  6.62s/it]
Generator loss: 2.181761022876291, Discriminator loss: 0.5929972853730706:  98%|█████████▊| 293/300 [34:13<00:46,  6.65s/it]
Generator loss: 2.198557503959712, Discriminator loss: 0.6052266035009833:  98%|█████████▊| 293/300 [34:20<00:46,  6.65s/it]
Generator loss: 2.198557503959712, Discriminator loss: 0.6052266035009833:  98%|█████████▊| 294/300 [34:20<00:39,  6.64s/it]
Generator loss: 2.1911894468700184, Discriminator loss: 0.5976759361870149:  98%|█████████▊| 294/300 [34:27<00:39,  6.64s/it]
Generator loss: 2.1911894468700184, Discriminator loss: 0.5976759361870149:  98%|█████████▊| 295/300 [34:27<00:33,  6.62s/it]
Generator loss: 2.207013448371607, Discriminator loss: 0.5941765610786045:  98%|█████████▊| 295/300 [34:33<00:33,  6.62s/it] 
Generator loss: 2.207013448371607, Discriminator loss: 0.5941765610786045:  99%|█████████▊| 296/300 [34:33<00:26,  6.60s/it]
Generator loss: 2.2090228690820584, Discriminator loss: 0.5952651058049763:  99%|█████████▊| 296/300 [34:40<00:26,  6.60s/it]
Generator loss: 2.2090228690820584, Discriminator loss: 0.5952651058049763:  99%|█████████▉| 297/300 [34:40<00:19,  6.61s/it]
Generator loss: 2.207116046372582, Discriminator loss: 0.5886701748651617:  99%|█████████▉| 297/300 [34:46<00:19,  6.61s/it] 
Generator loss: 2.207116046372582, Discriminator loss: 0.5886701748651617:  99%|█████████▉| 298/300 [34:46<00:13,  6.60s/it]
Generator loss: 2.222158696721582, Discriminator loss: 0.5882657041006228:  99%|█████████▉| 298/300 [34:53<00:13,  6.60s/it]
Generator loss: 2.222158696721582, Discriminator loss: 0.5882657041006228: 100%|█████████▉| 299/300 [34:53<00:06,  6.64s/it]
Generator loss: 2.2106246729107464, Discriminator loss: 0.5984119886861128: 100%|█████████▉| 299/300 [35:00<00:06,  6.64s/it]
Generator loss: 2.2106246729107464, Discriminator loss: 0.5984119886861128: 100%|██████████| 300/300 [35:00<00:00,  6.60s/it]
Generator loss: 2.2106246729107464, Discriminator loss: 0.5984119886861128: 100%|██████████| 300/300 [35:00<00:00,  7.00s/it]
Training Completed!

serious_mnist.py

  0%|          | 0/1875 [00:00<?, ?it/s]
loss 2.32 accuracy 0.09:   0%|          | 0/1875 [00:07<?, ?it/s]
loss 2.32 accuracy 0.09:   0%|          | 1/1875 [00:07<3:53:47,  7.49s/it]
loss 2.32 accuracy 0.06:   0%|          | 1/1875 [00:10<3:53:47,  7.49s/it]
loss 2.32 accuracy 0.06:   0%|          | 2/1875 [00:10<2:23:39,  4.60s/it]
loss 2.34 accuracy 0.12:   0%|          | 2/1875 [00:10<2:23:39,  4.60s/it]
loss 2.29 accuracy 0.25:   0%|          | 2/1875 [00:10<2:23:39,  4.60s/it]
loss 2.29 accuracy 0.22:   0%|          | 2/1875 [00:10<2:23:39,  4.60s/it]
loss 2.24 accuracy 0.22:   0%|          | 2/1875 [00:10<2:23:39,  4.60s/it]
loss 2.26 accuracy 0.06:   0%|          | 2/1875 [00:10<2:23:39,  4.60s/it]
loss 2.26 accuracy 0.06:   0%|          | 7/1875 [00:10<28:06,  1.11it/s]  
loss 2.35 accuracy 0.19:   0%|          | 7/1875 [00:10<28:06,  1.11it/s]
loss 2.37 accuracy 0.19:   0%|          | 7/1875 [00:10<28:06,  1.11it/s]
loss 2.25 accuracy 0.12:   0%|          | 7/1875 [00:10<28:06,  1.11it/s]
loss 2.16 accuracy 0.31:   0%|          | 7/1875 [00:10<28:06,  1.11it/s]
loss 2.24 accuracy 0.12:   0%|          | 7/1875 [00:10<28:06,  1.11it/s]
loss 2.24 accuracy 0.12:   1%|          | 12/1875 [00:10<13:23,  2.32it/s]
loss 2.16 accuracy 0.16:   1%|          | 12/1875 [00:10<13:23,  2.32it/s]
loss 2.14 accuracy 0.09:   1%|          | 12/1875 [00:10<13:23,  2.32it/s]
loss 2.18 accuracy 0.22:   1%|          | 12/1875 [00:10<13:23,  2.32it/s]
loss 2.18 accuracy 0.25:   1%|          | 12/1875 [00:10<13:23,  2.32it/s]
loss 2.03 accuracy 0.41:   1%|          | 12/1875 [00:10<13:23,  2.32it/s]
loss 2.03 accuracy 0.41:   1%|          | 17/1875 [00:10<07:51,  3.94it/s]
loss 2.00 accuracy 0.25:   1%|          | 17/1875 [00:10<07:51,  3.94it/s]
loss 1.91 accuracy 0.47:   1%|          | 17/1875 [00:10<07:51,  3.94it/s]
loss 2.09 accuracy 0.22:   1%|          | 17/1875 [00:10<07:51,  3.94it/s]
loss 1.93 accuracy 0.34:   1%|          | 17/1875 [00:10<07:51,  3.94it/s]
loss 1.78 accuracy 0.38:   1%|          | 17/1875 [00:10<07:51,  3.94it/s]
loss 1.78 accuracy 0.38:   1%|          | 22/1875 [00:10<05:05,  6.06it/s]
loss 1.78 accuracy 0.38:   1%|          | 22/1875 [00:10<05:05,  6.06it/s]
loss 1.94 accuracy 0.38:   1%|          | 22/1875 [00:10<05:05,  6.06it/s]
loss 1.88 accuracy 0.47:   1%|          | 22/1875 [00:10<05:05,  6.06it/s]
loss 1.90 accuracy 0.38:   1%|          | 22/1875 [00:10<05:05,  6.06it/s]
loss 1.89 accuracy 0.41:   1%|          | 22/1875 [00:10<05:05,  6.06it/s]
loss 1.89 accuracy 0.41:   1%|▏         | 27/1875 [00:10<03:31,  8.74it/s]
loss 1.78 accuracy 0.31:   1%|▏         | 27/1875 [00:10<03:31,  8.74it/s]
loss 1.76 accuracy 0.41:   1%|▏         | 27/1875 [00:10<03:31,  8.74it/s]
loss 1.96 accuracy 0.19:   1%|▏         | 27/1875 [00:10<03:31,  8.74it/s]
loss 1.74 accuracy 0.31:   1%|▏         | 27/1875 [00:10<03:31,  8.74it/s]
loss 1.62 accuracy 0.44:   1%|▏         | 27/1875 [00:10<03:31,  8.74it/s]
loss 1.62 accuracy 0.44:   2%|▏         | 32/1875 [00:10<02:33, 12.00it/s]
loss 1.69 accuracy 0.44:   2%|▏         | 32/1875 [00:10<02:33, 12.00it/s]
loss 1.74 accuracy 0.41:   2%|▏         | 32/1875 [00:10<02:33, 12.00it/s]
loss 1.77 accuracy 0.38:   2%|▏         | 32/1875 [00:10<02:33, 12.00it/s]
loss 1.67 accuracy 0.44:   2%|▏         | 32/1875 [00:10<02:33, 12.00it/s]
loss 1.58 accuracy 0.47:   2%|▏         | 32/1875 [00:10<02:33, 12.00it/s]
loss 1.58 accuracy 0.47:   2%|▏         | 37/1875 [00:10<01:56, 15.75it/s]
loss 1.72 accuracy 0.34:   2%|▏         | 37/1875 [00:10<01:56, 15.75it/s]
loss 1.58 accuracy 0.41:   2%|▏         | 37/1875 [00:10<01:56, 15.75it/s]
loss 1.63 accuracy 0.50:   2%|▏         | 37/1875 [00:10<01:56, 15.75it/s]
loss 1.66 accuracy 0.44:   2%|▏         | 37/1875 [00:10<01:56, 15.75it/s]
loss 1.78 accuracy 0.38:   2%|▏         | 37/1875 [00:10<01:56, 15.75it/s]
loss 1.78 accuracy 0.38:   2%|▏         | 42/1875 [00:10<01:32, 19.90it/s]
loss 1.42 accuracy 0.47:   2%|▏         | 42/1875 [00:10<01:32, 19.90it/s]
loss 1.38 accuracy 0.53:   2%|▏         | 42/1875 [00:10<01:32, 19.90it/s]
loss 1.71 accuracy 0.34:   2%|▏         | 42/1875 [00:11<01:32, 19.90it/s]
loss 1.47 accuracy 0.41:   2%|▏         | 42/1875 [00:11<01:32, 19.90it/s]
loss 1.51 accuracy 0.47:   2%|▏         | 42/1875 [00:11<01:32, 19.90it/s]
loss 1.51 accuracy 0.47:   3%|▎         | 47/1875 [00:11<01:15, 24.18it/s]
loss 1.61 accuracy 0.38:   3%|▎         | 47/1875 [00:11<01:15, 24.18it/s]
loss 1.33 accuracy 0.53:   3%|▎         | 47/1875 [00:11<01:15, 24.18it/s]
loss 1.51 accuracy 0.56:   3%|▎         | 47/1875 [00:11<01:15, 24.18it/s]
loss 1.26 accuracy 0.69:   3%|▎         | 47/1875 [00:11<01:15, 24.18it/s]
loss 1.26 accuracy 0.56:   3%|▎         | 47/1875 [00:11<01:15, 24.18it/s]
loss 1.26 accuracy 0.56:   3%|▎         | 52/1875 [00:11<01:04, 28.33it/s]
loss 1.31 accuracy 0.56:   3%|▎         | 52/1875 [00:11<01:04, 28.33it/s]
loss 1.53 accuracy 0.44:   3%|▎         | 52/1875 [00:11<01:04, 28.33it/s]
loss 1.29 accuracy 0.62:   3%|▎         | 52/1875 [00:11<01:04, 28.33it/s]
loss 1.32 accuracy 0.53:   3%|▎         | 52/1875 [00:11<01:04, 28.33it/s]
loss 1.39 accuracy 0.53:   3%|▎         | 52/1875 [00:11<01:04, 28.33it/s]
loss 1.39 accuracy 0.53:   3%|▎         | 57/1875 [00:11<00:56, 32.13it/s]
loss 1.43 accuracy 0.47:   3%|▎         | 57/1875 [00:11<00:56, 32.13it/s]
loss 1.23 accuracy 0.59:   3%|▎         | 57/1875 [00:11<00:56, 32.13it/s]
loss 1.26 accuracy 0.72:   3%|▎         | 57/1875 [00:11<00:56, 32.13it/s]
loss 1.60 accuracy 0.50:   3%|▎         | 57/1875 [00:11<00:56, 32.13it/s]
loss 1.19 accuracy 0.72:   3%|▎         | 57/1875 [00:11<00:56, 32.13it/s]
loss 1.19 accuracy 0.72:   3%|▎         | 62/1875 [00:11<00:51, 35.39it/s]
loss 1.44 accuracy 0.59:   3%|▎         | 62/1875 [00:11<00:51, 35.39it/s]
loss 1.54 accuracy 0.44:   3%|▎         | 62/1875 [00:11<00:51, 35.39it/s]
loss 1.45 accuracy 0.47:   3%|▎         | 62/1875 [00:11<00:51, 35.39it/s]
loss 1.38 accuracy 0.53:   3%|▎         | 62/1875 [00:11<00:51, 35.39it/s]
loss 1.12 accuracy 0.66:   3%|▎         | 62/1875 [00:11<00:51, 35.39it/s]
loss 1.12 accuracy 0.66:   4%|▎         | 67/1875 [00:11<00:47, 38.05it/s]
loss 1.32 accuracy 0.53:   4%|▎         | 67/1875 [00:11<00:47, 38.05it/s]
loss 1.37 accuracy 0.47:   4%|▎         | 67/1875 [00:11<00:47, 38.05it/s]
loss 1.29 accuracy 0.44:   4%|▎         | 67/1875 [00:11<00:47, 38.05it/s]
loss 1.37 accuracy 0.41:   4%|▎         | 67/1875 [00:11<00:47, 38.05it/s]
loss 1.19 accuracy 0.59:   4%|▎         | 67/1875 [00:11<00:47, 38.05it/s]
loss 1.19 accuracy 0.59:   4%|▍         | 72/1875 [00:11<00:44, 40.18it/s]
loss 1.45 accuracy 0.47:   4%|▍         | 72/1875 [00:11<00:44, 40.18it/s]
loss 1.52 accuracy 0.53:   4%|▍         | 72/1875 [00:11<00:44, 40.18it/s]
loss 1.20 accuracy 0.56:   4%|▍         | 72/1875 [00:11<00:44, 40.18it/s]
loss 1.37 accuracy 0.50:   4%|▍         | 72/1875 [00:11<00:44, 40.18it/s]
loss 1.36 accuracy 0.59:   4%|▍         | 72/1875 [00:11<00:44, 40.18it/s]
loss 1.36 accuracy 0.59:   4%|▍         | 77/1875 [00:11<00:42, 41.83it/s]
loss 1.07 accuracy 0.69:   4%|▍         | 77/1875 [00:11<00:42, 41.83it/s]
loss 1.15 accuracy 0.62:   4%|▍         | 77/1875 [00:11<00:42, 41.83it/s]
loss 1.37 accuracy 0.53:   4%|▍         | 77/1875 [00:11<00:42, 41.83it/s]
loss 1.27 accuracy 0.66:   4%|▍         | 77/1875 [00:11<00:42, 41.83it/s]
loss 1.17 accuracy 0.66:   4%|▍         | 77/1875 [00:11<00:42, 41.83it/s]
loss 1.17 accuracy 0.66:   4%|▍         | 82/1875 [00:11<00:41, 43.02it/s]
loss 1.07 accuracy 0.62:   4%|▍         | 82/1875 [00:11<00:41, 43.02it/s]
loss 1.24 accuracy 0.69:   4%|▍         | 82/1875 [00:11<00:41, 43.02it/s]
loss 1.27 accuracy 0.47:   4%|▍         | 82/1875 [00:11<00:41, 43.02it/s]
loss 1.33 accuracy 0.56:   4%|▍         | 82/1875 [00:11<00:41, 43.02it/s]
loss 1.24 accuracy 0.59:   4%|▍         | 82/1875 [00:11<00:41, 43.02it/s]
loss 1.24 accuracy 0.59:   5%|▍         | 87/1875 [00:11<00:40, 43.93it/s]
loss 1.12 accuracy 0.62:   5%|▍         | 87/1875 [00:11<00:40, 43.93it/s]
loss 1.20 accuracy 0.59:   5%|▍         | 87/1875 [00:11<00:40, 43.93it/s]
loss 0.96 accuracy 0.75:   5%|▍         | 87/1875 [00:11<00:40, 43.93it/s]
loss 1.24 accuracy 0.53:   5%|▍         | 87/1875 [00:12<00:40, 43.93it/s]
loss 1.31 accuracy 0.56:   5%|▍         | 87/1875 [00:12<00:40, 43.93it/s]
loss 1.31 accuracy 0.56:   5%|▍         | 92/1875 [00:12<00:40, 44.55it/s]
loss 1.30 accuracy 0.41:   5%|▍         | 92/1875 [00:12<00:40, 44.55it/s]
loss 1.26 accuracy 0.56:   5%|▍         | 92/1875 [00:12<00:40, 44.55it/s]
loss 1.03 accuracy 0.69:   5%|▍         | 92/1875 [00:12<00:40, 44.55it/s]
loss 0.84 accuracy 0.88:   5%|▍         | 92/1875 [00:12<00:40, 44.55it/s]
loss 1.10 accuracy 0.66:   5%|▍         | 92/1875 [00:12<00:40, 44.55it/s]
loss 1.10 accuracy 0.66:   5%|▌         | 97/1875 [00:12<00:39, 45.01it/s]
loss 1.14 accuracy 0.56:   5%|▌         | 97/1875 [00:12<00:39, 45.01it/s]
loss 0.83 accuracy 0.88:   5%|▌         | 97/1875 [00:12<00:39, 45.01it/s]
loss 1.00 accuracy 0.69:   5%|▌         | 97/1875 [00:12<00:39, 45.01it/s]
loss 1.18 accuracy 0.47:   5%|▌         | 97/1875 [00:12<00:39, 45.01it/s]
loss 1.16 accuracy 0.50:   5%|▌         | 97/1875 [00:12<00:39, 45.01it/s]
loss 1.16 accuracy 0.50:   5%|▌         | 102/1875 [00:12<00:39, 45.34it/s]
loss 1.13 accuracy 0.66:   5%|▌         | 102/1875 [00:12<00:39, 45.34it/s]
loss 1.21 accuracy 0.66:   5%|▌         | 102/1875 [00:12<00:39, 45.34it/s]
loss 1.38 accuracy 0.44:   5%|▌         | 102/1875 [00:12<00:39, 45.34it/s]
loss 1.11 accuracy 0.66:   5%|▌         | 102/1875 [00:12<00:39, 45.34it/s]
loss 0.75 accuracy 0.81:   5%|▌         | 102/1875 [00:12<00:39, 45.34it/s]
loss 0.75 accuracy 0.81:   6%|▌         | 107/1875 [00:12<00:38, 45.57it/s]
loss 0.91 accuracy 0.75:   6%|▌         | 107/1875 [00:12<00:38, 45.57it/s]
loss 0.61 accuracy 0.88:   6%|▌         | 107/1875 [00:12<00:38, 45.57it/s]
loss 1.01 accuracy 0.69:   6%|▌         | 107/1875 [00:12<00:38, 45.57it/s]
loss 0.78 accuracy 0.81:   6%|▌         | 107/1875 [00:12<00:38, 45.57it/s]
loss 1.55 accuracy 0.47:   6%|▌         | 107/1875 [00:12<00:38, 45.57it/s]
loss 1.55 accuracy 0.47:   6%|▌         | 112/1875 [00:12<00:38, 45.75it/s]
loss 0.99 accuracy 0.78:   6%|▌         | 112/1875 [00:12<00:38, 45.75it/s]
loss 1.33 accuracy 0.56:   6%|▌         | 112/1875 [00:12<00:38, 45.75it/s]
loss 1.10 accuracy 0.66:   6%|▌         | 112/1875 [00:12<00:38, 45.75it/s]
loss 1.13 accuracy 0.69:   6%|▌         | 112/1875 [00:12<00:38, 45.75it/s]
loss 1.01 accuracy 0.78:   6%|▌         | 112/1875 [00:12<00:38, 45.75it/s]
loss 1.01 accuracy 0.78:   6%|▌         | 117/1875 [00:12<00:38, 45.83it/s]
loss 1.38 accuracy 0.56:   6%|▌         | 117/1875 [00:12<00:38, 45.83it/s]
loss 0.97 accuracy 0.59:   6%|▌         | 117/1875 [00:12<00:38, 45.83it/s]
loss 0.92 accuracy 0.75:   6%|▌         | 117/1875 [00:12<00:38, 45.83it/s]
loss 0.86 accuracy 0.75:   6%|▌         | 117/1875 [00:12<00:38, 45.83it/s]
loss 1.07 accuracy 0.69:   6%|▌         | 117/1875 [00:12<00:38, 45.83it/s]
loss 1.07 accuracy 0.69:   7%|▋         | 122/1875 [00:12<00:38, 45.89it/s]
loss 0.92 accuracy 0.81:   7%|▋         | 122/1875 [00:12<00:38, 45.89it/s]
loss 0.96 accuracy 0.62:   7%|▋         | 122/1875 [00:12<00:38, 45.89it/s]
loss 0.74 accuracy 0.81:   7%|▋         | 122/1875 [00:12<00:38, 45.89it/s]
loss 1.03 accuracy 0.50:   7%|▋         | 122/1875 [00:12<00:38, 45.89it/s]
loss 0.92 accuracy 0.62:   7%|▋         | 122/1875 [00:12<00:38, 45.89it/s]
loss 0.92 accuracy 0.62:   7%|▋         | 127/1875 [00:12<00:38, 45.86it/s]
loss 1.03 accuracy 0.69:   7%|▋         | 127/1875 [00:12<00:38, 45.86it/s]
loss 1.15 accuracy 0.62:   7%|▋         | 127/1875 [00:12<00:38, 45.86it/s]
loss 1.06 accuracy 0.69:   7%|▋         | 127/1875 [00:12<00:38, 45.86it/s]
loss 0.88 accuracy 0.59:   7%|▋         | 127/1875 [00:12<00:38, 45.86it/s]
loss 0.78 accuracy 0.75:   7%|▋         | 127/1875 [00:12<00:38, 45.86it/s]
loss 0.78 accuracy 0.75:   7%|▋         | 132/1875 [00:12<00:38, 45.76it/s]
loss 1.20 accuracy 0.56:   7%|▋         | 132/1875 [00:12<00:38, 45.76it/s]
loss 0.99 accuracy 0.69:   7%|▋         | 132/1875 [00:12<00:38, 45.76it/s]
loss 0.79 accuracy 0.81:   7%|▋         | 132/1875 [00:12<00:38, 45.76it/s]
loss 0.94 accuracy 0.62:   7%|▋         | 132/1875 [00:12<00:38, 45.76it/s]
loss 0.80 accuracy 0.78:   7%|▋         | 132/1875 [00:13<00:38, 45.76it/s]
loss 0.80 accuracy 0.78:   7%|▋         | 137/1875 [00:13<00:37, 45.79it/s]
loss 0.91 accuracy 0.66:   7%|▋         | 137/1875 [00:13<00:37, 45.79it/s]
loss 0.40 accuracy 0.94:   7%|▋         | 137/1875 [00:13<00:37, 45.79it/s]
loss 0.68 accuracy 0.78:   7%|▋         | 137/1875 [00:13<00:37, 45.79it/s]
loss 0.75 accuracy 0.75:   7%|▋         | 137/1875 [00:13<00:37, 45.79it/s]
loss 0.91 accuracy 0.72:   7%|▋         | 137/1875 [00:13<00:37, 45.79it/s]
loss 0.91 accuracy 0.72:   8%|▊         | 142/1875 [00:13<00:37, 45.80it/s]
loss 0.71 accuracy 0.75:   8%|▊         | 142/1875 [00:13<00:37, 45.80it/s]
loss 0.96 accuracy 0.66:   8%|▊         | 142/1875 [00:13<00:37, 45.80it/s]
loss 0.80 accuracy 0.75:   8%|▊         | 142/1875 [00:13<00:37, 45.80it/s]
loss 0.76 accuracy 0.84:   8%|▊         | 142/1875 [00:13<00:37, 45.80it/s]
loss 1.19 accuracy 0.75:   8%|▊         | 142/1875 [00:13<00:37, 45.80it/s]
loss 1.19 accuracy 0.75:   8%|▊         | 147/1875 [00:13<00:37, 45.70it/s]
loss 0.89 accuracy 0.75:   8%|▊         | 147/1875 [00:13<00:37, 45.70it/s]
loss 0.60 accuracy 0.88:   8%|▊         | 147/1875 [00:13<00:37, 45.70it/s]
loss 0.60 accuracy 0.84:   8%|▊         | 147/1875 [00:13<00:37, 45.70it/s]
loss 0.74 accuracy 0.84:   8%|▊         | 147/1875 [00:13<00:37, 45.70it/s]
loss 0.71 accuracy 0.78:   8%|▊         | 147/1875 [00:13<00:37, 45.70it/s]
loss 0.71 accuracy 0.78:   8%|▊         | 152/1875 [00:13<00:37, 45.75it/s]
loss 0.61 accuracy 0.94:   8%|▊         | 152/1875 [00:13<00:37, 45.75it/s]
loss 0.62 accuracy 0.84:   8%|▊         | 152/1875 [00:13<00:37, 45.75it/s]
loss 0.64 accuracy 0.84:   8%|▊         | 152/1875 [00:13<00:37, 45.75it/s]
loss 0.61 accuracy 0.84:   8%|▊         | 152/1875 [00:13<00:37, 45.75it/s]
loss 0.46 accuracy 0.91:   8%|▊         | 152/1875 [00:13<00:37, 45.75it/s]
loss 0.46 accuracy 0.91:   8%|▊         | 157/1875 [00:13<00:37, 45.72it/s]
loss 0.66 accuracy 0.81:   8%|▊         | 157/1875 [00:13<00:37, 45.72it/s]
loss 0.79 accuracy 0.78:   8%|▊         | 157/1875 [00:13<00:37, 45.72it/s]
loss 0.76 accuracy 0.72:   8%|▊         | 157/1875 [00:13<00:37, 45.72it/s]
loss 0.85 accuracy 0.69:   8%|▊         | 157/1875 [00:13<00:37, 45.72it/s]
loss 0.81 accuracy 0.75:   8%|▊         | 157/1875 [00:13<00:37, 45.72it/s]
loss 0.81 accuracy 0.75:   9%|▊         | 162/1875 [00:13<00:37, 45.74it/s]
loss 0.92 accuracy 0.69:   9%|▊         | 162/1875 [00:13<00:37, 45.74it/s]
loss 1.02 accuracy 0.66:   9%|▊         | 162/1875 [00:13<00:37, 45.74it/s]
loss 0.46 accuracy 0.97:   9%|▊         | 162/1875 [00:13<00:37, 45.74it/s]
loss 0.71 accuracy 0.78:   9%|▊         | 162/1875 [00:13<00:37, 45.74it/s]
loss 0.57 accuracy 0.78:   9%|▊         | 162/1875 [00:13<00:37, 45.74it/s]
loss 0.57 accuracy 0.78:   9%|▉         | 167/1875 [00:13<00:37, 45.82it/s]
loss 0.58 accuracy 0.78:   9%|▉         | 167/1875 [00:13<00:37, 45.82it/s]
loss 0.51 accuracy 0.91:   9%|▉         | 167/1875 [00:13<00:37, 45.82it/s]
loss 0.62 accuracy 0.84:   9%|▉         | 167/1875 [00:13<00:37, 45.82it/s]
loss 0.69 accuracy 0.84:   9%|▉         | 167/1875 [00:13<00:37, 45.82it/s]
loss 0.89 accuracy 0.72:   9%|▉         | 167/1875 [00:13<00:37, 45.82it/s]
loss 0.89 accuracy 0.72:   9%|▉         | 172/1875 [00:13<00:37, 45.90it/s]
loss 0.53 accuracy 0.94:   9%|▉         | 172/1875 [00:13<00:37, 45.90it/s]
loss 0.51 accuracy 0.84:   9%|▉         | 172/1875 [00:13<00:37, 45.90it/s]
loss 0.40 accuracy 0.94:   9%|▉         | 172/1875 [00:13<00:37, 45.90it/s]
loss 0.57 accuracy 0.78:   9%|▉         | 172/1875 [00:13<00:37, 45.90it/s]
loss 0.54 accuracy 0.91:   9%|▉         | 172/1875 [00:13<00:37, 45.90it/s]
loss 0.54 accuracy 0.91:   9%|▉         | 177/1875 [00:13<00:36, 45.94it/s]
loss 0.54 accuracy 0.84:   9%|▉         | 177/1875 [00:13<00:36, 45.94it/s]
loss 0.84 accuracy 0.72:   9%|▉         | 177/1875 [00:13<00:36, 45.94it/s]
loss 0.29 accuracy 0.94:   9%|▉         | 177/1875 [00:13<00:36, 45.94it/s]
loss 0.38 accuracy 0.84:   9%|▉         | 177/1875 [00:13<00:36, 45.94it/s]
loss 0.62 accuracy 0.75:   9%|▉         | 177/1875 [00:13<00:36, 45.94it/s]
loss 0.62 accuracy 0.75:  10%|▉         | 182/1875 [00:13<00:36, 45.99it/s]
loss 0.78 accuracy 0.81:  10%|▉         | 182/1875 [00:14<00:36, 45.99it/s]
loss 0.55 accuracy 0.81:  10%|▉         | 182/1875 [00:14<00:36, 45.99it/s]
loss 0.64 accuracy 0.88:  10%|▉         | 182/1875 [00:14<00:36, 45.99it/s]
loss 1.16 accuracy 0.66:  10%|▉         | 182/1875 [00:14<00:36, 45.99it/s]
loss 0.69 accuracy 0.78:  10%|▉         | 182/1875 [00:14<00:36, 45.99it/s]
loss 0.69 accuracy 0.78:  10%|▉         | 187/1875 [00:14<00:36, 46.00it/s]
loss 0.89 accuracy 0.75:  10%|▉         | 187/1875 [00:14<00:36, 46.00it/s]
loss 0.45 accuracy 0.88:  10%|▉         | 187/1875 [00:14<00:36, 46.00it/s]
loss 0.66 accuracy 0.81:  10%|▉         | 187/1875 [00:14<00:36, 46.00it/s]
loss 0.53 accuracy 0.84:  10%|▉         | 187/1875 [00:14<00:36, 46.00it/s]
loss 0.53 accuracy 0.88:  10%|▉         | 187/1875 [00:14<00:36, 46.00it/s]
loss 0.53 accuracy 0.88:  10%|█         | 192/1875 [00:14<00:36, 46.05it/s]
loss 0.62 accuracy 0.84:  10%|█         | 192/1875 [00:14<00:36, 46.05it/s]
loss 0.54 accuracy 0.78:  10%|█         | 192/1875 [00:14<00:36, 46.05it/s]
loss 0.60 accuracy 0.84:  10%|█         | 192/1875 [00:14<00:36, 46.05it/s]
loss 0.77 accuracy 0.72:  10%|█         | 192/1875 [00:14<00:36, 46.05it/s]
loss 0.65 accuracy 0.84:  10%|█         | 192/1875 [00:14<00:36, 46.05it/s]
loss 0.65 accuracy 0.84:  11%|█         | 197/1875 [00:14<00:36, 46.06it/s]
loss 0.47 accuracy 0.81:  11%|█         | 197/1875 [00:14<00:36, 46.06it/s]
loss 0.67 accuracy 0.78:  11%|█         | 197/1875 [00:14<00:36, 46.06it/s]
loss 0.80 accuracy 0.69:  11%|█         | 197/1875 [00:14<00:36, 46.06it/s]
loss 0.61 accuracy 0.81:  11%|█         | 197/1875 [00:14<00:36, 46.06it/s]
loss 0.50 accuracy 0.88:  11%|█         | 197/1875 [00:14<00:36, 46.06it/s]
loss 0.50 accuracy 0.88:  11%|█         | 202/1875 [00:14<00:36, 46.09it/s]
loss 0.54 accuracy 0.78:  11%|█         | 202/1875 [00:14<00:36, 46.09it/s]
loss 0.33 accuracy 0.94:  11%|█         | 202/1875 [00:14<00:36, 46.09it/s]
loss 0.45 accuracy 0.91:  11%|█         | 202/1875 [00:14<00:36, 46.09it/s]
loss 0.36 accuracy 0.94:  11%|█         | 202/1875 [00:14<00:36, 46.09it/s]
loss 0.54 accuracy 0.81:  11%|█         | 202/1875 [00:14<00:36, 46.09it/s]
loss 0.54 accuracy 0.81:  11%|█         | 207/1875 [00:14<00:36, 46.12it/s]
loss 0.44 accuracy 0.84:  11%|█         | 207/1875 [00:14<00:36, 46.12it/s]
loss 0.61 accuracy 0.81:  11%|█         | 207/1875 [00:14<00:36, 46.12it/s]
loss 0.41 accuracy 0.94:  11%|█         | 207/1875 [00:14<00:36, 46.12it/s]
loss 0.40 accuracy 0.88:  11%|█         | 207/1875 [00:14<00:36, 46.12it/s]
loss 0.54 accuracy 0.88:  11%|█         | 207/1875 [00:14<00:36, 46.12it/s]
loss 0.54 accuracy 0.88:  11%|█▏        | 212/1875 [00:14<00:36, 46.13it/s]
loss 0.56 accuracy 0.81:  11%|█▏        | 212/1875 [00:14<00:36, 46.13it/s]
loss 0.42 accuracy 0.88:  11%|█▏        | 212/1875 [00:14<00:36, 46.13it/s]
loss 0.35 accuracy 0.88:  11%|█▏        | 212/1875 [00:14<00:36, 46.13it/s]
loss 0.59 accuracy 0.88:  11%|█▏        | 212/1875 [00:14<00:36, 46.13it/s]
loss 0.60 accuracy 0.84:  11%|█▏        | 212/1875 [00:14<00:36, 46.13it/s]
loss 0.60 accuracy 0.84:  12%|█▏        | 217/1875 [00:14<00:35, 46.14it/s]
loss 0.32 accuracy 0.91:  12%|█▏        | 217/1875 [00:14<00:35, 46.14it/s]
loss 0.42 accuracy 0.88:  12%|█▏        | 217/1875 [00:14<00:35, 46.14it/s]
loss 0.60 accuracy 0.81:  12%|█▏        | 217/1875 [00:14<00:35, 46.14it/s]
loss 0.42 accuracy 0.88:  12%|█▏        | 217/1875 [00:14<00:35, 46.14it/s]
loss 0.41 accuracy 0.91:  12%|█▏        | 217/1875 [00:14<00:35, 46.14it/s]
loss 0.41 accuracy 0.91:  12%|█▏        | 222/1875 [00:14<00:35, 46.14it/s]
loss 0.35 accuracy 0.84:  12%|█▏        | 222/1875 [00:14<00:35, 46.14it/s]
loss 0.40 accuracy 0.88:  12%|█▏        | 222/1875 [00:14<00:35, 46.14it/s]
loss 0.64 accuracy 0.84:  12%|█▏        | 222/1875 [00:14<00:35, 46.14it/s]
loss 0.52 accuracy 0.91:  12%|█▏        | 222/1875 [00:14<00:35, 46.14it/s]
loss 0.27 accuracy 0.94:  12%|█▏        | 222/1875 [00:14<00:35, 46.14it/s]
loss 0.27 accuracy 0.94:  12%|█▏        | 227/1875 [00:14<00:35, 46.15it/s]
loss 0.39 accuracy 0.94:  12%|█▏        | 227/1875 [00:14<00:35, 46.15it/s]
loss 0.30 accuracy 0.88:  12%|█▏        | 227/1875 [00:15<00:35, 46.15it/s]
loss 0.56 accuracy 0.78:  12%|█▏        | 227/1875 [00:15<00:35, 46.15it/s]
loss 0.32 accuracy 0.91:  12%|█▏        | 227/1875 [00:15<00:35, 46.15it/s]
loss 0.32 accuracy 0.88:  12%|█▏        | 227/1875 [00:15<00:35, 46.15it/s]
loss 0.32 accuracy 0.88:  12%|█▏        | 232/1875 [00:15<00:35, 46.11it/s]
loss 0.39 accuracy 0.91:  12%|█▏        | 232/1875 [00:15<00:35, 46.11it/s]
loss 0.54 accuracy 0.91:  12%|█▏        | 232/1875 [00:15<00:35, 46.11it/s]
loss 0.33 accuracy 0.88:  12%|█▏        | 232/1875 [00:15<00:35, 46.11it/s]
loss 0.60 accuracy 0.84:  12%|█▏        | 232/1875 [00:15<00:35, 46.11it/s]
loss 0.36 accuracy 0.94:  12%|█▏        | 232/1875 [00:15<00:35, 46.11it/s]
loss 0.36 accuracy 0.94:  13%|█▎        | 237/1875 [00:15<00:35, 46.11it/s]
loss 0.79 accuracy 0.81:  13%|█▎        | 237/1875 [00:15<00:35, 46.11it/s]
loss 0.51 accuracy 0.84:  13%|█▎        | 237/1875 [00:15<00:35, 46.11it/s]
loss 0.29 accuracy 0.97:  13%|█▎        | 237/1875 [00:15<00:35, 46.11it/s]
loss 0.33 accuracy 0.91:  13%|█▎        | 237/1875 [00:15<00:35, 46.11it/s]
loss 0.43 accuracy 0.84:  13%|█▎        | 237/1875 [00:15<00:35, 46.11it/s]
loss 0.43 accuracy 0.84:  13%|█▎        | 242/1875 [00:15<00:35, 46.03it/s]
loss 0.37 accuracy 0.88:  13%|█▎        | 242/1875 [00:15<00:35, 46.03it/s]
loss 0.47 accuracy 0.84:  13%|█▎        | 242/1875 [00:15<00:35, 46.03it/s]
loss 0.60 accuracy 0.78:  13%|█▎        | 242/1875 [00:15<00:35, 46.03it/s]
loss 0.57 accuracy 0.81:  13%|█▎        | 242/1875 [00:15<00:35, 46.03it/s]
loss 0.36 accuracy 0.94:  13%|█▎        | 242/1875 [00:15<00:35, 46.03it/s]
loss 0.36 accuracy 0.94:  13%|█▎        | 247/1875 [00:15<00:35, 45.94it/s]
loss 0.50 accuracy 0.81:  13%|█▎        | 247/1875 [00:15<00:35, 45.94it/s]
loss 0.35 accuracy 0.84:  13%|█▎        | 247/1875 [00:15<00:35, 45.94it/s]
loss 0.40 accuracy 0.91:  13%|█▎        | 247/1875 [00:15<00:35, 45.94it/s]
loss 0.46 accuracy 0.88:  13%|█▎        | 247/1875 [00:15<00:35, 45.94it/s]
loss 0.31 accuracy 0.91:  13%|█▎        | 247/1875 [00:15<00:35, 45.94it/s]
loss 0.31 accuracy 0.91:  13%|█▎        | 252/1875 [00:15<00:35, 45.85it/s]
loss 0.43 accuracy 0.84:  13%|█▎        | 252/1875 [00:15<00:35, 45.85it/s]
loss 0.53 accuracy 0.81:  13%|█▎        | 252/1875 [00:15<00:35, 45.85it/s]
loss 0.28 accuracy 0.94:  13%|█▎        | 252/1875 [00:15<00:35, 45.85it/s]
loss 0.40 accuracy 0.91:  13%|█▎        | 252/1875 [00:15<00:35, 45.85it/s]
loss 0.26 accuracy 0.91:  13%|█▎        | 252/1875 [00:15<00:35, 45.85it/s]
loss 0.26 accuracy 0.91:  14%|█▎        | 257/1875 [00:15<00:35, 45.90it/s]
loss 0.45 accuracy 0.91:  14%|█▎        | 257/1875 [00:15<00:35, 45.90it/s]
loss 0.28 accuracy 0.94:  14%|█▎        | 257/1875 [00:15<00:35, 45.90it/s]
loss 0.52 accuracy 0.84:  14%|█▎        | 257/1875 [00:15<00:35, 45.90it/s]
loss 0.24 accuracy 0.88:  14%|█▎        | 257/1875 [00:15<00:35, 45.90it/s]
loss 0.40 accuracy 0.91:  14%|█▎        | 257/1875 [00:15<00:35, 45.90it/s]
loss 0.40 accuracy 0.91:  14%|█▍        | 262/1875 [00:15<00:35, 45.81it/s]
loss 0.35 accuracy 0.88:  14%|█▍        | 262/1875 [00:15<00:35, 45.81it/s]
loss 0.38 accuracy 0.94:  14%|█▍        | 262/1875 [00:15<00:35, 45.81it/s]
loss 0.30 accuracy 0.91:  14%|█▍        | 262/1875 [00:15<00:35, 45.81it/s]
loss 0.29 accuracy 0.91:  14%|█▍        | 262/1875 [00:15<00:35, 45.81it/s]
loss 0.41 accuracy 0.84:  14%|█▍        | 262/1875 [00:15<00:35, 45.81it/s]
loss 0.41 accuracy 0.84:  14%|█▍        | 267/1875 [00:15<00:35, 45.74it/s]
loss 0.40 accuracy 0.84:  14%|█▍        | 267/1875 [00:15<00:35, 45.74it/s]
loss 0.45 accuracy 0.81:  14%|█▍        | 267/1875 [00:15<00:35, 45.74it/s]
loss 0.19 accuracy 1.00:  14%|█▍        | 267/1875 [00:15<00:35, 45.74it/s]
loss 0.27 accuracy 1.00:  14%|█▍        | 267/1875 [00:15<00:35, 45.74it/s]
loss 0.37 accuracy 0.88:  14%|█▍        | 267/1875 [00:15<00:35, 45.74it/s]
loss 0.37 accuracy 0.88:  15%|█▍        | 272/1875 [00:15<00:35, 45.71it/s]
loss 0.46 accuracy 0.78:  15%|█▍        | 272/1875 [00:15<00:35, 45.71it/s]
loss 0.43 accuracy 0.88:  15%|█▍        | 272/1875 [00:15<00:35, 45.71it/s]
loss 0.47 accuracy 0.84:  15%|█▍        | 272/1875 [00:16<00:35, 45.71it/s]
loss 0.51 accuracy 0.88:  15%|█▍        | 272/1875 [00:16<00:35, 45.71it/s]
loss 0.20 accuracy 0.91:  15%|█▍        | 272/1875 [00:16<00:35, 45.71it/s]
loss 0.20 accuracy 0.91:  15%|█▍        | 277/1875 [00:16<00:34, 45.69it/s]
loss 0.46 accuracy 0.88:  15%|█▍        | 277/1875 [00:16<00:34, 45.69it/s]
loss 0.31 accuracy 0.91:  15%|█▍        | 277/1875 [00:16<00:34, 45.69it/s]
loss 0.23 accuracy 0.94:  15%|█▍        | 277/1875 [00:16<00:34, 45.69it/s]
loss 0.25 accuracy 0.91:  15%|█▍        | 277/1875 [00:16<00:34, 45.69it/s]
loss 0.39 accuracy 0.88:  15%|█▍        | 277/1875 [00:16<00:34, 45.69it/s]
loss 0.39 accuracy 0.88:  15%|█▌        | 282/1875 [00:16<00:34, 45.81it/s]
loss 0.21 accuracy 0.94:  15%|█▌        | 282/1875 [00:16<00:34, 45.81it/s]
loss 0.29 accuracy 0.91:  15%|█▌        | 282/1875 [00:16<00:34, 45.81it/s]
loss 0.51 accuracy 0.88:  15%|█▌        | 282/1875 [00:16<00:34, 45.81it/s]
loss 0.46 accuracy 0.78:  15%|█▌        | 282/1875 [00:16<00:34, 45.81it/s]
loss 0.27 accuracy 0.88:  15%|█▌        | 282/1875 [00:16<00:34, 45.81it/s]
loss 0.27 accuracy 0.88:  15%|█▌        | 287/1875 [00:16<00:34, 45.87it/s]
loss 0.11 accuracy 1.00:  15%|█▌        | 287/1875 [00:16<00:34, 45.87it/s]
loss 0.13 accuracy 1.00:  15%|█▌        | 287/1875 [00:16<00:34, 45.87it/s]
loss 0.30 accuracy 0.91:  15%|█▌        | 287/1875 [00:16<00:34, 45.87it/s]
loss 0.23 accuracy 0.97:  15%|█▌        | 287/1875 [00:16<00:34, 45.87it/s]
loss 0.48 accuracy 0.91:  15%|█▌        | 287/1875 [00:16<00:34, 45.87it/s]
loss 0.48 accuracy 0.91:  16%|█▌        | 292/1875 [00:16<00:34, 45.94it/s]
loss 0.50 accuracy 0.88:  16%|█▌        | 292/1875 [00:16<00:34, 45.94it/s]
loss 0.37 accuracy 0.88:  16%|█▌        | 292/1875 [00:16<00:34, 45.94it/s]
loss 0.37 accuracy 0.84:  16%|█▌        | 292/1875 [00:16<00:34, 45.94it/s]
loss 0.27 accuracy 0.94:  16%|█▌        | 292/1875 [00:16<00:34, 45.94it/s]
loss 0.25 accuracy 0.91:  16%|█▌        | 292/1875 [00:16<00:34, 45.94it/s]
loss 0.25 accuracy 0.91:  16%|█▌        | 297/1875 [00:16<00:34, 46.00it/s]
loss 0.10 accuracy 1.00:  16%|█▌        | 297/1875 [00:16<00:34, 46.00it/s]
loss 0.38 accuracy 0.88:  16%|█▌        | 297/1875 [00:16<00:34, 46.00it/s]
loss 0.47 accuracy 0.88:  16%|█▌        | 297/1875 [00:16<00:34, 46.00it/s]
loss 0.17 accuracy 0.94:  16%|█▌        | 297/1875 [00:16<00:34, 46.00it/s]
loss 0.22 accuracy 0.97:  16%|█▌        | 297/1875 [00:16<00:34, 46.00it/s]
loss 0.22 accuracy 0.97:  16%|█▌        | 302/1875 [00:16<00:34, 46.07it/s]
loss 0.20 accuracy 0.94:  16%|█▌        | 302/1875 [00:16<00:34, 46.07it/s]
loss 0.22 accuracy 0.97:  16%|█▌        | 302/1875 [00:16<00:34, 46.07it/s]
loss 0.15 accuracy 0.97:  16%|█▌        | 302/1875 [00:16<00:34, 46.07it/s]
loss 0.34 accuracy 0.88:  16%|█▌        | 302/1875 [00:16<00:34, 46.07it/s]
loss 0.22 accuracy 0.91:  16%|█▌        | 302/1875 [00:16<00:34, 46.07it/s]
loss 0.22 accuracy 0.91:  16%|█▋        | 307/1875 [00:16<00:34, 46.11it/s]
loss 0.21 accuracy 0.94:  16%|█▋        | 307/1875 [00:16<00:34, 46.11it/s]
loss 0.28 accuracy 0.94:  16%|█▋        | 307/1875 [00:16<00:34, 46.11it/s]
loss 0.58 accuracy 0.78:  16%|█▋        | 307/1875 [00:16<00:34, 46.11it/s]
loss 0.40 accuracy 0.88:  16%|█▋        | 307/1875 [00:16<00:34, 46.11it/s]
loss 0.32 accuracy 0.91:  16%|█▋        | 307/1875 [00:16<00:34, 46.11it/s]
loss 0.32 accuracy 0.91:  17%|█▋        | 312/1875 [00:16<00:33, 46.11it/s]
loss 0.43 accuracy 0.88:  17%|█▋        | 312/1875 [00:16<00:33, 46.11it/s]
loss 0.29 accuracy 0.94:  17%|█▋        | 312/1875 [00:16<00:33, 46.11it/s]
loss 0.42 accuracy 0.91:  17%|█▋        | 312/1875 [00:16<00:33, 46.11it/s]
loss 0.51 accuracy 0.81:  17%|█▋        | 312/1875 [00:16<00:33, 46.11it/s]
loss 0.33 accuracy 0.91:  17%|█▋        | 312/1875 [00:16<00:33, 46.11it/s]
loss 0.33 accuracy 0.91:  17%|█▋        | 317/1875 [00:16<00:33, 46.10it/s]
loss 0.30 accuracy 0.94:  17%|█▋        | 317/1875 [00:16<00:33, 46.10it/s]
loss 0.24 accuracy 0.94:  17%|█▋        | 317/1875 [00:16<00:33, 46.10it/s]
loss 0.23 accuracy 0.97:  17%|█▋        | 317/1875 [00:16<00:33, 46.10it/s]
loss 0.56 accuracy 0.88:  17%|█▋        | 317/1875 [00:17<00:33, 46.10it/s]
loss 0.23 accuracy 0.97:  17%|█▋        | 317/1875 [00:17<00:33, 46.10it/s]
loss 0.23 accuracy 0.97:  17%|█▋        | 322/1875 [00:17<00:33, 46.13it/s]
loss 0.23 accuracy 0.94:  17%|█▋        | 322/1875 [00:17<00:33, 46.13it/s]
loss 0.22 accuracy 0.94:  17%|█▋        | 322/1875 [00:17<00:33, 46.13it/s]
loss 0.35 accuracy 0.91:  17%|█▋        | 322/1875 [00:17<00:33, 46.13it/s]
loss 0.31 accuracy 0.91:  17%|█▋        | 322/1875 [00:17<00:33, 46.13it/s]
loss 0.22 accuracy 0.94:  17%|█▋        | 322/1875 [00:17<00:33, 46.13it/s]
loss 0.22 accuracy 0.94:  17%|█▋        | 327/1875 [00:17<00:33, 46.14it/s]
loss 0.40 accuracy 0.88:  17%|█▋        | 327/1875 [00:17<00:33, 46.14it/s]
loss 0.30 accuracy 0.88:  17%|█▋        | 327/1875 [00:17<00:33, 46.14it/s]
loss 0.34 accuracy 0.94:  17%|█▋        | 327/1875 [00:17<00:33, 46.14it/s]
loss 0.81 accuracy 0.81:  17%|█▋        | 327/1875 [00:17<00:33, 46.14it/s]
loss 0.30 accuracy 0.91:  17%|█▋        | 327/1875 [00:17<00:33, 46.14it/s]
loss 0.30 accuracy 0.91:  18%|█▊        | 332/1875 [00:17<00:33, 46.14it/s]
loss 0.37 accuracy 0.91:  18%|█▊        | 332/1875 [00:17<00:33, 46.14it/s]
loss 0.10 accuracy 0.97:  18%|█▊        | 332/1875 [00:17<00:33, 46.14it/s]
loss 0.32 accuracy 0.94:  18%|█▊        | 332/1875 [00:17<00:33, 46.14it/s]
loss 0.26 accuracy 0.88:  18%|█▊        | 332/1875 [00:17<00:33, 46.14it/s]
loss 1.01 accuracy 0.78:  18%|█▊        | 332/1875 [00:17<00:33, 46.14it/s]
loss 1.01 accuracy 0.78:  18%|█▊        | 337/1875 [00:17<00:33, 46.14it/s]
loss 0.45 accuracy 0.91:  18%|█▊        | 337/1875 [00:17<00:33, 46.14it/s]
loss 0.38 accuracy 0.88:  18%|█▊        | 337/1875 [00:17<00:33, 46.14it/s]
loss 0.31 accuracy 0.91:  18%|█▊        | 337/1875 [00:17<00:33, 46.14it/s]
loss 0.27 accuracy 0.91:  18%|█▊        | 337/1875 [00:17<00:33, 46.14it/s]
loss 0.57 accuracy 0.72:  18%|█▊        | 337/1875 [00:17<00:33, 46.14it/s]
loss 0.57 accuracy 0.72:  18%|█▊        | 342/1875 [00:17<00:33, 46.14it/s]
loss 0.51 accuracy 0.84:  18%|█▊        | 342/1875 [00:17<00:33, 46.14it/s]
loss 0.37 accuracy 0.84:  18%|█▊        | 342/1875 [00:17<00:33, 46.14it/s]
loss 0.51 accuracy 0.88:  18%|█▊        | 342/1875 [00:17<00:33, 46.14it/s]
loss 0.50 accuracy 0.81:  18%|█▊        | 342/1875 [00:17<00:33, 46.14it/s]
loss 0.30 accuracy 0.94:  18%|█▊        | 342/1875 [00:17<00:33, 46.14it/s]
loss 0.30 accuracy 0.94:  19%|█▊        | 347/1875 [00:17<00:33, 46.11it/s]
loss 0.42 accuracy 0.84:  19%|█▊        | 347/1875 [00:17<00:33, 46.11it/s]
loss 0.62 accuracy 0.81:  19%|█▊        | 347/1875 [00:17<00:33, 46.11it/s]
loss 0.21 accuracy 0.97:  19%|█▊        | 347/1875 [00:17<00:33, 46.11it/s]
loss 0.17 accuracy 0.94:  19%|█▊        | 347/1875 [00:17<00:33, 46.11it/s]
loss 0.20 accuracy 0.94:  19%|█▊        | 347/1875 [00:17<00:33, 46.11it/s]
loss 0.20 accuracy 0.94:  19%|█▉        | 352/1875 [00:17<00:33, 46.08it/s]
loss 0.34 accuracy 0.88:  19%|█▉        | 352/1875 [00:17<00:33, 46.08it/s]
loss 0.40 accuracy 0.91:  19%|█▉        | 352/1875 [00:17<00:33, 46.08it/s]
loss 0.51 accuracy 0.88:  19%|█▉        | 352/1875 [00:17<00:33, 46.08it/s]
loss 0.16 accuracy 0.97:  19%|█▉        | 352/1875 [00:17<00:33, 46.08it/s]
loss 0.45 accuracy 0.84:  19%|█▉        | 352/1875 [00:17<00:33, 46.08it/s]
loss 0.45 accuracy 0.84:  19%|█▉        | 357/1875 [00:17<00:32, 46.08it/s]
loss 0.44 accuracy 0.81:  19%|█▉        | 357/1875 [00:17<00:32, 46.08it/s]
loss 0.21 accuracy 0.94:  19%|█▉        | 357/1875 [00:17<00:32, 46.08it/s]
loss 0.23 accuracy 0.94:  19%|█▉        | 357/1875 [00:17<00:32, 46.08it/s]
loss 0.34 accuracy 0.88:  19%|█▉        | 357/1875 [00:17<00:32, 46.08it/s]
loss 0.10 accuracy 0.94:  19%|█▉        | 357/1875 [00:17<00:32, 46.08it/s]
loss 0.10 accuracy 0.94:  19%|█▉        | 362/1875 [00:17<00:32, 46.04it/s]
loss 0.66 accuracy 0.78:  19%|█▉        | 362/1875 [00:17<00:32, 46.04it/s]
loss 0.25 accuracy 0.91:  19%|█▉        | 362/1875 [00:17<00:32, 46.04it/s]
loss 0.30 accuracy 0.91:  19%|█▉        | 362/1875 [00:17<00:32, 46.04it/s]
loss 0.24 accuracy 0.97:  19%|█▉        | 362/1875 [00:17<00:32, 46.04it/s]
loss 0.11 accuracy 0.97:  19%|█▉        | 362/1875 [00:18<00:32, 46.04it/s]
loss 0.11 accuracy 0.97:  20%|█▉        | 367/1875 [00:18<00:32, 46.02it/s]
loss 0.17 accuracy 0.97:  20%|█▉        | 367/1875 [00:18<00:32, 46.02it/s]
loss 0.39 accuracy 0.91:  20%|█▉        | 367/1875 [00:18<00:32, 46.02it/s]
loss 0.28 accuracy 0.91:  20%|█▉        | 367/1875 [00:18<00:32, 46.02it/s]
loss 0.17 accuracy 0.97:  20%|█▉        | 367/1875 [00:18<00:32, 46.02it/s]
loss 0.24 accuracy 0.94:  20%|█▉        | 367/1875 [00:18<00:32, 46.02it/s]
loss 0.24 accuracy 0.94:  20%|█▉        | 372/1875 [00:18<00:32, 45.89it/s]
loss 0.34 accuracy 0.88:  20%|█▉        | 372/1875 [00:18<00:32, 45.89it/s]
loss 0.29 accuracy 0.94:  20%|█▉        | 372/1875 [00:18<00:32, 45.89it/s]
loss 0.21 accuracy 0.91:  20%|█▉        | 372/1875 [00:18<00:32, 45.89it/s]
loss 0.28 accuracy 0.91:  20%|█▉        | 372/1875 [00:18<00:32, 45.89it/s]
loss 0.23 accuracy 0.88:  20%|█▉        | 372/1875 [00:18<00:32, 45.89it/s]
loss 0.23 accuracy 0.88:  20%|██        | 377/1875 [00:18<00:32, 45.87it/s]
loss 0.19 accuracy 0.94:  20%|██        | 377/1875 [00:18<00:32, 45.87it/s]
loss 0.33 accuracy 0.91:  20%|██        | 377/1875 [00:18<00:32, 45.87it/s]
loss 0.19 accuracy 0.94:  20%|██        | 377/1875 [00:18<00:32, 45.87it/s]
loss 0.11 accuracy 1.00:  20%|██        | 377/1875 [00:18<00:32, 45.87it/s]
loss 0.31 accuracy 0.91:  20%|██        | 377/1875 [00:18<00:32, 45.87it/s]
loss 0.31 accuracy 0.91:  20%|██        | 382/1875 [00:18<00:32, 45.87it/s]
loss 0.16 accuracy 0.97:  20%|██        | 382/1875 [00:18<00:32, 45.87it/s]
loss 0.20 accuracy 0.91:  20%|██        | 382/1875 [00:18<00:32, 45.87it/s]
loss 0.39 accuracy 0.84:  20%|██        | 382/1875 [00:18<00:32, 45.87it/s]
loss 0.16 accuracy 0.97:  20%|██        | 382/1875 [00:18<00:32, 45.87it/s]
loss 0.58 accuracy 0.84:  20%|██        | 382/1875 [00:18<00:32, 45.87it/s]
loss 0.58 accuracy 0.84:  21%|██        | 387/1875 [00:18<00:32, 45.89it/s]
loss 0.35 accuracy 0.91:  21%|██        | 387/1875 [00:18<00:32, 45.89it/s]
loss 0.44 accuracy 0.78:  21%|██        | 387/1875 [00:18<00:32, 45.89it/s]
loss 0.17 accuracy 0.94:  21%|██        | 387/1875 [00:18<00:32, 45.89it/s]
loss 0.45 accuracy 0.91:  21%|██        | 387/1875 [00:18<00:32, 45.89it/s]
loss 0.41 accuracy 0.94:  21%|██        | 387/1875 [00:18<00:32, 45.89it/s]
loss 0.41 accuracy 0.94:  21%|██        | 392/1875 [00:18<00:32, 45.78it/s]
loss 0.67 accuracy 0.91:  21%|██        | 392/1875 [00:18<00:32, 45.78it/s]
loss 0.35 accuracy 0.88:  21%|██        | 392/1875 [00:18<00:32, 45.78it/s]
loss 0.54 accuracy 0.84:  21%|██        | 392/1875 [00:18<00:32, 45.78it/s]
loss 0.27 accuracy 0.91:  21%|██        | 392/1875 [00:18<00:32, 45.78it/s]
loss 0.34 accuracy 0.94:  21%|██        | 392/1875 [00:18<00:32, 45.78it/s]
loss 0.34 accuracy 0.94:  21%|██        | 397/1875 [00:18<00:32, 45.81it/s]
loss 0.51 accuracy 0.81:  21%|██        | 397/1875 [00:18<00:32, 45.81it/s]
loss 0.78 accuracy 0.72:  21%|██        | 397/1875 [00:18<00:32, 45.81it/s]
loss 0.51 accuracy 0.75:  21%|██        | 397/1875 [00:18<00:32, 45.81it/s]
loss 0.42 accuracy 0.88:  21%|██        | 397/1875 [00:18<00:32, 45.81it/s]
loss 0.19 accuracy 0.97:  21%|██        | 397/1875 [00:18<00:32, 45.81it/s]
loss 0.19 accuracy 0.97:  21%|██▏       | 402/1875 [00:18<00:32, 45.81it/s]
loss 0.72 accuracy 0.81:  21%|██▏       | 402/1875 [00:18<00:32, 45.81it/s]
loss 0.33 accuracy 0.91:  21%|██▏       | 402/1875 [00:18<00:32, 45.81it/s]
loss 0.54 accuracy 0.81:  21%|██▏       | 402/1875 [00:18<00:32, 45.81it/s]
loss 0.61 accuracy 0.72:  21%|██▏       | 402/1875 [00:18<00:32, 45.81it/s]
loss 0.81 accuracy 0.72:  21%|██▏       | 402/1875 [00:18<00:32, 45.81it/s]
loss 0.81 accuracy 0.72:  22%|██▏       | 407/1875 [00:18<00:32, 45.80it/s]
loss 0.69 accuracy 0.72:  22%|██▏       | 407/1875 [00:18<00:32, 45.80it/s]
loss 0.21 accuracy 0.97:  22%|██▏       | 407/1875 [00:18<00:32, 45.80it/s]
loss 0.22 accuracy 0.94:  22%|██▏       | 407/1875 [00:18<00:32, 45.80it/s]
loss 0.33 accuracy 0.88:  22%|██▏       | 407/1875 [00:18<00:32, 45.80it/s]
loss 0.41 accuracy 0.88:  22%|██▏       | 407/1875 [00:18<00:32, 45.80it/s]
loss 0.41 accuracy 0.88:  22%|██▏       | 412/1875 [00:18<00:31, 45.85it/s]
loss 0.63 accuracy 0.81:  22%|██▏       | 412/1875 [00:19<00:31, 45.85it/s]
loss 0.19 accuracy 0.94:  22%|██▏       | 412/1875 [00:19<00:31, 45.85it/s]
loss 0.22 accuracy 0.91:  22%|██▏       | 412/1875 [00:19<00:31, 45.85it/s]
loss 0.38 accuracy 0.88:  22%|██▏       | 412/1875 [00:19<00:31, 45.85it/s]
loss 0.26 accuracy 0.94:  22%|██▏       | 412/1875 [00:19<00:31, 45.85it/s]
loss 0.26 accuracy 0.94:  22%|██▏       | 417/1875 [00:19<00:31, 45.89it/s]
loss 0.34 accuracy 0.94:  22%|██▏       | 417/1875 [00:19<00:31, 45.89it/s]
loss 0.37 accuracy 0.91:  22%|██▏       | 417/1875 [00:19<00:31, 45.89it/s]
loss 0.20 accuracy 0.94:  22%|██▏       | 417/1875 [00:19<00:31, 45.89it/s]
loss 0.19 accuracy 0.94:  22%|██▏       | 417/1875 [00:19<00:31, 45.89it/s]
loss 0.29 accuracy 0.91:  22%|██▏       | 417/1875 [00:19<00:31, 45.89it/s]
loss 0.29 accuracy 0.91:  23%|██▎       | 422/1875 [00:19<00:31, 45.98it/s]
loss 0.53 accuracy 0.88:  23%|██▎       | 422/1875 [00:19<00:31, 45.98it/s]
loss 0.38 accuracy 0.88:  23%|██▎       | 422/1875 [00:19<00:31, 45.98it/s]
loss 0.38 accuracy 0.91:  23%|██▎       | 422/1875 [00:19<00:31, 45.98it/s]
loss 0.45 accuracy 0.91:  23%|██▎       | 422/1875 [00:19<00:31, 45.98it/s]
loss 0.42 accuracy 0.84:  23%|██▎       | 422/1875 [00:19<00:31, 45.98it/s]
loss 0.42 accuracy 0.84:  23%|██▎       | 427/1875 [00:19<00:31, 46.01it/s]
loss 0.18 accuracy 0.97:  23%|██▎       | 427/1875 [00:19<00:31, 46.01it/s]
loss 0.34 accuracy 0.88:  23%|██▎       | 427/1875 [00:19<00:31, 46.01it/s]
loss 0.38 accuracy 0.88:  23%|██▎       | 427/1875 [00:19<00:31, 46.01it/s]
loss 0.50 accuracy 0.81:  23%|██▎       | 427/1875 [00:19<00:31, 46.01it/s]
loss 0.11 accuracy 0.97:  23%|██▎       | 427/1875 [00:19<00:31, 46.01it/s]
loss 0.11 accuracy 0.97:  23%|██▎       | 432/1875 [00:19<00:31, 46.04it/s]
loss 0.22 accuracy 0.91:  23%|██▎       | 432/1875 [00:19<00:31, 46.04it/s]
loss 0.25 accuracy 0.97:  23%|██▎       | 432/1875 [00:19<00:31, 46.04it/s]
loss 0.20 accuracy 0.97:  23%|██▎       | 432/1875 [00:19<00:31, 46.04it/s]
loss 0.27 accuracy 0.91:  23%|██▎       | 432/1875 [00:19<00:31, 46.04it/s]
loss 0.30 accuracy 0.91:  23%|██▎       | 432/1875 [00:19<00:31, 46.04it/s]
loss 0.30 accuracy 0.91:  23%|██▎       | 437/1875 [00:19<00:31, 46.08it/s]
loss 0.40 accuracy 0.88:  23%|██▎       | 437/1875 [00:19<00:31, 46.08it/s]
loss 0.64 accuracy 0.81:  23%|██▎       | 437/1875 [00:19<00:31, 46.08it/s]
loss 0.29 accuracy 0.91:  23%|██▎       | 437/1875 [00:19<00:31, 46.08it/s]
loss 0.46 accuracy 0.91:  23%|██▎       | 437/1875 [00:19<00:31, 46.08it/s]
loss 0.35 accuracy 0.91:  23%|██▎       | 437/1875 [00:19<00:31, 46.08it/s]
loss 0.35 accuracy 0.91:  24%|██▎       | 442/1875 [00:19<00:31, 46.08it/s]
loss 0.24 accuracy 0.94:  24%|██▎       | 442/1875 [00:19<00:31, 46.08it/s]
loss 0.30 accuracy 0.88:  24%|██▎       | 442/1875 [00:19<00:31, 46.08it/s]
loss 0.32 accuracy 0.88:  24%|██▎       | 442/1875 [00:19<00:31, 46.08it/s]
loss 0.20 accuracy 0.94:  24%|██▎       | 442/1875 [00:19<00:31, 46.08it/s]
loss 0.23 accuracy 0.97:  24%|██▎       | 442/1875 [00:19<00:31, 46.08it/s]
loss 0.23 accuracy 0.97:  24%|██▍       | 447/1875 [00:19<00:31, 46.05it/s]
loss 0.17 accuracy 0.94:  24%|██▍       | 447/1875 [00:19<00:31, 46.05it/s]
loss 0.14 accuracy 1.00:  24%|██▍       | 447/1875 [00:19<00:31, 46.05it/s]
loss 0.37 accuracy 0.91:  24%|██▍       | 447/1875 [00:19<00:31, 46.05it/s]
loss 0.43 accuracy 0.84:  24%|██▍       | 447/1875 [00:19<00:31, 46.05it/s]
loss 0.20 accuracy 0.94:  24%|██▍       | 447/1875 [00:19<00:31, 46.05it/s]
loss 0.20 accuracy 0.94:  24%|██▍       | 452/1875 [00:19<00:30, 46.05it/s]
loss 0.42 accuracy 0.91:  24%|██▍       | 452/1875 [00:19<00:30, 46.05it/s]
loss 0.45 accuracy 0.91:  24%|██▍       | 452/1875 [00:19<00:30, 46.05it/s]
loss 0.41 accuracy 0.84:  24%|██▍       | 452/1875 [00:19<00:30, 46.05it/s]
loss 0.25 accuracy 0.88:  24%|██▍       | 452/1875 [00:19<00:30, 46.05it/s]
loss 0.10 accuracy 1.00:  24%|██▍       | 452/1875 [00:19<00:30, 46.05it/s]
loss 0.10 accuracy 1.00:  24%|██▍       | 457/1875 [00:19<00:30, 46.09it/s]
loss 0.25 accuracy 0.94:  24%|██▍       | 457/1875 [00:19<00:30, 46.09it/s]
loss 0.36 accuracy 0.84:  24%|██▍       | 457/1875 [00:20<00:30, 46.09it/s]
loss 0.20 accuracy 0.94:  24%|██▍       | 457/1875 [00:20<00:30, 46.09it/s]
loss 0.16 accuracy 0.97:  24%|██▍       | 457/1875 [00:20<00:30, 46.09it/s]
loss 0.20 accuracy 0.97:  24%|██▍       | 457/1875 [00:20<00:30, 46.09it/s]
loss 0.20 accuracy 0.97:  25%|██▍       | 462/1875 [00:20<00:30, 46.05it/s]
loss 0.14 accuracy 0.97:  25%|██▍       | 462/1875 [00:20<00:30, 46.05it/s]
loss 0.44 accuracy 0.91:  25%|██▍       | 462/1875 [00:20<00:30, 46.05it/s]
loss 0.11 accuracy 1.00:  25%|██▍       | 462/1875 [00:20<00:30, 46.05it/s]
loss 0.32 accuracy 0.97:  25%|██▍       | 462/1875 [00:20<00:30, 46.05it/s]
loss 0.30 accuracy 0.91:  25%|██▍       | 462/1875 [00:20<00:30, 46.05it/s]
loss 0.30 accuracy 0.91:  25%|██▍       | 467/1875 [00:20<00:30, 46.04it/s]
loss 0.15 accuracy 0.97:  25%|██▍       | 467/1875 [00:20<00:30, 46.04it/s]
loss 0.10 accuracy 1.00:  25%|██▍       | 467/1875 [00:20<00:30, 46.04it/s]
loss 0.29 accuracy 0.88:  25%|██▍       | 467/1875 [00:20<00:30, 46.04it/s]
loss 0.18 accuracy 0.97:  25%|██▍       | 467/1875 [00:20<00:30, 46.04it/s]
loss 0.28 accuracy 0.94:  25%|██▍       | 467/1875 [00:20<00:30, 46.04it/s]
loss 0.28 accuracy 0.94:  25%|██▌       | 472/1875 [00:20<00:30, 45.99it/s]
loss 0.18 accuracy 0.94:  25%|██▌       | 472/1875 [00:20<00:30, 45.99it/s]
loss 0.35 accuracy 0.94:  25%|██▌       | 472/1875 [00:20<00:30, 45.99it/s]
loss 0.13 accuracy 0.97:  25%|██▌       | 472/1875 [00:20<00:30, 45.99it/s]
loss 0.20 accuracy 0.97:  25%|██▌       | 472/1875 [00:20<00:30, 45.99it/s]
loss 0.29 accuracy 0.91:  25%|██▌       | 472/1875 [00:20<00:30, 45.99it/s]
loss 0.29 accuracy 0.91:  25%|██▌       | 477/1875 [00:20<00:30, 45.86it/s]
loss 0.51 accuracy 0.88:  25%|██▌       | 477/1875 [00:20<00:30, 45.86it/s]
loss 0.19 accuracy 0.94:  25%|██▌       | 477/1875 [00:20<00:30, 45.86it/s]
loss 0.30 accuracy 0.88:  25%|██▌       | 477/1875 [00:20<00:30, 45.86it/s]
loss 0.10 accuracy 0.97:  25%|██▌       | 477/1875 [00:20<00:30, 45.86it/s]
loss 0.44 accuracy 0.81:  25%|██▌       | 477/1875 [00:20<00:30, 45.86it/s]
loss 0.44 accuracy 0.81:  26%|██▌       | 482/1875 [00:20<00:30, 45.87it/s]
loss 0.35 accuracy 0.91:  26%|██▌       | 482/1875 [00:20<00:30, 45.87it/s]
loss 0.57 accuracy 0.91:  26%|██▌       | 482/1875 [00:20<00:30, 45.87it/s]
loss 0.12 accuracy 0.97:  26%|██▌       | 482/1875 [00:20<00:30, 45.87it/s]
loss 0.18 accuracy 0.94:  26%|██▌       | 482/1875 [00:20<00:30, 45.87it/s]
loss 0.09 accuracy 0.97:  26%|██▌       | 482/1875 [00:20<00:30, 45.87it/s]
loss 0.09 accuracy 0.97:  26%|██▌       | 487/1875 [00:20<00:30, 45.88it/s]
loss 0.28 accuracy 0.94:  26%|██▌       | 487/1875 [00:20<00:30, 45.88it/s]
loss 0.28 accuracy 0.88:  26%|██▌       | 487/1875 [00:20<00:30, 45.88it/s]
loss 0.41 accuracy 0.88:  26%|██▌       | 487/1875 [00:20<00:30, 45.88it/s]
loss 0.29 accuracy 0.91:  26%|██▌       | 487/1875 [00:20<00:30, 45.88it/s]
loss 0.29 accuracy 0.88:  26%|██▌       | 487/1875 [00:20<00:30, 45.88it/s]
loss 0.29 accuracy 0.88:  26%|██▌       | 492/1875 [00:20<00:30, 45.70it/s]
loss 0.05 accuracy 1.00:  26%|██▌       | 492/1875 [00:20<00:30, 45.70it/s]
loss 0.10 accuracy 0.94:  26%|██▌       | 492/1875 [00:20<00:30, 45.70it/s]
loss 0.11 accuracy 0.97:  26%|██▌       | 492/1875 [00:20<00:30, 45.70it/s]
loss 0.15 accuracy 0.94:  26%|██▌       | 492/1875 [00:20<00:30, 45.70it/s]
loss 0.20 accuracy 0.91:  26%|██▌       | 492/1875 [00:20<00:30, 45.70it/s]
loss 0.20 accuracy 0.91:  27%|██▋       | 497/1875 [00:20<00:30, 45.74it/s]
loss 0.13 accuracy 0.97:  27%|██▋       | 497/1875 [00:20<00:30, 45.74it/s]
loss 0.41 accuracy 0.94:  27%|██▋       | 497/1875 [00:20<00:30, 45.74it/s]
loss 0.30 accuracy 0.91:  27%|██▋       | 497/1875 [00:20<00:30, 45.74it/s]
loss 0.57 accuracy 0.84:  27%|██▋       | 497/1875 [00:20<00:30, 45.74it/s]
loss 0.44 accuracy 0.88:  27%|██▋       | 497/1875 [00:20<00:30, 45.74it/s]
loss 0.44 accuracy 0.88:  27%|██▋       | 502/1875 [00:20<00:30, 45.68it/s]
loss 0.26 accuracy 0.91:  27%|██▋       | 502/1875 [00:20<00:30, 45.68it/s]
loss 0.17 accuracy 0.94:  27%|██▋       | 502/1875 [00:20<00:30, 45.68it/s]
loss 0.52 accuracy 0.88:  27%|██▋       | 502/1875 [00:21<00:30, 45.68it/s]
loss 0.06 accuracy 1.00:  27%|██▋       | 502/1875 [00:21<00:30, 45.68it/s]
loss 0.34 accuracy 0.88:  27%|██▋       | 502/1875 [00:21<00:30, 45.68it/s]
loss 0.34 accuracy 0.88:  27%|██▋       | 507/1875 [00:21<00:29, 45.75it/s]
loss 0.42 accuracy 0.94:  27%|██▋       | 507/1875 [00:21<00:29, 45.75it/s]
loss 0.32 accuracy 0.94:  27%|██▋       | 507/1875 [00:21<00:29, 45.75it/s]
loss 0.28 accuracy 0.84:  27%|██▋       | 507/1875 [00:21<00:29, 45.75it/s]
loss 0.26 accuracy 0.97:  27%|██▋       | 507/1875 [00:21<00:29, 45.75it/s]
loss 0.23 accuracy 0.91:  27%|██▋       | 507/1875 [00:21<00:29, 45.75it/s]
loss 0.23 accuracy 0.91:  27%|██▋       | 512/1875 [00:21<00:29, 45.84it/s]
loss 0.40 accuracy 0.81:  27%|██▋       | 512/1875 [00:21<00:29, 45.84it/s]
loss 0.20 accuracy 0.94:  27%|██▋       | 512/1875 [00:21<00:29, 45.84it/s]
loss 0.11 accuracy 1.00:  27%|██▋       | 512/1875 [00:21<00:29, 45.84it/s]
loss 0.28 accuracy 0.91:  27%|██▋       | 512/1875 [00:21<00:29, 45.84it/s]
loss 0.16 accuracy 0.97:  27%|██▋       | 512/1875 [00:21<00:29, 45.84it/s]
loss 0.16 accuracy 0.97:  28%|██▊       | 517/1875 [00:21<00:29, 45.92it/s]
loss 0.37 accuracy 0.91:  28%|██▊       | 517/1875 [00:21<00:29, 45.92it/s]
loss 0.16 accuracy 0.97:  28%|██▊       | 517/1875 [00:21<00:29, 45.92it/s]
loss 0.27 accuracy 0.84:  28%|██▊       | 517/1875 [00:21<00:29, 45.92it/s]
loss 0.58 accuracy 0.84:  28%|██▊       | 517/1875 [00:21<00:29, 45.92it/s]
loss 0.32 accuracy 0.97:  28%|██▊       | 517/1875 [00:21<00:29, 45.92it/s]
loss 0.32 accuracy 0.97:  28%|██▊       | 522/1875 [00:21<00:29, 45.97it/s]
loss 0.14 accuracy 0.97:  28%|██▊       | 522/1875 [00:21<00:29, 45.97it/s]
loss 0.14 accuracy 1.00:  28%|██▊       | 522/1875 [00:21<00:29, 45.97it/s]
loss 0.14 accuracy 0.97:  28%|██▊       | 522/1875 [00:21<00:29, 45.97it/s]
loss 0.08 accuracy 1.00:  28%|██▊       | 522/1875 [00:21<00:29, 45.97it/s]
loss 0.18 accuracy 0.94:  28%|██▊       | 522/1875 [00:21<00:29, 45.97it/s]
loss 0.18 accuracy 0.94:  28%|██▊       | 527/1875 [00:21<00:29, 45.98it/s]
loss 0.31 accuracy 0.91:  28%|██▊       | 527/1875 [00:21<00:29, 45.98it/s]
loss 0.07 accuracy 1.00:  28%|██▊       | 527/1875 [00:21<00:29, 45.98it/s]
loss 0.07 accuracy 1.00:  28%|██▊       | 527/1875 [00:21<00:29, 45.98it/s]
loss 0.32 accuracy 0.88:  28%|██▊       | 527/1875 [00:21<00:29, 45.98it/s]
loss 0.24 accuracy 0.91:  28%|██▊       | 527/1875 [00:21<00:29, 45.98it/s]
loss 0.24 accuracy 0.91:  28%|██▊       | 532/1875 [00:21<00:29, 46.00it/s]
loss 0.15 accuracy 0.97:  28%|██▊       | 532/1875 [00:21<00:29, 46.00it/s]
loss 0.11 accuracy 0.97:  28%|██▊       | 532/1875 [00:21<00:29, 46.00it/s]
loss 0.37 accuracy 0.84:  28%|██▊       | 532/1875 [00:21<00:29, 46.00it/s]
loss 0.15 accuracy 0.94:  28%|██▊       | 532/1875 [00:21<00:29, 46.00it/s]
loss 0.20 accuracy 0.94:  28%|██▊       | 532/1875 [00:21<00:29, 46.00it/s]
loss 0.20 accuracy 0.94:  29%|██▊       | 537/1875 [00:21<00:29, 46.04it/s]
loss 0.09 accuracy 1.00:  29%|██▊       | 537/1875 [00:21<00:29, 46.04it/s]
loss 0.22 accuracy 0.97:  29%|██▊       | 537/1875 [00:21<00:29, 46.04it/s]
loss 0.08 accuracy 1.00:  29%|██▊       | 537/1875 [00:21<00:29, 46.04it/s]
loss 0.27 accuracy 0.91:  29%|██▊       | 537/1875 [00:21<00:29, 46.04it/s]
loss 0.31 accuracy 0.91:  29%|██▊       | 537/1875 [00:21<00:29, 46.04it/s]
loss 0.31 accuracy 0.91:  29%|██▉       | 542/1875 [00:21<00:28, 46.03it/s]
loss 0.27 accuracy 0.91:  29%|██▉       | 542/1875 [00:21<00:28, 46.03it/s]
loss 0.44 accuracy 0.91:  29%|██▉       | 542/1875 [00:21<00:28, 46.03it/s]
loss 0.29 accuracy 0.91:  29%|██▉       | 542/1875 [00:21<00:28, 46.03it/s]
loss 0.26 accuracy 0.91:  29%|██▉       | 542/1875 [00:21<00:28, 46.03it/s]
loss 0.14 accuracy 0.97:  29%|██▉       | 542/1875 [00:21<00:28, 46.03it/s]
loss 0.14 accuracy 0.97:  29%|██▉       | 547/1875 [00:21<00:28, 45.99it/s]
loss 0.15 accuracy 0.97:  29%|██▉       | 547/1875 [00:21<00:28, 45.99it/s]
loss 0.13 accuracy 0.97:  29%|██▉       | 547/1875 [00:21<00:28, 45.99it/s]
loss 0.54 accuracy 0.88:  29%|██▉       | 547/1875 [00:21<00:28, 45.99it/s]
loss 0.31 accuracy 0.91:  29%|██▉       | 547/1875 [00:22<00:28, 45.99it/s]
loss 0.20 accuracy 0.94:  29%|██▉       | 547/1875 [00:22<00:28, 45.99it/s]
loss 0.20 accuracy 0.94:  29%|██▉       | 552/1875 [00:22<00:28, 45.96it/s]
loss 0.07 accuracy 1.00:  29%|██▉       | 552/1875 [00:22<00:28, 45.96it/s]
loss 0.09 accuracy 0.97:  29%|██▉       | 552/1875 [00:22<00:28, 45.96it/s]
loss 0.10 accuracy 0.97:  29%|██▉       | 552/1875 [00:22<00:28, 45.96it/s]
loss 0.34 accuracy 0.91:  29%|██▉       | 552/1875 [00:22<00:28, 45.96it/s]
loss 0.11 accuracy 0.97:  29%|██▉       | 552/1875 [00:22<00:28, 45.96it/s]
loss 0.11 accuracy 0.97:  30%|██▉       | 557/1875 [00:22<00:28, 45.83it/s]
loss 0.40 accuracy 0.88:  30%|██▉       | 557/1875 [00:22<00:28, 45.83it/s]
loss 0.25 accuracy 0.94:  30%|██▉       | 557/1875 [00:22<00:28, 45.83it/s]
loss 0.07 accuracy 1.00:  30%|██▉       | 557/1875 [00:22<00:28, 45.83it/s]
loss 0.11 accuracy 0.97:  30%|██▉       | 557/1875 [00:22<00:28, 45.83it/s]
loss 0.20 accuracy 0.97:  30%|██▉       | 557/1875 [00:22<00:28, 45.83it/s]
loss 0.20 accuracy 0.97:  30%|██▉       | 562/1875 [00:22<00:28, 45.84it/s]
loss 0.05 accuracy 1.00:  30%|██▉       | 562/1875 [00:22<00:28, 45.84it/s]
loss 0.15 accuracy 0.91:  30%|██▉       | 562/1875 [00:22<00:28, 45.84it/s]
loss 0.11 accuracy 1.00:  30%|██▉       | 562/1875 [00:22<00:28, 45.84it/s]
loss 0.11 accuracy 0.97:  30%|██▉       | 562/1875 [00:22<00:28, 45.84it/s]
loss 0.04 accuracy 1.00:  30%|██▉       | 562/1875 [00:22<00:28, 45.84it/s]
loss 0.04 accuracy 1.00:  30%|███       | 567/1875 [00:22<00:28, 45.80it/s]
loss 0.03 accuracy 1.00:  30%|███       | 567/1875 [00:22<00:28, 45.80it/s]
loss 0.19 accuracy 0.91:  30%|███       | 567/1875 [00:22<00:28, 45.80it/s]
loss 0.11 accuracy 0.97:  30%|███       | 567/1875 [00:22<00:28, 45.80it/s]
loss 0.18 accuracy 0.91:  30%|███       | 567/1875 [00:22<00:28, 45.80it/s]
loss 0.04 accuracy 1.00:  30%|███       | 567/1875 [00:22<00:28, 45.80it/s]
loss 0.04 accuracy 1.00:  31%|███       | 572/1875 [00:22<00:28, 45.69it/s]
loss 0.25 accuracy 0.94:  31%|███       | 572/1875 [00:22<00:28, 45.69it/s]
loss 0.33 accuracy 0.91:  31%|███       | 572/1875 [00:22<00:28, 45.69it/s]
loss 0.12 accuracy 0.97:  31%|███       | 572/1875 [00:22<00:28, 45.69it/s]
loss 0.11 accuracy 0.97:  31%|███       | 572/1875 [00:22<00:28, 45.69it/s]
loss 0.52 accuracy 0.91:  31%|███       | 572/1875 [00:22<00:28, 45.69it/s]
loss 0.52 accuracy 0.91:  31%|███       | 577/1875 [00:22<00:28, 45.70it/s]
loss 0.24 accuracy 0.94:  31%|███       | 577/1875 [00:22<00:28, 45.70it/s]
loss 0.16 accuracy 0.94:  31%|███       | 577/1875 [00:22<00:28, 45.70it/s]
loss 0.17 accuracy 0.94:  31%|███       | 577/1875 [00:22<00:28, 45.70it/s]
loss 0.17 accuracy 0.94:  31%|███       | 577/1875 [00:22<00:28, 45.70it/s]
loss 0.17 accuracy 0.91:  31%|███       | 577/1875 [00:22<00:28, 45.70it/s]
loss 0.17 accuracy 0.91:  31%|███       | 582/1875 [00:22<00:28, 45.71it/s]
loss 0.15 accuracy 0.97:  31%|███       | 582/1875 [00:22<00:28, 45.71it/s]
loss 0.13 accuracy 0.97:  31%|███       | 582/1875 [00:22<00:28, 45.71it/s]
loss 0.14 accuracy 0.94:  31%|███       | 582/1875 [00:22<00:28, 45.71it/s]
loss 0.18 accuracy 0.97:  31%|███       | 582/1875 [00:22<00:28, 45.71it/s]
loss 0.05 accuracy 1.00:  31%|███       | 582/1875 [00:22<00:28, 45.71it/s]
loss 0.05 accuracy 1.00:  31%|███▏      | 587/1875 [00:22<00:28, 45.81it/s]
loss 0.53 accuracy 0.91:  31%|███▏      | 587/1875 [00:22<00:28, 45.81it/s]
loss 0.10 accuracy 0.97:  31%|███▏      | 587/1875 [00:22<00:28, 45.81it/s]
loss 0.29 accuracy 0.91:  31%|███▏      | 587/1875 [00:22<00:28, 45.81it/s]
loss 0.12 accuracy 0.94:  31%|███▏      | 587/1875 [00:22<00:28, 45.81it/s]
loss 0.16 accuracy 0.94:  31%|███▏      | 587/1875 [00:22<00:28, 45.81it/s]
loss 0.16 accuracy 0.94:  32%|███▏      | 592/1875 [00:22<00:27, 45.88it/s]
loss 0.21 accuracy 0.94:  32%|███▏      | 592/1875 [00:22<00:27, 45.88it/s]
loss 0.15 accuracy 0.94:  32%|███▏      | 592/1875 [00:22<00:27, 45.88it/s]
loss 0.05 accuracy 1.00:  32%|███▏      | 592/1875 [00:22<00:27, 45.88it/s]
loss 0.15 accuracy 0.94:  32%|███▏      | 592/1875 [00:23<00:27, 45.88it/s]
loss 0.07 accuracy 0.97:  32%|███▏      | 592/1875 [00:23<00:27, 45.88it/s]
loss 0.07 accuracy 0.97:  32%|███▏      | 597/1875 [00:23<00:27, 45.94it/s]
loss 0.23 accuracy 0.91:  32%|███▏      | 597/1875 [00:23<00:27, 45.94it/s]
loss 0.16 accuracy 0.97:  32%|███▏      | 597/1875 [00:23<00:27, 45.94it/s]
loss 0.46 accuracy 0.88:  32%|███▏      | 597/1875 [00:23<00:27, 45.94it/s]
loss 0.34 accuracy 0.94:  32%|███▏      | 597/1875 [00:23<00:27, 45.94it/s]
loss 0.14 accuracy 0.97:  32%|███▏      | 597/1875 [00:23<00:27, 45.94it/s]
loss 0.14 accuracy 0.97:  32%|███▏      | 602/1875 [00:23<00:27, 45.99it/s]
loss 0.09 accuracy 1.00:  32%|███▏      | 602/1875 [00:23<00:27, 45.99it/s]
loss 0.23 accuracy 0.94:  32%|███▏      | 602/1875 [00:23<00:27, 45.99it/s]
loss 0.35 accuracy 0.88:  32%|███▏      | 602/1875 [00:23<00:27, 45.99it/s]
loss 0.34 accuracy 0.94:  32%|███▏      | 602/1875 [00:23<00:27, 45.99it/s]
loss 0.46 accuracy 0.88:  32%|███▏      | 602/1875 [00:23<00:27, 45.99it/s]
loss 0.46 accuracy 0.88:  32%|███▏      | 607/1875 [00:23<00:27, 46.05it/s]
loss 0.41 accuracy 0.84:  32%|███▏      | 607/1875 [00:23<00:27, 46.05it/s]
loss 0.15 accuracy 0.94:  32%|███▏      | 607/1875 [00:23<00:27, 46.05it/s]
loss 0.12 accuracy 0.97:  32%|███▏      | 607/1875 [00:23<00:27, 46.05it/s]
loss 0.39 accuracy 0.84:  32%|███▏      | 607/1875 [00:23<00:27, 46.05it/s]
loss 0.07 accuracy 0.97:  32%|███▏      | 607/1875 [00:23<00:27, 46.05it/s]
loss 0.07 accuracy 0.97:  33%|███▎      | 612/1875 [00:23<00:27, 46.08it/s]
loss 0.06 accuracy 0.97:  33%|███▎      | 612/1875 [00:23<00:27, 46.08it/s]
loss 0.11 accuracy 0.97:  33%|███▎      | 612/1875 [00:23<00:27, 46.08it/s]
loss 0.23 accuracy 0.91:  33%|███▎      | 612/1875 [00:23<00:27, 46.08it/s]
loss 0.11 accuracy 1.00:  33%|███▎      | 612/1875 [00:23<00:27, 46.08it/s]
loss 0.19 accuracy 0.94:  33%|███▎      | 612/1875 [00:23<00:27, 46.08it/s]
loss 0.19 accuracy 0.94:  33%|███▎      | 617/1875 [00:23<00:27, 46.07it/s]
loss 0.17 accuracy 0.94:  33%|███▎      | 617/1875 [00:23<00:27, 46.07it/s]
loss 0.06 accuracy 1.00:  33%|███▎      | 617/1875 [00:23<00:27, 46.07it/s]
loss 0.15 accuracy 0.97:  33%|███▎      | 617/1875 [00:23<00:27, 46.07it/s]
loss 0.07 accuracy 1.00:  33%|███▎      | 617/1875 [00:23<00:27, 46.07it/s]
loss 0.34 accuracy 0.94:  33%|███▎      | 617/1875 [00:23<00:27, 46.07it/s]
loss 0.34 accuracy 0.94:  33%|███▎      | 622/1875 [00:23<00:27, 46.09it/s]
loss 0.07 accuracy 0.97:  33%|███▎      | 622/1875 [00:23<00:27, 46.09it/s]
loss 0.16 accuracy 0.94:  33%|███▎      | 622/1875 [00:23<00:27, 46.09it/s]
loss 0.28 accuracy 0.91:  33%|███▎      | 622/1875 [00:23<00:27, 46.09it/s]
loss 0.04 accuracy 1.00:  33%|███▎      | 622/1875 [00:23<00:27, 46.09it/s]
loss 0.24 accuracy 0.97:  33%|███▎      | 622/1875 [00:23<00:27, 46.09it/s]
loss 0.24 accuracy 0.97:  33%|███▎      | 627/1875 [00:23<00:27, 46.12it/s]
loss 0.11 accuracy 0.97:  33%|███▎      | 627/1875 [00:23<00:27, 46.12it/s]
loss 0.15 accuracy 0.97:  33%|███▎      | 627/1875 [00:23<00:27, 46.12it/s]
loss 0.09 accuracy 1.00:  33%|███▎      | 627/1875 [00:23<00:27, 46.12it/s]
loss 0.07 accuracy 1.00:  33%|███▎      | 627/1875 [00:23<00:27, 46.12it/s]
loss 0.04 accuracy 1.00:  33%|███▎      | 627/1875 [00:23<00:27, 46.12it/s]
loss 0.04 accuracy 1.00:  34%|███▎      | 632/1875 [00:23<00:26, 46.14it/s]
loss 0.07 accuracy 1.00:  34%|███▎      | 632/1875 [00:23<00:26, 46.14it/s]
loss 0.52 accuracy 0.88:  34%|███▎      | 632/1875 [00:23<00:26, 46.14it/s]
loss 0.23 accuracy 0.94:  34%|███▎      | 632/1875 [00:23<00:26, 46.14it/s]
loss 0.16 accuracy 0.91:  34%|███▎      | 632/1875 [00:23<00:26, 46.14it/s]
loss 0.14 accuracy 0.97:  34%|███▎      | 632/1875 [00:23<00:26, 46.14it/s]
loss 0.14 accuracy 0.97:  34%|███▍      | 637/1875 [00:23<00:26, 46.16it/s]
loss 0.50 accuracy 0.91:  34%|███▍      | 637/1875 [00:23<00:26, 46.16it/s]
loss 0.18 accuracy 0.97:  34%|███▍      | 637/1875 [00:23<00:26, 46.16it/s]
loss 0.12 accuracy 1.00:  34%|███▍      | 637/1875 [00:23<00:26, 46.16it/s]
loss 0.28 accuracy 0.88:  34%|███▍      | 637/1875 [00:23<00:26, 46.16it/s]
loss 0.09 accuracy 1.00:  34%|███▍      | 637/1875 [00:23<00:26, 46.16it/s]
loss 0.09 accuracy 1.00:  34%|███▍      | 642/1875 [00:23<00:26, 46.18it/s]
loss 0.22 accuracy 0.94:  34%|███▍      | 642/1875 [00:24<00:26, 46.18it/s]
loss 0.05 accuracy 1.00:  34%|███▍      | 642/1875 [00:24<00:26, 46.18it/s]
loss 0.26 accuracy 0.88:  34%|███▍      | 642/1875 [00:24<00:26, 46.18it/s]
loss 0.14 accuracy 0.97:  34%|███▍      | 642/1875 [00:24<00:26, 46.18it/s]
loss 0.23 accuracy 0.94:  34%|███▍      | 642/1875 [00:24<00:26, 46.18it/s]
loss 0.23 accuracy 0.94:  35%|███▍      | 647/1875 [00:24<00:26, 46.14it/s]
loss 0.17 accuracy 0.97:  35%|███▍      | 647/1875 [00:24<00:26, 46.14it/s]
loss 0.04 accuracy 1.00:  35%|███▍      | 647/1875 [00:24<00:26, 46.14it/s]
loss 0.25 accuracy 0.94:  35%|███▍      | 647/1875 [00:24<00:26, 46.14it/s]
loss 0.21 accuracy 0.97:  35%|███▍      | 647/1875 [00:24<00:26, 46.14it/s]
loss 0.15 accuracy 0.94:  35%|███▍      | 647/1875 [00:24<00:26, 46.14it/s]
loss 0.15 accuracy 0.94:  35%|███▍      | 652/1875 [00:24<00:26, 46.15it/s]
loss 0.13 accuracy 0.97:  35%|███▍      | 652/1875 [00:24<00:26, 46.15it/s]
loss 0.47 accuracy 0.88:  35%|███▍      | 652/1875 [00:24<00:26, 46.15it/s]
loss 0.05 accuracy 1.00:  35%|███▍      | 652/1875 [00:24<00:26, 46.15it/s]
loss 0.21 accuracy 0.91:  35%|███▍      | 652/1875 [00:24<00:26, 46.15it/s]
loss 0.12 accuracy 0.97:  35%|███▍      | 652/1875 [00:24<00:26, 46.15it/s]
loss 0.12 accuracy 0.97:  35%|███▌      | 657/1875 [00:24<00:26, 46.17it/s]
loss 0.10 accuracy 0.97:  35%|███▌      | 657/1875 [00:24<00:26, 46.17it/s]
loss 0.22 accuracy 0.97:  35%|███▌      | 657/1875 [00:24<00:26, 46.17it/s]
loss 0.22 accuracy 0.91:  35%|███▌      | 657/1875 [00:24<00:26, 46.17it/s]
loss 0.20 accuracy 0.94:  35%|███▌      | 657/1875 [00:24<00:26, 46.17it/s]
loss 0.14 accuracy 0.94:  35%|███▌      | 657/1875 [00:24<00:26, 46.17it/s]
loss 0.14 accuracy 0.94:  35%|███▌      | 662/1875 [00:24<00:26, 46.18it/s]
loss 0.46 accuracy 0.91:  35%|███▌      | 662/1875 [00:24<00:26, 46.18it/s]
loss 0.18 accuracy 0.94:  35%|███▌      | 662/1875 [00:24<00:26, 46.18it/s]
loss 0.06 accuracy 1.00:  35%|███▌      | 662/1875 [00:24<00:26, 46.18it/s]
loss 0.05 accuracy 1.00:  35%|███▌      | 662/1875 [00:24<00:26, 46.18it/s]
loss 0.17 accuracy 0.94:  35%|███▌      | 662/1875 [00:24<00:26, 46.18it/s]
loss 0.17 accuracy 0.94:  36%|███▌      | 667/1875 [00:24<00:26, 46.18it/s]
loss 0.11 accuracy 0.97:  36%|███▌      | 667/1875 [00:24<00:26, 46.18it/s]
loss 0.40 accuracy 0.84:  36%|███▌      | 667/1875 [00:24<00:26, 46.18it/s]
loss 0.27 accuracy 0.94:  36%|███▌      | 667/1875 [00:24<00:26, 46.18it/s]
loss 0.13 accuracy 0.97:  36%|███▌      | 667/1875 [00:24<00:26, 46.18it/s]
loss 0.15 accuracy 0.94:  36%|███▌      | 667/1875 [00:24<00:26, 46.18it/s]
loss 0.15 accuracy 0.94:  36%|███▌      | 672/1875 [00:24<00:26, 46.17it/s]
loss 0.10 accuracy 0.97:  36%|███▌      | 672/1875 [00:24<00:26, 46.17it/s]
loss 0.22 accuracy 0.97:  36%|███▌      | 672/1875 [00:24<00:26, 46.17it/s]
loss 0.10 accuracy 0.97:  36%|███▌      | 672/1875 [00:24<00:26, 46.17it/s]
loss 0.18 accuracy 0.97:  36%|███▌      | 672/1875 [00:24<00:26, 46.17it/s]
loss 0.38 accuracy 0.91:  36%|███▌      | 672/1875 [00:24<00:26, 46.17it/s]
loss 0.38 accuracy 0.91:  36%|███▌      | 677/1875 [00:24<00:25, 46.17it/s]
loss 0.03 accuracy 1.00:  36%|███▌      | 677/1875 [00:24<00:25, 46.17it/s]
loss 0.23 accuracy 0.91:  36%|███▌      | 677/1875 [00:24<00:25, 46.17it/s]
loss 0.10 accuracy 0.97:  36%|███▌      | 677/1875 [00:24<00:25, 46.17it/s]
loss 0.07 accuracy 1.00:  36%|███▌      | 677/1875 [00:24<00:25, 46.17it/s]
loss 0.09 accuracy 0.97:  36%|███▌      | 677/1875 [00:24<00:25, 46.17it/s]
loss 0.09 accuracy 0.97:  36%|███▋      | 682/1875 [00:24<00:25, 46.18it/s]
loss 0.13 accuracy 0.94:  36%|███▋      | 682/1875 [00:24<00:25, 46.18it/s]
loss 0.08 accuracy 0.97:  36%|███▋      | 682/1875 [00:24<00:25, 46.18it/s]
loss 0.08 accuracy 0.97:  36%|███▋      | 682/1875 [00:24<00:25, 46.18it/s]
loss 0.11 accuracy 0.97:  36%|███▋      | 682/1875 [00:24<00:25, 46.18it/s]
loss 0.10 accuracy 0.94:  36%|███▋      | 682/1875 [00:24<00:25, 46.18it/s]
loss 0.10 accuracy 0.94:  37%|███▋      | 687/1875 [00:24<00:25, 46.15it/s]
loss 0.15 accuracy 0.94:  37%|███▋      | 687/1875 [00:24<00:25, 46.15it/s]
loss 0.10 accuracy 0.97:  37%|███▋      | 687/1875 [00:25<00:25, 46.15it/s]
loss 0.28 accuracy 0.97:  37%|███▋      | 687/1875 [00:25<00:25, 46.15it/s]
loss 0.32 accuracy 0.91:  37%|███▋      | 687/1875 [00:25<00:25, 46.15it/s]
loss 0.11 accuracy 0.97:  37%|███▋      | 687/1875 [00:25<00:25, 46.15it/s]
loss 0.11 accuracy 0.97:  37%|███▋      | 692/1875 [00:25<00:25, 46.09it/s]
loss 0.08 accuracy 0.97:  37%|███▋      | 692/1875 [00:25<00:25, 46.09it/s]
loss 0.06 accuracy 1.00:  37%|███▋      | 692/1875 [00:25<00:25, 46.09it/s]
loss 0.09 accuracy 1.00:  37%|███▋      | 692/1875 [00:25<00:25, 46.09it/s]
loss 0.19 accuracy 0.97:  37%|███▋      | 692/1875 [00:25<00:25, 46.09it/s]
loss 0.38 accuracy 0.91:  37%|███▋      | 692/1875 [00:25<00:25, 46.09it/s]
loss 0.38 accuracy 0.91:  37%|███▋      | 697/1875 [00:25<00:25, 46.05it/s]
loss 0.11 accuracy 0.97:  37%|███▋      | 697/1875 [00:25<00:25, 46.05it/s]
loss 0.16 accuracy 0.97:  37%|███▋      | 697/1875 [00:25<00:25, 46.05it/s]
loss 0.30 accuracy 0.91:  37%|███▋      | 697/1875 [00:25<00:25, 46.05it/s]
loss 0.05 accuracy 1.00:  37%|███▋      | 697/1875 [00:25<00:25, 46.05it/s]
loss 0.19 accuracy 0.91:  37%|███▋      | 697/1875 [00:25<00:25, 46.05it/s]
loss 0.19 accuracy 0.91:  37%|███▋      | 702/1875 [00:25<00:25, 46.05it/s]
loss 0.24 accuracy 0.94:  37%|███▋      | 702/1875 [00:25<00:25, 46.05it/s]
loss 0.15 accuracy 0.94:  37%|███▋      | 702/1875 [00:25<00:25, 46.05it/s]
loss 0.10 accuracy 1.00:  37%|███▋      | 702/1875 [00:25<00:25, 46.05it/s]
loss 0.09 accuracy 1.00:  37%|███▋      | 702/1875 [00:25<00:25, 46.05it/s]
loss 0.16 accuracy 0.94:  37%|███▋      | 702/1875 [00:25<00:25, 46.05it/s]
loss 0.16 accuracy 0.94:  38%|███▊      | 707/1875 [00:25<00:25, 45.98it/s]
loss 0.07 accuracy 1.00:  38%|███▊      | 707/1875 [00:25<00:25, 45.98it/s]
loss 0.42 accuracy 0.88:  38%|███▊      | 707/1875 [00:25<00:25, 45.98it/s]
loss 0.15 accuracy 0.97:  38%|███▊      | 707/1875 [00:25<00:25, 45.98it/s]
loss 0.16 accuracy 0.97:  38%|███▊      | 707/1875 [00:25<00:25, 45.98it/s]
loss 0.13 accuracy 0.97:  38%|███▊      | 707/1875 [00:25<00:25, 45.98it/s]
loss 0.13 accuracy 0.97:  38%|███▊      | 712/1875 [00:25<00:25, 45.84it/s]
loss 0.08 accuracy 0.97:  38%|███▊      | 712/1875 [00:25<00:25, 45.84it/s]
loss 0.11 accuracy 0.97:  38%|███▊      | 712/1875 [00:25<00:25, 45.84it/s]
loss 0.37 accuracy 0.88:  38%|███▊      | 712/1875 [00:25<00:25, 45.84it/s]
loss 0.20 accuracy 0.97:  38%|███▊      | 712/1875 [00:25<00:25, 45.84it/s]
loss 0.13 accuracy 0.97:  38%|███▊      | 712/1875 [00:25<00:25, 45.84it/s]
loss 0.13 accuracy 0.97:  38%|███▊      | 717/1875 [00:25<00:25, 45.84it/s]
loss 0.22 accuracy 0.91:  38%|███▊      | 717/1875 [00:25<00:25, 45.84it/s]
loss 0.20 accuracy 0.91:  38%|███▊      | 717/1875 [00:25<00:25, 45.84it/s]
loss 0.08 accuracy 0.97:  38%|███▊      | 717/1875 [00:25<00:25, 45.84it/s]
loss 0.20 accuracy 0.94:  38%|███▊      | 717/1875 [00:25<00:25, 45.84it/s]
loss 0.12 accuracy 0.97:  38%|███▊      | 717/1875 [00:25<00:25, 45.84it/s]
loss 0.12 accuracy 0.97:  39%|███▊      | 722/1875 [00:25<00:25, 45.83it/s]
loss 0.06 accuracy 1.00:  39%|███▊      | 722/1875 [00:25<00:25, 45.83it/s]
loss 0.12 accuracy 0.97:  39%|███▊      | 722/1875 [00:25<00:25, 45.83it/s]
loss 0.15 accuracy 0.97:  39%|███▊      | 722/1875 [00:25<00:25, 45.83it/s]
loss 0.18 accuracy 0.94:  39%|███▊      | 722/1875 [00:25<00:25, 45.83it/s]
loss 0.11 accuracy 0.97:  39%|███▊      | 722/1875 [00:25<00:25, 45.83it/s]
loss 0.11 accuracy 0.97:  39%|███▉      | 727/1875 [00:25<00:25, 45.71it/s]
loss 0.09 accuracy 1.00:  39%|███▉      | 727/1875 [00:25<00:25, 45.71it/s]
loss 0.08 accuracy 0.97:  39%|███▉      | 727/1875 [00:25<00:25, 45.71it/s]
loss 0.27 accuracy 0.97:  39%|███▉      | 727/1875 [00:25<00:25, 45.71it/s]
loss 0.12 accuracy 0.97:  39%|███▉      | 727/1875 [00:25<00:25, 45.71it/s]
loss 0.04 accuracy 1.00:  39%|███▉      | 727/1875 [00:25<00:25, 45.71it/s]
loss 0.04 accuracy 1.00:  39%|███▉      | 732/1875 [00:25<00:25, 45.71it/s]
loss 0.17 accuracy 0.94:  39%|███▉      | 732/1875 [00:25<00:25, 45.71it/s]
loss 0.11 accuracy 0.97:  39%|███▉      | 732/1875 [00:25<00:25, 45.71it/s]
loss 0.13 accuracy 0.97:  39%|███▉      | 732/1875 [00:26<00:25, 45.71it/s]
loss 0.08 accuracy 0.97:  39%|███▉      | 732/1875 [00:26<00:25, 45.71it/s]
loss 0.17 accuracy 0.97:  39%|███▉      | 732/1875 [00:26<00:25, 45.71it/s]
loss 0.17 accuracy 0.97:  39%|███▉      | 737/1875 [00:26<00:24, 45.69it/s]
loss 0.07 accuracy 0.97:  39%|███▉      | 737/1875 [00:26<00:24, 45.69it/s]
loss 0.45 accuracy 0.91:  39%|███▉      | 737/1875 [00:26<00:24, 45.69it/s]
loss 0.20 accuracy 0.91:  39%|███▉      | 737/1875 [00:26<00:24, 45.69it/s]
loss 0.32 accuracy 0.94:  39%|███▉      | 737/1875 [00:26<00:24, 45.69it/s]
loss 0.12 accuracy 0.97:  39%|███▉      | 737/1875 [00:26<00:24, 45.69it/s]
loss 0.12 accuracy 0.97:  40%|███▉      | 742/1875 [00:26<00:24, 45.74it/s]
loss 0.02 accuracy 1.00:  40%|███▉      | 742/1875 [00:26<00:24, 45.74it/s]
loss 0.18 accuracy 0.97:  40%|███▉      | 742/1875 [00:26<00:24, 45.74it/s]
loss 0.03 accuracy 1.00:  40%|███▉      | 742/1875 [00:26<00:24, 45.74it/s]
loss 0.27 accuracy 0.94:  40%|███▉      | 742/1875 [00:26<00:24, 45.74it/s]
loss 0.14 accuracy 0.97:  40%|███▉      | 742/1875 [00:26<00:24, 45.74it/s]
loss 0.14 accuracy 0.97:  40%|███▉      | 747/1875 [00:26<00:24, 45.86it/s]
loss 0.05 accuracy 1.00:  40%|███▉      | 747/1875 [00:26<00:24, 45.86it/s]
loss 0.07 accuracy 1.00:  40%|███▉      | 747/1875 [00:26<00:24, 45.86it/s]
loss 0.19 accuracy 0.91:  40%|███▉      | 747/1875 [00:26<00:24, 45.86it/s]
loss 0.23 accuracy 0.91:  40%|███▉      | 747/1875 [00:26<00:24, 45.86it/s]
loss 0.29 accuracy 0.91:  40%|███▉      | 747/1875 [00:26<00:24, 45.86it/s]
loss 0.29 accuracy 0.91:  40%|████      | 752/1875 [00:26<00:24, 45.93it/s]
loss 0.19 accuracy 0.97:  40%|████      | 752/1875 [00:26<00:24, 45.93it/s]
loss 0.07 accuracy 1.00:  40%|████      | 752/1875 [00:26<00:24, 45.93it/s]
loss 0.10 accuracy 0.97:  40%|████      | 752/1875 [00:26<00:24, 45.93it/s]
loss 0.37 accuracy 0.94:  40%|████      | 752/1875 [00:26<00:24, 45.93it/s]
loss 0.06 accuracy 1.00:  40%|████      | 752/1875 [00:26<00:24, 45.93it/s]
loss 0.06 accuracy 1.00:  40%|████      | 757/1875 [00:26<00:24, 45.98it/s]
loss 0.27 accuracy 0.94:  40%|████      | 757/1875 [00:26<00:24, 45.98it/s]
loss 0.23 accuracy 0.91:  40%|████      | 757/1875 [00:26<00:24, 45.98it/s]
loss 0.09 accuracy 0.97:  40%|████      | 757/1875 [00:26<00:24, 45.98it/s]
loss 0.05 accuracy 1.00:  40%|████      | 757/1875 [00:26<00:24, 45.98it/s]
loss 0.24 accuracy 0.88:  40%|████      | 757/1875 [00:26<00:24, 45.98it/s]
loss 0.24 accuracy 0.88:  41%|████      | 762/1875 [00:26<00:24, 46.04it/s]
loss 0.11 accuracy 0.97:  41%|████      | 762/1875 [00:26<00:24, 46.04it/s]
loss 0.28 accuracy 0.91:  41%|████      | 762/1875 [00:26<00:24, 46.04it/s]
loss 0.18 accuracy 0.91:  41%|████      | 762/1875 [00:26<00:24, 46.04it/s]
loss 0.08 accuracy 0.97:  41%|████      | 762/1875 [00:26<00:24, 46.04it/s]
loss 0.08 accuracy 1.00:  41%|████      | 762/1875 [00:26<00:24, 46.04it/s]
loss 0.08 accuracy 1.00:  41%|████      | 767/1875 [00:26<00:24, 46.09it/s]
loss 0.16 accuracy 0.94:  41%|████      | 767/1875 [00:26<00:24, 46.09it/s]
loss 0.14 accuracy 0.97:  41%|████      | 767/1875 [00:26<00:24, 46.09it/s]
loss 0.09 accuracy 1.00:  41%|████      | 767/1875 [00:26<00:24, 46.09it/s]
loss 0.09 accuracy 0.97:  41%|████      | 767/1875 [00:26<00:24, 46.09it/s]
loss 0.11 accuracy 0.97:  41%|████      | 767/1875 [00:26<00:24, 46.09it/s]
loss 0.11 accuracy 0.97:  41%|████      | 772/1875 [00:26<00:23, 46.13it/s]
loss 0.34 accuracy 0.94:  41%|████      | 772/1875 [00:26<00:23, 46.13it/s]
loss 0.17 accuracy 0.94:  41%|████      | 772/1875 [00:26<00:23, 46.13it/s]
loss 0.29 accuracy 0.94:  41%|████      | 772/1875 [00:26<00:23, 46.13it/s]
loss 0.06 accuracy 1.00:  41%|████      | 772/1875 [00:26<00:23, 46.13it/s]
loss 0.11 accuracy 0.97:  41%|████      | 772/1875 [00:26<00:23, 46.13it/s]
loss 0.11 accuracy 0.97:  41%|████▏     | 777/1875 [00:26<00:23, 46.11it/s]
loss 0.10 accuracy 0.94:  41%|████▏     | 777/1875 [00:26<00:23, 46.11it/s]
loss 0.03 accuracy 1.00:  41%|████▏     | 777/1875 [00:26<00:23, 46.11it/s]
loss 0.06 accuracy 0.97:  41%|████▏     | 777/1875 [00:26<00:23, 46.11it/s]
loss 0.24 accuracy 0.97:  41%|████▏     | 777/1875 [00:27<00:23, 46.11it/s]
loss 0.08 accuracy 1.00:  41%|████▏     | 777/1875 [00:27<00:23, 46.11it/s]
loss 0.08 accuracy 1.00:  42%|████▏     | 782/1875 [00:27<00:23, 46.10it/s]
loss 0.07 accuracy 0.97:  42%|████▏     | 782/1875 [00:27<00:23, 46.10it/s]
loss 0.03 accuracy 1.00:  42%|████▏     | 782/1875 [00:27<00:23, 46.10it/s]
loss 0.16 accuracy 0.97:  42%|████▏     | 782/1875 [00:27<00:23, 46.10it/s]
loss 0.24 accuracy 0.91:  42%|████▏     | 782/1875 [00:27<00:23, 46.10it/s]
loss 0.25 accuracy 0.91:  42%|████▏     | 782/1875 [00:27<00:23, 46.10it/s]
loss 0.25 accuracy 0.91:  42%|████▏     | 787/1875 [00:27<00:23, 46.11it/s]
loss 0.39 accuracy 0.81:  42%|████▏     | 787/1875 [00:27<00:23, 46.11it/s]
loss 0.10 accuracy 0.97:  42%|████▏     | 787/1875 [00:27<00:23, 46.11it/s]
loss 0.10 accuracy 0.97:  42%|████▏     | 787/1875 [00:27<00:23, 46.11it/s]
loss 0.17 accuracy 0.91:  42%|████▏     | 787/1875 [00:27<00:23, 46.11it/s]
loss 0.10 accuracy 0.94:  42%|████▏     | 787/1875 [00:27<00:23, 46.11it/s]
loss 0.10 accuracy 0.94:  42%|████▏     | 792/1875 [00:27<00:23, 46.09it/s]
loss 0.15 accuracy 0.94:  42%|████▏     | 792/1875 [00:27<00:23, 46.09it/s]
loss 0.09 accuracy 0.97:  42%|████▏     | 792/1875 [00:27<00:23, 46.09it/s]
loss 0.28 accuracy 0.88:  42%|████▏     | 792/1875 [00:27<00:23, 46.09it/s]
loss 0.06 accuracy 1.00:  42%|████▏     | 792/1875 [00:27<00:23, 46.09it/s]
loss 0.13 accuracy 0.97:  42%|████▏     | 792/1875 [00:27<00:23, 46.09it/s]
loss 0.13 accuracy 0.97:  43%|████▎     | 797/1875 [00:27<00:23, 46.08it/s]
loss 0.17 accuracy 0.94:  43%|████▎     | 797/1875 [00:27<00:23, 46.08it/s]
loss 0.28 accuracy 0.94:  43%|████▎     | 797/1875 [00:27<00:23, 46.08it/s]
loss 0.14 accuracy 0.97:  43%|████▎     | 797/1875 [00:27<00:23, 46.08it/s]
loss 0.13 accuracy 0.97:  43%|████▎     | 797/1875 [00:27<00:23, 46.08it/s]
loss 0.36 accuracy 0.91:  43%|████▎     | 797/1875 [00:27<00:23, 46.08it/s]
loss 0.36 accuracy 0.91:  43%|████▎     | 802/1875 [00:27<00:23, 46.04it/s]
loss 0.13 accuracy 0.97:  43%|████▎     | 802/1875 [00:27<00:23, 46.04it/s]
loss 0.22 accuracy 0.94:  43%|████▎     | 802/1875 [00:27<00:23, 46.04it/s]
loss 0.19 accuracy 0.94:  43%|████▎     | 802/1875 [00:27<00:23, 46.04it/s]
loss 0.24 accuracy 0.91:  43%|████▎     | 802/1875 [00:27<00:23, 46.04it/s]
loss 0.14 accuracy 0.97:  43%|████▎     | 802/1875 [00:27<00:23, 46.04it/s]
loss 0.14 accuracy 0.97:  43%|████▎     | 807/1875 [00:27<00:23, 45.99it/s]
loss 0.13 accuracy 0.97:  43%|████▎     | 807/1875 [00:27<00:23, 45.99it/s]
loss 0.14 accuracy 0.94:  43%|████▎     | 807/1875 [00:27<00:23, 45.99it/s]
loss 0.09 accuracy 0.97:  43%|████▎     | 807/1875 [00:27<00:23, 45.99it/s]
loss 0.16 accuracy 0.97:  43%|████▎     | 807/1875 [00:27<00:23, 45.99it/s]
loss 0.21 accuracy 0.97:  43%|████▎     | 807/1875 [00:27<00:23, 45.99it/s]
loss 0.21 accuracy 0.97:  43%|████▎     | 812/1875 [00:27<00:23, 45.83it/s]
loss 0.07 accuracy 0.97:  43%|████▎     | 812/1875 [00:27<00:23, 45.83it/s]
loss 0.16 accuracy 0.94:  43%|████▎     | 812/1875 [00:27<00:23, 45.83it/s]
loss 0.17 accuracy 0.94:  43%|████▎     | 812/1875 [00:27<00:23, 45.83it/s]
loss 0.30 accuracy 0.91:  43%|████▎     | 812/1875 [00:27<00:23, 45.83it/s]
loss 0.07 accuracy 1.00:  43%|████▎     | 812/1875 [00:27<00:23, 45.83it/s]
loss 0.07 accuracy 1.00:  44%|████▎     | 817/1875 [00:27<00:23, 45.80it/s]
loss 0.15 accuracy 0.94:  44%|████▎     | 817/1875 [00:27<00:23, 45.80it/s]
loss 0.10 accuracy 1.00:  44%|████▎     | 817/1875 [00:27<00:23, 45.80it/s]
loss 0.23 accuracy 0.94:  44%|████▎     | 817/1875 [00:27<00:23, 45.80it/s]
loss 0.11 accuracy 0.94:  44%|████▎     | 817/1875 [00:27<00:23, 45.80it/s]
loss 0.08 accuracy 1.00:  44%|████▎     | 817/1875 [00:27<00:23, 45.80it/s]
loss 0.08 accuracy 1.00:  44%|████▍     | 822/1875 [00:27<00:23, 45.73it/s]
loss 0.21 accuracy 0.88:  44%|████▍     | 822/1875 [00:27<00:23, 45.73it/s]
loss 0.11 accuracy 0.97:  44%|████▍     | 822/1875 [00:27<00:23, 45.73it/s]
loss 0.07 accuracy 0.97:  44%|████▍     | 822/1875 [00:27<00:23, 45.73it/s]
loss 0.10 accuracy 0.97:  44%|████▍     | 822/1875 [00:28<00:23, 45.73it/s]
loss 0.24 accuracy 0.91:  44%|████▍     | 822/1875 [00:28<00:23, 45.73it/s]
loss 0.24 accuracy 0.91:  44%|████▍     | 827/1875 [00:28<00:22, 45.70it/s]
loss 0.27 accuracy 0.88:  44%|████▍     | 827/1875 [00:28<00:22, 45.70it/s]
loss 0.07 accuracy 0.97:  44%|████▍     | 827/1875 [00:28<00:22, 45.70it/s]
loss 0.24 accuracy 0.91:  44%|████▍     | 827/1875 [00:28<00:22, 45.70it/s]
loss 0.09 accuracy 1.00:  44%|████▍     | 827/1875 [00:28<00:22, 45.70it/s]
loss 0.25 accuracy 0.94:  44%|████▍     | 827/1875 [00:28<00:22, 45.70it/s]
loss 0.25 accuracy 0.94:  44%|████▍     | 832/1875 [00:28<00:22, 45.70it/s]
loss 0.12 accuracy 0.97:  44%|████▍     | 832/1875 [00:28<00:22, 45.70it/s]
loss 0.42 accuracy 0.84:  44%|████▍     | 832/1875 [00:28<00:22, 45.70it/s]
loss 0.05 accuracy 1.00:  44%|████▍     | 832/1875 [00:28<00:22, 45.70it/s]
loss 0.05 accuracy 1.00:  44%|████▍     | 832/1875 [00:28<00:22, 45.70it/s]
loss 0.13 accuracy 0.97:  44%|████▍     | 832/1875 [00:28<00:22, 45.70it/s]
loss 0.13 accuracy 0.97:  45%|████▍     | 837/1875 [00:28<00:22, 45.72it/s]
loss 0.18 accuracy 0.94:  45%|████▍     | 837/1875 [00:28<00:22, 45.72it/s]
loss 0.24 accuracy 0.94:  45%|████▍     | 837/1875 [00:28<00:22, 45.72it/s]
loss 0.12 accuracy 0.94:  45%|████▍     | 837/1875 [00:28<00:22, 45.72it/s]
loss 0.09 accuracy 0.97:  45%|████▍     | 837/1875 [00:28<00:22, 45.72it/s]
loss 0.17 accuracy 0.97:  45%|████▍     | 837/1875 [00:28<00:22, 45.72it/s]
loss 0.17 accuracy 0.97:  45%|████▍     | 842/1875 [00:28<00:22, 45.75it/s]
loss 0.25 accuracy 0.91:  45%|████▍     | 842/1875 [00:28<00:22, 45.75it/s]
loss 0.06 accuracy 1.00:  45%|████▍     | 842/1875 [00:28<00:22, 45.75it/s]
loss 0.20 accuracy 0.97:  45%|████▍     | 842/1875 [00:28<00:22, 45.75it/s]
loss 0.04 accuracy 1.00:  45%|████▍     | 842/1875 [00:28<00:22, 45.75it/s]
loss 0.18 accuracy 0.94:  45%|████▍     | 842/1875 [00:28<00:22, 45.75it/s]
loss 0.18 accuracy 0.94:  45%|████▌     | 847/1875 [00:28<00:22, 45.88it/s]
loss 0.03 accuracy 1.00:  45%|████▌     | 847/1875 [00:28<00:22, 45.88it/s]
loss 0.21 accuracy 0.94:  45%|████▌     | 847/1875 [00:28<00:22, 45.88it/s]
loss 0.14 accuracy 0.97:  45%|████▌     | 847/1875 [00:28<00:22, 45.88it/s]
loss 0.04 accuracy 1.00:  45%|████▌     | 847/1875 [00:28<00:22, 45.88it/s]
loss 0.18 accuracy 0.97:  45%|████▌     | 847/1875 [00:28<00:22, 45.88it/s]
loss 0.18 accuracy 0.97:  45%|████▌     | 852/1875 [00:28<00:22, 45.95it/s]
loss 0.06 accuracy 1.00:  45%|████▌     | 852/1875 [00:28<00:22, 45.95it/s]
loss 0.06 accuracy 1.00:  45%|████▌     | 852/1875 [00:28<00:22, 45.95it/s]
loss 0.13 accuracy 0.97:  45%|████▌     | 852/1875 [00:28<00:22, 45.95it/s]
loss 0.03 accuracy 1.00:  45%|████▌     | 852/1875 [00:28<00:22, 45.95it/s]
loss 0.03 accuracy 1.00:  45%|████▌     | 852/1875 [00:28<00:22, 45.95it/s]
loss 0.03 accuracy 1.00:  46%|████▌     | 857/1875 [00:28<00:22, 46.01it/s]
loss 0.12 accuracy 0.94:  46%|████▌     | 857/1875 [00:28<00:22, 46.01it/s]
loss 0.16 accuracy 0.94:  46%|████▌     | 857/1875 [00:28<00:22, 46.01it/s]
loss 0.09 accuracy 0.97:  46%|████▌     | 857/1875 [00:28<00:22, 46.01it/s]
loss 0.05 accuracy 1.00:  46%|████▌     | 857/1875 [00:28<00:22, 46.01it/s]
loss 0.17 accuracy 0.94:  46%|████▌     | 857/1875 [00:28<00:22, 46.01it/s]
loss 0.17 accuracy 0.94:  46%|████▌     | 862/1875 [00:28<00:21, 46.07it/s]
loss 0.04 accuracy 1.00:  46%|████▌     | 862/1875 [00:28<00:21, 46.07it/s]
loss 0.20 accuracy 0.94:  46%|████▌     | 862/1875 [00:28<00:21, 46.07it/s]
loss 0.06 accuracy 1.00:  46%|████▌     | 862/1875 [00:28<00:21, 46.07it/s]
loss 0.05 accuracy 1.00:  46%|████▌     | 862/1875 [00:28<00:21, 46.07it/s]
loss 0.06 accuracy 0.97:  46%|████▌     | 862/1875 [00:28<00:21, 46.07it/s]
loss 0.06 accuracy 0.97:  46%|████▌     | 867/1875 [00:28<00:21, 46.07it/s]
loss 0.11 accuracy 0.97:  46%|████▌     | 867/1875 [00:28<00:21, 46.07it/s]
loss 0.19 accuracy 0.94:  46%|████▌     | 867/1875 [00:28<00:21, 46.07it/s]
loss 0.12 accuracy 1.00:  46%|████▌     | 867/1875 [00:28<00:21, 46.07it/s]
loss 0.21 accuracy 0.97:  46%|████▌     | 867/1875 [00:28<00:21, 46.07it/s]
loss 0.10 accuracy 0.97:  46%|████▌     | 867/1875 [00:29<00:21, 46.07it/s]
loss 0.10 accuracy 0.97:  47%|████▋     | 872/1875 [00:29<00:21, 46.12it/s]
loss 0.05 accuracy 1.00:  47%|████▋     | 872/1875 [00:29<00:21, 46.12it/s]
loss 0.14 accuracy 0.97:  47%|████▋     | 872/1875 [00:29<00:21, 46.12it/s]
loss 0.04 accuracy 1.00:  47%|████▋     | 872/1875 [00:29<00:21, 46.12it/s]
loss 0.03 accuracy 1.00:  47%|████▋     | 872/1875 [00:29<00:21, 46.12it/s]
loss 0.16 accuracy 0.97:  47%|████▋     | 872/1875 [00:29<00:21, 46.12it/s]
loss 0.16 accuracy 0.97:  47%|████▋     | 877/1875 [00:29<00:21, 46.12it/s]
loss 0.06 accuracy 0.97:  47%|████▋     | 877/1875 [00:29<00:21, 46.12it/s]
loss 0.07 accuracy 0.97:  47%|████▋     | 877/1875 [00:29<00:21, 46.12it/s]
loss 0.08 accuracy 0.97:  47%|████▋     | 877/1875 [00:29<00:21, 46.12it/s]
loss 0.12 accuracy 0.94:  47%|████▋     | 877/1875 [00:29<00:21, 46.12it/s]
loss 0.18 accuracy 0.94:  47%|████▋     | 877/1875 [00:29<00:21, 46.12it/s]
loss 0.18 accuracy 0.94:  47%|████▋     | 882/1875 [00:29<00:21, 46.12it/s]
loss 0.15 accuracy 0.94:  47%|████▋     | 882/1875 [00:29<00:21, 46.12it/s]
loss 0.21 accuracy 0.97:  47%|████▋     | 882/1875 [00:29<00:21, 46.12it/s]
loss 0.06 accuracy 1.00:  47%|████▋     | 882/1875 [00:29<00:21, 46.12it/s]
loss 0.12 accuracy 0.94:  47%|████▋     | 882/1875 [00:29<00:21, 46.12it/s]
loss 0.05 accuracy 1.00:  47%|████▋     | 882/1875 [00:29<00:21, 46.12it/s]
loss 0.05 accuracy 1.00:  47%|████▋     | 887/1875 [00:29<00:21, 46.10it/s]
loss 0.08 accuracy 0.94:  47%|████▋     | 887/1875 [00:29<00:21, 46.10it/s]
loss 0.33 accuracy 0.94:  47%|████▋     | 887/1875 [00:29<00:21, 46.10it/s]
loss 0.47 accuracy 0.84:  47%|████▋     | 887/1875 [00:29<00:21, 46.10it/s]
loss 0.10 accuracy 0.97:  47%|████▋     | 887/1875 [00:29<00:21, 46.10it/s]
loss 0.21 accuracy 0.91:  47%|████▋     | 887/1875 [00:29<00:21, 46.10it/s]
loss 0.21 accuracy 0.91:  48%|████▊     | 892/1875 [00:29<00:21, 46.13it/s]
loss 0.16 accuracy 0.97:  48%|████▊     | 892/1875 [00:29<00:21, 46.13it/s]
loss 0.09 accuracy 0.97:  48%|████▊     | 892/1875 [00:29<00:21, 46.13it/s]
loss 0.16 accuracy 0.97:  48%|████▊     | 892/1875 [00:29<00:21, 46.13it/s]
loss 0.17 accuracy 0.94:  48%|████▊     | 892/1875 [00:29<00:21, 46.13it/s]
loss 0.04 accuracy 1.00:  48%|████▊     | 892/1875 [00:29<00:21, 46.13it/s]
loss 0.04 accuracy 1.00:  48%|████▊     | 897/1875 [00:29<00:21, 46.13it/s]
loss 0.04 accuracy 1.00:  48%|████▊     | 897/1875 [00:29<00:21, 46.13it/s]
loss 0.05 accuracy 1.00:  48%|████▊     | 897/1875 [00:29<00:21, 46.13it/s]
loss 0.17 accuracy 0.94:  48%|████▊     | 897/1875 [00:29<00:21, 46.13it/s]
loss 0.09 accuracy 0.94:  48%|████▊     | 897/1875 [00:29<00:21, 46.13it/s]
loss 0.06 accuracy 1.00:  48%|████▊     | 897/1875 [00:29<00:21, 46.13it/s]
loss 0.06 accuracy 1.00:  48%|████▊     | 902/1875 [00:29<00:21, 46.16it/s]
loss 0.26 accuracy 0.94:  48%|████▊     | 902/1875 [00:29<00:21, 46.16it/s]
loss 0.07 accuracy 1.00:  48%|████▊     | 902/1875 [00:29<00:21, 46.16it/s]
loss 0.02 accuracy 1.00:  48%|████▊     | 902/1875 [00:29<00:21, 46.16it/s]
loss 0.08 accuracy 0.97:  48%|████▊     | 902/1875 [00:29<00:21, 46.16it/s]
loss 0.21 accuracy 0.94:  48%|████▊     | 902/1875 [00:29<00:21, 46.16it/s]
loss 0.21 accuracy 0.94:  48%|████▊     | 907/1875 [00:29<00:20, 46.16it/s]
loss 0.06 accuracy 0.97:  48%|████▊     | 907/1875 [00:29<00:20, 46.16it/s]
loss 0.05 accuracy 1.00:  48%|████▊     | 907/1875 [00:29<00:20, 46.16it/s]
loss 0.10 accuracy 1.00:  48%|████▊     | 907/1875 [00:29<00:20, 46.16it/s]
loss 0.12 accuracy 0.97:  48%|████▊     | 907/1875 [00:29<00:20, 46.16it/s]
loss 0.08 accuracy 0.97:  48%|████▊     | 907/1875 [00:29<00:20, 46.16it/s]
loss 0.08 accuracy 0.97:  49%|████▊     | 912/1875 [00:29<00:20, 46.15it/s]
loss 0.13 accuracy 0.97:  49%|████▊     | 912/1875 [00:29<00:20, 46.15it/s]
loss 0.08 accuracy 1.00:  49%|████▊     | 912/1875 [00:29<00:20, 46.15it/s]
loss 0.05 accuracy 1.00:  49%|████▊     | 912/1875 [00:29<00:20, 46.15it/s]
loss 0.33 accuracy 0.91:  49%|████▊     | 912/1875 [00:29<00:20, 46.15it/s]
loss 0.03 accuracy 1.00:  49%|████▊     | 912/1875 [00:29<00:20, 46.15it/s]
loss 0.03 accuracy 1.00:  49%|████▉     | 917/1875 [00:29<00:20, 46.14it/s]
loss 0.11 accuracy 1.00:  49%|████▉     | 917/1875 [00:29<00:20, 46.14it/s]
loss 0.17 accuracy 0.97:  49%|████▉     | 917/1875 [00:30<00:20, 46.14it/s]
loss 0.38 accuracy 0.91:  49%|████▉     | 917/1875 [00:30<00:20, 46.14it/s]
loss 0.12 accuracy 0.97:  49%|████▉     | 917/1875 [00:30<00:20, 46.14it/s]
loss 0.15 accuracy 0.94:  49%|████▉     | 917/1875 [00:30<00:20, 46.14it/s]
loss 0.15 accuracy 0.94:  49%|████▉     | 922/1875 [00:30<00:20, 46.16it/s]
loss 0.18 accuracy 0.94:  49%|████▉     | 922/1875 [00:30<00:20, 46.16it/s]
loss 0.31 accuracy 0.88:  49%|████▉     | 922/1875 [00:30<00:20, 46.16it/s]
loss 0.19 accuracy 0.91:  49%|████▉     | 922/1875 [00:30<00:20, 46.16it/s]
loss 0.16 accuracy 0.94:  49%|████▉     | 922/1875 [00:30<00:20, 46.16it/s]
loss 0.14 accuracy 0.94:  49%|████▉     | 922/1875 [00:30<00:20, 46.16it/s]
loss 0.14 accuracy 0.94:  49%|████▉     | 927/1875 [00:30<00:20, 46.15it/s]
loss 0.11 accuracy 0.97:  49%|████▉     | 927/1875 [00:30<00:20, 46.15it/s]
loss 0.18 accuracy 0.94:  49%|████▉     | 927/1875 [00:30<00:20, 46.15it/s]
loss 0.08 accuracy 1.00:  49%|████▉     | 927/1875 [00:30<00:20, 46.15it/s]
loss 0.10 accuracy 0.97:  49%|████▉     | 927/1875 [00:30<00:20, 46.15it/s]
loss 0.18 accuracy 0.97:  49%|████▉     | 927/1875 [00:30<00:20, 46.15it/s]
loss 0.18 accuracy 0.97:  50%|████▉     | 932/1875 [00:30<00:20, 46.14it/s]
loss 0.21 accuracy 0.94:  50%|████▉     | 932/1875 [00:30<00:20, 46.14it/s]
loss 0.11 accuracy 0.97:  50%|████▉     | 932/1875 [00:30<00:20, 46.14it/s]
loss 0.08 accuracy 1.00:  50%|████▉     | 932/1875 [00:30<00:20, 46.14it/s]
loss 0.26 accuracy 0.97:  50%|████▉     | 932/1875 [00:30<00:20, 46.14it/s]
loss 0.14 accuracy 0.94:  50%|████▉     | 932/1875 [00:30<00:20, 46.14it/s]
loss 0.14 accuracy 0.94:  50%|████▉     | 937/1875 [00:30<00:20, 46.08it/s]
loss 0.07 accuracy 0.97:  50%|████▉     | 937/1875 [00:30<00:20, 46.08it/s]
loss 0.51 accuracy 0.91:  50%|████▉     | 937/1875 [00:30<00:20, 46.08it/s]
loss 0.18 accuracy 0.94:  50%|████▉     | 937/1875 [00:30<00:20, 46.08it/s]
loss 0.14 accuracy 0.97:  50%|████▉     | 937/1875 [00:30<00:20, 46.08it/s]
loss 0.38 accuracy 0.94:  50%|████▉     | 937/1875 [00:30<00:20, 46.08it/s]
loss 0.38 accuracy 0.94:  50%|█████     | 942/1875 [00:30<00:20, 46.06it/s]
loss 0.07 accuracy 1.00:  50%|█████     | 942/1875 [00:30<00:20, 46.06it/s]
loss 0.24 accuracy 0.97:  50%|█████     | 942/1875 [00:30<00:20, 46.06it/s]
loss 0.13 accuracy 0.97:  50%|█████     | 942/1875 [00:30<00:20, 46.06it/s]
loss 0.07 accuracy 0.97:  50%|█████     | 942/1875 [00:30<00:20, 46.06it/s]
loss 0.15 accuracy 0.97:  50%|█████     | 942/1875 [00:30<00:20, 46.06it/s]
loss 0.15 accuracy 0.97:  51%|█████     | 947/1875 [00:30<00:20, 45.93it/s]
loss 0.15 accuracy 0.97:  51%|█████     | 947/1875 [00:30<00:20, 45.93it/s]
loss 0.40 accuracy 0.91:  51%|█████     | 947/1875 [00:30<00:20, 45.93it/s]
loss 0.07 accuracy 0.97:  51%|█████     | 947/1875 [00:30<00:20, 45.93it/s]
loss 0.11 accuracy 0.97:  51%|█████     | 947/1875 [00:30<00:20, 45.93it/s]
loss 0.09 accuracy 0.97:  51%|█████     | 947/1875 [00:30<00:20, 45.93it/s]
loss 0.09 accuracy 0.97:  51%|█████     | 952/1875 [00:30<00:20, 45.86it/s]
loss 0.03 accuracy 1.00:  51%|█████     | 952/1875 [00:30<00:20, 45.86it/s]
loss 0.06 accuracy 1.00:  51%|█████     | 952/1875 [00:30<00:20, 45.86it/s]
loss 0.03 accuracy 1.00:  51%|█████     | 952/1875 [00:30<00:20, 45.86it/s]
loss 0.28 accuracy 0.91:  51%|█████     | 952/1875 [00:30<00:20, 45.86it/s]
loss 0.10 accuracy 0.97:  51%|█████     | 952/1875 [00:30<00:20, 45.86it/s]
loss 0.10 accuracy 0.97:  51%|█████     | 957/1875 [00:30<00:20, 45.83it/s]
loss 0.20 accuracy 0.94:  51%|█████     | 957/1875 [00:30<00:20, 45.83it/s]
loss 0.09 accuracy 0.97:  51%|█████     | 957/1875 [00:30<00:20, 45.83it/s]
loss 0.19 accuracy 0.94:  51%|█████     | 957/1875 [00:30<00:20, 45.83it/s]
loss 0.13 accuracy 0.94:  51%|█████     | 957/1875 [00:30<00:20, 45.83it/s]
loss 0.03 accuracy 1.00:  51%|█████     | 957/1875 [00:30<00:20, 45.83it/s]
loss 0.03 accuracy 1.00:  51%|█████▏    | 962/1875 [00:30<00:19, 45.80it/s]
loss 0.19 accuracy 0.94:  51%|█████▏    | 962/1875 [00:30<00:19, 45.80it/s]
loss 0.42 accuracy 0.91:  51%|█████▏    | 962/1875 [00:31<00:19, 45.80it/s]
loss 0.22 accuracy 0.91:  51%|█████▏    | 962/1875 [00:31<00:19, 45.80it/s]
loss 0.09 accuracy 0.94:  51%|█████▏    | 962/1875 [00:31<00:19, 45.80it/s]
loss 0.22 accuracy 0.94:  51%|█████▏    | 962/1875 [00:31<00:19, 45.80it/s]
loss 0.22 accuracy 0.94:  52%|█████▏    | 967/1875 [00:31<00:19, 45.74it/s]
loss 0.08 accuracy 0.97:  52%|█████▏    | 967/1875 [00:31<00:19, 45.74it/s]
loss 0.19 accuracy 0.97:  52%|█████▏    | 967/1875 [00:31<00:19, 45.74it/s]
loss 0.18 accuracy 0.97:  52%|█████▏    | 967/1875 [00:31<00:19, 45.74it/s]
loss 0.04 accuracy 1.00:  52%|█████▏    | 967/1875 [00:31<00:19, 45.74it/s]
loss 0.11 accuracy 0.97:  52%|█████▏    | 967/1875 [00:31<00:19, 45.74it/s]
loss 0.11 accuracy 0.97:  52%|█████▏    | 972/1875 [00:31<00:19, 45.74it/s]
loss 0.03 accuracy 1.00:  52%|█████▏    | 972/1875 [00:31<00:19, 45.74it/s]
loss 0.16 accuracy 0.97:  52%|█████▏    | 972/1875 [00:31<00:19, 45.74it/s]
loss 0.26 accuracy 0.97:  52%|█████▏    | 972/1875 [00:31<00:19, 45.74it/s]
loss 0.06 accuracy 1.00:  52%|█████▏    | 972/1875 [00:31<00:19, 45.74it/s]
loss 0.15 accuracy 0.97:  52%|█████▏    | 972/1875 [00:31<00:19, 45.74it/s]
loss 0.15 accuracy 0.97:  52%|█████▏    | 977/1875 [00:31<00:19, 45.70it/s]
loss 0.27 accuracy 0.94:  52%|█████▏    | 977/1875 [00:31<00:19, 45.70it/s]
loss 0.07 accuracy 1.00:  52%|█████▏    | 977/1875 [00:31<00:19, 45.70it/s]
loss 0.11 accuracy 0.94:  52%|█████▏    | 977/1875 [00:31<00:19, 45.70it/s]
loss 0.08 accuracy 0.97:  52%|█████▏    | 977/1875 [00:31<00:19, 45.70it/s]
loss 0.13 accuracy 0.94:  52%|█████▏    | 977/1875 [00:31<00:19, 45.70it/s]
loss 0.13 accuracy 0.94:  52%|█████▏    | 982/1875 [00:31<00:19, 45.76it/s]
loss 0.41 accuracy 0.88:  52%|█████▏    | 982/1875 [00:31<00:19, 45.76it/s]
loss 0.16 accuracy 0.97:  52%|█████▏    | 982/1875 [00:31<00:19, 45.76it/s]
loss 0.18 accuracy 0.94:  52%|█████▏    | 982/1875 [00:31<00:19, 45.76it/s]
loss 0.04 accuracy 1.00:  52%|█████▏    | 982/1875 [00:31<00:19, 45.76it/s]
loss 0.08 accuracy 1.00:  52%|█████▏    | 982/1875 [00:31<00:19, 45.76it/s]
loss 0.08 accuracy 1.00:  53%|█████▎    | 987/1875 [00:31<00:19, 45.83it/s]
loss 0.08 accuracy 1.00:  53%|█████▎    | 987/1875 [00:31<00:19, 45.83it/s]
loss 0.14 accuracy 0.94:  53%|█████▎    | 987/1875 [00:31<00:19, 45.83it/s]
loss 0.40 accuracy 0.91:  53%|█████▎    | 987/1875 [00:31<00:19, 45.83it/s]
loss 0.12 accuracy 0.97:  53%|█████▎    | 987/1875 [00:31<00:19, 45.83it/s]
loss 0.05 accuracy 1.00:  53%|█████▎    | 987/1875 [00:31<00:19, 45.83it/s]
loss 0.05 accuracy 1.00:  53%|█████▎    | 992/1875 [00:31<00:19, 45.92it/s]
loss 0.03 accuracy 1.00:  53%|█████▎    | 992/1875 [00:31<00:19, 45.92it/s]
loss 0.12 accuracy 0.97:  53%|█████▎    | 992/1875 [00:31<00:19, 45.92it/s]
loss 0.09 accuracy 0.97:  53%|█████▎    | 992/1875 [00:31<00:19, 45.92it/s]
loss 0.11 accuracy 0.97:  53%|█████▎    | 992/1875 [00:31<00:19, 45.92it/s]
loss 0.04 accuracy 1.00:  53%|█████▎    | 992/1875 [00:31<00:19, 45.92it/s]
loss 0.04 accuracy 1.00:  53%|█████▎    | 997/1875 [00:31<00:19, 45.95it/s]
loss 0.06 accuracy 0.97:  53%|█████▎    | 997/1875 [00:31<00:19, 45.95it/s]
loss 0.06 accuracy 1.00:  53%|█████▎    | 997/1875 [00:31<00:19, 45.95it/s]
loss 0.15 accuracy 0.94:  53%|█████▎    | 997/1875 [00:31<00:19, 45.95it/s]
loss 0.06 accuracy 0.97:  53%|█████▎    | 997/1875 [00:31<00:19, 45.95it/s]
loss 0.06 accuracy 0.97:  53%|█████▎    | 997/1875 [00:31<00:19, 45.95it/s]
loss 0.06 accuracy 0.97:  53%|█████▎    | 1002/1875 [00:31<00:18, 45.99it/s]
loss 0.09 accuracy 0.97:  53%|█████▎    | 1002/1875 [00:31<00:18, 45.99it/s]
loss 0.03 accuracy 1.00:  53%|█████▎    | 1002/1875 [00:31<00:18, 45.99it/s]
loss 0.08 accuracy 0.97:  53%|█████▎    | 1002/1875 [00:31<00:18, 45.99it/s]
loss 0.14 accuracy 0.97:  53%|█████▎    | 1002/1875 [00:31<00:18, 45.99it/s]
loss 0.17 accuracy 0.94:  53%|█████▎    | 1002/1875 [00:31<00:18, 45.99it/s]
loss 0.17 accuracy 0.94:  54%|█████▎    | 1007/1875 [00:31<00:18, 45.98it/s]
loss 0.18 accuracy 0.97:  54%|█████▎    | 1007/1875 [00:31<00:18, 45.98it/s]
loss 0.09 accuracy 0.94:  54%|█████▎    | 1007/1875 [00:31<00:18, 45.98it/s]
loss 0.03 accuracy 1.00:  54%|█████▎    | 1007/1875 [00:32<00:18, 45.98it/s]
loss 0.05 accuracy 1.00:  54%|█████▎    | 1007/1875 [00:32<00:18, 45.98it/s]
loss 0.11 accuracy 0.97:  54%|█████▎    | 1007/1875 [00:32<00:18, 45.98it/s]
loss 0.11 accuracy 0.97:  54%|█████▍    | 1012/1875 [00:32<00:18, 46.03it/s]
loss 0.26 accuracy 0.88:  54%|█████▍    | 1012/1875 [00:32<00:18, 46.03it/s]
loss 0.33 accuracy 0.88:  54%|█████▍    | 1012/1875 [00:32<00:18, 46.03it/s]
loss 0.11 accuracy 0.97:  54%|█████▍    | 1012/1875 [00:32<00:18, 46.03it/s]
loss 0.13 accuracy 0.97:  54%|█████▍    | 1012/1875 [00:32<00:18, 46.03it/s]
loss 0.20 accuracy 0.94:  54%|█████▍    | 1012/1875 [00:32<00:18, 46.03it/s]
loss 0.20 accuracy 0.94:  54%|█████▍    | 1017/1875 [00:32<00:18, 46.03it/s]
loss 0.09 accuracy 0.97:  54%|█████▍    | 1017/1875 [00:32<00:18, 46.03it/s]
loss 0.11 accuracy 0.97:  54%|█████▍    | 1017/1875 [00:32<00:18, 46.03it/s]
loss 0.13 accuracy 0.97:  54%|█████▍    | 1017/1875 [00:32<00:18, 46.03it/s]
loss 0.09 accuracy 0.97:  54%|█████▍    | 1017/1875 [00:32<00:18, 46.03it/s]
loss 0.06 accuracy 1.00:  54%|█████▍    | 1017/1875 [00:32<00:18, 46.03it/s]
loss 0.06 accuracy 1.00:  55%|█████▍    | 1022/1875 [00:32<00:18, 46.02it/s]
loss 0.13 accuracy 0.94:  55%|█████▍    | 1022/1875 [00:32<00:18, 46.02it/s]
loss 0.05 accuracy 0.97:  55%|█████▍    | 1022/1875 [00:32<00:18, 46.02it/s]
loss 0.21 accuracy 0.91:  55%|█████▍    | 1022/1875 [00:32<00:18, 46.02it/s]
loss 0.03 accuracy 1.00:  55%|█████▍    | 1022/1875 [00:32<00:18, 46.02it/s]
loss 0.46 accuracy 0.88:  55%|█████▍    | 1022/1875 [00:32<00:18, 46.02it/s]
loss 0.46 accuracy 0.88:  55%|█████▍    | 1027/1875 [00:32<00:18, 45.91it/s]
loss 0.07 accuracy 0.97:  55%|█████▍    | 1027/1875 [00:32<00:18, 45.91it/s]
loss 0.16 accuracy 0.94:  55%|█████▍    | 1027/1875 [00:32<00:18, 45.91it/s]
loss 0.16 accuracy 0.97:  55%|█████▍    | 1027/1875 [00:32<00:18, 45.91it/s]
loss 0.15 accuracy 0.94:  55%|█████▍    | 1027/1875 [00:32<00:18, 45.91it/s]
loss 0.08 accuracy 1.00:  55%|█████▍    | 1027/1875 [00:32<00:18, 45.91it/s]
loss 0.08 accuracy 1.00:  55%|█████▌    | 1032/1875 [00:32<00:18, 45.83it/s]
loss 0.17 accuracy 0.94:  55%|█████▌    | 1032/1875 [00:32<00:18, 45.83it/s]
loss 0.12 accuracy 0.94:  55%|█████▌    | 1032/1875 [00:32<00:18, 45.83it/s]
loss 0.22 accuracy 0.91:  55%|█████▌    | 1032/1875 [00:32<00:18, 45.83it/s]
loss 0.45 accuracy 0.84:  55%|█████▌    | 1032/1875 [00:32<00:18, 45.83it/s]
loss 0.03 accuracy 1.00:  55%|█████▌    | 1032/1875 [00:32<00:18, 45.83it/s]
loss 0.03 accuracy 1.00:  55%|█████▌    | 1037/1875 [00:32<00:18, 45.85it/s]
loss 0.04 accuracy 1.00:  55%|█████▌    | 1037/1875 [00:32<00:18, 45.85it/s]
loss 0.06 accuracy 1.00:  55%|█████▌    | 1037/1875 [00:32<00:18, 45.85it/s]
loss 0.28 accuracy 0.84:  55%|█████▌    | 1037/1875 [00:32<00:18, 45.85it/s]
loss 0.09 accuracy 0.97:  55%|█████▌    | 1037/1875 [00:32<00:18, 45.85it/s]
loss 0.15 accuracy 0.91:  55%|█████▌    | 1037/1875 [00:32<00:18, 45.85it/s]
loss 0.15 accuracy 0.91:  56%|█████▌    | 1042/1875 [00:32<00:18, 45.75it/s]
loss 0.29 accuracy 0.91:  56%|█████▌    | 1042/1875 [00:32<00:18, 45.75it/s]
loss 0.12 accuracy 0.97:  56%|█████▌    | 1042/1875 [00:32<00:18, 45.75it/s]
loss 0.16 accuracy 0.91:  56%|█████▌    | 1042/1875 [00:32<00:18, 45.75it/s]
loss 0.16 accuracy 0.94:  56%|█████▌    | 1042/1875 [00:32<00:18, 45.75it/s]
loss 0.11 accuracy 0.97:  56%|█████▌    | 1042/1875 [00:32<00:18, 45.75it/s]
loss 0.11 accuracy 0.97:  56%|█████▌    | 1047/1875 [00:32<00:18, 45.73it/s]
loss 0.07 accuracy 1.00:  56%|█████▌    | 1047/1875 [00:32<00:18, 45.73it/s]
loss 0.25 accuracy 0.97:  56%|█████▌    | 1047/1875 [00:32<00:18, 45.73it/s]
loss 0.04 accuracy 1.00:  56%|█████▌    | 1047/1875 [00:32<00:18, 45.73it/s]
loss 0.20 accuracy 0.94:  56%|█████▌    | 1047/1875 [00:32<00:18, 45.73it/s]
loss 0.18 accuracy 0.94:  56%|█████▌    | 1047/1875 [00:32<00:18, 45.73it/s]
loss 0.18 accuracy 0.94:  56%|█████▌    | 1052/1875 [00:32<00:17, 45.74it/s]
loss 0.15 accuracy 0.97:  56%|█████▌    | 1052/1875 [00:32<00:17, 45.74it/s]
loss 0.10 accuracy 0.94:  56%|█████▌    | 1052/1875 [00:32<00:17, 45.74it/s]
loss 0.02 accuracy 1.00:  56%|█████▌    | 1052/1875 [00:32<00:17, 45.74it/s]
loss 0.05 accuracy 1.00:  56%|█████▌    | 1052/1875 [00:33<00:17, 45.74it/s]
loss 0.07 accuracy 0.97:  56%|█████▌    | 1052/1875 [00:33<00:17, 45.74it/s]
loss 0.07 accuracy 0.97:  56%|█████▋    | 1057/1875 [00:33<00:17, 45.73it/s]
loss 0.06 accuracy 0.97:  56%|█████▋    | 1057/1875 [00:33<00:17, 45.73it/s]
loss 0.17 accuracy 0.94:  56%|█████▋    | 1057/1875 [00:33<00:17, 45.73it/s]
loss 0.06 accuracy 1.00:  56%|█████▋    | 1057/1875 [00:33<00:17, 45.73it/s]
loss 0.22 accuracy 0.94:  56%|█████▋    | 1057/1875 [00:33<00:17, 45.73it/s]
loss 0.15 accuracy 0.97:  56%|█████▋    | 1057/1875 [00:33<00:17, 45.73it/s]
loss 0.15 accuracy 0.97:  57%|█████▋    | 1062/1875 [00:33<00:17, 45.78it/s]
loss 0.11 accuracy 0.97:  57%|█████▋    | 1062/1875 [00:33<00:17, 45.78it/s]
loss 0.02 accuracy 1.00:  57%|█████▋    | 1062/1875 [00:33<00:17, 45.78it/s]
loss 0.02 accuracy 1.00:  57%|█████▋    | 1062/1875 [00:33<00:17, 45.78it/s]
loss 0.09 accuracy 0.97:  57%|█████▋    | 1062/1875 [00:33<00:17, 45.78it/s]
loss 0.10 accuracy 0.97:  57%|█████▋    | 1062/1875 [00:33<00:17, 45.78it/s]
loss 0.10 accuracy 0.97:  57%|█████▋    | 1067/1875 [00:33<00:17, 45.87it/s]
loss 0.07 accuracy 0.97:  57%|█████▋    | 1067/1875 [00:33<00:17, 45.87it/s]
loss 0.21 accuracy 0.91:  57%|█████▋    | 1067/1875 [00:33<00:17, 45.87it/s]
loss 0.10 accuracy 1.00:  57%|█████▋    | 1067/1875 [00:33<00:17, 45.87it/s]
loss 0.15 accuracy 0.97:  57%|█████▋    | 1067/1875 [00:33<00:17, 45.87it/s]
loss 0.32 accuracy 0.97:  57%|█████▋    | 1067/1875 [00:33<00:17, 45.87it/s]
loss 0.32 accuracy 0.97:  57%|█████▋    | 1072/1875 [00:33<00:17, 45.93it/s]
loss 0.09 accuracy 0.97:  57%|█████▋    | 1072/1875 [00:33<00:17, 45.93it/s]
loss 0.09 accuracy 0.97:  57%|█████▋    | 1072/1875 [00:33<00:17, 45.93it/s]
loss 0.23 accuracy 0.97:  57%|█████▋    | 1072/1875 [00:33<00:17, 45.93it/s]
loss 0.14 accuracy 0.94:  57%|█████▋    | 1072/1875 [00:33<00:17, 45.93it/s]
loss 0.30 accuracy 0.97:  57%|█████▋    | 1072/1875 [00:33<00:17, 45.93it/s]
loss 0.30 accuracy 0.97:  57%|█████▋    | 1077/1875 [00:33<00:17, 45.98it/s]
loss 0.11 accuracy 1.00:  57%|█████▋    | 1077/1875 [00:33<00:17, 45.98it/s]
loss 0.09 accuracy 0.97:  57%|█████▋    | 1077/1875 [00:33<00:17, 45.98it/s]
loss 0.06 accuracy 1.00:  57%|█████▋    | 1077/1875 [00:33<00:17, 45.98it/s]
loss 0.11 accuracy 0.94:  57%|█████▋    | 1077/1875 [00:33<00:17, 45.98it/s]
loss 0.21 accuracy 0.94:  57%|█████▋    | 1077/1875 [00:33<00:17, 45.98it/s]
loss 0.21 accuracy 0.94:  58%|█████▊    | 1082/1875 [00:33<00:17, 46.04it/s]
loss 0.05 accuracy 1.00:  58%|█████▊    | 1082/1875 [00:33<00:17, 46.04it/s]
loss 0.08 accuracy 0.97:  58%|█████▊    | 1082/1875 [00:33<00:17, 46.04it/s]
loss 0.14 accuracy 0.97:  58%|█████▊    | 1082/1875 [00:33<00:17, 46.04it/s]
loss 0.22 accuracy 0.94:  58%|█████▊    | 1082/1875 [00:33<00:17, 46.04it/s]
loss 0.11 accuracy 1.00:  58%|█████▊    | 1082/1875 [00:33<00:17, 46.04it/s]
loss 0.11 accuracy 1.00:  58%|█████▊    | 1087/1875 [00:33<00:17, 46.08it/s]
loss 0.13 accuracy 0.97:  58%|█████▊    | 1087/1875 [00:33<00:17, 46.08it/s]
loss 0.05 accuracy 0.97:  58%|█████▊    | 1087/1875 [00:33<00:17, 46.08it/s]
loss 0.16 accuracy 0.91:  58%|█████▊    | 1087/1875 [00:33<00:17, 46.08it/s]
loss 0.05 accuracy 1.00:  58%|█████▊    | 1087/1875 [00:33<00:17, 46.08it/s]
loss 0.20 accuracy 0.91:  58%|█████▊    | 1087/1875 [00:33<00:17, 46.08it/s]
loss 0.20 accuracy 0.91:  58%|█████▊    | 1092/1875 [00:33<00:17, 46.05it/s]
loss 0.08 accuracy 0.97:  58%|█████▊    | 1092/1875 [00:33<00:17, 46.05it/s]
loss 0.04 accuracy 1.00:  58%|█████▊    | 1092/1875 [00:33<00:17, 46.05it/s]
loss 0.14 accuracy 0.94:  58%|█████▊    | 1092/1875 [00:33<00:17, 46.05it/s]
loss 0.06 accuracy 1.00:  58%|█████▊    | 1092/1875 [00:33<00:17, 46.05it/s]
loss 0.04 accuracy 1.00:  58%|█████▊    | 1092/1875 [00:33<00:17, 46.05it/s]
loss 0.04 accuracy 1.00:  59%|█████▊    | 1097/1875 [00:33<00:16, 46.08it/s]
loss 0.02 accuracy 1.00:  59%|█████▊    | 1097/1875 [00:33<00:16, 46.08it/s]
loss 0.29 accuracy 0.91:  59%|█████▊    | 1097/1875 [00:33<00:16, 46.08it/s]
loss 0.03 accuracy 1.00:  59%|█████▊    | 1097/1875 [00:33<00:16, 46.08it/s]
loss 0.13 accuracy 0.97:  59%|█████▊    | 1097/1875 [00:33<00:16, 46.08it/s]
loss 0.06 accuracy 1.00:  59%|█████▊    | 1097/1875 [00:34<00:16, 46.08it/s]
loss 0.06 accuracy 1.00:  59%|█████▉    | 1102/1875 [00:34<00:16, 46.08it/s]
loss 0.14 accuracy 0.97:  59%|█████▉    | 1102/1875 [00:34<00:16, 46.08it/s]
loss 0.31 accuracy 0.91:  59%|█████▉    | 1102/1875 [00:34<00:16, 46.08it/s]
loss 0.22 accuracy 0.94:  59%|█████▉    | 1102/1875 [00:34<00:16, 46.08it/s]
loss 0.14 accuracy 0.97:  59%|█████▉    | 1102/1875 [00:34<00:16, 46.08it/s]
loss 0.10 accuracy 0.97:  59%|█████▉    | 1102/1875 [00:34<00:16, 46.08it/s]
loss 0.10 accuracy 0.97:  59%|█████▉    | 1107/1875 [00:34<00:16, 46.07it/s]
loss 0.07 accuracy 1.00:  59%|█████▉    | 1107/1875 [00:34<00:16, 46.07it/s]
loss 0.04 accuracy 1.00:  59%|█████▉    | 1107/1875 [00:34<00:16, 46.07it/s]
loss 0.20 accuracy 0.91:  59%|█████▉    | 1107/1875 [00:34<00:16, 46.07it/s]
loss 0.27 accuracy 0.94:  59%|█████▉    | 1107/1875 [00:34<00:16, 46.07it/s]
loss 0.24 accuracy 0.97:  59%|█████▉    | 1107/1875 [00:34<00:16, 46.07it/s]
loss 0.24 accuracy 0.97:  59%|█████▉    | 1112/1875 [00:34<00:16, 46.07it/s]
loss 0.18 accuracy 0.94:  59%|█████▉    | 1112/1875 [00:34<00:16, 46.07it/s]
loss 0.04 accuracy 1.00:  59%|█████▉    | 1112/1875 [00:34<00:16, 46.07it/s]
loss 0.11 accuracy 0.94:  59%|█████▉    | 1112/1875 [00:34<00:16, 46.07it/s]
loss 0.04 accuracy 1.00:  59%|█████▉    | 1112/1875 [00:34<00:16, 46.07it/s]
loss 0.04 accuracy 1.00:  59%|█████▉    | 1112/1875 [00:34<00:16, 46.07it/s]
loss 0.04 accuracy 1.00:  60%|█████▉    | 1117/1875 [00:34<00:16, 46.07it/s]
loss 0.11 accuracy 0.97:  60%|█████▉    | 1117/1875 [00:34<00:16, 46.07it/s]
loss 0.06 accuracy 1.00:  60%|█████▉    | 1117/1875 [00:34<00:16, 46.07it/s]
loss 0.44 accuracy 0.84:  60%|█████▉    | 1117/1875 [00:34<00:16, 46.07it/s]
loss 0.17 accuracy 0.94:  60%|█████▉    | 1117/1875 [00:34<00:16, 46.07it/s]
loss 0.17 accuracy 0.94:  60%|█████▉    | 1117/1875 [00:34<00:16, 46.07it/s]
loss 0.17 accuracy 0.94:  60%|█████▉    | 1122/1875 [00:34<00:16, 46.01it/s]
loss 0.11 accuracy 0.97:  60%|█████▉    | 1122/1875 [00:34<00:16, 46.01it/s]
loss 0.20 accuracy 0.91:  60%|█████▉    | 1122/1875 [00:34<00:16, 46.01it/s]
loss 0.06 accuracy 1.00:  60%|█████▉    | 1122/1875 [00:34<00:16, 46.01it/s]
loss 0.27 accuracy 0.88:  60%|█████▉    | 1122/1875 [00:34<00:16, 46.01it/s]
loss 0.31 accuracy 0.91:  60%|█████▉    | 1122/1875 [00:34<00:16, 46.01it/s]
loss 0.31 accuracy 0.91:  60%|██████    | 1127/1875 [00:34<00:16, 45.87it/s]
loss 0.12 accuracy 0.97:  60%|██████    | 1127/1875 [00:34<00:16, 45.87it/s]
loss 0.11 accuracy 0.97:  60%|██████    | 1127/1875 [00:34<00:16, 45.87it/s]
loss 0.09 accuracy 0.97:  60%|██████    | 1127/1875 [00:34<00:16, 45.87it/s]
loss 0.20 accuracy 0.91:  60%|██████    | 1127/1875 [00:34<00:16, 45.87it/s]
loss 0.08 accuracy 1.00:  60%|██████    | 1127/1875 [00:34<00:16, 45.87it/s]
loss 0.08 accuracy 1.00:  60%|██████    | 1132/1875 [00:34<00:16, 45.88it/s]
loss 0.11 accuracy 0.97:  60%|██████    | 1132/1875 [00:34<00:16, 45.88it/s]
loss 0.24 accuracy 0.94:  60%|██████    | 1132/1875 [00:34<00:16, 45.88it/s]
loss 0.27 accuracy 0.88:  60%|██████    | 1132/1875 [00:34<00:16, 45.88it/s]
loss 0.04 accuracy 1.00:  60%|██████    | 1132/1875 [00:34<00:16, 45.88it/s]
loss 0.09 accuracy 0.97:  60%|██████    | 1132/1875 [00:34<00:16, 45.88it/s]
loss 0.09 accuracy 0.97:  61%|██████    | 1137/1875 [00:34<00:16, 45.89it/s]
loss 0.12 accuracy 0.97:  61%|██████    | 1137/1875 [00:34<00:16, 45.89it/s]
loss 0.13 accuracy 0.97:  61%|██████    | 1137/1875 [00:34<00:16, 45.89it/s]
loss 0.34 accuracy 0.97:  61%|██████    | 1137/1875 [00:34<00:16, 45.89it/s]
loss 0.07 accuracy 0.97:  61%|██████    | 1137/1875 [00:34<00:16, 45.89it/s]
loss 0.39 accuracy 0.84:  61%|██████    | 1137/1875 [00:34<00:16, 45.89it/s]
loss 0.39 accuracy 0.84:  61%|██████    | 1142/1875 [00:34<00:16, 45.77it/s]
loss 0.14 accuracy 0.97:  61%|██████    | 1142/1875 [00:34<00:16, 45.77it/s]
loss 0.11 accuracy 0.94:  61%|██████    | 1142/1875 [00:34<00:16, 45.77it/s]
loss 0.12 accuracy 0.97:  61%|██████    | 1142/1875 [00:34<00:16, 45.77it/s]
loss 0.17 accuracy 0.94:  61%|██████    | 1142/1875 [00:34<00:16, 45.77it/s]
loss 0.22 accuracy 0.91:  61%|██████    | 1142/1875 [00:34<00:16, 45.77it/s]
loss 0.22 accuracy 0.91:  61%|██████    | 1147/1875 [00:34<00:15, 45.76it/s]
loss 0.14 accuracy 0.94:  61%|██████    | 1147/1875 [00:35<00:15, 45.76it/s]
loss 0.13 accuracy 0.94:  61%|██████    | 1147/1875 [00:35<00:15, 45.76it/s]
loss 0.03 accuracy 1.00:  61%|██████    | 1147/1875 [00:35<00:15, 45.76it/s]
loss 0.10 accuracy 0.94:  61%|██████    | 1147/1875 [00:35<00:15, 45.76it/s]
loss 0.05 accuracy 1.00:  61%|██████    | 1147/1875 [00:35<00:15, 45.76it/s]
loss 0.05 accuracy 1.00:  61%|██████▏   | 1152/1875 [00:35<00:15, 45.73it/s]
loss 0.08 accuracy 0.97:  61%|██████▏   | 1152/1875 [00:35<00:15, 45.73it/s]
loss 0.08 accuracy 1.00:  61%|██████▏   | 1152/1875 [00:35<00:15, 45.73it/s]
loss 0.03 accuracy 1.00:  61%|██████▏   | 1152/1875 [00:35<00:15, 45.73it/s]
loss 0.30 accuracy 0.94:  61%|██████▏   | 1152/1875 [00:35<00:15, 45.73it/s]
loss 0.13 accuracy 0.94:  61%|██████▏   | 1152/1875 [00:35<00:15, 45.73it/s]
loss 0.13 accuracy 0.94:  62%|██████▏   | 1157/1875 [00:35<00:15, 45.73it/s]
loss 0.06 accuracy 1.00:  62%|██████▏   | 1157/1875 [00:35<00:15, 45.73it/s]
loss 0.17 accuracy 0.97:  62%|██████▏   | 1157/1875 [00:35<00:15, 45.73it/s]
loss 0.23 accuracy 0.91:  62%|██████▏   | 1157/1875 [00:35<00:15, 45.73it/s]
loss 0.12 accuracy 0.94:  62%|██████▏   | 1157/1875 [00:35<00:15, 45.73it/s]
loss 0.06 accuracy 1.00:  62%|██████▏   | 1157/1875 [00:35<00:15, 45.73it/s]
loss 0.06 accuracy 1.00:  62%|██████▏   | 1162/1875 [00:35<00:15, 45.81it/s]
loss 0.19 accuracy 0.94:  62%|██████▏   | 1162/1875 [00:35<00:15, 45.81it/s]
loss 0.07 accuracy 0.97:  62%|██████▏   | 1162/1875 [00:35<00:15, 45.81it/s]
loss 0.09 accuracy 1.00:  62%|██████▏   | 1162/1875 [00:35<00:15, 45.81it/s]
loss 0.21 accuracy 0.94:  62%|██████▏   | 1162/1875 [00:35<00:15, 45.81it/s]
loss 0.19 accuracy 0.91:  62%|██████▏   | 1162/1875 [00:35<00:15, 45.81it/s]
loss 0.19 accuracy 0.91:  62%|██████▏   | 1167/1875 [00:35<00:15, 45.85it/s]
loss 0.11 accuracy 0.97:  62%|██████▏   | 1167/1875 [00:35<00:15, 45.85it/s]
loss 0.23 accuracy 0.94:  62%|██████▏   | 1167/1875 [00:35<00:15, 45.85it/s]
loss 0.13 accuracy 0.97:  62%|██████▏   | 1167/1875 [00:35<00:15, 45.85it/s]
loss 0.06 accuracy 0.97:  62%|██████▏   | 1167/1875 [00:35<00:15, 45.85it/s]
loss 0.09 accuracy 0.97:  62%|██████▏   | 1167/1875 [00:35<00:15, 45.85it/s]
loss 0.09 accuracy 0.97:  63%|██████▎   | 1172/1875 [00:35<00:15, 45.96it/s]
loss 0.11 accuracy 0.97:  63%|██████▎   | 1172/1875 [00:35<00:15, 45.96it/s]
loss 0.16 accuracy 0.97:  63%|██████▎   | 1172/1875 [00:35<00:15, 45.96it/s]
loss 0.06 accuracy 0.97:  63%|██████▎   | 1172/1875 [00:35<00:15, 45.96it/s]
loss 0.06 accuracy 1.00:  63%|██████▎   | 1172/1875 [00:35<00:15, 45.96it/s]
loss 0.13 accuracy 0.97:  63%|██████▎   | 1172/1875 [00:35<00:15, 45.96it/s]
loss 0.13 accuracy 0.97:  63%|██████▎   | 1177/1875 [00:35<00:15, 45.99it/s]
loss 0.26 accuracy 0.91:  63%|██████▎   | 1177/1875 [00:35<00:15, 45.99it/s]
loss 0.28 accuracy 0.94:  63%|██████▎   | 1177/1875 [00:35<00:15, 45.99it/s]
loss 0.06 accuracy 1.00:  63%|██████▎   | 1177/1875 [00:35<00:15, 45.99it/s]
loss 0.08 accuracy 0.97:  63%|██████▎   | 1177/1875 [00:35<00:15, 45.99it/s]
loss 0.10 accuracy 0.97:  63%|██████▎   | 1177/1875 [00:35<00:15, 45.99it/s]
loss 0.10 accuracy 0.97:  63%|██████▎   | 1182/1875 [00:35<00:15, 46.04it/s]
loss 0.35 accuracy 0.88:  63%|██████▎   | 1182/1875 [00:35<00:15, 46.04it/s]
loss 0.09 accuracy 0.97:  63%|██████▎   | 1182/1875 [00:35<00:15, 46.04it/s]
loss 0.34 accuracy 0.91:  63%|██████▎   | 1182/1875 [00:35<00:15, 46.04it/s]
loss 0.08 accuracy 0.97:  63%|██████▎   | 1182/1875 [00:35<00:15, 46.04it/s]
loss 0.18 accuracy 0.94:  63%|██████▎   | 1182/1875 [00:35<00:15, 46.04it/s]
loss 0.18 accuracy 0.94:  63%|██████▎   | 1187/1875 [00:35<00:14, 46.06it/s]
loss 0.28 accuracy 0.91:  63%|██████▎   | 1187/1875 [00:35<00:14, 46.06it/s]
loss 0.14 accuracy 0.97:  63%|██████▎   | 1187/1875 [00:35<00:14, 46.06it/s]
loss 0.04 accuracy 1.00:  63%|██████▎   | 1187/1875 [00:35<00:14, 46.06it/s]
loss 0.08 accuracy 1.00:  63%|██████▎   | 1187/1875 [00:35<00:14, 46.06it/s]
loss 0.18 accuracy 0.94:  63%|██████▎   | 1187/1875 [00:35<00:14, 46.06it/s]
loss 0.18 accuracy 0.94:  64%|██████▎   | 1192/1875 [00:35<00:14, 46.07it/s]
loss 0.21 accuracy 0.94:  64%|██████▎   | 1192/1875 [00:35<00:14, 46.07it/s]
loss 0.08 accuracy 0.97:  64%|██████▎   | 1192/1875 [00:36<00:14, 46.07it/s]
loss 0.26 accuracy 0.94:  64%|██████▎   | 1192/1875 [00:36<00:14, 46.07it/s]
loss 0.04 accuracy 1.00:  64%|██████▎   | 1192/1875 [00:36<00:14, 46.07it/s]
loss 0.04 accuracy 1.00:  64%|██████▎   | 1192/1875 [00:36<00:14, 46.07it/s]
loss 0.04 accuracy 1.00:  64%|██████▍   | 1197/1875 [00:36<00:14, 46.14it/s]
loss 0.05 accuracy 1.00:  64%|██████▍   | 1197/1875 [00:36<00:14, 46.14it/s]
loss 0.19 accuracy 0.91:  64%|██████▍   | 1197/1875 [00:36<00:14, 46.14it/s]
loss 0.11 accuracy 1.00:  64%|██████▍   | 1197/1875 [00:36<00:14, 46.14it/s]
loss 0.20 accuracy 0.91:  64%|██████▍   | 1197/1875 [00:36<00:14, 46.14it/s]
loss 0.13 accuracy 0.94:  64%|██████▍   | 1197/1875 [00:36<00:14, 46.14it/s]
loss 0.13 accuracy 0.94:  64%|██████▍   | 1202/1875 [00:36<00:14, 46.16it/s]
loss 0.05 accuracy 0.97:  64%|██████▍   | 1202/1875 [00:36<00:14, 46.16it/s]
loss 0.11 accuracy 0.97:  64%|██████▍   | 1202/1875 [00:36<00:14, 46.16it/s]
loss 0.09 accuracy 0.97:  64%|██████▍   | 1202/1875 [00:36<00:14, 46.16it/s]
loss 0.08 accuracy 0.97:  64%|██████▍   | 1202/1875 [00:36<00:14, 46.16it/s]
loss 0.13 accuracy 0.94:  64%|██████▍   | 1202/1875 [00:36<00:14, 46.16it/s]
loss 0.13 accuracy 0.94:  64%|██████▍   | 1207/1875 [00:36<00:14, 46.15it/s]
loss 0.11 accuracy 0.97:  64%|██████▍   | 1207/1875 [00:36<00:14, 46.15it/s]
loss 0.09 accuracy 0.97:  64%|██████▍   | 1207/1875 [00:36<00:14, 46.15it/s]
loss 0.11 accuracy 0.94:  64%|██████▍   | 1207/1875 [00:36<00:14, 46.15it/s]
loss 0.06 accuracy 1.00:  64%|██████▍   | 1207/1875 [00:36<00:14, 46.15it/s]
loss 0.43 accuracy 0.94:  64%|██████▍   | 1207/1875 [00:36<00:14, 46.15it/s]
loss 0.43 accuracy 0.94:  65%|██████▍   | 1212/1875 [00:36<00:14, 46.19it/s]
loss 0.11 accuracy 0.94:  65%|██████▍   | 1212/1875 [00:36<00:14, 46.19it/s]
loss 0.09 accuracy 1.00:  65%|██████▍   | 1212/1875 [00:36<00:14, 46.19it/s]
loss 0.10 accuracy 0.97:  65%|██████▍   | 1212/1875 [00:36<00:14, 46.19it/s]
loss 0.06 accuracy 0.97:  65%|██████▍   | 1212/1875 [00:36<00:14, 46.19it/s]
loss 0.22 accuracy 0.94:  65%|██████▍   | 1212/1875 [00:36<00:14, 46.19it/s]
loss 0.22 accuracy 0.94:  65%|██████▍   | 1217/1875 [00:36<00:14, 46.18it/s]
loss 0.11 accuracy 0.97:  65%|██████▍   | 1217/1875 [00:36<00:14, 46.18it/s]
loss 0.17 accuracy 0.94:  65%|██████▍   | 1217/1875 [00:36<00:14, 46.18it/s]
loss 0.14 accuracy 0.91:  65%|██████▍   | 1217/1875 [00:36<00:14, 46.18it/s]
loss 0.03 accuracy 1.00:  65%|██████▍   | 1217/1875 [00:36<00:14, 46.18it/s]
loss 0.04 accuracy 1.00:  65%|██████▍   | 1217/1875 [00:36<00:14, 46.18it/s]
loss 0.04 accuracy 1.00:  65%|██████▌   | 1222/1875 [00:36<00:14, 46.17it/s]
loss 0.09 accuracy 0.97:  65%|██████▌   | 1222/1875 [00:36<00:14, 46.17it/s]
loss 0.18 accuracy 0.94:  65%|██████▌   | 1222/1875 [00:36<00:14, 46.17it/s]
loss 0.03 accuracy 1.00:  65%|██████▌   | 1222/1875 [00:36<00:14, 46.17it/s]
loss 0.24 accuracy 0.97:  65%|██████▌   | 1222/1875 [00:36<00:14, 46.17it/s]
loss 0.04 accuracy 1.00:  65%|██████▌   | 1222/1875 [00:36<00:14, 46.17it/s]
loss 0.04 accuracy 1.00:  65%|██████▌   | 1227/1875 [00:36<00:14, 46.15it/s]
loss 0.10 accuracy 1.00:  65%|██████▌   | 1227/1875 [00:36<00:14, 46.15it/s]
loss 0.20 accuracy 0.91:  65%|██████▌   | 1227/1875 [00:36<00:14, 46.15it/s]
loss 0.07 accuracy 0.97:  65%|██████▌   | 1227/1875 [00:36<00:14, 46.15it/s]
loss 0.07 accuracy 0.97:  65%|██████▌   | 1227/1875 [00:36<00:14, 46.15it/s]
loss 0.04 accuracy 1.00:  65%|██████▌   | 1227/1875 [00:36<00:14, 46.15it/s]
loss 0.04 accuracy 1.00:  66%|██████▌   | 1232/1875 [00:36<00:13, 46.15it/s]
loss 0.05 accuracy 1.00:  66%|██████▌   | 1232/1875 [00:36<00:13, 46.15it/s]
loss 0.06 accuracy 0.97:  66%|██████▌   | 1232/1875 [00:36<00:13, 46.15it/s]
loss 0.30 accuracy 0.94:  66%|██████▌   | 1232/1875 [00:36<00:13, 46.15it/s]
loss 0.10 accuracy 0.97:  66%|██████▌   | 1232/1875 [00:36<00:13, 46.15it/s]
loss 0.20 accuracy 0.94:  66%|██████▌   | 1232/1875 [00:36<00:13, 46.15it/s]
loss 0.20 accuracy 0.94:  66%|██████▌   | 1237/1875 [00:36<00:13, 46.15it/s]
loss 0.06 accuracy 0.97:  66%|██████▌   | 1237/1875 [00:36<00:13, 46.15it/s]
loss 0.22 accuracy 0.91:  66%|██████▌   | 1237/1875 [00:36<00:13, 46.15it/s]
loss 0.05 accuracy 1.00:  66%|██████▌   | 1237/1875 [00:37<00:13, 46.15it/s]
loss 0.22 accuracy 0.97:  66%|██████▌   | 1237/1875 [00:37<00:13, 46.15it/s]
loss 0.24 accuracy 0.91:  66%|██████▌   | 1237/1875 [00:37<00:13, 46.15it/s]
loss 0.24 accuracy 0.91:  66%|██████▌   | 1242/1875 [00:37<00:13, 46.13it/s]
loss 0.06 accuracy 0.97:  66%|██████▌   | 1242/1875 [00:37<00:13, 46.13it/s]
loss 0.07 accuracy 1.00:  66%|██████▌   | 1242/1875 [00:37<00:13, 46.13it/s]
loss 0.06 accuracy 1.00:  66%|██████▌   | 1242/1875 [00:37<00:13, 46.13it/s]
loss 0.09 accuracy 0.97:  66%|██████▌   | 1242/1875 [00:37<00:13, 46.13it/s]
loss 0.11 accuracy 0.97:  66%|██████▌   | 1242/1875 [00:37<00:13, 46.13it/s]
loss 0.11 accuracy 0.97:  67%|██████▋   | 1247/1875 [00:37<00:13, 46.09it/s]
loss 0.13 accuracy 0.94:  67%|██████▋   | 1247/1875 [00:37<00:13, 46.09it/s]
loss 0.02 accuracy 1.00:  67%|██████▋   | 1247/1875 [00:37<00:13, 46.09it/s]
loss 0.04 accuracy 1.00:  67%|██████▋   | 1247/1875 [00:37<00:13, 46.09it/s]
loss 0.14 accuracy 0.97:  67%|██████▋   | 1247/1875 [00:37<00:13, 46.09it/s]
loss 0.05 accuracy 1.00:  67%|██████▋   | 1247/1875 [00:37<00:13, 46.09it/s]
loss 0.05 accuracy 1.00:  67%|██████▋   | 1252/1875 [00:37<00:13, 46.06it/s]
loss 0.08 accuracy 0.97:  67%|██████▋   | 1252/1875 [00:37<00:13, 46.06it/s]
loss 0.05 accuracy 1.00:  67%|██████▋   | 1252/1875 [00:37<00:13, 46.06it/s]
loss 0.17 accuracy 0.91:  67%|██████▋   | 1252/1875 [00:37<00:13, 46.06it/s]
loss 0.06 accuracy 1.00:  67%|██████▋   | 1252/1875 [00:37<00:13, 46.06it/s]
loss 0.02 accuracy 1.00:  67%|██████▋   | 1252/1875 [00:37<00:13, 46.06it/s]
loss 0.02 accuracy 1.00:  67%|██████▋   | 1257/1875 [00:37<00:13, 46.05it/s]
loss 0.07 accuracy 1.00:  67%|██████▋   | 1257/1875 [00:37<00:13, 46.05it/s]
loss 0.04 accuracy 1.00:  67%|██████▋   | 1257/1875 [00:37<00:13, 46.05it/s]
loss 0.08 accuracy 0.97:  67%|██████▋   | 1257/1875 [00:37<00:13, 46.05it/s]
loss 0.06 accuracy 1.00:  67%|██████▋   | 1257/1875 [00:37<00:13, 46.05it/s]
loss 0.16 accuracy 0.97:  67%|██████▋   | 1257/1875 [00:37<00:13, 46.05it/s]
loss 0.16 accuracy 0.97:  67%|██████▋   | 1262/1875 [00:37<00:13, 46.01it/s]
loss 0.16 accuracy 0.94:  67%|██████▋   | 1262/1875 [00:37<00:13, 46.01it/s]
loss 0.02 accuracy 1.00:  67%|██████▋   | 1262/1875 [00:37<00:13, 46.01it/s]
loss 0.16 accuracy 0.94:  67%|██████▋   | 1262/1875 [00:37<00:13, 46.01it/s]
loss 0.05 accuracy 1.00:  67%|██████▋   | 1262/1875 [00:37<00:13, 46.01it/s]
loss 0.07 accuracy 0.97:  67%|██████▋   | 1262/1875 [00:37<00:13, 46.01it/s]
loss 0.07 accuracy 0.97:  68%|██████▊   | 1267/1875 [00:37<00:13, 45.86it/s]
loss 0.04 accuracy 1.00:  68%|██████▊   | 1267/1875 [00:37<00:13, 45.86it/s]
loss 0.24 accuracy 0.91:  68%|██████▊   | 1267/1875 [00:37<00:13, 45.86it/s]
loss 0.04 accuracy 1.00:  68%|██████▊   | 1267/1875 [00:37<00:13, 45.86it/s]
loss 0.04 accuracy 1.00:  68%|██████▊   | 1267/1875 [00:37<00:13, 45.86it/s]
loss 0.03 accuracy 1.00:  68%|██████▊   | 1267/1875 [00:37<00:13, 45.86it/s]
loss 0.03 accuracy 1.00:  68%|██████▊   | 1272/1875 [00:37<00:13, 45.84it/s]
loss 0.05 accuracy 1.00:  68%|██████▊   | 1272/1875 [00:37<00:13, 45.84it/s]
loss 0.12 accuracy 0.97:  68%|██████▊   | 1272/1875 [00:37<00:13, 45.84it/s]
loss 0.26 accuracy 0.91:  68%|██████▊   | 1272/1875 [00:37<00:13, 45.84it/s]
loss 0.26 accuracy 0.94:  68%|██████▊   | 1272/1875 [00:37<00:13, 45.84it/s]
loss 0.07 accuracy 1.00:  68%|██████▊   | 1272/1875 [00:37<00:13, 45.84it/s]
loss 0.07 accuracy 1.00:  68%|██████▊   | 1277/1875 [00:37<00:13, 45.79it/s]
loss 0.24 accuracy 0.94:  68%|██████▊   | 1277/1875 [00:37<00:13, 45.79it/s]
loss 0.04 accuracy 1.00:  68%|██████▊   | 1277/1875 [00:37<00:13, 45.79it/s]
loss 0.17 accuracy 0.97:  68%|██████▊   | 1277/1875 [00:37<00:13, 45.79it/s]
loss 0.12 accuracy 0.94:  68%|██████▊   | 1277/1875 [00:37<00:13, 45.79it/s]
loss 0.22 accuracy 0.97:  68%|██████▊   | 1277/1875 [00:37<00:13, 45.79it/s]
loss 0.22 accuracy 0.97:  68%|██████▊   | 1282/1875 [00:37<00:12, 45.72it/s]
loss 0.04 accuracy 1.00:  68%|██████▊   | 1282/1875 [00:37<00:12, 45.72it/s]
loss 0.18 accuracy 0.94:  68%|██████▊   | 1282/1875 [00:37<00:12, 45.72it/s]
loss 0.16 accuracy 0.94:  68%|██████▊   | 1282/1875 [00:37<00:12, 45.72it/s]
loss 0.13 accuracy 0.97:  68%|██████▊   | 1282/1875 [00:38<00:12, 45.72it/s]
loss 0.14 accuracy 0.94:  68%|██████▊   | 1282/1875 [00:38<00:12, 45.72it/s]
loss 0.14 accuracy 0.94:  69%|██████▊   | 1287/1875 [00:38<00:12, 45.77it/s]
loss 0.06 accuracy 1.00:  69%|██████▊   | 1287/1875 [00:38<00:12, 45.77it/s]
loss 0.15 accuracy 0.94:  69%|██████▊   | 1287/1875 [00:38<00:12, 45.77it/s]
loss 0.23 accuracy 0.94:  69%|██████▊   | 1287/1875 [00:38<00:12, 45.77it/s]
loss 0.04 accuracy 1.00:  69%|██████▊   | 1287/1875 [00:38<00:12, 45.77it/s]
loss 0.10 accuracy 0.97:  69%|██████▊   | 1287/1875 [00:38<00:12, 45.77it/s]
loss 0.10 accuracy 0.97:  69%|██████▉   | 1292/1875 [00:38<00:12, 45.69it/s]
loss 0.14 accuracy 0.94:  69%|██████▉   | 1292/1875 [00:38<00:12, 45.69it/s]
loss 0.32 accuracy 0.94:  69%|██████▉   | 1292/1875 [00:38<00:12, 45.69it/s]
loss 0.07 accuracy 0.97:  69%|██████▉   | 1292/1875 [00:38<00:12, 45.69it/s]
loss 0.22 accuracy 0.91:  69%|██████▉   | 1292/1875 [00:38<00:12, 45.69it/s]
loss 0.24 accuracy 0.94:  69%|██████▉   | 1292/1875 [00:38<00:12, 45.69it/s]
loss 0.24 accuracy 0.94:  69%|██████▉   | 1297/1875 [00:38<00:12, 45.74it/s]
loss 0.06 accuracy 1.00:  69%|██████▉   | 1297/1875 [00:38<00:12, 45.74it/s]
loss 0.13 accuracy 0.97:  69%|██████▉   | 1297/1875 [00:38<00:12, 45.74it/s]
loss 0.26 accuracy 0.94:  69%|██████▉   | 1297/1875 [00:38<00:12, 45.74it/s]
loss 0.20 accuracy 0.97:  69%|██████▉   | 1297/1875 [00:38<00:12, 45.74it/s]
loss 0.20 accuracy 0.97:  69%|██████▉   | 1297/1875 [00:38<00:12, 45.74it/s]
loss 0.20 accuracy 0.97:  69%|██████▉   | 1302/1875 [00:38<00:12, 45.82it/s]
loss 0.09 accuracy 0.97:  69%|██████▉   | 1302/1875 [00:38<00:12, 45.82it/s]
loss 0.09 accuracy 0.97:  69%|██████▉   | 1302/1875 [00:38<00:12, 45.82it/s]
loss 0.06 accuracy 0.97:  69%|██████▉   | 1302/1875 [00:38<00:12, 45.82it/s]
loss 0.18 accuracy 0.97:  69%|██████▉   | 1302/1875 [00:38<00:12, 45.82it/s]
loss 0.07 accuracy 1.00:  69%|██████▉   | 1302/1875 [00:38<00:12, 45.82it/s]
loss 0.07 accuracy 1.00:  70%|██████▉   | 1307/1875 [00:38<00:12, 45.90it/s]
loss 0.03 accuracy 1.00:  70%|██████▉   | 1307/1875 [00:38<00:12, 45.90it/s]
loss 0.22 accuracy 0.91:  70%|██████▉   | 1307/1875 [00:38<00:12, 45.90it/s]
loss 0.04 accuracy 1.00:  70%|██████▉   | 1307/1875 [00:38<00:12, 45.90it/s]
loss 0.19 accuracy 0.94:  70%|██████▉   | 1307/1875 [00:38<00:12, 45.90it/s]
loss 0.05 accuracy 1.00:  70%|██████▉   | 1307/1875 [00:38<00:12, 45.90it/s]
loss 0.05 accuracy 1.00:  70%|██████▉   | 1312/1875 [00:38<00:12, 45.98it/s]
loss 0.16 accuracy 0.97:  70%|██████▉   | 1312/1875 [00:38<00:12, 45.98it/s]
loss 0.30 accuracy 0.91:  70%|██████▉   | 1312/1875 [00:38<00:12, 45.98it/s]
loss 0.12 accuracy 0.97:  70%|██████▉   | 1312/1875 [00:38<00:12, 45.98it/s]
loss 0.21 accuracy 0.97:  70%|██████▉   | 1312/1875 [00:38<00:12, 45.98it/s]
loss 0.30 accuracy 0.94:  70%|██████▉   | 1312/1875 [00:38<00:12, 45.98it/s]
loss 0.30 accuracy 0.94:  70%|███████   | 1317/1875 [00:38<00:12, 46.01it/s]
loss 0.15 accuracy 0.94:  70%|███████   | 1317/1875 [00:38<00:12, 46.01it/s]
loss 0.06 accuracy 1.00:  70%|███████   | 1317/1875 [00:38<00:12, 46.01it/s]
loss 0.16 accuracy 0.97:  70%|███████   | 1317/1875 [00:38<00:12, 46.01it/s]
loss 0.02 accuracy 1.00:  70%|███████   | 1317/1875 [00:38<00:12, 46.01it/s]
loss 0.34 accuracy 0.88:  70%|███████   | 1317/1875 [00:38<00:12, 46.01it/s]
loss 0.34 accuracy 0.88:  71%|███████   | 1322/1875 [00:38<00:12, 46.06it/s]
loss 0.04 accuracy 1.00:  71%|███████   | 1322/1875 [00:38<00:12, 46.06it/s]
loss 0.10 accuracy 0.97:  71%|███████   | 1322/1875 [00:38<00:12, 46.06it/s]
loss 0.06 accuracy 1.00:  71%|███████   | 1322/1875 [00:38<00:12, 46.06it/s]
loss 0.04 accuracy 1.00:  71%|███████   | 1322/1875 [00:38<00:12, 46.06it/s]
loss 0.37 accuracy 0.94:  71%|███████   | 1322/1875 [00:38<00:12, 46.06it/s]
loss 0.37 accuracy 0.94:  71%|███████   | 1327/1875 [00:38<00:11, 46.07it/s]
loss 0.06 accuracy 1.00:  71%|███████   | 1327/1875 [00:38<00:11, 46.07it/s]
loss 0.05 accuracy 1.00:  71%|███████   | 1327/1875 [00:38<00:11, 46.07it/s]
loss 0.04 accuracy 1.00:  71%|███████   | 1327/1875 [00:38<00:11, 46.07it/s]
loss 0.07 accuracy 0.97:  71%|███████   | 1327/1875 [00:38<00:11, 46.07it/s]
loss 0.03 accuracy 1.00:  71%|███████   | 1327/1875 [00:39<00:11, 46.07it/s]
loss 0.03 accuracy 1.00:  71%|███████   | 1332/1875 [00:39<00:11, 46.07it/s]
loss 0.14 accuracy 0.94:  71%|███████   | 1332/1875 [00:39<00:11, 46.07it/s]
loss 0.13 accuracy 0.97:  71%|███████   | 1332/1875 [00:39<00:11, 46.07it/s]
loss 0.11 accuracy 0.97:  71%|███████   | 1332/1875 [00:39<00:11, 46.07it/s]
loss 0.16 accuracy 0.97:  71%|███████   | 1332/1875 [00:39<00:11, 46.07it/s]
loss 0.02 accuracy 1.00:  71%|███████   | 1332/1875 [00:39<00:11, 46.07it/s]
loss 0.02 accuracy 1.00:  71%|███████▏  | 1337/1875 [00:39<00:11, 46.09it/s]
loss 0.09 accuracy 0.97:  71%|███████▏  | 1337/1875 [00:39<00:11, 46.09it/s]
loss 0.10 accuracy 1.00:  71%|███████▏  | 1337/1875 [00:39<00:11, 46.09it/s]
loss 0.05 accuracy 1.00:  71%|███████▏  | 1337/1875 [00:39<00:11, 46.09it/s]
loss 0.18 accuracy 0.97:  71%|███████▏  | 1337/1875 [00:39<00:11, 46.09it/s]
loss 0.06 accuracy 1.00:  71%|███████▏  | 1337/1875 [00:39<00:11, 46.09it/s]
loss 0.06 accuracy 1.00:  72%|███████▏  | 1342/1875 [00:39<00:11, 46.05it/s]
loss 0.18 accuracy 0.94:  72%|███████▏  | 1342/1875 [00:39<00:11, 46.05it/s]
loss 0.17 accuracy 0.94:  72%|███████▏  | 1342/1875 [00:39<00:11, 46.05it/s]
loss 0.04 accuracy 1.00:  72%|███████▏  | 1342/1875 [00:39<00:11, 46.05it/s]
loss 0.10 accuracy 0.97:  72%|███████▏  | 1342/1875 [00:39<00:11, 46.05it/s]
loss 0.07 accuracy 0.97:  72%|███████▏  | 1342/1875 [00:39<00:11, 46.05it/s]
loss 0.07 accuracy 0.97:  72%|███████▏  | 1347/1875 [00:39<00:11, 46.00it/s]
loss 0.22 accuracy 0.91:  72%|███████▏  | 1347/1875 [00:39<00:11, 46.00it/s]
loss 0.02 accuracy 1.00:  72%|███████▏  | 1347/1875 [00:39<00:11, 46.00it/s]
loss 0.03 accuracy 1.00:  72%|███████▏  | 1347/1875 [00:39<00:11, 46.00it/s]
loss 0.03 accuracy 1.00:  72%|███████▏  | 1347/1875 [00:39<00:11, 46.00it/s]
loss 0.07 accuracy 1.00:  72%|███████▏  | 1347/1875 [00:39<00:11, 46.00it/s]
loss 0.07 accuracy 1.00:  72%|███████▏  | 1352/1875 [00:39<00:11, 45.86it/s]
loss 0.13 accuracy 0.94:  72%|███████▏  | 1352/1875 [00:39<00:11, 45.86it/s]
loss 0.24 accuracy 0.94:  72%|███████▏  | 1352/1875 [00:39<00:11, 45.86it/s]
loss 0.05 accuracy 1.00:  72%|███████▏  | 1352/1875 [00:39<00:11, 45.86it/s]
loss 0.12 accuracy 0.97:  72%|███████▏  | 1352/1875 [00:39<00:11, 45.86it/s]
loss 0.07 accuracy 1.00:  72%|███████▏  | 1352/1875 [00:39<00:11, 45.86it/s]
loss 0.07 accuracy 1.00:  72%|███████▏  | 1357/1875 [00:39<00:11, 45.82it/s]
loss 0.04 accuracy 1.00:  72%|███████▏  | 1357/1875 [00:39<00:11, 45.82it/s]
loss 0.12 accuracy 0.97:  72%|███████▏  | 1357/1875 [00:39<00:11, 45.82it/s]
loss 0.20 accuracy 0.94:  72%|███████▏  | 1357/1875 [00:39<00:11, 45.82it/s]
loss 0.11 accuracy 0.94:  72%|███████▏  | 1357/1875 [00:39<00:11, 45.82it/s]
loss 0.03 accuracy 1.00:  72%|███████▏  | 1357/1875 [00:39<00:11, 45.82it/s]
loss 0.03 accuracy 1.00:  73%|███████▎  | 1362/1875 [00:39<00:11, 45.84it/s]
loss 0.10 accuracy 0.97:  73%|███████▎  | 1362/1875 [00:39<00:11, 45.84it/s]
loss 0.07 accuracy 0.97:  73%|███████▎  | 1362/1875 [00:39<00:11, 45.84it/s]
loss 0.11 accuracy 0.97:  73%|███████▎  | 1362/1875 [00:39<00:11, 45.84it/s]
loss 0.38 accuracy 0.91:  73%|███████▎  | 1362/1875 [00:39<00:11, 45.84it/s]
loss 0.01 accuracy 1.00:  73%|███████▎  | 1362/1875 [00:39<00:11, 45.84it/s]
loss 0.01 accuracy 1.00:  73%|███████▎  | 1367/1875 [00:39<00:11, 45.70it/s]
loss 0.10 accuracy 0.94:  73%|███████▎  | 1367/1875 [00:39<00:11, 45.70it/s]
loss 0.16 accuracy 0.94:  73%|███████▎  | 1367/1875 [00:39<00:11, 45.70it/s]
loss 0.07 accuracy 0.97:  73%|███████▎  | 1367/1875 [00:39<00:11, 45.70it/s]
loss 0.23 accuracy 0.94:  73%|███████▎  | 1367/1875 [00:39<00:11, 45.70it/s]
loss 0.04 accuracy 1.00:  73%|███████▎  | 1367/1875 [00:39<00:11, 45.70it/s]
loss 0.04 accuracy 1.00:  73%|███████▎  | 1372/1875 [00:39<00:11, 45.72it/s]
loss 0.08 accuracy 0.97:  73%|███████▎  | 1372/1875 [00:39<00:11, 45.72it/s]
loss 0.35 accuracy 0.91:  73%|███████▎  | 1372/1875 [00:39<00:11, 45.72it/s]
loss 0.03 accuracy 1.00:  73%|███████▎  | 1372/1875 [00:39<00:11, 45.72it/s]
loss 0.18 accuracy 0.94:  73%|███████▎  | 1372/1875 [00:39<00:11, 45.72it/s]
loss 0.09 accuracy 0.97:  73%|███████▎  | 1372/1875 [00:39<00:11, 45.72it/s]
loss 0.09 accuracy 0.97:  73%|███████▎  | 1377/1875 [00:39<00:10, 45.70it/s]
loss 0.14 accuracy 0.97:  73%|███████▎  | 1377/1875 [00:40<00:10, 45.70it/s]
loss 0.15 accuracy 0.94:  73%|███████▎  | 1377/1875 [00:40<00:10, 45.70it/s]
loss 0.05 accuracy 1.00:  73%|███████▎  | 1377/1875 [00:40<00:10, 45.70it/s]
loss 0.29 accuracy 0.94:  73%|███████▎  | 1377/1875 [00:40<00:10, 45.70it/s]
loss 0.06 accuracy 0.97:  73%|███████▎  | 1377/1875 [00:40<00:10, 45.70it/s]
loss 0.06 accuracy 0.97:  74%|███████▎  | 1382/1875 [00:40<00:10, 45.74it/s]
loss 0.17 accuracy 0.94:  74%|███████▎  | 1382/1875 [00:40<00:10, 45.74it/s]
loss 0.06 accuracy 0.97:  74%|███████▎  | 1382/1875 [00:40<00:10, 45.74it/s]
loss 0.08 accuracy 0.97:  74%|███████▎  | 1382/1875 [00:40<00:10, 45.74it/s]
loss 0.06 accuracy 0.97:  74%|███████▎  | 1382/1875 [00:40<00:10, 45.74it/s]
loss 0.05 accuracy 1.00:  74%|███████▎  | 1382/1875 [00:40<00:10, 45.74it/s]
loss 0.05 accuracy 1.00:  74%|███████▍  | 1387/1875 [00:40<00:10, 45.82it/s]
loss 0.06 accuracy 1.00:  74%|███████▍  | 1387/1875 [00:40<00:10, 45.82it/s]
loss 0.20 accuracy 0.97:  74%|███████▍  | 1387/1875 [00:40<00:10, 45.82it/s]
loss 0.10 accuracy 0.97:  74%|███████▍  | 1387/1875 [00:40<00:10, 45.82it/s]
loss 0.20 accuracy 0.97:  74%|███████▍  | 1387/1875 [00:40<00:10, 45.82it/s]
loss 0.21 accuracy 0.91:  74%|███████▍  | 1387/1875 [00:40<00:10, 45.82it/s]
loss 0.21 accuracy 0.91:  74%|███████▍  | 1392/1875 [00:40<00:10, 45.90it/s]
loss 0.07 accuracy 0.97:  74%|███████▍  | 1392/1875 [00:40<00:10, 45.90it/s]
loss 0.28 accuracy 0.88:  74%|███████▍  | 1392/1875 [00:40<00:10, 45.90it/s]
loss 0.08 accuracy 0.97:  74%|███████▍  | 1392/1875 [00:40<00:10, 45.90it/s]
loss 0.08 accuracy 0.97:  74%|███████▍  | 1392/1875 [00:40<00:10, 45.90it/s]
loss 0.03 accuracy 1.00:  74%|███████▍  | 1392/1875 [00:40<00:10, 45.90it/s]
loss 0.03 accuracy 1.00:  75%|███████▍  | 1397/1875 [00:40<00:10, 45.94it/s]
loss 0.11 accuracy 0.94:  75%|███████▍  | 1397/1875 [00:40<00:10, 45.94it/s]
loss 0.07 accuracy 0.97:  75%|███████▍  | 1397/1875 [00:40<00:10, 45.94it/s]
loss 0.11 accuracy 0.97:  75%|███████▍  | 1397/1875 [00:40<00:10, 45.94it/s]
loss 0.02 accuracy 1.00:  75%|███████▍  | 1397/1875 [00:40<00:10, 45.94it/s]
loss 0.07 accuracy 0.97:  75%|███████▍  | 1397/1875 [00:40<00:10, 45.94it/s]
loss 0.07 accuracy 0.97:  75%|███████▍  | 1402/1875 [00:40<00:10, 46.03it/s]
loss 0.21 accuracy 0.91:  75%|███████▍  | 1402/1875 [00:40<00:10, 46.03it/s]
loss 0.03 accuracy 1.00:  75%|███████▍  | 1402/1875 [00:40<00:10, 46.03it/s]
loss 0.31 accuracy 0.94:  75%|███████▍  | 1402/1875 [00:40<00:10, 46.03it/s]
loss 0.12 accuracy 0.97:  75%|███████▍  | 1402/1875 [00:40<00:10, 46.03it/s]
loss 0.09 accuracy 0.97:  75%|███████▍  | 1402/1875 [00:40<00:10, 46.03it/s]
loss 0.09 accuracy 0.97:  75%|███████▌  | 1407/1875 [00:40<00:10, 46.07it/s]
loss 0.08 accuracy 1.00:  75%|███████▌  | 1407/1875 [00:40<00:10, 46.07it/s]
loss 0.10 accuracy 1.00:  75%|███████▌  | 1407/1875 [00:40<00:10, 46.07it/s]
loss 0.35 accuracy 0.91:  75%|███████▌  | 1407/1875 [00:40<00:10, 46.07it/s]
loss 0.06 accuracy 1.00:  75%|███████▌  | 1407/1875 [00:40<00:10, 46.07it/s]
loss 0.09 accuracy 0.97:  75%|███████▌  | 1407/1875 [00:40<00:10, 46.07it/s]
loss 0.09 accuracy 0.97:  75%|███████▌  | 1412/1875 [00:40<00:10, 46.08it/s]
loss 0.08 accuracy 0.97:  75%|███████▌  | 1412/1875 [00:40<00:10, 46.08it/s]
loss 0.21 accuracy 0.94:  75%|███████▌  | 1412/1875 [00:40<00:10, 46.08it/s]
loss 0.08 accuracy 1.00:  75%|███████▌  | 1412/1875 [00:40<00:10, 46.08it/s]
loss 0.06 accuracy 1.00:  75%|███████▌  | 1412/1875 [00:40<00:10, 46.08it/s]
loss 0.10 accuracy 0.97:  75%|███████▌  | 1412/1875 [00:40<00:10, 46.08it/s]
loss 0.10 accuracy 0.97:  76%|███████▌  | 1417/1875 [00:40<00:09, 46.10it/s]
loss 0.16 accuracy 0.97:  76%|███████▌  | 1417/1875 [00:40<00:09, 46.10it/s]
loss 0.15 accuracy 0.94:  76%|███████▌  | 1417/1875 [00:40<00:09, 46.10it/s]
loss 0.24 accuracy 0.94:  76%|███████▌  | 1417/1875 [00:40<00:09, 46.10it/s]
loss 0.06 accuracy 1.00:  76%|███████▌  | 1417/1875 [00:40<00:09, 46.10it/s]
loss 0.34 accuracy 0.94:  76%|███████▌  | 1417/1875 [00:40<00:09, 46.10it/s]
loss 0.34 accuracy 0.94:  76%|███████▌  | 1422/1875 [00:40<00:09, 46.08it/s]
loss 0.16 accuracy 0.97:  76%|███████▌  | 1422/1875 [00:40<00:09, 46.08it/s]
loss 0.04 accuracy 1.00:  76%|███████▌  | 1422/1875 [00:41<00:09, 46.08it/s]
loss 0.14 accuracy 0.94:  76%|███████▌  | 1422/1875 [00:41<00:09, 46.08it/s]
loss 0.03 accuracy 1.00:  76%|███████▌  | 1422/1875 [00:41<00:09, 46.08it/s]
loss 0.02 accuracy 1.00:  76%|███████▌  | 1422/1875 [00:41<00:09, 46.08it/s]
loss 0.02 accuracy 1.00:  76%|███████▌  | 1427/1875 [00:41<00:09, 46.06it/s]
loss 0.08 accuracy 1.00:  76%|███████▌  | 1427/1875 [00:41<00:09, 46.06it/s]
loss 0.18 accuracy 0.94:  76%|███████▌  | 1427/1875 [00:41<00:09, 46.06it/s]
loss 0.05 accuracy 1.00:  76%|███████▌  | 1427/1875 [00:41<00:09, 46.06it/s]
loss 0.19 accuracy 0.94:  76%|███████▌  | 1427/1875 [00:41<00:09, 46.06it/s]
loss 0.08 accuracy 0.97:  76%|███████▌  | 1427/1875 [00:41<00:09, 46.06it/s]
loss 0.08 accuracy 0.97:  76%|███████▋  | 1432/1875 [00:41<00:09, 46.03it/s]
loss 0.19 accuracy 0.97:  76%|███████▋  | 1432/1875 [00:41<00:09, 46.03it/s]
loss 0.07 accuracy 0.97:  76%|███████▋  | 1432/1875 [00:41<00:09, 46.03it/s]
loss 0.26 accuracy 0.94:  76%|███████▋  | 1432/1875 [00:41<00:09, 46.03it/s]
loss 0.36 accuracy 0.84:  76%|███████▋  | 1432/1875 [00:41<00:09, 46.03it/s]
loss 0.02 accuracy 1.00:  76%|███████▋  | 1432/1875 [00:41<00:09, 46.03it/s]
loss 0.02 accuracy 1.00:  77%|███████▋  | 1437/1875 [00:41<00:09, 45.95it/s]
loss 0.16 accuracy 0.97:  77%|███████▋  | 1437/1875 [00:41<00:09, 45.95it/s]
loss 0.04 accuracy 1.00:  77%|███████▋  | 1437/1875 [00:41<00:09, 45.95it/s]
loss 0.19 accuracy 0.94:  77%|███████▋  | 1437/1875 [00:41<00:09, 45.95it/s]
loss 0.05 accuracy 1.00:  77%|███████▋  | 1437/1875 [00:41<00:09, 45.95it/s]
loss 0.05 accuracy 1.00:  77%|███████▋  | 1437/1875 [00:41<00:09, 45.95it/s]
loss 0.05 accuracy 1.00:  77%|███████▋  | 1442/1875 [00:41<00:09, 45.83it/s]
loss 0.06 accuracy 1.00:  77%|███████▋  | 1442/1875 [00:41<00:09, 45.83it/s]
loss 0.11 accuracy 0.94:  77%|███████▋  | 1442/1875 [00:41<00:09, 45.83it/s]
loss 0.06 accuracy 0.97:  77%|███████▋  | 1442/1875 [00:41<00:09, 45.83it/s]
loss 0.26 accuracy 0.94:  77%|███████▋  | 1442/1875 [00:41<00:09, 45.83it/s]
loss 0.16 accuracy 0.97:  77%|███████▋  | 1442/1875 [00:41<00:09, 45.83it/s]
loss 0.16 accuracy 0.97:  77%|███████▋  | 1447/1875 [00:41<00:09, 45.82it/s]
loss 0.11 accuracy 0.97:  77%|███████▋  | 1447/1875 [00:41<00:09, 45.82it/s]
loss 0.02 accuracy 1.00:  77%|███████▋  | 1447/1875 [00:41<00:09, 45.82it/s]
loss 0.03 accuracy 1.00:  77%|███████▋  | 1447/1875 [00:41<00:09, 45.82it/s]
loss 0.05 accuracy 1.00:  77%|███████▋  | 1447/1875 [00:41<00:09, 45.82it/s]
loss 0.28 accuracy 0.94:  77%|███████▋  | 1447/1875 [00:41<00:09, 45.82it/s]
loss 0.28 accuracy 0.94:  77%|███████▋  | 1452/1875 [00:41<00:09, 45.67it/s]
loss 0.03 accuracy 1.00:  77%|███████▋  | 1452/1875 [00:41<00:09, 45.67it/s]
loss 0.11 accuracy 0.97:  77%|███████▋  | 1452/1875 [00:41<00:09, 45.67it/s]
loss 0.20 accuracy 0.94:  77%|███████▋  | 1452/1875 [00:41<00:09, 45.67it/s]
loss 0.07 accuracy 0.97:  77%|███████▋  | 1452/1875 [00:41<00:09, 45.67it/s]
loss 0.22 accuracy 0.94:  77%|███████▋  | 1452/1875 [00:41<00:09, 45.67it/s]
loss 0.22 accuracy 0.94:  78%|███████▊  | 1457/1875 [00:41<00:09, 45.71it/s]
loss 0.19 accuracy 0.97:  78%|███████▊  | 1457/1875 [00:41<00:09, 45.71it/s]
loss 0.06 accuracy 1.00:  78%|███████▊  | 1457/1875 [00:41<00:09, 45.71it/s]
loss 0.17 accuracy 0.97:  78%|███████▊  | 1457/1875 [00:41<00:09, 45.71it/s]
loss 0.10 accuracy 0.97:  78%|███████▊  | 1457/1875 [00:41<00:09, 45.71it/s]
loss 0.04 accuracy 1.00:  78%|███████▊  | 1457/1875 [00:41<00:09, 45.71it/s]
loss 0.04 accuracy 1.00:  78%|███████▊  | 1462/1875 [00:41<00:09, 45.68it/s]
loss 0.15 accuracy 0.94:  78%|███████▊  | 1462/1875 [00:41<00:09, 45.68it/s]
loss 0.03 accuracy 1.00:  78%|███████▊  | 1462/1875 [00:41<00:09, 45.68it/s]
loss 0.17 accuracy 0.94:  78%|███████▊  | 1462/1875 [00:41<00:09, 45.68it/s]
loss 0.10 accuracy 0.97:  78%|███████▊  | 1462/1875 [00:41<00:09, 45.68it/s]
loss 0.19 accuracy 0.91:  78%|███████▊  | 1462/1875 [00:41<00:09, 45.68it/s]
loss 0.19 accuracy 0.91:  78%|███████▊  | 1467/1875 [00:41<00:08, 45.70it/s]
loss 0.03 accuracy 1.00:  78%|███████▊  | 1467/1875 [00:41<00:08, 45.70it/s]
loss 0.15 accuracy 0.94:  78%|███████▊  | 1467/1875 [00:41<00:08, 45.70it/s]
loss 0.02 accuracy 1.00:  78%|███████▊  | 1467/1875 [00:42<00:08, 45.70it/s]
loss 0.20 accuracy 0.94:  78%|███████▊  | 1467/1875 [00:42<00:08, 45.70it/s]
loss 0.10 accuracy 0.97:  78%|███████▊  | 1467/1875 [00:42<00:08, 45.70it/s]
loss 0.10 accuracy 0.97:  79%|███████▊  | 1472/1875 [00:42<00:08, 45.77it/s]
loss 0.16 accuracy 0.94:  79%|███████▊  | 1472/1875 [00:42<00:08, 45.77it/s]
loss 0.20 accuracy 0.97:  79%|███████▊  | 1472/1875 [00:42<00:08, 45.77it/s]
loss 0.08 accuracy 0.97:  79%|███████▊  | 1472/1875 [00:42<00:08, 45.77it/s]
loss 0.14 accuracy 0.97:  79%|███████▊  | 1472/1875 [00:42<00:08, 45.77it/s]
loss 0.27 accuracy 0.97:  79%|███████▊  | 1472/1875 [00:42<00:08, 45.77it/s]
loss 0.27 accuracy 0.97:  79%|███████▉  | 1477/1875 [00:42<00:08, 45.87it/s]
loss 0.32 accuracy 0.94:  79%|███████▉  | 1477/1875 [00:42<00:08, 45.87it/s]
loss 0.19 accuracy 0.94:  79%|███████▉  | 1477/1875 [00:42<00:08, 45.87it/s]
loss 0.13 accuracy 0.94:  79%|███████▉  | 1477/1875 [00:42<00:08, 45.87it/s]
loss 0.19 accuracy 0.94:  79%|███████▉  | 1477/1875 [00:42<00:08, 45.87it/s]
loss 0.26 accuracy 0.91:  79%|███████▉  | 1477/1875 [00:42<00:08, 45.87it/s]
loss 0.26 accuracy 0.91:  79%|███████▉  | 1482/1875 [00:42<00:08, 45.91it/s]
loss 0.05 accuracy 0.97:  79%|███████▉  | 1482/1875 [00:42<00:08, 45.91it/s]
loss 0.10 accuracy 0.97:  79%|███████▉  | 1482/1875 [00:42<00:08, 45.91it/s]
loss 0.10 accuracy 0.97:  79%|███████▉  | 1482/1875 [00:42<00:08, 45.91it/s]
loss 0.07 accuracy 1.00:  79%|███████▉  | 1482/1875 [00:42<00:08, 45.91it/s]
loss 0.24 accuracy 0.94:  79%|███████▉  | 1482/1875 [00:42<00:08, 45.91it/s]
loss 0.24 accuracy 0.94:  79%|███████▉  | 1487/1875 [00:42<00:08, 45.92it/s]
loss 0.08 accuracy 0.97:  79%|███████▉  | 1487/1875 [00:42<00:08, 45.92it/s]
loss 0.09 accuracy 0.97:  79%|███████▉  | 1487/1875 [00:42<00:08, 45.92it/s]
loss 0.13 accuracy 0.94:  79%|███████▉  | 1487/1875 [00:42<00:08, 45.92it/s]
loss 0.03 accuracy 1.00:  79%|███████▉  | 1487/1875 [00:42<00:08, 45.92it/s]
loss 0.05 accuracy 1.00:  79%|███████▉  | 1487/1875 [00:42<00:08, 45.92it/s]
loss 0.05 accuracy 1.00:  80%|███████▉  | 1492/1875 [00:42<00:08, 45.95it/s]
loss 0.11 accuracy 0.97:  80%|███████▉  | 1492/1875 [00:42<00:08, 45.95it/s]
loss 0.09 accuracy 0.97:  80%|███████▉  | 1492/1875 [00:42<00:08, 45.95it/s]
loss 0.20 accuracy 0.94:  80%|███████▉  | 1492/1875 [00:42<00:08, 45.95it/s]
loss 0.13 accuracy 0.91:  80%|███████▉  | 1492/1875 [00:42<00:08, 45.95it/s]
loss 0.17 accuracy 0.97:  80%|███████▉  | 1492/1875 [00:42<00:08, 45.95it/s]
loss 0.17 accuracy 0.97:  80%|███████▉  | 1497/1875 [00:42<00:08, 45.98it/s]
loss 0.03 accuracy 1.00:  80%|███████▉  | 1497/1875 [00:42<00:08, 45.98it/s]
loss 0.26 accuracy 0.91:  80%|███████▉  | 1497/1875 [00:42<00:08, 45.98it/s]
loss 0.05 accuracy 0.97:  80%|███████▉  | 1497/1875 [00:42<00:08, 45.98it/s]
loss 0.03 accuracy 1.00:  80%|███████▉  | 1497/1875 [00:42<00:08, 45.98it/s]
loss 0.11 accuracy 0.97:  80%|███████▉  | 1497/1875 [00:42<00:08, 45.98it/s]
loss 0.11 accuracy 0.97:  80%|████████  | 1502/1875 [00:42<00:08, 45.93it/s]
loss 0.06 accuracy 0.97:  80%|████████  | 1502/1875 [00:42<00:08, 45.93it/s]
loss 0.07 accuracy 0.97:  80%|████████  | 1502/1875 [00:42<00:08, 45.93it/s]
loss 0.19 accuracy 0.97:  80%|████████  | 1502/1875 [00:42<00:08, 45.93it/s]
loss 0.19 accuracy 0.94:  80%|████████  | 1502/1875 [00:42<00:08, 45.93it/s]
loss 0.04 accuracy 1.00:  80%|████████  | 1502/1875 [00:42<00:08, 45.93it/s]
loss 0.04 accuracy 1.00:  80%|████████  | 1507/1875 [00:42<00:08, 45.87it/s]
loss 0.19 accuracy 0.94:  80%|████████  | 1507/1875 [00:42<00:08, 45.87it/s]
loss 0.05 accuracy 0.97:  80%|████████  | 1507/1875 [00:42<00:08, 45.87it/s]
loss 0.09 accuracy 0.97:  80%|████████  | 1507/1875 [00:42<00:08, 45.87it/s]
loss 0.20 accuracy 0.94:  80%|████████  | 1507/1875 [00:42<00:08, 45.87it/s]
loss 0.07 accuracy 0.97:  80%|████████  | 1507/1875 [00:42<00:08, 45.87it/s]
loss 0.07 accuracy 0.97:  81%|████████  | 1512/1875 [00:42<00:07, 45.84it/s]
loss 0.05 accuracy 1.00:  81%|████████  | 1512/1875 [00:42<00:07, 45.84it/s]
loss 0.04 accuracy 1.00:  81%|████████  | 1512/1875 [00:42<00:07, 45.84it/s]
loss 0.08 accuracy 0.97:  81%|████████  | 1512/1875 [00:42<00:07, 45.84it/s]
loss 0.06 accuracy 0.97:  81%|████████  | 1512/1875 [00:43<00:07, 45.84it/s]
loss 0.09 accuracy 0.94:  81%|████████  | 1512/1875 [00:43<00:07, 45.84it/s]
loss 0.09 accuracy 0.94:  81%|████████  | 1517/1875 [00:43<00:07, 45.87it/s]
loss 0.15 accuracy 0.97:  81%|████████  | 1517/1875 [00:43<00:07, 45.87it/s]
loss 0.13 accuracy 0.94:  81%|████████  | 1517/1875 [00:43<00:07, 45.87it/s]
loss 0.02 accuracy 1.00:  81%|████████  | 1517/1875 [00:43<00:07, 45.87it/s]
loss 0.13 accuracy 0.94:  81%|████████  | 1517/1875 [00:43<00:07, 45.87it/s]
loss 0.25 accuracy 0.97:  81%|████████  | 1517/1875 [00:43<00:07, 45.87it/s]
loss 0.25 accuracy 0.97:  81%|████████  | 1522/1875 [00:43<00:07, 45.84it/s]
loss 0.21 accuracy 0.94:  81%|████████  | 1522/1875 [00:43<00:07, 45.84it/s]
loss 0.05 accuracy 1.00:  81%|████████  | 1522/1875 [00:43<00:07, 45.84it/s]
loss 0.03 accuracy 1.00:  81%|████████  | 1522/1875 [00:43<00:07, 45.84it/s]
loss 0.02 accuracy 1.00:  81%|████████  | 1522/1875 [00:43<00:07, 45.84it/s]
loss 0.29 accuracy 0.91:  81%|████████  | 1522/1875 [00:43<00:07, 45.84it/s]
loss 0.29 accuracy 0.91:  81%|████████▏ | 1527/1875 [00:43<00:07, 45.75it/s]
loss 0.14 accuracy 0.91:  81%|████████▏ | 1527/1875 [00:43<00:07, 45.75it/s]
loss 0.05 accuracy 1.00:  81%|████████▏ | 1527/1875 [00:43<00:07, 45.75it/s]
loss 0.08 accuracy 0.97:  81%|████████▏ | 1527/1875 [00:43<00:07, 45.75it/s]
loss 0.08 accuracy 0.97:  81%|████████▏ | 1527/1875 [00:43<00:07, 45.75it/s]
loss 0.08 accuracy 1.00:  81%|████████▏ | 1527/1875 [00:43<00:07, 45.75it/s]
loss 0.08 accuracy 1.00:  82%|████████▏ | 1532/1875 [00:43<00:07, 45.75it/s]
loss 0.15 accuracy 1.00:  82%|████████▏ | 1532/1875 [00:43<00:07, 45.75it/s]
loss 0.16 accuracy 0.97:  82%|████████▏ | 1532/1875 [00:43<00:07, 45.75it/s]
loss 0.22 accuracy 0.97:  82%|████████▏ | 1532/1875 [00:43<00:07, 45.75it/s]
loss 0.13 accuracy 0.94:  82%|████████▏ | 1532/1875 [00:43<00:07, 45.75it/s]
loss 0.05 accuracy 0.97:  82%|████████▏ | 1532/1875 [00:43<00:07, 45.75it/s]
loss 0.05 accuracy 0.97:  82%|████████▏ | 1537/1875 [00:43<00:07, 45.73it/s]
loss 0.03 accuracy 1.00:  82%|████████▏ | 1537/1875 [00:43<00:07, 45.73it/s]
loss 0.11 accuracy 0.94:  82%|████████▏ | 1537/1875 [00:43<00:07, 45.73it/s]
loss 0.04 accuracy 1.00:  82%|████████▏ | 1537/1875 [00:43<00:07, 45.73it/s]
loss 0.07 accuracy 0.97:  82%|████████▏ | 1537/1875 [00:43<00:07, 45.73it/s]
loss 0.08 accuracy 0.97:  82%|████████▏ | 1537/1875 [00:43<00:07, 45.73it/s]
loss 0.08 accuracy 0.97:  82%|████████▏ | 1542/1875 [00:43<00:07, 45.76it/s]
loss 0.21 accuracy 0.94:  82%|████████▏ | 1542/1875 [00:43<00:07, 45.76it/s]
loss 0.10 accuracy 0.97:  82%|████████▏ | 1542/1875 [00:43<00:07, 45.76it/s]
loss 0.03 accuracy 1.00:  82%|████████▏ | 1542/1875 [00:43<00:07, 45.76it/s]
loss 0.03 accuracy 1.00:  82%|████████▏ | 1542/1875 [00:43<00:07, 45.76it/s]
loss 0.20 accuracy 0.91:  82%|████████▏ | 1542/1875 [00:43<00:07, 45.76it/s]
loss 0.20 accuracy 0.91:  83%|████████▎ | 1547/1875 [00:43<00:07, 45.84it/s]
loss 0.11 accuracy 0.94:  83%|████████▎ | 1547/1875 [00:43<00:07, 45.84it/s]
loss 0.32 accuracy 0.94:  83%|████████▎ | 1547/1875 [00:43<00:07, 45.84it/s]
loss 0.04 accuracy 1.00:  83%|████████▎ | 1547/1875 [00:43<00:07, 45.84it/s]
loss 0.06 accuracy 0.97:  83%|████████▎ | 1547/1875 [00:43<00:07, 45.84it/s]
loss 0.02 accuracy 1.00:  83%|████████▎ | 1547/1875 [00:43<00:07, 45.84it/s]
loss 0.02 accuracy 1.00:  83%|████████▎ | 1552/1875 [00:43<00:07, 45.91it/s]
loss 0.02 accuracy 1.00:  83%|████████▎ | 1552/1875 [00:43<00:07, 45.91it/s]
loss 0.10 accuracy 0.97:  83%|████████▎ | 1552/1875 [00:43<00:07, 45.91it/s]
loss 0.21 accuracy 0.97:  83%|████████▎ | 1552/1875 [00:43<00:07, 45.91it/s]
loss 0.14 accuracy 0.97:  83%|████████▎ | 1552/1875 [00:43<00:07, 45.91it/s]
loss 0.03 accuracy 1.00:  83%|████████▎ | 1552/1875 [00:43<00:07, 45.91it/s]
loss 0.03 accuracy 1.00:  83%|████████▎ | 1557/1875 [00:43<00:06, 45.96it/s]
loss 0.07 accuracy 1.00:  83%|████████▎ | 1557/1875 [00:43<00:06, 45.96it/s]
loss 0.04 accuracy 1.00:  83%|████████▎ | 1557/1875 [00:43<00:06, 45.96it/s]
loss 0.02 accuracy 1.00:  83%|████████▎ | 1557/1875 [00:43<00:06, 45.96it/s]
loss 0.27 accuracy 0.94:  83%|████████▎ | 1557/1875 [00:44<00:06, 45.96it/s]
loss 0.24 accuracy 0.94:  83%|████████▎ | 1557/1875 [00:44<00:06, 45.96it/s]
loss 0.24 accuracy 0.94:  83%|████████▎ | 1562/1875 [00:44<00:06, 46.02it/s]
loss 0.08 accuracy 0.97:  83%|████████▎ | 1562/1875 [00:44<00:06, 46.02it/s]
loss 0.13 accuracy 0.97:  83%|████████▎ | 1562/1875 [00:44<00:06, 46.02it/s]
loss 0.08 accuracy 0.97:  83%|████████▎ | 1562/1875 [00:44<00:06, 46.02it/s]
loss 0.07 accuracy 1.00:  83%|████████▎ | 1562/1875 [00:44<00:06, 46.02it/s]
loss 0.03 accuracy 1.00:  83%|████████▎ | 1562/1875 [00:44<00:06, 46.02it/s]
loss 0.03 accuracy 1.00:  84%|████████▎ | 1567/1875 [00:44<00:06, 46.05it/s]
loss 0.14 accuracy 0.94:  84%|████████▎ | 1567/1875 [00:44<00:06, 46.05it/s]
loss 0.08 accuracy 0.97:  84%|████████▎ | 1567/1875 [00:44<00:06, 46.05it/s]
loss 0.04 accuracy 1.00:  84%|████████▎ | 1567/1875 [00:44<00:06, 46.05it/s]
loss 0.04 accuracy 1.00:  84%|████████▎ | 1567/1875 [00:44<00:06, 46.05it/s]
loss 0.09 accuracy 0.97:  84%|████████▎ | 1567/1875 [00:44<00:06, 46.05it/s]
loss 0.09 accuracy 0.97:  84%|████████▍ | 1572/1875 [00:44<00:06, 46.07it/s]
loss 0.11 accuracy 0.97:  84%|████████▍ | 1572/1875 [00:44<00:06, 46.07it/s]
loss 0.10 accuracy 0.97:  84%|████████▍ | 1572/1875 [00:44<00:06, 46.07it/s]
loss 0.04 accuracy 1.00:  84%|████████▍ | 1572/1875 [00:44<00:06, 46.07it/s]
loss 0.02 accuracy 1.00:  84%|████████▍ | 1572/1875 [00:44<00:06, 46.07it/s]
loss 0.19 accuracy 0.94:  84%|████████▍ | 1572/1875 [00:44<00:06, 46.07it/s]
loss 0.19 accuracy 0.94:  84%|████████▍ | 1577/1875 [00:44<00:06, 46.10it/s]
loss 0.06 accuracy 1.00:  84%|████████▍ | 1577/1875 [00:44<00:06, 46.10it/s]
loss 0.03 accuracy 1.00:  84%|████████▍ | 1577/1875 [00:44<00:06, 46.10it/s]
loss 0.17 accuracy 0.97:  84%|████████▍ | 1577/1875 [00:44<00:06, 46.10it/s]
loss 0.08 accuracy 0.97:  84%|████████▍ | 1577/1875 [00:44<00:06, 46.10it/s]
loss 0.07 accuracy 0.97:  84%|████████▍ | 1577/1875 [00:44<00:06, 46.10it/s]
loss 0.07 accuracy 0.97:  84%|████████▍ | 1582/1875 [00:44<00:06, 46.06it/s]
loss 0.04 accuracy 1.00:  84%|████████▍ | 1582/1875 [00:44<00:06, 46.06it/s]
loss 0.05 accuracy 0.97:  84%|████████▍ | 1582/1875 [00:44<00:06, 46.06it/s]
loss 0.14 accuracy 0.97:  84%|████████▍ | 1582/1875 [00:44<00:06, 46.06it/s]
loss 0.11 accuracy 0.97:  84%|████████▍ | 1582/1875 [00:44<00:06, 46.06it/s]
loss 0.12 accuracy 0.97:  84%|████████▍ | 1582/1875 [00:44<00:06, 46.06it/s]
loss 0.12 accuracy 0.97:  85%|████████▍ | 1587/1875 [00:44<00:06, 46.05it/s]
loss 0.14 accuracy 0.97:  85%|████████▍ | 1587/1875 [00:44<00:06, 46.05it/s]
loss 0.02 accuracy 1.00:  85%|████████▍ | 1587/1875 [00:44<00:06, 46.05it/s]
loss 0.06 accuracy 0.97:  85%|████████▍ | 1587/1875 [00:44<00:06, 46.05it/s]
loss 0.10 accuracy 0.97:  85%|████████▍ | 1587/1875 [00:44<00:06, 46.05it/s]
loss 0.02 accuracy 1.00:  85%|████████▍ | 1587/1875 [00:44<00:06, 46.05it/s]
loss 0.02 accuracy 1.00:  85%|████████▍ | 1592/1875 [00:44<00:06, 46.02it/s]
loss 0.13 accuracy 0.97:  85%|████████▍ | 1592/1875 [00:44<00:06, 46.02it/s]
loss 0.46 accuracy 0.88:  85%|████████▍ | 1592/1875 [00:44<00:06, 46.02it/s]
loss 0.04 accuracy 1.00:  85%|████████▍ | 1592/1875 [00:44<00:06, 46.02it/s]
loss 0.03 accuracy 1.00:  85%|████████▍ | 1592/1875 [00:44<00:06, 46.02it/s]
loss 0.05 accuracy 1.00:  85%|████████▍ | 1592/1875 [00:44<00:06, 46.02it/s]
loss 0.05 accuracy 1.00:  85%|████████▌ | 1597/1875 [00:44<00:06, 45.93it/s]
loss 0.26 accuracy 0.94:  85%|████████▌ | 1597/1875 [00:44<00:06, 45.93it/s]
loss 0.07 accuracy 0.97:  85%|████████▌ | 1597/1875 [00:44<00:06, 45.93it/s]
loss 0.05 accuracy 1.00:  85%|████████▌ | 1597/1875 [00:44<00:06, 45.93it/s]
loss 0.12 accuracy 0.97:  85%|████████▌ | 1597/1875 [00:44<00:06, 45.93it/s]
loss 0.09 accuracy 0.97:  85%|████████▌ | 1597/1875 [00:44<00:06, 45.93it/s]
loss 0.09 accuracy 0.97:  85%|████████▌ | 1602/1875 [00:44<00:05, 45.81it/s]
loss 0.09 accuracy 1.00:  85%|████████▌ | 1602/1875 [00:44<00:05, 45.81it/s]
loss 0.05 accuracy 1.00:  85%|████████▌ | 1602/1875 [00:44<00:05, 45.81it/s]
loss 0.09 accuracy 0.97:  85%|████████▌ | 1602/1875 [00:44<00:05, 45.81it/s]
loss 0.10 accuracy 0.94:  85%|████████▌ | 1602/1875 [00:44<00:05, 45.81it/s]
loss 0.02 accuracy 1.00:  85%|████████▌ | 1602/1875 [00:45<00:05, 45.81it/s]
loss 0.02 accuracy 1.00:  86%|████████▌ | 1607/1875 [00:45<00:05, 45.82it/s]
loss 0.13 accuracy 0.97:  86%|████████▌ | 1607/1875 [00:45<00:05, 45.82it/s]
loss 0.07 accuracy 0.97:  86%|████████▌ | 1607/1875 [00:45<00:05, 45.82it/s]
loss 0.10 accuracy 0.97:  86%|████████▌ | 1607/1875 [00:45<00:05, 45.82it/s]
loss 0.21 accuracy 0.97:  86%|████████▌ | 1607/1875 [00:45<00:05, 45.82it/s]
loss 0.09 accuracy 0.97:  86%|████████▌ | 1607/1875 [00:45<00:05, 45.82it/s]
loss 0.09 accuracy 0.97:  86%|████████▌ | 1612/1875 [00:45<00:05, 45.66it/s]
loss 0.17 accuracy 0.97:  86%|████████▌ | 1612/1875 [00:45<00:05, 45.66it/s]
loss 0.46 accuracy 0.94:  86%|████████▌ | 1612/1875 [00:45<00:05, 45.66it/s]
loss 0.11 accuracy 0.97:  86%|████████▌ | 1612/1875 [00:45<00:05, 45.66it/s]
loss 0.03 accuracy 1.00:  86%|████████▌ | 1612/1875 [00:45<00:05, 45.66it/s]
loss 0.06 accuracy 1.00:  86%|████████▌ | 1612/1875 [00:45<00:05, 45.66it/s]
loss 0.06 accuracy 1.00:  86%|████████▌ | 1617/1875 [00:45<00:05, 45.70it/s]
loss 0.19 accuracy 0.97:  86%|████████▌ | 1617/1875 [00:45<00:05, 45.70it/s]
loss 0.07 accuracy 1.00:  86%|████████▌ | 1617/1875 [00:45<00:05, 45.70it/s]
loss 0.09 accuracy 0.97:  86%|████████▌ | 1617/1875 [00:45<00:05, 45.70it/s]
loss 0.08 accuracy 0.97:  86%|████████▌ | 1617/1875 [00:45<00:05, 45.70it/s]
loss 0.14 accuracy 0.97:  86%|████████▌ | 1617/1875 [00:45<00:05, 45.70it/s]
loss 0.14 accuracy 0.97:  87%|████████▋ | 1622/1875 [00:45<00:05, 45.65it/s]
loss 0.04 accuracy 1.00:  87%|████████▋ | 1622/1875 [00:45<00:05, 45.65it/s]
loss 0.09 accuracy 0.97:  87%|████████▋ | 1622/1875 [00:45<00:05, 45.65it/s]
loss 0.03 accuracy 1.00:  87%|████████▋ | 1622/1875 [00:45<00:05, 45.65it/s]
loss 0.06 accuracy 0.97:  87%|████████▋ | 1622/1875 [00:45<00:05, 45.65it/s]
loss 0.02 accuracy 1.00:  87%|████████▋ | 1622/1875 [00:45<00:05, 45.65it/s]
loss 0.02 accuracy 1.00:  87%|████████▋ | 1627/1875 [00:45<00:05, 45.72it/s]
loss 0.09 accuracy 1.00:  87%|████████▋ | 1627/1875 [00:45<00:05, 45.72it/s]
loss 0.04 accuracy 1.00:  87%|████████▋ | 1627/1875 [00:45<00:05, 45.72it/s]
loss 0.15 accuracy 0.97:  87%|████████▋ | 1627/1875 [00:45<00:05, 45.72it/s]
loss 0.03 accuracy 1.00:  87%|████████▋ | 1627/1875 [00:45<00:05, 45.72it/s]
loss 0.06 accuracy 1.00:  87%|████████▋ | 1627/1875 [00:45<00:05, 45.72it/s]
loss 0.06 accuracy 1.00:  87%|████████▋ | 1632/1875 [00:45<00:05, 45.82it/s]
loss 0.41 accuracy 0.97:  87%|████████▋ | 1632/1875 [00:45<00:05, 45.82it/s]
loss 0.10 accuracy 0.97:  87%|████████▋ | 1632/1875 [00:45<00:05, 45.82it/s]
loss 0.08 accuracy 0.97:  87%|████████▋ | 1632/1875 [00:45<00:05, 45.82it/s]
loss 0.04 accuracy 1.00:  87%|████████▋ | 1632/1875 [00:45<00:05, 45.82it/s]
loss 0.02 accuracy 1.00:  87%|████████▋ | 1632/1875 [00:45<00:05, 45.82it/s]
loss 0.02 accuracy 1.00:  87%|████████▋ | 1637/1875 [00:45<00:05, 45.90it/s]
loss 0.22 accuracy 0.97:  87%|████████▋ | 1637/1875 [00:45<00:05, 45.90it/s]
loss 0.23 accuracy 0.94:  87%|████████▋ | 1637/1875 [00:45<00:05, 45.90it/s]
loss 0.10 accuracy 0.97:  87%|████████▋ | 1637/1875 [00:45<00:05, 45.90it/s]
loss 0.19 accuracy 0.97:  87%|████████▋ | 1637/1875 [00:45<00:05, 45.90it/s]
loss 0.13 accuracy 0.97:  87%|████████▋ | 1637/1875 [00:45<00:05, 45.90it/s]
loss 0.13 accuracy 0.97:  88%|████████▊ | 1642/1875 [00:45<00:05, 45.96it/s]
loss 0.07 accuracy 0.97:  88%|████████▊ | 1642/1875 [00:45<00:05, 45.96it/s]
loss 0.03 accuracy 1.00:  88%|████████▊ | 1642/1875 [00:45<00:05, 45.96it/s]
loss 0.27 accuracy 0.94:  88%|████████▊ | 1642/1875 [00:45<00:05, 45.96it/s]
loss 0.11 accuracy 0.97:  88%|████████▊ | 1642/1875 [00:45<00:05, 45.96it/s]
loss 0.04 accuracy 1.00:  88%|████████▊ | 1642/1875 [00:45<00:05, 45.96it/s]
loss 0.04 accuracy 1.00:  88%|████████▊ | 1647/1875 [00:45<00:04, 46.02it/s]
loss 0.08 accuracy 0.97:  88%|████████▊ | 1647/1875 [00:45<00:04, 46.02it/s]
loss 0.04 accuracy 1.00:  88%|████████▊ | 1647/1875 [00:45<00:04, 46.02it/s]
loss 0.07 accuracy 0.97:  88%|████████▊ | 1647/1875 [00:45<00:04, 46.02it/s]
loss 0.06 accuracy 1.00:  88%|████████▊ | 1647/1875 [00:45<00:04, 46.02it/s]
loss 0.14 accuracy 0.94:  88%|████████▊ | 1647/1875 [00:45<00:04, 46.02it/s]
loss 0.14 accuracy 0.94:  88%|████████▊ | 1652/1875 [00:45<00:04, 46.10it/s]
loss 0.02 accuracy 1.00:  88%|████████▊ | 1652/1875 [00:46<00:04, 46.10it/s]
loss 0.02 accuracy 1.00:  88%|████████▊ | 1652/1875 [00:46<00:04, 46.10it/s]
loss 0.10 accuracy 0.94:  88%|████████▊ | 1652/1875 [00:46<00:04, 46.10it/s]
loss 0.10 accuracy 0.97:  88%|████████▊ | 1652/1875 [00:46<00:04, 46.10it/s]
loss 0.04 accuracy 1.00:  88%|████████▊ | 1652/1875 [00:46<00:04, 46.10it/s]
loss 0.04 accuracy 1.00:  88%|████████▊ | 1657/1875 [00:46<00:04, 46.10it/s]
loss 0.13 accuracy 0.97:  88%|████████▊ | 1657/1875 [00:46<00:04, 46.10it/s]
loss 0.10 accuracy 0.97:  88%|████████▊ | 1657/1875 [00:46<00:04, 46.10it/s]
loss 0.06 accuracy 1.00:  88%|████████▊ | 1657/1875 [00:46<00:04, 46.10it/s]
loss 0.18 accuracy 0.91:  88%|████████▊ | 1657/1875 [00:46<00:04, 46.10it/s]
loss 0.13 accuracy 0.97:  88%|████████▊ | 1657/1875 [00:46<00:04, 46.10it/s]
loss 0.13 accuracy 0.97:  89%|████████▊ | 1662/1875 [00:46<00:04, 46.10it/s]
loss 0.04 accuracy 1.00:  89%|████████▊ | 1662/1875 [00:46<00:04, 46.10it/s]
loss 0.06 accuracy 0.97:  89%|████████▊ | 1662/1875 [00:46<00:04, 46.10it/s]
loss 0.13 accuracy 0.97:  89%|████████▊ | 1662/1875 [00:46<00:04, 46.10it/s]
loss 0.03 accuracy 1.00:  89%|████████▊ | 1662/1875 [00:46<00:04, 46.10it/s]
loss 0.05 accuracy 0.97:  89%|████████▊ | 1662/1875 [00:46<00:04, 46.10it/s]
loss 0.05 accuracy 0.97:  89%|████████▉ | 1667/1875 [00:46<00:04, 46.08it/s]
loss 0.09 accuracy 0.97:  89%|████████▉ | 1667/1875 [00:46<00:04, 46.08it/s]
loss 0.12 accuracy 0.97:  89%|████████▉ | 1667/1875 [00:46<00:04, 46.08it/s]
loss 0.17 accuracy 0.97:  89%|████████▉ | 1667/1875 [00:46<00:04, 46.08it/s]
loss 0.13 accuracy 0.97:  89%|████████▉ | 1667/1875 [00:46<00:04, 46.08it/s]
loss 0.14 accuracy 0.97:  89%|████████▉ | 1667/1875 [00:46<00:04, 46.08it/s]
loss 0.14 accuracy 0.97:  89%|████████▉ | 1672/1875 [00:46<00:04, 46.04it/s]
loss 0.08 accuracy 0.97:  89%|████████▉ | 1672/1875 [00:46<00:04, 46.04it/s]
loss 0.06 accuracy 1.00:  89%|████████▉ | 1672/1875 [00:46<00:04, 46.04it/s]
loss 0.16 accuracy 0.97:  89%|████████▉ | 1672/1875 [00:46<00:04, 46.04it/s]
loss 0.13 accuracy 0.91:  89%|████████▉ | 1672/1875 [00:46<00:04, 46.04it/s]
loss 0.63 accuracy 0.91:  89%|████████▉ | 1672/1875 [00:46<00:04, 46.04it/s]
loss 0.63 accuracy 0.91:  89%|████████▉ | 1677/1875 [00:46<00:04, 46.02it/s]
loss 0.05 accuracy 1.00:  89%|████████▉ | 1677/1875 [00:46<00:04, 46.02it/s]
loss 0.07 accuracy 0.97:  89%|████████▉ | 1677/1875 [00:46<00:04, 46.02it/s]
loss 0.02 accuracy 1.00:  89%|████████▉ | 1677/1875 [00:46<00:04, 46.02it/s]
loss 0.03 accuracy 1.00:  89%|████████▉ | 1677/1875 [00:46<00:04, 46.02it/s]
loss 0.09 accuracy 0.97:  89%|████████▉ | 1677/1875 [00:46<00:04, 46.02it/s]
loss 0.09 accuracy 0.97:  90%|████████▉ | 1682/1875 [00:46<00:04, 45.98it/s]
loss 0.21 accuracy 0.97:  90%|████████▉ | 1682/1875 [00:46<00:04, 45.98it/s]
loss 0.09 accuracy 0.97:  90%|████████▉ | 1682/1875 [00:46<00:04, 45.98it/s]
loss 0.12 accuracy 0.97:  90%|████████▉ | 1682/1875 [00:46<00:04, 45.98it/s]
loss 0.04 accuracy 1.00:  90%|████████▉ | 1682/1875 [00:46<00:04, 45.98it/s]
loss 0.04 accuracy 1.00:  90%|████████▉ | 1682/1875 [00:46<00:04, 45.98it/s]
loss 0.04 accuracy 1.00:  90%|████████▉ | 1687/1875 [00:46<00:04, 45.85it/s]
loss 0.13 accuracy 0.97:  90%|████████▉ | 1687/1875 [00:46<00:04, 45.85it/s]
loss 0.04 accuracy 1.00:  90%|████████▉ | 1687/1875 [00:46<00:04, 45.85it/s]
loss 0.06 accuracy 1.00:  90%|████████▉ | 1687/1875 [00:46<00:04, 45.85it/s]
loss 0.05 accuracy 1.00:  90%|████████▉ | 1687/1875 [00:46<00:04, 45.85it/s]
loss 0.18 accuracy 0.94:  90%|████████▉ | 1687/1875 [00:46<00:04, 45.85it/s]
loss 0.18 accuracy 0.94:  90%|█████████ | 1692/1875 [00:46<00:03, 45.80it/s]
loss 0.14 accuracy 0.97:  90%|█████████ | 1692/1875 [00:46<00:03, 45.80it/s]
loss 0.15 accuracy 0.91:  90%|█████████ | 1692/1875 [00:46<00:03, 45.80it/s]
loss 0.08 accuracy 0.97:  90%|█████████ | 1692/1875 [00:46<00:03, 45.80it/s]
loss 0.04 accuracy 1.00:  90%|█████████ | 1692/1875 [00:46<00:03, 45.80it/s]
loss 0.04 accuracy 1.00:  90%|█████████ | 1692/1875 [00:46<00:03, 45.80it/s]
loss 0.04 accuracy 1.00:  91%|█████████ | 1697/1875 [00:46<00:03, 45.71it/s]
loss 0.14 accuracy 0.97:  91%|█████████ | 1697/1875 [00:46<00:03, 45.71it/s]
loss 0.23 accuracy 0.97:  91%|█████████ | 1697/1875 [00:47<00:03, 45.71it/s]
loss 0.40 accuracy 0.94:  91%|█████████ | 1697/1875 [00:47<00:03, 45.71it/s]
loss 0.03 accuracy 1.00:  91%|█████████ | 1697/1875 [00:47<00:03, 45.71it/s]
loss 0.10 accuracy 0.97:  91%|█████████ | 1697/1875 [00:47<00:03, 45.71it/s]
loss 0.10 accuracy 0.97:  91%|█████████ | 1702/1875 [00:47<00:03, 45.71it/s]
loss 0.09 accuracy 0.97:  91%|█████████ | 1702/1875 [00:47<00:03, 45.71it/s]
loss 0.04 accuracy 1.00:  91%|█████████ | 1702/1875 [00:47<00:03, 45.71it/s]
loss 0.03 accuracy 1.00:  91%|█████████ | 1702/1875 [00:47<00:03, 45.71it/s]
loss 0.10 accuracy 0.97:  91%|█████████ | 1702/1875 [00:47<00:03, 45.71it/s]
loss 0.02 accuracy 1.00:  91%|█████████ | 1702/1875 [00:47<00:03, 45.71it/s]
loss 0.02 accuracy 1.00:  91%|█████████ | 1707/1875 [00:47<00:03, 45.72it/s]
loss 0.07 accuracy 0.97:  91%|█████████ | 1707/1875 [00:47<00:03, 45.72it/s]
loss 0.23 accuracy 0.94:  91%|█████████ | 1707/1875 [00:47<00:03, 45.72it/s]
loss 0.15 accuracy 0.97:  91%|█████████ | 1707/1875 [00:47<00:03, 45.72it/s]
loss 0.11 accuracy 0.94:  91%|█████████ | 1707/1875 [00:47<00:03, 45.72it/s]
loss 0.02 accuracy 1.00:  91%|█████████ | 1707/1875 [00:47<00:03, 45.72it/s]
loss 0.02 accuracy 1.00:  91%|█████████▏| 1712/1875 [00:47<00:03, 45.70it/s]
loss 0.04 accuracy 1.00:  91%|█████████▏| 1712/1875 [00:47<00:03, 45.70it/s]
loss 0.12 accuracy 0.94:  91%|█████████▏| 1712/1875 [00:47<00:03, 45.70it/s]
loss 0.05 accuracy 1.00:  91%|█████████▏| 1712/1875 [00:47<00:03, 45.70it/s]
loss 0.02 accuracy 1.00:  91%|█████████▏| 1712/1875 [00:47<00:03, 45.70it/s]
loss 0.13 accuracy 0.94:  91%|█████████▏| 1712/1875 [00:47<00:03, 45.70it/s]
loss 0.13 accuracy 0.94:  92%|█████████▏| 1717/1875 [00:47<00:03, 45.77it/s]
loss 0.06 accuracy 0.97:  92%|█████████▏| 1717/1875 [00:47<00:03, 45.77it/s]
loss 0.11 accuracy 0.97:  92%|█████████▏| 1717/1875 [00:47<00:03, 45.77it/s]
loss 0.05 accuracy 0.97:  92%|█████████▏| 1717/1875 [00:47<00:03, 45.77it/s]
loss 0.05 accuracy 1.00:  92%|█████████▏| 1717/1875 [00:47<00:03, 45.77it/s]
loss 0.08 accuracy 0.97:  92%|█████████▏| 1717/1875 [00:47<00:03, 45.77it/s]
loss 0.08 accuracy 0.97:  92%|█████████▏| 1722/1875 [00:47<00:03, 45.84it/s]
loss 0.03 accuracy 1.00:  92%|█████████▏| 1722/1875 [00:47<00:03, 45.84it/s]
loss 0.02 accuracy 1.00:  92%|█████████▏| 1722/1875 [00:47<00:03, 45.84it/s]
loss 0.15 accuracy 0.94:  92%|█████████▏| 1722/1875 [00:47<00:03, 45.84it/s]
loss 0.09 accuracy 0.97:  92%|█████████▏| 1722/1875 [00:47<00:03, 45.84it/s]
loss 0.03 accuracy 1.00:  92%|█████████▏| 1722/1875 [00:47<00:03, 45.84it/s]
loss 0.03 accuracy 1.00:  92%|█████████▏| 1727/1875 [00:47<00:03, 45.93it/s]
loss 0.02 accuracy 1.00:  92%|█████████▏| 1727/1875 [00:47<00:03, 45.93it/s]
loss 0.07 accuracy 0.97:  92%|█████████▏| 1727/1875 [00:47<00:03, 45.93it/s]
loss 0.29 accuracy 0.97:  92%|█████████▏| 1727/1875 [00:47<00:03, 45.93it/s]
loss 0.07 accuracy 0.97:  92%|█████████▏| 1727/1875 [00:47<00:03, 45.93it/s]
loss 0.05 accuracy 1.00:  92%|█████████▏| 1727/1875 [00:47<00:03, 45.93it/s]
loss 0.05 accuracy 1.00:  92%|█████████▏| 1732/1875 [00:47<00:03, 46.00it/s]
loss 0.04 accuracy 0.97:  92%|█████████▏| 1732/1875 [00:47<00:03, 46.00it/s]
loss 0.02 accuracy 1.00:  92%|█████████▏| 1732/1875 [00:47<00:03, 46.00it/s]
loss 0.10 accuracy 0.94:  92%|█████████▏| 1732/1875 [00:47<00:03, 46.00it/s]
loss 0.03 accuracy 1.00:  92%|█████████▏| 1732/1875 [00:47<00:03, 46.00it/s]
loss 0.10 accuracy 0.97:  92%|█████████▏| 1732/1875 [00:47<00:03, 46.00it/s]
loss 0.10 accuracy 0.97:  93%|█████████▎| 1737/1875 [00:47<00:02, 46.05it/s]
loss 0.06 accuracy 0.97:  93%|█████████▎| 1737/1875 [00:47<00:02, 46.05it/s]
loss 0.25 accuracy 0.91:  93%|█████████▎| 1737/1875 [00:47<00:02, 46.05it/s]
loss 0.12 accuracy 0.97:  93%|█████████▎| 1737/1875 [00:47<00:02, 46.05it/s]
loss 0.06 accuracy 1.00:  93%|█████████▎| 1737/1875 [00:47<00:02, 46.05it/s]
loss 0.04 accuracy 1.00:  93%|█████████▎| 1737/1875 [00:47<00:02, 46.05it/s]
loss 0.04 accuracy 1.00:  93%|█████████▎| 1742/1875 [00:47<00:02, 46.04it/s]
loss 0.21 accuracy 0.97:  93%|█████████▎| 1742/1875 [00:47<00:02, 46.04it/s]
loss 0.03 accuracy 1.00:  93%|█████████▎| 1742/1875 [00:47<00:02, 46.04it/s]
loss 0.17 accuracy 0.94:  93%|█████████▎| 1742/1875 [00:48<00:02, 46.04it/s]
loss 0.03 accuracy 1.00:  93%|█████████▎| 1742/1875 [00:48<00:02, 46.04it/s]
loss 0.07 accuracy 1.00:  93%|█████████▎| 1742/1875 [00:48<00:02, 46.04it/s]
loss 0.07 accuracy 1.00:  93%|█████████▎| 1747/1875 [00:48<00:02, 46.04it/s]
loss 0.27 accuracy 0.91:  93%|█████████▎| 1747/1875 [00:48<00:02, 46.04it/s]
loss 0.02 accuracy 1.00:  93%|█████████▎| 1747/1875 [00:48<00:02, 46.04it/s]
loss 0.25 accuracy 0.97:  93%|█████████▎| 1747/1875 [00:48<00:02, 46.04it/s]
loss 0.06 accuracy 1.00:  93%|█████████▎| 1747/1875 [00:48<00:02, 46.04it/s]
loss 0.04 accuracy 1.00:  93%|█████████▎| 1747/1875 [00:48<00:02, 46.04it/s]
loss 0.04 accuracy 1.00:  93%|█████████▎| 1752/1875 [00:48<00:02, 46.02it/s]
loss 0.33 accuracy 0.91:  93%|█████████▎| 1752/1875 [00:48<00:02, 46.02it/s]
loss 0.10 accuracy 0.94:  93%|█████████▎| 1752/1875 [00:48<00:02, 46.02it/s]
loss 0.03 accuracy 1.00:  93%|█████████▎| 1752/1875 [00:48<00:02, 46.02it/s]
loss 0.02 accuracy 1.00:  93%|█████████▎| 1752/1875 [00:48<00:02, 46.02it/s]
loss 0.12 accuracy 0.94:  93%|█████████▎| 1752/1875 [00:48<00:02, 46.02it/s]
loss 0.12 accuracy 0.94:  94%|█████████▎| 1757/1875 [00:48<00:02, 45.99it/s]
loss 0.03 accuracy 1.00:  94%|█████████▎| 1757/1875 [00:48<00:02, 45.99it/s]
loss 0.12 accuracy 0.97:  94%|█████████▎| 1757/1875 [00:48<00:02, 45.99it/s]
loss 0.03 accuracy 1.00:  94%|█████████▎| 1757/1875 [00:48<00:02, 45.99it/s]
loss 0.14 accuracy 0.94:  94%|█████████▎| 1757/1875 [00:48<00:02, 45.99it/s]
loss 0.12 accuracy 0.94:  94%|█████████▎| 1757/1875 [00:48<00:02, 45.99it/s]
loss 0.12 accuracy 0.94:  94%|█████████▍| 1762/1875 [00:48<00:02, 45.94it/s]
loss 0.11 accuracy 0.94:  94%|█████████▍| 1762/1875 [00:48<00:02, 45.94it/s]
loss 0.02 accuracy 1.00:  94%|█████████▍| 1762/1875 [00:48<00:02, 45.94it/s]
loss 0.04 accuracy 1.00:  94%|█████████▍| 1762/1875 [00:48<00:02, 45.94it/s]
loss 0.10 accuracy 0.94:  94%|█████████▍| 1762/1875 [00:48<00:02, 45.94it/s]
loss 0.25 accuracy 0.94:  94%|█████████▍| 1762/1875 [00:48<00:02, 45.94it/s]
loss 0.25 accuracy 0.94:  94%|█████████▍| 1767/1875 [00:48<00:02, 45.80it/s]
loss 0.17 accuracy 0.97:  94%|█████████▍| 1767/1875 [00:48<00:02, 45.80it/s]
loss 0.11 accuracy 0.97:  94%|█████████▍| 1767/1875 [00:48<00:02, 45.80it/s]
loss 0.13 accuracy 0.94:  94%|█████████▍| 1767/1875 [00:48<00:02, 45.80it/s]
loss 0.24 accuracy 0.91:  94%|█████████▍| 1767/1875 [00:48<00:02, 45.80it/s]
loss 0.04 accuracy 1.00:  94%|█████████▍| 1767/1875 [00:48<00:02, 45.80it/s]
loss 0.04 accuracy 1.00:  95%|█████████▍| 1772/1875 [00:48<00:02, 45.81it/s]
loss 0.24 accuracy 0.97:  95%|█████████▍| 1772/1875 [00:48<00:02, 45.81it/s]
loss 0.06 accuracy 1.00:  95%|█████████▍| 1772/1875 [00:48<00:02, 45.81it/s]
loss 0.02 accuracy 1.00:  95%|█████████▍| 1772/1875 [00:48<00:02, 45.81it/s]
loss 0.05 accuracy 1.00:  95%|█████████▍| 1772/1875 [00:48<00:02, 45.81it/s]
loss 0.02 accuracy 1.00:  95%|█████████▍| 1772/1875 [00:48<00:02, 45.81it/s]
loss 0.02 accuracy 1.00:  95%|█████████▍| 1777/1875 [00:48<00:02, 45.67it/s]
loss 0.04 accuracy 1.00:  95%|█████████▍| 1777/1875 [00:48<00:02, 45.67it/s]
loss 0.15 accuracy 0.97:  95%|█████████▍| 1777/1875 [00:48<00:02, 45.67it/s]
loss 0.06 accuracy 0.97:  95%|█████████▍| 1777/1875 [00:48<00:02, 45.67it/s]
loss 0.15 accuracy 0.94:  95%|█████████▍| 1777/1875 [00:48<00:02, 45.67it/s]
loss 0.08 accuracy 0.97:  95%|█████████▍| 1777/1875 [00:48<00:02, 45.67it/s]
loss 0.08 accuracy 0.97:  95%|█████████▌| 1782/1875 [00:48<00:02, 45.69it/s]
loss 0.08 accuracy 0.97:  95%|█████████▌| 1782/1875 [00:48<00:02, 45.69it/s]
loss 0.04 accuracy 1.00:  95%|█████████▌| 1782/1875 [00:48<00:02, 45.69it/s]
loss 0.09 accuracy 0.97:  95%|█████████▌| 1782/1875 [00:48<00:02, 45.69it/s]
loss 0.03 accuracy 1.00:  95%|█████████▌| 1782/1875 [00:48<00:02, 45.69it/s]
loss 0.03 accuracy 1.00:  95%|█████████▌| 1782/1875 [00:48<00:02, 45.69it/s]
loss 0.03 accuracy 1.00:  95%|█████████▌| 1787/1875 [00:48<00:01, 45.67it/s]
loss 0.04 accuracy 1.00:  95%|█████████▌| 1787/1875 [00:48<00:01, 45.67it/s]
loss 0.01 accuracy 1.00:  95%|█████████▌| 1787/1875 [00:48<00:01, 45.67it/s]
loss 0.03 accuracy 1.00:  95%|█████████▌| 1787/1875 [00:48<00:01, 45.67it/s]
loss 0.21 accuracy 0.97:  95%|█████████▌| 1787/1875 [00:49<00:01, 45.67it/s]
loss 0.09 accuracy 0.94:  95%|█████████▌| 1787/1875 [00:49<00:01, 45.67it/s]
loss 0.09 accuracy 0.94:  96%|█████████▌| 1792/1875 [00:49<00:01, 45.72it/s]
loss 0.20 accuracy 0.97:  96%|█████████▌| 1792/1875 [00:49<00:01, 45.72it/s]
loss 0.04 accuracy 1.00:  96%|█████████▌| 1792/1875 [00:49<00:01, 45.72it/s]
loss 0.02 accuracy 1.00:  96%|█████████▌| 1792/1875 [00:49<00:01, 45.72it/s]
loss 0.03 accuracy 1.00:  96%|█████████▌| 1792/1875 [00:49<00:01, 45.72it/s]
loss 0.02 accuracy 1.00:  96%|█████████▌| 1792/1875 [00:49<00:01, 45.72it/s]
loss 0.02 accuracy 1.00:  96%|█████████▌| 1797/1875 [00:49<00:01, 45.77it/s]
loss 0.07 accuracy 0.97:  96%|█████████▌| 1797/1875 [00:49<00:01, 45.77it/s]
loss 0.08 accuracy 0.97:  96%|█████████▌| 1797/1875 [00:49<00:01, 45.77it/s]
loss 0.09 accuracy 0.97:  96%|█████████▌| 1797/1875 [00:49<00:01, 45.77it/s]
loss 0.04 accuracy 1.00:  96%|█████████▌| 1797/1875 [00:49<00:01, 45.77it/s]
loss 0.06 accuracy 0.97:  96%|█████████▌| 1797/1875 [00:49<00:01, 45.77it/s]
loss 0.06 accuracy 0.97:  96%|█████████▌| 1802/1875 [00:49<00:01, 45.86it/s]
loss 0.12 accuracy 0.97:  96%|█████████▌| 1802/1875 [00:49<00:01, 45.86it/s]
loss 0.05 accuracy 0.97:  96%|█████████▌| 1802/1875 [00:49<00:01, 45.86it/s]
loss 0.31 accuracy 0.94:  96%|█████████▌| 1802/1875 [00:49<00:01, 45.86it/s]
loss 0.16 accuracy 0.94:  96%|█████████▌| 1802/1875 [00:49<00:01, 45.86it/s]
loss 0.06 accuracy 0.97:  96%|█████████▌| 1802/1875 [00:49<00:01, 45.86it/s]
loss 0.06 accuracy 0.97:  96%|█████████▋| 1807/1875 [00:49<00:01, 45.96it/s]
loss 0.09 accuracy 0.94:  96%|█████████▋| 1807/1875 [00:49<00:01, 45.96it/s]
loss 0.17 accuracy 0.97:  96%|█████████▋| 1807/1875 [00:49<00:01, 45.96it/s]
loss 0.08 accuracy 0.97:  96%|█████████▋| 1807/1875 [00:49<00:01, 45.96it/s]
loss 0.04 accuracy 1.00:  96%|█████████▋| 1807/1875 [00:49<00:01, 45.96it/s]
loss 0.02 accuracy 1.00:  96%|█████████▋| 1807/1875 [00:49<00:01, 45.96it/s]
loss 0.02 accuracy 1.00:  97%|█████████▋| 1812/1875 [00:49<00:01, 45.99it/s]
loss 0.02 accuracy 1.00:  97%|█████████▋| 1812/1875 [00:49<00:01, 45.99it/s]
loss 0.02 accuracy 1.00:  97%|█████████▋| 1812/1875 [00:49<00:01, 45.99it/s]
loss 0.08 accuracy 1.00:  97%|█████████▋| 1812/1875 [00:49<00:01, 45.99it/s]
loss 0.16 accuracy 0.94:  97%|█████████▋| 1812/1875 [00:49<00:01, 45.99it/s]
loss 0.11 accuracy 0.94:  97%|█████████▋| 1812/1875 [00:49<00:01, 45.99it/s]
loss 0.11 accuracy 0.94:  97%|█████████▋| 1817/1875 [00:49<00:01, 46.06it/s]
loss 0.03 accuracy 1.00:  97%|█████████▋| 1817/1875 [00:49<00:01, 46.06it/s]
loss 0.01 accuracy 1.00:  97%|█████████▋| 1817/1875 [00:49<00:01, 46.06it/s]
loss 0.31 accuracy 0.91:  97%|█████████▋| 1817/1875 [00:49<00:01, 46.06it/s]
loss 0.02 accuracy 1.00:  97%|█████████▋| 1817/1875 [00:49<00:01, 46.06it/s]
loss 0.16 accuracy 0.97:  97%|█████████▋| 1817/1875 [00:49<00:01, 46.06it/s]
loss 0.16 accuracy 0.97:  97%|█████████▋| 1822/1875 [00:49<00:01, 46.10it/s]
loss 0.07 accuracy 0.97:  97%|█████████▋| 1822/1875 [00:49<00:01, 46.10it/s]
loss 0.03 accuracy 1.00:  97%|█████████▋| 1822/1875 [00:49<00:01, 46.10it/s]
loss 0.02 accuracy 1.00:  97%|█████████▋| 1822/1875 [00:49<00:01, 46.10it/s]
loss 0.06 accuracy 0.97:  97%|█████████▋| 1822/1875 [00:49<00:01, 46.10it/s]
loss 0.02 accuracy 1.00:  97%|█████████▋| 1822/1875 [00:49<00:01, 46.10it/s]
loss 0.02 accuracy 1.00:  97%|█████████▋| 1827/1875 [00:49<00:01, 46.13it/s]
loss 0.04 accuracy 1.00:  97%|█████████▋| 1827/1875 [00:49<00:01, 46.13it/s]
loss 0.07 accuracy 0.97:  97%|█████████▋| 1827/1875 [00:49<00:01, 46.13it/s]
loss 0.05 accuracy 0.97:  97%|█████████▋| 1827/1875 [00:49<00:01, 46.13it/s]
loss 0.28 accuracy 0.94:  97%|█████████▋| 1827/1875 [00:49<00:01, 46.13it/s]
loss 0.09 accuracy 0.97:  97%|█████████▋| 1827/1875 [00:49<00:01, 46.13it/s]
loss 0.09 accuracy 0.97:  98%|█████████▊| 1832/1875 [00:49<00:00, 46.11it/s]
loss 0.04 accuracy 1.00:  98%|█████████▊| 1832/1875 [00:49<00:00, 46.11it/s]
loss 0.05 accuracy 1.00:  98%|█████████▊| 1832/1875 [00:49<00:00, 46.11it/s]
loss 0.02 accuracy 1.00:  98%|█████████▊| 1832/1875 [00:49<00:00, 46.11it/s]
loss 0.03 accuracy 1.00:  98%|█████████▊| 1832/1875 [00:49<00:00, 46.11it/s]
loss 0.20 accuracy 0.94:  98%|█████████▊| 1832/1875 [00:50<00:00, 46.11it/s]
loss 0.20 accuracy 0.94:  98%|█████████▊| 1837/1875 [00:50<00:00, 46.08it/s]
loss 0.23 accuracy 0.97:  98%|█████████▊| 1837/1875 [00:50<00:00, 46.08it/s]
loss 0.04 accuracy 1.00:  98%|█████████▊| 1837/1875 [00:50<00:00, 46.08it/s]
loss 0.07 accuracy 0.97:  98%|█████████▊| 1837/1875 [00:50<00:00, 46.08it/s]
loss 0.02 accuracy 1.00:  98%|█████████▊| 1837/1875 [00:50<00:00, 46.08it/s]
loss 0.15 accuracy 0.97:  98%|█████████▊| 1837/1875 [00:50<00:00, 46.08it/s]
loss 0.15 accuracy 0.97:  98%|█████████▊| 1842/1875 [00:50<00:00, 46.04it/s]
loss 0.08 accuracy 0.97:  98%|█████████▊| 1842/1875 [00:50<00:00, 46.04it/s]
loss 0.05 accuracy 0.97:  98%|█████████▊| 1842/1875 [00:50<00:00, 46.04it/s]
loss 0.07 accuracy 0.97:  98%|█████████▊| 1842/1875 [00:50<00:00, 46.04it/s]
loss 0.16 accuracy 0.94:  98%|█████████▊| 1842/1875 [00:50<00:00, 46.04it/s]
loss 0.09 accuracy 0.97:  98%|█████████▊| 1842/1875 [00:50<00:00, 46.04it/s]
loss 0.09 accuracy 0.97:  99%|█████████▊| 1847/1875 [00:50<00:00, 45.98it/s]
loss 0.06 accuracy 0.97:  99%|█████████▊| 1847/1875 [00:50<00:00, 45.98it/s]
loss 0.17 accuracy 0.94:  99%|█████████▊| 1847/1875 [00:50<00:00, 45.98it/s]
loss 0.15 accuracy 0.94:  99%|█████████▊| 1847/1875 [00:50<00:00, 45.98it/s]
loss 0.09 accuracy 0.94:  99%|█████████▊| 1847/1875 [00:50<00:00, 45.98it/s]
loss 0.03 accuracy 1.00:  99%|█████████▊| 1847/1875 [00:50<00:00, 45.98it/s]
loss 0.03 accuracy 1.00:  99%|█████████▉| 1852/1875 [00:50<00:00, 45.84it/s]
loss 0.06 accuracy 0.97:  99%|█████████▉| 1852/1875 [00:50<00:00, 45.84it/s]
loss 0.04 accuracy 1.00:  99%|█████████▉| 1852/1875 [00:50<00:00, 45.84it/s]
loss 0.02 accuracy 1.00:  99%|█████████▉| 1852/1875 [00:50<00:00, 45.84it/s]
loss 0.17 accuracy 0.94:  99%|█████████▉| 1852/1875 [00:50<00:00, 45.84it/s]
loss 0.15 accuracy 0.97:  99%|█████████▉| 1852/1875 [00:50<00:00, 45.84it/s]
loss 0.15 accuracy 0.97:  99%|█████████▉| 1857/1875 [00:50<00:00, 45.81it/s]
loss 0.12 accuracy 0.97:  99%|█████████▉| 1857/1875 [00:50<00:00, 45.81it/s]
loss 0.13 accuracy 0.94:  99%|█████████▉| 1857/1875 [00:50<00:00, 45.81it/s]
loss 0.09 accuracy 0.97:  99%|█████████▉| 1857/1875 [00:50<00:00, 45.81it/s]
loss 0.10 accuracy 0.94:  99%|█████████▉| 1857/1875 [00:50<00:00, 45.81it/s]
loss 0.02 accuracy 1.00:  99%|█████████▉| 1857/1875 [00:50<00:00, 45.81it/s]
loss 0.02 accuracy 1.00:  99%|█████████▉| 1862/1875 [00:50<00:00, 45.76it/s]
loss 0.05 accuracy 1.00:  99%|█████████▉| 1862/1875 [00:50<00:00, 45.76it/s]
loss 0.03 accuracy 1.00:  99%|█████████▉| 1862/1875 [00:50<00:00, 45.76it/s]
loss 0.06 accuracy 1.00:  99%|█████████▉| 1862/1875 [00:50<00:00, 45.76it/s]
loss 0.08 accuracy 0.97:  99%|█████████▉| 1862/1875 [00:50<00:00, 45.76it/s]
loss 0.02 accuracy 1.00:  99%|█████████▉| 1862/1875 [00:50<00:00, 45.76it/s]
loss 0.02 accuracy 1.00: 100%|█████████▉| 1867/1875 [00:50<00:00, 45.69it/s]
loss 0.06 accuracy 1.00: 100%|█████████▉| 1867/1875 [00:50<00:00, 45.69it/s]
loss 0.13 accuracy 0.94: 100%|█████████▉| 1867/1875 [00:50<00:00, 45.69it/s]
loss 0.02 accuracy 1.00: 100%|█████████▉| 1867/1875 [00:50<00:00, 45.69it/s]
loss 0.07 accuracy 0.97: 100%|█████████▉| 1867/1875 [00:50<00:00, 45.69it/s]
loss 0.03 accuracy 1.00: 100%|█████████▉| 1867/1875 [00:50<00:00, 45.69it/s]
loss 0.03 accuracy 1.00: 100%|█████████▉| 1872/1875 [00:50<00:00, 45.74it/s]
loss 0.24 accuracy 0.94: 100%|█████████▉| 1872/1875 [00:50<00:00, 45.74it/s]
loss 0.03 accuracy 1.00: 100%|█████████▉| 1872/1875 [00:50<00:00, 45.74it/s]
loss 0.05 accuracy 1.00: 100%|█████████▉| 1872/1875 [00:50<00:00, 45.74it/s]
loss 0.05 accuracy 1.00: 100%|██████████| 1875/1875 [00:50<00:00, 36.88it/s]

  0%|          | 0/313 [00:00<?, ?it/s]
  0%|          | 1/313 [00:00<03:12,  1.62it/s]
  3%|▎         | 9/313 [00:00<00:18, 16.01it/s]
  5%|▌         | 17/313 [00:00<00:10, 28.79it/s]
  8%|▊         | 25/313 [00:00<00:07, 39.58it/s]
 10%|█         | 32/313 [00:01<00:06, 43.70it/s]
 13%|█▎        | 40/313 [00:01<00:05, 51.35it/s]
 15%|█▌        | 48/313 [00:01<00:04, 57.25it/s]
 18%|█▊        | 56/313 [00:01<00:04, 61.63it/s]
 20%|██        | 64/313 [00:01<00:03, 64.95it/s]
 23%|██▎       | 72/313 [00:01<00:03, 62.33it/s]
 26%|██▌       | 80/313 [00:01<00:03, 65.47it/s]
 28%|██▊       | 88/313 [00:01<00:03, 67.75it/s]
 31%|███       | 96/313 [00:01<00:03, 69.27it/s]
 33%|███▎      | 104/313 [00:02<00:03, 65.15it/s]
 36%|███▌      | 112/313 [00:02<00:02, 67.43it/s]
 38%|███▊      | 120/313 [00:02<00:02, 69.06it/s]
 41%|████      | 128/313 [00:02<00:02, 70.22it/s]
 43%|████▎     | 136/313 [00:02<00:02, 65.75it/s]
 46%|████▌     | 144/313 [00:02<00:02, 67.79it/s]
 49%|████▊     | 152/313 [00:02<00:02, 69.14it/s]
 51%|█████     | 160/313 [00:02<00:02, 70.12it/s]
 54%|█████▎    | 168/313 [00:03<00:02, 65.53it/s]
 56%|█████▌    | 176/313 [00:03<00:02, 67.52it/s]
 59%|█████▉    | 184/313 [00:03<00:01, 69.06it/s]
 61%|██████▏   | 192/313 [00:03<00:01, 70.09it/s]
 64%|██████▍   | 200/313 [00:03<00:01, 65.59it/s]
 66%|██████▋   | 208/313 [00:03<00:01, 67.65it/s]
 69%|██████▉   | 216/313 [00:03<00:01, 69.11it/s]
 72%|███████▏  | 224/313 [00:03<00:01, 70.37it/s]
 74%|███████▍  | 232/313 [00:03<00:01, 65.93it/s]
 77%|███████▋  | 240/313 [00:04<00:01, 67.85it/s]
 79%|███████▉  | 248/313 [00:04<00:00, 69.22it/s]
 82%|████████▏ | 256/313 [00:04<00:00, 70.28it/s]
 84%|████████▍ | 264/313 [00:04<00:00, 65.71it/s]
 87%|████████▋ | 272/313 [00:04<00:00, 67.59it/s]
 89%|████████▉ | 280/313 [00:04<00:00, 69.10it/s]
 92%|█████████▏| 288/313 [00:04<00:00, 70.17it/s]
 95%|█████████▍| 296/313 [00:04<00:00, 65.61it/s]
 97%|█████████▋| 304/313 [00:05<00:00, 67.60it/s]
100%|█████████▉| 312/313 [00:05<00:00, 69.04it/s]
100%|██████████| 313/313 [00:06<00:00, 49.51it/s]
test set accuracy is 0.973000
Traceback (most recent call last):
  File "/home/jebba/devel/tinygrad/tinygrad/examples/serious_mnist.py", line 136, in <module>
    model.save(f'examples/checkpoint{accuracy * 1e6:.0f}')
  File "/home/jebba/devel/tinygrad/tinygrad/examples/serious_mnist.py", line 72, in save
    with open(filename+'.npy', 'wb') as f:
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: 'examples/checkpoint973000.npy'

simple_conv_bn.py

running network

so_vits_svc.py

Traceback (most recent call last):
  File "/home/jebba/devel/tinygrad/tinygrad/examples/so_vits_svc.py", line 10, in <module>
    from examples.vits import ResidualCouplingBlock, PosteriorEncoder, Encoder, ResBlock1, ResBlock2, LRELU_SLOPE, sequence_mask, split, download_if_not_present, get_hparams_from_file, load_checkpoint, weight_norm, HParams
ImportError: cannot import name 'download_if_not_present' from 'examples.vits' (/home/jebba/devel/tinygrad/tinygrad/examples/vits.py)

stable_diffusion.py

  0%|          | 0/1131 [00:00<?, ?it/s]
ram used:  0.00 GB, alphas_cumprod                                    :   0%|          | 0/1131 [00:00<?, ?it/s]
ram used:  0.00 GB, alphas_cumprod                                    :   0%|          | 1/1131 [00:00<03:58,  4.73it/s]
ram used:  0.00 GB, model.diffusion_model.time_embed.0.weight         :   0%|          | 1/1131 [00:00<03:58,  4.73it/s]
ram used:  0.00 GB, model.diffusion_model.time_embed.0.bias           :   0%|          | 1/1131 [00:00<03:58,  4.73it/s]
ram used:  0.00 GB, model.diffusion_model.time_embed.2.weight         :   0%|          | 1/1131 [00:00<03:58,  4.73it/s]
ram used:  0.01 GB, model.diffusion_model.time_embed.2.bias           :   0%|          | 1/1131 [00:00<03:58,  4.73it/s]
ram used:  0.01 GB, model.diffusion_model.input_blocks.0.0.weight     :   0%|          | 1/1131 [00:00<03:58,  4.73it/s]
ram used:  0.01 GB, model.diffusion_model.input_blocks.0.0.bias       :   0%|          | 1/1131 [00:00<03:58,  4.73it/s]
ram used:  0.01 GB, model.diffusion_model.input_blocks.1.0.in_layers.0.weight:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]
ram used:  0.01 GB, model.diffusion_model.input_blocks.1.0.in_layers.0.bias:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]  
ram used:  0.01 GB, model.diffusion_model.input_blocks.1.0.in_layers.2.weight:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]
ram used:  0.01 GB, model.diffusion_model.input_blocks.1.0.in_layers.2.bias:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]  
ram used:  0.01 GB, model.diffusion_model.input_blocks.1.0.emb_layers.1.weight:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]
ram used:  0.01 GB, model.diffusion_model.input_blocks.1.0.emb_layers.1.bias:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]  
ram used:  0.01 GB, model.diffusion_model.input_blocks.1.0.out_layers.0.weight:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]
ram used:  0.01 GB, model.diffusion_model.input_blocks.1.0.out_layers.0.bias:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]  
ram used:  0.01 GB, model.diffusion_model.input_blocks.1.0.out_layers.3.weight:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]
ram used:  0.02 GB, model.diffusion_model.input_blocks.1.0.out_layers.3.bias:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]  
ram used:  0.02 GB, model.diffusion_model.input_blocks.1.1.norm.weight:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]      
ram used:  0.02 GB, model.diffusion_model.input_blocks.1.1.norm.bias  :   0%|          | 1/1131 [00:00<03:58,  4.73it/s]
ram used:  0.02 GB, model.diffusion_model.input_blocks.1.1.proj_in.weight:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]
ram used:  0.02 GB, model.diffusion_model.input_blocks.1.1.proj_in.bias:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]  
ram used:  0.02 GB, model.diffusion_model.input_blocks.1.1.transformer_blocks.0.attn1.to_q.weight:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]
ram used:  0.02 GB, model.diffusion_model.input_blocks.1.1.transformer_blocks.0.attn1.to_k.weight:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]
ram used:  0.02 GB, model.diffusion_model.input_blocks.1.1.transformer_blocks.0.attn1.to_v.weight:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]
ram used:  0.02 GB, model.diffusion_model.input_blocks.1.1.transformer_blocks.0.attn1.to_out.0.weight:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]
ram used:  0.02 GB, model.diffusion_model.input_blocks.1.1.transformer_blocks.0.attn1.to_out.0.bias:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]  
ram used:  0.02 GB, model.diffusion_model.input_blocks.1.1.transformer_blocks.0.ff.net.0.proj.weight:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]
ram used:  0.02 GB, model.diffusion_model.input_blocks.1.1.transformer_blocks.0.ff.net.0.proj.bias:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]  
ram used:  0.02 GB, model.diffusion_model.input_blocks.1.1.transformer_blocks.0.ff.net.2.weight:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]   
ram used:  0.02 GB, model.diffusion_model.input_blocks.1.1.transformer_blocks.0.ff.net.2.bias:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]  
ram used:  0.02 GB, model.diffusion_model.input_blocks.1.1.transformer_blocks.0.attn2.to_q.weight:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]
ram used:  0.02 GB, model.diffusion_model.input_blocks.1.1.transformer_blocks.0.attn2.to_k.weight:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]
ram used:  0.03 GB, model.diffusion_model.input_blocks.1.1.transformer_blocks.0.attn2.to_v.weight:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]
ram used:  0.03 GB, model.diffusion_model.input_blocks.1.1.transformer_blocks.0.attn2.to_out.0.weight:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]
ram used:  0.03 GB, model.diffusion_model.input_blocks.1.1.transformer_blocks.0.attn2.to_out.0.bias:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]  
ram used:  0.03 GB, model.diffusion_model.input_blocks.1.1.transformer_blocks.0.norm1.weight:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]       
ram used:  0.03 GB, model.diffusion_model.input_blocks.1.1.transformer_blocks.0.norm1.bias:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]  
ram used:  0.03 GB, model.diffusion_model.input_blocks.1.1.transformer_blocks.0.norm2.weight:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]
ram used:  0.03 GB, model.diffusion_model.input_blocks.1.1.transformer_blocks.0.norm2.bias:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]  
ram used:  0.03 GB, model.diffusion_model.input_blocks.1.1.transformer_blocks.0.norm3.weight:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]
ram used:  0.03 GB, model.diffusion_model.input_blocks.1.1.transformer_blocks.0.norm3.bias:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]  
ram used:  0.03 GB, model.diffusion_model.input_blocks.1.1.proj_out.weight:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]                
ram used:  0.03 GB, model.diffusion_model.input_blocks.1.1.proj_out.bias:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]  
ram used:  0.03 GB, model.diffusion_model.input_blocks.2.0.in_layers.0.weight:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]
ram used:  0.03 GB, model.diffusion_model.input_blocks.2.0.in_layers.0.bias:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]  
ram used:  0.03 GB, model.diffusion_model.input_blocks.2.0.in_layers.2.weight:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]
ram used:  0.03 GB, model.diffusion_model.input_blocks.2.0.in_layers.2.bias:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]  
ram used:  0.03 GB, model.diffusion_model.input_blocks.2.0.emb_layers.1.weight:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]
ram used:  0.03 GB, model.diffusion_model.input_blocks.2.0.emb_layers.1.bias:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]  
ram used:  0.03 GB, model.diffusion_model.input_blocks.2.0.out_layers.0.weight:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]
ram used:  0.03 GB, model.diffusion_model.input_blocks.2.0.out_layers.0.bias:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]  
ram used:  0.03 GB, model.diffusion_model.input_blocks.2.0.out_layers.3.weight:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]
ram used:  0.04 GB, model.diffusion_model.input_blocks.2.0.out_layers.3.bias:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]  
ram used:  0.04 GB, model.diffusion_model.input_blocks.2.1.norm.weight:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]      
ram used:  0.04 GB, model.diffusion_model.input_blocks.2.1.norm.bias  :   0%|          | 1/1131 [00:00<03:58,  4.73it/s]
ram used:  0.04 GB, model.diffusion_model.input_blocks.2.1.proj_in.weight:   0%|          | 1/1131 [00:00<03:58,  4.73it/s]
ram used:  0.04 GB, model.diffusion_model.input_blocks.2.1.proj_in.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]
ram used:  0.04 GB, model.diffusion_model.input_blocks.2.1.proj_in.bias:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]  
ram used:  0.04 GB, model.diffusion_model.input_blocks.2.1.transformer_blocks.0.attn1.to_q.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]
ram used:  0.04 GB, model.diffusion_model.input_blocks.2.1.transformer_blocks.0.attn1.to_k.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]
ram used:  0.04 GB, model.diffusion_model.input_blocks.2.1.transformer_blocks.0.attn1.to_v.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]
ram used:  0.04 GB, model.diffusion_model.input_blocks.2.1.transformer_blocks.0.attn1.to_out.0.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]
ram used:  0.04 GB, model.diffusion_model.input_blocks.2.1.transformer_blocks.0.attn1.to_out.0.bias:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]  
ram used:  0.04 GB, model.diffusion_model.input_blocks.2.1.transformer_blocks.0.ff.net.0.proj.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]
ram used:  0.04 GB, model.diffusion_model.input_blocks.2.1.transformer_blocks.0.ff.net.0.proj.bias:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]  
ram used:  0.04 GB, model.diffusion_model.input_blocks.2.1.transformer_blocks.0.ff.net.2.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]   
ram used:  0.04 GB, model.diffusion_model.input_blocks.2.1.transformer_blocks.0.ff.net.2.bias:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]  
ram used:  0.04 GB, model.diffusion_model.input_blocks.2.1.transformer_blocks.0.attn2.to_q.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]
ram used:  0.04 GB, model.diffusion_model.input_blocks.2.1.transformer_blocks.0.attn2.to_k.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]
ram used:  0.04 GB, model.diffusion_model.input_blocks.2.1.transformer_blocks.0.attn2.to_v.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]
ram used:  0.05 GB, model.diffusion_model.input_blocks.2.1.transformer_blocks.0.attn2.to_out.0.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]
ram used:  0.05 GB, model.diffusion_model.input_blocks.2.1.transformer_blocks.0.attn2.to_out.0.bias:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]  
ram used:  0.05 GB, model.diffusion_model.input_blocks.2.1.transformer_blocks.0.norm1.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]       
ram used:  0.05 GB, model.diffusion_model.input_blocks.2.1.transformer_blocks.0.norm1.bias:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]  
ram used:  0.05 GB, model.diffusion_model.input_blocks.2.1.transformer_blocks.0.norm2.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]
ram used:  0.05 GB, model.diffusion_model.input_blocks.2.1.transformer_blocks.0.norm2.bias:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]  
ram used:  0.05 GB, model.diffusion_model.input_blocks.2.1.transformer_blocks.0.norm3.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]
ram used:  0.05 GB, model.diffusion_model.input_blocks.2.1.transformer_blocks.0.norm3.bias:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]  
ram used:  0.05 GB, model.diffusion_model.input_blocks.2.1.proj_out.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]                
ram used:  0.05 GB, model.diffusion_model.input_blocks.2.1.proj_out.bias:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]  
ram used:  0.05 GB, model.diffusion_model.input_blocks.3.0.op.weight  :   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]  
ram used:  0.05 GB, model.diffusion_model.input_blocks.3.0.op.bias    :   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]
ram used:  0.05 GB, model.diffusion_model.input_blocks.4.0.in_layers.0.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]
ram used:  0.05 GB, model.diffusion_model.input_blocks.4.0.in_layers.0.bias:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]  
ram used:  0.05 GB, model.diffusion_model.input_blocks.4.0.in_layers.2.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]
ram used:  0.06 GB, model.diffusion_model.input_blocks.4.0.in_layers.2.bias:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]  
ram used:  0.06 GB, model.diffusion_model.input_blocks.4.0.emb_layers.1.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]
ram used:  0.06 GB, model.diffusion_model.input_blocks.4.0.emb_layers.1.bias:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]  
ram used:  0.06 GB, model.diffusion_model.input_blocks.4.0.out_layers.0.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]
ram used:  0.06 GB, model.diffusion_model.input_blocks.4.0.out_layers.0.bias:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]  
ram used:  0.06 GB, model.diffusion_model.input_blocks.4.0.out_layers.3.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]
ram used:  0.08 GB, model.diffusion_model.input_blocks.4.0.out_layers.3.bias:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]  
ram used:  0.08 GB, model.diffusion_model.input_blocks.4.0.skip_connection.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]
ram used:  0.08 GB, model.diffusion_model.input_blocks.4.0.skip_connection.bias:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]  
ram used:  0.08 GB, model.diffusion_model.input_blocks.4.1.norm.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]         
ram used:  0.08 GB, model.diffusion_model.input_blocks.4.1.norm.bias  :   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]
ram used:  0.08 GB, model.diffusion_model.input_blocks.4.1.proj_in.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]
ram used:  0.08 GB, model.diffusion_model.input_blocks.4.1.proj_in.bias:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]  
ram used:  0.08 GB, model.diffusion_model.input_blocks.4.1.transformer_blocks.0.attn1.to_q.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]
ram used:  0.08 GB, model.diffusion_model.input_blocks.4.1.transformer_blocks.0.attn1.to_k.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]
ram used:  0.08 GB, model.diffusion_model.input_blocks.4.1.transformer_blocks.0.attn1.to_v.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]
ram used:  0.08 GB, model.diffusion_model.input_blocks.4.1.transformer_blocks.0.attn1.to_out.0.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]
ram used:  0.08 GB, model.diffusion_model.input_blocks.4.1.transformer_blocks.0.attn1.to_out.0.bias:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]  
ram used:  0.08 GB, model.diffusion_model.input_blocks.4.1.transformer_blocks.0.ff.net.0.proj.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]
ram used:  0.10 GB, model.diffusion_model.input_blocks.4.1.transformer_blocks.0.ff.net.0.proj.bias:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]  
ram used:  0.10 GB, model.diffusion_model.input_blocks.4.1.transformer_blocks.0.ff.net.2.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]   
ram used:  0.10 GB, model.diffusion_model.input_blocks.4.1.transformer_blocks.0.ff.net.2.bias:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]  
ram used:  0.10 GB, model.diffusion_model.input_blocks.4.1.transformer_blocks.0.attn2.to_q.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]
ram used:  0.11 GB, model.diffusion_model.input_blocks.4.1.transformer_blocks.0.attn2.to_k.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]
ram used:  0.11 GB, model.diffusion_model.input_blocks.4.1.transformer_blocks.0.attn2.to_v.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]
ram used:  0.11 GB, model.diffusion_model.input_blocks.4.1.transformer_blocks.0.attn2.to_out.0.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]
ram used:  0.11 GB, model.diffusion_model.input_blocks.4.1.transformer_blocks.0.attn2.to_out.0.bias:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]  
ram used:  0.11 GB, model.diffusion_model.input_blocks.4.1.transformer_blocks.0.norm1.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]       
ram used:  0.11 GB, model.diffusion_model.input_blocks.4.1.transformer_blocks.0.norm1.bias:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]  
ram used:  0.11 GB, model.diffusion_model.input_blocks.4.1.transformer_blocks.0.norm2.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]
ram used:  0.11 GB, model.diffusion_model.input_blocks.4.1.transformer_blocks.0.norm2.bias:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]  
ram used:  0.11 GB, model.diffusion_model.input_blocks.4.1.transformer_blocks.0.norm3.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]
ram used:  0.11 GB, model.diffusion_model.input_blocks.4.1.transformer_blocks.0.norm3.bias:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]  
ram used:  0.11 GB, model.diffusion_model.input_blocks.4.1.proj_out.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]                
ram used:  0.11 GB, model.diffusion_model.input_blocks.4.1.proj_out.bias:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]  
ram used:  0.11 GB, model.diffusion_model.input_blocks.5.0.in_layers.0.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]
ram used:  0.11 GB, model.diffusion_model.input_blocks.5.0.in_layers.0.bias:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]  
ram used:  0.11 GB, model.diffusion_model.input_blocks.5.0.in_layers.2.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]
ram used:  0.13 GB, model.diffusion_model.input_blocks.5.0.in_layers.2.bias:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]  
ram used:  0.13 GB, model.diffusion_model.input_blocks.5.0.emb_layers.1.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]
ram used:  0.13 GB, model.diffusion_model.input_blocks.5.0.emb_layers.1.bias:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]  
ram used:  0.13 GB, model.diffusion_model.input_blocks.5.0.out_layers.0.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]
ram used:  0.13 GB, model.diffusion_model.input_blocks.5.0.out_layers.0.bias:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]  
ram used:  0.13 GB, model.diffusion_model.input_blocks.5.0.out_layers.3.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]
ram used:  0.15 GB, model.diffusion_model.input_blocks.5.0.out_layers.3.bias:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]  
ram used:  0.15 GB, model.diffusion_model.input_blocks.5.1.norm.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]      
ram used:  0.15 GB, model.diffusion_model.input_blocks.5.1.norm.bias  :   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]
ram used:  0.15 GB, model.diffusion_model.input_blocks.5.1.proj_in.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]
ram used:  0.15 GB, model.diffusion_model.input_blocks.5.1.proj_in.bias:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]  
ram used:  0.15 GB, model.diffusion_model.input_blocks.5.1.transformer_blocks.0.attn1.to_q.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]
ram used:  0.15 GB, model.diffusion_model.input_blocks.5.1.transformer_blocks.0.attn1.to_k.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]
ram used:  0.15 GB, model.diffusion_model.input_blocks.5.1.transformer_blocks.0.attn1.to_v.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]
ram used:  0.15 GB, model.diffusion_model.input_blocks.5.1.transformer_blocks.0.attn1.to_out.0.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]
ram used:  0.15 GB, model.diffusion_model.input_blocks.5.1.transformer_blocks.0.attn1.to_out.0.bias:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]  
ram used:  0.15 GB, model.diffusion_model.input_blocks.5.1.transformer_blocks.0.ff.net.0.proj.weight:   5%|▍         | 56/1131 [00:00<00:04, 224.33it/s]
ram used:  0.15 GB, model.diffusion_model.input_blocks.5.1.transformer_blocks.0.ff.net.0.proj.weight:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]
ram used:  0.17 GB, model.diffusion_model.input_blocks.5.1.transformer_blocks.0.ff.net.0.proj.bias:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]  
ram used:  0.17 GB, model.diffusion_model.input_blocks.5.1.transformer_blocks.0.ff.net.2.weight:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]   
ram used:  0.17 GB, model.diffusion_model.input_blocks.5.1.transformer_blocks.0.ff.net.2.bias:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]  
ram used:  0.17 GB, model.diffusion_model.input_blocks.5.1.transformer_blocks.0.attn2.to_q.weight:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]
ram used:  0.18 GB, model.diffusion_model.input_blocks.5.1.transformer_blocks.0.attn2.to_k.weight:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]
ram used:  0.18 GB, model.diffusion_model.input_blocks.5.1.transformer_blocks.0.attn2.to_v.weight:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]
ram used:  0.18 GB, model.diffusion_model.input_blocks.5.1.transformer_blocks.0.attn2.to_out.0.weight:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]
ram used:  0.18 GB, model.diffusion_model.input_blocks.5.1.transformer_blocks.0.attn2.to_out.0.bias:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]  
ram used:  0.18 GB, model.diffusion_model.input_blocks.5.1.transformer_blocks.0.norm1.weight:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]       
ram used:  0.18 GB, model.diffusion_model.input_blocks.5.1.transformer_blocks.0.norm1.bias:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]  
ram used:  0.18 GB, model.diffusion_model.input_blocks.5.1.transformer_blocks.0.norm2.weight:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]
ram used:  0.18 GB, model.diffusion_model.input_blocks.5.1.transformer_blocks.0.norm2.bias:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]  
ram used:  0.18 GB, model.diffusion_model.input_blocks.5.1.transformer_blocks.0.norm3.weight:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]
ram used:  0.18 GB, model.diffusion_model.input_blocks.5.1.transformer_blocks.0.norm3.bias:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]  
ram used:  0.18 GB, model.diffusion_model.input_blocks.5.1.proj_out.weight:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]                
ram used:  0.18 GB, model.diffusion_model.input_blocks.5.1.proj_out.bias:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]  
ram used:  0.18 GB, model.diffusion_model.input_blocks.6.0.op.weight  :  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]  
ram used:  0.20 GB, model.diffusion_model.input_blocks.6.0.op.bias    :  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]
ram used:  0.20 GB, model.diffusion_model.input_blocks.7.0.in_layers.0.weight:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]
ram used:  0.20 GB, model.diffusion_model.input_blocks.7.0.in_layers.0.bias:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]  
ram used:  0.20 GB, model.diffusion_model.input_blocks.7.0.in_layers.2.weight:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]
ram used:  0.23 GB, model.diffusion_model.input_blocks.7.0.in_layers.2.bias:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]  
ram used:  0.23 GB, model.diffusion_model.input_blocks.7.0.emb_layers.1.weight:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]
ram used:  0.23 GB, model.diffusion_model.input_blocks.7.0.emb_layers.1.bias:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]  
ram used:  0.23 GB, model.diffusion_model.input_blocks.7.0.out_layers.0.weight:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]
ram used:  0.23 GB, model.diffusion_model.input_blocks.7.0.out_layers.0.bias:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]  
ram used:  0.23 GB, model.diffusion_model.input_blocks.7.0.out_layers.3.weight:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]
ram used:  0.29 GB, model.diffusion_model.input_blocks.7.0.out_layers.3.bias:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]  
ram used:  0.29 GB, model.diffusion_model.input_blocks.7.0.skip_connection.weight:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]
ram used:  0.30 GB, model.diffusion_model.input_blocks.7.0.skip_connection.bias:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]  
ram used:  0.30 GB, model.diffusion_model.input_blocks.7.1.norm.weight:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]         
ram used:  0.30 GB, model.diffusion_model.input_blocks.7.1.norm.bias  :  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]
ram used:  0.30 GB, model.diffusion_model.input_blocks.7.1.proj_in.weight:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]
ram used:  0.30 GB, model.diffusion_model.input_blocks.7.1.proj_in.bias:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]  
ram used:  0.30 GB, model.diffusion_model.input_blocks.7.1.transformer_blocks.0.attn1.to_q.weight:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]
ram used:  0.31 GB, model.diffusion_model.input_blocks.7.1.transformer_blocks.0.attn1.to_k.weight:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]
ram used:  0.32 GB, model.diffusion_model.input_blocks.7.1.transformer_blocks.0.attn1.to_v.weight:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]
ram used:  0.32 GB, model.diffusion_model.input_blocks.7.1.transformer_blocks.0.attn1.to_out.0.weight:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]
ram used:  0.33 GB, model.diffusion_model.input_blocks.7.1.transformer_blocks.0.attn1.to_out.0.bias:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]  
ram used:  0.33 GB, model.diffusion_model.input_blocks.7.1.transformer_blocks.0.ff.net.0.proj.weight:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]
ram used:  0.38 GB, model.diffusion_model.input_blocks.7.1.transformer_blocks.0.ff.net.0.proj.bias:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]  
ram used:  0.38 GB, model.diffusion_model.input_blocks.7.1.transformer_blocks.0.ff.net.2.weight:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]   
ram used:  0.41 GB, model.diffusion_model.input_blocks.7.1.transformer_blocks.0.ff.net.2.bias:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]  
ram used:  0.41 GB, model.diffusion_model.input_blocks.7.1.transformer_blocks.0.attn2.to_q.weight:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]
ram used:  0.41 GB, model.diffusion_model.input_blocks.7.1.transformer_blocks.0.attn2.to_k.weight:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]
ram used:  0.42 GB, model.diffusion_model.input_blocks.7.1.transformer_blocks.0.attn2.to_v.weight:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]
ram used:  0.42 GB, model.diffusion_model.input_blocks.7.1.transformer_blocks.0.attn2.to_out.0.weight:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]
ram used:  0.43 GB, model.diffusion_model.input_blocks.7.1.transformer_blocks.0.attn2.to_out.0.bias:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]  
ram used:  0.43 GB, model.diffusion_model.input_blocks.7.1.transformer_blocks.0.norm1.weight:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]       
ram used:  0.43 GB, model.diffusion_model.input_blocks.7.1.transformer_blocks.0.norm1.bias:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]  
ram used:  0.43 GB, model.diffusion_model.input_blocks.7.1.transformer_blocks.0.norm2.weight:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]
ram used:  0.43 GB, model.diffusion_model.input_blocks.7.1.transformer_blocks.0.norm2.bias:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]  
ram used:  0.43 GB, model.diffusion_model.input_blocks.7.1.transformer_blocks.0.norm3.weight:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]
ram used:  0.43 GB, model.diffusion_model.input_blocks.7.1.transformer_blocks.0.norm3.bias:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]  
ram used:  0.43 GB, model.diffusion_model.input_blocks.7.1.proj_out.weight:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]                
ram used:  0.44 GB, model.diffusion_model.input_blocks.7.1.proj_out.bias:  12%|█▏        | 139/1131 [00:00<00:02, 444.41it/s]  
ram used:  0.44 GB, model.diffusion_model.input_blocks.7.1.proj_out.bias:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]
ram used:  0.44 GB, model.diffusion_model.input_blocks.8.0.in_layers.0.weight:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]
ram used:  0.44 GB, model.diffusion_model.input_blocks.8.0.in_layers.0.bias:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]  
ram used:  0.44 GB, model.diffusion_model.input_blocks.8.0.in_layers.2.weight:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]
ram used:  0.49 GB, model.diffusion_model.input_blocks.8.0.in_layers.2.bias:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]  
ram used:  0.49 GB, model.diffusion_model.input_blocks.8.0.emb_layers.1.weight:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]
ram used:  0.50 GB, model.diffusion_model.input_blocks.8.0.emb_layers.1.bias:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]  
ram used:  0.50 GB, model.diffusion_model.input_blocks.8.0.out_layers.0.weight:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]
ram used:  0.50 GB, model.diffusion_model.input_blocks.8.0.out_layers.0.bias:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]  
ram used:  0.50 GB, model.diffusion_model.input_blocks.8.0.out_layers.3.weight:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]
ram used:  0.56 GB, model.diffusion_model.input_blocks.8.0.out_layers.3.bias:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]  
ram used:  0.56 GB, model.diffusion_model.input_blocks.8.1.norm.weight:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]      
ram used:  0.56 GB, model.diffusion_model.input_blocks.8.1.norm.bias  :  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]
ram used:  0.56 GB, model.diffusion_model.input_blocks.8.1.proj_in.weight:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]
ram used:  0.57 GB, model.diffusion_model.input_blocks.8.1.proj_in.bias:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]  
ram used:  0.57 GB, model.diffusion_model.input_blocks.8.1.transformer_blocks.0.attn1.to_q.weight:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]
ram used:  0.57 GB, model.diffusion_model.input_blocks.8.1.transformer_blocks.0.attn1.to_k.weight:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]
ram used:  0.58 GB, model.diffusion_model.input_blocks.8.1.transformer_blocks.0.attn1.to_v.weight:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]
ram used:  0.59 GB, model.diffusion_model.input_blocks.8.1.transformer_blocks.0.attn1.to_out.0.weight:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]
ram used:  0.59 GB, model.diffusion_model.input_blocks.8.1.transformer_blocks.0.attn1.to_out.0.bias:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]  
ram used:  0.59 GB, model.diffusion_model.input_blocks.8.1.transformer_blocks.0.ff.net.0.proj.weight:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]
ram used:  0.64 GB, model.diffusion_model.input_blocks.8.1.transformer_blocks.0.ff.net.0.proj.bias:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]  
ram used:  0.64 GB, model.diffusion_model.input_blocks.8.1.transformer_blocks.0.ff.net.2.weight:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]   
ram used:  0.67 GB, model.diffusion_model.input_blocks.8.1.transformer_blocks.0.ff.net.2.bias:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]  
ram used:  0.67 GB, model.diffusion_model.input_blocks.8.1.transformer_blocks.0.attn2.to_q.weight:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]
ram used:  0.68 GB, model.diffusion_model.input_blocks.8.1.transformer_blocks.0.attn2.to_k.weight:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]
ram used:  0.68 GB, model.diffusion_model.input_blocks.8.1.transformer_blocks.0.attn2.to_v.weight:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]
ram used:  0.69 GB, model.diffusion_model.input_blocks.8.1.transformer_blocks.0.attn2.to_out.0.weight:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]
ram used:  0.69 GB, model.diffusion_model.input_blocks.8.1.transformer_blocks.0.attn2.to_out.0.bias:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]  
ram used:  0.69 GB, model.diffusion_model.input_blocks.8.1.transformer_blocks.0.norm1.weight:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]       
ram used:  0.69 GB, model.diffusion_model.input_blocks.8.1.transformer_blocks.0.norm1.bias:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]  
ram used:  0.69 GB, model.diffusion_model.input_blocks.8.1.transformer_blocks.0.norm2.weight:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]
ram used:  0.69 GB, model.diffusion_model.input_blocks.8.1.transformer_blocks.0.norm2.bias:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]  
ram used:  0.69 GB, model.diffusion_model.input_blocks.8.1.transformer_blocks.0.norm3.weight:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]
ram used:  0.69 GB, model.diffusion_model.input_blocks.8.1.transformer_blocks.0.norm3.bias:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]  
ram used:  0.69 GB, model.diffusion_model.input_blocks.8.1.proj_out.weight:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]                
ram used:  0.70 GB, model.diffusion_model.input_blocks.8.1.proj_out.bias:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]  
ram used:  0.70 GB, model.diffusion_model.input_blocks.9.0.op.weight  :  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]  
ram used:  0.76 GB, model.diffusion_model.input_blocks.9.0.op.bias    :  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]
ram used:  0.76 GB, model.diffusion_model.input_blocks.10.0.in_layers.0.weight:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]
ram used:  0.76 GB, model.diffusion_model.input_blocks.10.0.in_layers.0.bias:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]  
ram used:  0.76 GB, model.diffusion_model.input_blocks.10.0.in_layers.2.weight:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]
ram used:  0.82 GB, model.diffusion_model.input_blocks.10.0.in_layers.2.bias:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]  
ram used:  0.82 GB, model.diffusion_model.input_blocks.10.0.emb_layers.1.weight:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]
ram used:  0.82 GB, model.diffusion_model.input_blocks.10.0.emb_layers.1.bias:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]  
ram used:  0.82 GB, model.diffusion_model.input_blocks.10.0.out_layers.0.weight:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]
ram used:  0.82 GB, model.diffusion_model.input_blocks.10.0.out_layers.0.bias:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]  
ram used:  0.82 GB, model.diffusion_model.input_blocks.10.0.out_layers.3.weight:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]
ram used:  0.88 GB, model.diffusion_model.input_blocks.10.0.out_layers.3.bias:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]  
ram used:  0.88 GB, model.diffusion_model.input_blocks.11.0.in_layers.0.weight:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]
ram used:  0.88 GB, model.diffusion_model.input_blocks.11.0.in_layers.0.bias:  17%|█▋        | 195/1131 [00:00<00:02, 431.89it/s]  
ram used:  0.88 GB, model.diffusion_model.input_blocks.11.0.in_layers.0.bias:  22%|██▏       | 245/1131 [00:00<00:02, 345.57it/s]
ram used:  0.88 GB, model.diffusion_model.input_blocks.11.0.in_layers.2.weight:  22%|██▏       | 245/1131 [00:00<00:02, 345.57it/s]
ram used:  0.94 GB, model.diffusion_model.input_blocks.11.0.in_layers.2.bias:  22%|██▏       | 245/1131 [00:00<00:02, 345.57it/s]  
ram used:  0.94 GB, model.diffusion_model.input_blocks.11.0.emb_layers.1.weight:  22%|██▏       | 245/1131 [00:00<00:02, 345.57it/s]
ram used:  0.95 GB, model.diffusion_model.input_blocks.11.0.emb_layers.1.bias:  22%|██▏       | 245/1131 [00:00<00:02, 345.57it/s]  
ram used:  0.95 GB, model.diffusion_model.input_blocks.11.0.out_layers.0.weight:  22%|██▏       | 245/1131 [00:00<00:02, 345.57it/s]
ram used:  0.95 GB, model.diffusion_model.input_blocks.11.0.out_layers.0.bias:  22%|██▏       | 245/1131 [00:00<00:02, 345.57it/s]  
ram used:  0.95 GB, model.diffusion_model.input_blocks.11.0.out_layers.3.weight:  22%|██▏       | 245/1131 [00:00<00:02, 345.57it/s]
ram used:  1.01 GB, model.diffusion_model.input_blocks.11.0.out_layers.3.bias:  22%|██▏       | 245/1131 [00:00<00:02, 345.57it/s]  
ram used:  1.01 GB, model.diffusion_model.middle_block.0.in_layers.0.weight:  22%|██▏       | 245/1131 [00:00<00:02, 345.57it/s]  
ram used:  1.01 GB, model.diffusion_model.middle_block.0.in_layers.0.bias:  22%|██▏       | 245/1131 [00:00<00:02, 345.57it/s]  
ram used:  1.01 GB, model.diffusion_model.middle_block.0.in_layers.2.weight:  22%|██▏       | 245/1131 [00:00<00:02, 345.57it/s]
ram used:  1.07 GB, model.diffusion_model.middle_block.0.in_layers.2.bias:  22%|██▏       | 245/1131 [00:00<00:02, 345.57it/s]  
ram used:  1.07 GB, model.diffusion_model.middle_block.0.emb_layers.1.weight:  22%|██▏       | 245/1131 [00:00<00:02, 345.57it/s]
ram used:  1.07 GB, model.diffusion_model.middle_block.0.emb_layers.1.bias:  22%|██▏       | 245/1131 [00:00<00:02, 345.57it/s]  
ram used:  1.07 GB, model.diffusion_model.middle_block.0.out_layers.0.weight:  22%|██▏       | 245/1131 [00:00<00:02, 345.57it/s]
ram used:  1.07 GB, model.diffusion_model.middle_block.0.out_layers.0.bias:  22%|██▏       | 245/1131 [00:00<00:02, 345.57it/s]  
ram used:  1.07 GB, model.diffusion_model.middle_block.0.out_layers.3.weight:  22%|██▏       | 245/1131 [00:00<00:02, 345.57it/s]
ram used:  1.13 GB, model.diffusion_model.middle_block.0.out_layers.3.bias:  22%|██▏       | 245/1131 [00:00<00:02, 345.57it/s]  
ram used:  1.13 GB, model.diffusion_model.middle_block.1.norm.weight  :  22%|██▏       | 245/1131 [00:00<00:02, 345.57it/s]    
ram used:  1.13 GB, model.diffusion_model.middle_block.1.norm.bias    :  22%|██▏       | 245/1131 [00:00<00:02, 345.57it/s]
ram used:  1.13 GB, model.diffusion_model.middle_block.1.proj_in.weight:  22%|██▏       | 245/1131 [00:00<00:02, 345.57it/s]
ram used:  1.14 GB, model.diffusion_model.middle_block.1.proj_in.bias :  22%|██▏       | 245/1131 [00:00<00:02, 345.57it/s] 
ram used:  1.14 GB, model.diffusion_model.middle_block.1.transformer_blocks.0.attn1.to_q.weight:  22%|██▏       | 245/1131 [00:00<00:02, 345.57it/s]
ram used:  1.14 GB, model.diffusion_model.middle_block.1.transformer_blocks.0.attn1.to_k.weight:  22%|██▏       | 245/1131 [00:00<00:02, 345.57it/s]
ram used:  1.15 GB, model.diffusion_model.middle_block.1.transformer_blocks.0.attn1.to_v.weight:  22%|██▏       | 245/1131 [00:00<00:02, 345.57it/s]
ram used:  1.16 GB, model.diffusion_model.middle_block.1.transformer_blocks.0.attn1.to_out.0.weight:  22%|██▏       | 245/1131 [00:00<00:02, 345.57it/s]
ram used:  1.16 GB, model.diffusion_model.middle_block.1.transformer_blocks.0.attn1.to_out.0.bias:  22%|██▏       | 245/1131 [00:00<00:02, 345.57it/s]  
ram used:  1.16 GB, model.diffusion_model.middle_block.1.transformer_blocks.0.ff.net.0.proj.weight:  22%|██▏       | 245/1131 [00:00<00:02, 345.57it/s]
ram used:  1.22 GB, model.diffusion_model.middle_block.1.transformer_blocks.0.ff.net.0.proj.bias:  22%|██▏       | 245/1131 [00:00<00:02, 345.57it/s]  
ram used:  1.22 GB, model.diffusion_model.middle_block.1.transformer_blocks.0.ff.net.2.weight:  22%|██▏       | 245/1131 [00:00<00:02, 345.57it/s]   
ram used:  1.24 GB, model.diffusion_model.middle_block.1.transformer_blocks.0.ff.net.2.bias:  22%|██▏       | 245/1131 [00:00<00:02, 345.57it/s]  
ram used:  1.24 GB, model.diffusion_model.middle_block.1.transformer_blocks.0.attn2.to_q.weight:  22%|██▏       | 245/1131 [00:00<00:02, 345.57it/s]
ram used:  1.25 GB, model.diffusion_model.middle_block.1.transformer_blocks.0.attn2.to_k.weight:  22%|██▏       | 245/1131 [00:00<00:02, 345.57it/s]
ram used:  1.25 GB, model.diffusion_model.middle_block.1.transformer_blocks.0.attn2.to_v.weight:  22%|██▏       | 245/1131 [00:00<00:02, 345.57it/s]
ram used:  1.26 GB, model.diffusion_model.middle_block.1.transformer_blocks.0.attn2.to_out.0.weight:  22%|██▏       | 245/1131 [00:00<00:02, 345.57it/s]
ram used:  1.26 GB, model.diffusion_model.middle_block.1.transformer_blocks.0.attn2.to_out.0.bias:  22%|██▏       | 245/1131 [00:00<00:02, 345.57it/s]  
ram used:  1.26 GB, model.diffusion_model.middle_block.1.transformer_blocks.0.norm1.weight:  22%|██▏       | 245/1131 [00:00<00:02, 345.57it/s]       
ram used:  1.26 GB, model.diffusion_model.middle_block.1.transformer_blocks.0.norm1.bias:  22%|██▏       | 245/1131 [00:00<00:02, 345.57it/s]  
ram used:  1.26 GB, model.diffusion_model.middle_block.1.transformer_blocks.0.norm2.weight:  22%|██▏       | 245/1131 [00:00<00:02, 345.57it/s]
ram used:  1.26 GB, model.diffusion_model.middle_block.1.transformer_blocks.0.norm2.bias:  22%|██▏       | 245/1131 [00:00<00:02, 345.57it/s]  
ram used:  1.26 GB, model.diffusion_model.middle_block.1.transformer_blocks.0.norm3.weight:  22%|██▏       | 245/1131 [00:00<00:02, 345.57it/s]
ram used:  1.26 GB, model.diffusion_model.middle_block.1.transformer_blocks.0.norm3.weight:  25%|██▌       | 286/1131 [00:00<00:02, 306.95it/s]
ram used:  1.26 GB, model.diffusion_model.middle_block.1.transformer_blocks.0.norm3.bias:  25%|██▌       | 286/1131 [00:00<00:02, 306.95it/s]  
ram used:  1.26 GB, model.diffusion_model.middle_block.1.proj_out.weight:  25%|██▌       | 286/1131 [00:00<00:02, 306.95it/s]                
ram used:  1.27 GB, model.diffusion_model.middle_block.1.proj_out.bias:  25%|██▌       | 286/1131 [00:00<00:02, 306.95it/s]  
ram used:  1.27 GB, model.diffusion_model.middle_block.2.in_layers.0.weight:  25%|██▌       | 286/1131 [00:00<00:02, 306.95it/s]
ram used:  1.27 GB, model.diffusion_model.middle_block.2.in_layers.0.bias:  25%|██▌       | 286/1131 [00:00<00:02, 306.95it/s]  
ram used:  1.27 GB, model.diffusion_model.middle_block.2.in_layers.2.weight:  25%|██▌       | 286/1131 [00:00<00:02, 306.95it/s]
ram used:  1.33 GB, model.diffusion_model.middle_block.2.in_layers.2.bias:  25%|██▌       | 286/1131 [00:00<00:02, 306.95it/s]  
ram used:  1.33 GB, model.diffusion_model.middle_block.2.emb_layers.1.weight:  25%|██▌       | 286/1131 [00:00<00:02, 306.95it/s]
ram used:  1.34 GB, model.diffusion_model.middle_block.2.emb_layers.1.bias:  25%|██▌       | 286/1131 [00:00<00:02, 306.95it/s]  
ram used:  1.34 GB, model.diffusion_model.middle_block.2.out_layers.0.weight:  25%|██▌       | 286/1131 [00:00<00:02, 306.95it/s]
ram used:  1.34 GB, model.diffusion_model.middle_block.2.out_layers.0.bias:  25%|██▌       | 286/1131 [00:00<00:02, 306.95it/s]  
ram used:  1.34 GB, model.diffusion_model.middle_block.2.out_layers.3.weight:  25%|██▌       | 286/1131 [00:00<00:02, 306.95it/s]
ram used:  1.39 GB, model.diffusion_model.middle_block.2.out_layers.3.bias:  25%|██▌       | 286/1131 [00:00<00:02, 306.95it/s]  
ram used:  1.39 GB, model.diffusion_model.output_blocks.0.0.in_layers.0.weight:  25%|██▌       | 286/1131 [00:00<00:02, 306.95it/s]
ram used:  1.39 GB, model.diffusion_model.output_blocks.0.0.in_layers.0.bias:  25%|██▌       | 286/1131 [00:00<00:02, 306.95it/s]  
ram used:  1.39 GB, model.diffusion_model.output_blocks.0.0.in_layers.2.weight:  25%|██▌       | 286/1131 [00:00<00:02, 306.95it/s]
ram used:  1.51 GB, model.diffusion_model.output_blocks.0.0.in_layers.2.bias:  25%|██▌       | 286/1131 [00:01<00:02, 306.95it/s]  
ram used:  1.51 GB, model.diffusion_model.output_blocks.0.0.emb_layers.1.weight:  25%|██▌       | 286/1131 [00:01<00:02, 306.95it/s]
ram used:  1.52 GB, model.diffusion_model.output_blocks.0.0.emb_layers.1.bias:  25%|██▌       | 286/1131 [00:01<00:02, 306.95it/s]  
ram used:  1.52 GB, model.diffusion_model.output_blocks.0.0.out_layers.0.weight:  25%|██▌       | 286/1131 [00:01<00:02, 306.95it/s]
ram used:  1.52 GB, model.diffusion_model.output_blocks.0.0.out_layers.0.bias:  25%|██▌       | 286/1131 [00:01<00:02, 306.95it/s]  
ram used:  1.52 GB, model.diffusion_model.output_blocks.0.0.out_layers.3.weight:  25%|██▌       | 286/1131 [00:01<00:02, 306.95it/s]
ram used:  1.58 GB, model.diffusion_model.output_blocks.0.0.out_layers.3.bias:  25%|██▌       | 286/1131 [00:01<00:02, 306.95it/s]  
ram used:  1.58 GB, model.diffusion_model.output_blocks.0.0.skip_connection.weight:  25%|██▌       | 286/1131 [00:01<00:02, 306.95it/s]
ram used:  1.59 GB, model.diffusion_model.output_blocks.0.0.skip_connection.bias:  25%|██▌       | 286/1131 [00:01<00:02, 306.95it/s]  
ram used:  1.59 GB, model.diffusion_model.output_blocks.1.0.in_layers.0.weight:  25%|██▌       | 286/1131 [00:01<00:02, 306.95it/s]  
ram used:  1.59 GB, model.diffusion_model.output_blocks.1.0.in_layers.0.bias:  25%|██▌       | 286/1131 [00:01<00:02, 306.95it/s]  
ram used:  1.59 GB, model.diffusion_model.output_blocks.1.0.in_layers.2.weight:  25%|██▌       | 286/1131 [00:01<00:02, 306.95it/s]
ram used:  1.71 GB, model.diffusion_model.output_blocks.1.0.in_layers.2.bias:  25%|██▌       | 286/1131 [00:01<00:02, 306.95it/s]  
ram used:  1.71 GB, model.diffusion_model.output_blocks.1.0.emb_layers.1.weight:  25%|██▌       | 286/1131 [00:01<00:02, 306.95it/s]
ram used:  1.72 GB, model.diffusion_model.output_blocks.1.0.emb_layers.1.bias:  25%|██▌       | 286/1131 [00:01<00:02, 306.95it/s]  
ram used:  1.72 GB, model.diffusion_model.output_blocks.1.0.out_layers.0.weight:  25%|██▌       | 286/1131 [00:01<00:02, 306.95it/s]
ram used:  1.72 GB, model.diffusion_model.output_blocks.1.0.out_layers.0.bias:  25%|██▌       | 286/1131 [00:01<00:02, 306.95it/s]  
ram used:  1.72 GB, model.diffusion_model.output_blocks.1.0.out_layers.3.weight:  25%|██▌       | 286/1131 [00:01<00:02, 306.95it/s]
ram used:  1.78 GB, model.diffusion_model.output_blocks.1.0.out_layers.3.bias:  25%|██▌       | 286/1131 [00:01<00:02, 306.95it/s]  
ram used:  1.78 GB, model.diffusion_model.output_blocks.1.0.skip_connection.weight:  25%|██▌       | 286/1131 [00:01<00:02, 306.95it/s]
ram used:  1.78 GB, model.diffusion_model.output_blocks.1.0.skip_connection.weight:  28%|██▊       | 322/1131 [00:01<00:03, 248.71it/s]
ram used:  1.79 GB, model.diffusion_model.output_blocks.1.0.skip_connection.bias:  28%|██▊       | 322/1131 [00:01<00:03, 248.71it/s]  
ram used:  1.79 GB, model.diffusion_model.output_blocks.2.0.in_layers.0.weight:  28%|██▊       | 322/1131 [00:01<00:03, 248.71it/s]  
ram used:  1.79 GB, model.diffusion_model.output_blocks.2.0.in_layers.0.bias:  28%|██▊       | 322/1131 [00:01<00:03, 248.71it/s]  
ram used:  1.79 GB, model.diffusion_model.output_blocks.2.0.in_layers.2.weight:  28%|██▊       | 322/1131 [00:01<00:03, 248.71it/s]
ram used:  1.91 GB, model.diffusion_model.output_blocks.2.0.in_layers.2.bias:  28%|██▊       | 322/1131 [00:01<00:03, 248.71it/s]  
ram used:  1.91 GB, model.diffusion_model.output_blocks.2.0.emb_layers.1.weight:  28%|██▊       | 322/1131 [00:01<00:03, 248.71it/s]
ram used:  1.91 GB, model.diffusion_model.output_blocks.2.0.emb_layers.1.bias:  28%|██▊       | 322/1131 [00:01<00:03, 248.71it/s]  
ram used:  1.91 GB, model.diffusion_model.output_blocks.2.0.out_layers.0.weight:  28%|██▊       | 322/1131 [00:01<00:03, 248.71it/s]
ram used:  1.91 GB, model.diffusion_model.output_blocks.2.0.out_layers.0.bias:  28%|██▊       | 322/1131 [00:01<00:03, 248.71it/s]  
ram used:  1.91 GB, model.diffusion_model.output_blocks.2.0.out_layers.3.weight:  28%|██▊       | 322/1131 [00:01<00:03, 248.71it/s]
ram used:  1.97 GB, model.diffusion_model.output_blocks.2.0.out_layers.3.bias:  28%|██▊       | 322/1131 [00:01<00:03, 248.71it/s]  
ram used:  1.97 GB, model.diffusion_model.output_blocks.2.0.skip_connection.weight:  28%|██▊       | 322/1131 [00:01<00:03, 248.71it/s]
ram used:  1.98 GB, model.diffusion_model.output_blocks.2.0.skip_connection.bias:  28%|██▊       | 322/1131 [00:01<00:03, 248.71it/s]  
ram used:  1.98 GB, model.diffusion_model.output_blocks.2.1.conv.weight:  28%|██▊       | 322/1131 [00:01<00:03, 248.71it/s]         
ram used:  2.04 GB, model.diffusion_model.output_blocks.2.1.conv.bias :  28%|██▊       | 322/1131 [00:01<00:03, 248.71it/s] 
ram used:  2.04 GB, model.diffusion_model.output_blocks.3.0.in_layers.0.weight:  28%|██▊       | 322/1131 [00:01<00:03, 248.71it/s]
ram used:  2.04 GB, model.diffusion_model.output_blocks.3.0.in_layers.0.bias:  28%|██▊       | 322/1131 [00:01<00:03, 248.71it/s]  
ram used:  2.04 GB, model.diffusion_model.output_blocks.3.0.in_layers.2.weight:  28%|██▊       | 322/1131 [00:01<00:03, 248.71it/s]
ram used:  2.16 GB, model.diffusion_model.output_blocks.3.0.in_layers.2.bias:  28%|██▊       | 322/1131 [00:01<00:03, 248.71it/s]  
ram used:  2.16 GB, model.diffusion_model.output_blocks.3.0.emb_layers.1.weight:  28%|██▊       | 322/1131 [00:01<00:03, 248.71it/s]
ram used:  2.17 GB, model.diffusion_model.output_blocks.3.0.emb_layers.1.bias:  28%|██▊       | 322/1131 [00:01<00:03, 248.71it/s]  
ram used:  2.17 GB, model.diffusion_model.output_blocks.3.0.out_layers.0.weight:  28%|██▊       | 322/1131 [00:01<00:03, 248.71it/s]
ram used:  2.17 GB, model.diffusion_model.output_blocks.3.0.out_layers.0.bias:  28%|██▊       | 322/1131 [00:01<00:03, 248.71it/s]  
ram used:  2.17 GB, model.diffusion_model.output_blocks.3.0.out_layers.3.weight:  28%|██▊       | 322/1131 [00:01<00:03, 248.71it/s]
ram used:  2.23 GB, model.diffusion_model.output_blocks.3.0.out_layers.3.bias:  28%|██▊       | 322/1131 [00:01<00:03, 248.71it/s]  
ram used:  2.23 GB, model.diffusion_model.output_blocks.3.0.skip_connection.weight:  28%|██▊       | 322/1131 [00:01<00:03, 248.71it/s]
ram used:  2.24 GB, model.diffusion_model.output_blocks.3.0.skip_connection.bias:  28%|██▊       | 322/1131 [00:01<00:03, 248.71it/s]  
ram used:  2.24 GB, model.diffusion_model.output_blocks.3.1.norm.weight:  28%|██▊       | 322/1131 [00:01<00:03, 248.71it/s]         
ram used:  2.24 GB, model.diffusion_model.output_blocks.3.1.norm.bias :  28%|██▊       | 322/1131 [00:01<00:03, 248.71it/s] 
ram used:  2.24 GB, model.diffusion_model.output_blocks.3.1.norm.bias :  31%|███       | 351/1131 [00:01<00:03, 216.94it/s]
ram used:  2.24 GB, model.diffusion_model.output_blocks.3.1.proj_in.weight:  31%|███       | 351/1131 [00:01<00:03, 216.94it/s]
ram used:  2.25 GB, model.diffusion_model.output_blocks.3.1.proj_in.bias:  31%|███       | 351/1131 [00:01<00:03, 216.94it/s]  
ram used:  2.25 GB, model.diffusion_model.output_blocks.3.1.transformer_blocks.0.attn1.to_q.weight:  31%|███       | 351/1131 [00:01<00:03, 216.94it/s]
ram used:  2.25 GB, model.diffusion_model.output_blocks.3.1.transformer_blocks.0.attn1.to_k.weight:  31%|███       | 351/1131 [00:01<00:03, 216.94it/s]
ram used:  2.26 GB, model.diffusion_model.output_blocks.3.1.transformer_blocks.0.attn1.to_v.weight:  31%|███       | 351/1131 [00:01<00:03, 216.94it/s]
ram used:  2.27 GB, model.diffusion_model.output_blocks.3.1.transformer_blocks.0.attn1.to_out.0.weight:  31%|███       | 351/1131 [00:01<00:03, 216.94it/s]
ram used:  2.27 GB, model.diffusion_model.output_blocks.3.1.transformer_blocks.0.attn1.to_out.0.bias:  31%|███       | 351/1131 [00:01<00:03, 216.94it/s]  
ram used:  2.27 GB, model.diffusion_model.output_blocks.3.1.transformer_blocks.0.ff.net.0.proj.weight:  31%|███       | 351/1131 [00:01<00:03, 216.94it/s]
ram used:  2.33 GB, model.diffusion_model.output_blocks.3.1.transformer_blocks.0.ff.net.0.proj.bias:  31%|███       | 351/1131 [00:01<00:03, 216.94it/s]  
ram used:  2.33 GB, model.diffusion_model.output_blocks.3.1.transformer_blocks.0.ff.net.2.weight:  31%|███       | 351/1131 [00:01<00:03, 216.94it/s]   
ram used:  2.35 GB, model.diffusion_model.output_blocks.3.1.transformer_blocks.0.ff.net.2.bias:  31%|███       | 351/1131 [00:01<00:03, 216.94it/s]  
ram used:  2.35 GB, model.diffusion_model.output_blocks.3.1.transformer_blocks.0.attn2.to_q.weight:  31%|███       | 351/1131 [00:01<00:03, 216.94it/s]
ram used:  2.36 GB, model.diffusion_model.output_blocks.3.1.transformer_blocks.0.attn2.to_k.weight:  31%|███       | 351/1131 [00:01<00:03, 216.94it/s]
ram used:  2.36 GB, model.diffusion_model.output_blocks.3.1.transformer_blocks.0.attn2.to_v.weight:  31%|███       | 351/1131 [00:01<00:03, 216.94it/s]
ram used:  2.37 GB, model.diffusion_model.output_blocks.3.1.transformer_blocks.0.attn2.to_out.0.weight:  31%|███       | 351/1131 [00:01<00:03, 216.94it/s]
ram used:  2.37 GB, model.diffusion_model.output_blocks.3.1.transformer_blocks.0.attn2.to_out.0.bias:  31%|███       | 351/1131 [00:01<00:03, 216.94it/s]  
ram used:  2.37 GB, model.diffusion_model.output_blocks.3.1.transformer_blocks.0.norm1.weight:  31%|███       | 351/1131 [00:01<00:03, 216.94it/s]       
ram used:  2.37 GB, model.diffusion_model.output_blocks.3.1.transformer_blocks.0.norm1.bias:  31%|███       | 351/1131 [00:01<00:03, 216.94it/s]  
ram used:  2.37 GB, model.diffusion_model.output_blocks.3.1.transformer_blocks.0.norm2.weight:  31%|███       | 351/1131 [00:01<00:03, 216.94it/s]
ram used:  2.37 GB, model.diffusion_model.output_blocks.3.1.transformer_blocks.0.norm2.bias:  31%|███       | 351/1131 [00:01<00:03, 216.94it/s]  
ram used:  2.37 GB, model.diffusion_model.output_blocks.3.1.transformer_blocks.0.norm3.weight:  31%|███       | 351/1131 [00:01<00:03, 216.94it/s]
ram used:  2.37 GB, model.diffusion_model.output_blocks.3.1.transformer_blocks.0.norm3.bias:  31%|███       | 351/1131 [00:01<00:03, 216.94it/s]  
ram used:  2.37 GB, model.diffusion_model.output_blocks.3.1.proj_out.weight:  31%|███       | 351/1131 [00:01<00:03, 216.94it/s]                
ram used:  2.38 GB, model.diffusion_model.output_blocks.3.1.proj_out.bias:  31%|███       | 351/1131 [00:01<00:03, 216.94it/s]  
ram used:  2.38 GB, model.diffusion_model.output_blocks.4.0.in_layers.0.weight:  31%|███       | 351/1131 [00:01<00:03, 216.94it/s]
ram used:  2.38 GB, model.diffusion_model.output_blocks.4.0.in_layers.0.bias:  31%|███       | 351/1131 [00:01<00:03, 216.94it/s]  
ram used:  2.38 GB, model.diffusion_model.output_blocks.4.0.in_layers.2.weight:  31%|███       | 351/1131 [00:01<00:03, 216.94it/s]
ram used:  2.38 GB, model.diffusion_model.output_blocks.4.0.in_layers.2.weight:  33%|███▎      | 378/1131 [00:01<00:03, 222.81it/s]
ram used:  2.50 GB, model.diffusion_model.output_blocks.4.0.in_layers.2.bias:  33%|███▎      | 378/1131 [00:01<00:03, 222.81it/s]  
ram used:  2.50 GB, model.diffusion_model.output_blocks.4.0.emb_layers.1.weight:  33%|███▎      | 378/1131 [00:01<00:03, 222.81it/s]
ram used:  2.50 GB, model.diffusion_model.output_blocks.4.0.emb_layers.1.bias:  33%|███▎      | 378/1131 [00:01<00:03, 222.81it/s]  
ram used:  2.50 GB, model.diffusion_model.output_blocks.4.0.out_layers.0.weight:  33%|███▎      | 378/1131 [00:01<00:03, 222.81it/s]
ram used:  2.50 GB, model.diffusion_model.output_blocks.4.0.out_layers.0.bias:  33%|███▎      | 378/1131 [00:01<00:03, 222.81it/s]  
ram used:  2.50 GB, model.diffusion_model.output_blocks.4.0.out_layers.3.weight:  33%|███▎      | 378/1131 [00:01<00:03, 222.81it/s]
ram used:  2.56 GB, model.diffusion_model.output_blocks.4.0.out_layers.3.bias:  33%|███▎      | 378/1131 [00:01<00:03, 222.81it/s]  
ram used:  2.56 GB, model.diffusion_model.output_blocks.4.0.skip_connection.weight:  33%|███▎      | 378/1131 [00:01<00:03, 222.81it/s]
ram used:  2.58 GB, model.diffusion_model.output_blocks.4.0.skip_connection.bias:  33%|███▎      | 378/1131 [00:01<00:03, 222.81it/s]  
ram used:  2.58 GB, model.diffusion_model.output_blocks.4.1.norm.weight:  33%|███▎      | 378/1131 [00:01<00:03, 222.81it/s]         
ram used:  2.58 GB, model.diffusion_model.output_blocks.4.1.norm.bias :  33%|███▎      | 378/1131 [00:01<00:03, 222.81it/s] 
ram used:  2.58 GB, model.diffusion_model.output_blocks.4.1.proj_in.weight:  33%|███▎      | 378/1131 [00:01<00:03, 222.81it/s]
ram used:  2.58 GB, model.diffusion_model.output_blocks.4.1.proj_in.bias:  33%|███▎      | 378/1131 [00:01<00:03, 222.81it/s]  
ram used:  2.58 GB, model.diffusion_model.output_blocks.4.1.transformer_blocks.0.attn1.to_q.weight:  33%|███▎      | 378/1131 [00:01<00:03, 222.81it/s]
ram used:  2.59 GB, model.diffusion_model.output_blocks.4.1.transformer_blocks.0.attn1.to_k.weight:  33%|███▎      | 378/1131 [00:01<00:03, 222.81it/s]
ram used:  2.60 GB, model.diffusion_model.output_blocks.4.1.transformer_blocks.0.attn1.to_v.weight:  33%|███▎      | 378/1131 [00:01<00:03, 222.81it/s]
ram used:  2.60 GB, model.diffusion_model.output_blocks.4.1.transformer_blocks.0.attn1.to_out.0.weight:  33%|███▎      | 378/1131 [00:01<00:03, 222.81it/s]
ram used:  2.61 GB, model.diffusion_model.output_blocks.4.1.transformer_blocks.0.attn1.to_out.0.bias:  33%|███▎      | 378/1131 [00:01<00:03, 222.81it/s]  
ram used:  2.61 GB, model.diffusion_model.output_blocks.4.1.transformer_blocks.0.ff.net.0.proj.weight:  33%|███▎      | 378/1131 [00:01<00:03, 222.81it/s]
ram used:  2.66 GB, model.diffusion_model.output_blocks.4.1.transformer_blocks.0.ff.net.0.proj.bias:  33%|███▎      | 378/1131 [00:01<00:03, 222.81it/s]  
ram used:  2.66 GB, model.diffusion_model.output_blocks.4.1.transformer_blocks.0.ff.net.2.weight:  33%|███▎      | 378/1131 [00:01<00:03, 222.81it/s]   
ram used:  2.69 GB, model.diffusion_model.output_blocks.4.1.transformer_blocks.0.ff.net.2.bias:  33%|███▎      | 378/1131 [00:01<00:03, 222.81it/s]  
ram used:  2.69 GB, model.diffusion_model.output_blocks.4.1.transformer_blocks.0.attn2.to_q.weight:  33%|███▎      | 378/1131 [00:01<00:03, 222.81it/s]
ram used:  2.69 GB, model.diffusion_model.output_blocks.4.1.transformer_blocks.0.attn2.to_k.weight:  33%|███▎      | 378/1131 [00:01<00:03, 222.81it/s]
ram used:  2.70 GB, model.diffusion_model.output_blocks.4.1.transformer_blocks.0.attn2.to_v.weight:  33%|███▎      | 378/1131 [00:01<00:03, 222.81it/s]
ram used:  2.70 GB, model.diffusion_model.output_blocks.4.1.transformer_blocks.0.attn2.to_out.0.weight:  33%|███▎      | 378/1131 [00:01<00:03, 222.81it/s]
ram used:  2.70 GB, model.diffusion_model.output_blocks.4.1.transformer_blocks.0.attn2.to_out.0.weight:  36%|███▌      | 404/1131 [00:01<00:03, 230.72it/s]
ram used:  2.71 GB, model.diffusion_model.output_blocks.4.1.transformer_blocks.0.attn2.to_out.0.bias:  36%|███▌      | 404/1131 [00:01<00:03, 230.72it/s]  
ram used:  2.71 GB, model.diffusion_model.output_blocks.4.1.transformer_blocks.0.norm1.weight:  36%|███▌      | 404/1131 [00:01<00:03, 230.72it/s]       
ram used:  2.71 GB, model.diffusion_model.output_blocks.4.1.transformer_blocks.0.norm1.bias:  36%|███▌      | 404/1131 [00:01<00:03, 230.72it/s]  
ram used:  2.71 GB, model.diffusion_model.output_blocks.4.1.transformer_blocks.0.norm2.weight:  36%|███▌      | 404/1131 [00:01<00:03, 230.72it/s]
ram used:  2.71 GB, model.diffusion_model.output_blocks.4.1.transformer_blocks.0.norm2.bias:  36%|███▌      | 404/1131 [00:01<00:03, 230.72it/s]  
ram used:  2.71 GB, model.diffusion_model.output_blocks.4.1.transformer_blocks.0.norm3.weight:  36%|███▌      | 404/1131 [00:01<00:03, 230.72it/s]
ram used:  2.71 GB, model.diffusion_model.output_blocks.4.1.transformer_blocks.0.norm3.bias:  36%|███▌      | 404/1131 [00:01<00:03, 230.72it/s]  
ram used:  2.71 GB, model.diffusion_model.output_blocks.4.1.proj_out.weight:  36%|███▌      | 404/1131 [00:01<00:03, 230.72it/s]                
ram used:  2.72 GB, model.diffusion_model.output_blocks.4.1.proj_out.bias:  36%|███▌      | 404/1131 [00:01<00:03, 230.72it/s]  
ram used:  2.72 GB, model.diffusion_model.output_blocks.5.0.in_layers.0.weight:  36%|███▌      | 404/1131 [00:01<00:03, 230.72it/s]
ram used:  2.72 GB, model.diffusion_model.output_blocks.5.0.in_layers.0.bias:  36%|███▌      | 404/1131 [00:01<00:03, 230.72it/s]  
ram used:  2.72 GB, model.diffusion_model.output_blocks.5.0.in_layers.2.weight:  36%|███▌      | 404/1131 [00:01<00:03, 230.72it/s]
ram used:  2.80 GB, model.diffusion_model.output_blocks.5.0.in_layers.2.bias:  36%|███▌      | 404/1131 [00:01<00:03, 230.72it/s]  
ram used:  2.80 GB, model.diffusion_model.output_blocks.5.0.emb_layers.1.weight:  36%|███▌      | 404/1131 [00:01<00:03, 230.72it/s]
ram used:  2.81 GB, model.diffusion_model.output_blocks.5.0.emb_layers.1.bias:  36%|███▌      | 404/1131 [00:01<00:03, 230.72it/s]  
ram used:  2.81 GB, model.diffusion_model.output_blocks.5.0.out_layers.0.weight:  36%|███▌      | 404/1131 [00:01<00:03, 230.72it/s]
ram used:  2.81 GB, model.diffusion_model.output_blocks.5.0.out_layers.0.bias:  36%|███▌      | 404/1131 [00:01<00:03, 230.72it/s]  
ram used:  2.81 GB, model.diffusion_model.output_blocks.5.0.out_layers.3.weight:  36%|███▌      | 404/1131 [00:01<00:03, 230.72it/s]
ram used:  2.87 GB, model.diffusion_model.output_blocks.5.0.out_layers.3.bias:  36%|███▌      | 404/1131 [00:01<00:03, 230.72it/s]  
ram used:  2.87 GB, model.diffusion_model.output_blocks.5.0.skip_connection.weight:  36%|███▌      | 404/1131 [00:01<00:03, 230.72it/s]
ram used:  2.88 GB, model.diffusion_model.output_blocks.5.0.skip_connection.bias:  36%|███▌      | 404/1131 [00:01<00:03, 230.72it/s]  
ram used:  2.88 GB, model.diffusion_model.output_blocks.5.1.norm.weight:  36%|███▌      | 404/1131 [00:01<00:03, 230.72it/s]         
ram used:  2.88 GB, model.diffusion_model.output_blocks.5.1.norm.bias :  36%|███▌      | 404/1131 [00:01<00:03, 230.72it/s] 
ram used:  2.88 GB, model.diffusion_model.output_blocks.5.1.proj_in.weight:  36%|███▌      | 404/1131 [00:01<00:03, 230.72it/s]
ram used:  2.89 GB, model.diffusion_model.output_blocks.5.1.proj_in.bias:  36%|███▌      | 404/1131 [00:01<00:03, 230.72it/s]  
ram used:  2.89 GB, model.diffusion_model.output_blocks.5.1.transformer_blocks.0.attn1.to_q.weight:  36%|███▌      | 404/1131 [00:01<00:03, 230.72it/s]
ram used:  2.89 GB, model.diffusion_model.output_blocks.5.1.transformer_blocks.0.attn1.to_k.weight:  36%|███▌      | 404/1131 [00:01<00:03, 230.72it/s]
ram used:  2.90 GB, model.diffusion_model.output_blocks.5.1.transformer_blocks.0.attn1.to_v.weight:  36%|███▌      | 404/1131 [00:01<00:03, 230.72it/s]
ram used:  2.91 GB, model.diffusion_model.output_blocks.5.1.transformer_blocks.0.attn1.to_out.0.weight:  36%|███▌      | 404/1131 [00:01<00:03, 230.72it/s]
ram used:  2.91 GB, model.diffusion_model.output_blocks.5.1.transformer_blocks.0.attn1.to_out.0.bias:  36%|███▌      | 404/1131 [00:01<00:03, 230.72it/s]  
ram used:  2.91 GB, model.diffusion_model.output_blocks.5.1.transformer_blocks.0.ff.net.0.proj.weight:  36%|███▌      | 404/1131 [00:01<00:03, 230.72it/s]
ram used:  2.91 GB, model.diffusion_model.output_blocks.5.1.transformer_blocks.0.ff.net.0.proj.weight:  38%|███▊      | 435/1131 [00:01<00:02, 240.90it/s]
ram used:  2.96 GB, model.diffusion_model.output_blocks.5.1.transformer_blocks.0.ff.net.0.proj.bias:  38%|███▊      | 435/1131 [00:01<00:02, 240.90it/s]  
ram used:  2.96 GB, model.diffusion_model.output_blocks.5.1.transformer_blocks.0.ff.net.2.weight:  38%|███▊      | 435/1131 [00:01<00:02, 240.90it/s]   
ram used:  2.99 GB, model.diffusion_model.output_blocks.5.1.transformer_blocks.0.ff.net.2.bias:  38%|███▊      | 435/1131 [00:01<00:02, 240.90it/s]  
ram used:  2.99 GB, model.diffusion_model.output_blocks.5.1.transformer_blocks.0.attn2.to_q.weight:  38%|███▊      | 435/1131 [00:01<00:02, 240.90it/s]
ram used:  3.00 GB, model.diffusion_model.output_blocks.5.1.transformer_blocks.0.attn2.to_k.weight:  38%|███▊      | 435/1131 [00:01<00:02, 240.90it/s]
ram used:  3.00 GB, model.diffusion_model.output_blocks.5.1.transformer_blocks.0.attn2.to_v.weight:  38%|███▊      | 435/1131 [00:01<00:02, 240.90it/s]
ram used:  3.01 GB, model.diffusion_model.output_blocks.5.1.transformer_blocks.0.attn2.to_out.0.weight:  38%|███▊      | 435/1131 [00:01<00:02, 240.90it/s]
ram used:  3.01 GB, model.diffusion_model.output_blocks.5.1.transformer_blocks.0.attn2.to_out.0.bias:  38%|███▊      | 435/1131 [00:01<00:02, 240.90it/s]  
ram used:  3.01 GB, model.diffusion_model.output_blocks.5.1.transformer_blocks.0.norm1.weight:  38%|███▊      | 435/1131 [00:01<00:02, 240.90it/s]       
ram used:  3.01 GB, model.diffusion_model.output_blocks.5.1.transformer_blocks.0.norm1.bias:  38%|███▊      | 435/1131 [00:01<00:02, 240.90it/s]  
ram used:  3.01 GB, model.diffusion_model.output_blocks.5.1.transformer_blocks.0.norm2.weight:  38%|███▊      | 435/1131 [00:01<00:02, 240.90it/s]
ram used:  3.01 GB, model.diffusion_model.output_blocks.5.1.transformer_blocks.0.norm2.bias:  38%|███▊      | 435/1131 [00:01<00:02, 240.90it/s]  
ram used:  3.01 GB, model.diffusion_model.output_blocks.5.1.transformer_blocks.0.norm3.weight:  38%|███▊      | 435/1131 [00:01<00:02, 240.90it/s]
ram used:  3.01 GB, model.diffusion_model.output_blocks.5.1.transformer_blocks.0.norm3.bias:  38%|███▊      | 435/1131 [00:01<00:02, 240.90it/s]  
ram used:  3.01 GB, model.diffusion_model.output_blocks.5.1.proj_out.weight:  38%|███▊      | 435/1131 [00:01<00:02, 240.90it/s]                
ram used:  3.02 GB, model.diffusion_model.output_blocks.5.1.proj_out.bias:  38%|███▊      | 435/1131 [00:01<00:02, 240.90it/s]  
ram used:  3.02 GB, model.diffusion_model.output_blocks.5.2.conv.weight:  38%|███▊      | 435/1131 [00:01<00:02, 240.90it/s]  
ram used:  3.08 GB, model.diffusion_model.output_blocks.5.2.conv.bias :  38%|███▊      | 435/1131 [00:01<00:02, 240.90it/s] 
ram used:  3.08 GB, model.diffusion_model.output_blocks.6.0.in_layers.0.weight:  38%|███▊      | 435/1131 [00:01<00:02, 240.90it/s]
ram used:  3.08 GB, model.diffusion_model.output_blocks.6.0.in_layers.0.bias:  38%|███▊      | 435/1131 [00:01<00:02, 240.90it/s]  
ram used:  3.08 GB, model.diffusion_model.output_blocks.6.0.in_layers.2.weight:  38%|███▊      | 435/1131 [00:01<00:02, 240.90it/s]
ram used:  3.12 GB, model.diffusion_model.output_blocks.6.0.in_layers.2.bias:  38%|███▊      | 435/1131 [00:01<00:02, 240.90it/s]  
ram used:  3.12 GB, model.diffusion_model.output_blocks.6.0.emb_layers.1.weight:  38%|███▊      | 435/1131 [00:01<00:02, 240.90it/s]
ram used:  3.12 GB, model.diffusion_model.output_blocks.6.0.emb_layers.1.bias:  38%|███▊      | 435/1131 [00:01<00:02, 240.90it/s]  
ram used:  3.12 GB, model.diffusion_model.output_blocks.6.0.out_layers.0.weight:  38%|███▊      | 435/1131 [00:01<00:02, 240.90it/s]
ram used:  3.12 GB, model.diffusion_model.output_blocks.6.0.out_layers.0.bias:  38%|███▊      | 435/1131 [00:01<00:02, 240.90it/s]  
ram used:  3.12 GB, model.diffusion_model.output_blocks.6.0.out_layers.3.weight:  38%|███▊      | 435/1131 [00:01<00:02, 240.90it/s]
ram used:  3.12 GB, model.diffusion_model.output_blocks.6.0.out_layers.3.weight:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]
ram used:  3.14 GB, model.diffusion_model.output_blocks.6.0.out_layers.3.bias:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]  
ram used:  3.14 GB, model.diffusion_model.output_blocks.6.0.skip_connection.weight:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]
ram used:  3.14 GB, model.diffusion_model.output_blocks.6.0.skip_connection.bias:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]  
ram used:  3.14 GB, model.diffusion_model.output_blocks.6.1.norm.weight:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]         
ram used:  3.14 GB, model.diffusion_model.output_blocks.6.1.norm.bias :  41%|████      | 462/1131 [00:01<00:02, 245.25it/s] 
ram used:  3.14 GB, model.diffusion_model.output_blocks.6.1.proj_in.weight:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]
ram used:  3.15 GB, model.diffusion_model.output_blocks.6.1.proj_in.bias:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]  
ram used:  3.15 GB, model.diffusion_model.output_blocks.6.1.transformer_blocks.0.attn1.to_q.weight:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]
ram used:  3.15 GB, model.diffusion_model.output_blocks.6.1.transformer_blocks.0.attn1.to_k.weight:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]
ram used:  3.15 GB, model.diffusion_model.output_blocks.6.1.transformer_blocks.0.attn1.to_v.weight:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]
ram used:  3.15 GB, model.diffusion_model.output_blocks.6.1.transformer_blocks.0.attn1.to_out.0.weight:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]
ram used:  3.15 GB, model.diffusion_model.output_blocks.6.1.transformer_blocks.0.attn1.to_out.0.bias:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]  
ram used:  3.15 GB, model.diffusion_model.output_blocks.6.1.transformer_blocks.0.ff.net.0.proj.weight:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]
ram used:  3.17 GB, model.diffusion_model.output_blocks.6.1.transformer_blocks.0.ff.net.0.proj.bias:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]  
ram used:  3.17 GB, model.diffusion_model.output_blocks.6.1.transformer_blocks.0.ff.net.2.weight:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]   
ram used:  3.17 GB, model.diffusion_model.output_blocks.6.1.transformer_blocks.0.ff.net.2.bias:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]  
ram used:  3.17 GB, model.diffusion_model.output_blocks.6.1.transformer_blocks.0.attn2.to_q.weight:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]
ram used:  3.17 GB, model.diffusion_model.output_blocks.6.1.transformer_blocks.0.attn2.to_k.weight:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]
ram used:  3.18 GB, model.diffusion_model.output_blocks.6.1.transformer_blocks.0.attn2.to_v.weight:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]
ram used:  3.18 GB, model.diffusion_model.output_blocks.6.1.transformer_blocks.0.attn2.to_out.0.weight:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]
ram used:  3.18 GB, model.diffusion_model.output_blocks.6.1.transformer_blocks.0.attn2.to_out.0.bias:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]  
ram used:  3.18 GB, model.diffusion_model.output_blocks.6.1.transformer_blocks.0.norm1.weight:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]       
ram used:  3.18 GB, model.diffusion_model.output_blocks.6.1.transformer_blocks.0.norm1.bias:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]  
ram used:  3.18 GB, model.diffusion_model.output_blocks.6.1.transformer_blocks.0.norm2.weight:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]
ram used:  3.18 GB, model.diffusion_model.output_blocks.6.1.transformer_blocks.0.norm2.bias:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]  
ram used:  3.18 GB, model.diffusion_model.output_blocks.6.1.transformer_blocks.0.norm3.weight:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]
ram used:  3.18 GB, model.diffusion_model.output_blocks.6.1.transformer_blocks.0.norm3.bias:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]  
ram used:  3.18 GB, model.diffusion_model.output_blocks.6.1.proj_out.weight:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]                
ram used:  3.18 GB, model.diffusion_model.output_blocks.6.1.proj_out.bias:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]  
ram used:  3.18 GB, model.diffusion_model.output_blocks.7.0.in_layers.0.weight:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]
ram used:  3.18 GB, model.diffusion_model.output_blocks.7.0.in_layers.0.bias:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]  
ram used:  3.18 GB, model.diffusion_model.output_blocks.7.0.in_layers.2.weight:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]
ram used:  3.21 GB, model.diffusion_model.output_blocks.7.0.in_layers.2.bias:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]  
ram used:  3.21 GB, model.diffusion_model.output_blocks.7.0.emb_layers.1.weight:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]
ram used:  3.21 GB, model.diffusion_model.output_blocks.7.0.emb_layers.1.bias:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]  
ram used:  3.21 GB, model.diffusion_model.output_blocks.7.0.out_layers.0.weight:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]
ram used:  3.21 GB, model.diffusion_model.output_blocks.7.0.out_layers.0.bias:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]  
ram used:  3.21 GB, model.diffusion_model.output_blocks.7.0.out_layers.3.weight:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]
ram used:  3.23 GB, model.diffusion_model.output_blocks.7.0.out_layers.3.bias:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]  
ram used:  3.23 GB, model.diffusion_model.output_blocks.7.0.skip_connection.weight:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]
ram used:  3.23 GB, model.diffusion_model.output_blocks.7.0.skip_connection.bias:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]  
ram used:  3.23 GB, model.diffusion_model.output_blocks.7.1.norm.weight:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]         
ram used:  3.23 GB, model.diffusion_model.output_blocks.7.1.norm.bias :  41%|████      | 462/1131 [00:01<00:02, 245.25it/s] 
ram used:  3.23 GB, model.diffusion_model.output_blocks.7.1.proj_in.weight:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]
ram used:  3.23 GB, model.diffusion_model.output_blocks.7.1.proj_in.bias:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]  
ram used:  3.23 GB, model.diffusion_model.output_blocks.7.1.transformer_blocks.0.attn1.to_q.weight:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]
ram used:  3.24 GB, model.diffusion_model.output_blocks.7.1.transformer_blocks.0.attn1.to_k.weight:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]
ram used:  3.24 GB, model.diffusion_model.output_blocks.7.1.transformer_blocks.0.attn1.to_v.weight:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]
ram used:  3.24 GB, model.diffusion_model.output_blocks.7.1.transformer_blocks.0.attn1.to_out.0.weight:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]
ram used:  3.24 GB, model.diffusion_model.output_blocks.7.1.transformer_blocks.0.attn1.to_out.0.bias:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]  
ram used:  3.24 GB, model.diffusion_model.output_blocks.7.1.transformer_blocks.0.ff.net.0.proj.weight:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]
ram used:  3.25 GB, model.diffusion_model.output_blocks.7.1.transformer_blocks.0.ff.net.0.proj.bias:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]  
ram used:  3.25 GB, model.diffusion_model.output_blocks.7.1.transformer_blocks.0.ff.net.2.weight:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]   
ram used:  3.26 GB, model.diffusion_model.output_blocks.7.1.transformer_blocks.0.ff.net.2.bias:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]  
ram used:  3.26 GB, model.diffusion_model.output_blocks.7.1.transformer_blocks.0.attn2.to_q.weight:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]
ram used:  3.26 GB, model.diffusion_model.output_blocks.7.1.transformer_blocks.0.attn2.to_k.weight:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]
ram used:  3.26 GB, model.diffusion_model.output_blocks.7.1.transformer_blocks.0.attn2.to_v.weight:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]
ram used:  3.27 GB, model.diffusion_model.output_blocks.7.1.transformer_blocks.0.attn2.to_out.0.weight:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]
ram used:  3.27 GB, model.diffusion_model.output_blocks.7.1.transformer_blocks.0.attn2.to_out.0.bias:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]  
ram used:  3.27 GB, model.diffusion_model.output_blocks.7.1.transformer_blocks.0.norm1.weight:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]       
ram used:  3.27 GB, model.diffusion_model.output_blocks.7.1.transformer_blocks.0.norm1.bias:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]  
ram used:  3.27 GB, model.diffusion_model.output_blocks.7.1.transformer_blocks.0.norm2.weight:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]
ram used:  3.27 GB, model.diffusion_model.output_blocks.7.1.transformer_blocks.0.norm2.bias:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]  
ram used:  3.27 GB, model.diffusion_model.output_blocks.7.1.transformer_blocks.0.norm3.weight:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]
ram used:  3.27 GB, model.diffusion_model.output_blocks.7.1.transformer_blocks.0.norm3.bias:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]  
ram used:  3.27 GB, model.diffusion_model.output_blocks.7.1.proj_out.weight:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]                
ram used:  3.27 GB, model.diffusion_model.output_blocks.7.1.proj_out.bias:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]  
ram used:  3.27 GB, model.diffusion_model.output_blocks.8.0.in_layers.0.weight:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]
ram used:  3.27 GB, model.diffusion_model.output_blocks.8.0.in_layers.0.bias:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]  
ram used:  3.27 GB, model.diffusion_model.output_blocks.8.0.in_layers.2.weight:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]
ram used:  3.29 GB, model.diffusion_model.output_blocks.8.0.in_layers.2.bias:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]  
ram used:  3.29 GB, model.diffusion_model.output_blocks.8.0.emb_layers.1.weight:  41%|████      | 462/1131 [00:01<00:02, 245.25it/s]
ram used:  3.29 GB, model.diffusion_model.output_blocks.8.0.emb_layers.1.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.29 GB, model.diffusion_model.output_blocks.8.0.emb_layers.1.bias:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]  
ram used:  3.29 GB, model.diffusion_model.output_blocks.8.0.out_layers.0.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.29 GB, model.diffusion_model.output_blocks.8.0.out_layers.0.bias:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]  
ram used:  3.29 GB, model.diffusion_model.output_blocks.8.0.out_layers.3.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.31 GB, model.diffusion_model.output_blocks.8.0.out_layers.3.bias:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]  
ram used:  3.31 GB, model.diffusion_model.output_blocks.8.0.skip_connection.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.31 GB, model.diffusion_model.output_blocks.8.0.skip_connection.bias:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]  
ram used:  3.31 GB, model.diffusion_model.output_blocks.8.1.norm.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]         
ram used:  3.31 GB, model.diffusion_model.output_blocks.8.1.norm.bias :  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s] 
ram used:  3.31 GB, model.diffusion_model.output_blocks.8.1.proj_in.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.31 GB, model.diffusion_model.output_blocks.8.1.proj_in.bias:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]  
ram used:  3.31 GB, model.diffusion_model.output_blocks.8.1.transformer_blocks.0.attn1.to_q.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.31 GB, model.diffusion_model.output_blocks.8.1.transformer_blocks.0.attn1.to_k.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.32 GB, model.diffusion_model.output_blocks.8.1.transformer_blocks.0.attn1.to_v.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.32 GB, model.diffusion_model.output_blocks.8.1.transformer_blocks.0.attn1.to_out.0.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.32 GB, model.diffusion_model.output_blocks.8.1.transformer_blocks.0.attn1.to_out.0.bias:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]  
ram used:  3.32 GB, model.diffusion_model.output_blocks.8.1.transformer_blocks.0.ff.net.0.proj.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.33 GB, model.diffusion_model.output_blocks.8.1.transformer_blocks.0.ff.net.0.proj.bias:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]  
ram used:  3.33 GB, model.diffusion_model.output_blocks.8.1.transformer_blocks.0.ff.net.2.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]   
ram used:  3.34 GB, model.diffusion_model.output_blocks.8.1.transformer_blocks.0.ff.net.2.bias:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]  
ram used:  3.34 GB, model.diffusion_model.output_blocks.8.1.transformer_blocks.0.attn2.to_q.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.34 GB, model.diffusion_model.output_blocks.8.1.transformer_blocks.0.attn2.to_k.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.34 GB, model.diffusion_model.output_blocks.8.1.transformer_blocks.0.attn2.to_v.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.34 GB, model.diffusion_model.output_blocks.8.1.transformer_blocks.0.attn2.to_out.0.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.35 GB, model.diffusion_model.output_blocks.8.1.transformer_blocks.0.attn2.to_out.0.bias:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]  
ram used:  3.35 GB, model.diffusion_model.output_blocks.8.1.transformer_blocks.0.norm1.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]       
ram used:  3.35 GB, model.diffusion_model.output_blocks.8.1.transformer_blocks.0.norm1.bias:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]  
ram used:  3.35 GB, model.diffusion_model.output_blocks.8.1.transformer_blocks.0.norm2.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.35 GB, model.diffusion_model.output_blocks.8.1.transformer_blocks.0.norm2.bias:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]  
ram used:  3.35 GB, model.diffusion_model.output_blocks.8.1.transformer_blocks.0.norm3.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.35 GB, model.diffusion_model.output_blocks.8.1.transformer_blocks.0.norm3.bias:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]  
ram used:  3.35 GB, model.diffusion_model.output_blocks.8.1.proj_out.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]                
ram used:  3.35 GB, model.diffusion_model.output_blocks.8.1.proj_out.bias:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]  
ram used:  3.35 GB, model.diffusion_model.output_blocks.8.2.conv.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]  
ram used:  3.36 GB, model.diffusion_model.output_blocks.8.2.conv.bias :  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s] 
ram used:  3.36 GB, model.diffusion_model.output_blocks.9.0.in_layers.0.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.36 GB, model.diffusion_model.output_blocks.9.0.in_layers.0.bias:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]  
ram used:  3.36 GB, model.diffusion_model.output_blocks.9.0.in_layers.2.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.37 GB, model.diffusion_model.output_blocks.9.0.in_layers.2.bias:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]  
ram used:  3.37 GB, model.diffusion_model.output_blocks.9.0.emb_layers.1.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.38 GB, model.diffusion_model.output_blocks.9.0.emb_layers.1.bias:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]  
ram used:  3.38 GB, model.diffusion_model.output_blocks.9.0.out_layers.0.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.38 GB, model.diffusion_model.output_blocks.9.0.out_layers.0.bias:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]  
ram used:  3.38 GB, model.diffusion_model.output_blocks.9.0.out_layers.3.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.38 GB, model.diffusion_model.output_blocks.9.0.out_layers.3.bias:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]  
ram used:  3.38 GB, model.diffusion_model.output_blocks.9.0.skip_connection.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.38 GB, model.diffusion_model.output_blocks.9.0.skip_connection.bias:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]  
ram used:  3.38 GB, model.diffusion_model.output_blocks.9.1.norm.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]         
ram used:  3.38 GB, model.diffusion_model.output_blocks.9.1.norm.bias :  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s] 
ram used:  3.38 GB, model.diffusion_model.output_blocks.9.1.proj_in.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.38 GB, model.diffusion_model.output_blocks.9.1.proj_in.bias:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]  
ram used:  3.38 GB, model.diffusion_model.output_blocks.9.1.transformer_blocks.0.attn1.to_q.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.38 GB, model.diffusion_model.output_blocks.9.1.transformer_blocks.0.attn1.to_k.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.38 GB, model.diffusion_model.output_blocks.9.1.transformer_blocks.0.attn1.to_v.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.38 GB, model.diffusion_model.output_blocks.9.1.transformer_blocks.0.attn1.to_out.0.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.38 GB, model.diffusion_model.output_blocks.9.1.transformer_blocks.0.attn1.to_out.0.bias:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]  
ram used:  3.38 GB, model.diffusion_model.output_blocks.9.1.transformer_blocks.0.ff.net.0.proj.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.39 GB, model.diffusion_model.output_blocks.9.1.transformer_blocks.0.ff.net.0.proj.bias:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]  
ram used:  3.39 GB, model.diffusion_model.output_blocks.9.1.transformer_blocks.0.ff.net.2.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]   
ram used:  3.39 GB, model.diffusion_model.output_blocks.9.1.transformer_blocks.0.ff.net.2.bias:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]  
ram used:  3.39 GB, model.diffusion_model.output_blocks.9.1.transformer_blocks.0.attn2.to_q.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.39 GB, model.diffusion_model.output_blocks.9.1.transformer_blocks.0.attn2.to_k.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.39 GB, model.diffusion_model.output_blocks.9.1.transformer_blocks.0.attn2.to_v.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.39 GB, model.diffusion_model.output_blocks.9.1.transformer_blocks.0.attn2.to_out.0.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.39 GB, model.diffusion_model.output_blocks.9.1.transformer_blocks.0.attn2.to_out.0.bias:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]  
ram used:  3.39 GB, model.diffusion_model.output_blocks.9.1.transformer_blocks.0.norm1.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]       
ram used:  3.39 GB, model.diffusion_model.output_blocks.9.1.transformer_blocks.0.norm1.bias:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]  
ram used:  3.39 GB, model.diffusion_model.output_blocks.9.1.transformer_blocks.0.norm2.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.39 GB, model.diffusion_model.output_blocks.9.1.transformer_blocks.0.norm2.bias:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]  
ram used:  3.39 GB, model.diffusion_model.output_blocks.9.1.transformer_blocks.0.norm3.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.39 GB, model.diffusion_model.output_blocks.9.1.transformer_blocks.0.norm3.bias:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]  
ram used:  3.39 GB, model.diffusion_model.output_blocks.9.1.proj_out.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]                
ram used:  3.39 GB, model.diffusion_model.output_blocks.9.1.proj_out.bias:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]  
ram used:  3.39 GB, model.diffusion_model.output_blocks.10.0.in_layers.0.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.39 GB, model.diffusion_model.output_blocks.10.0.in_layers.0.bias:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]  
ram used:  3.39 GB, model.diffusion_model.output_blocks.10.0.in_layers.2.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.40 GB, model.diffusion_model.output_blocks.10.0.in_layers.2.bias:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]  
ram used:  3.40 GB, model.diffusion_model.output_blocks.10.0.emb_layers.1.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.40 GB, model.diffusion_model.output_blocks.10.0.emb_layers.1.bias:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]  
ram used:  3.40 GB, model.diffusion_model.output_blocks.10.0.out_layers.0.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.40 GB, model.diffusion_model.output_blocks.10.0.out_layers.0.bias:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]  
ram used:  3.40 GB, model.diffusion_model.output_blocks.10.0.out_layers.3.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.40 GB, model.diffusion_model.output_blocks.10.0.out_layers.3.bias:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]  
ram used:  3.40 GB, model.diffusion_model.output_blocks.10.0.skip_connection.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.40 GB, model.diffusion_model.output_blocks.10.0.skip_connection.bias:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]  
ram used:  3.40 GB, model.diffusion_model.output_blocks.10.1.norm.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]         
ram used:  3.40 GB, model.diffusion_model.output_blocks.10.1.norm.bias:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]  
ram used:  3.40 GB, model.diffusion_model.output_blocks.10.1.proj_in.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.40 GB, model.diffusion_model.output_blocks.10.1.proj_in.bias:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]  
ram used:  3.40 GB, model.diffusion_model.output_blocks.10.1.transformer_blocks.0.attn1.to_q.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.40 GB, model.diffusion_model.output_blocks.10.1.transformer_blocks.0.attn1.to_k.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.41 GB, model.diffusion_model.output_blocks.10.1.transformer_blocks.0.attn1.to_v.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.41 GB, model.diffusion_model.output_blocks.10.1.transformer_blocks.0.attn1.to_out.0.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.41 GB, model.diffusion_model.output_blocks.10.1.transformer_blocks.0.attn1.to_out.0.bias:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]  
ram used:  3.41 GB, model.diffusion_model.output_blocks.10.1.transformer_blocks.0.ff.net.0.proj.weight:  47%|████▋     | 534/1131 [00:01<00:01, 366.95it/s]
ram used:  3.41 GB, model.diffusion_model.output_blocks.10.1.transformer_blocks.0.ff.net.0.proj.weight:  56%|█████▌    | 629/1131 [00:01<00:00, 523.12it/s]
ram used:  3.41 GB, model.diffusion_model.output_blocks.10.1.transformer_blocks.0.ff.net.0.proj.bias:  56%|█████▌    | 629/1131 [00:01<00:00, 523.12it/s]  
ram used:  3.41 GB, model.diffusion_model.output_blocks.10.1.transformer_blocks.0.ff.net.2.weight:  56%|█████▌    | 629/1131 [00:01<00:00, 523.12it/s]   
ram used:  3.41 GB, model.diffusion_model.output_blocks.10.1.transformer_blocks.0.ff.net.2.bias:  56%|█████▌    | 629/1131 [00:01<00:00, 523.12it/s]  
ram used:  3.41 GB, model.diffusion_model.output_blocks.10.1.transformer_blocks.0.attn2.to_q.weight:  56%|█████▌    | 629/1131 [00:01<00:00, 523.12it/s]
ram used:  3.41 GB, model.diffusion_model.output_blocks.10.1.transformer_blocks.0.attn2.to_k.weight:  56%|█████▌    | 629/1131 [00:01<00:00, 523.12it/s]
ram used:  3.41 GB, model.diffusion_model.output_blocks.10.1.transformer_blocks.0.attn2.to_v.weight:  56%|█████▌    | 629/1131 [00:01<00:00, 523.12it/s]
ram used:  3.41 GB, model.diffusion_model.output_blocks.10.1.transformer_blocks.0.attn2.to_out.0.weight:  56%|█████▌    | 629/1131 [00:01<00:00, 523.12it/s]
ram used:  3.41 GB, model.diffusion_model.output_blocks.10.1.transformer_blocks.0.attn2.to_out.0.bias:  56%|█████▌    | 629/1131 [00:01<00:00, 523.12it/s]  
ram used:  3.41 GB, model.diffusion_model.output_blocks.10.1.transformer_blocks.0.norm1.weight:  56%|█████▌    | 629/1131 [00:01<00:00, 523.12it/s]       
ram used:  3.41 GB, model.diffusion_model.output_blocks.10.1.transformer_blocks.0.norm1.bias:  56%|█████▌    | 629/1131 [00:01<00:00, 523.12it/s]  
ram used:  3.41 GB, model.diffusion_model.output_blocks.10.1.transformer_blocks.0.norm2.weight:  56%|█████▌    | 629/1131 [00:01<00:00, 523.12it/s]
ram used:  3.41 GB, model.diffusion_model.output_blocks.10.1.transformer_blocks.0.norm2.bias:  56%|█████▌    | 629/1131 [00:01<00:00, 523.12it/s]  
ram used:  3.41 GB, model.diffusion_model.output_blocks.10.1.transformer_blocks.0.norm3.weight:  56%|█████▌    | 629/1131 [00:01<00:00, 523.12it/s]
ram used:  3.41 GB, model.diffusion_model.output_blocks.10.1.transformer_blocks.0.norm3.bias:  56%|█████▌    | 629/1131 [00:01<00:00, 523.12it/s]  
ram used:  3.41 GB, model.diffusion_model.output_blocks.10.1.proj_out.weight:  56%|█████▌    | 629/1131 [00:01<00:00, 523.12it/s]                
ram used:  3.41 GB, model.diffusion_model.output_blocks.10.1.proj_out.bias:  56%|█████▌    | 629/1131 [00:01<00:00, 523.12it/s]  
ram used:  3.41 GB, model.diffusion_model.output_blocks.11.0.in_layers.0.weight:  56%|█████▌    | 629/1131 [00:01<00:00, 523.12it/s]
ram used:  3.41 GB, model.diffusion_model.output_blocks.11.0.in_layers.0.bias:  56%|█████▌    | 629/1131 [00:01<00:00, 523.12it/s]  
ram used:  3.41 GB, model.diffusion_model.output_blocks.11.0.in_layers.2.weight:  56%|█████▌    | 629/1131 [00:01<00:00, 523.12it/s]
ram used:  3.42 GB, model.diffusion_model.output_blocks.11.0.in_layers.2.bias:  56%|█████▌    | 629/1131 [00:01<00:00, 523.12it/s]  
ram used:  3.42 GB, model.diffusion_model.output_blocks.11.0.emb_layers.1.weight:  56%|█████▌    | 629/1131 [00:01<00:00, 523.12it/s]
ram used:  3.42 GB, model.diffusion_model.output_blocks.11.0.emb_layers.1.bias:  56%|█████▌    | 629/1131 [00:01<00:00, 523.12it/s]  
ram used:  3.42 GB, model.diffusion_model.output_blocks.11.0.out_layers.0.weight:  56%|█████▌    | 629/1131 [00:01<00:00, 523.12it/s]
ram used:  3.42 GB, model.diffusion_model.output_blocks.11.0.out_layers.0.bias:  56%|█████▌    | 629/1131 [00:01<00:00, 523.12it/s]  
ram used:  3.42 GB, model.diffusion_model.output_blocks.11.0.out_layers.3.weight:  56%|█████▌    | 629/1131 [00:01<00:00, 523.12it/s]
ram used:  3.43 GB, model.diffusion_model.output_blocks.11.0.out_layers.3.bias:  56%|█████▌    | 629/1131 [00:01<00:00, 523.12it/s]  
ram used:  3.43 GB, model.diffusion_model.output_blocks.11.0.skip_connection.weight:  56%|█████▌    | 629/1131 [00:01<00:00, 523.12it/s]
ram used:  3.43 GB, model.diffusion_model.output_blocks.11.0.skip_connection.bias:  56%|█████▌    | 629/1131 [00:01<00:00, 523.12it/s]  
ram used:  3.43 GB, model.diffusion_model.output_blocks.11.1.norm.weight:  56%|█████▌    | 629/1131 [00:01<00:00, 523.12it/s]         
ram used:  3.43 GB, model.diffusion_model.output_blocks.11.1.norm.bias:  56%|█████▌    | 629/1131 [00:01<00:00, 523.12it/s]  
ram used:  3.43 GB, model.diffusion_model.output_blocks.11.1.proj_in.weight:  56%|█████▌    | 629/1131 [00:01<00:00, 523.12it/s]
ram used:  3.43 GB, model.diffusion_model.output_blocks.11.1.proj_in.bias:  56%|█████▌    | 629/1131 [00:01<00:00, 523.12it/s]  
ram used:  3.43 GB, model.diffusion_model.output_blocks.11.1.transformer_blocks.0.attn1.to_q.weight:  56%|█████▌    | 629/1131 [00:01<00:00, 523.12it/s]
ram used:  3.43 GB, model.diffusion_model.output_blocks.11.1.transformer_blocks.0.attn1.to_k.weight:  56%|█████▌    | 629/1131 [00:01<00:00, 523.12it/s]
ram used:  3.43 GB, model.diffusion_model.output_blocks.11.1.transformer_blocks.0.attn1.to_v.weight:  56%|█████▌    | 629/1131 [00:01<00:00, 523.12it/s]
ram used:  3.43 GB, model.diffusion_model.output_blocks.11.1.transformer_blocks.0.attn1.to_out.0.weight:  56%|█████▌    | 629/1131 [00:01<00:00, 523.12it/s]
ram used:  3.43 GB, model.diffusion_model.output_blocks.11.1.transformer_blocks.0.attn1.to_out.0.bias:  56%|█████▌    | 629/1131 [00:01<00:00, 523.12it/s]  
ram used:  3.43 GB, model.diffusion_model.output_blocks.11.1.transformer_blocks.0.ff.net.0.proj.weight:  56%|█████▌    | 629/1131 [00:01<00:00, 523.12it/s]
ram used:  3.43 GB, model.diffusion_model.output_blocks.11.1.transformer_blocks.0.ff.net.0.proj.bias:  56%|█████▌    | 629/1131 [00:01<00:00, 523.12it/s]  
ram used:  3.43 GB, model.diffusion_model.output_blocks.11.1.transformer_blocks.0.ff.net.2.weight:  56%|█████▌    | 629/1131 [00:01<00:00, 523.12it/s]   
ram used:  3.43 GB, model.diffusion_model.output_blocks.11.1.transformer_blocks.0.ff.net.2.bias:  56%|█████▌    | 629/1131 [00:01<00:00, 523.12it/s]  
ram used:  3.43 GB, model.diffusion_model.output_blocks.11.1.transformer_blocks.0.attn2.to_q.weight:  56%|█████▌    | 629/1131 [00:01<00:00, 523.12it/s]
ram used:  3.44 GB, model.diffusion_model.output_blocks.11.1.transformer_blocks.0.attn2.to_k.weight:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]
ram used:  3.44 GB, model.diffusion_model.output_blocks.11.1.transformer_blocks.0.attn2.to_v.weight:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]
ram used:  3.44 GB, model.diffusion_model.output_blocks.11.1.transformer_blocks.0.attn2.to_out.0.weight:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]
ram used:  3.44 GB, model.diffusion_model.output_blocks.11.1.transformer_blocks.0.attn2.to_out.0.bias:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]  
ram used:  3.44 GB, model.diffusion_model.output_blocks.11.1.transformer_blocks.0.norm1.weight:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]       
ram used:  3.44 GB, model.diffusion_model.output_blocks.11.1.transformer_blocks.0.norm1.bias:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]  
ram used:  3.44 GB, model.diffusion_model.output_blocks.11.1.transformer_blocks.0.norm2.weight:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]
ram used:  3.44 GB, model.diffusion_model.output_blocks.11.1.transformer_blocks.0.norm2.bias:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]  
ram used:  3.44 GB, model.diffusion_model.output_blocks.11.1.transformer_blocks.0.norm3.weight:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]
ram used:  3.44 GB, model.diffusion_model.output_blocks.11.1.transformer_blocks.0.norm3.bias:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]  
ram used:  3.44 GB, model.diffusion_model.output_blocks.11.1.proj_out.weight:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]                
ram used:  3.44 GB, model.diffusion_model.output_blocks.11.1.proj_out.bias:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]  
ram used:  3.44 GB, model.diffusion_model.out.0.weight                :  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]    
ram used:  3.44 GB, model.diffusion_model.out.0.bias                  :  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]
ram used:  3.44 GB, model.diffusion_model.out.2.weight                :  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]
ram used:  3.44 GB, model.diffusion_model.out.2.bias                  :  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]
ram used:  3.44 GB, first_stage_model.encoder.conv_in.weight          :  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]
ram used:  3.44 GB, first_stage_model.encoder.conv_in.bias            :  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]
ram used:  3.44 GB, first_stage_model.encoder.down.0.block.0.norm1.weight:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]
ram used:  3.44 GB, first_stage_model.encoder.down.0.block.0.norm1.bias:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]  
ram used:  3.44 GB, first_stage_model.encoder.down.0.block.0.conv1.weight:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]
ram used:  3.44 GB, first_stage_model.encoder.down.0.block.0.conv1.bias:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]  
ram used:  3.44 GB, first_stage_model.encoder.down.0.block.0.norm2.weight:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]
ram used:  3.44 GB, first_stage_model.encoder.down.0.block.0.norm2.bias:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]  
ram used:  3.44 GB, first_stage_model.encoder.down.0.block.0.conv2.weight:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]
ram used:  3.44 GB, first_stage_model.encoder.down.0.block.0.conv2.bias:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]  
ram used:  3.44 GB, first_stage_model.encoder.down.0.block.1.norm1.weight:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]
ram used:  3.44 GB, first_stage_model.encoder.down.0.block.1.norm1.bias:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]  
ram used:  3.44 GB, first_stage_model.encoder.down.0.block.1.conv1.weight:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]
ram used:  3.44 GB, first_stage_model.encoder.down.0.block.1.conv1.bias:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]  
ram used:  3.44 GB, first_stage_model.encoder.down.0.block.1.norm2.weight:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]
ram used:  3.44 GB, first_stage_model.encoder.down.0.block.1.norm2.bias:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]  
ram used:  3.44 GB, first_stage_model.encoder.down.0.block.1.conv2.weight:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]
ram used:  3.44 GB, first_stage_model.encoder.down.0.block.1.conv2.bias:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]  
ram used:  3.44 GB, first_stage_model.encoder.down.0.downsample.conv.weight:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]
ram used:  3.44 GB, first_stage_model.encoder.down.0.downsample.conv.bias:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]  
ram used:  3.44 GB, first_stage_model.encoder.down.1.block.0.norm1.weight:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]
ram used:  3.44 GB, first_stage_model.encoder.down.1.block.0.norm1.bias:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]  
ram used:  3.44 GB, first_stage_model.encoder.down.1.block.0.conv1.weight:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]
ram used:  3.44 GB, first_stage_model.encoder.down.1.block.0.conv1.bias:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]  
ram used:  3.44 GB, first_stage_model.encoder.down.1.block.0.norm2.weight:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]
ram used:  3.44 GB, first_stage_model.encoder.down.1.block.0.norm2.bias:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]  
ram used:  3.44 GB, first_stage_model.encoder.down.1.block.0.conv2.weight:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]
ram used:  3.44 GB, first_stage_model.encoder.down.1.block.0.conv2.bias:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]  
ram used:  3.44 GB, first_stage_model.encoder.down.1.block.0.nin_shortcut.weight:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]
ram used:  3.44 GB, first_stage_model.encoder.down.1.block.0.nin_shortcut.bias:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]  
ram used:  3.44 GB, first_stage_model.encoder.down.1.block.1.norm1.weight:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]     
ram used:  3.44 GB, first_stage_model.encoder.down.1.block.1.norm1.bias:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]  
ram used:  3.44 GB, first_stage_model.encoder.down.1.block.1.conv1.weight:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]
ram used:  3.45 GB, first_stage_model.encoder.down.1.block.1.conv1.bias:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]  
ram used:  3.45 GB, first_stage_model.encoder.down.1.block.1.norm2.weight:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]
ram used:  3.45 GB, first_stage_model.encoder.down.1.block.1.norm2.bias:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]  
ram used:  3.45 GB, first_stage_model.encoder.down.1.block.1.conv2.weight:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]
ram used:  3.45 GB, first_stage_model.encoder.down.1.block.1.conv2.bias:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]  
ram used:  3.45 GB, first_stage_model.encoder.down.1.downsample.conv.weight:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]
ram used:  3.45 GB, first_stage_model.encoder.down.1.downsample.conv.bias:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]  
ram used:  3.45 GB, first_stage_model.encoder.down.2.block.0.norm1.weight:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]
ram used:  3.45 GB, first_stage_model.encoder.down.2.block.0.norm1.bias:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]  
ram used:  3.45 GB, first_stage_model.encoder.down.2.block.0.conv1.weight:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]
ram used:  3.46 GB, first_stage_model.encoder.down.2.block.0.conv1.bias:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]  
ram used:  3.46 GB, first_stage_model.encoder.down.2.block.0.norm2.weight:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]
ram used:  3.46 GB, first_stage_model.encoder.down.2.block.0.norm2.bias:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]  
ram used:  3.46 GB, first_stage_model.encoder.down.2.block.0.conv2.weight:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]
ram used:  3.47 GB, first_stage_model.encoder.down.2.block.0.conv2.bias:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]  
ram used:  3.47 GB, first_stage_model.encoder.down.2.block.0.nin_shortcut.weight:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]
ram used:  3.47 GB, first_stage_model.encoder.down.2.block.0.nin_shortcut.bias:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]  
ram used:  3.47 GB, first_stage_model.encoder.down.2.block.1.norm1.weight:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]     
ram used:  3.47 GB, first_stage_model.encoder.down.2.block.1.norm1.bias:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]  
ram used:  3.47 GB, first_stage_model.encoder.down.2.block.1.conv1.weight:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]
ram used:  3.48 GB, first_stage_model.encoder.down.2.block.1.conv1.bias:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]  
ram used:  3.48 GB, first_stage_model.encoder.down.2.block.1.norm2.weight:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]
ram used:  3.48 GB, first_stage_model.encoder.down.2.block.1.norm2.bias:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]  
ram used:  3.48 GB, first_stage_model.encoder.down.2.block.1.conv2.weight:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]
ram used:  3.49 GB, first_stage_model.encoder.down.2.block.1.conv2.bias:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]  
ram used:  3.49 GB, first_stage_model.encoder.down.2.downsample.conv.weight:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]
ram used:  3.49 GB, first_stage_model.encoder.down.2.downsample.conv.bias:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]  
ram used:  3.49 GB, first_stage_model.encoder.down.3.block.0.norm1.weight:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]
ram used:  3.49 GB, first_stage_model.encoder.down.3.block.0.norm1.bias:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]  
ram used:  3.49 GB, first_stage_model.encoder.down.3.block.0.conv1.weight:  56%|█████▌    | 629/1131 [00:02<00:00, 523.12it/s]
ram used:  3.49 GB, first_stage_model.encoder.down.3.block.0.conv1.weight:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.50 GB, first_stage_model.encoder.down.3.block.0.conv1.bias:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]  
ram used:  3.50 GB, first_stage_model.encoder.down.3.block.0.norm2.weight:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.50 GB, first_stage_model.encoder.down.3.block.0.norm2.bias:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]  
ram used:  3.50 GB, first_stage_model.encoder.down.3.block.0.conv2.weight:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.51 GB, first_stage_model.encoder.down.3.block.0.conv2.bias:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]  
ram used:  3.51 GB, first_stage_model.encoder.down.3.block.1.norm1.weight:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.51 GB, first_stage_model.encoder.down.3.block.1.norm1.bias:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]  
ram used:  3.51 GB, first_stage_model.encoder.down.3.block.1.conv1.weight:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.52 GB, first_stage_model.encoder.down.3.block.1.conv1.bias:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]  
ram used:  3.52 GB, first_stage_model.encoder.down.3.block.1.norm2.weight:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.52 GB, first_stage_model.encoder.down.3.block.1.norm2.bias:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]  
ram used:  3.52 GB, first_stage_model.encoder.down.3.block.1.conv2.weight:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.53 GB, first_stage_model.encoder.down.3.block.1.conv2.bias:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]  
ram used:  3.53 GB, first_stage_model.encoder.mid.block_1.norm1.weight:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s] 
ram used:  3.53 GB, first_stage_model.encoder.mid.block_1.norm1.bias  :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.53 GB, first_stage_model.encoder.mid.block_1.conv1.weight:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.54 GB, first_stage_model.encoder.mid.block_1.conv1.bias  :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.54 GB, first_stage_model.encoder.mid.block_1.norm2.weight:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.54 GB, first_stage_model.encoder.mid.block_1.norm2.bias  :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.54 GB, first_stage_model.encoder.mid.block_1.conv2.weight:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.55 GB, first_stage_model.encoder.mid.block_1.conv2.bias  :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.55 GB, first_stage_model.encoder.mid.attn_1.norm.weight  :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.55 GB, first_stage_model.encoder.mid.attn_1.norm.bias    :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.55 GB, first_stage_model.encoder.mid.attn_1.q.weight     :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.55 GB, first_stage_model.encoder.mid.attn_1.q.bias       :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.55 GB, first_stage_model.encoder.mid.attn_1.k.weight     :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.55 GB, first_stage_model.encoder.mid.attn_1.k.bias       :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.55 GB, first_stage_model.encoder.mid.attn_1.v.weight     :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.55 GB, first_stage_model.encoder.mid.attn_1.v.bias       :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.55 GB, first_stage_model.encoder.mid.attn_1.proj_out.weight:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.56 GB, first_stage_model.encoder.mid.attn_1.proj_out.bias:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]  
ram used:  3.56 GB, first_stage_model.encoder.mid.block_2.norm1.weight:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.56 GB, first_stage_model.encoder.mid.block_2.norm1.bias  :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.56 GB, first_stage_model.encoder.mid.block_2.conv1.weight:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.57 GB, first_stage_model.encoder.mid.block_2.conv1.bias  :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.57 GB, first_stage_model.encoder.mid.block_2.norm2.weight:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.57 GB, first_stage_model.encoder.mid.block_2.norm2.bias  :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.57 GB, first_stage_model.encoder.mid.block_2.conv2.weight:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.57 GB, first_stage_model.encoder.mid.block_2.conv2.bias  :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.57 GB, first_stage_model.encoder.norm_out.weight         :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.57 GB, first_stage_model.encoder.norm_out.bias           :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.57 GB, first_stage_model.encoder.conv_out.weight         :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.57 GB, first_stage_model.encoder.conv_out.bias           :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.57 GB, first_stage_model.decoder.conv_in.weight          :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.57 GB, first_stage_model.decoder.conv_in.bias            :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.57 GB, first_stage_model.decoder.mid.block_1.norm1.weight:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.57 GB, first_stage_model.decoder.mid.block_1.norm1.bias  :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.57 GB, first_stage_model.decoder.mid.block_1.conv1.weight:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.58 GB, first_stage_model.decoder.mid.block_1.conv1.bias  :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.58 GB, first_stage_model.decoder.mid.block_1.norm2.weight:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.58 GB, first_stage_model.decoder.mid.block_1.norm2.bias  :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.58 GB, first_stage_model.decoder.mid.block_1.conv2.weight:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.59 GB, first_stage_model.decoder.mid.block_1.conv2.bias  :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.59 GB, first_stage_model.decoder.mid.attn_1.norm.weight  :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.59 GB, first_stage_model.decoder.mid.attn_1.norm.bias    :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.59 GB, first_stage_model.decoder.mid.attn_1.q.weight     :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.59 GB, first_stage_model.decoder.mid.attn_1.q.bias       :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.59 GB, first_stage_model.decoder.mid.attn_1.k.weight     :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.60 GB, first_stage_model.decoder.mid.attn_1.k.bias       :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.60 GB, first_stage_model.decoder.mid.attn_1.v.weight     :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.60 GB, first_stage_model.decoder.mid.attn_1.v.bias       :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.60 GB, first_stage_model.decoder.mid.attn_1.proj_out.weight:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.60 GB, first_stage_model.decoder.mid.attn_1.proj_out.bias:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]  
ram used:  3.60 GB, first_stage_model.decoder.mid.block_2.norm1.weight:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.60 GB, first_stage_model.decoder.mid.block_2.norm1.bias  :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.60 GB, first_stage_model.decoder.mid.block_2.conv1.weight:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.61 GB, first_stage_model.decoder.mid.block_2.conv1.bias  :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.61 GB, first_stage_model.decoder.mid.block_2.norm2.weight:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.61 GB, first_stage_model.decoder.mid.block_2.norm2.bias  :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.61 GB, first_stage_model.decoder.mid.block_2.conv2.weight:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.62 GB, first_stage_model.decoder.mid.block_2.conv2.bias  :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.62 GB, first_stage_model.decoder.up.0.block.0.norm1.weight:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.62 GB, first_stage_model.decoder.up.0.block.0.norm1.bias :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s] 
ram used:  3.62 GB, first_stage_model.decoder.up.0.block.0.conv1.weight:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.62 GB, first_stage_model.decoder.up.0.block.0.conv1.bias :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s] 
ram used:  3.62 GB, first_stage_model.decoder.up.0.block.0.norm2.weight:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.62 GB, first_stage_model.decoder.up.0.block.0.norm2.bias :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s] 
ram used:  3.62 GB, first_stage_model.decoder.up.0.block.0.conv2.weight:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.62 GB, first_stage_model.decoder.up.0.block.0.conv2.bias :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s] 
ram used:  3.62 GB, first_stage_model.decoder.up.0.block.0.nin_shortcut.weight:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.62 GB, first_stage_model.decoder.up.0.block.0.nin_shortcut.bias:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]  
ram used:  3.62 GB, first_stage_model.decoder.up.0.block.1.norm1.weight:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]     
ram used:  3.62 GB, first_stage_model.decoder.up.0.block.1.norm1.bias :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s] 
ram used:  3.62 GB, first_stage_model.decoder.up.0.block.1.conv1.weight:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.62 GB, first_stage_model.decoder.up.0.block.1.conv1.bias :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s] 
ram used:  3.62 GB, first_stage_model.decoder.up.0.block.1.norm2.weight:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.62 GB, first_stage_model.decoder.up.0.block.1.norm2.bias :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s] 
ram used:  3.62 GB, first_stage_model.decoder.up.0.block.1.conv2.weight:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.62 GB, first_stage_model.decoder.up.0.block.1.conv2.bias :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s] 
ram used:  3.62 GB, first_stage_model.decoder.up.0.block.2.norm1.weight:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.62 GB, first_stage_model.decoder.up.0.block.2.norm1.bias :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s] 
ram used:  3.62 GB, first_stage_model.decoder.up.0.block.2.conv1.weight:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.62 GB, first_stage_model.decoder.up.0.block.2.conv1.bias :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s] 
ram used:  3.62 GB, first_stage_model.decoder.up.0.block.2.norm2.weight:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.62 GB, first_stage_model.decoder.up.0.block.2.norm2.bias :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s] 
ram used:  3.62 GB, first_stage_model.decoder.up.0.block.2.conv2.weight:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.62 GB, first_stage_model.decoder.up.0.block.2.conv2.bias :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s] 
ram used:  3.62 GB, first_stage_model.decoder.up.1.block.0.norm1.weight:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.62 GB, first_stage_model.decoder.up.1.block.0.norm1.bias :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s] 
ram used:  3.62 GB, first_stage_model.decoder.up.1.block.0.conv1.weight:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.63 GB, first_stage_model.decoder.up.1.block.0.conv1.bias :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s] 
ram used:  3.63 GB, first_stage_model.decoder.up.1.block.0.norm2.weight:  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s]
ram used:  3.63 GB, first_stage_model.decoder.up.1.block.0.norm2.bias :  66%|██████▋   | 750/1131 [00:02<00:00, 708.08it/s] 
ram used:  3.63 GB, first_stage_model.decoder.up.1.block.0.norm2.bias :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.63 GB, first_stage_model.decoder.up.1.block.0.conv2.weight:  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.63 GB, first_stage_model.decoder.up.1.block.0.conv2.bias :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s] 
ram used:  3.63 GB, first_stage_model.decoder.up.1.block.0.nin_shortcut.weight:  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.63 GB, first_stage_model.decoder.up.1.block.0.nin_shortcut.bias:  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]  
ram used:  3.63 GB, first_stage_model.decoder.up.1.block.1.norm1.weight:  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]     
ram used:  3.63 GB, first_stage_model.decoder.up.1.block.1.norm1.bias :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s] 
ram used:  3.63 GB, first_stage_model.decoder.up.1.block.1.conv1.weight:  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.63 GB, first_stage_model.decoder.up.1.block.1.conv1.bias :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s] 
ram used:  3.63 GB, first_stage_model.decoder.up.1.block.1.norm2.weight:  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.63 GB, first_stage_model.decoder.up.1.block.1.norm2.bias :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s] 
ram used:  3.63 GB, first_stage_model.decoder.up.1.block.1.conv2.weight:  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.63 GB, first_stage_model.decoder.up.1.block.1.conv2.bias :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s] 
ram used:  3.63 GB, first_stage_model.decoder.up.1.block.2.norm1.weight:  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.63 GB, first_stage_model.decoder.up.1.block.2.norm1.bias :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s] 
ram used:  3.63 GB, first_stage_model.decoder.up.1.block.2.conv1.weight:  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.64 GB, first_stage_model.decoder.up.1.block.2.conv1.bias :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s] 
ram used:  3.64 GB, first_stage_model.decoder.up.1.block.2.norm2.weight:  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.64 GB, first_stage_model.decoder.up.1.block.2.norm2.bias :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s] 
ram used:  3.64 GB, first_stage_model.decoder.up.1.block.2.conv2.weight:  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.64 GB, first_stage_model.decoder.up.1.block.2.conv2.bias :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s] 
ram used:  3.64 GB, first_stage_model.decoder.up.1.upsample.conv.weight:  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.64 GB, first_stage_model.decoder.up.1.upsample.conv.bias :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s] 
ram used:  3.64 GB, first_stage_model.decoder.up.2.block.0.norm1.weight:  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.64 GB, first_stage_model.decoder.up.2.block.0.norm1.bias :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s] 
ram used:  3.64 GB, first_stage_model.decoder.up.2.block.0.conv1.weight:  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.65 GB, first_stage_model.decoder.up.2.block.0.conv1.bias :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s] 
ram used:  3.65 GB, first_stage_model.decoder.up.2.block.0.norm2.weight:  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.65 GB, first_stage_model.decoder.up.2.block.0.norm2.bias :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s] 
ram used:  3.65 GB, first_stage_model.decoder.up.2.block.0.conv2.weight:  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.66 GB, first_stage_model.decoder.up.2.block.0.conv2.bias :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s] 
ram used:  3.66 GB, first_stage_model.decoder.up.2.block.1.norm1.weight:  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.66 GB, first_stage_model.decoder.up.2.block.1.norm1.bias :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s] 
ram used:  3.66 GB, first_stage_model.decoder.up.2.block.1.conv1.weight:  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.67 GB, first_stage_model.decoder.up.2.block.1.conv1.bias :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s] 
ram used:  3.67 GB, first_stage_model.decoder.up.2.block.1.norm2.weight:  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.67 GB, first_stage_model.decoder.up.2.block.1.norm2.bias :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s] 
ram used:  3.67 GB, first_stage_model.decoder.up.2.block.1.conv2.weight:  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.68 GB, first_stage_model.decoder.up.2.block.1.conv2.bias :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s] 
ram used:  3.68 GB, first_stage_model.decoder.up.2.block.2.norm1.weight:  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.68 GB, first_stage_model.decoder.up.2.block.2.norm1.bias :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s] 
ram used:  3.68 GB, first_stage_model.decoder.up.2.block.2.conv1.weight:  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.69 GB, first_stage_model.decoder.up.2.block.2.conv1.bias :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s] 
ram used:  3.69 GB, first_stage_model.decoder.up.2.block.2.norm2.weight:  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.69 GB, first_stage_model.decoder.up.2.block.2.norm2.bias :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s] 
ram used:  3.69 GB, first_stage_model.decoder.up.2.block.2.conv2.weight:  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.70 GB, first_stage_model.decoder.up.2.block.2.conv2.bias :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s] 
ram used:  3.70 GB, first_stage_model.decoder.up.2.upsample.conv.weight:  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.71 GB, first_stage_model.decoder.up.2.upsample.conv.bias :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s] 
ram used:  3.71 GB, first_stage_model.decoder.up.3.block.0.norm1.weight:  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.71 GB, first_stage_model.decoder.up.3.block.0.norm1.bias :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s] 
ram used:  3.71 GB, first_stage_model.decoder.up.3.block.0.conv1.weight:  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.72 GB, first_stage_model.decoder.up.3.block.0.conv1.bias :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s] 
ram used:  3.72 GB, first_stage_model.decoder.up.3.block.0.norm2.weight:  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.72 GB, first_stage_model.decoder.up.3.block.0.norm2.bias :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s] 
ram used:  3.72 GB, first_stage_model.decoder.up.3.block.0.conv2.weight:  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.73 GB, first_stage_model.decoder.up.3.block.0.conv2.bias :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s] 
ram used:  3.73 GB, first_stage_model.decoder.up.3.block.1.norm1.weight:  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.73 GB, first_stage_model.decoder.up.3.block.1.norm1.bias :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s] 
ram used:  3.73 GB, first_stage_model.decoder.up.3.block.1.conv1.weight:  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.73 GB, first_stage_model.decoder.up.3.block.1.conv1.bias :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s] 
ram used:  3.73 GB, first_stage_model.decoder.up.3.block.1.norm2.weight:  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.73 GB, first_stage_model.decoder.up.3.block.1.norm2.bias :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s] 
ram used:  3.73 GB, first_stage_model.decoder.up.3.block.1.conv2.weight:  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.74 GB, first_stage_model.decoder.up.3.block.1.conv2.bias :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s] 
ram used:  3.74 GB, first_stage_model.decoder.up.3.block.2.norm1.weight:  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.74 GB, first_stage_model.decoder.up.3.block.2.norm1.bias :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s] 
ram used:  3.74 GB, first_stage_model.decoder.up.3.block.2.conv1.weight:  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.75 GB, first_stage_model.decoder.up.3.block.2.conv1.bias :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s] 
ram used:  3.75 GB, first_stage_model.decoder.up.3.block.2.norm2.weight:  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.75 GB, first_stage_model.decoder.up.3.block.2.norm2.bias :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s] 
ram used:  3.75 GB, first_stage_model.decoder.up.3.block.2.conv2.weight:  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.76 GB, first_stage_model.decoder.up.3.block.2.conv2.bias :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s] 
ram used:  3.76 GB, first_stage_model.decoder.up.3.upsample.conv.weight:  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.77 GB, first_stage_model.decoder.up.3.upsample.conv.bias :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s] 
ram used:  3.77 GB, first_stage_model.decoder.norm_out.weight         :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.77 GB, first_stage_model.decoder.norm_out.bias           :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.77 GB, first_stage_model.decoder.conv_out.weight         :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.77 GB, first_stage_model.decoder.conv_out.bias           :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.77 GB, first_stage_model.quant_conv.weight               :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.77 GB, first_stage_model.quant_conv.bias                 :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.77 GB, first_stage_model.post_quant_conv.weight          :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.77 GB, first_stage_model.post_quant_conv.bias            :  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.77 GB, cond_stage_model.transformer.text_model.embeddings.token_embedding.weight:  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.92 GB, cond_stage_model.transformer.text_model.embeddings.position_embedding.weight:  75%|███████▌  | 853/1131 [00:02<00:00, 798.64it/s]
ram used:  3.92 GB, cond_stage_model.transformer.text_model.embeddings.position_embedding.weight:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]
ram used:  3.92 GB, cond_stage_model.transformer.text_model.encoder.layers.0.self_attn.k_proj.weight:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]
ram used:  3.93 GB, cond_stage_model.transformer.text_model.encoder.layers.0.self_attn.k_proj.bias:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]  
ram used:  3.93 GB, cond_stage_model.transformer.text_model.encoder.layers.0.self_attn.v_proj.weight:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]
ram used:  3.93 GB, cond_stage_model.transformer.text_model.encoder.layers.0.self_attn.v_proj.bias:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]  
ram used:  3.93 GB, cond_stage_model.transformer.text_model.encoder.layers.0.self_attn.q_proj.weight:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]
ram used:  3.93 GB, cond_stage_model.transformer.text_model.encoder.layers.0.self_attn.q_proj.bias:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]  
ram used:  3.93 GB, cond_stage_model.transformer.text_model.encoder.layers.0.self_attn.out_proj.weight:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]
ram used:  3.93 GB, cond_stage_model.transformer.text_model.encoder.layers.0.self_attn.out_proj.bias:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]  
ram used:  3.93 GB, cond_stage_model.transformer.text_model.encoder.layers.0.layer_norm1.weight:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]     
ram used:  3.93 GB, cond_stage_model.transformer.text_model.encoder.layers.0.layer_norm1.bias:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]  
ram used:  3.93 GB, cond_stage_model.transformer.text_model.encoder.layers.0.mlp.fc1.weight:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]  
ram used:  3.94 GB, cond_stage_model.transformer.text_model.encoder.layers.0.mlp.fc1.bias:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]  
ram used:  3.94 GB, cond_stage_model.transformer.text_model.encoder.layers.0.mlp.fc2.weight:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]
ram used:  3.95 GB, cond_stage_model.transformer.text_model.encoder.layers.0.mlp.fc2.bias:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]  
ram used:  3.95 GB, cond_stage_model.transformer.text_model.encoder.layers.0.layer_norm2.weight:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]
ram used:  3.95 GB, cond_stage_model.transformer.text_model.encoder.layers.0.layer_norm2.bias:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]  
ram used:  3.95 GB, cond_stage_model.transformer.text_model.encoder.layers.1.self_attn.k_proj.weight:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]
ram used:  3.96 GB, cond_stage_model.transformer.text_model.encoder.layers.1.self_attn.k_proj.bias:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]  
ram used:  3.96 GB, cond_stage_model.transformer.text_model.encoder.layers.1.self_attn.v_proj.weight:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]
ram used:  3.96 GB, cond_stage_model.transformer.text_model.encoder.layers.1.self_attn.v_proj.bias:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]  
ram used:  3.96 GB, cond_stage_model.transformer.text_model.encoder.layers.1.self_attn.q_proj.weight:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]
ram used:  3.96 GB, cond_stage_model.transformer.text_model.encoder.layers.1.self_attn.q_proj.bias:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]  
ram used:  3.96 GB, cond_stage_model.transformer.text_model.encoder.layers.1.self_attn.out_proj.weight:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]
ram used:  3.96 GB, cond_stage_model.transformer.text_model.encoder.layers.1.self_attn.out_proj.bias:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]  
ram used:  3.96 GB, cond_stage_model.transformer.text_model.encoder.layers.1.layer_norm1.weight:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]     
ram used:  3.96 GB, cond_stage_model.transformer.text_model.encoder.layers.1.layer_norm1.bias:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]  
ram used:  3.96 GB, cond_stage_model.transformer.text_model.encoder.layers.1.mlp.fc1.weight:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]  
ram used:  3.97 GB, cond_stage_model.transformer.text_model.encoder.layers.1.mlp.fc1.bias:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]  
ram used:  3.97 GB, cond_stage_model.transformer.text_model.encoder.layers.1.mlp.fc2.weight:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]
ram used:  3.98 GB, cond_stage_model.transformer.text_model.encoder.layers.1.mlp.fc2.bias:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]  
ram used:  3.98 GB, cond_stage_model.transformer.text_model.encoder.layers.1.layer_norm2.weight:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]
ram used:  3.98 GB, cond_stage_model.transformer.text_model.encoder.layers.1.layer_norm2.bias:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]  
ram used:  3.98 GB, cond_stage_model.transformer.text_model.encoder.layers.2.self_attn.k_proj.weight:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]
ram used:  3.98 GB, cond_stage_model.transformer.text_model.encoder.layers.2.self_attn.k_proj.bias:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]  
ram used:  3.98 GB, cond_stage_model.transformer.text_model.encoder.layers.2.self_attn.v_proj.weight:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]
ram used:  3.99 GB, cond_stage_model.transformer.text_model.encoder.layers.2.self_attn.v_proj.bias:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]  
ram used:  3.99 GB, cond_stage_model.transformer.text_model.encoder.layers.2.self_attn.q_proj.weight:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]
ram used:  3.99 GB, cond_stage_model.transformer.text_model.encoder.layers.2.self_attn.q_proj.bias:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]  
ram used:  3.99 GB, cond_stage_model.transformer.text_model.encoder.layers.2.self_attn.out_proj.weight:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]
ram used:  3.99 GB, cond_stage_model.transformer.text_model.encoder.layers.2.self_attn.out_proj.bias:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]  
ram used:  3.99 GB, cond_stage_model.transformer.text_model.encoder.layers.2.layer_norm1.weight:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]     
ram used:  3.99 GB, cond_stage_model.transformer.text_model.encoder.layers.2.layer_norm1.bias:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]  
ram used:  3.99 GB, cond_stage_model.transformer.text_model.encoder.layers.2.mlp.fc1.weight:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]  
ram used:  4.00 GB, cond_stage_model.transformer.text_model.encoder.layers.2.mlp.fc1.bias:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]  
ram used:  4.00 GB, cond_stage_model.transformer.text_model.encoder.layers.2.mlp.fc2.weight:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]
ram used:  4.01 GB, cond_stage_model.transformer.text_model.encoder.layers.2.mlp.fc2.bias:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]  
ram used:  4.01 GB, cond_stage_model.transformer.text_model.encoder.layers.2.layer_norm2.weight:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]
ram used:  4.01 GB, cond_stage_model.transformer.text_model.encoder.layers.2.layer_norm2.bias:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]  
ram used:  4.01 GB, cond_stage_model.transformer.text_model.encoder.layers.3.self_attn.k_proj.weight:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]
ram used:  4.01 GB, cond_stage_model.transformer.text_model.encoder.layers.3.self_attn.k_proj.bias:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]  
ram used:  4.01 GB, cond_stage_model.transformer.text_model.encoder.layers.3.self_attn.v_proj.weight:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]
ram used:  4.01 GB, cond_stage_model.transformer.text_model.encoder.layers.3.self_attn.v_proj.bias:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]  
ram used:  4.01 GB, cond_stage_model.transformer.text_model.encoder.layers.3.self_attn.q_proj.weight:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]
ram used:  4.02 GB, cond_stage_model.transformer.text_model.encoder.layers.3.self_attn.q_proj.bias:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]  
ram used:  4.02 GB, cond_stage_model.transformer.text_model.encoder.layers.3.self_attn.out_proj.weight:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]
ram used:  4.02 GB, cond_stage_model.transformer.text_model.encoder.layers.3.self_attn.out_proj.bias:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]  
ram used:  4.02 GB, cond_stage_model.transformer.text_model.encoder.layers.3.layer_norm1.weight:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]     
ram used:  4.02 GB, cond_stage_model.transformer.text_model.encoder.layers.3.layer_norm1.bias:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]  
ram used:  4.02 GB, cond_stage_model.transformer.text_model.encoder.layers.3.mlp.fc1.weight:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]  
ram used:  4.03 GB, cond_stage_model.transformer.text_model.encoder.layers.3.mlp.fc1.bias:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]  
ram used:  4.03 GB, cond_stage_model.transformer.text_model.encoder.layers.3.mlp.fc2.weight:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]
ram used:  4.04 GB, cond_stage_model.transformer.text_model.encoder.layers.3.mlp.fc2.bias:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]  
ram used:  4.04 GB, cond_stage_model.transformer.text_model.encoder.layers.3.layer_norm2.weight:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]
ram used:  4.04 GB, cond_stage_model.transformer.text_model.encoder.layers.3.layer_norm2.bias:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]  
ram used:  4.04 GB, cond_stage_model.transformer.text_model.encoder.layers.4.self_attn.k_proj.weight:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]
ram used:  4.04 GB, cond_stage_model.transformer.text_model.encoder.layers.4.self_attn.k_proj.bias:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]  
ram used:  4.04 GB, cond_stage_model.transformer.text_model.encoder.layers.4.self_attn.v_proj.weight:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]
ram used:  4.04 GB, cond_stage_model.transformer.text_model.encoder.layers.4.self_attn.v_proj.bias:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]  
ram used:  4.04 GB, cond_stage_model.transformer.text_model.encoder.layers.4.self_attn.q_proj.weight:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]
ram used:  4.05 GB, cond_stage_model.transformer.text_model.encoder.layers.4.self_attn.q_proj.bias:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]  
ram used:  4.05 GB, cond_stage_model.transformer.text_model.encoder.layers.4.self_attn.out_proj.weight:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]
ram used:  4.05 GB, cond_stage_model.transformer.text_model.encoder.layers.4.self_attn.out_proj.bias:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]  
ram used:  4.05 GB, cond_stage_model.transformer.text_model.encoder.layers.4.layer_norm1.weight:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]     
ram used:  4.05 GB, cond_stage_model.transformer.text_model.encoder.layers.4.layer_norm1.bias:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]  
ram used:  4.05 GB, cond_stage_model.transformer.text_model.encoder.layers.4.mlp.fc1.weight:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]  
ram used:  4.06 GB, cond_stage_model.transformer.text_model.encoder.layers.4.mlp.fc1.bias:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]  
ram used:  4.06 GB, cond_stage_model.transformer.text_model.encoder.layers.4.mlp.fc2.weight:  83%|████████▎ | 937/1131 [00:02<00:00, 704.23it/s]
ram used:  4.06 GB, cond_stage_model.transformer.text_model.encoder.layers.4.mlp.fc2.weight:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]
ram used:  4.07 GB, cond_stage_model.transformer.text_model.encoder.layers.4.mlp.fc2.bias:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.07 GB, cond_stage_model.transformer.text_model.encoder.layers.4.layer_norm2.weight:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]
ram used:  4.07 GB, cond_stage_model.transformer.text_model.encoder.layers.4.layer_norm2.bias:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.07 GB, cond_stage_model.transformer.text_model.encoder.layers.5.self_attn.k_proj.weight:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]
ram used:  4.07 GB, cond_stage_model.transformer.text_model.encoder.layers.5.self_attn.k_proj.bias:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.07 GB, cond_stage_model.transformer.text_model.encoder.layers.5.self_attn.v_proj.weight:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]
ram used:  4.07 GB, cond_stage_model.transformer.text_model.encoder.layers.5.self_attn.v_proj.bias:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.07 GB, cond_stage_model.transformer.text_model.encoder.layers.5.self_attn.q_proj.weight:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]
ram used:  4.07 GB, cond_stage_model.transformer.text_model.encoder.layers.5.self_attn.q_proj.bias:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.07 GB, cond_stage_model.transformer.text_model.encoder.layers.5.self_attn.out_proj.weight:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]
ram used:  4.08 GB, cond_stage_model.transformer.text_model.encoder.layers.5.self_attn.out_proj.bias:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.08 GB, cond_stage_model.transformer.text_model.encoder.layers.5.layer_norm1.weight:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]     
ram used:  4.08 GB, cond_stage_model.transformer.text_model.encoder.layers.5.layer_norm1.bias:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.08 GB, cond_stage_model.transformer.text_model.encoder.layers.5.mlp.fc1.weight:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.09 GB, cond_stage_model.transformer.text_model.encoder.layers.5.mlp.fc1.bias:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.09 GB, cond_stage_model.transformer.text_model.encoder.layers.5.mlp.fc2.weight:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]
ram used:  4.09 GB, cond_stage_model.transformer.text_model.encoder.layers.5.mlp.fc2.bias:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.09 GB, cond_stage_model.transformer.text_model.encoder.layers.5.layer_norm2.weight:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]
ram used:  4.09 GB, cond_stage_model.transformer.text_model.encoder.layers.5.layer_norm2.bias:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.09 GB, cond_stage_model.transformer.text_model.encoder.layers.6.self_attn.k_proj.weight:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]
ram used:  4.10 GB, cond_stage_model.transformer.text_model.encoder.layers.6.self_attn.k_proj.bias:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.10 GB, cond_stage_model.transformer.text_model.encoder.layers.6.self_attn.v_proj.weight:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]
ram used:  4.10 GB, cond_stage_model.transformer.text_model.encoder.layers.6.self_attn.v_proj.bias:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.10 GB, cond_stage_model.transformer.text_model.encoder.layers.6.self_attn.q_proj.weight:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]
ram used:  4.10 GB, cond_stage_model.transformer.text_model.encoder.layers.6.self_attn.q_proj.bias:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.10 GB, cond_stage_model.transformer.text_model.encoder.layers.6.self_attn.out_proj.weight:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]
ram used:  4.10 GB, cond_stage_model.transformer.text_model.encoder.layers.6.self_attn.out_proj.bias:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.10 GB, cond_stage_model.transformer.text_model.encoder.layers.6.layer_norm1.weight:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]     
ram used:  4.10 GB, cond_stage_model.transformer.text_model.encoder.layers.6.layer_norm1.bias:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.10 GB, cond_stage_model.transformer.text_model.encoder.layers.6.mlp.fc1.weight:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.11 GB, cond_stage_model.transformer.text_model.encoder.layers.6.mlp.fc1.bias:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.11 GB, cond_stage_model.transformer.text_model.encoder.layers.6.mlp.fc2.weight:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]
ram used:  4.12 GB, cond_stage_model.transformer.text_model.encoder.layers.6.mlp.fc2.bias:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.12 GB, cond_stage_model.transformer.text_model.encoder.layers.6.layer_norm2.weight:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]
ram used:  4.12 GB, cond_stage_model.transformer.text_model.encoder.layers.6.layer_norm2.bias:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.12 GB, cond_stage_model.transformer.text_model.encoder.layers.7.self_attn.k_proj.weight:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]
ram used:  4.13 GB, cond_stage_model.transformer.text_model.encoder.layers.7.self_attn.k_proj.bias:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.13 GB, cond_stage_model.transformer.text_model.encoder.layers.7.self_attn.v_proj.weight:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]
ram used:  4.13 GB, cond_stage_model.transformer.text_model.encoder.layers.7.self_attn.v_proj.bias:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.13 GB, cond_stage_model.transformer.text_model.encoder.layers.7.self_attn.q_proj.weight:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]
ram used:  4.13 GB, cond_stage_model.transformer.text_model.encoder.layers.7.self_attn.q_proj.bias:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.13 GB, cond_stage_model.transformer.text_model.encoder.layers.7.self_attn.out_proj.weight:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]
ram used:  4.13 GB, cond_stage_model.transformer.text_model.encoder.layers.7.self_attn.out_proj.bias:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.13 GB, cond_stage_model.transformer.text_model.encoder.layers.7.layer_norm1.weight:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]     
ram used:  4.13 GB, cond_stage_model.transformer.text_model.encoder.layers.7.layer_norm1.bias:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.13 GB, cond_stage_model.transformer.text_model.encoder.layers.7.mlp.fc1.weight:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.14 GB, cond_stage_model.transformer.text_model.encoder.layers.7.mlp.fc1.bias:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.14 GB, cond_stage_model.transformer.text_model.encoder.layers.7.mlp.fc2.weight:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]
ram used:  4.15 GB, cond_stage_model.transformer.text_model.encoder.layers.7.mlp.fc2.bias:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.15 GB, cond_stage_model.transformer.text_model.encoder.layers.7.layer_norm2.weight:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]
ram used:  4.15 GB, cond_stage_model.transformer.text_model.encoder.layers.7.layer_norm2.bias:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.15 GB, cond_stage_model.transformer.text_model.encoder.layers.8.self_attn.k_proj.weight:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]
ram used:  4.15 GB, cond_stage_model.transformer.text_model.encoder.layers.8.self_attn.k_proj.bias:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.15 GB, cond_stage_model.transformer.text_model.encoder.layers.8.self_attn.v_proj.weight:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]
ram used:  4.16 GB, cond_stage_model.transformer.text_model.encoder.layers.8.self_attn.v_proj.bias:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.16 GB, cond_stage_model.transformer.text_model.encoder.layers.8.self_attn.q_proj.weight:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]
ram used:  4.16 GB, cond_stage_model.transformer.text_model.encoder.layers.8.self_attn.q_proj.bias:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.16 GB, cond_stage_model.transformer.text_model.encoder.layers.8.self_attn.out_proj.weight:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]
ram used:  4.16 GB, cond_stage_model.transformer.text_model.encoder.layers.8.self_attn.out_proj.bias:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.16 GB, cond_stage_model.transformer.text_model.encoder.layers.8.layer_norm1.weight:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]     
ram used:  4.16 GB, cond_stage_model.transformer.text_model.encoder.layers.8.layer_norm1.bias:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.16 GB, cond_stage_model.transformer.text_model.encoder.layers.8.mlp.fc1.weight:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.17 GB, cond_stage_model.transformer.text_model.encoder.layers.8.mlp.fc1.bias:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.17 GB, cond_stage_model.transformer.text_model.encoder.layers.8.mlp.fc2.weight:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]
ram used:  4.18 GB, cond_stage_model.transformer.text_model.encoder.layers.8.mlp.fc2.bias:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.18 GB, cond_stage_model.transformer.text_model.encoder.layers.8.layer_norm2.weight:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]
ram used:  4.18 GB, cond_stage_model.transformer.text_model.encoder.layers.8.layer_norm2.bias:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.18 GB, cond_stage_model.transformer.text_model.encoder.layers.9.self_attn.k_proj.weight:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]
ram used:  4.18 GB, cond_stage_model.transformer.text_model.encoder.layers.9.self_attn.k_proj.bias:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.18 GB, cond_stage_model.transformer.text_model.encoder.layers.9.self_attn.v_proj.weight:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]
ram used:  4.18 GB, cond_stage_model.transformer.text_model.encoder.layers.9.self_attn.v_proj.bias:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.18 GB, cond_stage_model.transformer.text_model.encoder.layers.9.self_attn.q_proj.weight:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]
ram used:  4.19 GB, cond_stage_model.transformer.text_model.encoder.layers.9.self_attn.q_proj.bias:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.19 GB, cond_stage_model.transformer.text_model.encoder.layers.9.self_attn.out_proj.weight:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]
ram used:  4.19 GB, cond_stage_model.transformer.text_model.encoder.layers.9.self_attn.out_proj.bias:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.19 GB, cond_stage_model.transformer.text_model.encoder.layers.9.layer_norm1.weight:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]     
ram used:  4.19 GB, cond_stage_model.transformer.text_model.encoder.layers.9.layer_norm1.bias:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.19 GB, cond_stage_model.transformer.text_model.encoder.layers.9.mlp.fc1.weight:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.20 GB, cond_stage_model.transformer.text_model.encoder.layers.9.mlp.fc1.bias:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.20 GB, cond_stage_model.transformer.text_model.encoder.layers.9.mlp.fc2.weight:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]
ram used:  4.21 GB, cond_stage_model.transformer.text_model.encoder.layers.9.mlp.fc2.bias:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]  
ram used:  4.21 GB, cond_stage_model.transformer.text_model.encoder.layers.9.layer_norm2.weight:  90%|████████▉ | 1014/1131 [00:02<00:00, 714.28it/s]
ram used:  4.21 GB, cond_stage_model.transformer.text_model.encoder.layers.9.layer_norm2.weight:  97%|█████████▋| 1096/1131 [00:02<00:00, 742.75it/s]
ram used:  4.21 GB, cond_stage_model.transformer.text_model.encoder.layers.9.layer_norm2.bias:  97%|█████████▋| 1096/1131 [00:02<00:00, 742.75it/s]  
ram used:  4.21 GB, cond_stage_model.transformer.text_model.encoder.layers.10.self_attn.k_proj.weight:  97%|█████████▋| 1096/1131 [00:02<00:00, 742.75it/s]
ram used:  4.21 GB, cond_stage_model.transformer.text_model.encoder.layers.10.self_attn.k_proj.bias:  97%|█████████▋| 1096/1131 [00:02<00:00, 742.75it/s]  
ram used:  4.21 GB, cond_stage_model.transformer.text_model.encoder.layers.10.self_attn.v_proj.weight:  97%|█████████▋| 1096/1131 [00:02<00:00, 742.75it/s]
ram used:  4.21 GB, cond_stage_model.transformer.text_model.encoder.layers.10.self_attn.v_proj.bias:  97%|█████████▋| 1096/1131 [00:02<00:00, 742.75it/s]  
ram used:  4.21 GB, cond_stage_model.transformer.text_model.encoder.layers.10.self_attn.q_proj.weight:  97%|█████████▋| 1096/1131 [00:02<00:00, 742.75it/s]
ram used:  4.22 GB, cond_stage_model.transformer.text_model.encoder.layers.10.self_attn.q_proj.bias:  97%|█████████▋| 1096/1131 [00:02<00:00, 742.75it/s]  
ram used:  4.22 GB, cond_stage_model.transformer.text_model.encoder.layers.10.self_attn.out_proj.weight:  97%|█████████▋| 1096/1131 [00:02<00:00, 742.75it/s]
ram used:  4.22 GB, cond_stage_model.transformer.text_model.encoder.layers.10.self_attn.out_proj.bias:  97%|█████████▋| 1096/1131 [00:02<00:00, 742.75it/s]  
ram used:  4.22 GB, cond_stage_model.transformer.text_model.encoder.layers.10.layer_norm1.weight:  97%|█████████▋| 1096/1131 [00:02<00:00, 742.75it/s]     
ram used:  4.22 GB, cond_stage_model.transformer.text_model.encoder.layers.10.layer_norm1.bias:  97%|█████████▋| 1096/1131 [00:02<00:00, 742.75it/s]  
ram used:  4.22 GB, cond_stage_model.transformer.text_model.encoder.layers.10.mlp.fc1.weight:  97%|█████████▋| 1096/1131 [00:02<00:00, 742.75it/s]  
ram used:  4.23 GB, cond_stage_model.transformer.text_model.encoder.layers.10.mlp.fc1.bias:  97%|█████████▋| 1096/1131 [00:02<00:00, 742.75it/s]  
ram used:  4.23 GB, cond_stage_model.transformer.text_model.encoder.layers.10.mlp.fc2.weight:  97%|█████████▋| 1096/1131 [00:02<00:00, 742.75it/s]
ram used:  4.24 GB, cond_stage_model.transformer.text_model.encoder.layers.10.mlp.fc2.bias:  97%|█████████▋| 1096/1131 [00:02<00:00, 742.75it/s]  
ram used:  4.24 GB, cond_stage_model.transformer.text_model.encoder.layers.10.layer_norm2.weight:  97%|█████████▋| 1096/1131 [00:02<00:00, 742.75it/s]
ram used:  4.24 GB, cond_stage_model.transformer.text_model.encoder.layers.10.layer_norm2.bias:  97%|█████████▋| 1096/1131 [00:02<00:00, 742.75it/s]  
ram used:  4.24 GB, cond_stage_model.transformer.text_model.encoder.layers.11.self_attn.k_proj.weight:  97%|█████████▋| 1096/1131 [00:02<00:00, 742.75it/s]
ram used:  4.24 GB, cond_stage_model.transformer.text_model.encoder.layers.11.self_attn.k_proj.bias:  97%|█████████▋| 1096/1131 [00:02<00:00, 742.75it/s]  
ram used:  4.24 GB, cond_stage_model.transformer.text_model.encoder.layers.11.self_attn.v_proj.weight:  97%|█████████▋| 1096/1131 [00:02<00:00, 742.75it/s]
ram used:  4.24 GB, cond_stage_model.transformer.text_model.encoder.layers.11.self_attn.v_proj.bias:  97%|█████████▋| 1096/1131 [00:02<00:00, 742.75it/s]  
ram used:  4.24 GB, cond_stage_model.transformer.text_model.encoder.layers.11.self_attn.q_proj.weight:  97%|█████████▋| 1096/1131 [00:02<00:00, 742.75it/s]
ram used:  4.24 GB, cond_stage_model.transformer.text_model.encoder.layers.11.self_attn.q_proj.bias:  97%|█████████▋| 1096/1131 [00:02<00:00, 742.75it/s]  
ram used:  4.24 GB, cond_stage_model.transformer.text_model.encoder.layers.11.self_attn.out_proj.weight:  97%|█████████▋| 1096/1131 [00:02<00:00, 742.75it/s]
ram used:  4.25 GB, cond_stage_model.transformer.text_model.encoder.layers.11.self_attn.out_proj.bias:  97%|█████████▋| 1096/1131 [00:02<00:00, 742.75it/s]  
ram used:  4.25 GB, cond_stage_model.transformer.text_model.encoder.layers.11.layer_norm1.weight:  97%|█████████▋| 1096/1131 [00:02<00:00, 742.75it/s]     
ram used:  4.25 GB, cond_stage_model.transformer.text_model.encoder.layers.11.layer_norm1.bias:  97%|█████████▋| 1096/1131 [00:02<00:00, 742.75it/s]  
ram used:  4.25 GB, cond_stage_model.transformer.text_model.encoder.layers.11.mlp.fc1.weight:  97%|█████████▋| 1096/1131 [00:02<00:00, 742.75it/s]  
ram used:  4.26 GB, cond_stage_model.transformer.text_model.encoder.layers.11.mlp.fc1.bias:  97%|█████████▋| 1096/1131 [00:02<00:00, 742.75it/s]  
ram used:  4.26 GB, cond_stage_model.transformer.text_model.encoder.layers.11.mlp.fc2.weight:  97%|█████████▋| 1096/1131 [00:02<00:00, 742.75it/s]
ram used:  4.26 GB, cond_stage_model.transformer.text_model.encoder.layers.11.mlp.fc2.bias:  97%|█████████▋| 1096/1131 [00:02<00:00, 742.75it/s]  
ram used:  4.26 GB, cond_stage_model.transformer.text_model.encoder.layers.11.layer_norm2.weight:  97%|█████████▋| 1096/1131 [00:02<00:00, 742.75it/s]
ram used:  4.26 GB, cond_stage_model.transformer.text_model.encoder.layers.11.layer_norm2.bias:  97%|█████████▋| 1096/1131 [00:02<00:00, 742.75it/s]  
ram used:  4.26 GB, cond_stage_model.transformer.text_model.final_layer_norm.weight:  97%|█████████▋| 1096/1131 [00:02<00:00, 742.75it/s]           
ram used:  4.26 GB, cond_stage_model.transformer.text_model.final_layer_norm.bias:  97%|█████████▋| 1096/1131 [00:02<00:00, 742.75it/s]  
ram used:  4.26 GB, cond_stage_model.transformer.text_model.final_layer_norm.bias: 100%|██████████| 1131/1131 [00:02<00:00, 440.76it/s]
loaded weights in 2571.99 ms, 4.26 GB loaded at 1.66 GB/s
got CLIP context (1, 77, 768)
got unconditional CLIP context (1, 77, 768)
running for [1, 201, 401, 601, 801] timesteps

  0%|          | 0/5 [00:00<?, ?it/s]
  4 801:   0%|          | 0/5 [00:00<?, ?it/s]
  4 801:  20%|██        | 1/5 [00:13<00:55, 13.99s/it]
  3 601:  20%|██        | 1/5 [00:13<00:55, 13.99s/it]
  3 601:  40%|████      | 2/5 [00:14<00:18,  6.01s/it]
  2 401:  40%|████      | 2/5 [00:14<00:18,  6.01s/it]
  1 201:  40%|████      | 2/5 [00:14<00:18,  6.01s/it]
  0   1:  40%|████      | 2/5 [00:14<00:18,  6.01s/it]
  0   1: 100%|██████████| 5/5 [00:14<00:00,  2.88s/it]
decode (1, 512, 64, 64)
decode (1, 512, 128, 128)
decode (1, 512, 256, 256)
decode (1, 256, 512, 512)
(512, 512, 3)
saving /tmp/rendered.png
Error: no "view" mailcap rules found for type "image/png"
/usr/bin/xdg-open: 882: www-browser: not found
/usr/bin/xdg-open: 882: links2: not found
/usr/bin/xdg-open: 882: elinks: not found
/usr/bin/xdg-open: 882: links: not found
/usr/bin/xdg-open: 882: lynx: not found
/usr/bin/xdg-open: 882: w3m: not found
xdg-open: no method available for opening '/tmp/tmphj455rbh.PNG'

train_efficientnet.py

parameter count 296
training with batch size 16 for 2048 steps

  0%|          | 0/2048 [00:00<?, ?it/s]
loss 2.68 accuracy 0.00 -- 60.07 + 59.54 + 31476.56 + 219.46 = 31815.64:   0%|          | 0/2048 [00:31<?, ?it/s]
loss 2.68 accuracy 0.00 -- 60.07 + 59.54 + 31476.56 + 219.46 = 31815.64:   0%|          | 1/2048 [00:31<18:06:37, 31.85s/it]
loss 2.68 accuracy 0.06 -- 54.64 + 153.94 + 14521.00 + 5.87 = 14735.45:   0%|          | 1/2048 [00:46<18:06:37, 31.85s/it] 
loss 2.68 accuracy 0.06 -- 54.64 + 153.94 + 14521.00 + 5.87 = 14735.45:   0%|          | 2/2048 [00:46<12:23:21, 21.80s/it]
loss 2.92 accuracy 0.12 -- 55.55 + 56.57 + 620.43 + 4.97 = 737.52:   0%|          | 2/2048 [00:47<12:23:21, 21.80s/it]     
loss 2.92 accuracy 0.12 -- 55.55 + 56.57 + 620.43 + 4.97 = 737.52:   0%|          | 3/2048 [00:47<6:55:39, 12.20s/it] 
loss 3.24 accuracy 0.12 -- 56.70 + 56.73 + 501.23 + 5.02 = 619.68:   0%|          | 3/2048 [00:48<6:55:39, 12.20s/it]
loss 3.24 accuracy 0.12 -- 56.70 + 56.73 + 501.23 + 5.02 = 619.68:   0%|          | 4/2048 [00:48<4:20:08,  7.64s/it]
loss 2.99 accuracy 0.19 -- 166.41 + 57.14 + 505.88 + 4.99 = 734.43:   0%|          | 4/2048 [00:48<4:20:08,  7.64s/it]
loss 2.99 accuracy 0.19 -- 166.41 + 57.14 + 505.88 + 4.99 = 734.43:   0%|          | 5/2048 [00:48<2:55:36,  5.16s/it]
loss 3.15 accuracy 0.12 -- 57.06 + 57.74 + 633.05 + 5.03 = 752.89:   0%|          | 5/2048 [00:49<2:55:36,  5.16s/it] 
loss 3.15 accuracy 0.12 -- 57.06 + 57.74 + 633.05 + 5.03 = 752.89:   0%|          | 6/2048 [00:49<2:04:52,  3.67s/it]
loss 3.58 accuracy 0.19 -- 57.63 + 57.02 + 513.18 + 5.00 = 632.83:   0%|          | 6/2048 [00:50<2:04:52,  3.67s/it]
loss 3.58 accuracy 0.19 -- 57.63 + 57.02 + 513.18 + 5.00 = 632.83:   0%|          | 7/2048 [00:50<1:31:21,  2.69s/it]
loss 3.65 accuracy 0.12 -- 56.84 + 57.49 + 629.53 + 5.02 = 748.88:   0%|          | 7/2048 [00:51<1:31:21,  2.69s/it]
loss 3.65 accuracy 0.12 -- 56.84 + 57.49 + 629.53 + 5.02 = 748.88:   0%|          | 8/2048 [00:51<1:10:38,  2.08s/it]
loss 3.52 accuracy 0.12 -- 57.34 + 57.04 + 511.09 + 4.98 = 630.45:   0%|          | 8/2048 [00:51<1:10:38,  2.08s/it]
loss 3.52 accuracy 0.12 -- 57.34 + 57.04 + 511.09 + 4.98 = 630.45:   0%|          | 9/2048 [00:51<55:31,  1.63s/it]  
loss 3.62 accuracy 0.12 -- 56.32 + 170.81 + 509.88 + 4.98 = 741.99:   0%|          | 9/2048 [00:52<55:31,  1.63s/it]
loss 3.62 accuracy 0.12 -- 56.32 + 170.81 + 509.88 + 4.98 = 741.99:   0%|          | 10/2048 [00:52<46:48,  1.38s/it]
loss 3.23 accuracy 0.06 -- 56.67 + 56.50 + 509.40 + 4.98 = 627.56:   0%|          | 10/2048 [00:53<46:48,  1.38s/it] 
loss 3.23 accuracy 0.06 -- 56.67 + 56.50 + 509.40 + 4.98 = 627.56:   1%|          | 11/2048 [00:53<39:16,  1.16s/it]
loss 3.66 accuracy 0.00 -- 57.39 + 57.07 + 506.83 + 5.00 = 626.28:   1%|          | 11/2048 [00:53<39:16,  1.16s/it]
loss 3.66 accuracy 0.00 -- 57.39 + 57.07 + 506.83 + 5.00 = 626.28:   1%|          | 12/2048 [00:53<35:12,  1.04s/it]
loss 3.66 accuracy 0.06 -- 56.38 + 57.46 + 501.93 + 4.92 = 620.70:   1%|          | 12/2048 [00:54<35:12,  1.04s/it]
loss 3.66 accuracy 0.06 -- 56.38 + 57.46 + 501.93 + 4.92 = 620.70:   1%|          | 13/2048 [00:54<31:11,  1.09it/s]
loss 2.83 accuracy 0.19 -- 161.58 + 56.90 + 497.35 + 4.97 = 720.81:   1%|          | 13/2048 [00:55<31:11,  1.09it/s]
loss 2.83 accuracy 0.19 -- 161.58 + 56.90 + 497.35 + 4.97 = 720.81:   1%|          | 14/2048 [00:55<29:25,  1.15it/s]
loss 2.59 accuracy 0.25 -- 56.46 + 171.02 + 510.33 + 5.01 = 742.81:   1%|          | 14/2048 [00:56<29:25,  1.15it/s]
loss 2.59 accuracy 0.25 -- 56.46 + 171.02 + 510.33 + 5.01 = 742.81:   1%|          | 15/2048 [00:56<28:24,  1.19it/s]
loss 4.20 accuracy 0.00 -- 57.55 + 57.12 + 506.81 + 4.95 = 626.43:   1%|          | 15/2048 [00:56<28:24,  1.19it/s] 
loss 4.20 accuracy 0.00 -- 57.55 + 57.12 + 506.81 + 4.95 = 626.43:   1%|          | 16/2048 [00:56<26:31,  1.28it/s]
loss 3.20 accuracy 0.25 -- 166.51 + 57.89 + 506.22 + 4.98 = 735.60:   1%|          | 16/2048 [00:57<26:31,  1.28it/s]
loss 3.20 accuracy 0.25 -- 166.51 + 57.89 + 506.22 + 4.98 = 735.60:   1%|          | 17/2048 [00:57<26:40,  1.27it/s]
loss 2.69 accuracy 0.19 -- 56.37 + 57.45 + 632.64 + 4.96 = 751.41:   1%|          | 17/2048 [00:58<26:40,  1.27it/s] 
loss 2.69 accuracy 0.19 -- 56.37 + 57.45 + 632.64 + 4.96 = 751.41:   1%|          | 18/2048 [00:58<26:33,  1.27it/s]
loss 3.83 accuracy 0.12 -- 57.34 + 56.92 + 513.08 + 4.96 = 632.30:   1%|          | 18/2048 [00:58<26:33,  1.27it/s]
loss 3.83 accuracy 0.12 -- 57.34 + 56.92 + 513.08 + 4.96 = 632.30:   1%|          | 19/2048 [00:58<25:16,  1.34it/s]
loss 3.53 accuracy 0.00 -- 56.67 + 57.76 + 627.03 + 4.96 = 746.42:   1%|          | 19/2048 [00:59<25:16,  1.34it/s]
loss 3.53 accuracy 0.00 -- 56.67 + 57.76 + 627.03 + 4.96 = 746.42:   1%|          | 20/2048 [00:59<25:32,  1.32it/s]
loss 2.85 accuracy 0.12 -- 57.14 + 56.67 + 511.39 + 5.01 = 630.21:   1%|          | 20/2048 [01:00<25:32,  1.32it/s]
loss 2.85 accuracy 0.12 -- 57.14 + 56.67 + 511.39 + 5.01 = 630.21:   1%|          | 21/2048 [01:00<24:32,  1.38it/s]
loss 3.75 accuracy 0.00 -- 56.73 + 170.30 + 510.66 + 4.94 = 742.63:   1%|          | 21/2048 [01:01<24:32,  1.38it/s]
loss 3.75 accuracy 0.00 -- 56.73 + 170.30 + 510.66 + 4.94 = 742.63:   1%|          | 22/2048 [01:01<24:58,  1.35it/s]
loss 3.16 accuracy 0.00 -- 56.37 + 56.61 + 508.78 + 4.96 = 626.71:   1%|          | 22/2048 [01:01<24:58,  1.35it/s] 
loss 3.16 accuracy 0.00 -- 56.37 + 56.61 + 508.78 + 4.96 = 626.71:   1%|          | 23/2048 [01:01<24:06,  1.40it/s]
loss 2.89 accuracy 0.25 -- 57.06 + 56.84 + 506.63 + 4.96 = 625.48:   1%|          | 23/2048 [01:02<24:06,  1.40it/s]
loss 2.89 accuracy 0.25 -- 57.06 + 56.84 + 506.63 + 4.96 = 625.48:   1%|          | 24/2048 [01:02<24:35,  1.37it/s]
loss 3.16 accuracy 0.19 -- 56.42 + 57.74 + 510.59 + 4.93 = 629.68:   1%|          | 24/2048 [01:03<24:35,  1.37it/s]
loss 3.16 accuracy 0.19 -- 56.42 + 57.74 + 510.59 + 4.93 = 629.68:   1%|          | 25/2048 [01:03<23:51,  1.41it/s]
loss 3.20 accuracy 0.06 -- 161.59 + 57.25 + 497.86 + 4.92 = 721.62:   1%|          | 25/2048 [01:03<23:51,  1.41it/s]
loss 3.20 accuracy 0.06 -- 161.59 + 57.25 + 497.86 + 4.92 = 721.62:   1%|▏         | 26/2048 [01:03<24:16,  1.39it/s]
loss 3.72 accuracy 0.19 -- 56.32 + 171.12 + 509.18 + 4.96 = 741.58:   1%|▏         | 26/2048 [01:04<24:16,  1.39it/s]
loss 3.72 accuracy 0.19 -- 56.32 + 171.12 + 509.18 + 4.96 = 741.58:   1%|▏         | 27/2048 [01:04<24:45,  1.36it/s]
loss 2.21 accuracy 0.25 -- 57.25 + 56.99 + 505.71 + 4.99 = 624.95:   1%|▏         | 27/2048 [01:05<24:45,  1.36it/s] 
loss 2.21 accuracy 0.25 -- 57.25 + 56.99 + 505.71 + 4.99 = 624.95:   1%|▏         | 28/2048 [01:05<23:54,  1.41it/s]
loss 2.45 accuracy 0.12 -- 166.51 + 57.68 + 504.50 + 4.97 = 733.66:   1%|▏         | 28/2048 [01:06<23:54,  1.41it/s]
loss 2.45 accuracy 0.12 -- 166.51 + 57.68 + 504.50 + 4.97 = 733.66:   1%|▏         | 29/2048 [01:06<24:24,  1.38it/s]
loss 2.34 accuracy 0.31 -- 56.83 + 57.68 + 631.80 + 4.96 = 751.27:   1%|▏         | 29/2048 [01:06<24:24,  1.38it/s] 
loss 2.34 accuracy 0.31 -- 56.83 + 57.68 + 631.80 + 4.96 = 751.27:   1%|▏         | 30/2048 [01:06<24:56,  1.35it/s]
loss 3.25 accuracy 0.12 -- 57.14 + 56.79 + 512.39 + 5.04 = 631.36:   1%|▏         | 30/2048 [01:07<24:56,  1.35it/s]
loss 3.25 accuracy 0.12 -- 57.14 + 56.79 + 512.39 + 5.04 = 631.36:   2%|▏         | 31/2048 [01:07<24:06,  1.39it/s]
loss 2.57 accuracy 0.25 -- 56.36 + 57.58 + 627.26 + 5.01 = 746.21:   2%|▏         | 31/2048 [01:08<24:06,  1.39it/s]
loss 2.57 accuracy 0.25 -- 56.36 + 57.58 + 627.26 + 5.01 = 746.21:   2%|▏         | 32/2048 [01:08<24:39,  1.36it/s]
loss 3.08 accuracy 0.06 -- 57.02 + 56.94 + 512.38 + 4.95 = 631.28:   2%|▏         | 32/2048 [01:09<24:39,  1.36it/s]
loss 3.08 accuracy 0.06 -- 57.02 + 56.94 + 512.38 + 4.95 = 631.28:   2%|▏         | 33/2048 [01:09<23:53,  1.41it/s]
loss 2.50 accuracy 0.19 -- 56.46 + 170.47 + 510.74 + 4.92 = 742.59:   2%|▏         | 33/2048 [01:09<23:53,  1.41it/s]
loss 2.50 accuracy 0.19 -- 56.46 + 170.47 + 510.74 + 4.92 = 742.59:   2%|▏         | 34/2048 [01:09<24:28,  1.37it/s]
loss 2.49 accuracy 0.25 -- 56.65 + 56.58 + 509.35 + 4.95 = 627.53:   2%|▏         | 34/2048 [01:10<24:28,  1.37it/s] 
loss 2.49 accuracy 0.25 -- 56.65 + 56.58 + 509.35 + 4.95 = 627.53:   2%|▏         | 35/2048 [01:10<23:42,  1.41it/s]
loss 2.66 accuracy 0.19 -- 57.15 + 56.66 + 506.31 + 4.96 = 625.09:   2%|▏         | 35/2048 [01:11<23:42,  1.41it/s]
loss 2.66 accuracy 0.19 -- 57.15 + 56.66 + 506.31 + 4.96 = 625.09:   2%|▏         | 36/2048 [01:11<24:16,  1.38it/s]
loss 2.45 accuracy 0.06 -- 56.72 + 57.53 + 503.47 + 4.93 = 622.65:   2%|▏         | 36/2048 [01:11<24:16,  1.38it/s]
loss 2.45 accuracy 0.06 -- 56.72 + 57.53 + 503.47 + 4.93 = 622.65:   2%|▏         | 37/2048 [01:11<23:31,  1.42it/s]
loss 2.52 accuracy 0.12 -- 161.43 + 57.13 + 498.34 + 4.95 = 721.86:   2%|▏         | 37/2048 [01:12<23:31,  1.42it/s]
loss 2.52 accuracy 0.12 -- 161.43 + 57.13 + 498.34 + 4.95 = 721.86:   2%|▏         | 38/2048 [01:12<23:59,  1.40it/s]
loss 2.90 accuracy 0.19 -- 56.60 + 172.00 + 510.17 + 4.94 = 743.70:   2%|▏         | 38/2048 [01:13<23:59,  1.40it/s]
loss 2.90 accuracy 0.19 -- 56.60 + 172.00 + 510.17 + 4.94 = 743.70:   2%|▏         | 39/2048 [01:13<24:32,  1.36it/s]
loss 3.13 accuracy 0.12 -- 57.61 + 56.68 + 507.47 + 4.94 = 626.71:   2%|▏         | 39/2048 [01:14<24:32,  1.36it/s] 
loss 3.13 accuracy 0.12 -- 57.61 + 56.68 + 507.47 + 4.94 = 626.71:   2%|▏         | 40/2048 [01:14<23:44,  1.41it/s]
loss 2.63 accuracy 0.12 -- 166.51 + 57.57 + 504.92 + 4.96 = 733.95:   2%|▏         | 40/2048 [01:14<23:44,  1.41it/s]
loss 2.63 accuracy 0.12 -- 166.51 + 57.57 + 504.92 + 4.96 = 733.95:   2%|▏         | 41/2048 [01:14<24:14,  1.38it/s]
loss 3.17 accuracy 0.06 -- 56.51 + 57.76 + 632.84 + 4.97 = 752.08:   2%|▏         | 41/2048 [01:15<24:14,  1.38it/s] 
loss 3.17 accuracy 0.06 -- 56.51 + 57.76 + 632.84 + 4.97 = 752.08:   2%|▏         | 42/2048 [01:15<24:46,  1.35it/s]
loss 2.67 accuracy 0.19 -- 57.36 + 57.28 + 515.07 + 4.93 = 634.63:   2%|▏         | 42/2048 [01:16<24:46,  1.35it/s]
loss 2.67 accuracy 0.19 -- 57.36 + 57.28 + 515.07 + 4.93 = 634.63:   2%|▏         | 43/2048 [01:16<23:58,  1.39it/s]
loss 2.22 accuracy 0.25 -- 57.00 + 57.66 + 626.79 + 4.93 = 746.37:   2%|▏         | 43/2048 [01:17<23:58,  1.39it/s]
loss 2.22 accuracy 0.25 -- 57.00 + 57.66 + 626.79 + 4.93 = 746.37:   2%|▏         | 44/2048 [01:17<24:31,  1.36it/s]
loss 2.57 accuracy 0.19 -- 57.19 + 56.80 + 509.33 + 4.97 = 628.30:   2%|▏         | 44/2048 [01:17<24:31,  1.36it/s]
loss 2.57 accuracy 0.19 -- 57.19 + 56.80 + 509.33 + 4.97 = 628.30:   2%|▏         | 45/2048 [01:17<23:43,  1.41it/s]
loss 2.96 accuracy 0.12 -- 56.87 + 171.29 + 512.56 + 4.99 = 745.72:   2%|▏         | 45/2048 [01:18<23:43,  1.41it/s]
loss 2.96 accuracy 0.12 -- 56.87 + 171.29 + 512.56 + 4.99 = 745.72:   2%|▏         | 46/2048 [01:18<24:20,  1.37it/s]
loss 2.29 accuracy 0.19 -- 56.80 + 56.98 + 508.70 + 4.93 = 627.41:   2%|▏         | 46/2048 [01:19<24:20,  1.37it/s] 
loss 2.29 accuracy 0.19 -- 56.80 + 56.98 + 508.70 + 4.93 = 627.41:   2%|▏         | 47/2048 [01:19<23:35,  1.41it/s]
loss 2.38 accuracy 0.31 -- 56.89 + 56.38 + 505.75 + 4.96 = 623.98:   2%|▏         | 47/2048 [01:19<23:35,  1.41it/s]
loss 2.38 accuracy 0.31 -- 56.89 + 56.38 + 505.75 + 4.96 = 623.98:   2%|▏         | 48/2048 [01:19<24:07,  1.38it/s]
loss 2.82 accuracy 0.12 -- 56.45 + 57.53 + 503.36 + 4.99 = 622.34:   2%|▏         | 48/2048 [01:20<24:07,  1.38it/s]
loss 2.82 accuracy 0.12 -- 56.45 + 57.53 + 503.36 + 4.99 = 622.34:   2%|▏         | 49/2048 [01:20<23:22,  1.43it/s]
loss 2.84 accuracy 0.06 -- 161.31 + 57.22 + 498.24 + 4.95 = 721.72:   2%|▏         | 49/2048 [01:21<23:22,  1.43it/s]
loss 2.84 accuracy 0.06 -- 161.31 + 57.22 + 498.24 + 4.95 = 721.72:   2%|▏         | 50/2048 [01:21<23:50,  1.40it/s]
loss 2.82 accuracy 0.25 -- 56.49 + 170.20 + 509.57 + 4.94 = 741.20:   2%|▏         | 50/2048 [01:22<23:50,  1.40it/s]
loss 2.82 accuracy 0.25 -- 56.49 + 170.20 + 509.57 + 4.94 = 741.20:   2%|▏         | 51/2048 [01:22<24:21,  1.37it/s]
loss 2.55 accuracy 0.06 -- 57.40 + 56.92 + 505.32 + 4.97 = 624.61:   2%|▏         | 51/2048 [01:22<24:21,  1.37it/s] 
loss 2.55 accuracy 0.06 -- 57.40 + 56.92 + 505.32 + 4.97 = 624.61:   3%|▎         | 52/2048 [01:22<23:33,  1.41it/s]
loss 2.79 accuracy 0.06 -- 166.14 + 57.77 + 505.58 + 4.95 = 734.43:   3%|▎         | 52/2048 [01:23<23:33,  1.41it/s]
loss 2.79 accuracy 0.06 -- 166.14 + 57.77 + 505.58 + 4.95 = 734.43:   3%|▎         | 53/2048 [01:23<24:04,  1.38it/s]
loss 3.23 accuracy 0.00 -- 56.58 + 57.94 + 635.67 + 4.95 = 755.14:   3%|▎         | 53/2048 [01:24<24:04,  1.38it/s] 
loss 3.23 accuracy 0.00 -- 56.58 + 57.94 + 635.67 + 4.95 = 755.14:   3%|▎         | 54/2048 [01:24<24:39,  1.35it/s]
loss 2.25 accuracy 0.25 -- 57.36 + 56.66 + 514.99 + 4.96 = 633.97:   3%|▎         | 54/2048 [01:24<24:39,  1.35it/s]
loss 2.25 accuracy 0.25 -- 57.36 + 56.66 + 514.99 + 4.96 = 633.97:   3%|▎         | 55/2048 [01:24<23:50,  1.39it/s]
loss 2.72 accuracy 0.00 -- 56.95 + 57.45 + 626.68 + 4.96 = 746.04:   3%|▎         | 55/2048 [01:25<23:50,  1.39it/s]
loss 2.72 accuracy 0.00 -- 56.95 + 57.45 + 626.68 + 4.96 = 746.04:   3%|▎         | 56/2048 [01:25<24:23,  1.36it/s]
loss 2.89 accuracy 0.19 -- 57.37 + 56.63 + 510.67 + 4.92 = 629.59:   3%|▎         | 56/2048 [01:26<24:23,  1.36it/s]
loss 2.89 accuracy 0.19 -- 57.37 + 56.63 + 510.67 + 4.92 = 629.59:   3%|▎         | 57/2048 [01:26<23:36,  1.41it/s]
loss 2.52 accuracy 0.19 -- 56.62 + 169.89 + 510.45 + 4.96 = 741.92:   3%|▎         | 57/2048 [01:27<23:36,  1.41it/s]
loss 2.52 accuracy 0.19 -- 56.62 + 169.89 + 510.45 + 4.96 = 741.92:   3%|▎         | 58/2048 [01:27<24:10,  1.37it/s]
loss 2.18 accuracy 0.19 -- 56.57 + 56.86 + 508.15 + 4.95 = 626.53:   3%|▎         | 58/2048 [01:27<24:10,  1.37it/s] 
loss 2.18 accuracy 0.19 -- 56.57 + 56.86 + 508.15 + 4.95 = 626.53:   3%|▎         | 59/2048 [01:27<23:25,  1.42it/s]
loss 2.77 accuracy 0.19 -- 57.27 + 56.81 + 505.29 + 4.92 = 624.28:   3%|▎         | 59/2048 [01:28<23:25,  1.42it/s]
loss 2.77 accuracy 0.19 -- 57.27 + 56.81 + 505.29 + 4.92 = 624.28:   3%|▎         | 60/2048 [01:28<23:57,  1.38it/s]
loss 2.40 accuracy 0.06 -- 56.41 + 57.76 + 504.98 + 4.95 = 624.09:   3%|▎         | 60/2048 [01:29<23:57,  1.38it/s]
loss 2.40 accuracy 0.06 -- 56.41 + 57.76 + 504.98 + 4.95 = 624.09:   3%|▎         | 61/2048 [01:29<23:14,  1.42it/s]
loss 2.54 accuracy 0.25 -- 161.78 + 57.10 + 497.81 + 4.95 = 721.64:   3%|▎         | 61/2048 [01:29<23:14,  1.42it/s]
loss 2.54 accuracy 0.25 -- 161.78 + 57.10 + 497.81 + 4.95 = 721.64:   3%|▎         | 62/2048 [01:29<23:42,  1.40it/s]
loss 1.99 accuracy 0.31 -- 56.19 + 169.55 + 508.93 + 4.96 = 739.63:   3%|▎         | 62/2048 [01:30<23:42,  1.40it/s]
loss 1.99 accuracy 0.31 -- 56.19 + 169.55 + 508.93 + 4.96 = 739.63:   3%|▎         | 63/2048 [01:30<24:12,  1.37it/s]
loss 2.31 accuracy 0.19 -- 57.19 + 56.49 + 504.98 + 4.96 = 623.62:   3%|▎         | 63/2048 [01:31<24:12,  1.37it/s] 
loss 2.31 accuracy 0.19 -- 57.19 + 56.49 + 504.98 + 4.96 = 623.62:   3%|▎         | 64/2048 [01:31<23:24,  1.41it/s]
loss 2.24 accuracy 0.25 -- 165.95 + 57.86 + 503.87 + 4.96 = 732.64:   3%|▎         | 64/2048 [01:32<23:24,  1.41it/s]
loss 2.24 accuracy 0.25 -- 165.95 + 57.86 + 503.87 + 4.96 = 732.64:   3%|▎         | 65/2048 [01:32<23:54,  1.38it/s]
loss 2.30 accuracy 0.06 -- 56.39 + 57.30 + 634.27 + 4.99 = 752.95:   3%|▎         | 65/2048 [01:32<23:54,  1.38it/s] 
loss 2.30 accuracy 0.06 -- 56.39 + 57.30 + 634.27 + 4.99 = 752.95:   3%|▎         | 66/2048 [01:32<24:28,  1.35it/s]
loss 2.42 accuracy 0.12 -- 57.70 + 57.06 + 512.80 + 4.92 = 632.48:   3%|▎         | 66/2048 [01:33<24:28,  1.35it/s]
loss 2.42 accuracy 0.12 -- 57.70 + 57.06 + 512.80 + 4.92 = 632.48:   3%|▎         | 67/2048 [01:33<23:39,  1.40it/s]
loss 2.42 accuracy 0.19 -- 56.63 + 57.77 + 627.63 + 4.95 = 746.98:   3%|▎         | 67/2048 [01:34<23:39,  1.40it/s]
loss 2.42 accuracy 0.19 -- 56.63 + 57.77 + 627.63 + 4.95 = 746.98:   3%|▎         | 68/2048 [01:34<24:13,  1.36it/s]
loss 2.10 accuracy 0.19 -- 57.41 + 57.06 + 510.26 + 4.95 = 629.68:   3%|▎         | 68/2048 [01:34<24:13,  1.36it/s]
loss 2.10 accuracy 0.19 -- 57.41 + 57.06 + 510.26 + 4.95 = 629.68:   3%|▎         | 69/2048 [01:34<23:27,  1.41it/s]
loss 2.33 accuracy 0.25 -- 56.52 + 169.98 + 509.45 + 4.97 = 740.92:   3%|▎         | 69/2048 [01:35<23:27,  1.41it/s]
loss 2.33 accuracy 0.25 -- 56.52 + 169.98 + 509.45 + 4.97 = 740.92:   3%|▎         | 70/2048 [01:35<24:21,  1.35it/s]
loss 2.46 accuracy 0.25 -- 56.48 + 56.69 + 507.07 + 4.96 = 625.19:   3%|▎         | 70/2048 [01:36<24:21,  1.35it/s] 
loss 2.46 accuracy 0.25 -- 56.48 + 56.69 + 507.07 + 4.96 = 625.19:   3%|▎         | 71/2048 [01:36<23:30,  1.40it/s]
loss 2.55 accuracy 0.25 -- 57.14 + 57.07 + 506.64 + 4.92 = 625.77:   3%|▎         | 71/2048 [01:37<23:30,  1.40it/s]
loss 2.55 accuracy 0.25 -- 57.14 + 57.07 + 506.64 + 4.92 = 625.77:   4%|▎         | 72/2048 [01:37<23:59,  1.37it/s]
loss 2.54 accuracy 0.19 -- 56.46 + 57.77 + 502.40 + 4.94 = 621.57:   4%|▎         | 72/2048 [01:37<23:59,  1.37it/s]
loss 2.54 accuracy 0.19 -- 56.46 + 57.77 + 502.40 + 4.94 = 621.57:   4%|▎         | 73/2048 [01:37<23:11,  1.42it/s]
loss 2.63 accuracy 0.06 -- 161.08 + 57.27 + 498.39 + 4.91 = 721.65:   4%|▎         | 73/2048 [01:38<23:11,  1.42it/s]
loss 2.63 accuracy 0.06 -- 161.08 + 57.27 + 498.39 + 4.91 = 721.65:   4%|▎         | 74/2048 [01:38<23:37,  1.39it/s]
loss 2.35 accuracy 0.06 -- 56.17 + 170.50 + 508.45 + 4.92 = 740.04:   4%|▎         | 74/2048 [01:39<23:37,  1.39it/s]
loss 2.35 accuracy 0.06 -- 56.17 + 170.50 + 508.45 + 4.92 = 740.04:   4%|▎         | 75/2048 [01:39<24:06,  1.36it/s]
loss 2.07 accuracy 0.19 -- 56.85 + 56.61 + 504.88 + 4.88 = 623.21:   4%|▎         | 75/2048 [01:40<24:06,  1.36it/s] 
loss 2.07 accuracy 0.19 -- 56.85 + 56.61 + 504.88 + 4.88 = 623.21:   4%|▎         | 76/2048 [01:40<23:17,  1.41it/s]
loss 2.45 accuracy 0.12 -- 299.43 + 102.71 + 563.90 + 5.17 = 971.21:   4%|▎         | 76/2048 [01:41<23:17,  1.41it/s]
loss 2.45 accuracy 0.12 -- 299.43 + 102.71 + 563.90 + 5.17 = 971.21:   4%|▍         | 77/2048 [01:41<26:09,  1.26it/s]
loss 2.24 accuracy 0.12 -- 60.17 + 60.60 + 666.19 + 5.00 = 791.96:   4%|▍         | 77/2048 [01:41<26:09,  1.26it/s]  
loss 2.24 accuracy 0.12 -- 60.17 + 60.60 + 666.19 + 5.00 = 791.96:   4%|▍         | 78/2048 [01:41<26:24,  1.24it/s]
loss 2.33 accuracy 0.12 -- 58.22 + 58.99 + 533.25 + 5.00 = 655.47:   4%|▍         | 78/2048 [01:42<26:24,  1.24it/s]
loss 2.33 accuracy 0.12 -- 58.22 + 58.99 + 533.25 + 5.00 = 655.47:   4%|▍         | 79/2048 [01:42<25:12,  1.30it/s]
loss 2.39 accuracy 0.19 -- 58.16 + 59.63 + 657.85 + 5.02 = 780.67:   4%|▍         | 79/2048 [01:43<25:12,  1.30it/s]
loss 2.39 accuracy 0.19 -- 58.16 + 59.63 + 657.85 + 5.02 = 780.67:   4%|▍         | 80/2048 [01:43<25:36,  1.28it/s]
loss 2.35 accuracy 0.19 -- 57.73 + 56.80 + 510.00 + 4.88 = 629.41:   4%|▍         | 80/2048 [01:43<25:36,  1.28it/s]
loss 2.35 accuracy 0.19 -- 57.73 + 56.80 + 510.00 + 4.88 = 629.41:   4%|▍         | 81/2048 [01:43<24:22,  1.34it/s]
loss 2.18 accuracy 0.25 -- 56.52 + 171.05 + 508.90 + 4.89 = 741.36:   4%|▍         | 81/2048 [01:44<24:22,  1.34it/s]
loss 2.18 accuracy 0.25 -- 56.52 + 171.05 + 508.90 + 4.89 = 741.36:   4%|▍         | 82/2048 [01:44<24:37,  1.33it/s]
loss 2.37 accuracy 0.25 -- 57.18 + 56.73 + 507.95 + 4.88 = 626.74:   4%|▍         | 82/2048 [01:45<24:37,  1.33it/s] 
loss 2.37 accuracy 0.25 -- 57.18 + 56.73 + 507.95 + 4.88 = 626.74:   4%|▍         | 83/2048 [01:45<23:39,  1.38it/s]
loss 2.51 accuracy 0.19 -- 56.79 + 56.61 + 506.11 + 4.88 = 624.39:   4%|▍         | 83/2048 [01:46<23:39,  1.38it/s]
loss 2.51 accuracy 0.19 -- 56.79 + 56.61 + 506.11 + 4.88 = 624.39:   4%|▍         | 84/2048 [01:46<24:03,  1.36it/s]
loss 2.49 accuracy 0.19 -- 56.30 + 57.36 + 503.51 + 4.90 = 622.07:   4%|▍         | 84/2048 [01:46<24:03,  1.36it/s]
loss 2.49 accuracy 0.19 -- 56.30 + 57.36 + 503.51 + 4.90 = 622.07:   4%|▍         | 85/2048 [01:46<23:12,  1.41it/s]
loss 2.18 accuracy 0.19 -- 163.99 + 56.96 + 498.16 + 4.87 = 723.98:   4%|▍         | 85/2048 [01:47<23:12,  1.41it/s]
loss 2.18 accuracy 0.19 -- 163.99 + 56.96 + 498.16 + 4.87 = 723.98:   4%|▍         | 86/2048 [01:47<23:36,  1.38it/s]
loss 2.32 accuracy 0.06 -- 56.21 + 171.62 + 510.57 + 4.94 = 743.33:   4%|▍         | 86/2048 [01:48<23:36,  1.38it/s]
loss 2.32 accuracy 0.06 -- 56.21 + 171.62 + 510.57 + 4.94 = 743.33:   4%|▍         | 87/2048 [01:48<24:05,  1.36it/s]
loss 2.29 accuracy 0.12 -- 56.90 + 56.52 + 505.62 + 4.92 = 623.96:   4%|▍         | 87/2048 [01:48<24:05,  1.36it/s] 
loss 2.29 accuracy 0.12 -- 56.90 + 56.52 + 505.62 + 4.92 = 623.96:   4%|▍         | 88/2048 [01:48<23:14,  1.41it/s]
loss 2.05 accuracy 0.38 -- 168.24 + 57.38 + 504.51 + 4.89 = 735.02:   4%|▍         | 88/2048 [01:49<23:14,  1.41it/s]
loss 2.05 accuracy 0.38 -- 168.24 + 57.38 + 504.51 + 4.89 = 735.02:   4%|▍         | 89/2048 [01:49<23:43,  1.38it/s]
loss 2.09 accuracy 0.06 -- 56.53 + 57.52 + 634.54 + 4.89 = 753.47:   4%|▍         | 89/2048 [01:50<23:43,  1.38it/s] 
loss 2.09 accuracy 0.06 -- 56.53 + 57.52 + 634.54 + 4.89 = 753.47:   4%|▍         | 90/2048 [01:50<24:14,  1.35it/s]
loss 2.06 accuracy 0.19 -- 57.51 + 57.05 + 514.55 + 4.89 = 634.00:   4%|▍         | 90/2048 [01:51<24:14,  1.35it/s]
loss 2.06 accuracy 0.19 -- 57.51 + 57.05 + 514.55 + 4.89 = 634.00:   4%|▍         | 91/2048 [01:51<23:26,  1.39it/s]
loss 2.11 accuracy 0.25 -- 56.34 + 57.30 + 630.04 + 4.89 = 748.57:   4%|▍         | 91/2048 [01:51<23:26,  1.39it/s]
loss 2.11 accuracy 0.25 -- 56.34 + 57.30 + 630.04 + 4.89 = 748.57:   4%|▍         | 92/2048 [01:51<23:59,  1.36it/s]
loss 2.39 accuracy 0.06 -- 57.09 + 56.58 + 512.42 + 4.89 = 630.98:   4%|▍         | 92/2048 [01:52<23:59,  1.36it/s]
loss 2.39 accuracy 0.06 -- 57.09 + 56.58 + 512.42 + 4.89 = 630.98:   5%|▍         | 93/2048 [01:52<23:13,  1.40it/s]
loss 2.29 accuracy 0.00 -- 56.31 + 173.65 + 511.67 + 4.89 = 746.52:   5%|▍         | 93/2048 [01:53<23:13,  1.40it/s]
loss 2.29 accuracy 0.00 -- 56.31 + 173.65 + 511.67 + 4.89 = 746.52:   5%|▍         | 94/2048 [01:53<23:48,  1.37it/s]
loss 2.99 accuracy 0.06 -- 56.38 + 56.44 + 509.84 + 4.89 = 627.55:   5%|▍         | 94/2048 [01:54<23:48,  1.37it/s] 
loss 2.99 accuracy 0.06 -- 56.38 + 56.44 + 509.84 + 4.89 = 627.55:   5%|▍         | 95/2048 [01:54<23:03,  1.41it/s]
loss 2.44 accuracy 0.00 -- 56.81 + 56.63 + 506.91 + 4.90 = 625.25:   5%|▍         | 95/2048 [01:54<23:03,  1.41it/s]
loss 2.44 accuracy 0.00 -- 56.81 + 56.63 + 506.91 + 4.90 = 625.25:   5%|▍         | 96/2048 [01:54<23:36,  1.38it/s]
loss 2.11 accuracy 0.25 -- 56.16 + 57.61 + 503.34 + 4.91 = 622.02:   5%|▍         | 96/2048 [01:55<23:36,  1.38it/s]
loss 2.11 accuracy 0.25 -- 56.16 + 57.61 + 503.34 + 4.91 = 622.02:   5%|▍         | 97/2048 [01:55<22:50,  1.42it/s]
loss 2.00 accuracy 0.19 -- 164.18 + 57.11 + 497.57 + 4.91 = 723.77:   5%|▍         | 97/2048 [01:56<22:50,  1.42it/s]
loss 2.00 accuracy 0.19 -- 164.18 + 57.11 + 497.57 + 4.91 = 723.77:   5%|▍         | 98/2048 [01:56<23:18,  1.39it/s]
loss 2.82 accuracy 0.25 -- 56.15 + 171.95 + 509.83 + 4.89 = 742.82:   5%|▍         | 98/2048 [01:57<23:18,  1.39it/s]
loss 2.82 accuracy 0.25 -- 56.15 + 171.95 + 509.83 + 4.89 = 742.82:   5%|▍         | 99/2048 [01:57<23:49,  1.36it/s]
loss 2.33 accuracy 0.19 -- 57.16 + 56.40 + 505.48 + 4.91 = 623.96:   5%|▍         | 99/2048 [01:57<23:49,  1.36it/s] 
loss 2.33 accuracy 0.19 -- 57.16 + 56.40 + 505.48 + 4.91 = 623.96:   5%|▍         | 100/2048 [01:57<23:01,  1.41it/s]
loss 2.19 accuracy 0.25 -- 168.88 + 57.41 + 505.31 + 4.92 = 736.52:   5%|▍         | 100/2048 [01:58<23:01,  1.41it/s]
loss 2.19 accuracy 0.25 -- 168.88 + 57.41 + 505.31 + 4.92 = 736.52:   5%|▍         | 101/2048 [01:58<23:32,  1.38it/s]
loss 2.25 accuracy 0.12 -- 56.40 + 58.54 + 636.19 + 4.92 = 756.04:   5%|▍         | 101/2048 [01:59<23:32,  1.38it/s] 
loss 2.25 accuracy 0.12 -- 56.40 + 58.54 + 636.19 + 4.92 = 756.04:   5%|▍         | 102/2048 [01:59<24:06,  1.35it/s]
loss 2.18 accuracy 0.06 -- 57.06 + 56.66 + 513.96 + 4.90 = 632.57:   5%|▍         | 102/2048 [01:59<24:06,  1.35it/s]
loss 2.18 accuracy 0.06 -- 57.06 + 56.66 + 513.96 + 4.90 = 632.57:   5%|▌         | 103/2048 [01:59<23:17,  1.39it/s]
loss 2.83 accuracy 0.00 -- 56.38 + 57.24 + 630.24 + 4.92 = 748.78:   5%|▌         | 103/2048 [02:00<23:17,  1.39it/s]
loss 2.83 accuracy 0.00 -- 56.38 + 57.24 + 630.24 + 4.92 = 748.78:   5%|▌         | 104/2048 [02:00<23:50,  1.36it/s]
loss 2.13 accuracy 0.19 -- 57.32 + 56.92 + 510.92 + 4.91 = 630.07:   5%|▌         | 104/2048 [02:01<23:50,  1.36it/s]
loss 2.13 accuracy 0.19 -- 57.32 + 56.92 + 510.92 + 4.91 = 630.07:   5%|▌         | 105/2048 [02:01<23:04,  1.40it/s]
loss 2.14 accuracy 0.06 -- 56.20 + 173.14 + 510.19 + 4.90 = 744.43:   5%|▌         | 105/2048 [02:02<23:04,  1.40it/s]
loss 2.14 accuracy 0.06 -- 56.20 + 173.14 + 510.19 + 4.90 = 744.43:   5%|▌         | 106/2048 [02:02<23:38,  1.37it/s]
loss 2.32 accuracy 0.19 -- 56.71 + 56.60 + 508.37 + 4.94 = 626.62:   5%|▌         | 106/2048 [02:02<23:38,  1.37it/s] 
loss 2.32 accuracy 0.19 -- 56.71 + 56.60 + 508.37 + 4.94 = 626.62:   5%|▌         | 107/2048 [02:02<22:53,  1.41it/s]
loss 2.05 accuracy 0.25 -- 66.62 + 67.52 + 513.98 + 4.89 = 653.01:   5%|▌         | 107/2048 [02:03<22:53,  1.41it/s]
loss 2.05 accuracy 0.25 -- 66.62 + 67.52 + 513.98 + 4.89 = 653.01:   5%|▌         | 108/2048 [02:03<24:42,  1.31it/s]
loss 1.98 accuracy 0.25 -- 56.12 + 57.40 + 503.46 + 4.89 = 621.88:   5%|▌         | 108/2048 [02:04<24:42,  1.31it/s]
loss 1.98 accuracy 0.25 -- 56.12 + 57.40 + 503.46 + 4.89 = 621.88:   5%|▌         | 109/2048 [02:04<23:36,  1.37it/s]
loss 2.48 accuracy 0.12 -- 164.12 + 56.85 + 497.11 + 4.88 = 722.96:   5%|▌         | 109/2048 [02:05<23:36,  1.37it/s]
loss 2.48 accuracy 0.12 -- 164.12 + 56.85 + 497.11 + 4.88 = 722.96:   5%|▌         | 110/2048 [02:05<23:47,  1.36it/s]
loss 2.08 accuracy 0.25 -- 56.28 + 171.83 + 512.36 + 4.91 = 745.37:   5%|▌         | 110/2048 [02:05<23:47,  1.36it/s]
loss 2.08 accuracy 0.25 -- 56.28 + 171.83 + 512.36 + 4.91 = 745.37:   5%|▌         | 111/2048 [02:05<24:07,  1.34it/s]
loss 2.20 accuracy 0.19 -- 57.54 + 57.25 + 507.58 + 4.94 = 627.30:   5%|▌         | 111/2048 [02:06<24:07,  1.34it/s] 
loss 2.20 accuracy 0.19 -- 57.54 + 57.25 + 507.58 + 4.94 = 627.30:   5%|▌         | 112/2048 [02:06<23:13,  1.39it/s]
loss 2.21 accuracy 0.25 -- 168.50 + 57.37 + 506.37 + 4.89 = 737.13:   5%|▌         | 112/2048 [02:07<23:13,  1.39it/s]
loss 2.21 accuracy 0.25 -- 168.50 + 57.37 + 506.37 + 4.89 = 737.13:   6%|▌         | 113/2048 [02:07<23:38,  1.36it/s]
loss 2.33 accuracy 0.19 -- 56.80 + 57.45 + 637.17 + 4.97 = 756.39:   6%|▌         | 113/2048 [02:08<23:38,  1.36it/s] 
loss 2.33 accuracy 0.19 -- 56.80 + 57.45 + 637.17 + 4.97 = 756.39:   6%|▌         | 114/2048 [02:08<24:07,  1.34it/s]
loss 2.30 accuracy 0.06 -- 57.42 + 56.76 + 516.45 + 4.91 = 635.54:   6%|▌         | 114/2048 [02:08<24:07,  1.34it/s]
loss 2.30 accuracy 0.06 -- 57.42 + 56.76 + 516.45 + 4.91 = 635.54:   6%|▌         | 115/2048 [02:08<23:16,  1.38it/s]
loss 2.09 accuracy 0.25 -- 56.93 + 59.77 + 632.29 + 4.93 = 753.91:   6%|▌         | 115/2048 [02:09<23:16,  1.38it/s]
loss 2.09 accuracy 0.25 -- 56.93 + 59.77 + 632.29 + 4.93 = 753.91:   6%|▌         | 116/2048 [02:09<23:50,  1.35it/s]
loss 2.21 accuracy 0.31 -- 57.53 + 56.92 + 512.99 + 4.91 = 632.36:   6%|▌         | 116/2048 [02:10<23:50,  1.35it/s]
loss 2.21 accuracy 0.31 -- 57.53 + 56.92 + 512.99 + 4.91 = 632.36:   6%|▌         | 117/2048 [02:10<23:02,  1.40it/s]
loss 2.14 accuracy 0.25 -- 56.98 + 172.06 + 513.29 + 4.95 = 747.27:   6%|▌         | 117/2048 [02:10<23:02,  1.40it/s]
loss 2.14 accuracy 0.25 -- 56.98 + 172.06 + 513.29 + 4.95 = 747.27:   6%|▌         | 118/2048 [02:10<23:36,  1.36it/s]
loss 2.01 accuracy 0.31 -- 57.46 + 56.96 + 510.27 + 4.93 = 629.62:   6%|▌         | 118/2048 [02:11<23:36,  1.36it/s] 
loss 2.01 accuracy 0.31 -- 57.46 + 56.96 + 510.27 + 4.93 = 629.62:   6%|▌         | 119/2048 [02:11<23:11,  1.39it/s]
loss 2.31 accuracy 0.12 -- 57.30 + 56.82 + 508.06 + 4.92 = 627.10:   6%|▌         | 119/2048 [02:12<23:11,  1.39it/s]
loss 2.31 accuracy 0.12 -- 57.30 + 56.82 + 508.06 + 4.92 = 627.10:   6%|▌         | 120/2048 [02:12<23:37,  1.36it/s]
loss 2.35 accuracy 0.12 -- 56.79 + 57.69 + 507.46 + 4.92 = 626.86:   6%|▌         | 120/2048 [02:12<23:37,  1.36it/s]
loss 2.35 accuracy 0.12 -- 56.79 + 57.69 + 507.46 + 4.92 = 626.86:   6%|▌         | 121/2048 [02:12<22:50,  1.41it/s]
loss 2.48 accuracy 0.12 -- 164.00 + 57.10 + 499.85 + 4.94 = 725.90:   6%|▌         | 121/2048 [02:13<22:50,  1.41it/s]
loss 2.48 accuracy 0.12 -- 164.00 + 57.10 + 499.85 + 4.94 = 725.90:   6%|▌         | 122/2048 [02:13<23:13,  1.38it/s]
loss 2.10 accuracy 0.25 -- 56.52 + 172.57 + 512.89 + 4.95 = 746.94:   6%|▌         | 122/2048 [02:14<23:13,  1.38it/s]
loss 2.10 accuracy 0.25 -- 56.52 + 172.57 + 512.89 + 4.95 = 746.94:   6%|▌         | 123/2048 [02:14<23:42,  1.35it/s]
loss 2.35 accuracy 0.19 -- 57.54 + 56.75 + 508.21 + 4.91 = 627.42:   6%|▌         | 123/2048 [02:15<23:42,  1.35it/s] 
loss 2.35 accuracy 0.19 -- 57.54 + 56.75 + 508.21 + 4.91 = 627.42:   6%|▌         | 124/2048 [02:15<22:53,  1.40it/s]
loss 2.11 accuracy 0.19 -- 168.71 + 57.16 + 506.84 + 4.94 = 737.66:   6%|▌         | 124/2048 [02:15<22:53,  1.40it/s]
loss 2.11 accuracy 0.19 -- 168.71 + 57.16 + 506.84 + 4.94 = 737.66:   6%|▌         | 125/2048 [02:15<23:22,  1.37it/s]
loss 2.33 accuracy 0.25 -- 57.03 + 57.84 + 637.25 + 4.93 = 757.06:   6%|▌         | 125/2048 [02:16<23:22,  1.37it/s] 
loss 2.33 accuracy 0.25 -- 57.03 + 57.84 + 637.25 + 4.93 = 757.06:   6%|▌         | 126/2048 [02:16<24:13,  1.32it/s]
loss 1.86 accuracy 0.38 -- 57.79 + 56.64 + 516.31 + 4.95 = 635.70:   6%|▌         | 126/2048 [02:17<24:13,  1.32it/s]
loss 1.86 accuracy 0.38 -- 57.79 + 56.64 + 516.31 + 4.95 = 635.70:   6%|▌         | 127/2048 [02:17<23:19,  1.37it/s]
loss 2.07 accuracy 0.25 -- 56.49 + 57.88 + 631.75 + 4.90 = 751.02:   6%|▌         | 127/2048 [02:18<23:19,  1.37it/s]
loss 2.07 accuracy 0.25 -- 56.49 + 57.88 + 631.75 + 4.90 = 751.02:   6%|▋         | 128/2048 [02:18<23:47,  1.35it/s]
loss 2.44 accuracy 0.19 -- 57.35 + 56.65 + 512.75 + 4.96 = 631.70:   6%|▋         | 128/2048 [02:18<23:47,  1.35it/s]
loss 2.44 accuracy 0.19 -- 57.35 + 56.65 + 512.75 + 4.96 = 631.70:   6%|▋         | 129/2048 [02:18<22:57,  1.39it/s]
loss 2.72 accuracy 0.06 -- 57.21 + 172.68 + 512.15 + 4.94 = 746.98:   6%|▋         | 129/2048 [02:19<22:57,  1.39it/s]
loss 2.72 accuracy 0.06 -- 57.21 + 172.68 + 512.15 + 4.94 = 746.98:   6%|▋         | 130/2048 [02:19<23:29,  1.36it/s]
loss 1.97 accuracy 0.31 -- 56.74 + 56.52 + 509.16 + 4.92 = 627.34:   6%|▋         | 130/2048 [02:20<23:29,  1.36it/s] 
loss 1.97 accuracy 0.31 -- 56.74 + 56.52 + 509.16 + 4.92 = 627.34:   6%|▋         | 131/2048 [02:20<22:42,  1.41it/s]
loss 1.96 accuracy 0.25 -- 57.28 + 56.70 + 510.21 + 4.91 = 629.10:   6%|▋         | 131/2048 [02:21<22:42,  1.41it/s]
loss 1.96 accuracy 0.25 -- 57.28 + 56.70 + 510.21 + 4.91 = 629.10:   6%|▋         | 132/2048 [02:21<23:14,  1.37it/s]
loss 1.83 accuracy 0.38 -- 56.41 + 57.49 + 504.92 + 4.93 = 623.75:   6%|▋         | 132/2048 [02:21<23:14,  1.37it/s]
loss 1.83 accuracy 0.38 -- 56.41 + 57.49 + 504.92 + 4.93 = 623.75:   6%|▋         | 133/2048 [02:21<22:30,  1.42it/s]
loss 2.27 accuracy 0.12 -- 164.79 + 57.08 + 499.86 + 4.91 = 726.65:   6%|▋         | 133/2048 [02:22<22:30,  1.42it/s]
loss 2.27 accuracy 0.12 -- 164.79 + 57.08 + 499.86 + 4.91 = 726.65:   7%|▋         | 134/2048 [02:22<22:57,  1.39it/s]
loss 2.22 accuracy 0.19 -- 56.63 + 172.73 + 511.81 + 4.95 = 746.13:   7%|▋         | 134/2048 [02:23<22:57,  1.39it/s]
loss 2.22 accuracy 0.19 -- 56.63 + 172.73 + 511.81 + 4.95 = 746.13:   7%|▋         | 135/2048 [02:23<23:27,  1.36it/s]
loss 2.32 accuracy 0.19 -- 57.06 + 56.75 + 507.38 + 4.89 = 626.08:   7%|▋         | 135/2048 [02:23<23:27,  1.36it/s] 
loss 2.32 accuracy 0.19 -- 57.06 + 56.75 + 507.38 + 4.89 = 626.08:   7%|▋         | 136/2048 [02:23<22:40,  1.41it/s]
loss 2.10 accuracy 0.12 -- 169.74 + 57.23 + 506.27 + 4.91 = 738.14:   7%|▋         | 136/2048 [02:24<22:40,  1.41it/s]
loss 2.10 accuracy 0.12 -- 169.74 + 57.23 + 506.27 + 4.91 = 738.14:   7%|▋         | 137/2048 [02:24<23:10,  1.37it/s]
loss 2.07 accuracy 0.25 -- 56.96 + 58.00 + 635.54 + 4.96 = 755.47:   7%|▋         | 137/2048 [02:25<23:10,  1.37it/s] 
loss 2.07 accuracy 0.25 -- 56.96 + 58.00 + 635.54 + 4.96 = 755.47:   7%|▋         | 138/2048 [02:25<23:41,  1.34it/s]
loss 2.37 accuracy 0.06 -- 57.04 + 56.86 + 515.98 + 4.90 = 634.79:   7%|▋         | 138/2048 [02:26<23:41,  1.34it/s]
loss 2.37 accuracy 0.06 -- 57.04 + 56.86 + 515.98 + 4.90 = 634.79:   7%|▋         | 139/2048 [02:26<22:53,  1.39it/s]
loss 2.22 accuracy 0.25 -- 57.00 + 57.28 + 631.30 + 4.94 = 750.53:   7%|▋         | 139/2048 [02:26<22:53,  1.39it/s]
loss 2.22 accuracy 0.25 -- 57.00 + 57.28 + 631.30 + 4.94 = 750.53:   7%|▋         | 140/2048 [02:26<23:46,  1.34it/s]
loss 2.31 accuracy 0.12 -- 57.24 + 56.69 + 511.23 + 4.93 = 630.09:   7%|▋         | 140/2048 [02:27<23:46,  1.34it/s]
loss 2.31 accuracy 0.12 -- 57.24 + 56.69 + 511.23 + 4.93 = 630.09:   7%|▋         | 141/2048 [02:27<22:54,  1.39it/s]
loss 2.09 accuracy 0.44 -- 56.88 + 171.64 + 512.40 + 4.94 = 745.86:   7%|▋         | 141/2048 [02:28<22:54,  1.39it/s]
loss 2.09 accuracy 0.44 -- 56.88 + 171.64 + 512.40 + 4.94 = 745.86:   7%|▋         | 142/2048 [02:28<23:23,  1.36it/s]
loss 2.37 accuracy 0.12 -- 56.85 + 56.98 + 511.36 + 4.88 = 630.07:   7%|▋         | 142/2048 [02:29<23:23,  1.36it/s] 
loss 2.37 accuracy 0.12 -- 56.85 + 56.98 + 511.36 + 4.88 = 630.07:   7%|▋         | 143/2048 [02:29<22:37,  1.40it/s]
loss 2.15 accuracy 0.25 -- 57.32 + 56.56 + 620.44 + 5.27 = 739.59:   7%|▋         | 143/2048 [02:29<22:37,  1.40it/s]
loss 2.15 accuracy 0.25 -- 57.32 + 56.56 + 620.44 + 5.27 = 739.59:   7%|▋         | 144/2048 [02:29<24:11,  1.31it/s]
loss 2.00 accuracy 0.38 -- 64.58 + 62.25 + 506.10 + 4.92 = 637.85:   7%|▋         | 144/2048 [02:30<24:11,  1.31it/s]
loss 2.00 accuracy 0.38 -- 64.58 + 62.25 + 506.10 + 4.92 = 637.85:   7%|▋         | 145/2048 [02:30<23:20,  1.36it/s]
loss 1.91 accuracy 0.25 -- 163.47 + 56.87 + 499.36 + 4.89 = 724.59:   7%|▋         | 145/2048 [02:31<23:20,  1.36it/s]
loss 1.91 accuracy 0.25 -- 163.47 + 56.87 + 499.36 + 4.89 = 724.59:   7%|▋         | 146/2048 [02:31<23:29,  1.35it/s]
loss 2.23 accuracy 0.19 -- 56.77 + 172.28 + 513.99 + 4.92 = 747.96:   7%|▋         | 146/2048 [02:32<23:29,  1.35it/s]
loss 2.23 accuracy 0.19 -- 56.77 + 172.28 + 513.99 + 4.92 = 747.96:   7%|▋         | 147/2048 [02:32<23:48,  1.33it/s]
loss 2.77 accuracy 0.12 -- 57.02 + 56.90 + 509.43 + 4.88 = 628.23:   7%|▋         | 147/2048 [02:32<23:48,  1.33it/s] 
loss 2.77 accuracy 0.12 -- 57.02 + 56.90 + 509.43 + 4.88 = 628.23:   7%|▋         | 148/2048 [02:32<22:53,  1.38it/s]
loss 2.19 accuracy 0.12 -- 168.88 + 57.69 + 507.05 + 4.90 = 738.53:   7%|▋         | 148/2048 [02:33<22:53,  1.38it/s]
loss 2.19 accuracy 0.12 -- 168.88 + 57.69 + 507.05 + 4.90 = 738.53:   7%|▋         | 149/2048 [02:33<23:17,  1.36it/s]
loss 2.25 accuracy 0.12 -- 56.84 + 57.61 + 636.93 + 4.89 = 756.27:   7%|▋         | 149/2048 [02:34<23:17,  1.36it/s] 
loss 2.25 accuracy 0.12 -- 56.84 + 57.61 + 636.93 + 4.89 = 756.27:   7%|▋         | 150/2048 [02:34<23:43,  1.33it/s]
loss 1.90 accuracy 0.25 -- 57.47 + 56.70 + 514.85 + 4.90 = 633.92:   7%|▋         | 150/2048 [02:34<23:43,  1.33it/s]
loss 1.90 accuracy 0.25 -- 57.47 + 56.70 + 514.85 + 4.90 = 633.92:   7%|▋         | 151/2048 [02:34<22:53,  1.38it/s]
loss 2.24 accuracy 0.25 -- 56.65 + 57.49 + 629.80 + 4.91 = 748.84:   7%|▋         | 151/2048 [02:35<22:53,  1.38it/s]
loss 2.24 accuracy 0.25 -- 56.65 + 57.49 + 629.80 + 4.91 = 748.84:   7%|▋         | 152/2048 [02:35<23:22,  1.35it/s]
loss 2.01 accuracy 0.31 -- 57.27 + 56.62 + 512.15 + 4.93 = 630.97:   7%|▋         | 152/2048 [02:36<23:22,  1.35it/s]
loss 2.01 accuracy 0.31 -- 57.27 + 56.62 + 512.15 + 4.93 = 630.97:   7%|▋         | 153/2048 [02:36<22:35,  1.40it/s]
loss 2.15 accuracy 0.12 -- 57.08 + 171.91 + 513.96 + 4.95 = 747.91:   7%|▋         | 153/2048 [02:37<22:35,  1.40it/s]
loss 2.15 accuracy 0.12 -- 57.08 + 171.91 + 513.96 + 4.95 = 747.91:   8%|▊         | 154/2048 [02:37<23:09,  1.36it/s]
loss 2.17 accuracy 0.12 -- 56.86 + 56.93 + 510.64 + 4.93 = 629.36:   8%|▊         | 154/2048 [02:37<23:09,  1.36it/s] 
loss 2.17 accuracy 0.12 -- 56.86 + 56.93 + 510.64 + 4.93 = 629.36:   8%|▊         | 155/2048 [02:37<22:25,  1.41it/s]
loss 2.52 accuracy 0.12 -- 57.06 + 56.65 + 507.95 + 4.93 = 626.58:   8%|▊         | 155/2048 [02:38<22:25,  1.41it/s]
loss 2.52 accuracy 0.12 -- 57.06 + 56.65 + 507.95 + 4.93 = 626.58:   8%|▊         | 156/2048 [02:38<22:55,  1.38it/s]
loss 2.81 accuracy 0.12 -- 56.52 + 57.54 + 505.01 + 4.93 = 624.01:   8%|▊         | 156/2048 [02:39<22:55,  1.38it/s]
loss 2.81 accuracy 0.12 -- 56.52 + 57.54 + 505.01 + 4.93 = 624.01:   8%|▊         | 157/2048 [02:39<22:12,  1.42it/s]
loss 2.05 accuracy 0.06 -- 163.52 + 57.22 + 499.79 + 4.88 = 725.41:   8%|▊         | 157/2048 [02:39<22:12,  1.42it/s]
loss 2.05 accuracy 0.06 -- 163.52 + 57.22 + 499.79 + 4.88 = 725.41:   8%|▊         | 158/2048 [02:39<22:38,  1.39it/s]
loss 1.97 accuracy 0.19 -- 56.30 + 171.44 + 510.95 + 4.88 = 743.58:   8%|▊         | 158/2048 [02:40<22:38,  1.39it/s]
loss 1.97 accuracy 0.19 -- 56.30 + 171.44 + 510.95 + 4.88 = 743.58:   8%|▊         | 159/2048 [02:40<23:07,  1.36it/s]
loss 2.12 accuracy 0.25 -- 56.95 + 56.33 + 505.41 + 4.95 = 623.65:   8%|▊         | 159/2048 [02:41<23:07,  1.36it/s] 
loss 2.12 accuracy 0.25 -- 56.95 + 56.33 + 505.41 + 4.95 = 623.65:   8%|▊         | 160/2048 [02:41<22:19,  1.41it/s]
loss 2.07 accuracy 0.19 -- 169.23 + 57.17 + 506.95 + 4.93 = 738.29:   8%|▊         | 160/2048 [02:42<22:19,  1.41it/s]
loss 2.07 accuracy 0.19 -- 169.23 + 57.17 + 506.95 + 4.93 = 738.29:   8%|▊         | 161/2048 [02:42<22:51,  1.38it/s]
loss 2.32 accuracy 0.00 -- 56.79 + 57.18 + 636.49 + 4.94 = 755.40:   8%|▊         | 161/2048 [02:42<22:51,  1.38it/s] 
loss 2.32 accuracy 0.00 -- 56.79 + 57.18 + 636.49 + 4.94 = 755.40:   8%|▊         | 162/2048 [02:42<23:22,  1.34it/s]
loss 1.97 accuracy 0.31 -- 57.67 + 56.96 + 514.96 + 4.92 = 634.51:   8%|▊         | 162/2048 [02:43<23:22,  1.34it/s]
loss 1.97 accuracy 0.31 -- 57.67 + 56.96 + 514.96 + 4.92 = 634.51:   8%|▊         | 163/2048 [02:43<22:35,  1.39it/s]
loss 2.74 accuracy 0.12 -- 56.47 + 57.61 + 630.52 + 4.93 = 749.53:   8%|▊         | 163/2048 [02:44<22:35,  1.39it/s]
loss 2.74 accuracy 0.12 -- 56.47 + 57.61 + 630.52 + 4.93 = 749.53:   8%|▊         | 164/2048 [02:44<23:07,  1.36it/s]
loss 2.02 accuracy 0.25 -- 57.34 + 56.89 + 513.60 + 4.95 = 632.78:   8%|▊         | 164/2048 [02:45<23:07,  1.36it/s]
loss 2.02 accuracy 0.25 -- 57.34 + 56.89 + 513.60 + 4.95 = 632.78:   8%|▊         | 165/2048 [02:45<22:23,  1.40it/s]
loss 2.11 accuracy 0.12 -- 56.52 + 171.89 + 510.56 + 4.92 = 743.89:   8%|▊         | 165/2048 [02:45<22:23,  1.40it/s]
loss 2.11 accuracy 0.12 -- 56.52 + 171.89 + 510.56 + 4.92 = 743.89:   8%|▊         | 166/2048 [02:45<22:55,  1.37it/s]
loss 1.97 accuracy 0.31 -- 57.26 + 56.55 + 509.10 + 4.89 = 627.81:   8%|▊         | 166/2048 [02:46<22:55,  1.37it/s] 
loss 1.97 accuracy 0.31 -- 57.26 + 56.55 + 509.10 + 4.89 = 627.81:   8%|▊         | 167/2048 [02:46<22:12,  1.41it/s]
loss 2.23 accuracy 0.12 -- 57.19 + 56.49 + 508.97 + 4.92 = 627.58:   8%|▊         | 167/2048 [02:47<22:12,  1.41it/s]
loss 2.23 accuracy 0.12 -- 57.19 + 56.49 + 508.97 + 4.92 = 627.58:   8%|▊         | 168/2048 [02:47<22:45,  1.38it/s]
loss 1.89 accuracy 0.31 -- 56.58 + 57.20 + 504.29 + 4.90 = 622.97:   8%|▊         | 168/2048 [02:47<22:45,  1.38it/s]
loss 1.89 accuracy 0.31 -- 56.58 + 57.20 + 504.29 + 4.90 = 622.97:   8%|▊         | 169/2048 [02:47<22:01,  1.42it/s]
loss 2.49 accuracy 0.19 -- 163.59 + 57.06 + 498.10 + 4.97 = 723.73:   8%|▊         | 169/2048 [02:48<22:01,  1.42it/s]
loss 2.49 accuracy 0.19 -- 163.59 + 57.06 + 498.10 + 4.97 = 723.73:   8%|▊         | 170/2048 [02:48<22:27,  1.39it/s]
loss 2.04 accuracy 0.19 -- 57.12 + 172.37 + 512.25 + 4.91 = 746.65:   8%|▊         | 170/2048 [02:49<22:27,  1.39it/s]
loss 2.04 accuracy 0.19 -- 57.12 + 172.37 + 512.25 + 4.91 = 746.65:   8%|▊         | 171/2048 [02:49<22:59,  1.36it/s]
loss 2.29 accuracy 0.19 -- 57.48 + 56.50 + 510.72 + 4.92 = 629.61:   8%|▊         | 171/2048 [02:50<22:59,  1.36it/s] 
loss 2.29 accuracy 0.19 -- 57.48 + 56.50 + 510.72 + 4.92 = 629.61:   8%|▊         | 172/2048 [02:50<22:14,  1.41it/s]
loss 1.98 accuracy 0.31 -- 169.24 + 57.41 + 505.79 + 4.90 = 737.35:   8%|▊         | 172/2048 [02:50<22:14,  1.41it/s]
loss 1.98 accuracy 0.31 -- 169.24 + 57.41 + 505.79 + 4.90 = 737.35:   8%|▊         | 173/2048 [02:50<22:43,  1.37it/s]
loss 2.11 accuracy 0.06 -- 56.80 + 57.32 + 636.36 + 4.90 = 755.38:   8%|▊         | 173/2048 [02:51<22:43,  1.37it/s] 
loss 2.11 accuracy 0.06 -- 56.80 + 57.32 + 636.36 + 4.90 = 755.38:   8%|▊         | 174/2048 [02:51<23:14,  1.34it/s]
loss 2.19 accuracy 0.12 -- 57.36 + 56.94 + 516.25 + 4.93 = 635.49:   8%|▊         | 174/2048 [02:52<23:14,  1.34it/s]
loss 2.19 accuracy 0.12 -- 57.36 + 56.94 + 516.25 + 4.93 = 635.49:   9%|▊         | 175/2048 [02:52<22:28,  1.39it/s]
loss 2.35 accuracy 0.06 -- 56.71 + 57.53 + 634.73 + 4.90 = 753.86:   9%|▊         | 175/2048 [02:53<22:28,  1.39it/s]
loss 2.35 accuracy 0.06 -- 56.71 + 57.53 + 634.73 + 4.90 = 753.86:   9%|▊         | 176/2048 [02:53<23:02,  1.35it/s]
loss 2.22 accuracy 0.25 -- 57.22 + 56.85 + 512.31 + 4.91 = 631.29:   9%|▊         | 176/2048 [02:53<23:02,  1.35it/s]
loss 2.22 accuracy 0.25 -- 57.22 + 56.85 + 512.31 + 4.91 = 631.29:   9%|▊         | 177/2048 [02:53<22:16,  1.40it/s]
loss 2.32 accuracy 0.06 -- 56.72 + 172.46 + 512.29 + 4.94 = 746.41:   9%|▊         | 177/2048 [02:54<22:16,  1.40it/s]
loss 2.32 accuracy 0.06 -- 56.72 + 172.46 + 512.29 + 4.94 = 746.41:   9%|▊         | 178/2048 [02:54<22:49,  1.37it/s]
loss 1.94 accuracy 0.38 -- 56.87 + 56.51 + 509.52 + 4.94 = 627.83:   9%|▊         | 178/2048 [02:55<22:49,  1.37it/s] 
loss 1.94 accuracy 0.38 -- 56.87 + 56.51 + 509.52 + 4.94 = 627.83:   9%|▊         | 179/2048 [02:55<22:05,  1.41it/s]
loss 2.28 accuracy 0.06 -- 57.41 + 56.51 + 507.89 + 4.93 = 626.74:   9%|▊         | 179/2048 [02:55<22:05,  1.41it/s]
loss 2.28 accuracy 0.06 -- 57.41 + 56.51 + 507.89 + 4.93 = 626.74:   9%|▉         | 180/2048 [02:55<22:37,  1.38it/s]
loss 1.94 accuracy 0.25 -- 56.52 + 57.80 + 504.99 + 4.92 = 624.24:   9%|▉         | 180/2048 [02:56<22:37,  1.38it/s]
loss 1.94 accuracy 0.25 -- 56.52 + 57.80 + 504.99 + 4.92 = 624.24:   9%|▉         | 181/2048 [02:56<21:54,  1.42it/s]
loss 2.05 accuracy 0.38 -- 164.59 + 57.28 + 500.50 + 4.94 = 727.30:   9%|▉         | 181/2048 [02:57<21:54,  1.42it/s]
loss 2.05 accuracy 0.38 -- 164.59 + 57.28 + 500.50 + 4.94 = 727.30:   9%|▉         | 182/2048 [02:57<22:42,  1.37it/s]
loss 2.28 accuracy 0.19 -- 56.40 + 172.88 + 511.83 + 4.95 = 746.06:   9%|▉         | 182/2048 [02:58<22:42,  1.37it/s]
loss 2.28 accuracy 0.19 -- 56.40 + 172.88 + 511.83 + 4.95 = 746.06:   9%|▉         | 183/2048 [02:58<23:05,  1.35it/s]
loss 2.40 accuracy 0.06 -- 57.77 + 56.96 + 509.54 + 4.91 = 629.18:   9%|▉         | 183/2048 [02:58<23:05,  1.35it/s] 
loss 2.40 accuracy 0.06 -- 57.77 + 56.96 + 509.54 + 4.91 = 629.18:   9%|▉         | 184/2048 [02:58<22:16,  1.39it/s]
loss 1.92 accuracy 0.19 -- 169.13 + 57.23 + 507.24 + 4.93 = 738.54:   9%|▉         | 184/2048 [02:59<22:16,  1.39it/s]
loss 1.92 accuracy 0.19 -- 169.13 + 57.23 + 507.24 + 4.93 = 738.54:   9%|▉         | 185/2048 [02:59<22:43,  1.37it/s]
loss 2.04 accuracy 0.25 -- 56.77 + 57.83 + 637.08 + 4.91 = 756.58:   9%|▉         | 185/2048 [03:00<22:43,  1.37it/s] 
loss 2.04 accuracy 0.25 -- 56.77 + 57.83 + 637.08 + 4.91 = 756.58:   9%|▉         | 186/2048 [03:00<23:11,  1.34it/s]
loss 2.33 accuracy 0.19 -- 57.50 + 56.78 + 517.36 + 4.92 = 636.56:   9%|▉         | 186/2048 [03:01<23:11,  1.34it/s]
loss 2.33 accuracy 0.19 -- 57.50 + 56.78 + 517.36 + 4.92 = 636.56:   9%|▉         | 187/2048 [03:01<22:24,  1.38it/s]
loss 2.19 accuracy 0.25 -- 56.87 + 57.57 + 632.57 + 4.94 = 751.95:   9%|▉         | 187/2048 [03:01<22:24,  1.38it/s]
loss 2.19 accuracy 0.25 -- 56.87 + 57.57 + 632.57 + 4.94 = 751.95:   9%|▉         | 188/2048 [03:01<22:55,  1.35it/s]
loss 2.34 accuracy 0.12 -- 57.35 + 57.10 + 511.51 + 4.91 = 630.87:   9%|▉         | 188/2048 [03:02<22:55,  1.35it/s]
loss 2.34 accuracy 0.12 -- 57.35 + 57.10 + 511.51 + 4.91 = 630.87:   9%|▉         | 189/2048 [03:02<22:09,  1.40it/s]
loss 2.06 accuracy 0.31 -- 56.61 + 171.98 + 512.44 + 4.89 = 745.92:   9%|▉         | 189/2048 [03:03<22:09,  1.40it/s]
loss 2.06 accuracy 0.31 -- 56.61 + 171.98 + 512.44 + 4.89 = 745.92:   9%|▉         | 190/2048 [03:03<22:41,  1.36it/s]
loss 1.74 accuracy 0.44 -- 56.83 + 56.69 + 509.75 + 4.92 = 628.20:   9%|▉         | 190/2048 [03:03<22:41,  1.36it/s] 
loss 1.74 accuracy 0.44 -- 56.83 + 56.69 + 509.75 + 4.92 = 628.20:   9%|▉         | 191/2048 [03:03<21:57,  1.41it/s]
loss 2.21 accuracy 0.19 -- 57.00 + 56.49 + 507.99 + 4.95 = 626.42:   9%|▉         | 191/2048 [03:04<21:57,  1.41it/s]
loss 2.21 accuracy 0.19 -- 57.00 + 56.49 + 507.99 + 4.95 = 626.42:   9%|▉         | 192/2048 [03:04<22:28,  1.38it/s]
loss 2.49 accuracy 0.19 -- 57.09 + 58.09 + 508.30 + 4.93 = 628.41:   9%|▉         | 192/2048 [03:05<22:28,  1.38it/s]
loss 2.49 accuracy 0.19 -- 57.09 + 58.09 + 508.30 + 4.93 = 628.41:   9%|▉         | 193/2048 [03:05<21:48,  1.42it/s]
loss 1.81 accuracy 0.38 -- 163.69 + 57.27 + 500.14 + 4.94 = 726.03:   9%|▉         | 193/2048 [03:06<21:48,  1.42it/s]
loss 1.81 accuracy 0.38 -- 163.69 + 57.27 + 500.14 + 4.94 = 726.03:   9%|▉         | 194/2048 [03:06<22:14,  1.39it/s]
loss 2.65 accuracy 0.25 -- 56.18 + 172.23 + 511.22 + 4.93 = 744.56:   9%|▉         | 194/2048 [03:06<22:14,  1.39it/s]
loss 2.65 accuracy 0.25 -- 56.18 + 172.23 + 511.22 + 4.93 = 744.56:  10%|▉         | 195/2048 [03:06<22:42,  1.36it/s]
loss 1.69 accuracy 0.31 -- 57.36 + 56.79 + 508.60 + 4.95 = 627.69:  10%|▉         | 195/2048 [03:07<22:42,  1.36it/s] 
loss 1.69 accuracy 0.31 -- 57.36 + 56.79 + 508.60 + 4.95 = 627.69:  10%|▉         | 196/2048 [03:07<22:17,  1.38it/s]
loss 1.94 accuracy 0.38 -- 168.32 + 57.32 + 507.41 + 4.93 = 737.98:  10%|▉         | 196/2048 [03:08<22:17,  1.38it/s]
loss 1.94 accuracy 0.38 -- 168.32 + 57.32 + 507.41 + 4.93 = 737.98:  10%|▉         | 197/2048 [03:08<22:40,  1.36it/s]
loss 2.44 accuracy 0.12 -- 57.02 + 57.82 + 638.13 + 5.01 = 757.98:  10%|▉         | 197/2048 [03:09<22:40,  1.36it/s] 
loss 2.44 accuracy 0.12 -- 57.02 + 57.82 + 638.13 + 5.01 = 757.98:  10%|▉         | 198/2048 [03:09<23:08,  1.33it/s]
loss 2.11 accuracy 0.19 -- 57.63 + 57.05 + 516.01 + 4.94 = 635.62:  10%|▉         | 198/2048 [03:09<23:08,  1.33it/s]
loss 2.11 accuracy 0.19 -- 57.63 + 57.05 + 516.01 + 4.94 = 635.62:  10%|▉         | 199/2048 [03:09<22:18,  1.38it/s]
loss 1.91 accuracy 0.31 -- 56.81 + 57.88 + 632.58 + 4.94 = 752.22:  10%|▉         | 199/2048 [03:10<22:18,  1.38it/s]
loss 1.91 accuracy 0.31 -- 56.81 + 57.88 + 632.58 + 4.94 = 752.22:  10%|▉         | 200/2048 [03:10<22:48,  1.35it/s]
loss 2.18 accuracy 0.25 -- 57.42 + 57.02 + 511.97 + 4.95 = 631.35:  10%|▉         | 200/2048 [03:11<22:48,  1.35it/s]
loss 2.18 accuracy 0.25 -- 57.42 + 57.02 + 511.97 + 4.95 = 631.35:  10%|▉         | 201/2048 [03:11<22:02,  1.40it/s]
loss 2.10 accuracy 0.12 -- 56.63 + 171.83 + 511.85 + 4.92 = 745.22:  10%|▉         | 201/2048 [03:11<22:02,  1.40it/s]
loss 2.10 accuracy 0.12 -- 56.63 + 171.83 + 511.85 + 4.92 = 745.22:  10%|▉         | 202/2048 [03:11<22:33,  1.36it/s]
loss 2.07 accuracy 0.12 -- 56.89 + 56.81 + 509.83 + 4.92 = 628.44:  10%|▉         | 202/2048 [03:12<22:33,  1.36it/s] 
loss 2.07 accuracy 0.12 -- 56.89 + 56.81 + 509.83 + 4.92 = 628.44:  10%|▉         | 203/2048 [03:12<22:09,  1.39it/s]
loss 1.77 accuracy 0.38 -- 57.74 + 56.96 + 507.62 + 4.92 = 627.25:  10%|▉         | 203/2048 [03:13<22:09,  1.39it/s]
loss 1.77 accuracy 0.38 -- 57.74 + 56.96 + 507.62 + 4.92 = 627.25:  10%|▉         | 204/2048 [03:13<22:34,  1.36it/s]
loss 2.33 accuracy 0.31 -- 56.72 + 57.69 + 505.65 + 4.91 = 624.97:  10%|▉         | 204/2048 [03:14<22:34,  1.36it/s]
loss 2.33 accuracy 0.31 -- 56.72 + 57.69 + 505.65 + 4.91 = 624.97:  10%|█         | 205/2048 [03:14<21:48,  1.41it/s]
loss 2.38 accuracy 0.38 -- 164.29 + 57.38 + 499.15 + 4.93 = 725.75:  10%|█         | 205/2048 [03:14<21:48,  1.41it/s]
loss 2.38 accuracy 0.38 -- 164.29 + 57.38 + 499.15 + 4.93 = 725.75:  10%|█         | 206/204