Name | ParityModified-1647049376-5105-3-0_0 |
Workunit | 11639850 |
Created | 21 Apr 2022, 8:13:27 UTC |
Sent | 21 Apr 2022, 8:13:55 UTC |
Report deadline | 29 Apr 2022, 8:13:55 UTC |
Received | 26 Apr 2022, 0:40:15 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | -1073741523 (0xC000012D) Unknown error code |
Computer ID | 7896 |
Run time | 1 sec |
CPU time | |
Validate state | Invalid |
Credit | 0.00 |
Device peak FLOPS | 3,940.75 GFLOPS |
Application version | Machine Learning Dataset Generator (GPU) v9.75 (cuda10200) windows_x86_64 |
Peak disk usage | 1.54 GB |
<core_client_version>7.16.20</core_client_version> <![CDATA[ <message> (unknown error) - exit code 3221225773 (0xc000012d)</message> <stderr_txt> Machine Learning Dataset Generator v9.75 (Windows/x64) (libTorch: release/1.6 GPU: NVIDIA GeForce GTX 1060 3GB) [2022-04-26 02:33:20 main:435] : INFO : Set logging level to 1 [2022-04-26 02:33:20 main:441] : INFO : Running in BOINC Client mode [2022-04-26 02:33:20 main:444] : INFO : Resolving all filenames [2022-04-26 02:33:20 main:452] : INFO : Resolved: dataset.hdf5 => dataset.hdf5 (exists = 1) [2022-04-26 02:33:20 main:452] : INFO : Resolved: model.cfg => model.cfg (exists = 0) [2022-04-26 02:33:21 main:452] : INFO : Resolved: model-final.pt => model-final.pt (exists = 0) [2022-04-26 02:33:21 main:452] : INFO : Resolved: model-input.pt => model-input.pt (exists = 1) [2022-04-26 02:33:21 main:452] : INFO : Resolved: snapshot.pt => snapshot.pt (exists = 0) [2022-04-26 02:33:21 main:472] : INFO : Dataset filename: dataset.hdf5 [2022-04-26 02:33:21 main:474] : INFO : Configuration: [2022-04-26 02:33:21 main:475] : INFO : Model type: GRU [2022-04-26 02:33:21 main:476] : INFO : Validation Loss Threshold: 0.0001 [2022-04-26 02:33:21 main:477] : INFO : Max Epochs: 2048 [2022-04-26 02:33:21 main:478] : INFO : Batch Size: 128 [2022-04-26 02:33:21 main:479] : INFO : Learning Rate: 0.01 [2022-04-26 02:33:21 main:480] : INFO : Patience: 10 [2022-04-26 02:33:21 main:481] : INFO : Hidden Width: 12 [2022-04-26 02:33:21 main:482] : INFO : # Recurrent Layers: 4 [2022-04-26 02:33:21 main:483] : INFO : # Backend Layers: 4 [2022-04-26 02:33:21 main:484] : INFO : # Threads: 1 [2022-04-26 02:33:21 main:486] : INFO : Preparing Dataset [2022-04-26 02:33:21 load_hdf5_ds_into_tensor:28] : INFO : Loading Dataset /Xt from dataset.hdf5 into memory [2022-04-26 02:33:30 load_hdf5_ds_into_tensor:28] : INFO : Loading Dataset /Yt from dataset.hdf5 into memory [2022-04-26 02:35:38 load:106] : INFO : Successfully loaded dataset of 2048 examples into memory. [2022-04-26 02:35:38 load_hdf5_ds_into_tensor:28] : INFO : Loading Dataset /Xv from dataset.hdf5 into memory [2022-04-26 02:35:38 load_hdf5_ds_into_tensor:28] : INFO : Loading Dataset /Yv from dataset.hdf5 into memory [2022-04-26 02:35:38 load:106] : INFO : Successfully loaded dataset of 512 examples into memory. [2022-04-26 02:35:38 main:494] : INFO : Creating Model [2022-04-26 02:35:38 main:507] : INFO : Preparing config file [2022-04-26 02:35:38 main:519] : INFO : Creating new config file [2022-04-26 02:35:38 main:538] : INFO : This is a continuation WU, loading previous network [2022-04-26 02:36:20 main:559] : INFO : Loading DataLoader into Memory [2022-04-26 02:36:20 main:562] : INFO : Starting Training [2022-04-26 02:36:35 main:574] : INFO : Epoch 1 | loss: 0.0311919 | val_loss: 0.0311539 | Time: 15054.3 ms [2022-04-26 02:36:42 main:574] : INFO : Epoch 2 | loss: 0.031135 | val_loss: 0.0311496 | Time: 6294.24 ms [2022-04-26 02:36:48 main:574] : INFO : Epoch 3 | loss: 0.0311308 | val_loss: 0.0311497 | Time: 5520.59 ms [2022-04-26 02:36:55 main:574] : INFO : Epoch 4 | loss: 0.0311281 | val_loss: 0.0311455 | Time: 6115.88 ms [2022-04-26 02:37:01 main:574] : INFO : Epoch 5 | loss: 0.0311267 | val_loss: 0.0311512 | Time: 6277.16 ms [2022-04-26 02:37:07 main:574] : INFO : Epoch 6 | loss: 0.0311346 | val_loss: 0.0311507 | Time: 5672.59 ms [2022-04-26 02:37:15 main:574] : INFO : Epoch 7 | loss: 0.031134 | val_loss: 0.0311488 | Time: 6599.5 ms [2022-04-26 02:37:21 main:574] : INFO : Epoch 8 | loss: 0.0311324 | val_loss: 0.0311434 | Time: 5973.81 ms [2022-04-26 02:37:25 main:574] : INFO : Epoch 9 | loss: 0.0311331 | val_loss: 0.0311445 | Time: 3624.96 ms [2022-04-26 02:37:29 main:574] : INFO : Epoch 10 | loss: 0.0311298 | val_loss: 0.0311439 | Time: 4731.15 ms [2022-04-26 02:37:34 main:574] : INFO : Epoch 11 | loss: 0.0311288 | val_loss: 0.031144 | Time: 3327.49 ms [2022-04-26 02:37:38 main:574] : INFO : Epoch 12 | loss: 0.0311291 | val_loss: 0.0311435 | Time: 3904.07 ms [2022-04-26 02:37:43 main:574] : INFO : Epoch 13 | loss: 0.0311281 | val_loss: 0.0311467 | Time: 4858.4 ms [2022-04-26 02:37:47 main:574] : INFO : Epoch 14 | loss: 0.0311258 | val_loss: 0.0311451 | Time: 3882.4 ms [2022-04-26 02:37:51 main:574] : INFO : Epoch 15 | loss: 0.0311259 | val_loss: 0.0311439 | Time: 3340.91 ms [2022-04-26 02:37:56 main:574] : INFO : Epoch 16 | loss: 0.0311296 | val_loss: 0.031146 | Time: 3634.65 ms [2022-04-26 02:37:59 main:574] : INFO : Epoch 17 | loss: 0.0311289 | val_loss: 0.0311471 | Time: 2956.97 ms [2022-04-26 02:38:03 main:574] : INFO : Epoch 18 | loss: 0.0311276 | val_loss: 0.031147 | Time: 3268.24 ms [2022-04-26 02:38:07 main:574] : INFO : Epoch 19 | loss: 0.03113 | val_loss: 0.0311493 | Time: 3298.65 ms [2022-04-26 02:38:10 main:574] : INFO : Epoch 20 | loss: 0.0311412 | val_loss: 0.031155 | Time: 3085.24 ms [2022-04-26 02:38:14 main:574] : INFO : Epoch 21 | loss: 0.0311436 | val_loss: 0.0311562 | Time: 3148.21 ms [2022-04-26 02:38:17 main:574] : INFO : Epoch 22 | loss: 0.0311461 | val_loss: 0.0311594 | Time: 3598.59 ms [2022-04-26 02:38:21 main:574] : INFO : Epoch 23 | loss: 0.0311478 | val_loss: 0.0311603 | Time: 3114.74 ms [2022-04-26 02:38:25 main:574] : INFO : Epoch 24 | loss: 0.0311598 | val_loss: 0.03117 | Time: 4398.75 ms </stderr_txt> ]]>
©2023 MLC@Home Team
A project of the Cognition, Robotics, and Learning (CORAL) Lab at the University of Maryland, Baltimore County (UMBC)