| Name | ParityModified-1607233793-4287-35_2 |
| Workunit | 3076243 |
| Created | 2 May 2021, 10:09:02 UTC |
| Sent | 2 May 2021, 10:18:54 UTC |
| Report deadline | 9 May 2021, 10:18:54 UTC |
| Received | 3 May 2021, 8:42:25 UTC |
| Server state | Over |
| Outcome | Computation error |
| Client state | Compute error |
| Exit status | 197 (0x000000C5) EXIT_TIME_LIMIT_EXCEEDED |
| Computer ID | 11158 |
| Run time | 10 hours 57 min 44 sec |
| CPU time | |
| Validate state | Invalid |
| Credit | 0.00 |
| Device peak FLOPS | 13,837.92 GFLOPS |
| Application version | Machine Learning Dataset Generator (test) v9.80 (amdrocm) x86_64-pc-linux-gnu |
| Peak working set size | 1.58 GB |
| Peak swap size | 8.10 GB |
| Peak disk usage | 2.25 GB |
<core_client_version>7.16.6</core_client_version> <![CDATA[ <message> exceeded elapsed time limit 39418.03 (400000.00G/10.15G)</message> <stderr_txt> DEBUG: Args: ../../projects/www.mlcathome.org_mlcathome/mlds-gpu_9.80_x86_64-pc-linux-gnu__rocm -c --maxepoch 128 nthreads: 1 gpudev: 0 Re-exec()-ing to set environment correctly 14:44:10 (10214): start_timer_thread(): pthread_create(): 22Machine Learning Dataset Generator v9.80 (Linux/x86_64) (libTorch: release/1.7 GPU: Vega 20 [Radeon VII]) [2021-05-02 14:44:10 main:442] : INFO : Set logging level to 1 [2021-05-02 14:44:10 main:448] : INFO : Running in BOINC Client mode [2021-05-02 14:44:10 main:451] : INFO : Resolving all filenames [2021-05-02 14:44:10 main:459] : INFO : Resolved: dataset.hdf5 => ../../projects/www.mlcathome.org_mlcathome/ParityModified-train-val-dataset.hdf5 (exists = 1) [2021-05-02 14:44:10 main:459] : INFO : Resolved: model.cfg => ../../projects/www.mlcathome.org_mlcathome/ParityModified-1607233793-4287-35_2_r2144645146_1 (exists = 0) [2021-05-02 14:44:10 main:459] : INFO : Resolved: model-final.pt => ../../projects/www.mlcathome.org_mlcathome/ParityModified-1607233793-4287-35_2_r2144645146_0 (exists = 0) [2021-05-02 14:44:10 main:459] : INFO : Resolved: model-input.pt => ../../projects/www.mlcathome.org_mlcathome/ParityModified-1607233793-4287-35 (exists = 1) [2021-05-02 14:44:10 main:459] : INFO : Resolved: snapshot.pt => snapshot.pt (exists = 0) [2021-05-02 14:44:10 main:479] : INFO : Dataset filename: ../../projects/www.mlcathome.org_mlcathome/ParityModified-train-val-dataset.hdf5 [2021-05-02 14:44:10 main:481] : INFO : Configuration: [2021-05-02 14:44:10 main:482] : INFO : Model type: GRU [2021-05-02 14:44:10 main:483] : INFO : Validation Loss Threshold: 0.0001 [2021-05-02 14:44:10 main:484] : INFO : Max Epochs: 128 [2021-05-02 14:44:10 main:485] : INFO : Batch Size: 128 [2021-05-02 14:44:10 main:486] : INFO : Learning Rate: 0.01 [2021-05-02 14:44:10 main:487] : INFO : Patience: 10 [2021-05-02 14:44:10 main:488] : INFO : Hidden Width: 12 [2021-05-02 14:44:10 main:489] : INFO : # Recurrent Layers: 4 [2021-05-02 14:44:10 main:490] : INFO : # Backend Layers: 4 [2021-05-02 14:44:10 main:491] : INFO : # Threads: 1 [2021-05-02 14:44:10 main:493] : INFO : Preparing Dataset [2021-05-02 14:44:10 load_hdf5_ds_into_tensor:28] : INFO : Loading Dataset /Xt from ../../projects/www.mlcathome.org_mlcathome/ParityModified-train-val-dataset.hdf5 into memory [2021-05-02 14:44:10 load_hdf5_ds_into_tensor:28] : INFO : Loading Dataset /Yt from ../../projects/www.mlcathome.org_mlcathome/ParityModified-train-val-dataset.hdf5 into memory [2021-05-02 14:44:11 load:106] : INFO : Successfully loaded dataset of 2048 examples into memory. [2021-05-02 14:44:11 load_hdf5_ds_into_tensor:28] : INFO : Loading Dataset /Xv from ../../projects/www.mlcathome.org_mlcathome/ParityModified-train-val-dataset.hdf5 into memory [2021-05-02 14:44:11 load_hdf5_ds_into_tensor:28] : INFO : Loading Dataset /Yv from ../../projects/www.mlcathome.org_mlcathome/ParityModified-train-val-dataset.hdf5 into memory [2021-05-02 14:44:11 load:106] : INFO : Successfully loaded dataset of 512 examples into memory. [2021-05-02 14:44:11 main:501] : INFO : Creating Model [2021-05-02 14:44:11 main:514] : INFO : Preparing config file [2021-05-02 14:44:11 main:526] : INFO : Creating new config file [2021-05-02 14:44:11 main:545] : INFO : This is a continuation WU, loading previous network [2021-05-02 14:44:11 main:566] : INFO : Loading DataLoader into Memory [2021-05-02 14:44:11 main:569] : INFO : Starting Training </stderr_txt> ]]>
©2022 MLC@Home Team
A project of the Cognition, Robotics, and Learning (CORAL) Lab at the University of Maryland, Baltimore County (UMBC)