Task 14627319

Name ParityModified-1647049376-5105-3-0_0
Workunit 11639850
Created 21 Apr 2022, 8:13:27 UTC
Sent 21 Apr 2022, 8:13:55 UTC
Report deadline 29 Apr 2022, 8:13:55 UTC
Received 26 Apr 2022, 0:40:15 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status -1073741523 (0xC000012D) Unknown error code
Computer ID 7896
Run time 1 sec
CPU time
Validate state Invalid
Credit 0.00
Device peak FLOPS 3,940.75 GFLOPS
Application version Machine Learning Dataset Generator (GPU) v9.75 (cuda10200)
windows_x86_64
Peak disk usage 1.54 GB

Stderr output

<core_client_version>7.16.20</core_client_version>
<![CDATA[
<message>
(unknown error) - exit code 3221225773 (0xc000012d)</message>
<stderr_txt>
Machine Learning Dataset Generator v9.75 (Windows/x64) (libTorch: release/1.6 GPU: NVIDIA GeForce GTX 1060 3GB)
[2022-04-26 02:33:20	                main:435]	:	INFO	:	Set logging level to 1
[2022-04-26 02:33:20	                main:441]	:	INFO	:	Running in BOINC Client mode
[2022-04-26 02:33:20	                main:444]	:	INFO	:	Resolving all filenames
[2022-04-26 02:33:20	                main:452]	:	INFO	:	Resolved: dataset.hdf5 => dataset.hdf5 (exists = 1)
[2022-04-26 02:33:20	                main:452]	:	INFO	:	Resolved: model.cfg => model.cfg (exists = 0)
[2022-04-26 02:33:21	                main:452]	:	INFO	:	Resolved: model-final.pt => model-final.pt (exists = 0)
[2022-04-26 02:33:21	                main:452]	:	INFO	:	Resolved: model-input.pt => model-input.pt (exists = 1)
[2022-04-26 02:33:21	                main:452]	:	INFO	:	Resolved: snapshot.pt => snapshot.pt (exists = 0)
[2022-04-26 02:33:21	                main:472]	:	INFO	:	Dataset filename: dataset.hdf5
[2022-04-26 02:33:21	                main:474]	:	INFO	:	Configuration: 
[2022-04-26 02:33:21	                main:475]	:	INFO	:	    Model type: GRU
[2022-04-26 02:33:21	                main:476]	:	INFO	:	    Validation Loss Threshold: 0.0001
[2022-04-26 02:33:21	                main:477]	:	INFO	:	    Max Epochs: 2048
[2022-04-26 02:33:21	                main:478]	:	INFO	:	    Batch Size: 128
[2022-04-26 02:33:21	                main:479]	:	INFO	:	    Learning Rate: 0.01
[2022-04-26 02:33:21	                main:480]	:	INFO	:	    Patience: 10
[2022-04-26 02:33:21	                main:481]	:	INFO	:	    Hidden Width: 12
[2022-04-26 02:33:21	                main:482]	:	INFO	:	    # Recurrent Layers: 4
[2022-04-26 02:33:21	                main:483]	:	INFO	:	    # Backend Layers: 4
[2022-04-26 02:33:21	                main:484]	:	INFO	:	    # Threads: 1
[2022-04-26 02:33:21	                main:486]	:	INFO	:	Preparing Dataset
[2022-04-26 02:33:21	load_hdf5_ds_into_tensor:28]	:	INFO	:	Loading Dataset /Xt from dataset.hdf5 into memory
[2022-04-26 02:33:30	load_hdf5_ds_into_tensor:28]	:	INFO	:	Loading Dataset /Yt from dataset.hdf5 into memory
[2022-04-26 02:35:38	                load:106]	:	INFO	:	Successfully loaded dataset of 2048 examples into memory.
[2022-04-26 02:35:38	load_hdf5_ds_into_tensor:28]	:	INFO	:	Loading Dataset /Xv from dataset.hdf5 into memory
[2022-04-26 02:35:38	load_hdf5_ds_into_tensor:28]	:	INFO	:	Loading Dataset /Yv from dataset.hdf5 into memory
[2022-04-26 02:35:38	                load:106]	:	INFO	:	Successfully loaded dataset of 512 examples into memory.
[2022-04-26 02:35:38	                main:494]	:	INFO	:	Creating Model
[2022-04-26 02:35:38	                main:507]	:	INFO	:	Preparing config file
[2022-04-26 02:35:38	                main:519]	:	INFO	:	Creating new config file
[2022-04-26 02:35:38	                main:538]	:	INFO	:	This is a continuation WU, loading previous network
[2022-04-26 02:36:20	                main:559]	:	INFO	:	Loading DataLoader into Memory
[2022-04-26 02:36:20	                main:562]	:	INFO	:	Starting Training
[2022-04-26 02:36:35	                main:574]	:	INFO	:	Epoch 1 | loss: 0.0311919 | val_loss: 0.0311539 | Time: 15054.3 ms
[2022-04-26 02:36:42	                main:574]	:	INFO	:	Epoch 2 | loss: 0.031135 | val_loss: 0.0311496 | Time: 6294.24 ms
[2022-04-26 02:36:48	                main:574]	:	INFO	:	Epoch 3 | loss: 0.0311308 | val_loss: 0.0311497 | Time: 5520.59 ms
[2022-04-26 02:36:55	                main:574]	:	INFO	:	Epoch 4 | loss: 0.0311281 | val_loss: 0.0311455 | Time: 6115.88 ms
[2022-04-26 02:37:01	                main:574]	:	INFO	:	Epoch 5 | loss: 0.0311267 | val_loss: 0.0311512 | Time: 6277.16 ms
[2022-04-26 02:37:07	                main:574]	:	INFO	:	Epoch 6 | loss: 0.0311346 | val_loss: 0.0311507 | Time: 5672.59 ms
[2022-04-26 02:37:15	                main:574]	:	INFO	:	Epoch 7 | loss: 0.031134 | val_loss: 0.0311488 | Time: 6599.5 ms
[2022-04-26 02:37:21	                main:574]	:	INFO	:	Epoch 8 | loss: 0.0311324 | val_loss: 0.0311434 | Time: 5973.81 ms
[2022-04-26 02:37:25	                main:574]	:	INFO	:	Epoch 9 | loss: 0.0311331 | val_loss: 0.0311445 | Time: 3624.96 ms
[2022-04-26 02:37:29	                main:574]	:	INFO	:	Epoch 10 | loss: 0.0311298 | val_loss: 0.0311439 | Time: 4731.15 ms
[2022-04-26 02:37:34	                main:574]	:	INFO	:	Epoch 11 | loss: 0.0311288 | val_loss: 0.031144 | Time: 3327.49 ms
[2022-04-26 02:37:38	                main:574]	:	INFO	:	Epoch 12 | loss: 0.0311291 | val_loss: 0.0311435 | Time: 3904.07 ms
[2022-04-26 02:37:43	                main:574]	:	INFO	:	Epoch 13 | loss: 0.0311281 | val_loss: 0.0311467 | Time: 4858.4 ms
[2022-04-26 02:37:47	                main:574]	:	INFO	:	Epoch 14 | loss: 0.0311258 | val_loss: 0.0311451 | Time: 3882.4 ms
[2022-04-26 02:37:51	                main:574]	:	INFO	:	Epoch 15 | loss: 0.0311259 | val_loss: 0.0311439 | Time: 3340.91 ms
[2022-04-26 02:37:56	                main:574]	:	INFO	:	Epoch 16 | loss: 0.0311296 | val_loss: 0.031146 | Time: 3634.65 ms
[2022-04-26 02:37:59	                main:574]	:	INFO	:	Epoch 17 | loss: 0.0311289 | val_loss: 0.0311471 | Time: 2956.97 ms
[2022-04-26 02:38:03	                main:574]	:	INFO	:	Epoch 18 | loss: 0.0311276 | val_loss: 0.031147 | Time: 3268.24 ms
[2022-04-26 02:38:07	                main:574]	:	INFO	:	Epoch 19 | loss: 0.03113 | val_loss: 0.0311493 | Time: 3298.65 ms
[2022-04-26 02:38:10	                main:574]	:	INFO	:	Epoch 20 | loss: 0.0311412 | val_loss: 0.031155 | Time: 3085.24 ms
[2022-04-26 02:38:14	                main:574]	:	INFO	:	Epoch 21 | loss: 0.0311436 | val_loss: 0.0311562 | Time: 3148.21 ms
[2022-04-26 02:38:17	                main:574]	:	INFO	:	Epoch 22 | loss: 0.0311461 | val_loss: 0.0311594 | Time: 3598.59 ms
[2022-04-26 02:38:21	                main:574]	:	INFO	:	Epoch 23 | loss: 0.0311478 | val_loss: 0.0311603 | Time: 3114.74 ms
[2022-04-26 02:38:25	                main:574]	:	INFO	:	Epoch 24 | loss: 0.0311598 | val_loss: 0.03117 | Time: 4398.75 ms

</stderr_txt>
]]>


©2024 MLC@Home Team
A project of the Cognition, Robotics, and Learning (CORAL) Lab at the University of Maryland, Baltimore County (UMBC)