MLDS v0.90 issues roundup

Questions and Answers : Issue Discussion : MLDS v0.90 issues roundup
Message board moderation

To post messages, you must log in.

AuthorMessage
pianoman [MLC@Home Admin]
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 30 Jun 20
Posts: 462
Credit: 21,406,548
RAC: 0
Message 48 - Posted: 3 Jul 2020, 2:39:58 UTC
Last modified: 3 Jul 2020, 2:40:47 UTC

I'm working on v0.91, hopefully releasing for beta tonight or tomorrow. Here are the issues I know about for mlds 0.90, please respond with any others:


  • Issue resuming from a snapshot after system reboot: I know what's going on and working on a fix. The bug doesn't effect the resulting final trained network, but could cause other issues. This will be fixed for 0.91, or snapshotting will be disabled.
  • Results with "finish file present too long" in stderr: I still need to track this down, but at least some results that have that appear valid on manual inspection.
  • Centos 7 glibc version error: This is fixed for for v0.91.
  • OpenSUSE failure with fusermount: This is actually an opensuse bug, see; https://forums.opensuse.org/showthread.php/525251-sshfs-problem-quot-fuse-failed-to-exec-fusermount-Permission-denied-quot.
  • Windows support: I've coaxed VS2019 to at least attempt to compile to compile the code, but it will not be ready for another few days at least.

ID: 48 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
pianoman [MLC@Home Admin]
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 30 Jun 20
Posts: 462
Credit: 21,406,548
RAC: 0
Message 51 - Posted: 3 Jul 2020, 7:40:59 UTC - in response to Message 48.  

Application v0.91 is up as a beta test, and should fix both the centos 7 and checkpoint/snapshotting issues.
Please enable "test applications" in your profile and try it out.
ID: 51 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Sergey Kovalchuk

Send message
Joined: 1 Jul 20
Posts: 31
Credit: 123,959
RAC: 0
Message 56 - Posted: 3 Jul 2020, 12:06:27 UTC - in response to Message 51.  

Ubuntu 18.04.3 LTS [4.19.104+|libc 2.27 (Ubuntu GLIBC 2.27-3ubuntu1)]

<stderr_txt>
../../projects/www.mlcathome.org_mlcathome/mlds_0.91_x86_64-pc-linux-gnu: /tmp/.mount_mlds_0YZG0le/usr/bin/../lib/libstdc++.so.6: version `CXXABI_1.3.8' not found (required by /usr/lib/x86_64-linux-gnu/libtcmalloc.so.4)
../../projects/www.mlcathome.org_mlcathome/mlds_0.91_x86_64-pc-linux-gnu: /tmp/.mount_mlds_0YZG0le/usr/bin/../lib/libstdc++.so.6: version `GLIBCXX_3.4.21' not found (required by /usr/lib/x86_64-linux-gnu/libtcmalloc.so.4)
</stderr_txt>
ID: 56 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
pianoman [MLC@Home Admin]
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 30 Jun 20
Posts: 462
Credit: 21,406,548
RAC: 0
Message 61 - Posted: 3 Jul 2020, 15:12:13 UTC - in response to Message 56.  

Hmm.. I can't reproduce this:
root@sacred-akita:~# ./mlds-0.91.appimage 
Machine Learning Dataset Generator v0.91
[2020-07-03 15:05:50	                main:246]	:	INFO	:	Set logging level to 1
[2020-07-03 15:05:50	                main:254]	:	INFO	:	Running in BOINC Standalone mode
[2020-07-03 15:05:50	                main:259]	:	INFO	:	Resolving all filenames
[2020-07-03 15:05:50	                main:267]	:	INFO	:	Resolved: dataset.hdf5 => dataset.hdf5 (exists = 0)
[2020-07-03 15:05:50	                main:267]	:	INFO	:	Resolved: model.cfg => model.cfg (exists = 0)
[2020-07-03 15:05:50	                main:267]	:	INFO	:	Resolved: model-final.pt => model-final.pt (exists = 0)
[2020-07-03 15:05:50	                main:267]	:	INFO	:	Resolved: model-input.pt => model-input.pt (exists = 0)
[2020-07-03 15:05:50	                main:267]	:	INFO	:	Resolved: snapshot.pt => snapshot.pt (exists = 0)
[2020-07-03 15:05:50	                main:272]	:	ERROR	:	Resolved dataset filename doesn't exist, exiting
root@sacred-akita:~# lsb_release -a
No LSB modules are available.
Distributor ID:	Ubuntu
Description:	Ubuntu 18.04.4 LTS
Release:	18.04
Codename:	bionic


Do you happen to have the libtcmalloc-minimal4 package installed? I do not and don't see this issue. I'm not suggesting you remove it, I'm just trying to track down the issue.
ID: 61 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Bok

Send message
Joined: 1 Jul 20
Posts: 7
Credit: 1,181,193
RAC: 0
Message 62 - Posted: 3 Jul 2020, 15:15:11 UTC

I have a bunch of tasks that have been running on my centos7 build just fine, due to complete within an hour now.
ID: 62 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
pianoman [MLC@Home Admin]
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 30 Jun 20
Posts: 462
Credit: 21,406,548
RAC: 0
Message 63 - Posted: 3 Jul 2020, 15:31:39 UTC - in response to Message 56.  

Are you manually LD_PRELOAD-ing libtcmalloc?
ID: 63 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
pianoman [MLC@Home Admin]
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 30 Jun 20
Posts: 462
Credit: 21,406,548
RAC: 0
Message 64 - Posted: 3 Jul 2020, 15:33:09 UTC - in response to Message 62.  

I have a bunch of tasks that have been running on my centos7 build just fine, due to complete within an hour now.


Great, and thanks.
ID: 64 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Sergey Kovalchuk

Send message
Joined: 1 Jul 20
Posts: 31
Credit: 123,959
RAC: 0
Message 68 - Posted: 3 Jul 2020, 16:18:36 UTC - in response to Message 61.  
Last modified: 3 Jul 2020, 16:46:57 UTC

dpkg --list | grep -E "libtcmalloc"
ii  libtcmalloc-minimal4                   2.5-2.2ubuntu3                                    amd64        efficient thread-caching malloc


lsb_release -a
No LSB modules are available.
Distributor ID:	Ubuntu
Description:	Ubuntu 18.04.3 LTS
Release:	18.04
Codename:	bionic


./tmp/mlds_0.91_x86_64-pc-linux-gnu
./tmp/mlds_0.91_x86_64-pc-linux-gnu: /tmp/.mount_mlds_0E2SoYp/usr/bin/../lib/libstdc++.so.6: version `CXXABI_1.3.8' not found (required by /usr/lib/x86_64-linux-gnu/libtcmalloc.so.4)
./tmp/mlds_0.91_x86_64-pc-linux-gnu: /tmp/.mount_mlds_0E2SoYp/usr/bin/../lib/libstdc++.so.6: version `GLIBCXX_3.4.21' not found (required by /usr/lib/x86_64-linux-gnu/libtcmalloc.so.4)


./tmp/mlds_0.90_x86_64-pc-linux-gnu

Machine Learning Dataset Generator v0.90
[2020-07-03 16:45:03	                main:281]	:	INFO	:	Set logging level to 1
[2020-07-03 16:45:03	                main:289]	:	INFO	:	Running in BOINC Standalone mode
[2020-07-03 16:45:03	                main:311]	:	INFO	:	Dataset filename: dataset.hdf5
[2020-07-03 16:45:03	                main:313]	:	INFO	:	Configuration: 
[2020-07-03 16:45:03	                main:314]	:	INFO	:	    Validation Loss Threshold: 0.0001
[2020-07-03 16:45:03	                main:315]	:	INFO	:	    Max Epochs: 100
[2020-07-03 16:45:03	                main:316]	:	INFO	:	    Batch Size: 128
[2020-07-03 16:45:03	                main:317]	:	INFO	:	    Patience: 10
[2020-07-03 16:45:03	                main:318]	:	INFO	:	    Hidden Width: 12
[2020-07-03 16:45:03	                main:319]	:	INFO	:	    # Recurrent Layers: 4
[2020-07-03 16:45:03	                main:320]	:	INFO	:	    # Backend Layers: 4
[2020-07-03 16:45:03	                main:322]	:	INFO	:	Preparing Dataset
ID: 68 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Questions and Answers : Issue Discussion : MLDS v0.90 issues roundup

©2022 MLC@Home Team
A project of the Cognition, Robotics, and Learning (CORAL) Lab at the University of Maryland, Baltimore County (UMBC)