Message boards :
News :
Updated CPU client 9.9x release and issues
Message board moderation
Author | Message |
---|---|
Send message Joined: 30 Jun 20 Posts: 462 Credit: 21,406,548 RAC: 0 ![]() ![]() ![]() ![]() |
Earlier this week, we released the latest v9.90 CPU client after almost 3 weeks of testing. While it initially seemed to be working fine, a number of errors started accumulating over the last 24 hours. We've identified a server configuration issue and believe it is now fixed as of 6AM UTC today. The server was generating invalid WUs for the MLDS queue. We've cancelled all of the problematic WUs and are adding new ones to the main queue. The GPU clients and MLDSTEST queue remained unaffected. v9.90 is an important release for MLDS, as it contains support for CNNs and Dense feed forward network types needed for DS4. Highlights include: - Statically linked binary for Linux (no more AppImage) - DS4 support! (CNN and Dense networks) - Better NaN handling - Update to libTorch 1.9 - Wrapper instead of BOINC native API
|
Send message Joined: 30 Jun 20 Posts: 462 Credit: 21,406,548 RAC: 0 ![]() ![]() ![]() ![]() |
This fix will not solve ALL oustanding issues, but it should help with: * Computation errors with no output in the logs that became prevalent in the last 24 hours (the 24h failure rate jumped from 1% to 80% over the past two days) * The memory limit has been set back to 800MB, as originally intended. It turned out that was not the issue. There's still known issues with DLL issues on windows, at least one report of a crash involving a file already existing that shouldn't exist, and one crash on an odroid (arm) system. Please keep reporting these issues and we'll tackle them as we can. I want to re-assure anyone experiencing those that we're not ignoring you at all. Thanks for volunteering your compute time! |
Send message Joined: 9 Jul 20 Posts: 142 Credit: 11,536,204 RAC: 3 ![]() ![]() ![]() ![]() |
Awesome news! I do know from personal experience how frustrating it can be to suddenly start seeing errors after weeks of coding and testing. I applaud your continued commitment and am especially excited for the DS4 support and the future science we can help to inform with new data sets and network types. Also great to see better NaN handling incorporated into version 9.90+ as this often resulted in the GPU WUs crunching through 100s of epochs, just carrying through the NaN while the erroneous result was only all too apparent when inspecting the WU log afterwards. Thanks for all your work! |
Send message Joined: 2 May 21 Posts: 9 Credit: 2,016,461 RAC: 2 ![]() ![]() ![]() |
I'm getting no wu's for aarch64-unknown-linux-gnu I'm not intending to be impatient, just concerned in case you are unaware. |
Send message Joined: 2 May 21 Posts: 9 Credit: 2,016,461 RAC: 2 ![]() ![]() ![]() |
I'm getting no wu's for aarch64-unknown-linux-gnu All good now, thanks Edit: picking 32-bit arm tasks up. |
©2023 MLC@Home Team
A project of the Cognition, Robotics, and Learning (CORAL) Lab at the University of Maryland, Baltimore County (UMBC)