[TWIM Notes] Nov 9 2020

Message boards : News : [TWIM Notes] Nov 9 2020
Message board moderation

To post messages, you must log in.

AuthorMessage
pianoman [MLC@Home Admin]
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 30 Jun 20
Posts: 462
Credit: 21,406,548
RAC: 0
Message 792 - Posted: 10 Nov 2020, 5:48:06 UTC

This Week in MLC@Home
Notes for Nov 9 2020
A weekly summary of news and notes for MLC@Home

Summary
GPU week(s), part 3. We took some time off this past week to focus on things other than academics, but still managed to get an updated Linux/CUDA client out that now works on at least some systems. Attempting to make a single application that runs on multiple OS, drivers, and CUDA versions is proving to be difficult. This is also a strange case where the Linux app is proving more difficult than Windows... 2020 is a really weird timeline.

News:

  • You can follow GPU client progress on several forum threads like this one: https://www.mlcathome.org/mlcathome/forum_thread.php?id=111
  • We've set up a BOINC AppPlan to filter clients who can receive the cuda client to those that will work. You will need CUDA driver version 440 or higher, Windows 10 or LInux w/ GLIBC 2.27 or higher, and an nvidia card with compute capability 3.5 or greater.
  • We expect to have most CUDA issues ironed out and have them in general (non-beta) use by lastnextthis coming week. ROCm support would be next, but it is a lower priority.
  • Datasets 1,2 and 3 continue crunching away. GREAT progress so far!
  • GPU debugging continues to take too much time, we look forward to getting them released so we can focus on the science.



Project status snapshot:
(note these numbers are approximations)

Tasks
Tasks ready to send 16536
Tasks in progress 22529
Users
With credit 1134
Registered in past 24 hours 35
Hosts
With recent credit 2097
Registered in past 24 hours 24
Current GigaFLOPS 26267.99

Dataset 1 and 2 progress:

SingleDirectMachine      10002/10004
EightBitMachine          10001/10006
SingleInvertMachine      10001/10003
SimpleXORMachine         10000/10002
ParityMachine              949/10005

ParityModified             308/10005
EightBitModified          6713/10006
SimpleXORModified        10005/10005
SingleDirectModified     10004/10004
SingleInvertModified     10002/10002 

Dataset 3 progress:
Overall (so far): 49317/53359
Milestone 1, 100x100:  10000/10000
Milestone 2, 100x1000: 49317/100000
Milestone 3: 100x10000: 49317/1000000


Last week's TWIM Notes: Nov 2 2020

Thanks again to all our volunteers!

-- The MLC@Home Admins[/s]j
ID: 792 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
bozz4science

Send message
Joined: 9 Jul 20
Posts: 142
Credit: 11,536,204
RAC: 3
Message 794 - Posted: 10 Nov 2020, 16:09:52 UTC

Great news! I got some GPU test WUs after all and they crunched away reliably in ~1,200 sec. The CPU equivalent runtime is on ΓΈ 35,000 sec, so for me at least that is a massive speedup. They were running from the get go without any problems. Also trying to suspend and continue WUs worked like a charm. The only 2 errors I received were "-529697949 (0xE06D7363) Unknown error code" so that wasn't helpful. The 2 errors were thrown only, while trying to rund 2 WUs simultaneously via specific app config settings. So I guess it was some internal memory corruption.

After reverting back to the stock settings for the GPU app, it works flawlessly again and at least for my CPU/GPU pairing, the speedup of ~30x is incredible. Testing was conducted on a Windows system with a Xeon X5660 and GTX 750 Ti.

Are you going to separate the GPU app version into its own respective application soon? And when are you planning to release the GPU WUs into the wild?

Thx
ID: 794 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
pianoman [MLC@Home Admin]
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 30 Jun 20
Posts: 462
Credit: 21,406,548
RAC: 0
Message 800 - Posted: 11 Nov 2020, 7:36:36 UTC - in response to Message 794.  

GPU app in the release channel are live right now. WUs will be release in the both channels as needed. Currently waiting for 10,000 new WUs in the CPU channel to queue and we'll release another 10,000 into the GPU channel.
ID: 800 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Message boards : News : [TWIM Notes] Nov 9 2020

©2024 MLC@Home Team
A project of the Cognition, Robotics, and Learning (CORAL) Lab at the University of Maryland, Baltimore County (UMBC)