Posts by alex

1) Questions and Answers : Unix/Linux : Linux for Windows Users (Message 1211)
Posted 31 May 2021 by alex
Post:
I've tried that. Driver manager said, no updates available, all drivers installed.

So i tried ubuntu 21.04.
After installing boinc i tried to start boinc. An error occured, saying : chdir: access denied.
Did not start boinc.

I give up for now, it's too time consuming.

Anyway, thanks for the help.
2) Questions and Answers : Unix/Linux : Linux for Windows Users (Message 1209)
Posted 30 May 2021 by alex
Post:
I tried the Ubuntu x86 64Bit friver package. Installer script said:
Unable to install pin package.
This driver may not support the running operating system

Download page offers:
- windows drivers ... not good for Mint
- RHEL x86 64 Bit
- CentOS
- SLED/SLES 15
- and of coarse Ubuntu x86 64 Bit.

Which one is the right package?
3) Questions and Answers : Unix/Linux : Linux for Windows Users (Message 1208)
Posted 30 May 2021 by alex
Post:
I installed Linux Mint, Boinc.
Boinc manager says: no usable GPU found.
I have the RX570 installed.
Please help me ! What have i to do to get my gpu recogniced by boinc?
4) Questions and Answers : Unix/Linux : Linux for Windows Users (Message 1199)
Posted 19 May 2021 by alex
Post:
Thank you for the reply.
The Question was: Which Linux distribution ist a good choice for a windows-user, who is not familiar with all the setup procedures required in linux.
I have a spare-system that can easily convertet to a linux system. I tried this a couple of years ago to crunch Einstein, but i was not able to get the ATI card running to crunch Einstein. For daily business all worked fine, but no crunching.
I had a three month discussion on the einstein-board, got a lot of help, but nothing worked. I was told, that different Linux distributions show a different level of complexity regarding the installation.

Since there are a lot of Linux crunchers here, who have experiance in using and maintaining Linux, i thought, i can get a hint which distribution to use and the chance to place questions here to get it running.
5) Message boards : Cafe : Badges (Message 1153)
Posted 18 Apr 2021 by alex
Post:
I would ad 50M as well.
6) Message boards : News : [TWIM Notes] Feb 1 2021 (Message 1115)
Posted 5 Mar 2021 by alex
Post:
I've seen you are running only cpu-wu's. If you get your RX550 up and running please leave a note here. I have a backup system with an RX570 and could easily install another disk with linux - if there is a chance to get it working.
7) Questions and Answers : Windows : Computation errors on 2080 Ti (Message 1107)
Posted 25 Feb 2021 by alex
Post:
every failed wu should give you something like this:
Stderr Ausgabe

<core_client_version>7.16.11</core_client_version>
<![CDATA[
<message>
(unknown error) - exit code 3765269347 (0xe06d7363)</message>
<stderr_txt>
0.0309688 | val_loss: 0.0312789 | Time: 1741.63 ms
[2021-02-24 03:23:01 main:574] : INFO : Epoch 1361 | loss: 0.0309637 | val_loss: 0.0312642 | Time: 1752.23 ms
[2021-02-24 03:23:03 main:574] : INFO : Epoch 1362 | loss: 0.0309647 | val_loss: 0.0312205 | Time: 1745.28 ms
[2021-02-24 03:23:05 main:574] : INFO : Epoch 1363 | loss: 0.0309646 | val_loss: 0.031285 | Time: 1731.2 ms
[2021-02-24 03:23:06 main:574] : INFO : Epoch 1364 | loss: 0.0309602 | val_loss: 0.0312414 | Time: 1771.33 ms
[2021-02-24 03:23:08 main:574] : INFO : Epoch 1365 | loss: 0.0309603 | val_loss: 0.0312525 | Time: 1717.28 ms

The exit-code might give Pianoman Infos why it failed. GTX3090 is a brand new produkt and it might be that MLC was not involved anywhere.
8) Questions and Answers : Windows : Computation errors on 2080 Ti (Message 1092)
Posted 16 Feb 2021 by alex
Post:
I too had problems when i started to crunch MLC. Driver updates helped in some cases.
I still have a lot of wu's that failed. Looking into the std_err output i can see a lot 'NaN' entries (Not a Number), which happens here from time to time. But all wu's? Could you please post some of the error-codes of the results? Maybe this gives hints to the problem. Is windows up to date? Are you crunching multiple wu's on the card?
From my experience i can say: 3 different windows systems 20H2 up to date, 3 different Cuda GPU's with drivers up to date work fine, the error-rate is well below 5%.. Also 2 Laptops with mobile Nvidia GPU's work fine.
9) Questions and Answers : Unix/Linux : Linux for Windows Users (Message 1090)
Posted 12 Feb 2021 by alex
Post:
A couple of years ago i tried to use linux (Kubuntu) to crunch Einstein, because many people in the forum talked about speed advantage with Linux. I was unable to get my AMD card working, even with help from Einstein forum and a german Ubuntu forum. OpenCL did not work.

I do have a system (https://www.mlcathome.org/mlcathome/show_host_detail.php?hostid=5403) that could be easily converted to a dual boot system; it's a backup system and not actively in use. It has a RX570 card installed. What would be the best Linux distribution to be used for a non Linux experienced user like me, easy to install even the AMD card? What is the performance of the RX570 here?

Or will be there a windows version for this card in the near future?
10) Questions and Answers : Windows : multiple cuda tasks (Message 1062)
Posted 26 Jan 2021 by alex
Post:
You can find it in in the issue discussions, wu's fail with err. message out of memory

The reply was:
The error indicates the system ran out of GPU RAM.

Each WU takes on the order of 1.6GB-1.9GB of GPU memory when computing. And we developed the cuda app on a system with a 1650 with only 4GB of ram, so your 1060 6GB should have plenty of headroom with memory.
Are you were running anything else graphics intensive at the time? maybe a game? Or are you trying to run multiple WUs at the same time on a GPU? if so you could easily run out of GPU memory in total.

Hope that helps, and thanks for crunching!
11) Questions and Answers : Windows : multiple cuda tasks (Message 1058)
Posted 25 Jan 2021 by alex
Post:
In earlier requests Pianoman postet, that the tasks need up to 2 GB mamory. Not always, but from time to time.
I tried it on my GTX1060 / 3GB as well, 80% of the wu's fail.
My 2 pc' with gpu's 4GB or more work fine with 2 wu's.
12) Questions and Answers : Issue Discussion : 2 GPUs, 1 failing to run with MLC (Message 1055)
Posted 24 Jan 2021 by alex
Post:
MLC recognices your pc with 2 GTX980TI's with 4096 MB each. This seem to be the problem.
Might be a configuration error. You need to wait for an Linux expert to get help .
13) Message boards : Cafe : PCIe bandwidth: influence on GPU performance (Message 1041)
Posted 16 Jan 2021 by alex
Post:
I see a difference in the driver version i use.
One system runs with 457.51 , another one with 452.06. Both win10 pc's.
14) Message boards : Cafe : PCIe bandwidth: influence on GPU performance (Message 1038)
Posted 14 Jan 2021 by alex
Post:
I have not exact same configuration, but a system with Nvidia and AMD.
I tested the numbers with GPU-Z. The summary can be found here in a pdf https://www.dropbox.com/s/ifb83k02fi0i1do/MLC%20answer%201.pdf?dl=0

What i found on a system running 2 wu's on the nvidia-card is a message popping up relatively often: waiting for memory. This extends thr runtime.
The system has 8 GB ram and some apps are running together with boinc. The cpu-load is in the range of 80% with spikes to nearly 100% when i start additional programs. The system has on-chip amd-gpu running Einstein; this requires (shared) ram.
Have you checked that the cpu-load is not limiting your system or ram limitations? A second gpu requires also ram and cpu-time.
I have 2 systems running with 2 wu's on the nvidia. Runtimes are longer but far away from twice as long, so this should not be the problem.
15) Questions and Answers : Issue Discussion : Setup to run 2 wu's on one GPU (Message 1024)
Posted 6 Jan 2021 by alex
Post:
looks good so far, credit increased. I have two systems running with 2 wu's on gpu. Not more errors than usual.
I am curious how a GTX3060ti would perform. 4 wu's (8 GB Ram) and > 19 TFlops ...
16) Message boards : News : [TWIM Notes] Jan 5 2021 -- A 6 Month Retrospective (Message 1023)
Posted 6 Jan 2021 by alex
Post:
A One-Man-Show?
Very brave to start this! I really hope that you will get help soon!
Great work, really!
17) Questions and Answers : Issue Discussion : Setup to run 2 wu's on one GPU (Message 1017)
Posted 5 Jan 2021 by alex
Post:
Big Thx for the responds, it's working now. GPU-usage has increased to 66% avg, 71% peak. More intresting is that the memory usage has not really increased, is always at 2012 MB. WU's finish and validate. Of coarse they take longer time, but there is also twice the output. Needs some time to compare the performance.
If someone wants to follow the results, it's cpu-ID: 5172
18) Questions and Answers : Issue Discussion : cpu comparison (Message 1002)
Posted 30 Dec 2020 by alex
Post:
all cpu's not overclocked, just standard bios settings. No highspeed gaming memory, just standard desktop pc's.
all systems run win10 (20H2)

hostid 5172 Ryzen5 2400G long-runs ~ 30.680 sec @ 5 wu's running + 1 gpu wu
hostid 5168 Ryzen5 1600 long runs ~ 28.020 sec @ 9 wu's running + 1 gpu wu
hostid 5173 Intel i7 860 long runs ~ 47.000 sec @ 5 wu's running + 1 gpu wu
hostid 6228 Athlon x4 860k long runs ~ 22.000 sec @ 2 wu's running no gpu wu

looks like MLC is specialized for these old cpu's
since my oldboy has only a small cooler i will upgrade this later in january and try running 3 wu's, currently it gets quite hot.

btw, runtime prediction is way off. Initial runtime prediction on my athlon says 1d 8h
19) Questions and Answers : Issue Discussion : Validate errors (Message 992)
Posted 29 Dec 2020 by alex
Post:
Some time ago i asked a similar question, one can see the thread here https://www.mlcathome.org/mlcathome/forum_thread.php?id=133
My initial assumption was that is was related to windows.
There one can find an answer from pianoman
However, the presence of NaNs in your result should not cause a validation failure. it's not your fault the algorithm happened to search down an un-fruitful path in the error plane, so you should still get credit for the computation. Instead, we simply don't generate a follow-on WU to continue searching down that path. I'll need to check the DB later tonight to see why this particular WU failed validation, but the nan isn't the reason.

So maybe it's a good idea to wait for a clear statement from the program developers. It makes no sense to kill wu's that are useful and would validate.
20) Questions and Answers : Issue Discussion : Validate errors (Message 989)
Posted 29 Dec 2020 by alex
Post:
I can manually abort them after "human-check".

Do you mean that they are cancelled during runtime, not to wast cpu/gpu time? Sounds good ...
There should be a way to do that from your script, Boinc Manager is remote controllable.

Are you shure that a NaN automatically signals a failed wu?


Next 20

©2022 MLC@Home Team
A project of the Cognition, Robotics, and Learning (CORAL) Lab at the University of Maryland, Baltimore County (UMBC)