2 GPUs, 1 failing to run with MLC

Questions and Answers : Issue Discussion : 2 GPUs, 1 failing to run with MLC
Message board moderation

To post messages, you must log in.

AuthorMessage
Philipp Marc Neuhaus

Send message
Joined: 23 Jan 21
Posts: 1
Credit: 5,119
RAC: 0
Message 1054 - Posted: 23 Jan 2021, 17:30:04 UTC

Gents,

I have two GPUs running,
GPU 0 = GeForce GTX 980 Ti
GPU 1 = GeForce GTX 760 (192-bit)

GPU 0 is running fine,
GPU 1 comes back with a calculation error after some seconds.

In all other projects both GPUs are running fine.

What should I do and how?
If I should disable GPU 1 for this project, then how do I do this?

Thanks
Phil
ID: 1054 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
alex

Send message
Joined: 4 Dec 20
Posts: 32
Credit: 47,319,359
RAC: 0
Message 1055 - Posted: 24 Jan 2021, 15:30:48 UTC - in response to Message 1054.  

MLC recognices your pc with 2 GTX980TI's with 4096 MB each. This seem to be the problem.
Might be a configuration error. You need to wait for an Linux expert to get help .
ID: 1055 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
pianoman [MLC@Home Admin]
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 30 Jun 20
Posts: 462
Credit: 21,406,548
RAC: 0
Message 1064 - Posted: 27 Jan 2021, 4:21:24 UTC

I'm sorry you're having this issue.

Unfortunately, we don't have a test machine with two GPUs to test on, but I'm pretty sure it's worked for others so I'm not sure what the issue is here.

As Alex says, BOINC itself is seeing you as having two GTX980s.. How much RAM does your GTX760 have?
2GB is cutting it really close for MLC, at least for the ParityModified WUs currently in the GPU channel
(the rand_automata ones aren't as memory intensive, somewhat unintuitively).
Since BOINC is telling us you have two 980s with 4GB of VRAM, there's no way of filtering this out on our end.

What I'm a bit confused about is if I look at your failed tasks, the error says that it can't find a valid kernel for your card.
Its almost like the nvidia libs are choosing the wrong kernel for your card. Since the 980 and 760 are two different generations, it almost
seems like its requesting a 980 kernel for your 760 card. This shouldn't be happening. Once we finish releasing the datasets we'll
try and take a look at this.
ID: 1064 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Questions and Answers : Issue Discussion : 2 GPUs, 1 failing to run with MLC

©2022 MLC@Home Team
A project of the Cognition, Robotics, and Learning (CORAL) Lab at the University of Maryland, Baltimore County (UMBC)