Long run times on second GPU

Questions and Answers : Windows : Long run times on second GPU
Message board moderation

To post messages, you must log in.

AuthorMessage
Fardringle

Send message
Joined: 2 Jul 20
Posts: 6
Credit: 19,616,492
RAC: 0
Message 1350 - Posted: 28 Aug 2021, 13:55:43 UTC

I have an older computer that has been running MLC on an Nvidia Quadro K2200 for a while now. It's not a super powerful card, but it has been steady and reliable on the project, completing tasks in an average of about 1.75 hours.

I got my hands on a second K2200 yesterday so I put it in the same machine in the second PCIe slot. BOINC registered the second card and immediately started running MLC tasks on it as well. However, while the first card is still finishing tasks in around 1.75 hours, the second card is taking more than twice as long to complete its tasks, with an average time of 4.06 hours so far.

The computer has an i7-4790 CPU and 32GB of DDR3-12800 (1600mhz) RAM.

The only difference between the two is that the second card is in a 4x slot and the first card is in a 16x slot. Is MLC so heavily bandwidth dependent that the slot speed would make that much of a difference even on really old, really slow graphics cards?

This is the computer info page:
https://www.mlcathome.org/mlcathome/show_host_detail.php?hostid=5191

And this is the computer's task list:
https://www.mlcathome.org/mlcathome/results.php?hostid=5191&offset=0&show_names=0&state=4&appid=
ID: 1350 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
pianoman [MLC@Home Admin]
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 30 Jun 20
Posts: 462
Credit: 21,406,548
RAC: 0
Message 1351 - Posted: 28 Aug 2021, 15:03:25 UTC - in response to Message 1350.  

First, thanks for supporting the project.

Second, can you run "nvidia-smi" to verify that the client is actually using the second card (and not double-loading the 1st)? Here's a link on how to find this command on windows https://stackoverflow.com/questions/57100015/how-do-i-run-nvidia-smi-on-windows. Run it while boinc is running. I don't have a dual GPU machine to test.

But assuming it's working as intended, then I can say that yes, the way the current GPU WUs are crafted, there is a lot of activity moving the dataset and model between CPU and GPU RAM, so it's not /too/ surprising that if the first slot is x16, and the second slot is x4, the WU might take 2-3x as long. This is why you'll also see people here with very fast graphics cards saying they get low GPU utilization.
ID: 1351 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Fardringle

Send message
Joined: 2 Jul 20
Posts: 6
Credit: 19,616,492
RAC: 0
Message 1352 - Posted: 28 Aug 2021, 15:26:11 UTC - in response to Message 1351.  

nvidia-smi.exe says that both cards are being used. Not at 100%, but it wasn't ever 100% utilization with just the one card either.

MSI Afterburner and GPU-Z also say that both cards are being used.

So it is probably just the limit of the 4x PCIe slot, then. I wouldn't expect such an old card to be able to fully use even a 4X slot, but as you said this project moves a lot of data between the card and the CPU so I guess I'll just take what I can get out of the second card.

These GPUs only use about 15 Watts under full load so it's not really wasting any power to have them running this way...

I had my Quadro RTX 3000 running this project for a short time as well for testing. I can't do it full time since it's in a notebook computer and gets too hot. But I noticed that the GPU utilization, task times, and credits rewarded, were a LOT higher if I didn't let BOINC run any other projects (CPU or GPU) on the notebook at the same time. Even a 15-20% load on the CPU from something else dramatically reduced the performance of the MLC app. I don't see the same boost in utilization having the older i7 CPU idle in this PC, though. It probably just doesn't have the CPU power and RAM bandwidth to really make a difference.

I'm not using this computer for anything else, and the K2200s are using such a small amount of electricity, so I'll just let them do whatever they can do. :)
ID: 1352 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
kotenok2000

Send message
Joined: 17 Jul 20
Posts: 12
Credit: 7,473,347
RAC: 65
Message 1353 - Posted: 29 Aug 2021, 8:20:58 UTC

can you set gpu process priority to real time with process hacker?
ID: 1353 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
bozz4science

Send message
Joined: 9 Jul 20
Posts: 142
Credit: 11,536,204
RAC: 3
Message 1354 - Posted: 29 Aug 2021, 10:46:02 UTC

A while back I had experienced the same issue, when I added a second card to my host and both PCIe slots then ran in a dual x8 slot width configuration instead of x16. PCIe bandwidth seemingly has a large influence on GPU WUs as far as I can tell. All of my cards (750Ti / 970 / 1660S) that I tested in a x16 slot config were faster than running in x8 mode (can be easily tweaked in most modern BIOS/UEFI settings). I eventually settled on a single GPU setup with my 1660S running in x16 mode. I do slightly remember that the cards had about a 50 to 100% performance penalty when running in x8 mode because the GPU WUs rely heavily on CPU support
ID: 1354 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Questions and Answers : Windows : Long run times on second GPU

©2022 MLC@Home Team
A project of the Cognition, Robotics, and Learning (CORAL) Lab at the University of Maryland, Baltimore County (UMBC)