What is MLC@Home?

MLC@Home is a distributed computing project dedicated to understanding and interpreting complex machine learning models, with an emphasis on neural networks. It uses the BOINC distributed computing platform. You can find more information on our main website here: https://www.mlcathome.org.

Neural Networks have fuelled a machine learning revolution over the past decade that has led to machines accomplishing amazingly complex tasks. However, these models are largly black boxes: we know they work, but they are so complex (up to hundreds of millions of parameters!) that we struggle to understand the limits of such systems. Yet understanding networks becomes extremely important as networks are deployed in safety critical fields, like medicine and autonomous vehicles.

MLC@Home provides an open, collaborative platform for researchers studying machine learning comprehension. It allows us to train thousands of networks in parallel, with tightly controlled inputs, hyperparameters, and network structures. We use this to gain insights into these complex models.

We ask for volunteers to donate some of their background computing time to help us continue our research. We use the time-tested BOINC distributed computing infrastructure — the same infrastructure that powers SETI@home's search for alien life, and Rosetta@home's search for effective medications. BOINC is fun — you get credit for each bit of compute that you do, with leaderboards and milestones. All while helping further open research. Please follow the link below to join, and happy crunching!

Join MLC@Home

Already joined? Log in.

News

[TMIM Notes] July 1 2021 --- Celebrating 1 year of MLC@Home!
This Month in MLC@Home
Notes for July 1 2021
A monthly summary of news and notes for MLC@Home

Summary
Happy first birthday to MLC@Home! This project went live on July 1, 2020, and caught on pretty quickly in the BOINC community. We've remained focused on our goal, which is breaking open the black box of neural networks to explain why they make the choices they do. This is so important as machine learning permeates more and more of our everyday life; from autonomous cars, to banking decisions, and medical diagnoses. We need research to understand how to keep bias out of these systems.

We are also the first, and to date only, public machine learning focused BOINC project. This means that while we could leverage the BOINC framework for job management, we have to build most of the ML client infrastructure from the ground up. This hasn't always been smooth, but we've accomplished so much in the past year regardless.

In the past year, we have:


  • Received contributions from over 2500+ volunteers and 9200+ hosts
  • Processed over 3.4 million BOINC workunits
  • Trained over 1.1 million neural networks for analysis over 3 different datasets, the largest datasets of their kind
  • Generated over 4.3TB of data for analysis
  • Published one academic paper (more coming..)
  • Presented at the 2021 BOINC Workshop
  • Released 47 client versions targeting 3 different CPU architectures, 2 GPU architectures, and multiple versions of Windows and Liunx.
  • Outgrew the initial server within the first few months!



I'm overwhelmed by our community and what we've accomplished together. We've already shown that networks trained with the same data cluster together in weight space, despite the randomness associated with neural network training. We've also shown we can use this clustering to detect networks trained with poisoned data versus clean data, a significant finding in the field.

But there's still soo much more to do! So while we want to acknowledge and celebrate what we've jointly accomplished so far, let's also look forward and set some loose goals for the next year of MLC@Home:


  • MLDS will continue near term!
    DS4 is (almost) ready and expands the dataset to include CNN network types as well as RNNs used in DS1-3. DS5 will likely vary the shape and size of each network slightly to see if clustering still happens when shape is varies. Future MLDS work beyond DS5 is TBD, but we expect there to be plenty DS4/DS5 WUs for many months to come. We expect to update the paper with the latest runs over the next month.

  • We'd like to expand beyond MLDS!
    We are the first project to do ML on a BOINC-sized scale. We would like to expand to supporting other areas of research, and want to commit to bringing at least one other ML project online within the next year. Please contact us if you are a researcher who is interested in working with the platform!

  • We need to improve the technical side of the project
    From the client supporting AMD GPUs and OSX to optimizing utilization of graphics cards to a better validation process for WUs, there's a laundry list of technical issues we'd like to address, and have not done so effectively in the past three months. We're also hitting some corner-cases of the BOINC software stack that are tricky to work around. If you are a developer and want to help, we'd welcome the support.

  • We'd like to improve outreach
    To get more people involved, we'd like to produce a few short videos about the project, what we've found and how others can help. These should be short, easily accessible, and easy to share. We'd like to produce at least one of these within the next 6 months.


These are loose goals but should give you an idea where we're concentrating our efforts for the next year. If you have further insights, please share them below or on Discord.

Thanks again for supporting MLC@Home, and here to many more years of successful, important research in an important field.

Other News


  • DS3 is all but complete (just a last few 130+ trickling in!). I consider DS3 to be the most important dataset and can't wait to run our analysis on the whole thing!
  • From now on we'll be blasting DS1 (then DS2) WUs into both the GPU and CPU queues until that completes and/or until DS4 is ready. We'll try to get those over the hump ASAP.
  • Some fun news! MLC Discord user Tankbuster has updated our banner graphic! See the updated banner on project and home pages!
  • Even more exciting, Tankbuster built a prototype graphics app for MLC@Home! You can see mockups and videos and follow the discussion at the MLC Discord server (link at the bottom). Screenshot:
  • Reminder: the MLC client is open source, and has an issues list at gitlab. If you're a programmer or data scientist and want to help, feel free to look over the issues and submit a pull request.



Project status snapshot:
(note these numbers are approximations)






Last month's TMIM Notes: Jun 8 2021

Thanks again to all our volunteers!

-- The MLC@Home Admins(s)
Homepage: https://www.mlcathome.org/
Discord invite: https://discord.gg/BdE4PGpX2y
Twitter: @MLCHome2
2 Jul 2021, 1:40:09 UTC · Discuss


[TMIM Notes] Jun 8 2021 posted
MLC@Home has posted the Jun 8 2021 edition of its monthly "This Month In MLC@Home" newsletter!
A monthly update including the new client with DS4 support, a note on disk space on the server, and a mvoe to monthly updates instead of weekly ones moving forward.

Read the update and join the discussion here.
9 Jun 2021, 4:22:49 UTC · Discuss


[TMIM Notes] June 8 2021
This Week Month in MLC@Home
Notes for June 8 2021
A monthly summary of news and notes for MLC@Home

Summary
Updates have come slowly these past few months, since the presentation at the BOINC workshop and the release of our initial paper, as we're personally adjusting (fortunately!) the the beginnings of a post-pandemic life. Work, family life, and everything is changing for many of us, and we're still trying to figure out the new normal. Because of this, going forward these updates to be monthly since they take quite a bit of time to put together and we've been failing to get them out weekly for a while now anyway. And here's hoping all our volunteers all over the world are in an area where they too can start to move beyond the worst of the pandemic.

But that doesn't mean the project has been dormant!

DS1/DS2/DS3 are all nearing completion, especially DS3 which is sitting at 97%. We've been talking about DS4 for months, and the code is ready for larger testing. Unfortunately, we rolled out a test client a few weeks ago that failed miserably, because of an incompatibility between PyTorch and the native BOINC API. there's a way around this, but it requires more development, and a change to how WUs are specified, and we've been working on it ever since. We should be ready any day now but its been more involved then we thought so we're not prepared to give it a time. But, we do know we need to have it soon as DS3 WUs are running out.

Some of the other benefits of the new client are it's statically linked, which vastly simplifies deployment. The extra development time has also given us a chance to make a change to make us more robust to NaNs, which should cut down on the amount of validation errors on the system.

Another new issue is the data partition is running out of space on the server.. DS3 is taking over 4TB! Thanks to all of our volunteers! We've moved some things around to make a little space so everything is still working for now. We received some new storage today and will need some downtime to get it installed. Shouldn't take more than a few minutes, so we'll just do it sometime within the next week.

So, stay tuned, the next month's going to be intresting for MLC@Home, as we move into DS4 and the next phase of this research.

Other News


  • DS1/DS2 continues along as a slow pace. It will continue in the background until we have 10,000 samples of each.
  • We're working on keeping the paper up to date and fleshing it out some more.
  • We're also looking at a slightly new set of work beyond dataset generation, so hopefully MLDS won't be the only project in the future.
  • Reminder: the MLC client is open source, and has an issues list at gitlab. If you're a programmer or data scientist and want to help, feel free to look over the issues and submit a pull request.



Project status snapshot:
(note these numbers are approximations)






Last month's TMIM Notes: May 1 2021

Thanks again to all our volunteers!

-- The MLC@Home Admins(s)
Homepage: https://www.mlcathome.org/
Discord invite: https://discord.gg/BdE4PGpX2y
Twitter: @MLCHome2
9 Jun 2021, 4:16:08 UTC · Discuss


[TWIM Notes] May 1 2021 posted
MLC@Home has posted the May 1 2021 edition of its weekly "This Week In MLC@Home" newsletter!
A server hiccup, a note about possible issues on Linux with newer distributions with aggressive systemd sandboxing, and hope for a new client rollout this week!

Read the update and join the discussion here.
2 May 2021, 3:07:05 UTC · Discuss


[TWIM Notes] May 1 2021
This Week in MLC@Home
Notes for May 1 2021
A weekly summary of news and notes for MLC@Home

Summary
An overdue update this week.

First, we had a small server issue this morning, 5/1, and were down for about 10 hours until it was fixed. No data was lost, and we were able to restart with no further issues, although there may be some WUs that were marked invalid due to the unstable state of the system as it was going down, we're looking into that at the moment. It's the first bit of unscheduled downtime in long time. Fortunately we've been very stable since moving to the new server last year.

Second, thanks to an astute user, we've noticed a trend in newer Linux distributions that effects the MLC clients (as well as others like Einstein@Home and LHC). Some distributions are using systemd's sandboxing capabilities to limit the BOINC client and any project applications from interacting with the rest of the system for security reasons. Unfortunately, MLC's appimage-based clients use /tmp, which is now restricted under this new policy. We've identified this as an issue with Ubuntu 21.04 and Gentoo, and may become an issue further down the line for other systemd-based distributions. For now, there's a workaround listed in our forums https://www.mlcathome.org/mlcathome/forum_thread.php?id=198 . The next client update will drop appimage support and thus won't be effected by the issue going forward.

We've also spent some time working on the ROCm client, and have it working with Radeon VII as well as VEGA graphics card. Unfortunately, the current client requires you to have rocm-3.9.0 installed on your system.

Speaking of the next client version, DS4 support is implemented and works, so we hope to roll out the new CPU client with some test WUs this coming week. Features include CNN (DS4) support, static linking, (no more appimage!), and some more minor fixes.

Other News


  • Lots of great progress on DS3, we're ~80% complete. It's nice to see some green showing up on the scoreboard.
  • We participated in weeks 2 and 3 of the BOINC workshop, and look forward to the workshop posting the videos soon.
  • We're looking for conferences/workshops to submit our published paper.
  • Reminder: the MLC client is open source, and has an issues list at gitlab. If you're a programmer or data scientist and want to help, feel free to look over the issues and submit a pull request.



Project status snapshot:
(note these numbers are approximations)






Last week's TWIM Notes: Apr 22 2021

Thanks again to all our volunteers!

-- The MLC@Home Admins(s)
Homepage: https://www.mlcathome.org/
Discord invite: https://discord.gg/BdE4PGpX2y
Twitter: @MLCHome2
2 May 2021, 3:04:15 UTC · Discuss


... more

News is available as an RSS feed   RSS


©2021 MLC@Home Team
A project of the Cognition, Robotics, and Learning (CORAL) Lab at the University of Maryland, Baltimore County (UMBC)