What is MLC@Home?

MLC@Home is a distributed computing project dedicated to understanding and interpreting complex machine learning models, with an emphasis on neural networks. It uses the BOINC distributed computing platform. You can find more information on our main website here: https://www.mlcathome.org.

Neural Networks have fuelled a machine learning revolution over the past decade that has led to machines accomplishing amazingly complex tasks. However, these models are largly black boxes: we know they work, but they are so complex (up to hundreds of millions of parameters!) that we struggle to understand the limits of such systems. Yet understanding networks becomes extremely important as networks are deployed in safety critical fields, like medicine and autonomous vehicles.

MLC@Home provides an open, collaborative platform for researchers studying machine learning comprehension. It allows us to train thousands of networks in parallel, with tightly controlled inputs, hyperparameters, and network structures. We use this to gain insights into these complex models.

We ask for volunteers to donate some of their background computing time to help us continue our research. We use the time-tested BOINC distributed computing infrastructure — the same infrastructure that powers SETI@home's search for alien life, and Rosetta@home's search for effective medications. BOINC is fun — you get credit for each bit of compute that you do, with leaderboards and milestones. All while helping further open research. Please follow the link below to join, and happy crunching!

Join MLC@Home

Already joined? Log in.

News

DS3 Dataset of 1 million trained neural networks is available for download!
Hello volunteers!

Just a quick note that Dataset 3 is finally posted for download at our site https://www.mlcathome.org/mlds.html! Dataset 3 was completed a few months ago, but due to its massive size (2.25TB in all), and us emphasizing our own analysis over packaging the results for download, its taken us until now to make it available.

As a reminder, DS3 contains over 1 million trained neural networks (10,000/ea modelling 100 different automata), with a goal of analyzing how networks of the same size and shape encode similar-but-not-exact training data. Expect an updated paper soon!

We've always held that if the public is doing work for this project, then the results of that work should be made available back to the public to further science. As of right now, all of DS1, DS2, and DS3 are available to the public under a CC-BY-SA 4.0 license. We will do the same with DS4 when it completes.

DS3 is released via torrents due to its size. A few volunteers have already downloaded and seeded these (very large) files, so hopefully new downloads should be a bit quicker than us just serving from our singular server. The torrent files are listed on our website, and we're using the Academic Torrents tracker (see: https://academictorrents.com/browse.php?search=mlds.

Thanks again to all our volunteers! DS3 is quite an accomplishment!

-- The MLC@Home Admins(s)
Homepage: https://www.mlcathome.org/
Discord invite: https://discord.gg/BdE4PGpX2y
Twitter: @MLCHome2
2 May 2022, 3:58:59 UTC · Discuss


MLC@Home inconsistent work generation for the next few months
TL;DR: MLC is entering an analysis phase, and new work will be bursty and inconsistent for at least the next few months. Please adjust your BOINC contributions accordingly!

Over the past several months we (MLC@Home admins) have turned our attention to the analysis of the results our volunteers have contributed. With the completion of DS1/2/3, and the partial results of DS4, we're really excited to polish up some papers and publish some results. (Along those lines, look for an announcement of availability of all 5 tiers of DS3 datasets later today or tomorrow, just need to set up a torrent for 1.3TB DS3-10000 dataset).

In addition, DS4 results are larger than we anticipated and also don't require as much computation time to complete. So when we release DS4 WUs our volunteers churn through them in only a few days time while also filling up the disk space on the server. This is a great problem to have, but also forces us to be judicious about sending out work to make sure we've archived enough old results off the server to handle the influx of new results.

The upshot of all this is that we don't have the resources to both do the analysis and prepare/maintain consistent meaningful work units. So rather that just keep pushing out work that'll keep WUs flowing but has less scientific meaning (such as creating more DS3 networks just to create a bigger dataset), we'd rather just announce that MLC@Home WUs work will be inconsistent for at least the next several months. We expect to release batches of DS4 WUs every few weeks, but it won't be the constant work availability you're used to from the project over the past two years.

We realize this will cause us to lose some volunteers, but that's why we're trying to be upfront about this now so that everyone can decide if and how to allocate their BOINC contributions accordingly. We hope that you'll consider leaving MLC in your projects list and help us crunch WUs when we have them, but understand if choose not to.

A few key things to note:


  • Are you shutting down? . No, not at this time. Beyond the stated goals for DS4 above, we have some ideas where we would like to go in the future. But the main admin needs to spend time finishing up their thesis, so those plans will be on hold until after that is complete. If those plans don't come to fruition, then we will be up front here and actively shut down the project. We promise we won't just leave abandon it with no notice!
  • What about all the work the volunteers have done? DS1/DS2 datasets are all available for download at https://www.mlcathome.org/mlds.html, and DS3 will be soon (via torrent). As DS4 completes we promise to make those available too at the same place.



We hope this announcement reassures you that we're trying to be good stewards of the trust and resources you provide us as BOINC volunteers. We're really excited by the science and humbled by your support since we started in July 2020, and we hope you understand as we move into the next phases of our work. As things change we'll make more announcements here and on our Discord.

-- The MLC@Home Admins(s)
Homepage: https://www.mlcathome.org/
Discord invite: https://discord.gg/BdE4PGpX2y
Twitter: @MLCHome2
17 Apr 2022, 15:33:47 UTC · Discuss


Spring 2022 MLC Project Update: DS2 Complete edition!
It's been a while since we've posted an update, but that doesn't mean the project has been idle! If you've been following on our Discord server you'll know we've continued to make progress, and thanks to our volunteers, today is a day of celebration!

Here's a summary of the current project status:

Summary


  • DS2 Computation is complete! As of 1 Apr 2022, we finally crossed 10,000 trained networks threshold for ParityModified, completing our computation for DS2. This has taken a long time, and the complete dataset should help researchers understand how neural networks encode data.
  • All DS1/DS2 tarballs are available for download from https://www.mlcathome.org/mlds.html. This is your work, and now its free for you or anyone else to study and build upon!
  • DS3 tarballs still pending. Computation for DS3 completed last year, but we have not uploaded to full datasets to the website for download yet. We've been focused on analysis, and the sheer size of the dataset can cause headaches making bundling a time-consuming task. We'll post here when they're available.
  • DS4 WUs are out! DS4 WUs are out for our CPU client, and progress has started there. DS4 is much more complicated to manage on the backend because it has multiple training sets that have different requirements, but we're pushing new WUs out as fast as we can.
  • We're pausing GPU WUs: It saddens us, but we have not been successful updating our GPU clients to support DS4 WUs. And as we shift our focus to analyzing the results we do have, we have less and less time to focus on client development beyond the CPU client. When the current GPU queue runs dry, we won't be sending out more GPU work until we have time to re-prioritize porting a GPU client again. Maintaining a GPU client has taken much more time and effort than anticipated, and unless we can get outside help it will remain a low priority for the time being. We truly appreciate our GPU volunteers, but at the moment we don't have any work to send, and encourage you to turn your hardware to support other worthwhile projects that can support your hardware!
  • We're exploring porting the CPU client to Rust. In addition, our reliance on PyTorch has become more of a hindrance to portability than an asset. While the neural network ecosystem in rust is not nearly as robust, the ability for rust to compile a static binary targeting a large number of architectures and operating systems is very appealing to portability. As such, we're looking to port our MLC CPU client to pure rust, with an option to support GPUs from the same code base in the future. If you know Rust and are interested, please contact the MLC Admins.



Please note that there are still DS2 WUs in the work queue, we ask that you please continue to crunch them, as it's always better to have more samples as spares. However, we don't plan to queue up any more DS1/2/3 WUs, and all new WUs added will be DS4 or later. This applies to the GPU queue as well.

We're really excited for DS4 WUs going forward, and it should help show our theory that similar networks cluster in parameter space in both feed forward and CNN-based networks as well as the RNNs used in DS1/2/3. Beyond DS4, we have some ideas but have nothing concrete at the moment. We'll keep you updated as we move forward.

Thanks again to all our volunteers for supporting the project and helping science.

-- The MLC@Home Admins(s)
Homepage: https://www.mlcathome.org/
Discord invite: https://discord.gg/BdE4PGpX2y
Twitter: @MLCHome2
3 Apr 2022, 1:31:00 UTC · Discuss


Maintenance / Downtime 3/27/22
MLC's server will have a brief period of downtime today starting at approximately 3:30pm UTC to add more storage and prepare the main queue for DS4 workloads. The downtime shouldn't be more than an hour or two.

Thanks again for all your support.
-- MLC Admins
27 Mar 2022, 15:04:52 UTC · Discuss


Testing updates to backend services again
All,

We're going to be testing some new backend server updates again this weekend. Last time this lead to some instability, but we've learned quite a bit from that and have taken steps to make sure it doesn't happen again, with an easy and quick revert path if necessary. There may be some small interruptions, but nothing serious,. There is also nothing you need to do on the client side, this is all on the backend.

Wish us luck, and we'll be watching the results like a hawk for any new issues.
14 Nov 2021, 2:40:52 UTC · Discuss


... more

News is available as an RSS feed   RSS


©2022 MLC@Home Team
A project of the Cognition, Robotics, and Learning (CORAL) Lab at the University of Maryland, Baltimore County (UMBC)