What is MLC@Home?

MLC@Home is a distributed computing project dedicated to understanding and interpreting complex machine learning models, with an emphasis on neural networks. It uses the BOINC distributed computing platform. You can find more information on our main website here: https://www.mlcathome.org.

Neural Networks have fuelled a machine learning revolution over the past decade that has led to machines accomplishing amazingly complex tasks. However, these models are largly black boxes: we know they work, but they are so complex (up to hundreds of millions of parameters!) that we struggle to understand the limits of such systems. Yet understanding networks becomes extremely important as networks are deployed in safety critical fields, like medicine and autonomous vehicles.

MLC@Home provides an open, collaborative platform for researchers studying machine learning comprehension. It allows us to train thousands of networks in parallel, with tightly controlled inputs, hyperparameters, and network structures. We use this to gain insights into these complex models.

We ask for volunteers to donate some of their background computing time to help us continue our research. We use the time-tested BOINC distributed computing infrastructure — the same infrastructure that powers SETI@home's search for alien life, and Rosetta@home's search for effective medications. BOINC is fun — you get credit for each bit of compute that you do, with leaderboards and milestones. All while helping further open research. Please follow the link below to join, and happy crunching!

Join MLC@Home

Already joined? Log in.

News

[TWIM Notes] Feb 23 2021 posted
MLC@Home has posted the Feb 23 2021 edition of its weekly "This Week In MLC@Home" newsletter!
Paper updates, a note about the larger datasets, and more...

Read the update and join the discussion here.
24 Feb 2021, 6:00:39 UTC · Discuss


[TWIM Notes] Feb 23 2021
This Week in MLC@Home
Notes for Feb 23 2021
A weekly summary of news and notes for MLC@Home

Summary
A small update this week since there's not much news overall. Honestly, yours truly got a little swamped with other things in life and needed to take care of some other things last week. Buy we're back this week, and other than an unexpectedly empty CPU queue for about half a day last week (since corrected), the project has continued and you volunteers have carved off another large chunk of datasets 1, 2 and 3.

The paper is coming along nicely, and we're finally happy with the data and analysis we want at this point, so it's just writing. Some tables and graphs have been posted to Twitter if you'd like to follow along. Honestly, we're about 3 months overdue to get this out to at least arXiv, but we hope it'll be worth the wait. Most of that time was spent tweaking algorithms to get better results, and the datasets/feature vectors are too large fit on an 8GB GPU, so it's mostly CPU training. 18 hours for an epoch is not uncommon. On the bright side, what we're learning doing this will help DS4 when it's available.

DS3-500 and DS3-1000 are complete and ready to upload, but we're still deciding the best place to host such large files. Seeding a torrent off the main server would cut into the already limited bandwidth. We'll need a better solution, for now they exist on a backed-up data drive, and available by individual request.

Other News


  • DS4 remains on hold until after the paper is finished.
  • We're discussing internally allowing Gridcoin to whitelist this project. If the MLC community has strong opinions either way, please post them below. There are still one or two things that we want to change before that either way, but the list of technical barriers on our side is almost empty.
  • Reminder: the MLC client is open source, and has an issues list at gitlab. If you're a programmer of data scientist and want to help, feel free to look over the issues and submit a pull request.



Project status snapshot:
(note these numbers are approximations)






Last week's TWIM Notes: Feb 8 2021

Thanks again to all our volunteers!

-- The MLC@Home Admins(s)
Homepage: https://www.mlcathome.org/
Twitter: @MLCHome2
24 Feb 2021, 5:58:36 UTC · Discuss


[TWIM Notes] Feb 8 2021 posted
MLC@Home has posted the Feb 8 2021 edition of its weekly "This Week In MLC@Home" newsletter!
Its Dataset week! This week MLC@Home releases the first round of datasets computed by our volunteers. They are available for download at the https://www.mlcathome.org/ website. These datasets will help scientists better understand neural networks.

Read the update and join the discussion here.
9 Feb 2021, 5:17:21 UTC · Discuss


[TWIM Notes] Feb 8 2021
This Week in MLC@Home
Notes for Feb 8 2021
A weekly summary of news and notes for MLC@Home

Summary
It's Dataset week!

We've been working behind the scenes on a paper for analysis of the results you, our volunteers have created so far. The paper is still in progress, but we've waited long enough to release the datasets themselves. Over the next few days, you'll see more and more datasets come available for download at https://www.mlcathome.org/mlds.html.

Each dataset archive contains a README.md file with details on what is included and some information on how to use it. Each trained network is in its own directory, with both the native PyTorch version of the saved model, a JSON file that contains the learned weights, and some metadata about each trained network, including a training history.

Currently available datasets are:

- MLDS-DS1: 100/ea, 500/ea, 1000/ea. (5000/ea and 10000/ea still computing)
- MLDS-DS2: 100/ea, 500/ea, 1000/ea. (5000/ea and 10000/ea still computing)
- MLDS-DS3: 100/ea (500/ea and 1000/ea zipping/uploading, 5000/ea and 10000/ea still computing)

The latter datasets are so large they would be better as torrents, so we may make them available only as torrents. For researchers, if you use these datasets, I ask that you a) let us know, and b) cite our paper when it comes available.

Detailed News


  • Lots of great work on a paper; stay tuned to the twitter account for some graphs and tables over the next week.
  • Building the dataset archive takes a long time, because of the large number of files, and xz compression is not exactly fast. So DS3-500 and DS3-1000 might take a little while compress/upload.



Project status snapshot:
(note these numbers are approximations)






Last week's TWIM Notes: Feb 1 2021

Thanks again to all our volunteers!

-- The MLC@Home Admins(s)
Homepage: https://www.mlcathome.org/
Twitter: @MLCHome2
9 Feb 2021, 5:01:59 UTC · Discuss


[TWIM Notes] Feb 1 2021 posted
MLC@Home has posted the Feb 1 2021 edition of its weekly "This Week In MLC@Home" newsletter!
Updates on dataset release, the paper, and we've now surpassed 1000 examples for each of D1/D2, meaning a greater mix of WUs in the future.

Read the update and join the discussion here.
2 Feb 2021, 7:53:40 UTC · Discuss


... more

News is available as an RSS feed   RSS


©2021 MLC@Home Team
A project of the Cognition, Robotics, and Learning (CORAL) Lab at the University of Maryland, Baltimore County (UMBC)