[TWIM Notes] Jan 18 2021

Message boards : News : [TWIM Notes] Jan 18 2021
Message board moderation

To post messages, you must log in.

AuthorMessage
pianoman [MLC@Home Admin]
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 30 Jun 20
Posts: 462
Credit: 21,406,548
RAC: 0
Message 1046 - Posted: 19 Jan 2021, 3:27:14 UTC

This Week in MLC@Home
Notes for Jan 18 2021
A weekly summary of news and notes for MLC@Home

Summary
After a week-long hiatus, we're back with our regular weekly update. Lots of time spent writing, especially over the last few days (long weekend in the US). Project-wise, we've made some real progress on cleaning up the validation and assimilation process in preparation for DS4.

Detailed News

  • Writing is the top priority. It would be best to have 1000 examples each for the DS1/DS2 examples, and we need about 160 more for ParityModified. We're close though, and perhaps will be over 1000 by next week.
  • The modstest queue is already running an improved validation process. So far its going well, and will roll out to the other queues as well. There are no user visible changes, but lots of internal changes to help maintenance and testing.
  • Not much progress on client support for DS4, it remains partially implemented but not yet ready for testing. But the datasets for DS4 were generated weeks ago, just prioritized writing lately over DS4.
  • We looked at updating the ROCm client to enable POLARIS support and move to ROCm 4.0 (and possibly some NAVI support), but PyTorch doesn't support 4.0 yet, and we still need a little more time to convert to ROCm 3.10 to enable POLARIS support.



Project status snapshot:
(note these numbers are approximations)






Last week's TWIM Notes: Jan 5 2021

Thanks again to all our volunteers!

-- The MLC@Home Admins(s)
Homepage: https://www.mlcathome.org/
Twitter: @MLCHome2projuct-summary-

ID: 1046 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jonas

Send message
Joined: 19 Nov 20
Posts: 2
Credit: 2,347,589
RAC: 0
Message 1048 - Posted: 19 Jan 2021, 9:52:09 UTC - in response to Message 1046.  

Thank you for your informative update! Such an exciting project!

Is there a particular reason why ParityMachine and ParityModified take so much longer than the other machines?

Anyways, my GPU is working on it ;)
ID: 1048 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
pianoman [MLC@Home Admin]
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 30 Jun 20
Posts: 462
Credit: 21,406,548
RAC: 0
Message 1049 - Posted: 19 Jan 2021, 16:14:57 UTC - in response to Message 1048.  
Last modified: 19 Jan 2021, 17:48:38 UTC

TL;DR: The DS1/DS2 network shapes are "underparameterized" (too small) for learning this machine type quickly/consistently.

So how hard a "machine" is for a neural network to learn (for this type of machine) has to do with the number of input states, the number of possible output states, and the number of hidden states. The five machines in DS1 and DS2 are designed to range from trivial to hard to learn, while keeping the neural network shape used to learn them constant. While each machine type has 8 inputs and 8 outputs, only certain aspects of them are meaningful based on the machine. The loss is calculated over the outputs of the machine, so if only one output is meaningful, the network quickly learns to set the other outputs to 0 and ignore them. For SingleDirect* and SingleInvert*, only one input is meaningful and only one output is meaningful, and there are no hidden states (hidden == output).. the network quickly learns to ignore everything else, and training finishes quickly. SimpleXOR* uses 2 valid inputs, 4 hidden states, and 1 output, so it has to learn the mapping of the 4 hidden states to the output state, but since its just a 2-bit XOR, its not that hard. EightBit* is actually simpler conceptually for humans, but all 8 inputs are active, as well as all 8 outputs, so there are 2^8 output states to learn (but no hidden states as hidden == output). Parity*, like EightBit*, pays attention to all 8 inputs, has 2^8 hidden states, but has only one output active. So not only does it need to learn how to map the 8 inputs to the 2^8 hidden states, it also needs to learn the mapping of the 2^8 hidden states to the single output state.. meaning it needs to learn a parity calculation that determines that output state. In addition, since the training only has one bit of output information to guide training, and needs to learn 2^8 hidden states, its just that much harder.

The advice in the field is almost always "just make the network bigger until it works". A larger neural network would have more parameters to play with to learn this calculation more easily. However, one of the things we're comparing in this experiment is how training works over a range of complexities if we keep the shape of the network constant. We do know that Partiy* networks can be learned in this shape, but it just takes a lot longer than the other machine types. Part of what we're exploring is why and how long, so every contribution, even if it doesn't complete the network, is just as important as those that do.

It just takes a while. It's expected, but takes a while. I hope that helps!
ID: 1049 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jonas

Send message
Joined: 19 Nov 20
Posts: 2
Credit: 2,347,589
RAC: 0
Message 1050 - Posted: 19 Jan 2021, 21:16:54 UTC - in response to Message 1049.  

Well, thank you very much for such a comprehensive explanation! Even if I did not understand every detail, it is good to know that this behavior was expected.

Thanks once again and have a great day!
ID: 1050 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
pianoman [MLC@Home Admin]
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 30 Jun 20
Posts: 462
Credit: 21,406,548
RAC: 0
Message 1051 - Posted: 19 Jan 2021, 22:02:00 UTC
Last modified: 19 Jan 2021, 22:25:16 UTC

[edited -- nevermind, of course 10 minutes after posting I find the email in a weird folder I must have shuffled it to.]
ID: 1051 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Message boards : News : [TWIM Notes] Jan 18 2021

©2024 MLC@Home Team
A project of the Cognition, Robotics, and Learning (CORAL) Lab at the University of Maryland, Baltimore County (UMBC)