Yandex finds a better way to train ML models over the Internet | MCUTimes

Yandex finds a better way to train ML models over the Internet

A new proposal from the tech giant Yandex overcomes a major obstacle to advances in machine learning by bringing the process to the masses so that anyone with a home computer can help train a large neural network.

Modern deep learning applications require many GPUs, which can be expensive and therefore usually only available to well-funded companies and institutions. To ensure that ML model training does not become the domain of large organizations that can afford the necessary technology, developers have experimented with pooling computer resources for a group of volunteers. Grid or voluntary computing is a good idea, but there are issues around high latency, asymmetric bandwidth, and specific challenges for voluntary computing.

Yandex proposes a solution called Distributed deep learning in open collaborations (DeDLOC), which addresses these challenges by taking the best features of data parallelism in GPUs and improving popular distributed training techniques. DeDLOC enables everyone in the ML community to run distributed training on a large scale with their friends.

  • The new algorithmic framework adapts to the participants’ different network and hardware settings for efficient data transfer.
  • DeDLOC has been successfully tested – Yandex’s team of researchers, along with Hugging Face, a professor from the University of Toronto and others, used the method to further their education sahajBERT, a model for the Bengali language, with 40 volunteers. On downstream tasks, the model achieves quality comparable to much larger models using hundreds of high-altitude accelerators.
  • DeDLOC can also be important for multilingual NLP. Now the community for any language can train their own models without having to spend huge computational resources concentrated in one place.

The GitHub repo associated with DeDLOC can be found HERE.

DeDLOC research paper can be found here: https://arxiv.org/abs/2106.10207

Sign up for free insideBIGDATA newsletter.

Join us on Twitter: @ InsideBigData1 – https://twitter.com/InsideBigData1

Disclaimers for mcutimes.com

All the information on this website - https://mcutimes.com - is published in good faith and for general information purpose only. mcutimes.com does not make any warranties about the completeness, reliability, and accuracy of this information. Any action you take upon the information you find on this website (mcutimes.com), is strictly at your own risk. mcutimes.com will not be liable for any losses and/or damages in connection with the use of our website.

Leave a Comment