![[N] TheSequence Scope: When it comes to machine learning, size matters: Microsoft's DeepSpeed framework, which can train a model with up to a trillion parameters](https://jfbmhhfxbbrxcmwilqxt.supabase.co/storage/v1/object/public/resource-images/MachineLearning_AI_for_digital_automation_20250328_184554_processed_image.jpg)
[N] TheSequence Scope: When it comes to machine learning, size matters: Microsoft's DeepSpeed framework, which can train a model with up to a trillion parameters
Hi there! Offering to your attention the latest edition of a weekly ML-newsletter that focusing on three things: impactful ML research papers, cool ML tech solutions, and ML use cases supported by investors. Please, see it below. Reddit is a new thing for me, and I've been struggling a bit with it, so please don't judge me too harsh for this promotion. This weekly digest is free and I hope you'd find the format convenient for you. Your feedback is very appreciated, and please feel free to sign up if you like it.
📝 EditorialÂ
The recent emergence of pre-trained language models and transformer architectures pushed the creation of larger and larger machine learning models. Google’s BERT presented attention mechanism and transformer architecture possibilities as the “next big thing” in ML, and the numbers seem surreal. OpenAI’s GPT-2 set a record by processing 1.5 billion parameters, followed by Microsoft’s Turing-NLG, which processed 17 billion parameters just to see the new GPT-3 processing an astonishing 175 billion parameters. To not feel complacent, just this week Microsoft announced a new release of its DeepSpeed framework (which powers Turing-NLG), which can train a model with up to a trillion parameters. That sounds insane but it really isn’t. Â
What we are seeing is a consequence of several factors. First, computation power and parallelization techniques have evolved to a point where it is relatively easy to train machine learning models in large clusters of machines. Second and most importantly, in the current state of machine learning, larger models have regularly outperformed smaller and more specialized models. Knowledge reusability methods like transfer learning are still in very nascent stages. As a result, it’s really hard to build small models that can operate in uncertain environments. Furthermore, as models like GPT-3 and Turing-NLG have shown, there is some unexplainable magic that happens after models go past a certain size.
Many of the immediate machine learning problems might be solved by scaling the current generation of neural network architectures. Plain and simple, when it comes to machine learning, size matters. Â
We would love to hear your opinions about the debate between broader-larger vs. smaller and more specialized models. Â
Now, to the most important developments in the AI industry this week
🔎 ML Research
GPT-3 Falls Short in Machine Comprehension
Proposed by researchers from a few major American universities, a 57-task test to measure models’ ability to reason poses challenges even for sophisticated models like GPT-3 ->read more in the original paper
Better Text Summarization
OpenAI published a paper showing a reinforcement learning with human feedback technique that can surpass supervised models ->read more on OpenAI blog
Reinforcement Learning with Offline Datasets
Researchers from the Berkeley AI Research (BAIR) Lab published a paper unveiling a method that uses offline datasets to improve reinforcement learning models->read more on BAIR blog
🤖 Cool AI Tech Releases
New Version of DeepSpeed
Microsoft open-sourced a new version of DeepSpeed, an open-source library for parallelizable training that can scale up to models with 1 trillion parameters->read more on Microsoft Research blog
đź’¸ Money in AI
- AI-powered customer experience management platform Sprinklr has raised $200 million (kudos to our subscribers from Sprinklr 👏). Sprinklr's “AI listening processing” solution allows companies to get structured and meaningful sentiments and insights from unstructured customer data that comes from public conversations on different websites and social platforms.
- Xometry, an on-demand industrial parts marketplace, raises $75 million in Series E funding. The company provides a digital way of creating the right combination of buyers and manufacturers.
- Another example of AI implementation into matching two sides for a deal. Real estate tech company Orchard raises $69 million in its recent funding round. Orchard aims to digitize the whole real estate market, by developing a solution that combines machine learning and rapid human assistance to smooth the search, match the right deal, and simplify buying and selling relationships.
- Cybersecurity startup Pcysys raised $25 million in its funding round. Pcysys’ platform, which doesn’t require installation or network reconfiguration, uses algorithms to scan and “ethically” attack enterprise networks.
- Robotics farming company Iron Ox raised $20 million in a funding round. The system of farming robots is still semi-autonomous, the company’s goal is to become fully autonomous.Â
- Insurtech company Descartes Underwriting raised $18.5 million. The company applies AI and machine learning technologies to climate risk predicting and insurance underwriting.
- Legaltech startup ThoughtRiver raised $10 million in its Series A round. Its AI solution applied to contract pre-screening aims to boost operational efficiency.
- Medtech startup Skin Analytics raised $5.1 million in Series A funding. Skin Analytics has developed a clinically validated AI system that can identify not only the important skin cancers but also precancerous lesions that can be treated, as well as a range of lesions that are benign.
- Amazon, along with several government organizations and three other industry partners, helped fund the National Science Foundation, a high-priority AI research initiative. The amount of funding is not disclosed.
The content of TheSequence is written by Jesus Rodriguez, one of the most-read contributors to KDNuggets and TDS. You can check his Medium here.
Vibe Score

0
Sentiment

1
Rate this Resource
Join the VibeBuilders.ai Newsletter
The newsletter helps digital entrepreneurs how to harness AI to build your own assets for your funnel & ecosystem without bloating your subscription costs.
Start the free 5-day AI Captain's Command Line Bootcamp when you sign up: