Tech

‘Catastrophic overtraining’ could harm large language AI models that are trained on more data for the sake of training

Share
Share


  • Researchers from top US universities warn extending pre-training can be detrimental to performance
  • Too much pre-training can deliver worse performance due to something akin to the butterfly effect
  • The more they are pre-trained, the more they become sensitive to small changes that could disrupt the end result

Researchers from Carnegie Mellon, Stanford, Harvard, and Princeton are challenging one of AI development’s accepted core beliefs – that the more pre-training data the better the performance.

As reported by HPCwire, a new paper discuses the concept of “catastrophic overtraining,” whereby extended pre-training can harm a model’s performance after fine-tuning.

Share

Leave a comment

Leave a Reply

Your email address will not be published. Required fields are marked *

Related Articles
Perplexity will make AI images for you, but ChatGPT is the one doing the work
Tech

Perplexity will make AI images for you, but ChatGPT is the one doing the work

Perplexity has added AI image generation to its platform The images are...

Could the ‘Angry Magpie’ save your business from insider threat and data-related attacks?
Tech

Could the ‘Angry Magpie’ save your business from insider threat and data-related attacks?

Browsers are the new frontline, but today’s DLP can’t see the real...

Smart surfaces could represent a powerless solution to multipath signal interference
Tech

Smart surfaces could represent a powerless solution to multipath signal interference

This study demonstrates a passive metasurface technology that uses a time-varying mechanism...

Dual scalable annealing processors overcome capacity and precision limits
Tech

Dual scalable annealing processors overcome capacity and precision limits

The proposed system enables simultaneous expansion of the number of spins and...