Topic: Training Data

3 chapters across the catalog

Talking Toilet
Episode 1751 5:45 - 10:49

1751: Talking Toilet

AI Data Center Market Downturn and Inference Shift

Industry insights from a data center developer suggest a significant downturn in the AI infrastructure market, with companies like Microsoft reportedly canceling contracts. The emergence of the Chinese DeepSeek model has shifted expectations toward cheaper training methods, moving the industry focus from remote training centers to low-latency "inference" hubs. Many struggling data centers are being repurposed for Bitcoin mining, while major firms like KKR and BlackRock have already secured exits from these investments.

Flash to Bang
Episode 1619 1:32:34 - 1:37:25

1619: Flash to Bang

AI Training Data, Copyright Law and Coding

New legislation has been proposed to require AI developers to disclose the sources of their training data to protect copyright holders. While some argue that scanning all human knowledge into a database is a societal benefit, others worry about the loss of intellectual property rights. Additionally, industry insiders claim that AI-generated code is often inferior to that produced by human programmers.

Beast Train
Episode 1593 2:48:31 - 2:54:55

1593: Beast Train

AI Model Collapse, Recursive Training, NPR Marketplace

A segment from NPR's Marketplace explains the concept of "model collapse," where AI systems trained on AI-generated content become increasingly deranged over generations. This recursive feedback loop is compared to photocopying a photocopy until the original image is unrecognizable. The only proposed solution is hiring humans to write original "prose" to freshen the training data.