Core Concepts

What Is Training Data?

Training data is the collection of text, images, or other examples a model learns from. Its size, quality, and composition strongly shape what a model can do and what biases it inherits. Questions about where training data comes from, and whether its use is fair or licensed, are at the center of legal and ethical debates over AI.

Go deeper: Harms and Risks

Further reading

Read more about Training data — articles and blogs from around the web: