What are books1 and books2?

The GPT-3 paper says the models were trained on filtered Common Crawl, WebText2, Books1, Books2, and Wikipedia.

Is one of these books datasets Project Gutenberg? If not, is there any public information about these datasets?

Thanks.

3 Likes