I'm as pro-tech as the next tech-nerd, but no, you don't get to steal everyone else's hard work just because you're a corporation instead of a single person.
@eff AI should be keeping a distributed ledger documenting every use of information to build & use AI. The ledger used to pay royalties no matter the official copy write status, deep pockets should not be a prerequisite for remuneration
@eff Scraping data for a model at that kind of scale should only be legal if the resulting model is fully free (as in freedom) and open-source.
Regardless of what fair use technically allows, taking public data to train a model with private ownership is unethical, especially when the motive is corporate profit.
We can get this innovation with much less downsides when models are freely available, modifiable, inspectable, and open to all, just like the interpretations of the training data were.