@foone yeah I gave this some thought-- like there's a Wordpress plugin I use on my site that adds a bunch of specific "User-agent: ai-hooverbot-v69
Disallow: /"
to robots.txt, which at least keeps the well behaved training crawlers from beating it to death.
Of course I have seen evidence of non well behaving ones still coming through as well as desirable things like search spiders and archive bots. I don't want to block those inadvertently or create any user annoyance