@veff If you ever get into this yourself, whether using https://github.com/suno-ai/bark or some other FOSS software, or a paid service, I have a "hot tip".
Judging by how you post, you don't need it, but I did. I have learned quickly that the voice AI does its best when it is given proper English with correct grammar and syntax. If I don't give it that, it gets confused. It sounds like someone reading from a book aloud for the first time and seeing a word they hadn't seen before.
This is probably why the Edmund Duke (from StarCraft 1) reading of "Nigger Worship and its Consequences" was actually pretty good. Despite only having 3 minutes of speech to work with as a source for him, the text he's reading is not marred by years of lazy internet "grammar" and syntax. Nor by decades of over-use of shorthand words and phrases.