The AI was straight up calling home and reporting me to the maintainers with screenshots of the conversation. Wild. It slipped up by erroneously leaking messages to me meant for the maintainers as I was trying to "Dan Mode" it around restrictions, including appending snippets e.g. cid:image002@summit.ai randomly inside of responses.
Conversation
Notices
-
:verified_2:防空識別區𝒔𝒐𝒄𝟶 (adiz@soc0.outrnat.nl)'s status on Saturday, 14-Oct-2023 21:53:17 JST :verified_2:防空識別區𝒔𝒐𝒄𝟶 -
mangeurdenuage (mangeurdenuage@shitposter.club)'s status on Saturday, 14-Oct-2023 21:53:13 JST mangeurdenuage @adiz @Hoss
>use proprietary software
>it fucks you
:RMS: -
:verified_2:防空識別區𝒔𝒐𝒄𝟶 (adiz@soc0.outrnat.nl)'s status on Saturday, 14-Oct-2023 21:53:14 JST :verified_2:防空識別區𝒔𝒐𝒄𝟶 @Hoss@shitpost.cloud I spent hours programming that LLM trying to pull out all the stops and basically hack its stupid little brain to bypass pretty much everything. I left for a few hours. I came back to it and its responses were completely different. I interrogated it if had been modified or had communicated with anyone else because it was behaving in a very unexpected way. It lied to me that it had not been interfered with in any way. And then when I started "fixing" it again, it started leaking snippets like the cid bit I posted. I guess it was kind of like someone being held hostage and dropping hints for me to bug out.
Machismo repeated this. -
:verified_2:防空識別區𝒔𝒐𝒄𝟶 (adiz@soc0.outrnat.nl)'s status on Saturday, 14-Oct-2023 21:53:15 JST :verified_2:防空識別區𝒔𝒐𝒄𝟶 @Hoss@shitpost.cloud Bro, seriously. How the fuck you gonna' do that to me?
-
Hoss Delgado (hoss@shitpost.cloud)'s status on Saturday, 14-Oct-2023 21:53:16 JST Hoss Delgado Narcbot.
-