its either that or i should just learn sql . but since i have no vices and i am perfect,
RE: https://brain.worm.pink/objects/fcd9f101-3ee0-4e73-a16d-d3d9dc74f80e
Conversation
Notices
-
φ (fiore@brain.worm.pink)'s status on Tuesday, 25-Feb-2025 16:03:15 JST φ
- † top dog :pedomustdie: likes this.
-
Theory_Of_Everything (kirby@annihilation.social)'s status on Tuesday, 25-Feb-2025 16:03:46 JST Theory_Of_Everything
@snacks @fiore :chirumiru_cirno_dance:
the efficiency of such a schema is widely disputed -
† top dog :pedomustdie: (dcc@annihilation.social)'s status on Tuesday, 25-Feb-2025 16:03:46 JST † top dog :pedomustdie:
@kirby @snacks @fiore You just need to know json and postgres....... WAIT A SECOND -
Snacks (snacks@netzsphaere.xyz)'s status on Tuesday, 25-Feb-2025 16:03:47 JST Snacks
@fiore postgres, but almost everything is stored in json, not in tables and row -
φ (fiore@brain.worm.pink)'s status on Tuesday, 25-Feb-2025 16:03:48 JST φ
@snacks what db does pleroma use
-
Snacks (snacks@netzsphaere.xyz)'s status on Tuesday, 25-Feb-2025 16:03:49 JST Snacks
@fiore sql doesn't help you with pleroma -
Phantasm (phnt@fluffytail.org)'s status on Tuesday, 25-Feb-2025 16:05:09 JST Phantasm
@snacks @fiore It makes sense compared to normalizing every object to 30 columns every time it comes in, or needs to go out. The big downside is the huge size of the tables and indexes needed for it to work quickly. With indexes, the performance is about the same as a normal DB schema. † top dog :pedomustdie: likes this. -
Theory_Of_Everything (kirby@annihilation.social)'s status on Tuesday, 25-Feb-2025 16:05:25 JST Theory_Of_Everything
@dcc @fiore @snacks but like
you have to go through every single row of the database, unmarshal the json, look at the contents of the json to find what you're looking for if you're making a query......... that's hardly efficient† top dog :pedomustdie: likes this. -
† top dog :pedomustdie: (dcc@annihilation.social)'s status on Tuesday, 25-Feb-2025 16:05:51 JST † top dog :pedomustdie:
@kirby @fiore @snacks You have to spit something some way, maybe i could think of a better way my self (my vps biz is just going to use sqlite lol) -
Theory_Of_Everything (kirby@annihilation.social)'s status on Tuesday, 25-Feb-2025 16:08:08 JST Theory_Of_Everything
@dcc @fiore @snacks you could just have a traditional database schema set up you don't have to store json inside the database like pleroma does † top dog :pedomustdie: likes this. -
Phantasm (phnt@fluffytail.org)'s status on Tuesday, 25-Feb-2025 16:11:44 JST Phantasm
@kirby @dcc @fiore @snacks The JSON for Pleroma is stored as a binary format, there's no need for parsing it once it is in the DB. The overhead is at insertion-time. Once it's in the DB, Postgres has it's own representation of it, that it can understand without re-parsing it all the time. It does not store whitespace in the JSON itself, it does not store duplicate keys in the JSON.
To skim over the details, it's basically stored in the already parsed representation. To simplify even more, imagine Postgres ran jq on the JSON and stored the internal representation of the JSON made by jq.† top dog :pedomustdie: likes this. -
† top dog :pedomustdie: (dcc@annihilation.social)'s status on Tuesday, 25-Feb-2025 16:12:56 JST † top dog :pedomustdie:
@phnt @kirby @fiore @snacks Btw do you know why kirbs dumps don't seem to have 20 gb of data? (activates and notifications) -
† top dog :pedomustdie: (dcc@annihilation.social)'s status on Tuesday, 25-Feb-2025 16:15:06 JST † top dog :pedomustdie:
@kirby @fiore @phnt @snacks Separate data and schema -
Theory_Of_Everything (kirby@annihilation.social)'s status on Tuesday, 25-Feb-2025 16:15:07 JST Theory_Of_Everything
@dcc @phnt @fiore @snacks oh yeah we've been having this problem where we have a 40 gb data dump but only 20 gb of data ends up inside postgres and there seem to be no activities inside the imported database when we start up pleroma
i also had this issue with fre but that was solved by making a custom format postgres dump and waiting 2 hours for the entire thing to import into another machine, so i thought nothing of it. but we tried that here and after 1 or 2 days of importing the entire thing we ended up with the same problem -
Phantasm (phnt@fluffytail.org)'s status on Tuesday, 25-Feb-2025 16:16:33 JST Phantasm
@dcc @kirby @fiore @snacks The dump of a Pleroma DB without indexes is around 50% of the full size in Postgres. I don't know how big the original DB is. And considering the really long time it took to restore, I also suspect an incorrect dump. Perhaps with triggers not disabled. -
† top dog :pedomustdie: (dcc@annihilation.social)'s status on Tuesday, 25-Feb-2025 16:16:33 JST † top dog :pedomustdie:
@phnt @kirby @fiore @snacks This time (the second time) with seprate schema and data (the data is the right size) it only is 20gb in postgres. -
Phantasm (phnt@fluffytail.org)'s status on Tuesday, 25-Feb-2025 16:18:59 JST Phantasm
@dcc @kirby @fiore @snacks Are indexes built properly? Did psql spit out any errors when ingesting?
I don't see a simple reason as to why it would by so small after an import. Even FluffyTail is bigger than 20GB after 1.5 years. -
† top dog :pedomustdie: (dcc@annihilation.social)'s status on Tuesday, 25-Feb-2025 16:18:59 JST † top dog :pedomustdie:
@phnt @kirby @fiore @snacks You see whats werid
screenshot_25_February_24_23-17-16.png
screenshot_25_February_24_23-18-45.png -
Phantasm (phnt@fluffytail.org)'s status on Tuesday, 25-Feb-2025 16:24:23 JST Phantasm
Did psql spit out any errors when ingesting?
I suspect this, if psql was not ran with -v ON_ERROR_EXIT=1. It could have thrown an error, skipped a big table (likely activities or objects) and continued the import. There should be zero errors coming from psql and the only exception to that is psql complaining about a different version of Postgres used for the dump, than what is running currently on the import. In that case, it can be ignored and ran without -v ON_ERROR_EXIT=1, but there still shouldn't be any other warnings or errors.
† top dog :pedomustdie: likes this. -
Phantasm (phnt@fluffytail.org)'s status on Tuesday, 25-Feb-2025 16:27:27 JST Phantasm
@kirby @dcc @fiore @snacks That's a big fail, as the foreign key cannot find an activity with that id in the activities table. At least that's what I think is happening. The activities table might be incomplete after an import. † top dog :pedomustdie: likes this. -
Theory_Of_Everything (kirby@annihilation.social)'s status on Tuesday, 25-Feb-2025 16:27:28 JST Theory_Of_Everything
@phnt @dcc @fiore @snacks there was this one error during the import process but this is likely being spit out cause i fucked with the database by bookmarking an object and then deleting the object from the db last year
psql:p_data.sql:28793676: ERROR: insert or update on table "bookmarks" violates foreign key constraint "bookmarks_activity_id_fkey"
DETAIL: Key (activity_id)=(00000189-c415-0f21-5936-3173ee4a0000) is not present in table "activities".