Conversation
Notices
-
emasculated (cope@eeeeeeeee.eu)'s status on Thursday, 29-Dec-2022 03:47:20 JST emasculated @mint ever consider tracking a statistic of what queries are common on fba?
hunk.city still being in the top 50 will never not be funny- likes this.
-
(mint@ryona.agency)'s status on Thursday, 29-Dec-2022 03:52:11 JST @cope Maybe, but this would either require parsing the logs which are rotated every N days, or logging it all in webapp itself which would cause additional strain. A bit too lazy to figure that out, the main priority is keeping the crawler running, a bunch of trannies started watching over their own logs and reporting IPs on #fediblock, so I'd need to get a bunch of reliable proxies.
>hunk.city still being in the top 50 will never not be funny
People copypaste old blocklists without ever checking them. There's dozens if not hundreds new instances that block, say, smugloli which haven't been online in 3 years, not to mention gab and truthsoc neither of which even federate. -
pomstan (pomstan@xn--p1abe3d.xn--80asehdb)'s status on Thursday, 29-Dec-2022 04:04:44 JST pomstan @mint @cope how about utilizing (((cloud))) services for short-lived vms?
likes this. -
(mint@ryona.agency)'s status on Thursday, 29-Dec-2022 04:07:30 JST @pomstan @cope I'm thinking of pulling proxies from random public lists and using random ones for each crawl, but reliability of those is likely not that good. -
(mint@ryona.agency)'s status on Thursday, 29-Dec-2022 04:13:20 JST @shitpisscum @cope @pomstan Quite a bit, plus a sizeable amount is behind :cloudflare: which might give you a captcha page instead. -
shitpisscum (shitpisscum@social.mrhands.horse)'s status on Thursday, 29-Dec-2022 04:13:21 JST shitpisscum @mint @cope @pomstan TOR? Not sure how many of them completely block TOR exit nodes -
emasculated (cope@eeeeeeeee.eu)'s status on Thursday, 29-Dec-2022 04:21:13 JST emasculated @mint @shitpisscum @pomstan easier just to scrape for open proxies that's for sure, https://github.com/h1w/proxy-pool likes this.