Conversation
Notices
-
Konstantin :veripawed: :veritrek: :rainbow_pride_potion: (konstantin@m.iamkonstantin.eu)'s status on Sunday, 19-Feb-2023 09:29:00 JST Konstantin :veripawed: :veritrek: :rainbow_pride_potion: I wonder, how hard would it be to build a search engine? The old fashioned kind that just returns a list of things that relate to a search query. Maybe it’s a single binary, very easy to deploy and self host and can be configured with topics one is interested in... Does that exist already? 🤔
#search #foss-
Adrian Cochrane (alcinnz@floss.social)'s status on Sunday, 19-Feb-2023 09:29:00 JST Adrian Cochrane @konstantin By far the difficult part is feeding information to return in search results, but Apache Lucene is an excellent such software project! Its heuristics are by far the best internationalized!
I'd recommend using either the Apache Solr or ElasticSearch wrappers to get something a a bit more out-of-the-box.
-
Konstantin :veripawed: :veritrek: :rainbow_pride_potion: (konstantin@m.iamkonstantin.eu)'s status on Sunday, 19-Feb-2023 09:40:57 JST Konstantin :veripawed: :veritrek: :rainbow_pride_potion: @alcinnz I’ve used both for some big data projects, it definitely makes sense to try it out for this use case as well. All I need is a crawler 😋! -
Adrian Cochrane (alcinnz@floss.social)'s status on Sunday, 19-Feb-2023 09:40:57 JST Adrian Cochrane @konstantin Here's a blog from someone who's stood up a public Apache Solr instance with a custom frontend: https://blog.searchmysite.net
He uses Scrapy to crawl registered sites.
-