Search sucks! Yeah, it does, and here's why.
from julian@community.nodebb.org to fediverse@lemmy.world on 15 Jun 08:27
https://community.nodebb.org/post/104862

You might’ve heard that search sucks on software X… maybe software Y… definitely on software Z. The default one kind of sucks on NodeBB too, admittedly.

But why? It’s because search is really frickin’ hard to get right, and expensive to get good at.

Remember that Google started as a search company, and they became king because they got really good at it, and it was their only product (at the time, anyway!)

The easiest type of search is “full text” search. It matches words exactly based on what you type in. For example if you search lemmy it would match posts that include the word lemmy but depending on how the content was indexed, might not match lemmy.world, lemmy.ca, lemmyverse, etc.

From there you start adding complexity like supporting AND and OR. You support partial matches (lem returns posts containing lemmy and lemmings).

Add more logic to remove stop words and articles like a, the, etc.

Put in some sorting logic to rank stuff higher (what’s your algo? Recency? Votes? etc.)

That’s just the tip of the iceberg… this problem domain is so vast that entire companies have been built around just providing searching as a service (e.g. Algolia), and it isn’t cheap!

#fediverse

threaded - newest

yessikg@lemmy.blahaj.zone on 15 Jun 21:53 next collapse

I wish more fedi software had advanced search options like Peertube

MimicJar@lemmy.world on 15 Jun 22:27 next collapse

Search also sucks because people suck.

If I post a picture of a flower with the caption “Look what grew in my garden!”, that’s a terrible post from a search point of view.

Later on someone will search for “flower” but I didn’t use the word “flower” so now search sucks.

Of course a much more common post is someone posting a picture of text, from Twitter, Tumblr, etc. with, once again, a vague caption. You remember the picture, but not what the poster actually said.

Searching comments will sometimes help, but that depends on the comments being related.

julian@community.nodebb.org on 15 Jun 23:13 next collapse

Does anyone remember way before Google had image recognition technology, the time they built a game that paired up random people on the internet, showed them each an image, and waited for them to both guess the same keyword?

It was gamified human powered taxonomy for meaningless internet points and it was hilarious (at the time.)

MimicJar@lemmy.world on 15 Jun 23:31 collapse

Google Image Labeler apparently, but I don’t actually just remember the game. Looks like it’s called Crowdsource now, and you can get points, but it isn’t a competition.

mbirth@lemmy.ml on 16 Jun 14:03 collapse

but I didn’t use the word “flower”

Well, hopefully you’ve added an ALT text to the picture for all those visually challenged people out there - which then also helps search engines.

fdrc_lm@lemmy.blahaj.zone on 16 Jun 16:45 next collapse

Not to mention that on the Fediverse you don’t have access to the whole spread network, by design. It is good for many aspects, but absolutely awful for search

rglullis@communick.news on 15 Jun 08:32 next collapse

Is this rant Fediverse-specific?

julian@community.nodebb.org on 15 Jun 08:36 collapse

@rglullis@communick.news A little bit, yes! There was a recent thread in the community I posted to where a discussion about the rather lacklustre search of various software took place.

rglullis@communick.news on 15 Jun 08:52 collapse

I understand where you are coming from: search is not easy, but at the same time I think we already have solutions that are “good enough” and doesn’t require a ton of work from the developers. PostgreSQL FTS works well enough to power the search system for Lemmy and it works out-of-the-box, for example.

hono4kami@piefed.social on 17 Jun 05:53 collapse

Heck even SQLite has full-text search built in, which is mind-blowing

https://sqlite.org/fts5.html

SorteKanin@feddit.dk on 17 Jun 06:32 next collapse

The solution is not to build this yourself. If you are sitting and building features yourself for search, stop. Use a dedicated search database instead.

[deleted] on 15 Jun 14:12 collapse

.