I am interested in the Google features.."Did you mean xxxxxxx" when you type in a misspell word. How it is done? :confused:
Printable View
I am interested in the Google features.."Did you mean xxxxxxx" when you type in a misspell word. How it is done? :confused:
Click here.
EDIT: I'm not sure exactly how it's done, that's as close as I got to finding it. Just thought the thread deserved at least one response. x_x
I'm new here so take this for what it's worth...
Ask google how they do it...if it's a trade secret all they'll say is no...right!
as they say: it never hurts to ask! :D ...well, sometimes it does.
You would need a huge index of all of the terms. When the user types it in, you try to match terms. If it matches, you probably don't need a "Were you searching for: xxxx?". If it doesn't match, you'd probably have to find similar words using regular expressions. Then you have a list to select from, but you don't want to give the user all similar words, but ones they'd probably be searching for. You'd probably then weigh all of the similar terms, and return the one with the most relevance compared to the others.
User Query: Leen
Database: Apple, Car, Lap, Led, Lead, Leak, Leap, Lean, Learn, Long, Mountain, Orange, Zebra
Match All Letters: None
Match 3 Letters: Apple, Car, Lap, Led, Lead, Leak, Leap, Lean, Learn, Long, Mountain, Orange, Zebra
Match 2 Letters: Apple, Car, Lap, Led, Lead, Leak, Leap, Lean, Learn, Long, Mountain, Orange, Zebra
Match 1 Letter: Apple, Car, Lap, Led, Lead, Leak, Leap, Lean, Learn, Long, Mountain, Orange, Zebra
We have 2 very similar words that the user could have typed in (assuming they spelled somewhere in the ball park). Maybe some algorithm could take the similar words and ignore the vowels and see which matches the best (assuming vowels are often mis-placed/pronounced). Or perhaps we could see how many results "Lean" and "Learn" return in our queries, and return the one with the most "hits" (assumes the user wants the most popular result).
I'm not sure how they work, but those are just a few ideas that would seem fairly straight forward in principle. As for how to program this and scale it the way Google does, I don't want to think about it :p