Tuesday, October 25, 2005

What do search engines solve?

The obvious answer to this question is "Search engines solve the problem of finding information." This makes sense to most people, but from a technical standpoint it is difficult to evaluate. What information is the user trying to find? Arguably any page on the web contains information, so a search engine would not be necessary, since most users can find a web page without using a search engine. In fact, the search engine itself is a web page, so by finding the search engine the user has already accomplished the goal.

Now, most people will recognize quickly that your average web surfer doesn't want to find just any information, the surfer is looking for one particular piece or type of information. Let us assume that the surfer is looking for the answer to some question. This won't always be the case, as some people simply browse the web just hoping to find something interesting, but for the purposes of designing a search engine it's reasonable to assume that the user is looking for something.

So, search engines help the user answer a particular question by finding a page that answers that question on the web. Another logical thing to ask is which search engine is the best? We could rank a search engine by just whether the user was successful in their search---did they ever find the answer to the question they were seeking to answer? This simply gives a yes or no answer, but we could look at the percentage of successful searches over a long period of time to decide which search engine is the best.

Whether the user eventually finds what they are looking for or not is not all we should consider in ranking a search engine. Users could probably eventually find the information without ever using a search engine, so using a search engine doesn't offer any improvement using only this criteria. Instead we want to measure how well a search engine helps a user answer the question. A logical measure of this is the time it takes to find the answer. Time isn't terribly reliable though, because some users read slower, or they get distracted. A better way would be to count the number of hyperlinks followed or pages visited to find the answer. We could refer to this as the path length.

Measuring the average path length to find the answer to a question for a particular search engine seems to give a good metric for evaluating a search engine. This gives a way to test "how good are the search results?" which is a very vague question and hard to answer quantitatively. So, to look at our original question, "what do search engines solve," it looks like we can say that search engines attempt to minimize the path length to answer a question.

This is useful to me, because it gives me a target for my project, and a way to objectively compare my results against other search engines. The solution is now testable, and also allows me to more precisely answer questions like "what emerges" when using a swarm intelligence approach.

Notice that the problem of minimizing path length to an answer has other solutions besides search engines. The focus of this is more on the user experience, rather than the quality of the search engine, which is definitely where the focus should be. It also means that I no longer have to be limited to a search engine to solve this problem, but I can apply swarm intelligence in another way to optimize the user experience.

It seems like I should design an experiment to test search engines and gather some data to compare the project I eventually produce against.

2 Comments:

Anonymous Anonymous said...

This comment has been removed by a blog administrator.

4:14 PM  
Blogger blinks said...

Yay for interestingness. Makes me remember an old idea I had... hmm...

6:21 PM  

Post a Comment

<< Home