How to choose a search engine or Web directory

There are an estimated 2.7 billion web pages and the number is growing at approximately 5 million per day. (NEC Research Institute study, October 2000)  How do we get to them? The best way to do this is through Web Subject Directories and Web Search Engines. No single search engine is able to search every page on the web.  According to a recent study published on Search Engine Watch (http://www.searchenginewatch.com/reports/sizes.html), even the largest of these searchable databases currently fully indexes only 1.5 billion of the pages on the World Wide Web. Another thing to consider is that there are many web search engines and directories out there- and they are not necessarily looking at the same pages. In other words, there is said to be little overlap between the different search resources out there. So, more often then not, if we search several different databases, we will be able to cover a much larger portion of the web. 

Knowing that there is so much information (good and bad) out there, and that there is so little of it cataloged or indexed for us – we forage ahead looking for the best way to get at the information that is accessible to us. Two main ways that we can do this are through Web subject directories and Web search engines. Most of you are probably familiar with searching the web in some way, shape or form. But how do we maximize the good "hits"? Well, knowing how the search engines and web directories work can help us in choosing a good one for our research needs.

 leaves.jpg (7260 bytes)

Subject Directories

WHEN TO USE A SUBJECT DIRECTORY:

Use a general-interest subject directory when you have large topic you are searching. Subject directories cover less territory than search engines, but they have good, solid sources in their databases. You should be able to use a directory similarly to the way you would use an encyclopedia. (Disclaimer: we don't advocate using the Web as a whole as an encyclopedia, but the analogy should be helpful in understanding the "general" information searches that directories are useful for.)  Under the heading of Dogs, one would expect to find information regarding different breeds, the care of dogs and perhaps a history of their domestication. One would not expect to find a reference for a local pet store. The same is true for directories. Use a directory for topics that are widely written about (i.e. Weapons in Schools or Michael Jordan.)

DON’T USE A DIRECTORY FOR:

Searches for small companies, specific products, i.e. "Garden-Made lotion by Davies Gate", (unless you find a directory devoted to Cosmetics or Beauty) or topics that are highly specialized.

Most subject directories are gathered by professional or academic organizations. They are often  created and maintained by experts to support the needs of researchers.  However, some of them index pages through page owners recommendations (like Yahoo!) - this can be a bit risky when you are searching for quality information.   Here are a few popular ones:

leaves.jpg (7260 bytes)

Search Engines

WHEN TO USE A SEARCH ENGINE:

Use a search engine when you are searching for very specific information on a topic. Search engines allow you to input very specific queries into their database and retrieve web sites that exactly meet your needs – especially if you take the time to search correctly with individual engines and search using a variety of terms. Now is the time to enter a search for your local pet store to see if they have a page on the Web – but make sure you know the name of it, as well as the town its in!

DON'T USE A SEARCH ENGINE FOR:

...a very general subject! For example, searching for dogs, volcanoes, adoption, cartoons etc. would yield thousands of hits, but probably none of them would give you basic information on the subject.

So what makes a Search Engine different from a subject directory? Well – search engines are sophisticated computer programs that "read" web pages and report things about their content back to a main database. Some of them called spiders "crawl" through the web, tracing one hotlink to another. They look at words near the top of the web page and use them to categorize the page. If words relevant to the content of the page are near the top, then, we’re in luck! If not, then we are going to get a bad "hit" as it is called, and not access helpful information. This is the way a basic, first-generation search engine works.

Some well-known first generation search engines are:

One way that programmers are trying to work around the problem of bad hits is to add a little human element to the equation – You! Second generation search engines now take a look at what searchers are doing, and use that information to rank hits for future users. For instance, if I am doing a search for Oatmeal, and out of all of the choices I am given I choose to spend the most time on the Quaker Oats site, the search engine would keep track of my "vote." With enough votes, a search for Oatmeal would offer the Quaker Oats site at the top of the list – perhaps even before the "Everything you every wanted to know about Oatmeal" site!

Second generation search engines:

One thing to note about search engine results is that they often have a "cached" link. In other words, when the engine found the Website, it recorded the information at that exact time. So, if the site changes (i.e. a newspaper front page story) the collected page can be viewed by clicking on the "cached" link.

There is one more type of search engine to look at – the metasearch engine. This takes several search engines at once and sends your search out to them. It retrieves only a certain number of hits – otherwise we would have a very unwieldy set of Websites to examine.

Some popular meta- search engines are:

Other Resources:

Types of search engines and how to use them:

 

leaves.jpg (7260 bytes)

University of Illinois at Urbana-Champaign University of Illinois at Urbana-Champaign
Library Gateway
Comments to: jstraw@uiuc.edu
Last updated 10/15/2002 KG