Seek and Ye Shall Find


photo of John Michael Pierobon By: John Michael Pierobon

What is a search engine? Are all search engines the same? Which one should I use? Why do the results of search engines differ? And how do I use one effectively? This article answers these questions.

A search engine is a program that catalogs and indexes Internet content. It allows users to type in a search phrase and have the search engine return a ranked list of Web sites found in its catalog.

There are four kinds of search engines. They are: human-compiled, hybrid, machine-compiled, and meta.

A "crawler" is a "robot", and a "robot" is a "spider", and a "spider" is a "crawler". These are three terms for the same thing. A "crawler" is a computer program that automatically traverses, or crawls, through a Web site recursively, retrieving all documents that are referenced. A machine-compiled search engine uses a "spider" to add Web sites to its catalog.

AltaVista is an example of a machine-compiled search engine. They have been awarded more search-related patents than any other company in the world.

Human-compiled search engines, as their name implies, require the intervention of human beings to build the catalogs of Web sites. Web sites are read and reviewed by editors, and are placed into specific categories. About.com is an example of a human-compiled search engine.

A hybrid search engine is a hybrid of a machine-compiled and a human-compiled search engine. It is the best of both worlds. Yahoo started out as a human-compiled search engine, but it has now become a hybrid search engine.

A meta search engine searches other search engines and returns the top results from those search engines. Rather than searching only AltaVista and getting a very large number of entries, a meta search engine will search several different search engines and return the top ten or twenty entries from each of them. Metacrawler is an example of meta search engine.

Each search engine has a help section. Each explains how to construct your search phrase to drill deep to find exactly what you are looking for. Boolean searches help narrow down a search by using operators such as "AND", "OR", "NEAR", and "NOT". Most search engines treat words in double quotes as phrases, and interpret plus signs "+" as mandatory inclusions and minus signs "-" as exclusions. Many search engines allow for wildcards, which is helpful if one is not sure of the spelling. Also most search engines allow for fuzzy searches. A fuzzy search allows for spelling mistakes by the user. For example, if a user types in "Rosevetl" the search engine would return entries for "Roosevelt".

Most search engines now allow you to exclude material that may not be suitable for children from the results page.

Different search engines treat keywords differently. Hence, search engine results differ for several reasons. Some search engines catalog more pages than others, or run their "crawler" more frequently. They assign different rankings to the position and frequency of the keywords in the documents. Their definition of "NEAR" differs. "NEAR" could be within five words in one search engine, while in another search engine it could be within twenty words. Some rely more heavily on the title of the document, while others use the meta tags which appear in the head of the HTML document.

I have mentioned some of the most popular search engines, but sometimes one needs to look at other search engines for specialized searches. Many search engines have carved out a niche market for themselves. For example, Anzwers specializes on Australian and New Zealand Web sites. Deja is famous for its ability to search newsgroups.

To make informed decisions one must gain knowledge. Knowledge is derived from information. Information is structured raw data which can be easily retrieved thanks to the Internet. The Internet is full of information. So, where does one start?

To search the Web effectively one should begin with the end in mind. Start by formulating a query. Take advantage of the advanced features of your search engine such as surrounding keywords in double quotes, and using operators such as "AND", "OR", and "NOT". To improve your search results, opt for rare words; musicals instead of music. Use wildcards.

Try the obvious domain names. For example, if you are looking for a specific Microsoft product, try www.microsoft.com.

If you get stuck, use another search engine.

When you see a promising site, bookmark it. You can always clean up your bookmarks later. Do not lose the information you found.

Additional information on search engines may be found by visiting www.searchenginewatch.com. It is a Web site that contains lots of comparitive information about search engines.

John Michael Pierobon is an Internet consultant based in Fort Lauderdale.
John Michael may be reached by sending electronic mail to pierobon@pierobon.org


Home | Résumé | Courses | Comments | HTML | Definitions | Articles | Books


Thank you for visiting.

© 2000 - 2006 John Michael Pierobon