Tuesday, February 19, 2008

General Search Strategies


  • Most search engines employ the principles of Boolean logic in the formulation of search queries. See Boolean Searching on the Internet detailed information about search strategy. If you take the time to understand the basics of Boolean logic, you will have a better chance of search success.

  • Search engines tend to have a default Boolean logic. This means that the space between multiple search terms defaults to either OR logic or AND logic. This has become a de facto standard. It is imperative that you know which logical operator is the default. Nowadays, the default logic tends to be AND, but you should always check the site's Help file to make sure.

  • Another de facto standard is the requirement to search for phrases within quotations, e.g., "dealth penalty".

  • If the option is available, use proximity operators (e.g., NEAR) if these are available rather than specifying an AND relationship between your keywords. This will make sure that your search terms are located near each other in the full text document. The closer your terms are placed, the more possibly relevant the document will be. Google does proximity searching by default. See Boolean Searching on the Internet for a list of more sites that offer proximity searching.

  • Field searching is another extremely important way of limiting your search results in large search engines that contain millions of full-text files. For example,

TITLE:slavery

in a search engine such as AltaVista will bring you more relevant hits than merely searching on the keyword slavery.

  • To enhance subject searches, try the URL field to narrow your results. The URL field offers a good way to search for certain subject terms. This is because of the make-up of the URL.

Anatomy of a URL

This is a URL on the CNN home page

http://www.cnn.com/feedback/comments.html

This URL is typical of addresses hosted in domains in the United States. Structure of this URL:

  1. Protocol: http
  2. Host computer name: www
  3. Second-level domain name: cnn
  4. Top-level domain name: com
  5. Directory name: feedback
  6. File name: comments.html

The directory name and file name often contain subject terms. These can be searched with the URL field.

For example:

URL:slavery

will give you more relevant results than the keyword slavery by searching for this term as a directory name or a file name.

  • To find a home page when you know the location or sponsor of the information, use the SITE field. In this case, you search on the top-level and second-level domain names together, and then use AND logic to add subject terms to your search.

Examples of sites:

mit.edu
nasa.gov
microsoft.com

For example, if you are searching for information about spacewalks conducted by NASA, go to AltaVista and try something like this:

+site:nasa.gov +spacewalks

This search will limit your results to files at the NASA Web site.

  • Beware of searching on three-letter top-level domains to narrow your search. Do NOT try to search for the URL edu or com. There are too many pages in these domains for the search engine to handle. On the other hand, searching for the URL gov may be more successful because there are far fewer of these pages. Still, all searches on top-level domains should be used with caution.

Keep in mind that there are a few search services that specialize in retrieving Web pages from individual top-level domains. For example:

Use these specialty engines when you wish to limit your results to these domains, as your results are more likely to be accurate and comprehensive.

  • Limiting a search by a two-letter country code, also a top-level domain, might be a viable option. Take a look at this list of ISO 3166 Internet country codes.

    Quick Tip!

    Best Bet Search Syntax

    • Place the plus sign ( + ) in front of all words you wish to retrieve

    +hibernation +bears

    • Place a phrase within quotations

    "freedom of the press"
    Putting it all together:
    +"drug policy" +"United States"

0 comments: