Martin Felton Library

Web Searching

Home
Up
Animal Habitats
Children @ the World
Feria Nacional de Colombia
French Revolution
Gun Control
Great Inventors
Machine Building
Middle Ages Comparison
Modes of Subsistence
Model United Nations
Native Americans
Our Changing Earth
Philosophy Course
Philosophy Course II
Primary Poetry
Primary Science Projects
Science Olympics
SIDA
Simple Machines
The Great White North
The Titanic
Travel the Americas
Underground Animals
Values & Philosophy
Web Searching

This is a workshop I did a number of YEARS ago.  Some of the information may be out of date, but the basic rules and tips for web searching remain the same.

Introduction Problems Researching Validity
Search Tools Glossary

  1.     Introduction

·    Disclaimer:  I will be giving a lot of information and tools for searching and determining the usefulness of resources on the Internet; we will have little time for actual practice.  As well, as search engines and tools change rapidly and links become outdated and non-existent, some of the information here will no longer be useful tomorrow.  The best we can hope for is to develop some basic strategies and understanding for research on the internet.

2.     The Problems with Doing Research on the Internet

·       Time Consuming: Sometimes even the simplest searches turn out to take hours, especially if one starts to wander

·       Ambiguous:  Thousands of  “hits” make it difficult to determine just where to go, and often you never find exactly what you are looking for.

·       Validity: Everybody and his brother can have a site on the internet; just whom do you trust?

·       Reflection: The old research rules have changes and with pages available at the click of a mouse, there is very little time for reflection.  

3.     Researching on The Internet

·       Search Methods and Tools.

·        A search tool is a computer programs that performs searches, a search method is the way a search tool requests and retrieves information.

·        There are basically four types of search tools:

·        A Directory Search or Subject Search: Information is in hierarchical order (i.e. Yahoo)

·         A Search Engine: Searches are done using key words (i.e. HotBot)

·        A Directory with a Search engine: Both of the above methods are combined to achieve more specific searches as you go further down the hierarchy.  (i.e. Yahoo)

·        A Multi-Engine Search:  A search engine that utilizes a number of different search engines simultaneously.  As well, there are a number of third-party programs that do the same thing (i.e. Copernicus)

·        IF TIME: Please perform your search using each of the engines described above and see the differences between search engines.

·       Keyword Search Operators.

·        For most of us, a simple keyword is the way we start a search; however, if you can use search operators in your search, you may have much more success.

·         IF TIME: Perform your searches again using specific operators and see the differences in the results.

·       Keyword Searches.

·        See Appendix One for a list of search sites and the search operators they support.

·        Also, please check the site you are using for their help file or FAQ (Frequently Asked Questions); many sites have operators and criteria of their own.

·        Many people will stick with one site only, just to get to know that site and the way it works.

·       Planning and Conducting a Search.

·        In order to choose from the wide variety of search engines out there, you may wish to check out Choosing the Best Search Engine for Your Purpose first.

·        Limiting the Number of Hits

·        The hardest part of any search is trying to evaluate the number of “hits” you get after your search.  In order to limit these hits, please keep in mind the following:

·         Avoid making your queries too general, the more specific the better.

·        Compose your query using the appropriate language for the search engine; check the help file.

·        Use quotation marks around keywords and phrases.

·        Try using a multi-engine search tool.

·        Try using a subject search tool.

·        Try using the specifying fields found in a number of search sites (i.e. Power Search for Excite, Super Search for HotBot, etc.)

·       Hints.

1.          To speed up your searches, bookmark your favorite search tools for future use.  Also, bookmark useful sites during your search, so that you can later find your way back to them.  This technique also eliminates typing errors when the addresses or URLs are long and complicated.

2.          There are times when a search tool will not connect to a Web site for one of several reasons:

·       You may have misspelled a word or erred in the address, not uncommon errors.  A careful check will detect the mistake.

·       You may have difficulty in accessing the site, because of the high activity there.  In such instances, avoid, if possible, the time period when all three US Time Zones coincide in their peak use periods.

·       At times, the search tool itself may be disabled or undergoing changes, and you will need to wait until it is operating again.

·       The site has been discontinued, but not yet removed as a link.

3.          Search tools are constantly changing as they expand their scope and improve their performance.  Use the Help Section of your frequently used search tools to remain current as to their use.

4.          An extraordinary number of hits often result because the query directs the search to individual words rather than to related words as in a phrase or title.  For example, if a query asks for American customs rather than "American customs,” then the responses will be for the words American and customs separately, in addition to the coupled words.  Quote marks narrow the search by coupling related words.  Other operators act similarly in limiting searches to the intended meaning.

5.          Because each search tool has its own method and criteria for indexing and supplying information, its database content and retrieval method will tend to be unique.  Therefore, the responses to a particular query may vary greatly from search tool to search tool.  For any one query, you will considerably improve your chances of finding the information you want by using several search tools

6.          A search tool may provide different responses at different times to the exact same query, since its database and retrieval criteria change frequently.  Different responses occur more frequently in keyword searches than subject searches.

7.          During a search you will sometimes find long articles that you prefer not to read or print at the moment.  You can defer action by selecting the text, copying it onto Clipboard and then pasting it in a word processing window.  Later you can read the articles and decide which parts, if any, you wish to keep for future reference.

8.          Some Web sites and browsers may give you the option of eliminating graphics.  For those with computers that are slow to download, you will speed up your search by using search tools that have minimal graphics.  You can assess this factor by noting how long it takes to download the Home Page.

9.          Knowledge of how information is indexed can be helpful in selecting an appropriate search engine for a query.  There are three methods used in the indexing of a Web site database.

·       Full text index: A database index that is said to include all terms and URLs.  In practice each search tool uses a filter to remove words it considers unnecessary.

·       Keyword Index: A database index that is based on the location and frequency of words and phrases.  However, if a name or term is mentioned only once or twice in the Web site, it may not be included in its index. Keyword indexing is the most used and fastest growing indexing method.

·       Person [Human] Index: This index is created by individuals who review Web sites and select the most appropriate words and phrases to describe their content.  It provides a directory that is high in relevance based on similar cataloging methods used by libraries.  Unlike the above two indexing methods, which employ robots, it has the value of being reviewed.

10.       When constructing a query, avoid using general terms except where modified by a specific one.  Otherwise, you will get an enormous number of hits of little or no relevance.  For example, roof alone is too broad, but "tile roof" as a phrase is acceptable.

11.       There are many ways of finding information on the Internet other than by the use of the WWW.  These include WAIS, Archie, Veronica, Gopher and ftp, all of which preceded the WWW but have been largely overshadowed by it.  For the beginner, it is better to master the Web first, so as not to dilute your efforts.

12.       The author of a publication starts out by creating a Web site and then submits the Web site to the Webmasters of both directories and search engines.  Directory reviewers assess submittals for suitability, and if accepted, index their contents.  Search engines automatically spider all Web sites on the Internet, indexing new ones according to their particular guidelines.  Both directories and search engines periodically update previous submittals, using their established procedures.

13.       There continues to be a huge proliferation of Web sites because the Internet provides a simple and cost-effective way to publish and attain worldwide exposure.  Because search engines spider their input without review, the searcher needs to be careful about the validity, accuracy and authority of their references.  Directories which are reviewed have some advantage in this respect. In any case, wherever you can, consider the reputation of the author, source of the information and date of publication.

·        IF TIME: Please perform your search again and evaluate your success (or lack thereof) using the information above.

4.     Determining the Validity of Resources and Using Bibliographic Documentation

·       There are a couple of Interesting Sites To See What to Look For in a Web Page…

1. How To Critically Analyze Information Resources

·       And What To Look Out for…

1.  The Good, The Bad, and The Ugly

Credits: How To Search the Web: A Tutorial for Beginners and Non-Experts  

APPENDIX ONE

Preferred Keyword Search Tools and Their Operators

 

Search Tool

Operators

 

Boolean

Plus/Minus

Quote
Marks

Stemming

Case Sensitive

All-In-One*

x

 

 

 

 

AltaVista

x

x

x

*

 

Dogpile

x

 

x

o

 

Encyc. Brit.

 

 

 

 

 

Excite

x

x

x

x

x

HotBot

x

x

x

o

x

Infoseek

o

x

x

x

x

LookSmart

 

 

 

 

 

MetaFind

x

 

x

 

 

Magellan

 

 

 

 

 

Mamma

x

x

x

 

x

MetaCrawler

 

x

x

 

 

Northern Light

o

x

x

 

 

OneKey

 

 

 

 

 

SavvySearch

o

x

x

 

o

Yahoo

x

x

x

*

x

Table Symbols: [x] means supports, [o] means excludes, [*] means a wild card capability.

APPENDIX TWO

Bookmark - A page on the Netscape Browser that lists URLs or Web addresses.  Bookmarks serve as links for easy access to Web addresses. MS Explorer’s equivalent is called Favorites.  To bookmark a Web page on your screen, click Bookmark on the bar, and when it is displayed, click Add Bookmark. The link then adds to the bottom of the Bookmark Listing.

Boolean Search - A keyword search that uses Boolean Operators for obtaining a precise definition of a query.

Browsing - In the WWW browsing refers to a directory search. In popular use, browsing, or surfing, is casually looking for information on the Internet.

Browser - A computer program used to connect to Web sites on the World Wide Web and access information.

Concept Search - A query that implies a term’s broader meaning, and rather than its literal meaning.

Data - Information such as text, numbers, images and sound contained in a form that can be processed on a computer.  

Database - Stored information at a Search Tool’s Web site.  For search engines, a robot is used to keep the database current by an automated procedure called spidering.  For directories, the database is kept current through reviews conducted by qualified people.

Directory Search - A hierarchical search starts with a general heading and proceeds through selection of increasingly more specific headings or subjects. It provides a means of focusing more closely on the object of the search.  It is also referred to as subject search, directory guide or directory tree.

False Drops  - Documents that are retrieved but are not relevant to the user’s interest.

Fields - Components of a Web page such as a title, URL, summary, text and images often displayed by a search engine to help narrow a search.  

Full-Text Indexing - A database index that includes all terms and URLs.  In practice, each search tool uses a filter to remove words it considers unnecessary.

Hierarchical - A ranking of subjects or things from the most general to the most specific.

Hits - A list of links or references to documents that are returned in response to a query, also called matches or matching queries.

Home Page  - The first page of a search tool’s Web site.

Hypertext Link   - A highlighted word or image [shown in color] on a Web page that when clicked connects or links to another location with related information. [Links provide an easy way to move about the Internet.

Index or catalog - A file that designates the location of specific data in a search engine’s database. 

Internet - The Internet, with a large I, refers to a worldwide system of linked computer networks that serve as a communication system. When used with a small i, a term used to mean a group of interconnected local networks.

Keyword - A term that a computer can recognize and use as the basis for executing a search.

Keyword Search - A search that utilizes terms that defines the user’s interest.

Link - More accurately hypertext link. It is a connection between two Web pages or sites that have related information. For example, highlighted data such as text and graphics at one Web site when clicked provide related information residing at another Web site.

Location Box, Also Address Box - A designated place within a browser for an address [URL].  It is the starting point for accessing a Web site.

Multi-Engine Search - A search that uses a number search engines in parallel to provide a response to a query.

Operator - A rule or a specific instruction used in composing a query.

Phrase Search - A search that uses a string of adjacent, related words enclosed in quote marks as the query.

Popular Items - A search category created to cover frequently sought subjects and services. Search tools list Popular Items on their Home Page.

Precision - A standard measure of information retrieval, defined as the number of relevant documents obtained divided by the total number of documents retrieved.

Proximity - Proximity is how closely words appear together within a document. In this context, adjacency or phrase usually means that words must appear exactly in the order specified with no intervening words.

Query - A search request. A combination of words and symbols that defines the information that the user is seeking.  Queries are used to direct search tools to appropriate Web sites to obtain information.

Query By Example -Use of an example to solicit more like information.

Ranking - A means of listing hits in the order of their relevancy. It is usually determined by some selection of the number, location and frequency of the term in the document being searched.

Relevance -The usefulness of a response to a query.

Robot - The software for indexing and updating Web sites. It operates by scanning documents on the Internet via a network of links. A robot is also known as a spider, crawler and indexer.

Search Box -A place within a search engine’s Web site to enter a query. Also called a location box and address box.

Search Engine - A host computer that serves a Web site and provides information from within its own sites and via links with other Web sites.  This is accomplished by using the keywords of a query to match index terms in the search engine’s database.

Search Tool - A computer program which conducts a search on the World Wide Web.

Site - The location of a page on the Internet. In WWW, it is called a Web site and identified by its URL.

Spider - To spider is the process of scanning Web sites to add new pages and to update existing ones. A spider is the same as a robot.

Stemming - The use of a stem [i.e. root] of a word to search words that are derived from it. For example, "child" would retrieve information on child, children, childhood, childless and so on.

Term  - A single word or combination of words used in a query.

Truncation -See Stemming.

Uniform Resource Locator [URL] -Uniform Resource Locator is the Internet designation of a Web address.

Web Page - The address of a Web site. It can also refer to a page within a Web site.  When Web pages are part of the same document, they are also collectively known as a Web site.

Web Site - In search use, it is a specific address or URL on the WWW. In function, it is a computer system that is set up to distribute documents stored in its database. Web sites range in size from as little as one page to a vast number of pages, such as those of a search engine’s database or a full text book.

World Wide Web [WWW] or the Web - A global computer communication system that uses the Internet to transmit data [i.e. text, numbers, images and sound

 

 
 

Web page designed and updated by Thomas David Rompf, Head of Information Services.  Last updated 12/24/2009

Colegio Bolivar, Calle 5, #122-21, via a Pance, Apartado Aereo 26300
Cali, Colombia, South America