Searchable Web Pages

Policy:

Department of Commerce (DOC) Web sites must assist the public in finding and using government information. To this end, DOC Web sites are subject to the following requirements:

Requirement 1: For All Department of Commerce Web pages

  1. Use HTML/XHTML or XML <title> tags to describe the content of Web pages.  All DOC Web pages must contain a unique page title in the head section that specifically relates to the contents of that page.

    Example:  <title>DOC Web Policy: Searchable Web Pages - DOC Web Advisory Group</title>

    Implementation Deadline:  June 30, 2006

Requirements 2 and 3: For all DOC Major Web Sites:

  1. Provide a search functionMajor Web Sites must include a search function. This may be in the form of a search box or a link to a search page.

  2. Sensitive Information: Agencies must ensure that sensitive or restricted information, or personally identifiable information (such as social security numbers), cannot be retrieved using a search engine.

    Implementation Deadline:  December 31, 2005

  3. Use Standard Metadata.  As provided in OMB guidance, organizations should follow the recommendation of the Interagency Committee on Government Information (see Webcontent.gov - Use Standard Metadata) and use metadata syntax consistent with the Dublin Core Metadata standards posted at http://www.dublincore.org.

    At a minimum, the following six meta tags, following Dublin Core format, are required:
        • Title - This tag is different from the HTML/XHTML or XMLtitle tag, but the same title text should be used.

          Example: <meta name="DC.title" content="Home page of NOAA's National Weather Service" />

        • Description - A brief description of the contents and purpose of the individual page.

          Example: <meta name="DC.description" content="NWS Home page." />

        • Creator - The content owner; this should be the name of the organization.

          Example:  <meta name="DC.creator" content="US Department of Commerce, NOAA, National Weather Service" />

        • Date Created - The original creation date of the page in ISO8601 format (YYYY-MM-DD).

          Example: <meta name="DC.date.created" scheme="ISO8601" content="2001-01-01" />

        • Date Reviewed - The date the page contents were last reviewed in ISO8601 format (YYYY-MM-DD).

          Example: <meta name="DC.date.reviewed" scheme="ISO8601" content="2005-09-22" />

        • Language - Declares to users the natural language of the document being indexed. Search engines which index Web sites based on language often read this tag to determine which language(s) is supported. This tag is particularly useful for non-English and multiple language Web sites. If the content is in more than one language, the element may be repeated.

          Example: <meta name="DC.language" scheme="DCTERMS.RFC1766" content="EN-US"

Additional Metadata:  Organizations should include subject and keyword metadata if it is helpful for improving search relevancy and for content classification. If organizations do choose to use additional metadata, they should choose from Dublin Core standards, where possible. 

Robot Exclusion Protocol:  In those instances where organizations determine that sites should not be indexed or that indexing should be limited, they may use the Robot Exclusion Protocol (see Resources below).

Example:  <meta name=”ROBOTS” content=”NOINDEX, NOFOLLOW” /> [This tag instructs robots not to index the Web page]

Scope:

Requirement 1: All Department of Commerce Web sites, including intranet sites not available to the public.

Requirement 2 and 3: All Department of Commerce Major Web Sites.

Purpose:

The purpose of this policy is to make public information on DOC Web sites easy to find and to ensure that DOC Web sites comply with policies for federal government Web sites.

Exceptions:

None.

Deadline for Implementation:

    • Requirement #1: December 31, 2005
    • Requirement #2: June 30, 2006
    • Requirement #3: December 31, 2007

Discussion:

Search Function: The search function should, to the extent practical permit searching all files intended for public use on the Web site, display search results in order of relevancy to search criteria, and provide response times equivalent to industry best practice.

Page Titles: Including HTML/XHTML or XML titles makes Web pages easier to find and use because this text is used as the page title in a crawler-based search engine, as well as the title in bookmarks and browser reverse bars.

Metadata: Metadata provides a standardized system to classify and label Web resources. Metadata improves search relevancy, provides information about who created the information and when it was created, supports Web site maintenance and administration, helps create data-driven pages, and allows information to be tracked and assembled government-wide. Quality metadata helps the public to efficiently locate government information more efficiently when using search engines.

Organizations are encouraged to include metadata on as many pages as feasible within resource constraints. Metadata so that both high-level pages and pages embedded deep within a Web site can be accessed.

Resources

The Department of Commerce does not endorse any of the Web site links noted below. They are merely examples of resources which may be helpful to Web developers attempting to make their Web sites easier to find. Because these links are to sites outside the control of the Department of Commerce, you should review the Privacy Policy of each site you visit, as its information collection practices may differ from ours. This Resources list is not intended to be exhaustive.

Helping Search Engines Index Your Site
http://www.w3.org/TR/html401/appendix/notes.html#h-B.4

Web Robots and Robot Exclusion Protocol
http://www.robotstxt.org/wc/robots.html

<head> and <title> tags
http://www.w3.org/TR/html401/struct/global.html#h-7.4

Metadata
http://www.usa.gov/webcontent/managing_content/organizing/metadata.shtml

Information on Dublin Core standards: 
http://dublincore.org/documents/dcmi-terms/

Tool to create Dublin Core meta tags by URL
http://www.ukoln.ac.uk/metadata/dcdot/

Robots <meta> tag
http://www.w3.org/TR/html401/appendix/notes.html#h-B.4.1.2
http://www.robotstxt.org/wc/exclusion.html#robotstxt

WebContent.gov – Web Managers Advisory Council site
http://www.usa.gov/webcontent/index.shtml

Department of Commerce Web Advisory Council (WAC)
U.S. Department of Commerce

Send questions and comments about this page to WAC@doc.gov
Page last updated October 12, 2010