Microsoft Office Online
Sign in to My Office Online (What's this?) | Sign in

 
 
Microsoft Office SharePoint Server (MOSS)
Search
Search
 
Check for updates: (c) Microsoft
Office downloads
 
 
 
Warning: You are viewing this page with an unsupported Web browser. This Web site works best with Microsoft Internet Explorer 6.0 or later, Firefox 1.5, or Netscape Navigator 8.0 or later. Learn more about supported browsers.

Email this linkEmail this link Printer-Friendly VersionPrinter-Friendly Version Bookmark and ShareShare
About Rules That Include or Exclude Content
 

You can create rules that include or exclude content from the content index. These rules are called site restrictions and site path rules. A site restriction is the main rule for a site. You can show or hide the other rules for a site by clicking the plus sign (+) or minus sign (-) next to the site restriction. The other rules for a site are called site path rules. The site restriction defines the overall rules for a site, and the site path rules are rules for specific parts of the site. For example, a site restriction might apply to the whole site example.microsoft.com, and the site path rules for that site apply to http://example.microsoft.com/ and http://example.microsoft.com/*. If no site path rule applies to a path in a site, then the site restriction applies.

You can use site restrictions and site path rules to do the following:

  • Override the settings for the default content access account when crawling a specific site or path.
  • Specify the granularity for crawling lists.
  • Allow crawling of sites where addresses pass parameters, i.e., the address includes a question mark (?).
  • Allow sites to be traversed for links without content being added into the index.
  • Exclude an area from the index completely.

Rules can use general expressions and wild cards, as shown in the following examples:

  • "http://server1/folder*" applies to all Web resources that have a URL that starts with "http://server1/folder"
  • "http://server?web*" applies to resources such as "http://serveraweb2/file.htm" and "http://serverbweb3/file.htm "
  • "*/*.doc" applies to every Microsoft Word document encountered

Note  Depending on how a content source is added, site path rules need to be entered in a certain way for the update to succeed. For example, if the content source is \\server_name\Folder1\Folder2 and the site path rule is \\server_name\Folder1\Folder2\*, the update fails. If the content source is \\server_name\Folder1\Folder2 and the site path rule is \\server_name\Folder1\*, the update succeeds. If the content source \\server_name\Folder1\Folder2\ and the site path rule is \\server_name\Folder1\Folder2\*, the update succeeds.

Document shortcuts are subject to the same site and path restrictions as other documents and content sources in the portal site. If a user adds a document shortcut to the portal site, Microsoft Office SharePoint Portal Server 2003 updates that shortcut in the same way as other content sources. If site or file type restrictions prohibit the inclusion of a shortcut in the index, SharePoint Portal Server does not include content from that document shortcut in the index.

The settings described by these rules will become effective only after a new update occurs. If you change rules during an update, any content that has not been crawled yet and that is described by the rule will be affected by the changes.

Note  By default, crawling ASPX pages is disabled.

Related Topics

Adding a Rule That Includes or Excludes Content
Editing a Rule that Includes or Excludes Content
Deleting a Rule That Includes or Excludes Content
Moving a Rule that Includes or Excludes Content
advertisement