Microsoft Office Online
Sign in to My Office Online (What's this?) | Sign in

 
 
Microsoft Office SharePoint Server (MOSS)
Search
Search
 
Check for updates: (c) Microsoft
Office downloads
 
 
 
Warning: You are viewing this page with an unsupported Web browser. This Web site works best with Microsoft Internet Explorer 6.0 or later, Firefox 1.5, or Netscape Navigator 8.0 or later. Learn more about supported browsers.

Email this linkEmail this link Printer-Friendly VersionPrinter-Friendly Version Bookmark and ShareShare
Specifying the File Size That SharePoint Portal Server 2003 Can Crawl
 

By Emily Schroeder, Microsoft Corporation

Introduction

Every Microsoft Office SharePoint Portal Server 2003 portal site includes content indexes that allow users to search for content available from that portal site. Part of the process of including content in the index is filtering, where SharePoint Portal Server crawls and extracts the text and any properties defined in the file.

By default, SharePoint Portal Server can crawl and filter a file with a size of up to 16 megabytes (MB). After this limit is reached, SharePoint Portal Server enters a warning in the gatherer log. SharePoint Portal Server will not crawl the file again unless the file changes. To change the limit of 16 MB, you must modify the registry entry MaxDownloadSize.

In addition, a registry entry called MaxGrowFactor enables you to change the ratio between the file size and the filter output. This parameter provides an upper bound on the amount of text filtered from a file, based on the file size. For example, if MaxGrowFactor is 4 and the file size is 16 MB, the filter cannot produce more than 64 MB of text from the file. This parameter applies if the filter is decompressing the file.

If either MaxDownloadSize or MaxGrowFactor is exceeded, you might see the following entries in the gatherer log:

  • The filtering process was stopped because its memory quota was exceeded.
  • The document was only partially crawled because it is too large, or the filtered size exceeded the MaxGrowFactor limit. The document was truncated and some words or terms used in the document may not be available.

Specify the maximum file size

By default, SharePoint Portal Server 2003 can crawl and filter a file that is up to 16 MB in size. You can change this limit by editing the MaxDownloadSize entry in the registry.

 Caution   Incorrectly editing the registry may severely damage your system. Before making changes to the registry, you should back up any valued data on the computer.

  1. On the taskbar, click Start, and then click Run.
  2. Type regedit, and then click OK.
  3. In Registry Editor, navigate to HKEY_LOCAL_MACHINE\Software\Microsoft\SPSSearch\Gathering Manager.
  4. In the details pane, right-click MaxDownloadSize, and then click Modify.
  5. In the Edit DWORD Value dialog box, in the Value data box, type the number for the maximum size of file that can be crawled. Ensure that Base is specified as Decimal.
  6. Click OK.
  7. Close Registry Editor.
  8. Restart the server.

Specify the maximum grow factor

By default, SharePoint Portal Server 2003 can produce text from a file that is equal to 4 times the file size, which can occur if the filter is decompressing the file. You can change this limit by editing the MaxGrowFactor entry in the registry.

 Caution   Incorrectly editing the registry may severely damage your system. Before making changes to the registry, you should back up any valued data on the computer.

  1. On the taskbar, click Start, and then click Run.
  2. Type regedit, and then click OK.
  3. In Registry Editor, navigate to HKEY_LOCAL_MACHINE\Software\Microsoft\SPSSearch\Gathering Manager.
  4. In the details pane, right-click MaxGrowFactor, and then click Modify.
  5. In the Edit DWORD Value dialog box, in the Value data box, type the number for the maximum ratio between the size of the file and the filter output. Ensure that Base is specified as Decimal.
  6. Click OK.
  7. Close Registry Editor.
  8. Restart the server.
advertisement