By Emily Schroeder, Microsoft Corporation
Introduction
Every Microsoft Office SharePoint Portal Server 2003
portal site includes content indexes that allow users to search for
content available from that portal site. Part of the process of
including content in the index is filtering, where SharePoint
Portal Server crawls and extracts the text and any properties
defined in the file.
By default, SharePoint Portal Server can crawl and filter a file
with a size of up to 16 megabytes (MB). After this limit is
reached, SharePoint Portal Server enters a warning in the gatherer
log. SharePoint Portal Server will not crawl the file again unless
the file changes. To change the limit of 16 MB, you must modify the
registry entry MaxDownloadSize.
In addition, a registry entry called MaxGrowFactor
enables you to change the ratio between the file size and the
filter output. This parameter provides an upper bound on the amount
of text filtered from a file, based on the file size. For example,
if MaxGrowFactor is 4 and the file size is 16 MB, the filter
cannot produce more than 64 MB of text from the file. This
parameter applies if the filter is decompressing the file.
If either MaxDownloadSize or MaxGrowFactor is
exceeded, you might see the following entries in the gatherer
log:
- The filtering process was stopped because its memory quota was
exceeded.
- The document was only partially crawled because it is too
large, or the filtered size exceeded the MaxGrowFactor limit. The
document was truncated and some words or terms used in the document
may not be available.
Specify the maximum file size
By default, SharePoint Portal Server 2003 can crawl and filter a
file that is up to 16 MB in size. You can change this limit by
editing the MaxDownloadSize entry in the registry.
Caution Incorrectly
editing the registry may severely damage your system. Before making
changes to the registry, you should back up any valued data on the
computer.
- On the taskbar, click Start, and then click
Run.
- Type regedit, and then click OK.
- In Registry Editor, navigate to
HKEY_LOCAL_MACHINE\Software\Microsoft\SPSSearch\Gathering
Manager.
- In the details pane, right-click MaxDownloadSize, and
then click Modify.
- In the Edit DWORD Value dialog box, in the Value
data box, type the number for the maximum size of file that can
be crawled. Ensure that Base is specified as
Decimal.
- Click OK.
- Close Registry Editor.
- Restart the server.
Specify the maximum grow factor
By default, SharePoint Portal Server 2003 can produce text from
a file that is equal to 4 times the file size, which can occur if
the filter is decompressing the file. You can change this limit by
editing the MaxGrowFactor entry in the registry.
Caution Incorrectly
editing the registry may severely damage your system. Before making
changes to the registry, you should back up any valued data on the
computer.
- On the taskbar, click Start, and then click
Run.
- Type regedit, and then click OK.
- In Registry Editor, navigate to
HKEY_LOCAL_MACHINE\Software\Microsoft\SPSSearch\Gathering
Manager.
- In the details pane, right-click MaxGrowFactor, and then
click Modify.
- In the Edit DWORD Value dialog box, in the Value
data box, type the number for the maximum ratio between the
size of the file and the filter output. Ensure that Base is
specified as Decimal.
- Click OK.
- Close Registry Editor.
- Restart the server.