SharePoint Portal Server 2003 depends on several key hardware
resources to ensure optimal performance. In general, the most
important resource for responding to increased load is CPU
capacity.
Note Insufficient RAM,
hard disk capacity, or network throughput can result in servers
falling short of the ideal performance that their CPU capacity
suggests.
Plan for hardware that delivers the CPU capacity and supporting
resources that satisfy your requirements based on the information
in this paper.
You can evaluate the throughput capacity of portal sites in many
different ways. It is important to understand those throughput
characteristics — Web page throughput, search throughput, and
index throughput — that have the most significant impact on the
performance of a portal site deployment. The following sections
provide detailed recommendations for several different deployment
scenarios.
Web Page Throughput
Web page throughput is a metric that can be hard to predict.
Usage that drives Web page throughput can vary greatly from hour to
hour and from day to day. SharePoint Portal Server 2003 is designed
to provide a high-performance solution that can accommodate
dramatically varying throughput needs. Conservative recommendations
for capacity planning assume that, on average, the portal site
deployment runs at 10 percent of total capacity. This enables
the deployment to successfully respond to unusual high-demand
periods.
There are many models and formulas for estimating the number of
pages per second required to support a given number of users.
However, it is not always clear what the term "number of users"
means for an organization. It is common to refer to the number of
users that could potentially use the portal site as the "number of
named users."
It is also common to refer to the number of users that may
actively use the portal site as the "number of simultaneous users."
It is extremely difficult to make a reliable prediction of the
number of simultaneous users, or the number of pages per second
required to support them.
To help determine the required throughput of a portal site
solution, organizations can use the following formula:
| number of users |
x |
percent of active users per day |
x |
number of operations per active user per
day |
x |
peak factor |
| 360,000 |
|
number of hours per day |
The following table explains the variables used in the
formula.
| Term |
Definition |
| Number of users |
The total number of users that may have access to
the solution |
| Percent of active users per day |
The percentage of the total number of users who
might use the portal site solution during any particular day.
Typically, this figure is approximately 25 percent, but it may vary
from 10 to 75 percent. |
| Number of operations per active user per day |
The number of operations that a typical user does
on the portal site during a typical day. An operation is an action
such as viewing the home page, searching, retrieving documents,
etc. Typically, this number is approximately 10, but it may vary
depending on the organization. |
| Peak factor |
An approximate number that estimates the extent to
which the portal site throughput exceeds the average throughput.
This number typically ranges from 5 to 10. |
| Number of hours per day |
The number of hours during which most activity
occurs. This number typically ranges between 6 and 14 hours. |
The number 360,000 is determined by:
100 (for percent conversion) x 60 (number of minutes in an
hour) x 60 (number of seconds in a minute)
You can use these quantitative descriptions of a portal site
deployment to estimate the required peak throughput. For example, a
company with 10,000 users (of which 40 percent per day are active,
performing an average of 20 operations) with a peak factor of 6 and
12 hours as the number of hours per day during which most activity
occurs, needs 11 pages per second throughput.
| 10,000 x 40 x 20 x 6 |
= 11.11 |
| 360,000 x 12 |
|
Search Throughput
Use the total Web page throughput to estimate the number of
content searches executed per second. Conservative recommendations
for capacity planning assume that 10 percent of all Web pages
viewed result in content searches.
Index Throughput
The rate at which content changes across your organization
determines the rate at which to update the content index. In
general, assume that 10 percent of the entire corpus must be
updated in the index every 24 hours. While it is extremely rare for
10 percent of content to actually change every 24 hours, this
recommendation allows the portal site to complete both large
additions of content and strategic full updates of the index in a
timely fashion.
The index throughput affects the performance of both the search
and alerts mechanisms on the portal site, since these are dependent
on an up-to-date index.
Recommendation
The index throughput should be capable of updating 10 percent of
the entire corpus in the index every 24 hours.
High Availability
As a central source for important business information and
applications, portal site deployments are frequently an important
resource for an organization and can be classified as "mission
critical." Many organizations choose to deploy a server farm to
ensure portal site availability regardless of actual throughput
requirements. Organizations typically deploy a server farm to
ensure high availability rather than high Web page throughput.
Recommendation
If your organization classifies the portal site as a critical
resource, consider deploying a server farm.
Storage
SharePoint Portal Server 2003 stores data in SQL Server and
full-text indexes in the file systems on the search and index
management servers. In general, the most important characteristics
for determining the amount of storage space required are the total
size of the documents stored on the portal site and the total size
of the documents included in the portal site index. The following
table illustrates the storage requirements for the server roles in
a SharePoint Portal Server solution.
| Server role |
Required storage |
| Database |
200% of the total size of all documents stored on
the portal site |
| Index |
60% of the total size of all documents stored on the portal site The index size is about 30 percent of the size of all documents in its catalogs. Because a copy (a snapshot of the content indexes) is always present, doubling the required space to 60 percent.
|
| Search |
Number of index servers X 60% of the the total size of all documents stored on the portal site When the index propogates from the index server to the search server, there are two copies of each index (each 30 percent of the total document size) on the search server for each index server.
|
For example, a portal site that stores 1 million documents with
an average document size of 100 kilobytes (KB) stores 100 GB of
document data and, thus, requires 200 GB of storage space.
Adding new portal sites or team sites does not in itself consume
much disk space. Each new portal site (without content) consumes
approximately 20 megabytes (MB) of disk space (in the database),
whereas a new site, personal site, or portal site area (without
content) consumes less than 200 KB of disk space (in the
database).
Recommendation
Use the preceding table to compute your storage needs. Multiply
the results by a factor of 1.5 to 3 to accommodate for future
growth.
Other Recommendations
- For more than one portal site, use shared services. This
results in a lower memory footprint, and requires no additional
servers.
- Try to consolidate to a few portal sites, typically along
geographic or organizational boundaries.
- Use areas for navigation and hierarchical security. In many
cases, you can use areas to store divisional information rather
than creating separate portal sites for divisions.