10 Reasons Why SharePoint Performance
Can Slow
In recent months, we’ve seen customers who are
doing major migrations of documents—for example, from file servers—to
SharePoint. Far too often, one of the complaints I hear voiced in these
scenarios is how slow the upload can be, particularly when uploading huge
numbers of documents. I thought it would be worth starting a discussion of
lessons learned related to performance of mass uploads. This is a fairly
targeted business scenario—most organizations don’t do this too often—but it
also brings up some key points to consider about performance in other
SharePoint scenarios.
The following are among the factors that can
cause performance of mass uploads to suffer:
·
The
recovery model for the content database (a SQL Server setting) is set to full
by default. While this is
certainly the appropriate setting for a production environment, as it allows
for recovery using transaction logs, it can slow the content loading of a
migration. Set the recovery mode to simple, which causes the contents of the
transaction log to be truncated each time a checkpoint is issued for the
database. Just remember two things: First, set it back to Full when finished.
Second, remember this mode means that the database recovery point can only be
as recent as the last database backup, so you’ll probably want to back up
before your migration—and there are many good reasons for that, anyway.
·
Search indexing, if it kicks in, consumes resources that you
might need on your WFEs and SQL servers for processing the migration of files. Make sure that search jobs are scheduled
appropriately—or paused—while you do your mass upload.
·
Anti-virus
software, if it is scanning every document that is uploaded, or is scanning the
database or BLOB store directly, can slow things down tremendously. Assuming that your documents were scanned when
they were uploaded to their original location, you probably don’t need to incur
that penalty when simply moving those documents to SharePoint.
·
BLOB
storage can affect performance—for better or worse. As you know, I’ve done a lot of writing
and speaking about BLOB storage and content database scalability. BLOBs
(binary large objects) are the binary, unstructured chunk of data that is the
document as it is stored in SQL in the AllDocStreams table of your content
database. You can externalize BLOBs using EBS or RBS, which means you store
BLOBs in a location other than your content database, and the database gets a
pointer to the document. When you externalize BLOBs, you reduce the writes to
your database. By default, when you upload a document, it gets written to the
transaction log first, then gets committed to the database. That’s two
writes for every document. By externalizing BLOBs, there is conceptually a
performance benefit. But it really depends on the performance of the storage
tier to which you move BLOBs, and depending on the performance of the EBS or
RBS provider (the software that manages the communication between EBS/RBS,
which are Microsoft APIs, and your BLOB storage platform). For example, if
you’re externalizing BLOBs to cloud storage—like Amazon or Rackspace for
example, it’s likely performance will be penalized. But if you’re
externalizing to a high-performance storage tier, performance can definitely
increase for this mass-upload scenario.
·
Database
growth sizing. The default
database size and growth settings for SQL databases are really not appropriate
for most SharePoint databases, particularly those that will contain BLOBs. Set
the size of your content database to something that represents the size of the
data you’re going to upload. Consider the space that metadata will take, as
well. That way, SQL doesn’t have to “grow” the database as you upload—the space
is already there. As a side note, size and growth affect performance as your
environment scales—there are some great blog posts on the “interwebs” to help
you determine an appropriate setting, but I recommend setting an initial size
that represents your expected content size (including metadata and BLOBs, if
stored in SQL) over the first few months of your service, and a growth setting
of 10% of that size. But be smart about it—there are a lot of variables in that
calculation that all depend on your usage patterns.
·
Storage
performance, of course, can affect the uploads. Consider creative solutions—like moving
the database to which you’re uploading to a separate set of spindles, a
separate SQL instance, or a separate SQL server, during the upload. Then move
it to its “final home” after uploading is complete. Keep in mind you
might even be able to do a migration in a lab then bring the content database
into production. Just detach and reattach the content databases.
·
The
web front end (WFE) can be a bottleneck. Consider uploading to a dedicated web front end that is
not being hit by users (though it’s typically the SQL side that’s the
bottleneck)… you can target your migration using DNS or load balancer settings.
·
The
bottleneck might be the connection between the WFE and SQL Server. Use a dedicated high-speed (Gig-E or 10Gig-E)
network between WFE and SQL servers. Use teaming if NICs support it.
·
The
client side can also be a bottleneck, as can requests that aren’t load
balanced. Consider running the
migration directly on the WFE or from multiple clients, depending on your
infrastructure.
·
The
source can be the bottleneck. Consider all of the previous issues as to where the files are
coming from? Should you perform the upload from the file server, for
example? Should you move or copy the files to disks that are local to the WFE
to maximize performance of the actual upload? That kind of two-step process may
help you migrate during specific time windows of your service level
agreements.
There’s a lot of room for performance
problems—and for creative solutions—when you think about all of the moving
parts (both infrastructure and services) that are at play in a simple mass
upload!
Hopefully this gives you some ideas for this
mass upload scenario—ideas that are also useful points to consider in other
performance scenarios. Of course, there’s a LOT more to say about performance
and SQL performance in particular. I’d like to thank my colleague Randy
Williams for his significant contribution to this newsletter and point you to his great presentation about optimizing SQL for SharePoint.
And I’d like to invite you to comment on this article with YOUR experiences
related to mass-upload performance optimization.
5 Reasons Why You Have
SharePoint Performance Issues
As SharePoint user adoption
increases, so does the amount of data that must be stored in SharePoint.
Although rapid adoption indicates effective collaboration, this content
explosion can easily outstrip SharePoint’s basic storage configuration.
This causes an outcry both from end
users, who complain about SharePoint’s slow performance, and SQL Server DBAs,
who protest that SharePoint is taking up too much expensive server space and
processing power. All of this can lead to dissatisfaction with a once-loved
platform.
Here are five storage-related issues
in SharePoint that can kill performance, with tips on how to resolve or prevent
them.
Problem #1: Unstructured data
takeover. The primary
document types stored in SharePoint are PDFs, Microsoft Word and PowerPoint
files, and large Excel spreadsheets. These documents are usually well over a
megabyte.
SharePoint saves all file contents
in SQL Server as unstructured data, otherwise known asBinary Large Objects (BLOBs). Having many BLOBs in SQL
Server causes several issues. Not only do they take up lots of storage space,
they also use server resources.
Because a BLOB is unstructured data,
any time a user accesses a file in SharePoint, the BLOB has to be reassembled
before it can be delivered back to the user – taking extra processing power and
time.
Solution: Move BLOBs out of SQL Server and into a
secondary storage location – specifically, a higher density storage array that
is reasonably fast, like a file share or network attached storage (NAS).
Problem #2: An avalanche of large
media. Organizations
today use a variety of large files such as videos, images, and PowerPoint
presentations, but storing them in SharePoint can lead to performance issues because SQL Server isn't optimized to
house them.
Media files, especially, cause
issues for users because they are so large and need to be retrieved fairly
quickly. For example, a video file may have to stream at a certain rate, and
applications won't return control until the file is fully loaded. As more of
this type of content is stored in SharePoint, it amplifies the likelihood that
users will experience browser timeout, slow Web server performance, and upload
and recall failures.
Solution: For organizations that make SharePoint “the
place” for all content large and small, use third-party tools specifically
designed to facilitate the externalization of large media storage and
organization. This will encourage user adoption and still allow you to maintain
the performance that users demand.
Problem #3: Old and unused files
hogging valuable SQL Server storage. As data ages, it usually loses its value and
usefulness, so it’s not uncommon for the majority of SharePoint content to go
completely unused for long periods of time. In fact, more than 60 to 80 percent
of content in SharePoint is either unused or used only sparingly in its lifespan.
Many organizations waste space by applying the same storage treatment for this
old, unused data as they do for new, active content, quickly degrading both SQL
Server and SharePoint performance.
Solution: Move less active and relevant
SharePoint data to less expensive storage, while still keeping it available to
end users via SharePoint. In the interface, it helps to move these older files
to different parts of the information architecture, to minimize navigational
and search clutter. Similarly, we can “unclutter” the storage back end.
A third-party tool that provides
tiered storage will enable you to easily move each piece of SharePoint data
through its life cycle to various repositories, such as direct attached
storage, a file share, or even the cloud. With tiered storage, you can keep
your most active and relevant data close at hand, while moving the rest to less
expensive and possibly slower storage, based on the particular needs of your
data set.
Problem #4: Lack of scalability. As SharePoint content grows, its supporting
hardware can become underpowered if growth rates weren't accurately forecasted.
Organizations unable to invest in new hardware need to find alternatives that
enable them to use best practices and keep SharePoint performance optimal.
Microsoft guidance suggests limiting content databases to 200GB maximum unless
disk subsystems are tuned for high input/output performance. In addition, huge
content databases are cumbersome for backup and restore operations.
Solution: Offload BLOBs to the file system –
thus reducing the size of the content database. Again, tiered storage will give
you maximum flexibility, so as SharePoint data grows, you can direct it to the
proper storage location, either for pure long-term storage or zippy immediate
use.
It also lets you spread the storage
load across a wider pool of storage devices. This approach keeps SharePoint
performance high and preserves your investment in existing hardware by
prolonging its useful life in lieu of buying expensive hardware. It’s simpler
to invest in optimizing a smaller SQL Server storage core than a full
multi-terabyte storage footprint, including archives.
Problem #5: Not leveraging
Microsoft’s data externalization features.Microsoft’s
recommended externalization options are Remote BLOB Storage (RBS), a SQL Server
API that enables SharePoint 2010 to store BLOBs in locations outside the
content databases, and External BLOB Storage (EBS), a SharePoint API introduced
in SharePoint 2007 SP1 and continued in SharePoint 2010.
Many organizations haven't yet
explored these externalization capabilities, however, and are missing out on
significant storage and related performance benefits. However, native EBS and
RBS require frequent T-SQL command-line administration, and lack flexibility.
Solution: Use a third-party tool that works with
Microsoft’s supported APIs, RBS, and EBS, and gives administrators an intuitive
interface through SharePoint’s native Central Administration to set the scope,
rules and location for data externalization.
In each of these five problem areas,
you can see that offloading the SharePoint data to more efficient external
storage is clearly the answer. Microsoft’s native options, EBS and RBS, only
add to the complexity of managing SharePoint storage, however, so the best
option to improve SharePoint performance and reduce costs is to select a
third-party tool that integrates cleanly into SharePoint’s Central
Administration. This would enable administrators to take advantage of EBS and
RBS, choosing the data they want to externalize by setting the scope and rules
for externalization and selecting where they want the data to be stored.
Chris McNulty is a strategic product manager and evangelist
for SharePoint Solutions at Quest Software. He is a MCTS, MCSE, and a member of
the Microsoft Solutions Advocate and MVTSP programs. A frequent speaker at
events around the world, he's the author of the SharePoint 2010 Consultant’s Handbook, and other books, and writes atSharePoint For All and the KnowPoint blog.
No comments:
Post a Comment