|
Lars Fastrup, Independent Consultant |
SharePoint
2010 Search…What’s Next for SharePoint Pros |
|
Sponsored by BA-Insight www.ba-insight.net March 2010 |
|
|
SharePoint
2010 Search - What’s next for IT-Pros
SharePoint 2010
Enterprise Search includes many relevant improvements for IT professionals. The
biggest change is without doubt the new and more scalable deployment architecture.
Beyond that, SharePoint 2010 offers many other useful changes including an improved
administration dashboard, built-in administration reports for monitoring the
performance of the search engine over time and complete PowerShell support
enabling scriptable administration. Last but not least, SharePoint 2010 also
ships with an improved connector framework allowing IT-Pros to easily configure
indexing of remote repositories. The
following sections will introduce you to more technical details on these improvements
for IT professionals.
New Scale-Out Architecture
The search engine in
Microsoft Office SharePoint Server 2007 suffered from a number of scalability
problems that have all been addressed in SharePoint 2010 by the introduction of
a new scale-out architecture. The scalability issues found in MOSS 2007 and
resolved in SP2010 include the following:
·
High query latency and slow crawls when the search index grows to millions of
items. The official limit of 50 million items per index does not perform well
in practice.
·
Non-redundant index server role making it a single-point-of-failure and a
performance bottleneck with respect to crawl speed.
·
Non-redundant property database making it a single-point-of-failure and a
performance bottleneck with respect to crawl speed as well as query latency.
The SharePoint 2010 search
engine introduces a new and highly componentized deployment architecture to resolve
these scalability issues. The available components that IT-Pros must learn how
to deploy include; 1 Administration Component, 1 Administration Database, 1+
Query Component, 1+ Crawl Component, 1+ Property Database and 1+ Crawl
Database. This componentization of the search engine offers the following
features and benefits:
·
Index Partitioning enabling a search index to be partitioned across
multiple query servers, which will in turn work in parallel on each query. This
enables deployment architectures with sub-second query latency up to about 100
million indexed items.
·
Index Mirroring enabling query failover by cross mirroring the
search index on the query servers (passive mirroring) or mirroring it to a
parallel set of query servers (active mirroring).
·
Multiple Stateless Crawlers offering improved crawl performance and high
availability of crawls. Stateless refers to the fact the crawlers are redundant
and they do not keep a copy of the index on the server as was the case with the
index server in MOSS 2007. Consequently, crawlers have a low disk space
requirement.
·
Multiple Crawl Databases for improved crawl performance. Supports
native SQL mirroring for failover.
·
Multiple Property Databases for improved query performance. Supports
native SQL mirroring for failover.
Figure 1 shows a
sample deployment with a partitioned and mirrored search index, multiple
property databases, multiple crawlers and multiple crawl databases.
Figure 1: Sample SharePoint 2010
Search deployment.

Improved Administration experience
The consolidated
administration dashboard introduced with the MOSS 2007 infrastructure update
has been carried along and improved in SharePoint 2010. Hence, the search
administration experience will be very familiar to search administrators familiar
with the MOSS 2007 administration experience. The dashboard provides IT
administrators with a quick overview of the state of the search engine and easy
access to its configuration. Significant improvements include:
·
Topology
editor for adding, updating and removing search components in a deployment.
·
Support
for managing custom content sources directly in the Web UI (Required custom
code in MOSS 2007).
·
Support for
regular expressions in Crawl Rules.
·
Ability to
prioritize Content Sources.
·
Improved
Web analytics reports for monitoring search usage.
·
New
administration reports to monitor the performance of query components and crawl
components in a deployment.
·
Web part
based dashboard page allowing for easy customization with custom Web parts.
·
Advanced
monitoring though Microsoft System Center Operations Manager (SCOM).
The screen shot seen
in Figure 2 illustrates the look and feel of the dashboard and Figure 3 shows a
sample report on the crawl rate over time.
Figure 2: Consolidated
Administration Dashboard

Figure 3: Report on crawl-rate
over time

PowerShell Support
Say goodbye to STSADM
and hello to Microsoft PowerShell - virtually every administrative operation in
SharePoint 2010 is now scriptable through a rich palette of PowerShell Cmdlets[1].
Enterprise Search is no exception here – it ships with 100+ PowerShell Cmdlets
enabling scripted administration of search artifacts like:
·
Search
Service Application
·
Crawl, Query
and Database components
·
Content
sources
·
Crawl
rules
·
Crawled
metadata properties
·
Managed
metadata properties
·
Search
scopes
·
Ranking
model
·
And much
more…
Executing PowerShell
commands is easy; simply login to the server and launch the SharePoint 2010
Management Shell from the Windows start menu and type the PowerShell command or
script to execute. Figure 4 below shows how to add a new Content Source for
crawling a file share using the Cmdlet named
New-SPEnterpriseSearchCrawlContentSource.
Figure 4: Sample command executed
from the SharePoint 2010 Management Shell

To list all SharePoint
2010 Cmdlets, type:
Get-Command –pssnapin
“Microsoft.SharePoint.PowerShell” | format-table name
To view the usage of a
Cmdlet, type:
Help <Name of Cmdlet>
To view the detailed
usage of a Cmdlet, type:
Help <Name of Cmdlet> -full
Improved Connector Framework
The SharePoint 2010 Enterprise
Search Engine also ships with a new connector framework leveraging the new
Business Connectivity Services (BCS)[2]
to index external content. The framework does along with improved tool support
in SharePoint Designer 2010, enable administrators to configure the indexing of
external content through the following generic connectors:
·
Database
connector
·
Windows
Communication Foundation (WCF) / Web Services connector
·
.NET
connector with callouts to custom code
Developers can
additionally develop custom connectors in managed code (.NET) to efficiently
index any custom repository not supported by the BCS. The connector framework
supports indexing of structured content (rows and columns) and unstructured
content (documents) along with security descriptors (ACLs) on each item. The
latter enables automatic security trimming of search results at query time. This
is a big improvement over the Business Data Catalog in MOSS 2007, which can
only index structured data without associated security descriptors.
These improvements over
the BDC eliminate the need to develop complex Protocol Handlers to index
documents and security information from custom repositories. However, the
Protocol Handler connectivity framework is still present and used by SharePoint
2010 to index SharePoint content, File shares, Web sites and People profiles. But
the new connector framework is leveraged when indexing content from Lotus NotesTM,
Exchange Public Folders and DocumentumTM.
Figure 5 outlines the
overall architecture of the new connector framework.
Figure 5: Connector Framework
Architecture
