Fast Search

Limiting Search Crawling to a subsite

Blogs / Limiting Search Crawling to a subsite

Meet Bhimani

January 09, 2012

All Post

I had an interesting challenge. I was asked to limit Search Crawling to a single subsite. The underlying issue was that a great deal of security in this farm was implemented via Audiences which is not a secure method of locking down content. Audiences expose documents and items to users, but don’t prevent the user from actually accessing the documents or items. Search Content Sources expect to have nice and simple Web Application URLs to crawl. So how best to restrict crawling to a subsite?

The simple answer is set up the Content Source to crawl the whole Web Application, but set up Crawl Rules to exclude everything else. Only two rules are needed:

Include: List the site to include, such as “http ://SharePoint/sites/site1/site2”
Note the * at the end to ensure all sub-content is crawled. Being the first crawl rule, this takes precedence over the next. Don’t forget the *.*
It seems the testing of the crawl rule with just a * will appear to capture all content, but at crawl time, only a *.* will capture content with a file extension.
Exclude: List everything else: http://*.*
This will exclude anything not captured in the first rule.
If you have a content source that includes people (sps3://SharePoint) be sure to use a wildcard on the protocol as well.

Voila!

Want to talk?

Drop us a line. We are here to answer your questions 24*7.

Limiting Search Crawling to a subsite

Blogs / Limiting Search Crawling to a subsite

Meet Bhimani

Want to talk?

Newsletters

Copilot Agents Shaping the Future of Work at Microsoft Ignite 2025

The Only Microsoft 365 Updates That Actually Mattered in 2025

Migrating to SharePoint? Start Here

Simplifying Employee Search with a Corporate Directory