Content

You have to want it more than you're afraid of it

SharePoint guy/gal’s worst nightmare: the deadliest booboos

Tuesday 19 May 2009 - Filed under Tech Notes

As with many other jobs, implementing or maintaining SharePoint requires an exceptional degree of attention to detail because, well, stuff happens. If an error occurs or a mistake is made inside a testing or staging environment before things go to production, it’s a good lesson. If that lesson is captured, documented, and learned, it becomes a priceless asset. I have hand-picked some of the deadliest SharePoint mistakes which I have directly and indirectly experienced from an implementer’s point of view.

Booboo #1. Oops, I nuked Central Administration!

In smaller SharePoint deployments where there is only one Web front-end server which doubles as an application server, there is no need to turn on services related to document conversions. It’s a good idea to switch them off not only to save processing power but to prevent errors from occurring. In Central Administration > Operations > Services on Server, click on the “Stop” links next to the two document conversions services. Sounds simple enough, right? Sometimes, though, it may be necessary to switch to the “Custom” view to list all of the available services. And sometimes, you may click on “Stop” next to the the “Central Administration” service by mistake because it’s right above document conversions. When you do, SharePoint doesn’t ask for confirmation and you cannot go back to the same screen. The result: Central Administration is instantly nuked. Without a functioning Central Administration site, it’s SharePoint no more. No point tinkering with IIS because it doesn’t help. All this drama unfolds in a split second, literally.

Services on server

Lord, have mercy: Thankfully, there is a way out, I mean, back into SharePoint. Terminal Service into the server and run the SharePoint Products and Technologies Wizard from the Start menu. It reconstructs Central Administration in the process. Be sure to provide the same port number it was on so that everything remains the same and, if you’re lucky, nobody would notice anything happened. The downside is that it takes at least a few minutes to finish, which means it could cause significant downtime in a busy production environment in which case… Christ, have mercy.

Booboo #2. Oops, search gets no results!

The most painful troubleshooting job I get to go through while setting up SharePoint servers is content crawl jobs failing for a variety of reasons. No crawling, no search index. No search index, no search results. No search, no point using SharePoint. Unfortunately, there is no fixed recipe for this disaster. The root cause may reside in database permissions, service accounts, IIS screw-up, NTFS security on the “hosts” file, environmental factors to do with the Active Directory domain, “local loopback“, recently installed updates, or any combination of these or other things.

While it’s not always possible to prevent this right from the beginning, it is a seriously career-limiting move not to check regularly that your search is functioning as expected. Examine the application events in Event Viewer. Also load the Search Administration page in SSP and check the number of successes, number of errors, and duration of each crawl job. Watch out for full crawls that took only a few seconds to complete… it’s never a good sign. TROUBLESHOOT IT AT ALL COSTS. Update (26 May 2009): There is growing speculation, at least in my lab, that this is a side-effect of upgrading to SP2. Microsoft has documented some of the symptoms and workarounds here and here, but still does not say that there is a direct correlation between the indexing issue and SP2. I need to get to the bottom of this; in the meantime, I would advise against patching production SharePoint farms beyond the Uber Packages of February 2009 Cumulative Update (12.0.0.6341). Update (20 July 2009): After numerous deployments in different environments, I have concluded that Service Pack 2 (12.0.0.6421) is not the culprit of the various problems mentioned above. As of this writing, the recommended patch level for production is Service Pack 2 + April 2009 Cumulative Update (12.0.0.6504)

Booboo #3. Oops, the content database is empty!

It’s good to be future-proof and smart about capacity requirements. Projecting the volume of SharePoint content over time with reasonable accuracy is not easy, but must be done one way or another. Often, the outcome of this exercise is a decision to create (or restore) each site collection in its own content database. Having multiple databases under a single SharePoint Web application is a quantum leap from indiscriminately shoving everything in a single default content database, which is known to have a supported limit of 100GB in size. Have a look at Central Administration > Application Management > Content Databases. There you can set up multiple content databases designed for individual (or groups of) site collections that belong to the same Web application. The tricky part is making sure that the site collection you subsequently create goes into the content database you built for it. That is because SharePoint uses its own not-so-easy-to-understand logic to select a content database for a newly created site collection. That makes the process more of an art than science, although it can eventually be expressed as an algorithm.

Each SharePoint content database has these properties:

  • Name. The name is typically prefixed with “WSS_Content”. Any new database you create gets a name like “WSS_Content_4435253d41864f8583ef0203fa1fd89b” where the last part is a randomly-generated identifier. Make sure you change that gibberish into to a succinct, meaningful name before you hit the OK button.
  • Database status. It’s either “Started” or “Stopped”, also known as “Ready” and “Offline”, respectively. “Started” or “Ready” means the database is able to house more site collections if instructed. “Stopped” or “Offline” means the database will not accept any more new site collections. That’s all there is to it; the status does not affect the operation of existing site collections in any way.
  • Current number of sites. This shows the number of site collections that are already in the database.
  • Site level warning. When the total number of site collections in the database reaches this value, a warning will be generated so the administrator knows about it. The default value is 9000.
  • Maximum number of sites. This is the maximum number of sites that can be created in the database. The default is 15000. It must be strictly greater than the “site level warning” value.

SharePoint bases its decision on these properties. Again, the challenge is to put the right site collection(s) into the right content database. Let me illustrate this with a real-life-like situation. Suppose you currently have these content databases set up under a particular Web application:

Name Status Curr. sites Max. sites
WSS_Content Started 1 1
WSS_Content_2 Stopped 0 15000
WSS_Content_4435253d41864f8583ef0203fa1fd89b Started 0 15000
WSS_Content_Asset_Management Started 0 15000
WSS_Content_dac26d1dc9f044df93b08dd2df8a1e22 Started 1 15000
WSS_Content_ECM Started 0 5
WSS_Content_Main Stopped 0 1
WSS_Content_Projects Started 2 2
WSS_Content_QA Started 1 1001

Question: If you were to create a site collection titled “Asset_Management” whose URL is “/am”, which content database would SharePoint place it in? Answer: The one whose gibberish name ends with “89b”. Here’s why: First, all stopped (offline) databases are eliminated. Next, all remaining databases that have reached their caps (max. sites) are eliminated. Next, SharePoint attempts to match the URL (not the title) of the site collection against the names of the databases: Here, no match is found because “Asset_Management” happens to be the title of the site collection, not the URL. Next, the database(s) with the greatest number of available slots are selected: Here, two of the remaining databases have 15000 available slots. Finally, SharePoint picks the one that comes first alphabetically.

By learning this behaviour, you can always force SharePoint to pick the right content database for a specific site collection. If you are mathematically enthused, you can even batch-restore multiple site collections in one go without having to adjust the statuses or numbers manually during the process and still have all of your site collections correctly placed in their designated databases.

So, what’s the deadly booboo other than getting the names and numbers wrong in the first place? It’s when you create a site collection and don’t check back on the list of content databases to make sure it has indeed gone into the right database. If you fail to spot your mistake early on, the consequence may be catastrophic. So how would you recover from it? Do an STSADM backup of the site collection, delete it from the database, and then do an STSADM restore of it to a new database. Update (20 July 2009): There is actually another way to specify the target content database. Use the STSADM command to create the site collection in a brand new database (stsadm -o createsiteinnewdb) and set the databasename switch to the name of the new database. This works when you want to create a new content database and a site collection in one go; it doesn’t work for content databases that already exist.

Booboo #4. Oops, where did I/we/they back up the entire farm again?

The grandaddy of all deadly SharePoint booboos, in my opinion, is: not having the necessary know-how and/or resources to reconstruct a SharePoint farm when it needs to be reconstructed from scratch for whatever reason. Things might change in SharePoint Server 2010, but there is no such thing as real-time geographic fail-over in SharePoint Server 2007. Different disaster recovery mechanisms are required for the different parts and components of a SharePoint farm. Content can be backed up in certain ways, but custom-developed elements and the various layers of configurations need to be backed up in different ways and must be accounted for. The best practice is to practise replicating the content and configurations of a working SharePoint farm onto an offline environment, and document the findings and caveats in the process.

2009-05-19  »  JK

Share your thoughts

Re: SharePoint guy/gal’s worst nightmare: the deadliest booboos







Tags you can use (optional):
<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>