How we deployed SharePoint WSP Solutions without downtime


 

Introduction

As your SharePoint environment grows and more solutions are built on the platform it becomes critical to the business. Often it gets to the level that the business is very reluctant to allow SharePoint to be unavailable. This makes it difficult to plan application deployments which generally cause downtime.

Recently, I have been working on an environment which was in use globally and therefore the window for taking down the SharePoint farm is very small.

With that in mind I thought I’d share the process that I now use to deploy SharePoint WSP solutions without any downtime to the SharePoint farm. There are going to be times where you cannot avoid downtime, certain SharePoint object model calls cause all the server to recycle their application pools.

In order for deployments to be executed with no downtime there are a few requirements. There are infrastructure requirements as well as following a process.

 

Farm Configuration Requirements

In order to deploy SharePoint solutions there must be at least two SharePoint Servers configured as Web Front Ends. These web front ends are running the Web Application Service instance.

The next step is to ensure that these servers are being load balanced, either using software load balancing such as Windows Network Load Balancing (NLB) or using hardware load balancers such as the F5 Network’s BIG IP.

The F5 Network’s BIG IP configuration guide can be found here.

Without either the load balancers or the two web front ends then you are not going to be able to prevent downtime.

Regarding the load balancers, they must be configured so that when a server is down then no traffic is directed to that server.

At one client this required a bit more thought. The client used the F5 Network BIG IP load balancers unfortunately the load was not balanced as Round Robin balancing was being used.

Also the method to detect whether a server was down did not work. The way that the load balancers tested that a server was available was by using a low level call into IIS looking for an HTTP status of 200. This meant that as long as IIS up then the server was up and it would receive requests,  even if SharePoint was failing for some reason.

This was fixed by using a cURL script which allows Windows Authentication to be used to call into SharePoint. The script would then look for a particular text string in the page. As long as this was found then the server is up and available.

The last tweak to the load balancers were configured to use Observed dynamic balancing. The following article by Don MacVitte gives a great overview of the different types of load balancing.

Observed load balancing, balances the load by using a number of metrics, this snippet from the article explains in more detail:-

Observed: The Observed method uses a combination of the logic used in the Least Connections and Fastest algorithms to load balance connections to servers being load-balanced. With this method, servers are ranked based on a combination of the number of current connections and the response time. Servers that have a better balance of fewest connections and fastest response time receive a greater proportion of the connections. This Application Delivery Controller method is rarely available in a simple load balancer., including the response time from each server.

 

Deployment Process

SharePoint Deployment Without Downtime Diagram Process

With these infrastructure requirements in place it is possible to deploy solutions without downtime.

For SharePoint 2010 deployments, PowerShell is the tool of choice when deploying SharePoint Solutions.

Once you have the farm configured correctly then you will need to deploy the solutions through the SharePoint 2010 Management Shell.

So generally I login onto the server through Remote Desktop, copy over the solution file(s) into a folder such as c:\install\[solutionname]\[solutionversion].

Once the files are copied over:

  • Start SharePoint 2010 Management Shell
  • Change Directory to where the SharePoint solution (.wsp) are found.
    • cd c:\install\[solutionname]\1.0\

    The approach that I take is to use PowerShell variables where I can. This helps reduce the amount of typing and there are less mistakes made.

  • Create a variable to the current folder
  • $curdir = gl;
  • Create a variable for the solution’s filename
  • $solutionname = "MySolution.wsp”
  • Add the solution
  • Add-SPSolution –LiteralPath $curdir\$solutionname

Next is the important part, the solution has been added to the Configuration database and the next step is to deploy the solution to the servers. To ensure that there is no downtime, we will ensure that the deployment only occurs on one machine at a time.

Depending on the type of solution being deployed there are a few additional attributes that you may have to specify to the Install-SPSolution command.

However the most important attribute we need to remember is the –Local attribute this will only ensure that the solution deployment will only occur on the local server.

Other attributes include:-

  • WebApplication – this will deploy the solution to a specific Sharepoint Web Application
  • GacDeployment – this option allows a solution which contains .NET assemblies to be installed in to the GAC to be installed.
  • CasPolicies – this option allows a solution which contains code access security policies for the assemblies / web parts to be deployed.

Once the Install-SPSolution command has been run then the solution deployment success needs to be checked. This can also be achieved using PowerShell.

   $checkSolution = Get-SPSolution $solutionname;   $checkSolution.LastOperationalDetails;  

When this has been executed an output such as the following will be displayed for the local server.

sharepoint_solution_lastoperationdetails

The process now has to wait until the SPSolution.LastOperationDetails call returns back that the deployment is successful.

Once the deployment has completed, I now restart IIS and the SharePoint Timer Job Service using the following PowerShell.

 Restart-Service sptimerv4;   iisreset;   

The installation of the SharePoint solutions will have caused IIS to restart and therefore the server will not have responded to the load balancer. Provided that the load balancer has been configured correctly the server should no longer be processing requests to clients and therefore there is no loss of service.

Depending on configuration of the load balancers, the time it takes for a server to start responding to requests will based on a setting called ramp up time.

The slow ramp up time is the time the load balancer will wait after the server has come back online before it starts sending all requests to the server. This gives the server time to get back on its feet and processing those requests.

Once the server is back online, the SharePoint solution process can be repeated for each of the other SharePoint Servers.

 

Deployment Gotchas

Although this approach will for most cases work there are a few gotchas that you need to watch out for. In some of these occasions all SharePoint WFEs will have their application pools restarted.

Currently these are:-

  • SPWebConfigModification – its a good practice to apply changes to the web.config files for an IIS Web Application using the SharePoint SPWebConfigModification object. This will ensure that any settings required in the web.config for your solution will be applied. This reduces the likelihood of an issue with the solution due to down to missing configuration. Also if new servers are added to the farm then their web.config files wil be setup correctly. This will save you lots of time if you ever have to do a disaster recovery exercise!
  • SharePoint Field Controls – if your solution includes a new custom SharePoint field. Then this will not be available until all SharePoint servers have been updated. This can be more of a problem when for example you are using something like Telerik’s RadEditor control and are performing an upgrade. When these type of deployments are being made then its best to inform the users that the application will not be available. At least the rest of the SharePoint farm is up and running!

Note: As more issues are found then this section will be updated.

Deployment Script

The final deployment script is as follows:-

  cd c:\install\[solutionname]\[versionnumber];   $curdir=gl;   $solutionname=”SolutionName.wsp”;   Add-SPSolution –LiteralPath $curdir\$solutionname;   Install-SPSolution –Local –Identity $solutionname –GACDeployment –CASPolicies;   $checksolution = Get-SPSolution $solutionname;   $checksolution.LastOperationalDetails;   

 

Finally

I hope that you find this useful and would love to hear about your experience and approach to deployment of SharePoint solutions.

Save typing when Manipulating SharePoint Solution Files in PowerShell


 

Introduction

 

One of the things that annoyed me with PowerShell and SharePoint are the following commands:-

  • Add-SPSolution
  • Update-SPSolution
  • Actually anything that requires a –LiteralPath parameter

Don’t get me wrong I think that using PowerShell to admin SharePoint is fantastic but these commands bug me because they require a full path to work correctly. Unfortunately you don’t seem to be able to use the full stop (or period) to represent the current directory.

So you cannot do something like this:-

Add-SPSolution .\mysharepoint.wsp;

Instead you need to do this:-

Add-SPSolution c:\thisisareallylongpath\release1\section2\mysharepoint.wsp

Solution

Anyway, that got me thinking surely there is a better way and I can save myself some typing.

The solution is to use the command Get-Location.

So instead you can do this:-

$curdir = Get-Location;
Add-SPSolution–LiteralPath $curdir”\mysharepoint.wsp”;

Actually there is a PowerShell Alias called gl which is a shortcut for the Get-Location command.

So the shortest version in terms of saving your typing fingers is this:-

$curdir = gl;
Add-SPSolution –LiteralPath $curdir”\mysharepoint.wsp”;

Really hope that helps save you some time, also love to hear any other alternatives, there is bound to be an even better way of doing this.

SharePoint Utility to manage Content Type and List Event Receivers


Introduction

Recently I have been building a solution which makes use of Content Type Event Receivers. The event receivers were being deployed using declarative xml for example:-

<ContentType ID=”[Removed]” Name=”My Content Type” Group=”My Content Types” Version=”0″>
<DocumentTemplate TargetName=”/_layouts/CreatePage.aspx” />
<XmlDocuments>
<XmlDocument NamespaceURI=”http://schemas.microsoft.com/sharepoint/events“>
<Receivers xmlns:spe=”http://schemas.microsoft.com/sharepoint/events“>
<Receiver>
<Name>MyEventHandler</Name>
<Type>ItemUpdated</Type>
<SequenceNumber>10000</SequenceNumber>
<Assembly>MyAssembly, Version=1.0.0.0, Culture=neutral, PublicKeyToken=000000000000</Assembly>
<Class>MyAssembly.EventReceivers.ContentTypeEventReceiver</Class>
<Data></Data>
<Filter></Filter>
</Receiver>
……

Now this normally works great, except when you make a mistake. Its quite well known that upgrading Content Types that have been defined using declarative xml can be tricky.

This is one of the reasons why Microsoft released Feature Upgrades in SharePoint 2010 (if you haven’t seen this then please check out Chris O’Brien’s set of blog posts (http://www.sharepointnutsandbolts.com/2010/06/feature-upgrade-part-1-fundamentals.html) on the subject).

In this case the development was on a SharePoint 2007 environment and I found myself in a situation with deploying Content Type Event Receivers.

It seems that once the Content Type had been deployed and the feature switched on its quite difficult to make certain changes to the content type. This includes changes like the following:-

  • changing a field’s default value
  • changing whether a field is required
  • changing the event receiver definition
    However, unfortunately Gary LaPointe’s stsadm commands would not working correctly for my content type and Gaetan Bouveret’s utility only manages list event receivers but does look very good at that!

Solution

Therefore I started to write my own command line utility. The first version was quickly written to get what I needed to do, done.

I have updated it so that at least its a bit more intuitive but it still only allows you to modify event receivers at a content type or list level.

Please be careful because if you ask it to delete an event receiver then guess what it will delete the event receiver!

As with all these utilities please do not try using this on your production environment without performing some testing on a development environment beforehand.

If you have any comments, problems with the utility please leave a comment and I’ll take a look..

Utility System Requirements:-

  • SharePoint 2007 SP2 (Tested on April 2010 and April 2011 Cumulative Update)
  • Utility must be run from machine which is connected to the SharePoint farm that is being modified.

Usage:-

Console application name: ITSP.EventReceiverConfig.exe

Command Line Arguments:-

  • -url : full url of site that hosts content type or list
  • -mode : [list | add | remove]
  • -targettype : [list | contenttype | assembly]
  • -targetname : [name of list | name of content type | name of assembly]
  • -assembly : [full assembly name including version number etc]
  • -class : [fully qualified class name that is implementing the event receiver]
  • -type : [itemadding | itemadded | itemupdating | itemupdated]
  • -sequence : [number bigger than 1]

The utility with releases, document and source code can be found on Codeplex.

Codeplex Project: http://itspeventreceiver.codeplex.com

Documentation can be found here: http://itspeventreceiver.codeplex.com/documentation

Corrupted List Url Fix: Link to SharePoint List points to /_layouts/listedit.aspx?List=[ListGuid]


Issue

Had an issue at a client where a SharePoint task list got corrupted.

The issue had come about due to the list being copied using Metalogix SharePoint Migration Manager within the same site. The copy of the list had been corrupted so that if you clicked on “View All Site Content”, then clicked on the list name you were taken to the Edit List Settings page (/_layouts/listaspx.aspx?List=[GuidOfList]). Once in the list settings page, clicking on the breadcrumb to take you back to the list would take you to the web application homepage.

Solution

The fix was actually quite straightforward, if you typed in the url of the list correctly, http://sharepointsiteurl/sites/site/lists/My%20List%20Name, you could access the default list view with no problem. So I used a similar method to fix that I have used in this blog post, Manage Content and Structure error.

To fix:-

  • Delete the list using List Settings->Delete this list.
  • Restore the list from the recycle bin

Voila! The link within “View All Site content” is now working as expected.

Synchronization for Shared Services Provider ‘xxxx’ has failed.


Introduction

Recently we had an issue on a client’s SharePoint farm. The issue only occured on one SharePoint web front end and the following message was displayed within the server’s event log:-

Event Type: Warning
Event Source: Office SharePoint Server
Event Category: Office Server Shared Services
Event ID: 5783
Date:  dd/mm/yyyy
Time:  hh:mm:ss
User:  N/A
Computer: XXXXXX
Description:
Synchronization for Shared Services Provider ‘xxxx’ has failed. The operation will be retried.

Reason: Collection was modified; enumeration operation may not execute.

Technical Support Details:
System.InvalidOperationException: Collection was modified; enumeration operation may not execute.
   at System.Collections.Hashtable.HashtableEnumerator.MoveNext()
   at Microsoft.Office.Server.Administration.SharedResourceProvider.SynchronizeConfigurationDatabaseAccess(SharedComponentSecurity security)
   at Microsoft.Office.Server.Administration.SharedResourceProvider.SynchronizeAccessControl(SharedComponentSecurity sharedApplicationSecurity)
   at Microsoft.Office.Server.Administration.SharedResourceProvider.Microsoft.Office.Server.Administration.ISharedComponent.Synchronize()
   at Microsoft.Office.Server.Administration.SharedResourceProviderJob.Execute(Guid targetInstanceId)

For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.

Solutions

There were a number of solutions around including this thread, http://social.technet.microsoft.com/Forums/en/sharepointadmin/thread/d2d0e939-2e29-462e-a2f0-6f5e54f6e8fd

The thread talks about running the following commands:-

stsadm -o sync

This will force Sync on all Databases. Wait for some time may be an hour before running following commands.

stsadm -o sync -ListOldDatabases <Days>

This command will list out the databases which are out of sync for number of days specified.
You can also runfollowing command

stsadm -o sync -DeleteOldDatabases <days>

This will Remove Databases which it is unable to sync for number of days specified

stsadm -o sync

However when the command: stsadm -o sync -ListOldDatabases 1 was run then the following message was returned:

“No Databases met the criteria specified”.

Fix

The fix was to restart the Windows SharePoint Timer Job using:-

  • net stop sptimerv3
  • net start sptimerv3

No further error messages were received in the event log.

Unable to create a page based on a Publishing Page Layout


Introduction

This issue reared its head a few weeks ago. One of my clients has a pretty heavily customised Intranet which uses custom site definitions with feature stapling etc.

I have advised them to use separate site collections to help keep their site collections secure but with the main advantage being that they can use separate content databases for each site collection. This is to help ensure that they don’t get large 100Gb databases and issues with SQL Server locking.

Anyway, recently we started seeing an issue where the various contributors to the site (who were not site collection admins) could not create a page using the normal site settings->create page.

They had contribute rights but would get the error message:-

“The list does not exist”

 

Solution

So I started thinking about what the problem could be. The list does not exist message was being displayed before a page has been created and as soon as the user clicked Site Settings->Create a Page.

When you create a page you get a list of page layouts to choose from. So that brought me to checking that the master page gallery was there. I thought maybe someone had deleted the list or a feature had not been activated properly.

The list was there but no permissions were set, ah I thought that will be it.

I added the [Site Name] Members and [Site Name] Owners groups giving the Owners group Contributor permissions and the Members group Reader permissions.

The users could now create page successfully.

The reason that the problem was happening is that as part of the site collection creation process the content managers were being good and cleaning up the unused out of the box publishing SharePoint groups.

These groups include the following:-

  • Approvers
  • Designers
  • Hierarchy Managers
  • Quick Deploy Users
  • Restricted Readers
  • Style Resource Readers

However, this process was removing all the permissions from the master page gallery and hence any users who which were not site collection admins could not access the list.

The next step to this solution is write a Feature Receiver to clean up the SharePoint groups and clean up the permissions, this should help make things easier and ensure that users can start using the site as soon as its setup.

That will have to wait for a further blog post though.

Manage Content and Structure (An Unknown Error Occured)


>

Sometimes you have got to love SharePoint, or more importantly love the developers who developed SharePoint to return an exception message such as “An unknown error occured” when something bad happens.

This message occurred when using the Site Actions->Manage Content and Structure feature.

After some investigation a couple of posts on MSDN were found these talked about checking various xml files which customise the Publishing Editing Toolbar and Site Actions toolbar.

The MSDN post said to check that the files were both checked in and published.

However, this did not fix the problem. After further investigation I could see that some lists had two entries in the View All Site Content page (/_layouts/viewlsts.aspx), both which linked to the same list. So the problem is how can we get rid of one of these links?

The first idea was to backup the list as a list template and then delete the list, however this would change the list id when a new list was created from the template.

Instead I had the idea to use the recycle bin feature to bring the list back.

Deleting and restoring the list from the recycle bin did indeed fix the duplicate list links and also fixed the Manage Content and Structure feature!

Next, is to workout why these list entries are duplicated.