PowerShell: Deleting SharePoint List Items


Introduction

Whilst I love SharePoint Workflows and how versatile they can be, they can generate quite a bit of data. Well mine do as I like to log plenty of information so that the support / admin teams can find out what’s going on with the workflow.

Unfortunately when you log plenty of information this means that the workflow history list can get quite large.

One of the workflows that we built over a ten month period has processed a couple of hundred thousand list items and has created about 3 million list items in the workflow history list.

We wanted to clear down this list and so PowerShell came to the rescue.

Solution

We built the following PowerShell script which you provide the following parameters:0

  • Url – Url of web hosting the workflow history list
  • AgeOfItemsToDelete – days of logs that you wish to keep
  • ListName – the display name of the workflow history list
  • NumberOfItemsInBatch – the number of items that should be returned in each query.

The original script looked like this:-

param
(
	[Parameter(Mandatory=$false, HelpMessage='System Host Url')]
	[string]$Url = "http://sharepoint",
	[Parameter(Mandatory=$false, HelpMessage='List Name')]
	[string]$ListName = "Workflow Tasks",
	[Parameter(Mandatory=$false, HelpMessage='Age of items in list to keep (number of days).')]
	[int]$AgeOfItemsToKeepInDays = 365,
	[Parameter(Mandatory=$false, HelpMessage='What size batch should we delete the items in?')]
	[int]$NumberOfItemsToDeleteInBatch = 1000
	
)

$assignmentCollection = Start-SPAssignment -Global;

$rootWeb=Get-SPWeb $Url -AssignmentCollection $assignmentCollection;

$listToProcess = $rootWeb.Lists.TryGetList($ListName);
if($listToProcess -ne $null)
{
	$startTime = [DateTime]::Now;
	$numberOfDaysToDelete = [TimeSpan]::FromDays($AgeOfItemsToKeepInDays);
	$deleteItemsOlderThanDate = [DateTime]::Now.Subtract($numberOfDaysToDelete);
	$isoDeleteItemsOlderThanDate = [Microsoft.SharePoint.Utilities.SPUtility]::CreateISO8601DateTimeFromSystemDateTime($deleteItemsOlderThanDate);
	$numberOfItemsToRetrieve = $NumberOfItemsToDeleteInBatch;
	
	$camlQueryString = [String]::Format("<Where><Leq><FieldRef Name='Modified' /><Value IncludeTimeValue='TRUE' Type='DateTime'>{0}</Value></Leq></Where>", $isoDeleteItemsOlderThanDate);
	$camlQuery = New-Object -TypeName "Microsoft.SharePoint.SPQuery" -ArgumentList @($listToProcess.DefaultView);
	$camlQuery.Query=$camlQueryString;
	$camlQuery.RowLimit=$numberOfItemsToRetrieve;
	
	$deletedItemCount=0;
	
	do
	{
		$camlResults = [Microsoft.SharePoint.SPListItemCollection] $listToProcess.GetItems($camlQuery);
		$itemsCountReturnedByQuery = $camlResults.Count;
		Write-Host "Executed Query and found " $camlResults.Count " Items";
		
		$listItemDataTable = [System.Data.DataTable]$camlResults.GetDataTable();
		foreach($listItemRow in $listItemDataTable.Rows)
		{
			$listItemIdToDelete = $listItemRow["ID"];
			$listItemModifiedDate = $listItemRow["Modified"];
			Write-Host "Deleting Item $listItemIdToDelete - Modified $listItemModifiedDate";
			$listItemToDelete = $listToProcess.GetItemById($listItemIdToDelete);
			$listItemToDelete.Delete();
			$deletedItemCount++;
		}
	}
	while($itemsCountReturnedByQuery -gt 0)
	
	$totalSecondsTaken = [DateTime]::Now.Subtract($startTime).TotalSeconds;
	Write-Host -ForegroundColor Green "Processing took $totalSecondsTaken seconds to delete $deletedItemCount Item(s).";
}
else
{
	Write-Host "Cannot find list: " $ListName;
}

Stop-SPAssignment -Global -AssignmentCollection $assignmentCollection;

Write-Host "Finished";

However, whilst this worked ok for a list that was quite small. When we went to use it on the Production environment it performed like a dog. Fortunately the script was run out of hours so didn’t impact the environment too much. Though the memory that it consumed was quite large (4GB) after deleting the second item.

There was something seriously wrong with approach being taken, so after a bit of investigation it was obvious what was going on.

Look at the script again, there is a line of code that is:-

$listToProcess.Items.DeleteItemById($listItemIdToDelete);

Well it turns out that this call, updates the collection after the DeleteItemById function is called. So we made a small modification and the offensive line became:-

$listItemToDelete = $listToProcess.GetItemById($listItemIdToDelete);
$listItemToDelete.Delete();

This change meant that the PowerShell session now only consumed 270Mb (I say only!) and memory usage did not rise. The deletion of the items was much quicker too, probably by a few 1000%!

Here is the final script for completeness.

param
(
[Parameter(Mandatory=$false, HelpMessage='System Host Url')]
[string]$Url = "<a href="http://sharepoint&quot;">http://sharepoint"</a>,
[Parameter(Mandatory=$false, HelpMessage='List Name')]
[string]$ListName = "Workflow Tasks",
[Parameter(Mandatory=$false, HelpMessage='Age of items in list to keep (number of days).')]
[int]$AgeOfItemsToKeepInDays = 365,
[Parameter(Mandatory=$false, HelpMessage='What size batch should we delete the items in?')]
[int]$NumberOfItemsToDeleteInBatch = 1000

)

$assignmentCollection = Start-SPAssignment -Global;

$rootWeb=Get-SPWeb $Url -AssignmentCollection $assignmentCollection;

$listToProcess = $rootWeb.Lists.TryGetList($ListName);
if($listToProcess -ne $null)
{
$startTime = [DateTime]::Now;
$numberOfDaysToDelete = [TimeSpan]::FromDays($AgeOfItemsToKeepInDays);
$deleteItemsOlderThanDate = [DateTime]::Now.Subtract($numberOfDaysToDelete);
$isoDeleteItemsOlderThanDate = [Microsoft.SharePoint.Utilities.SPUtility]::CreateISO8601DateTimeFromSystemDateTime($deleteItemsOlderThanDate);
$numberOfItemsToRetrieve = $NumberOfItemsToDeleteInBatch;

$camlQueryString = [String]::Format("&lt;Where&gt;&lt;Leq&gt;&lt;FieldRef Name='Modified' /&gt;&lt;Value IncludeTimeValue='TRUE' Type='DateTime'&gt;{0}&lt;/Value&gt;&lt;/Leq&gt;&lt;/Where&gt;", $isoDeleteItemsOlderThanDate);
$camlQuery = New-Object -TypeName "Microsoft.SharePoint.SPQuery" -ArgumentList @($listToProcess.DefaultView);
$camlQuery.Query=$camlQueryString;
$camlQuery.RowLimit=$numberOfItemsToRetrieve;

$deletedItemCount=0;

do
{
$camlResults = [Microsoft.SharePoint.SPListItemCollection] $listToProcess.GetItems($camlQuery);
$itemsCountReturnedByQuery = $camlResults.Count;
Write-Host "Executed Query and found " $camlResults.Count " Items";

$listItemDataTable = [System.Data.DataTable]$camlResults.GetDataTable();
foreach($listItemRow in $listItemDataTable.Rows)
{
$listItemIdToDelete = $listItemRow["ID"];
$listItemModifiedDate = $listItemRow["Modified"];
Write-Host "Deleting Item $listItemIdToDelete - Modified $listItemModifiedDate";
$listItemToDelete = $listToProcess.GetItemById($listItemIdToDelete);
$listItemToDelete.Delete();
$deletedItemCount++;
}
}
while($itemsCountReturnedByQuery -gt 0)

$totalSecondsTaken = [DateTime]::Now.Subtract($startTime).TotalSeconds;
Write-Host -ForegroundColor Green "Processing took $totalSecondsTaken seconds to delete $deletedItemCount Item(s).";
}
else
{
Write-Host "Cannot find list: " $ListName;
}

Stop-SPAssignment -Global -AssignmentCollection $assignmentCollection;

Write-Host "Finished";

Hope that helps someone who has the same problem. Please let me know if you have an alternative solution!

Links to the scripts:-

Delete-ListItemsOlderThan-Slow.txt

Delete-ListItemsOlderThanV2.txt

PowerShell, SharePoint and Memory Leaks (Start-SPAssignment)


Introduction

Over the last couple of years I have been using PowerShell to do more & more within my SharePoint environments.

Recently I have been writing a script to move a large number of SharePoint Web objects from one site collection to another.

Whilst this was being tested, it was obvious that something wasn’t quite right as I kept seeing the following messages in the ULS logs:-

Potential excessive number of SPRequest objects (60) currently unreleased on thread 10. Ensure that this object or its parent (such as an SPWeb or SPSite) is being properly disposed. This object is holding on to a separate native heap. This object will not be automatically disposed…

 

Now I am used to seeing these “Object Not disposed messages” in logs when writing code and the majority are easily fixed. Please see Stefan Goßner’s excellent article on Disposing of SPWeb and SPSite objects for more information.

The reason for the errors being displayed are that the objects are not being correctly disposed after they are used. This leads to the memory slowly being consumed and not released. Eventually if there are enough objects created, then memory pressure is placed on the server and performance is impacted.

With C# code this is easily sorted out by either calling .Dispose() or wrapping the code that creates the disposable object with a using() statement around the assignment line.
An example is shown below.

For example:-


using(SPWeb web=_site.OpenWeb(“http://sharepoint"))

{

if(web.Exists)

{

//do something

}

}

However PowerShell objects such as the SPWebPipeBind object or SPSitePipeBind object don’t have a .Dispose() function. So how the heck do you stop them from leaking memory?

Well the secret is that you make use of the command Start-SPAssignment and Stop-SPAssignment cmdlets.

Start-SPAssignment / Stop-SPAssignment Cmdlets

Whilst using this commands I came across a number of blog posts but the approach that they were taken didn’t really work and I would still see the same memory leaks.
The set of Cmdlets should be used in the following fashion:-


$assignmentCollection = Start-SPAssignment;

$allWebs = Get-SPWeb –Site <a href="http://sharepoint">http://sharepoint</a> –AssignmentCollection $assignmentCollection

foreach($web in $allWebs)

{

Enable-SPFeature –Identity “Feature” –Url $web.Url –AssignmentCollection $assignmentCollection;

}

Stop-SPAssignment $assignmentCollection;

Walking through the code you will see that the Start-SPAssignment is used to return an SPAssignmentCollection object. We keep a reference to this SPAssignmentCollection so that we can use it to collect any objects that need disposing.

So the next question is how do we use the SPAssignmentCollection object to collect the objects?

Well a large number of the SharePoint PowerShell functions have an optional parameter called AssignmentCollection. The variable that we created using Start-SPAssignment should then be passed into these SharePoint PowerShell cmdlets.

For example, Get-SPWeb has the a set of parameters (content taken from TechNet):-

Parameter Required Description
Identity Optional Specifies the name or full or partial URL of the subsite. If you use a relative path, you must specify the Site parameter.A valid URL in the form http://server_name or a relative path in the form of /SubSites/MySubSite.
AssignmentCollection Optional Manages objects for the purpose of proper disposal. Use of objects, such as SPWeb or SPSite, can use large amounts of memory and use of these objects in Windows PowerShell scripts requires proper memory management. Using the SPAssignment object, you can assign objects to a variable and dispose of the objects after they are needed to free up memory. When SPWeb, SPSite, or SPSiteAdministration objects are used, the objects are automatically disposed of if an assignment collection or the Global parameter is not used.
Confirm Optional Prompts you for confirmation before executing the command. For more information, type the following command: get-help about_commonparameters
Filter Optional Specifies the server-side filter to use for the specified scope.The type must be a valid filter in the form {filterName <operator> “filterValue”}.
Limit Optional Limits the maximum number of subsites to return. The default value is 200. To return all sites, enter ALLThe type must be a valid number greater than 0 or ALL.
Site Optional Specifies the URL or GUID of the site collection from which to list subsites.The type must be a valid URL, in the form of http://server_name; a GUID, in the form 1234-5678-9807, or an SPSite object.
WhatIf Optional Displays a message that describes the effect of the command instead of executing the command. For more information, type the following command: get-help about_commonparameters

When the AssignmentCollection parameter is passed then the objects that are created by that function are then stored in this AssignmentCollection.

When the objects are no longer required then they can be released using:-

Stop-SPAssignment –AssignmentCollection $assignmentCollection

This will free the memory and the associated objects.

SPAssignment Modes of Operation

The SPAssignment cmdlets can be used in a couple of ways:-

  • Simple mode, which is used by passing the –Global switch which seems to start monitoring the objects being created and will ensure that they are all disposed of when you call Stop-SPAssignment.
  • Advanced mode, which is where you create an AssignmentCollection using Start-SPAssignment and then manage the entries added to the collection.

I prefer the advanced mode as you are in control of when objects are being disposed of and also after running your PowerShell script a few times you can see where the appropriate calls to release memory should be placed.

When to use Stop-SPAssignment

One of the mistakes that I made was not calling the Stop-SPAssignment at the right time.

For example, say we have a script which is looping through all the Site Collections in a Web Application and then looping through all the Webs in a Site Collection. For each web object an operation is performed such as enabling a feature.

Well I had put the Stop-SPAssignment function being called at the end of the function. Guess what happened?

The PowerShell script would eat loads of RAM. Actually the memory would be freed but with large web applications the PowerShell script would slowing down considerably as the memory allocated started to impact the server.

The fix was relatively simple, move the Stop-SPAssignment function to the end of each loop which processes each site collection. This kept the PowerShell script from consuming too much RAM and it  performed well.

Not too often

I could have moved the Stop-SPAssignment function so that it is called after each web has been processed, however my testing found that this slowed down the script too much as Stop-SPAssignment was called.

Conclusion

Using the SPAssignment cmdlets are essential to building production ready PowerShell scripts. They will ensure that the servers in your SharePoint farm are not affected by over zealous PowerShell scripts consuming huge quantities of RAM!