SharePoint Blogs / SharePoint University
SharePoint Blogs and SharePoint University - all in one place!
Need SharePoint Training? Attend a SharePoint Bootcamp!

Please delete cookies related to sharepointblogs.com and sharepointu.com to resolve login issues!

SharePoint Metadata Integrity Checker Utility

The metadata features of SharePoint 2003 and 2007 enable you to customize, sort, group, filter and search documents and lists.  In SharePoint, this equates to adding columns that describe and categorize list items.  SharePoint 2007 takes a more disciplined content management approach with the introduction of site columns and content types. 

Although powerful, there is an integrity flaw in both SharePoint 2003 and 2007 that you should be aware of.  When you add a choice or multi-choice column, you can supply a list of values.  If you delete or modify values in this list, any list items using the old values will not be updated and will still contain the original values.  Those of you with database design experience can see the issue immediately: SharePoint does not maintain referential integrity.  The list item column references a value in a lookup list which may change. 

I'll demonstrate the issue.

Go to a document library and select Modify settings and columns.

Click on Add a new column.

Name the column "Color" and enter 3 choices "White, Brown and Balck."  Notice that I have misspelled "Black."  Although this is a simple example, errors like this occur in the "real world".  In addition, values can and do change.  Imagine you uploaded 100 documents using a given company name and the company goes through a name change.  Without a utility, your only option is to modify the column value and update each documents by hand.  From experience, SharePoint content managers usually just update the column list value without realizing existing list items contain the old value.

After adding the column, add a new document to the list.  When prompted, select the value "Balck" for the color.

You can see the list item contains the misspelled value.

Go back to the document library and edit the new column you added.  Change the value "Balck" to "Black."

Even though you changed the value, the list item still has the old value.

If you edit the list item, notice the color drop down reverts to the default value because it was unable to find "Balck" in the list.  Until the list item is updated, it will contain an invalid value.

How do you deal with this issue?  First, make sure your content managers are aware of the issue.  From now on, use a utility when you need modify or remove list item values that are in use.  You can do it by hand if there are a small number of list items. 

Once the issue is under control, analyze and fix integrity issues.  I developed a sample in C# that will  analyze a SharePoint 2003 environment.  In a later post, I will update it for 2007.

The code starts at a given url and recursively analyzes each list and list item verifying each column contains only valid values.  The end result is a list of invalid list items that you could output to a log file or run nightly and send e-mail notifications when issues are encountered.  This code is not comprehensive, it is only a sample to get you started. 

If want to see if you have any metatdata integrity issues in your environment, copy the .exe in bin\debug directory in the attached zip file to your SharePoint server and run it.  After it runs, open the log file and see how many issues you have. 

Download the full source code here


Posted 07-13-2007 9:34 AM by John Powell, PMP, MCPD, MCSD

Comments

Nadeem wrote re: SharePoint Metadata Integrity Checker Utility
on 07-13-2007 10:43 AM

Just use a lookup column instead of a choice column and the problem is solved.

Contrary to popular belief, you *can* keep and maintain your SharePoint list with the selection choices on a single top level site, even though you might need your drop-down or multi-select column available in other (sub-)sites.  This is because site columns are inherited by sub-sites.  I have seen "solutions" to the supposed "cross-site lookup problem" involving list replication and custom field controls - these are completely unnecessary.

Finally, using a list to maintain selection choices provides a number of things including controllign security for adding/modifying/deleting selection choices, the ability to require approvals for selection choice changes, and audit trails.

Tariq Ayad wrote re: SharePoint Metadata Integrity Checker Utility
on 07-13-2007 11:25 AM

Hi John,

This is interesting, but how does to tool determine what metadata values are valid and what are invalid?

Forinstance what if balack was a valid value?

-Tariq

John Powell, PMP, MCPD, MCSD wrote re: SharePoint Metadata Integrity Checker Utility
on 07-13-2007 11:37 AM

It simply gets the range of valid values for each choice column and then sees if any of the values of each list item are outside of that range.

John Powell, PMP, MCPD, MCSD wrote re: SharePoint Metadata Integrity Checker Utility
on 07-13-2007 11:39 AM

Here is the exact algorithm if that helps...

        //****************************************************************
        /// ValidateField
        /// <summary>
        /// Validates a choice field.
        /// </summary>
        /// <param name="list">The list</param>
        /// <param name="choiceField">The choice field to validate</param>
        //****************************************************************
        private void ValidateField(SPList list, SPFieldChoice choiceField) 
        {
            Trace.WriteLine("Validating Choice Field: " + choiceField.Title);

            // Create a hashtable of valid choices
            Hashtable htValidChoices = new Hashtable();
            foreach(string s in choiceField.Choices) 
            {
                htValidChoicesSleep = s;
            }

            // Enumerate the list items and validate the 
            //  selected choices
            foreach(SPListItem li in list.Items) 
            {
                // Get the list item value for the current field        
                string sVal = li[choiceField.Title] as string;

                // If the value is not null
                if(sVal != null) 
                {
                    string[]sChoices = sVal.Split('#');

                    foreach(string sChoice in sChoices) 
                    {
                        if(!htValidChoices.ContainsKey(sChoice))
                        {
                            Trace.WriteLine(
                                string.Format("Invalid Value: {0}, ID {1}",
                                sChoice,
                                li.ID));

                            // Create a new invalid list item
                            InvalidListItem invalid = new InvalidListItem();

                            // Get the edit Url
                            invalid.EditUrl = GetListItemEditUrl(list, li);
                            invalid.InvalidValues.Add(sVal);

                            _alInvalidItems.Add(invalid);

                        } // end if

                    } // end foreach

                } // end if
    
            } // end foreach

        } // end ValidateField()
John Powell, PMP, MCPD, MCSD wrote re: SharePoint Metadata Integrity Checker Utility
on 07-13-2007 11:51 AM

Nadeem,

Can you share how to do that?  I created a list at the site collection level and then attempted to use it as a lookup list in a sub-site.  The avaiable lists were only scoped to the current site and I could not see any lists from the top site.

Thanks!

Bob Silva wrote re: SharePoint Metadata Integrity Checker Utility
on 07-27-2007 5:10 PM

I believe Nadeem is only referring to a single site-column defined at the topsite level, even though he references it as a list. There is currently no solution for maintaining a list and having that available at a subsite in a drop-down or lookup column.

Bob

Balaji wrote re: SharePoint Metadata Integrity Checker Utility
on 03-07-2008 5:03 AM

I want to add the metadata along with the file and upload it , from a windows application to sharepoint 2007.I am not using the objects(dlls) in sharepoint.With the help of built in web services how can i get this work

Add a Comment

(required)  
(optional)
(required)  
Remember Me?
Need SharePoint Training? Attend a SharePoint Bootcamp!
Posts (c) their respective authors. Everything else (c) 2009 SharePoint Experts, Inc.