SharePoint Blogs / SharePoint University
SharePoint Blogs and SharePoint University - all in one place!
Need SharePoint Training? Attend a SharePoint Bootcamp!

Please delete cookies related to sharepointblogs.com and sharepointu.com to resolve login issues!

MOSS Usage Log Processing File parsing
William's Blog

News

  • Disclaimer
    Information in this weblog is provided "AS IS" with no warranties, and confers no rights. This weblog does not represent the thoughts, intentions, plans or strategies of my employer. It is solely my own personal opinion. Inappropriate comments will be deleted at the authors discretion. All code samples are provided "AS IS" without warranty of any kind, either express or implied, including but not limited to the implied warranties of merchantability and/or fitness for a particular purpose.

    Creative Commons License
    This work is licensed under a Creative Commons Attribution-Share Alike 3.0 United States License.

Parsing STS logs in SharePoint 2003 was fairly well documented. However, that is not the case yet in 2007. There is now some content about log parsing in 2007. It is written in C++, so this should still be useful. Given the number of libraries I've seen on the net referencing this code, I assume that is the case.

First, there is an additional 300 bytes added to the head of each log file. The additional 300 bytes contains the text "Windows SharePoint Services HTTP log file" followed by space characters making up the difference. You can simply ignore these.

Secondly, the header of the each entry appears to have grown as well, including the ordering of the definition.

Header Data (50 bytes)

Field Bytes Type

Unused

11 bytes

?

Site Url (length)

2 bytes

ushort

Web (length)

2 bytes

ushort

Doc (length)

2 bytes

ushort

Unknown

2 bytes

?

Bytes (data)

4 bytes

uint

HTTP Status (data)

2 bytes

ushort

Username (length)

2 bytes

ushort

QueryString (length)

2 bytes

ushort

Referral (length)

2 bytes

ushort

User Agent (length)

2 bytes

ushort

Command / Search Query (length)

2 bytes

ushort

Unused

15 bytes

?

Total

50 bytes

It also appears that the actual data has changed ordering a bit as well:

  1. Site GUID
  2. Timestamp
  3. Site Url
  4. Web
  5. Doc
  6. User
  7. QueryString
  8. Referral
  9. User Agent
  10. New GUID (don't know what this represents... anyone?)
  11. Command / Search Query

I've created a parser in C#:

using System;
using System.Collections.Generic;
using System.Data;
using System.IO;
using System.Text;
using System.Runtime.InteropServices;
using System.ComponentModel;

namespace MOSS2007LogParser
{
    public class Parser
    {
        private const int FILEHEADERLENGTH = 300;
        private const int RECORDHEADERLENGTH = 50;

        private List<LogRecord> _logRecords = new List<LogRecord>();
        private void ReadLogFile(string logFilePath)
        {
            _logRecords.Clear();

            // Open a stream to the log file
            // I recommend using a Memory-mapped file here, as the files can grow quite large.
            // http://www.winterdom.com/dev/dotnet/ shows a fairly good implementation [see: FileMap]
            using (Stream s = File.OpenRead(logFilePath))
            {
                BinaryReader br = new BinaryReader(s);

                // Read the file header
                byte[] fileHeaderData = br.ReadBytes(FILEHEADERLENGTH);

                while (s.Position < s.Length)
                {
                    // Read the record header array
                    byte[] recordHeader = br.ReadBytes(RECORDHEADERLENGTH);
                    LogRecordHeader headerData = new LogRecordHeader(recordHeader);

                    // Get the length of the current record and read the array
                    int recordLength = headerData.RecordLength();
                    byte[] recordData = br.ReadBytes(recordLength);

                    _logRecords.Add(new LogRecord(headerData, recordData));
                }

                br.Close();
                s.Close();
            }
        }

        public DataSet GetLogDataSet(string logFilePath)
        {
            ReadLogFile(logFilePath);

            DataSet dsSTSLogs = new DataSet("dsSTSLogs");
            DataTable dtSTSLogs = new DataTable("dtSTSLogs");

            DataColumn dcTimeStamp = new DataColumn("TimeStamp");
            DataColumn dcSiteGuid = new DataColumn("SiteGUID");
            DataColumn dcSiteUrl = new DataColumn("SiteUrl");
            DataColumn dcWeb = new DataColumn("Web");
            DataColumn dcDocument = new DataColumn("Document");
            DataColumn dcUserName = new DataColumn("UserName");
            DataColumn dcQueryString = new DataColumn("QueryString");
            DataColumn dcReferral = new DataColumn("Referral");
            DataColumn dcUserAgent = new DataColumn("UserAgent");
            DataColumn dcCommand = new DataColumn("Command");
            DataColumn dcHttpStatus = new DataColumn("HttpStatus");
            DataColumn dcBytesSent = new DataColumn("BytesSent");

            dtSTSLogs.Columns.Add(dcTimeStamp);
            dtSTSLogs.Columns.Add(dcSiteGuid);
            dtSTSLogs.Columns.Add(dcSiteUrl);
            dtSTSLogs.Columns.Add(dcWeb);
            dtSTSLogs.Columns.Add(dcDocument);
            dtSTSLogs.Columns.Add(dcUserName);
            dtSTSLogs.Columns.Add(dcQueryString);
            dtSTSLogs.Columns.Add(dcReferral);
            dtSTSLogs.Columns.Add(dcUserAgent);
            dtSTSLogs.Columns.Add(dcCommand);
            dtSTSLogs.Columns.Add(dcHttpStatus);
            dtSTSLogs.Columns.Add(dcBytesSent);

            dsSTSLogs.Tables.Add(dtSTSLogs);

            foreach (LogRecord record in _logRecords)
            {
                DataRow drRecord = dtSTSLogs.NewRow();

                drRecord["TimeStamp"] = record.TimeStamp;
                drRecord["SiteGUID"] = record.SiteGUID;
                drRecord["SiteUrl"] = record.SiteUrl;
                drRecord["Web"] = record.Web;
                drRecord["Document"] = record.Document;
                drRecord["UserName"] = record.UserName;
                drRecord["QueryString"] = record.QueryString;
                drRecord["Referral"] = record.Referral;
                drRecord["UserAgent"] = record.UserAgent;
                drRecord["Command"] = record.Command;
                drRecord["HttpStatus"] = record.Status;
                drRecord["BytesSent"] = record.BytesSent;

                dtSTSLogs.Rows.Add(drRecord);
            }

            return dsSTSLogs;
        }
        public void ConvertToCSVFile(string logFilePath)
        {
            string defaultPath = Path.Combine(
                Path.GetDirectoryName(logFilePath), 
                Path.GetFileNameWithoutExtension(logFilePath)) + ".csv";

            ConvertToCSVFile(logFilePath, defaultPath);
        }

        public void ConvertToCSVFile(string logFilePath, string csvFilePath)
        {
            ReadLogFile(logFilePath);

            string recordLineFormat = "{0},{1},{2},{3},{4},{5},{6},{7},{8},{9},{10},{11}";

            using (StreamWriter sw = File.CreateText(csvFilePath))
            {
                foreach (LogRecord record in _logRecords)
                {
                    sw.WriteLine(String.Format(recordLineFormat,
                        record.TimeStamp,
                        record.SiteGUID,
                        record.SiteUrl,
                        record.Web,
                        record.Document,
                        record.UserName,
                        record.QueryString,
                        record.Referral,
                        record.UserAgent,
                        record.Command,
                        record.BytesSent,
                        record.Status));
                }

                sw.Close();
            }
        }
    }

    public class LogRecordHeader
    {
        public const int STANDARD_GUID_LENGTH = 36;
        public const int STANDARD_TIMESTAMP_LENGTH = 8;

        private byte[] _recordHeader;

        private int _siteUrlLength;
        public int SiteUrlLength
        {
            get { return _siteUrlLength; }
        }

        private int _webLength;
        public int WebLength
        {
            get { return _webLength; }
        }

        private int _docLength;
        public int DocLength
        {
            get { return _docLength; }
        }

        private int _userNameLength;
        public int UserNameLength
        {
            get { return _userNameLength; }
        }

        private long _bytesSent;
        public long BytesSent
        {
            get { return _bytesSent; }
        }

        private int _httpStatus;
        public int HttpStatus
        {
            get { return _httpStatus; }
        }

        private int _queryStringLength;
        public int QueryStringLength
        {
            get { return _queryStringLength; }
        }

        private int _referralLength;
        public int ReferralLength
        {
            get { return _referralLength; }
        }

        private int _userAgentLength;
        public int UserAgentLength
        {
            get { return _userAgentLength; }
        }

        private int _commandLength;
        public int CommandLength
        {
            get { return _commandLength; }
        }

        public LogRecordHeader(byte[] recordHeader)
        {
            this._recordHeader = recordHeader;
            this._siteUrlLength = BitConverter.ToUInt16(recordHeader, SITEURL_OFFSET);
            this._webLength = BitConverter.ToUInt16(recordHeader, WEB_OFFSET);
            this._docLength = BitConverter.ToUInt16(recordHeader, DOC_OFFSET);
            this._bytesSent = BitConverter.ToUInt32(recordHeader, BYTESSENT_OFFSET);
            this._httpStatus = BitConverter.ToUInt16(recordHeader, HTTPSTATUS_OFFSET);
            this._userNameLength = BitConverter.ToUInt16(recordHeader, USERNAME_OFFSET);
            this._queryStringLength = BitConverter.ToUInt16(recordHeader, QUERYSTRING_OFFSET);
            this._referralLength = BitConverter.ToUInt16(recordHeader, REFERRAL_OFFSET);
            this._userAgentLength = BitConverter.ToUInt16(recordHeader, USERAGENT_OFFSET);
            this._commandLength = BitConverter.ToUInt16(recordHeader, COMMAND_OFFSET);
        }

        private const int SITEURL_OFFSET = 12;
        private const int WEB_OFFSET = 14;
        private const int DOC_OFFSET = 16;
        private const int BYTESSENT_OFFSET = 20;
        private const int HTTPSTATUS_OFFSET = 24;
        private const int USERNAME_OFFSET = 26;
        private const int QUERYSTRING_OFFSET = 28;
        private const int REFERRAL_OFFSET = 30;
        private const int USERAGENT_OFFSET = 32;
        private const int COMMAND_OFFSET = 34;

        public int RecordLength()
        {
            return  STANDARD_GUID_LENGTH + 1 +
                    STANDARD_TIMESTAMP_LENGTH + 1 +
                    SiteUrlLength + 1 +
                    WebLength + 1 +
                    DocLength + 1 +
                    UserNameLength + 1 +
                    QueryStringLength + 1 +
                    ReferralLength + 1 +
                    UserAgentLength + 1 +
                    STANDARD_GUID_LENGTH + 1 +
                    CommandLength + 1;
        }
    }

    public class LogRecord
    {
        private string _siteGuid;
        public string SiteGUID
        {
            get { return _siteGuid; }
        }

        private string _timeStamp;
        public string TimeStamp
        {
            get { return _timeStamp; }
        }

        private string _siteUrl;
        public string SiteUrl
        {
            get { return _siteUrl; }
        }

        private string _web;
        public string Web
        {
            get { return _web; }
        }

        private string _document;
        public string Document
        {
            get { return _document; }
        }

        private string _userName;
        public string UserName
        {
            get { return _userName; }
        }

        private string _queryString;
        public string QueryString
        {
            get { return _queryString; }
        }

        private string _referral;
        public string Referral
        {
            get { return _referral; }
        }

        private string _userAgent;
        public string UserAgent
        {
            get { return _userAgent; }
        }

        private string _command;
        public string Command
        {
            get { return _command; }
        }

        private int _status;
        public int Status
        {
            get { return _status; }
        }

        private long _bytesSent;
        public long BytesSent
        {
            get { return _bytesSent; }
        }

        public LogRecord(LogRecordHeader headerData, byte[] recordData)
        {
            int timeStampOffset = LogRecordHeader.STANDARD_GUID_LENGTH + 1;
            int siteUrlOffset = timeStampOffset + LogRecordHeader.STANDARD_TIMESTAMP_LENGTH + 1;
            int webOffset = siteUrlOffset + headerData.SiteUrlLength + 1;
            int documentOffset = webOffset + headerData.WebLength + 1;
            int userNameOffset = documentOffset + headerData.DocLength + 1;
            int queryStringOffset = userNameOffset + headerData.UserNameLength + 1;
            int referralOffset = queryStringOffset + headerData.QueryStringLength + 1;
            int userAgentOffset = referralOffset + headerData.ReferralLength + 1;
            int commandOffset = userAgentOffset + headerData.UserAgentLength + LogRecordHeader.STANDARD_GUID_LENGTH + 2;
            
            Encoding enc = UTF8Encoding.Default;

            _siteGuid = enc.GetString(recordData, 0, LogRecordHeader.STANDARD_GUID_LENGTH);
            _timeStamp = enc.GetString(recordData, timeStampOffset, LogRecordHeader.STANDARD_TIMESTAMP_LENGTH);
            _siteUrl = enc.GetString(recordData, siteUrlOffset, headerData.SiteUrlLength);
            _web = enc.GetString(recordData, webOffset, headerData.WebLength);
            _document = enc.GetString(recordData, documentOffset, headerData.DocLength);
            _userName = enc.GetString(recordData, userNameOffset, headerData.UserNameLength);
            _queryString = enc.GetString(recordData, queryStringOffset, headerData.QueryStringLength);
            _referral = enc.GetString(recordData, referralOffset, headerData.ReferralLength);
            _userAgent = enc.GetString(recordData, userAgentOffset, headerData.UserAgentLength);
            _command = enc.GetString(recordData, commandOffset, headerData.CommandLength);
            _status = headerData.HttpStatus;
            _bytesSent = headerData.BytesSent;
        }
    }
}

Posted 06-22-2007 11:06 PM by wpowell

Comments

SharePointKicks.com wrote MOSS Usage Log Processing File parsing
on 06-24-2007 7:07 PM

You've been kicked (a good thing) - Trackback from SharePointKicks.com

Links (6/24/2007) « Steve’s SharePoint Stuff wrote Links (6/24/2007) &laquo; Steve&#8217;s SharePoint Stuff
on 06-24-2007 7:52 PM

Pingback from  Links (6/24/2007) &laquo; Steve&#8217;s SharePoint Stuff

Sharepoint link love 6-26-2007 at Virtual Generations wrote Sharepoint link love 6-26-2007 at Virtual Generations
on 06-26-2007 5:25 AM

Pingback from  Sharepoint link love 6-26-2007 at  Virtual Generations

Sharepoint 2007 log parser (C#) | Log Parser Watch wrote Sharepoint 2007 log parser (C#) | Log Parser Watch
on 08-12-2007 6:27 AM

Pingback from  Sharepoint 2007 log parser (C#) | Log Parser Watch

SharePoint - notes from the field wrote Usage Analysis Processing logs
on 10-30-2007 6:53 PM

Here is a simple utility I wrote to process the MOSS usage logs (when turned on at Central Administration

Leonid Lyublinski wrote re: MOSS Usage Log Processing File parsing
on 10-30-2007 7:02 PM

A great stuff! I've compiled the code into a small utility that combines logs into 1 CSV.

Noticias externas wrote Usage Analysis Processing logs
on 10-30-2007 7:25 PM

Here is a simple utility I wrote to process the MOSS usage logs (when turned on at Central Administration

MSDN Blog Postings » Usage Analysis Processing logs wrote MSDN Blog Postings &raquo; Usage Analysis Processing logs
on 10-30-2007 7:54 PM

Pingback from  MSDN Blog Postings  &raquo; Usage Analysis Processing logs

abdrasin wrote re: MOSS Usage Log Processing File parsing
on 11-28-2007 10:39 AM

I was using this when I noticed an error.  The Command / Query String is not captured correctly because of the insertion of a Guid between the UserAgent and it.

To fix it locate the following line of code in the LogRecord constructor:

int commandOffset = userAgentOffset + headerData.UserAgentLength + 1;

Change it to:

int commandOffset = userAgentOffset + headerData.UserAgentLength + LogRecordHeader.STANDARD_GUID_LENGTH + 2;

Harry wrote re: MOSS Usage Log Processing File parsing
on 01-10-2008 9:50 AM

Do you create this in a class file in Visual Studio? How do you point to the log file and show where the .csv file we be put? I'm new to this and need more details on making this work.

wpowell wrote re: MOSS Usage Log Processing File parsing
on 01-15-2008 2:07 AM

Harry,

Yes. You can use visual studio... although there are other, less user friendly methods. To use the class, you could create a console application that read 2 parameters. One for the input log file path, and one for the output csv path. Then you would call the ConvertToCSVFile method.

Hope that helps,

William

Craig wrote re: MOSS Usage Log Processing File parsing
on 02-17-2008 9:58 PM

Hey William,

Microsoft have published the details on the log format:

msdn2.microsoft.com/.../bb814929.aspx

Which covers the fields you weren't sure about.

Later'ish

Craig

roadburn’s sharepoint 2007 blog » Blog Archive » MOSS - Sharepoint 2007 Custom Site Usage Report wrote roadburn&#8217;s sharepoint 2007 blog &raquo; Blog Archive &raquo; MOSS - Sharepoint 2007 Custom Site Usage Report
on 02-27-2008 12:08 AM

Pingback from  roadburn&#8217;s sharepoint 2007 blog  &raquo; Blog Archive   &raquo; MOSS - Sharepoint 2007 Custom Site Usage Report

Mike's Blog wrote SharePoint Usage Statistics
on 05-13-2008 9:21 PM

Ran across these posts earlier... might be useful to anyone trying to understand how usage statistics

Edwin Roestenburg wrote re: MOSS Usage Log Processing File parsing
on 05-27-2008 5:07 PM

Great work William!

Your work saved me a great deal of time. I've made a few small changes though to allow for the exclusion of columns. On top of that I've created a small GUI to select the columns, walk a directory tree with log files and either create 1 big CVS of multiple "smaller" ones (1 per log-file). If this would be useful to you or anyone else, then please contact me for a free copy at EdwinRoestenburg@yahoo.com

AndersR wrote re: MOSS Usage Log Processing File parsing
on 06-26-2008 7:46 AM

Great work.

As abdrasin points out though, the last columns of the log file isnt parsed correctly.

Actually the spec states that the column following the user is actually not the querystring of the request, but the querystring of the referer!!

Instead the querystring is in fact what you specify as Command.

This is useful if you (as i did) needed to find out what list item was used when :-)

Again thanx for doing the hard stuff!

wpowell wrote re: MOSS Usage Log Processing File parsing
on 07-16-2008 1:13 PM

Updated as per abdrasin and AndersR.

Thanks!

Shailen Sukul wrote re: MOSS Usage Log Processing File parsing
on 09-03-2008 9:45 PM

Thank you for sharing your knowledge.

This has proved to be very useful!

Eduardo wrote re: MOSS Usage Log Processing File parsing
on 03-30-2009 12:32 PM

Good article,

Do you know how to parse the log file using LogParser.

wpowell wrote re: MOSS Usage Log Processing File parsing
on 04-29-2009 1:27 PM

Hi Eduardo,

I usally convert to CSV and then use LogParser from there.

Hope that helps.

William

Marcelluswalace wrote re: MOSS Usage Log Processing File parsing
on 05-13-2009 4:14 PM

Hi,

is this code valid for MOSSS 2007 I dont think so because you said that the order has changed and you using the former order. Just Cannot understand....it doesnt work with my logs file.

How do I get the number of Hit by user and so on with your parser?

wpowell wrote re: MOSS Usage Log Processing File parsing
on 05-28-2009 12:43 AM

It is valid for MOSS 2007.

See Eduardo's comment on 3-30-09 and my response for one method to get the information you are looking for.

Josh G wrote re: MOSS Usage Log Processing File parsing
on 07-02-2009 1:48 PM

Hi William - Thanks for this code! I was able to use it as a starting point for parsing the usage logs and uploading them to a SQL table.

I added some fields to your LogRecord class to retrieve WebAppID and dates from the log filepath - check it out at sharepoint.gavant.org/.../sharepoint-usage-logs-part-2 if you're interested.

Thanks again--

Josh

Add a Comment

(required)  
(optional)
(required)  
Remember Me?
Need SharePoint Training? Attend a SharePoint Bootcamp!
Posts (c) their respective authors. Everything else (c) 2009 SharePoint Experts, Inc.