owenG
home learn tableau about
divider
SDLC VS 2010 Coded UI Test Coded UI Test and Fitnesse MSBuild 4.0 MSBuild and IronPython MSBuild and IronPython - TFS checkins MSBuild and IronPython - Custom SQL Data








IronPython Example: Custom task for parsing MSBuild Detailed Build Summary information and saving results to SQL Server


The next example for using PythonClassFactory, see code samples in that article for background on the setup and general environment.

MSBuild 4/Visual Studio 2010 makes available a new command line switch, /detailedsummary (or /ds). Further information on what new data is recorded in the log file as a result of the new switch can be found here but I'm only going to concentrate on a subset of that data. Specifically, the inline task below will be combining general information about the current build, e.g. build label & build definition, and combining it with the detailed build summary information that relates to the specific projects being built, including the time expended in building each project.


There are two parts to the initial setup:

1) Workflow

The process template I'm using in this example is a copy of the TFS-provided DefaultTemplate.xaml, with two new insertions. Since the data to be parsed is being generated from the (project) build process itself, I need to wait till that has definitely completed. To be safe, I inserted the new Sequence element toward the very end of the workflow, just before the existing 'Check In Gated Changes for CheckInShelveset Builds' container. Within the new Sequence I added an MSBuild Activity and named it MSBuild_PyProjectBuilds:

MSBuild activity: MSBuild_PyProjectBuilds

The Targets and Project properties of the new Activity get updated to point to the target (PyTarget7) and project (PythonTargets.targets) that will hold the new inline task. Then the CommandLineArguments property is set to pass information regarding the current BuildLabel, BuildDefinition and the location of the log file for the primary MSBuild:

MSBuild_PyProjectBuilds CommandLine property

The build label/definition values listed there will wind up getting stored in the SQL Server table while LogLocation is needed so that we know where the file to be parsed is going to be located.

As a separate step the new '/ds' command needs to be added to the existing CommandLineArguments for the built-in MSBuild Activity that actually builds the projects.


2) SQL Server

Of course the tables to that will hold the build detail information need to exist before they can be updated. I've chosen to create two tables in an owenG_Data database that resides on the same SQL Server as TFS. The first table is named ProjectBuilds and holds information on the build label, the specific project being built, and the two related time values. (For further information on the difference between Exclusive Time and Total Time, see the MSDN blog post MSBuild 4 Detailed Build Summary I linked to previously.) The second table is named BuildList and stores the LabelName, BuildDefinition and (sort of) Time of Build. The BuildList table's LabelName is set as a foreign key into ProjectBuilds:

ProjectBuilds and BuildList tables


divider element

On to the .targets file section:

.targets
  <UsingTask TaskName="PyProjectBuilds"  TaskFactory="PythonClassFactory" 
        AssemblyFile="..\_compiled\PythonClassFactory.dll" >
    <ParameterGroup>
      <inLogFile ParameterType="System.String" Required="true" />
      <inBuildLabel ParameterType="System.String" Required="true" />
      <inBuildDef ParameterType="System.String" Required="true" />
      <outData Output="true" />
    </ParameterGroup>
    <Task>
      <![CDATA[
import re, sys
import clr
import System
clr.AddReference('System.Data')
from System.Data.SqlClient import SqlConnection, SqlParameter, SqlCommand
outData = 'OK'
parseOn = False
data = []
f = open(inLogFile, 'r')
for line in f:
  if 'Build Hierarchy' in line:
    parseOn = True
  if 'Node Utilization' in line:
    parseOn = False
  if parseOn == True:
    timeData = re.findall('\d+.\d\d\d', line)
    if len(timeData) == 2:
      proj = line[line.find('\\'):].strip()
      dataLine = timeData[0], timeData[1], proj
      data.append(dataLine)
f.close()
connString = 'Data Source=TFS01;Initial Catalog=owenG_Data;Integrated Security=SSPI;'
with SqlConnection(connString) as conn:
  conn.Open()
  transaction = conn.BeginTransaction()
  try:
    insertCommand = SqlCommand()
    insertCommand.Connection = conn
    insertCommand.Transaction = transaction
    insertCommand.CommandText = ("INSERT INTO BuildList([LabelName],[BuildDefinition],[TimeOfBuild]) "
      "VALUES ('{0}', '{1}', GetDate())").format(inBuildLabel,inBuildDef)
    numAffected = insertCommand.ExecuteNonQuery()
    for datum in data:
      insertCommand.CommandText = ("INSERT INTO ProjectBuilds([LabelName],[ProjectPath],[ExclusiveTime],"
        "[TotalTime]) VALUES ('{0}', '{1}', {2}, {3})").format(inBuildLabel, datum[2], datum[0], datum[1])
      numAffected = insertCommand.ExecuteNonQuery()
    transaction.Commit()
  except System.Data.SqlClient.SqlException, e:
    transaction.Rollback()
    outData = 'SQL Error: {0}'.format(e)
  except:
    transaction.Rollback()
    outData = 'Unexpected Error: {0}'.format(sys.exc_info()[0])
      ]]>
    </Task>
  </UsingTask>

  <PropertyGroup>
    <BuildLabel>Label7</BuildLabel>
    <BuildDefinition>BuildDef2</BuildDefinition>
    <LogFilePath>C:\oweng\test\msbuild.log</LogFilePath>
  </PropertyGroup>

  <Target Name="PyTarget7">
    <PyProjectBuilds inLogFile="$(LogFilePath)" inBuildLabel="$(BuildLabel)" 
        inBuildDef="$(BuildDefinition)">
      <Output PropertyName="outData" TaskParameter="outData"  />
    </PyProjectBuilds>
    <Message Text="outData: $(outData)" Importance="High"/>
  </Target>

Task name is PyProjectBuilds, with three input parameters and one output, all strings:

  • inLogFile - All three input parameters are essentially set in the .xaml workflow and passed in to MSBuild on the command line via /p, using and intermediary property. LogFilePath is assigned the value of BuildDetail.LogLocation, the full path to where the main MSBuild log file, containing the detailed build summary data, is written. PyProjectBuilds later assigns inLogFile to the value of $(LogFilePath).
  • inBuildLabel - Passed using BuildLabel property, which has been set to BuildDetail.LabelName. Note that the value contains the full build label, including the TeamProject, in the format of BuildDefinition_YYYYMMDD.BuildNo@$/TeamProject.
  • inBuildDef - Holds BuildDetail.BuildDefinition.Name, the name of the build definition used to kick off the build. PyProjectBuilds gets it via local BuildDefinition property.
  • outData - I originally added this parameter when I was thinking about writing the parsed values to a .csv (turns out the Python csv module was written in C and is therefore not available in the current build of IronPython.) Later I set it to store error messages, if present.

I'll look at PyTarget7 next, which is the entry point called by the MSBuild_PyProjectBuilds Activity in the .xaml. As discussed above, there are local MSBuild properties that this build target uses to set the values for the actual inline task. The values for these local properties will always be passed in from the workflow if this example were ever put into production but I've kept the local PropertyGroup definition for each. These provide default values and were helpful when it came to first testing out the task on the local MSBuild command line - I found them easier to remember, and test with, when they were defined in the project instead of passing in from CMD. The Message Task within PyTarget7 will write the outData value to the log file, the path for which I defined in the .xaml, via MSBuild_PyProjectBuilds properties.

Finally we have the PyProjectBuilds task, which holds the guts of the this .targets snippet. It begins with a few imports, for re (regex pattern searching), sys (for reporting on unhandled exceptions), and clr (for accessing .Net framework classes). After that the .Net System.Data is referenced and the necessary SQL classes imported. The first real block of code involves reading the log file and parsing the relevant lines of text. The addition of /ds to the msbuild command line results in a new Build Hierarchy and Node Utilization section being written to the bottom of the log. It is the Build Hierarchy data I care about in this task, because it lists the build time for both the parent .sln solution file and each individual .*proj project. The below represents typical output as generated by building All-In-One Code Framework.sln, skipping most of the Node Utilization section:

...
 Detailed Build Summary
 ======================
  
 ============================== Build Hierarchy (IDs represent configurations) ============================================
 Id           : Exclusive Time   Total Time   Path (Targets)
 --------------------------------------------------------------------------------------------------------------------------
 0              : 0.149s        4.741s       C:\Builds\1\AllInOne\PythonBuildDef1\Sources\AllInOne
                                                         \Main\All-In-One Code Framework.sln () 
 | 1            : 0.440s        0.440s       C:\Builds\1\AllInOne\PythonBuildDef1\Sources\AllInOne
                                                         \Main\MFCRestartManagerSupport\MFCRestartManagerSupport.vcxproj () 
 | 2            : 0.769s        0.769s       C:\Builds\1\AllInOne\PythonBuildDef1\Sources\AllInOne
                                                         \Main\MFCTaskDialog\MFCTaskDialog.vcxproj () 
 | 3            : 0.141s        0.141s       C:\Builds\1\AllInOne\PythonBuildDef1\Sources\AllInOne
                                                         \Main\CSEFForeignKeyAssociation\CSEFForeignKeyAssociation.csproj () 
 | 4            : 0.937s        0.937s       C:\Builds\1\AllInOne\PythonBuildDef1\Sources\AllInOne
                                                         \Main\VBEFForeignKeyAssociation\VBEFForeignKeyAssociation.vbproj () 
 | 5            : 0.048s        0.048s       C:\Builds\1\AllInOne\PythonBuildDef1\Sources\AllInOne
                                                         \Main\CSEFComplexType\CSEFComplexType.csproj () 
 | 6            : 0.531s        0.531s       C:\Builds\1\AllInOne\PythonBuildDef1\Sources\AllInOne
                                                         \Main\VBEFComplexType\VBEFComplexType.vbproj () 
 | 7            : 0.054s        0.054s       C:\Builds\1\AllInOne\PythonBuildDef1\Sources\AllInOne
                                                         \Main\CSEFModelFirst\CSEFModelFirst.csproj () 
 | 8            : 0.647s        0.647s       C:\Builds\1\AllInOne\PythonBuildDef1\Sources\AllInOne
                                                         \Main\VBEFModelFirst\VBEFModelFirst.vbproj () 
 | 9            : 0.066s        0.068s       C:\Builds\1\AllInOne\PythonBuildDef1\Sources\AllInOne
                                                         \Main\ModelingProject1\ModelingProject1.modelproj () 
 | . 11         : 0.002s        0.002s       C:\Builds\1\AllInOne\PythonBuildDef1\Sources\AllInOne
                                                         \Main\CSEFComplexType\CSEFComplexType.csproj (GetNativeManifest) 
 . 10           : 0.953s        0.954s       C:\Builds\1\AllInOne\PythonBuildDef1\Sources\AllInOne
                                                         \Main\CSTestProject1\CSTestProject1.csproj () 
 | . 12         : 0.001s        0.001s       C:\Builds\1\AllInOne\PythonBuildDef1\Sources\AllInOne
                                                         \Main\CSEFModelFirst\CSEFModelFirst.csproj (GetNativeManifest) 
  
 ============================== Node Utilization (IDs represent configurations) ==========================================
 Timestamp:            1        Duration   Cumulative
 -------------------------------------------------------------------------------------------------------------------------
 634065957904847776:   0        0.148s     0.148s ##
  ...
  
 Build succeeded.
    0 Warning(s)
    0 Error(s)

 Time Elapsed 00:00:04.79

The IronPython code opens the log file and reads it from beginning to end, only paying attention the lines that occur after the appearance of 'Build Hierarchy' and before 'Node Utilization' is encountered. For those reviewed lines the text is parsed in order to create a tuple (dataLine holding the data for 'Exclusive Time', 'Total Time', and 'Project Path.' Each of those tuples is added to a Python list named data.

Once data is populated the code moves on to the SQL section, first establishing a connection to the SQL Server (named TFS01 in this example). With the connection open, .BeginTransaction is called in order to set up a transaction such that both inserts will fail if either fails. The first INSERT statement is run once and updates the BuildList table. A typical INSERT here might wind up executing as:

T-SQL
    INSERT INTO BuildList([LabelName],[BuildDefinition],[TimeOfBuild])
    VALUES ('PythonBuildDef1_20100611.3@$/AllInOne', 'PythonBuildDef1', '2010-06-11 09:53:25.380')

And we can see that I cheated in updating the TimeOfBuild column by using a T-SQL Now() instead trying to get a more technically correct value. Then a loop iterates through the data list, running an INSERT like the below for each member:

T-SQL
INSERT INTO ProjectBuilds([LabelName],[ProjectPath],[ExclusiveTime],[TotalTime]) 
VALUES ('PythonBuildDef1_20100611.3@$/AllInOne', 
'\Builds\1\AllInOne\PythonBuildDef1\Sources\AllInOne\Main\CSEFModelFirst\CSEFModelFirst.csproj ()',
 1.876, 1.878) 

In addition to being within a transaction, both INSERT's are nested in a try/except block, the Python equivalent of a .Net Try/Catch statement. A SqlClient.SqlException exception will be specifically captured and logged via outData while anything else will be caught by the generic except:. If there are no problems the transaction will be .Committed; otherwise the transaction is rolled back.

Below is the result from a clean build of All-In-One Code Framework.sln, where the times would naturally be shorter if the 'Clean Workspace' setting within the build definition were instead configured to 'Outputs' or 'None':

2 SQL tables after a run

Assuming that in fact the 'Clean Workspace' value does not change once set, the time elapsed values from the various builds could prove helpful for monitoring purposes. The general build time that is often tossed out for continuous integration builds is that they should take 10 minutes or fewer to complete. If the time begins creeping up, having something like PyProjectBuilds running would enable the running of reports to see if a single project is responsible. Even if a particular project is not to 'blame' as it were, there may be other targets or tasks that it invokes and thereby contributes to increased build times. Other ideas might involve storing further information about the build, by adding additional columns to the table. For example, if there are multiple servers performing the builds then the name of the agent responsible for each build could be recorded also, thereby allowing you to determine in vivo which servers in fact are really faster than others.


THE END.