Paul Selles

Computers and cats

Category Archives: XML

XAML Formatting: Programmatically Tabify XAML and XML Files in C#

With many developers working together, formatting styles can sometimes be an issue. This is most noticeable when working with XAML files, and the biggest culprit for formatting issues is poor tabbing and inconsistent tab characters. Luckily there is an easy way to standardize XAML formatting through that implantation of a custom check-in policy that can programmatically fix all XAML files prior to being checked in.

Using XmlWriter in conjunction with the XmlWriterSettings class we can easily customize our output [1][2]. The XmlWriterSettings properties give us a lot of control as to how we want the XML or XAML to look and our settings are documented and are worth looking over. I am also using XmlReader and XmlReaderSettings to convert the text into an XmlDocument, were the XmlReaderSettings class is used to ensure that we ignore any potentially invalid XML characters in our XAML [3][4][5][6]. I encountered one tricky bit were the XmlReader will break on decimal and hex character reference, this was solved by doing a text replace on all “&” with “&” pre-XmlReader and converting back post-XmlWriter.

public static class Tabify
{
	// Tabify XML document
    public static void Xml(string filename)
    {
        DoTabify(filename, false);
    }

	// Tabify Xaml document
    public static void Xaml(string filename)
    {
        DoTabify(filename, true);
    }

	// Tabify
    private static void DoTabify(string filename, bool xaml=false)
    {
        // XmlDocument container
        XmlDocument xmlDocument = new XmlDocument();

        // We want to make sure that decimal and hex character references are not lost
        string xmlString = File.ReadAllText(filename);
        xmlString = xmlString.Replace("&", "&");
        
        // Xml Reader settings 
        XmlReaderSettings xmlReadSettings = new XmlReaderSettings()
        {
            CheckCharacters = false,     // We have some invalid characters we want to ignore
        };
  
        // Use XML reader to load content to XmlDocument container
        using (XmlReader xmReader = XmlReader.Create(new StringReader(xmlString), xmlReadSettings))
        {
            xmReader.MoveToContent();
            xmlDocument.Load(xmReader);
        }

        // Customize how our XML will look, we want tabs, UTF8 encoding and new line on attributes
        XmlWriterSettings xmlWriterSettings = new XmlWriterSettings()
        {
            Indent = true,                              // Indent elements
            IndentChars = "\t",                         // Indent with tabs
            CheckCharacters =  false,                   // Ignore invalid characters
            NewLineChars = Environment.NewLine,         // Set newline character
            NewLineHandling = NewLineHandling.None,     // Normalize line breaks
            Encoding = new UTF8Encoding()               // UTF8 encoding
        };

        // We do not want the xml declaration for xaml files
        if (xaml)
            xmlWriterSettings.OmitXmlDeclaration = true;    // For XAML this must be false!!!!

        StringBuilder xmlStringBuilder = new StringBuilder();
        
        // Write xml to file using saved settings
        using (XmlWriter xmlWriter = XmlWriter.Create(xmlStringBuilder, xmlWriterSettings))
        {
            xmlWriter.Flush();
            xmlDocument.WriteContentTo(xmlWriter);
        }

        // Restore decimal and hex character references
        xmlString = xmlStringBuilder.ToString().Replace("&", "&");
        File.WriteAllText(filename, xmlString);
    }
}

Paul

References

[1] XmlWriter Class. MSDN Library.

[2] XmlWriterSettings Class. MSDN Library.

[3] XmlReader ClassXmlReader Class. MSDN Library.

[4] XmlReaderSettings Class. MSDN Library.

[5] XmlDocument Class. MSDN Library.

[6] Parsing Xml with Invalid Characters in C#. Paul Selles

Advertisements

Tfs Build Log: Querying Build Log Data

If you have ever wanted to programmatically parse past or current Tfs Build log data, then this post is for you.

 

Background

My company uses Tfs to implement a Rolling CI Build which often involves multiple Changesets per build. For the most part this is smooth sailing, however when the build breaks we want to to quickly be able to pinpoint the source and notify the guilty party. With multiple Associated Changesets per build, with multiple files checked-in by multiple developers, how to we efficiently pin-point to source of the break. We also do not want to parse the MSBuild log files since we are not creating log files for our Rolling CI Builds. Our solution then become querying and parsing the xml build information from the Project Collection SQL Database.

All the build log information is available on the Project Collection Database where the data is recorded one activity at a time so we are able to query all the way up to the current build activity. All this data can be found in the SQL Database, appropriately named Tfs_YourTeamProjectCollection, table Tbl_BuildInformation; where YourTeamProjectCollection is the name of your Team Project Collection. The build activity will appear as independent XML nodes, divided up into 16 different types.

 

Examining Tbl_BuildInformation

As mentioned above the build information will be located the Tbl_BuildInformation table and the build activity will be appear of 16 different types of XML nodes. Firstly let’s take a look at the the available columns:

dbo.tbl_BuildInformation_Columns

and a query result for a single build:

tlb_BuildInformation_query 

Here is what you need to know about the columns:

  • BuildId: Should be self explanatory it is typically the last octet of the IBuildDetail.BuildNumber (BuildDefinitionName.1.0.0.12345) or the Uri Fragment in the IBuildDetail.BuildUri (vstfs:///Build/Build/12345.)[1][2]
  • NodeId: Id given to the Node on that specific node
  • ParentId: This is the hierarchy that in the build log, this id represents it’s parent NodeId
  • NodeType: Identify what type of node the row is (1 to 16)
  • LastModifiedDate: Self explanatory
  • Fields: Contains the XML data that we are interested in.

 

Understanding NodeTypes

Even though there are 16 different NodeTypes I am interested in only 5 of them. I had some difficulties finding documentation on the NodeTypes, so here is are examples of the XML Fields Column of each the NodeTypes that I use. The structure of all the NodeTypes are the same, and any value can be retrieved with the the XPath string “/Fields/Field[@Name=’AttributeValueOfInterest’][Value]”. For the sake of simplicity, I will only be referencing the ‘AttributeValueOfInterest’ from here on out.

NodeType 4: Project or Solution Compilation Results

This NodeType contains the MSBuild results for a Project or Solution component of your Build Targets. It provides the number of build errors or warnings for specific build. The Build Agent Local Path and the Server Path of the Project or Solution being built. Expect to see one entry for each Project or Solution Built.

Example Fields Column XML:

<Fields>
  <Field Name="CompilationErrors">
    <Value>0</Value>
  </Field>
  <Field Name="CompilationWarnings">
    <Value>0</Value>
  </Field>
  <Field Name="FinishTime">
    <Value>2013-10-30T20:01:00.3073686Z</Value>
  </Field>
  <Field Name="Flavor">
    <Value>Release</Value>
  </Field>
  <Field Name="LocalPath">
    <Value>C:\Builds\57\Path\To\Your\Project\Or\Solution.csproj</Value>
  </Field>
  <Field Name="Platform">
    <Value>AnyCPU</Value>
  </Field>
  <Field Name="ServerPath">
    <Value>$/Path/To/Your/Project/Or/Solution.csproj<Value>
  </Field>
  <Field Name="StartTime">
    <Value>2013-10-30T20:00:32.4797462Z</Value>
  </Field>
  <Field Name="StaticAnalysisErrors">
    <Value>0</Value>
  </Field>
  <Field Name="StaticAnalysisWarnings">
    <Value>0</Value>
  </Field>
  <Field Name="TargetNames">
    <Value />
  </Field>
</Fields>

NodeType 6: Total Compilation Results

This NodeType contains the MSBuild results for all components of your Build Targets. It provides the number of build errors or warnings for specific build.

Example Fields Column XML:

<Fields>
  <Field Name="Platform">
    <Value>Any CPU</Value>
  </Field>
  <Field Name="Flavor">
    <Value>Release</Value>
  </Field>
  <Field Name="TotalCompilationErrors">
    <Value>16</Value>
  </Field>
  <Field Name="TotalCompilationWarnings">
    <Value>9</Value>
  </Field>
  <Field Name="TotalStaticAnalysisErrors">
    <Value>0</Value>
  </Field>
  <Field Name="TotalStaticAnalysisWarnings">
    <Value>0</Value>
  </Field>
</Fields>

NodeType 7: Associated Changeset

This NodeType contains contains the information of an individual Changeset from the list of Associated Changesets. It contains the ChangsetId, who checked it in, and the the check-in comments.

Example Fields Column XML:

<Fields>
  <Field Name="ChangesetId">
    <Value>12345</Value>
  </Field>
  <Field Name="ChangesetUri">
    <Value>vstfs:///VersionControl/Changeset/12345</Value>
  </Field>
  <Field Name="CheckedInBy">
    <Value>Paul Selles</Value>
  </Field>
  <Field Name="Comment">
    <Value>Check-in comments.</Value>
  </Field>
</Fields>

NodeType 8: Build Log Errors

This NodeType contains contains any errors that are created during any build activity. These can be anything from compilation errors, network errors, to user created errors through the WriteBuildError class in the build process template[3]. As an example I will post the Fields Column XML for a compilation error. It is important to note that for error types other than compilation error, the AttributeValueOfInterest File, ServerPath, LineNumber, EndLineNumber are not present.

Example Fields Column XML:

<Fields>
  <Field Name="Code">
    <Value>BC30002</Value>
  </Field>
  <Field Name="EndLineNumber">
    <Value>270</Value>
  </Field>
  <Field Name="ErrorType">
    <Value>Compilation</Value>
  </Field>
  <Field Name="File">
    <Value>C:\Builds\57\Path\To\Your\Project\Or\CodeFile.cs</Value>
  </Field>
  <Field Name="LineNumber">
    <Value>270</Value>
  </Field>
  <Field Name="Message">
    <Value>Type 'EmailSet' is not defined.</Value>
  </Field>
  <Field Name="ServerPath">
    <Value>$/RQ/Path/To/Your/Project/Or/CodeFile.cs;C51077</Value>
  </Field>
  <Field Name="Timestamp">
    <Value>2013-10-30T20:10:34.6821707Z</Value>
  </Field>
</Fields>

NodeType 10: Build Log Warnings

This NodeType contains any warnings that are created during any build activity. This is very similar to NodeType 8, build activity errors. The build process template class for user created warning is WriteBuildWarning [3]. And as for the errors above, the posted example of the Fields Column XML for a compilation error. It is important to note that for warnings types other than compilation warnings , the AttributeValueOfInterest File, ServerPath, LineNumber, EndLineNumber are not present.

Example Fields Column XML:

<Fields>
  <Field Name="CompilationErrors">
    <Value>0</Value>
  </Field>
  <Field Name="CompilationWarnings">
    <Value>0</Value>
  </Field>
  <Field Name="FinishTime">
    <Value>2013-10-30T20:01:00.3073686Z</Value>
  </Field>
  <Field Name="Flavor">
    <Value>Release</Value>
  </Field>
  <Field Name="LocalPath">
    <Value>C:\Builds\57\Path\To\Your\Project\Or\Solution.csproj</Value>
  </Field>
  <Field Name="Platform">
    <Value>AnyCPU</Value>
  </Field>
  <Field Name="ServerPath">
    <Value>$/Path/To/Your/Project/Or/Solution.csproj<Value>
  </Field>
  <Field Name="StartTime">
    <Value>2013-10-30T20:00:32.4797462Z</Value>
  </Field>
  <Field Name="StaticAnalysisErrors">
    <Value>0</Value>
  </Field>
  <Field Name="StaticAnalysisWarnings">
    <Value>0</Value>
  </Field>
  <Field Name="TargetNames">
    <Value />
  </Field>
</Fields>

 

Querying The Database

In order to get these values, we will query this database with the current BuildId. As mentioned above, this is typically be the last octet of the IBuildDetail.BuildNumber (BuildDefinitionName.1.0.0.12345) or the Uri Fragment in the IBuildDetail.BuildUri (vstfs:///Build/Build/12345.)[1][2] Best practice is to confirm the BuildId in the table tbl_Build using the above mentioned number, 12345, in the BuildUri column. Once we have confirmed that we have to correct BuildId, we can go ahead and get the relevant build information with the following query:

SELECT * FROM [Tfs_YourTeamProjectCollection].[dbo].[Tbl_BuildInformation] WHERE [BuildId]=BuildId AND [NodeType] IN (4,6,7,8,10)

 

References

[1] IBuildDetail.BuildNumber Property. MSDN Library

[2] IBuildDetail.Uri Property. MSDN Library

[3] WriteBuildError. MSDN Library

[4] WriteBuildWarning. MSDN Libary

Powershell Tip #1: Strict Mode XML Parsing Gotcha

I love how easy it is to parse XML with Powershell, but then I started scripting in Strict Mode and got hung-up on a little problem dealing with attributes.

I will revisit a revised cats.xml to use as an example xml file:

<?xml version="1.0" encoding="utf-8"?>
<Cats>
	<Cat Name="Wilson" Type="Tabby">
		<Property Name="Fur" Value="Coarse"/>
		<Property Name="Color" Value="Orange" />
		<Part Name="Paws">
			<Property Name="Claws" Value="Very sharp" />
		</Part>
		<Part Name="Nose">
			<Property Name="Cute" Value="true" />
		</Part>
	</Cat>
	<Cat Name="Winnie" Type="Short hair">
		<Property Name="Fur" Value="Soft"/>
		<Property Name="Color" Value="Black" />
		<Part Name="Paws">
			<Property Name="Polydactyl" Value="true" />
			<Property Name="Claws" Value="Sharp" />
		</Part>
		<Part Name="Nose">
			<Property Name="Cute" Value="true" />
		</Part>
	</Cat>
	<Cat Name="Luna">
		<Property Name="Fur" Value="Soft"/>
		<Property Name="Color" Value="Black" />
		<Part Name="Paws">
			<Property Name="Claws" Value="Trimmed" />
		</Part>
		<Part Name="Nose">
			<Property Name="Cute" Value="true" />
		</Part>
	</Cat>
</Cats>

Lets make a script that will retrieve the Name of a cat by their type:

# Get cat name by type
Param (
	[ValidateNotNullOrEmpty()]
	[String]$Type
)
[Xml]$Cats = (Get-Content -Path C:\temp\cats.xml)
($Cats.Cats.Cat | Where {$_.Type -match $Type}).Name

The script is nice and short and does what we expect. If the Type matches, it returns the Name of the cat (otherwise, we get nothing):

CatTypeResults

Now let’s try the script in Strict Mode:

# Get cat name by type
Param (
	[ValidateNotNullOrEmpty()]
	[String]$Type
)
Set-StrictMode -Version Latest
[Xml]$Cats = (Get-Content -Path C:\temp\cats.xml)
($Cats.Cats.Cat | Where {$_.Type -match $Type}).Name

If we try to run the script again we run into problems:

CatTypeResultsStrictMode

What did we do wrong? As it turns out there are two problems. The first is that the entry for Cat Luna does not have a Type attribute (if we re-ran the tests removing Luna then the script will pass). Secondly, we are expecting Powershell to interpret what we mean by the property Type. We want Type as an attribute name, but how does Powershell know that? This is sloppy scripting, but it works until we switch into Strict Mode.

In order to move forward, we need a better idea of the objects that we are playing with. So let’s get the members of $Cats.Cats.

SystemXmlXmlDocumentGetMember

We can see that this is an System.Xml.XmlElement[1]. We could look the class up or we can intuitively see what method on from the list above will help us clean up our code and test for the attribute Type:

# Get cat name by type
Param (
	[ValidateNotNullOrEmpty()]
	[String]$Type
)
[Xml]$Cats = (Get-Content -Path C:\temp\cats.xml)
$Cat = $Cats.Cats.Cat | Where {$_.GetAttribute('Type') -match $Type}
if ($Cat) { $Cat.GetAttribute('Name') }

Using GetAttribute on attribute Type will solve the first error. We solve our second error (when a not matching type is entered) by making sure that the object is not null before reading the attribute Name.

Paul

References

[1] XmlXElement Class. MSDN Library

Parsing Xml with Invalid Characters in C#

The Problem

I’ve stumbled upon an interesting predicament. I need to parse some SQL relationships from an automatically generated XML file that contains invalid characters. Here is an example XML file that I will use to highlight the problem that I saw:

<?xml version="1.0" encoding="utf-8"?>
<Cats>
	<Cat Id="1" Type="Tabby">
		<Property Name="Fur" Value="Coarse"/>
		<Property Name="Color" Value="Orange" />
		<Part Name="Paws">
			<Property Name="Claws" Value="Very sharp" />
		</Part>
		<Part Name="Nose">
			<Property Name="Cute" Value="true" />
		</Part>
		<Info>
			I have an invalid character.&#x13;
		</Info>
	</Cat>
	<Cat Id="2" Type="Short hair">
		<Property Name="Fur" Value="Soft"/>
		<Property Name="Color" Value="Black" />
		<Part Name="Paws">
			<Property Name="Polydactyl" Value="true" />
			<Property Name="Claws" Value="Sharp" />
		</Part>
		<Part Name="Nose">
			<Property Name="Cute" Value="true" />
		</Part>
		<Info>
			I don't have an invalid character.
		</Info>
	</Cat>
</Cats>

So above we have a small XML file cataloging my two cats. Within the Info tags you may notice that the first Cat entry has a superfluous character, 0x13; this falls outside of the valid XML character set [1]. The W3C recommendation, however, is no guarantee that every XML file that you encounter will follow the recommendations to a tee.

In C# we can try using the two most common XML parsing libraries System.Xml and System.Xml.Linq to import the XML file to the XmlDocument and XDocument objects using their respective Load functions [2][3]. If we try to do this we can expect to see the following exception:

‘ ‘, hexadecimal value 0x13, is an invalid character. Line 13, position 35.

The Solution

There is a workaround that is made possible with the lightweight disposable XmlReader class and the XmlReaderSettings support class that allows us to customize the behavior of XmlReader [4][5]. The XmlReaderSettings property that interests us the most is the Boolean CheckCharacters. Setting CheckCharacters property to false will let us read the XML document without verifying if the processed text data is within the valid XML character set [6]. The XmlDocument and XDocument objects can now be loaded from the XmlReader incident free:

static XmlDocument ReadXmlDocumentWithInvalidCharacters(string filename)
{
    XmlDocument xmlDocument = new XmlDocument();

    XmlReaderSettings xmlReaderSettings = new XmlReaderSettings { CheckCharacters = false };

    using (XmlReader xmlReader = XmlReader.Create(filename, xmlReaderSettings))
    {
        // Load our XmlDocument
        xmlReader.MoveToContent();
        xmlDocument.Load(xmlReader);
    }

    return xmlDocument;
}
static XDocument ReadXDocumentWithInvalidCharacters(string filename)
{
    XDocument xDocument = null;

    XmlReaderSettings xmlReaderSettings = new XmlReaderSettings { CheckCharacters = false };

    using (XmlReader xmlReader = XmlReader.Create(filename, xmlReaderSettings))
    {
        // Load our XDocument
        xmlReader.MoveToContent();
        xDocument = XDocument.Load(xmlReader);
    }

    return xDocument;
}

Once we load our XML code then we are free to parse it, and since I prefer working with the System.Xml.Linq library, that’s all I will do:

static void PrintXDocument(XDocument xDocument)
{
    foreach (XElement xElement in xDocument.Elements(xDocument.Root.Name).DescendantsAndSelf())
    {
           Console.Write(("".PadRight(xElement.Ancestors().Count() * 4) +
            (xElement.HasElements == true || string.IsNullOrEmpty(xElement.Value) ?
                xElement.Name.LocalName :
                (xElement.Name.LocalName + " \"" + xElement.Value.Trim() + "\""))));

        foreach (XAttribute xAttribute in xElement.Attributes())
            Console.Write(" " + xAttribute.Name.LocalName + "=\"" + xAttribute.Value + "\"");

        Console.WriteLine();
    }

    Console.ReadLine();
}

And the results:

Cats
Cat Id=”1″ Type=”Tabby”
Property Name=”Fur” Value=”Coarse”
Property Name=”Color” Value=”Orange”
Part Name=”Paws”
Property Name=”Claws” Value=”Very sharp”
Part Name=”Nose”
Property Name=”Cute” Value=”true”
Info “I have an invalid character.‼”
Cat Id=”2″ Type=”Short hair”
Property Name=”Fur” Value=”Soft”
Property Name=”Color” Value=”Black”
Part Name=”Paws”
Property Name=”Polydactyl” Value=”true”
Property Name=”Claws” Value=”Sharp”
Part Name=”Nose”
Property Name=”Cute” Value=”true”
Info “I don’t have an invalid character.”

We are not out of the woods yet

We are dealing with damaged goods here: that invalid character is still present, so we have to be careful. Notice the in the output above, that is 0x13.

An example of what can go wrong is evident if we try to print out the contents of our XDocument object:

Console.WriteLine(XDocument.Load(filename).ToString());

Normally we will get a printout of the containing XML. In this case we will see the exception we saw above.

Paul

 

References

[1] Extensible Markup Language (XML) 1.0 (Fifth Edition). 26 Nov 2008. W3C Recommendation

[2] XmlDocument Class. MSDN Library

[3] XDocument Class. MSDN Library

[4] XmlReader Class. MSDN Library

[5] XmlReaderSettings Class. MSDN Library

[6] XmlReaderSettings.CheckCharacters Property. MSDN Library

%d bloggers like this: