TurboXml 2.0.2

dotnet add package TurboXml --version 2.0.2
NuGet\Install-Package TurboXml -Version 2.0.2
This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.
<PackageReference Include="TurboXml" Version="2.0.2" />
For projects that support PackageReference, copy this XML node into the project file to reference the package.
paket add TurboXml --version 2.0.2
#r "nuget: TurboXml, 2.0.2"
#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.
// Install TurboXml as a Cake Addin
#addin nuget:?package=TurboXml&version=2.0.2

// Install TurboXml as a Cake Tool
#tool nuget:?package=TurboXml&version=2.0.2

TurboXml ci coverage NuGet

<img align="right" width="160px" src="https://raw.githubusercontent.com/xoofx/TurboXml/main/img/TurboXml.png">

TurboXml is a .NET library that provides a lightweight and fast SAX - Simple API XML parser by using callbacks.

This is the equivalent of System.Xml.XmlReader but faster with no allocations. 🚀

✨ Features

  • Should be slightly faster than System.Xml.XmlReader
  • Zero Allocation XML Parser
    • Callbacks received ReadOnlySpan<char> for the parsed elements.
    • Parse from small to very large XML documents, without allocating!
  • Optimized with SIMD
    • TurboXml is using some SIMD to improve parsing of large portions of XML documents.
  • Provide precise source location of the XML elements parsed (to report warning/errors)
  • Compatible with net8.0+
  • NativeAOT ready

📃 User Guide

TurboXML is in the family of the SAX parsers and so you need to implement the callbacks defined by IXmlReadHandler.

By default this handler implements empty interface methods that you can easily override:

var xml = "<?xml version=\"1.0\"?><root enabled=\"true\">Hello World!</root>";
var handler = new MyXmlHandler();
XmlParser.Parse(xml, ref handler);
// Will print:
//
// BeginTag(1:23): root
// Attribute(1:28)-(1:36): enabled="true"
// Content(1:43): Hello World!
// EndTag(1:57): root

struct MyXmlHandler : IXmlReadHandler
{
    public void OnBeginTag(ReadOnlySpan<char> name, int line, int column)
        => Console.WriteLine($"BeginTag({line + 1}:{column + 1}): {name}");

    public void OnEndTagEmpty()
        => Console.WriteLine($"EndTagEmpty");

    public void OnEndTag(ReadOnlySpan<char> name, int line, int column)
        => Console.WriteLine($"EndTag({line + 1}:{column + 1}): {name}");

    public void OnAttribute(ReadOnlySpan<char> name, ReadOnlySpan<char> value, int nameLine, int nameColumn, int valueLine, int valueColumn)
        => Console.WriteLine($"Attribute({nameLine + 1}:{nameColumn + 1})-({valueLine + 1}:{valueColumn + 1}): {name}=\"{value}\"");

    public void OnText(ReadOnlySpan<char> text, int line, int column)
        => Console.WriteLine($"Content({line + 1}:{column + 1}): {text}");
}

📊 Benchmarks

The solution contains 2 benchmarks:

  • BenchStream that parses 240+ MSBuild xml files (targets and props) from the .NET 8 (or latest SDK) installed
  • BenchString that parses the Tiger.svg in memory from a string.

In general, the advantages of TurboXml over System.Xml.XmlReader:

  • It should be slightly faster - specially if tag names, attributes or even content are bigger than 8 consecutive characters by using SIMD instructions.
  • It will make almost zero allocations - apart for the internal buffers used to pass data as ReadOnlySpan<char> back the the XML Handler.

Stream Results

BenchmarkDotNet v0.13.12, Windows 11 (10.0.22631.3085/23H2/2023Update/SunValley3)
AMD Ryzen 9 7950X, 1 CPU, 32 logical and 16 physical cores
.NET SDK 8.0.101
  [Host]     : .NET 8.0.1 (8.0.123.58001), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  DefaultJob : .NET 8.0.1 (8.0.123.58001), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
Method Mean Error StdDev Gen0 Gen1 Allocated
TurboXml - Stream 3.993 ms 0.0780 ms 0.0729 ms - - 13.18 KB
System.Xml.XmlReader - Stream 4.163 ms 0.0386 ms 0.0361 ms 328.1250 46.8750 5415.45 KB

String Results

Method Mean Error StdDev Gen0 Gen1 Allocated
TurboXml 52.14 us 1.040 us 1.491 us - - -
System.Xml.XmlReader 56.98 us 0.393 us 0.348 us 2.9297 0.2441 49304 B

🚨 XML Conformance and Known Limitations

This parser is following the Extensible Markup Language (XML) 1.0 (Fifth Edition) and should support any XML valid documents, except for the known limitations described below:

  • For simplicity of the implementation, this parser does not support DTD, custom entities and XML directives (<!DOCTYPE ...>). If you are looking for this, you should instead use System.Xml.XmlReader.
  • This parser checks for well formed XML, matching begin and end tags and report an error if they are not matching
  • This parser does not check for duplicated attributes.
    • It is the responsibility of the XML handler to implement such a check. The rationale is that the check can be performed more efficiently depending on user scenarios (e.g bit flags...etc.)

🏗️ Build

You need to install the .NET 8 SDK. Then from the root folder:

$ dotnet build src -c Release

🪪 License

This software is released under the BSD-2-Clause license.

🤗 Author

Alexandre Mutel aka xoofx.

Product Compatible and additional computed target framework versions.
.NET net8.0 is compatible.  net8.0-android was computed.  net8.0-browser was computed.  net8.0-ios was computed.  net8.0-maccatalyst was computed.  net8.0-macos was computed.  net8.0-tvos was computed.  net8.0-windows was computed. 
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.
  • net8.0

    • No dependencies.

NuGet packages

This package is not used by any NuGet packages.

GitHub repositories

This package is not used by any popular GitHub repositories.

Version Downloads Last updated
2.0.2 123 2/12/2024
2.0.1 87 2/11/2024
2.0.0 75 2/11/2024
1.0.1 88 2/8/2024
1.0.0 91 2/7/2024