Gapotchenko.FX.Runtime.CompilerServices.Intrinsics 2022.1.4

Prefix Reserved
There is a newer version of this package available.
See the version list below for details.
dotnet add package Gapotchenko.FX.Runtime.CompilerServices.Intrinsics --version 2022.1.4                
NuGet\Install-Package Gapotchenko.FX.Runtime.CompilerServices.Intrinsics -Version 2022.1.4                
This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.
<PackageReference Include="Gapotchenko.FX.Runtime.CompilerServices.Intrinsics" Version="2022.1.4" />                
For projects that support PackageReference, copy this XML node into the project file to reference the package.
paket add Gapotchenko.FX.Runtime.CompilerServices.Intrinsics --version 2022.1.4                
#r "nuget: Gapotchenko.FX.Runtime.CompilerServices.Intrinsics, 2022.1.4"                
#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.
// Install Gapotchenko.FX.Runtime.CompilerServices.Intrinsics as a Cake Addin
#addin nuget:?package=Gapotchenko.FX.Runtime.CompilerServices.Intrinsics&version=2022.1.4

// Install Gapotchenko.FX.Runtime.CompilerServices.Intrinsics as a Cake Tool
#tool nuget:?package=Gapotchenko.FX.Runtime.CompilerServices.Intrinsics&version=2022.1.4                

Overview

Gapotchenko.FX.Runtime.CompilerServices.Intrinsics module allows to define and compile intrinsic functions. They can be used in hardware-accelerated implementations of algorithms.

Example

Suppose we are trying to fix the performance bottleneck in the following algorithm:

class BitOperations
{
    // Returns the base 2 logarithm of a specified number.
    public static int Log2_Trivial(uint value)
    {
        int r = 0;
        while ((value >>= 1) != 0)
            ++r;
        return r;
    }
}

log<sub>2</sub> seems to be a trivial operation but it often becomes a serious bottleneck in path-finding or cryptographic algorithms. We can do better here if we switch to a table lookup:

class BitOperations
{
    // "Bit Twiddling Hacks" by Sean Eron Anderson:
    // http://graphics.stanford.edu/~seander/bithacks.html

    static readonly int[] m_Log2DeBruijn32 =
    {
         0,  9,  1, 10, 13, 21,  2, 29,
        11, 14, 16, 18, 22, 25,  3, 30,
         8, 12, 20, 28, 15, 17, 24,  7,
        19, 27, 23,  6, 26,  5,  4, 31
    };

    public static int Log2_DeBruijn(uint value)
    {
        // Round down to one less than a power of 2.
        value |= value >> 1;
        value |= value >> 2;
        value |= value >> 4;
        value |= value >> 8;
        value |= value >> 16;

        var index = (value * 0x07C4ACDDU) >> 27;
        return m_Log2DeBruijn32[index];
    }
}

This is a vast improvement over previous version but we can do even better.

Meet the Intel 80386, a 32-bit microprocessor introduced in 1985. It brought the Bit Scan Reverse (BSR) instruction that does exactly what we want to achieve with Log2 using just a small fraction of cycles.

Chances are your machine runs on a descendant of that influential CPU, be it AMD Ryzen or Intel Core. So how can we use the BSR instruction from .NET?

This is why Gapotchenko.FX.Runtime.CompilerServices.Intrinsics class exists. It provides an ability to define intrinsic implementation of a method with MachineCodeIntrinsicAttribute. Let's see how:

using Gapotchenko.FX.Runtime.CompilerServices;
using System.Runtime.CompilerServices;
using System.Runtime.InteropServices;

class BitOperations
{
    // Use static constructor to ensure that intrinsic methods are initialized (compiled) before they can be used
    static BitOperations() => Intrinsics.InitializeType(typeof(BitOperations));

    static readonly int[] m_Log2DeBruijn32 =
    {
         0,  9,  1, 10, 13, 21,  2, 29,
        11, 14, 16, 18, 22, 25,  3, 30,
         8, 12, 20, 28, 15, 17, 24,  7,
        19, 27, 23,  6, 26,  5,  4, 31
    };

    // Define machine code intrinsic for the method
    [MachineCodeIntrinsic(Architecture.X64, 0x0f, 0xbd, 0xc1)]  // BSR EAX, ECX
    [MethodImpl(MethodImplOptions.NoInlining)]
    public static int Log2_Intrinsic(uint value)
    {
        value |= value >> 1;
        value |= value >> 2;
        value |= value >> 4;
        value |= value >> 8;
        value |= value >> 16;

        var index = (value * 0x07C4ACDDU) >> 27;
        return m_Log2DeBruijn32[index];
    }
}

Log2_Intrinsic method defines a custom attribute that provides a machine code for BSR EAX, ECX instruction. Machine code is tied to CPU architecture and this is reflected in the attribute as well.

Please note that besides using MachineCodeIntrinsicAttribute to define method intrinsic implementations, BitOperations class should use a static constructor to ensure that corresponding methods are initialized (compiled) before they are called.

Here are the execution times of all three implementations (lower is better):

Method Mean Error StdDev
Log2_Trivial 4.587 ns 0.0325 ns 0.0288 ns
Log2_DeBruijn 1.256 ns 0.0068 ns 0.0063 ns
Log2_Intrinsic 1.038 ns 0.0660 ns 0.0947 ns

Log2_Intrinsic is a clear winner.

Intrinsic compiler may or may not apply machine code to a method depending on current app host environment. When intrinsic is not applied, the original method implementation is used, thus providing a graceful, albeit less performant, fallback.

Product Compatible and additional computed target framework versions.
.NET net5.0 is compatible.  net5.0-windows was computed.  net6.0 is compatible.  net6.0-android was computed.  net6.0-ios was computed.  net6.0-maccatalyst was computed.  net6.0-macos was computed.  net6.0-tvos was computed.  net6.0-windows was computed.  net7.0 is compatible.  net7.0-android was computed.  net7.0-ios was computed.  net7.0-maccatalyst was computed.  net7.0-macos was computed.  net7.0-tvos was computed.  net7.0-windows was computed.  net8.0 was computed.  net8.0-android was computed.  net8.0-browser was computed.  net8.0-ios was computed.  net8.0-maccatalyst was computed.  net8.0-macos was computed.  net8.0-tvos was computed.  net8.0-windows was computed. 
.NET Core netcoreapp2.0 is compatible.  netcoreapp2.1 is compatible.  netcoreapp2.2 was computed.  netcoreapp3.0 is compatible.  netcoreapp3.1 was computed. 
.NET Standard netstandard2.0 is compatible.  netstandard2.1 is compatible. 
.NET Framework net46 is compatible.  net461 was computed.  net462 was computed.  net463 was computed.  net47 was computed.  net471 is compatible.  net472 is compatible.  net48 was computed.  net481 was computed. 
MonoAndroid monoandroid was computed. 
MonoMac monomac was computed. 
MonoTouch monotouch was computed. 
Tizen tizen40 was computed.  tizen60 was computed. 
Xamarin.iOS xamarinios was computed. 
Xamarin.Mac xamarinmac was computed. 
Xamarin.TVOS xamarintvos was computed. 
Xamarin.WatchOS xamarinwatchos was computed. 
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.

NuGet packages (1)

Showing the top 1 NuGet packages that depend on Gapotchenko.FX.Runtime.CompilerServices.Intrinsics:

Package Downloads
Gapotchenko.FX.Numerics

The module provides hardware-accelerated operations for numeric data types.

GitHub repositories

This package is not used by any popular GitHub repositories.

Version Downloads Last updated
2024.2.5 145 12/31/2024
2024.1.3 296 11/10/2024
2022.2.7 6,180 5/1/2022
2022.2.5 1,783 5/1/2022
2022.1.4 1,755 4/6/2022
2021.2.21 1,087 1/21/2022
2021.2.20 987 1/17/2022
2021.1.5 743 7/6/2021
2020.2.2-beta 491 11/21/2020
2020.1.15 888 11/5/2020
2020.1.9-beta 550 7/14/2020
2020.1.8-beta 543 7/14/2020
2020.1.7-beta 575 7/14/2020
2020.1.1-beta 638 2/11/2020
2019.3.7 892 11/4/2019
2019.2.20 943 8/13/2019