Acornima 0.9.0

There is a newer version of this package available.
See the version list below for details.
dotnet add package Acornima --version 0.9.0
NuGet\Install-Package Acornima -Version 0.9.0
This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.
<PackageReference Include="Acornima" Version="0.9.0" />
For projects that support PackageReference, copy this XML node into the project file to reference the package.
paket add Acornima --version 0.9.0
#r "nuget: Acornima, 0.9.0"
#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.
// Install Acornima as a Cake Addin
#addin nuget:?package=Acornima&version=0.9.0

// Install Acornima as a Cake Tool
#tool nuget:?package=Acornima&version=0.9.0

GitHub Actions Workflow Status Feedz Version

Acorn + Esprima = Acornima

This project is an interbreeding of the acornjs and the Esprima.NET parsers, with the intention of creating an even more complete and performant ECMAScript (a.k.a JavaScript) parser library for .NET by combining the best bits of those.

It should also be mentioned that there is an earlier .NET port of acornjs, AcornSharp, which though is unmaintained for a long time, served as a good starting point. If it weren't for AcornSharp, this project probably have never started.

Here is how this Frankenstein's monster looks like:

  • The tokenizer is mostly a direct translation of the acornjs tokenizer to C# (with many smaller and bigger performance improvements, partly inspired by Esprima.NET) - apart from the regex validation/conversion logic, which has been borrowed from Esprima.NET currently.
  • The parser is ~99% acornjs (also with a bunch of minor improvements) and ~1% Esprima.NET (strict mode detection, public API).
  • Both projects follow the ESTree specification, so is Acornima. The actual AST implementation is based on that of Esprima.NET, with further minor improvements to the class hierarchy that bring it even closer to the spec and allow encoding a bit more information.
  • The built-in AST visitors and additional utility functionality stems from Esprima.NET as well.

And what good comes out of this mix?

  • A parser which already matches the performance of Esprima.NET, while doing more: it also passes the complete Test262 test suite for ECMAScript 2023.
  • It is also more economic with regard to stack usage, so it can parse ~1.5x deeper structures.
  • More options for fine-tuning parsing.
  • A standalone tokenizer which can deal with most of the ambiguities of the JavaScript grammar (thanks to the clever context tracking solution implemented by acornjs).
  • As the parser tracks variable scopes to detect variable redeclarations, it will be possible to expose this information to the consumer.

AST

Node [x]
 ├─ArrayPattern : IDestructuringPattern [v,s]
 ├─AssignmentPattern : IDestructuringPattern [v,s]
 ├─CatchClause [v,s]
 ├─ClassBody [v,s]
 ├─ClassProperty : IClassElement, IProperty
 │  ├─AccessorProperty : IClassElement, IProperty [v,s]
 │  ├─MethodDefinition : IClassElement, IProperty [v,s]
 │  └─PropertyDefinition : IClassElement, IProperty [v,s]
 ├─Decorator [v,s]
 ├─ImportAttribute [v,s]
 ├─ModuleSpecifier
 │  ├─ExportSpecifier [v,s]
 │  └─ImportDeclarationSpecifier
 │     ├─ImportDefaultSpecifier [v,s]
 │     ├─ImportNamespaceSpecifier [v,s]
 │     └─ImportSpecifier [v,s]
 ├─ObjectPattern : IDestructuringPattern [v,s]
 ├─Program : IVarScope [v]
 │  ├─Module : IVarScope [s]
 │  └─Script : IVarScope [s]
 ├─Property : IProperty [v]
 │  ├─AssignmentProperty : IProperty [s]
 │  └─ObjectProperty : IProperty [s]
 ├─RestElement : IDestructuringPattern [v,s]
 ├─StatementOrExpression
 │  ├─Expression [x]
 │  │  ├─ArrayExpression [v,s]
 │  │  ├─ArrowFunctionExpression : IFunction [v,s]
 │  │  ├─AssignmentExpression [v,s]
 │  │  ├─AwaitExpression [v,s]
 │  │  ├─BinaryExpression [v]
 │  │  │  ├─LogicalExpression [s]
 │  │  │  └─NonLogicalBinaryExpression [s]
 │  │  ├─CallExpression : IChainElement [v,s]
 │  │  ├─ChainExpression [v,s]
 │  │  ├─ClassExpression : IClass [v,s]
 │  │  ├─ConditionalExpression [v,s]
 │  │  ├─FunctionExpression : IFunction [v,s]
 │  │  ├─Identifier : IDestructuringPattern [v,s]
 │  │  ├─ImportExpression [v,s]
 │  │  ├─Literal [v]
 │  │  │  ├─BigIntLiteral [s]
 │  │  │  ├─BooleanLiteral [s]
 │  │  │  ├─NullLiteral [s]
 │  │  │  ├─NumericLiteral [s]
 │  │  │  ├─RegExpLiteral [s]
 │  │  │  └─StringLiteral [s]
 │  │  ├─MemberExpression : IChainElement, IDestructuringPattern [v,s]
 │  │  ├─MetaProperty [v,s]
 │  │  ├─NewExpression [v,s]
 │  │  ├─ObjectExpression [v,s]
 │  │  ├─ParenthesizedExpression [v,s]
 │  │  ├─PrivateIdentifier [v,s]
 │  │  ├─SequenceExpression [v,s]
 │  │  ├─SpreadElement [v,s]
 │  │  ├─Super [v,s]
 │  │  ├─TaggedTemplateExpression [v,s]
 │  │  ├─TemplateLiteral [v,s]
 │  │  ├─ThisExpression [v,s]
 │  │  ├─UnaryExpression [v]
 │  │  │  ├─NonUpdateUnaryExpression [s]
 │  │  │  └─UpdateExpression [s]
 │  │  └─YieldExpression [v,s]
 │  └─Statement [x]
 │     ├─BlockStatement [v]
 │     │  ├─FunctionBody : IVarScope [s]
 │     │  ├─NestedBlockStatement [s]
 │     │  └─StaticBlock : IClassElement, IVarScope [v,s]
 │     ├─BreakStatement [v,s]
 │     ├─ContinueStatement [v,s]
 │     ├─DebuggerStatement [v,s]
 │     ├─Declaration [x]
 │     │  ├─ClassDeclaration : IClass [v,s]
 │     │  ├─FunctionDeclaration : IFunction [v,s]
 │     │  ├─ImportOrExportDeclaration
 │     │  │  ├─ExportDeclaration
 │     │  │  │  ├─ExportAllDeclaration [v,s]
 │     │  │  │  ├─ExportDefaultDeclaration [v,s]
 │     │  │  │  └─ExportNamedDeclaration [v,s]
 │     │  │  └─ImportDeclaration [v,s]
 │     │  └─VariableDeclaration [v,s]
 │     ├─DoWhileStatement [v,s]
 │     ├─EmptyStatement [v,s]
 │     ├─ExpressionStatement [v]
 │     │  ├─Directive [s]
 │     │  └─NonSpecialExpressionStatement [s]
 │     ├─ForInStatement [v,s]
 │     ├─ForOfStatement [v,s]
 │     ├─ForStatement [v,s]
 │     ├─IfStatement [v,s]
 │     ├─LabeledStatement [v,s]
 │     ├─ReturnStatement [v,s]
 │     ├─SwitchStatement [v,s]
 │     ├─ThrowStatement [v,s]
 │     ├─TryStatement [v,s]
 │     ├─WhileStatement [v,s]
 │     └─WithStatement [v,s]
 ├─SwitchCase [v,s]
 ├─TemplateElement [v,s]
 └─VariableDeclarator [v,s]

Legend:

  • v - A visitation method is generated in the visitors for the node type.
  • s - The node class is sealed. (It's beneficial to check for sealed types when possible.)
  • x - The node class can be subclassed. (The AST provides some limited extensibility for special use cases.)

Benchmarks

Method Runtime FileName Mean Allocated
Acornima-dev .NET 8.0 angular-1.2.5 10.576 ms 4062.79 KB
Acornima-dev .NET Framework 4.8 angular-1.2.5 21.935 ms 4083.74 KB
Esprima-v3.0.4 .NET 8.0 angular-1.2.5 11.214 ms 3828.11 KB
Esprima-v3.0.4 .NET Framework 4.8 angular-1.2.5 20.684 ms 3879.54 KB
Acornima-dev .NET 8.0 backbone-1.1.0 1.408 ms 638.72 KB
Acornima-dev .NET Framework 4.8 backbone-1.1.0 3.226 ms 642.58 KB
Esprima-v3.0.4 .NET 8.0 backbone-1.1.0 1.465 ms 613.88 KB
Esprima-v3.0.4 .NET Framework 4.8 backbone-1.1.0 2.917 ms 620.3 KB
Acornima-dev .NET 8.0 jquery-1.9.1 8.221 ms 3322.58 KB
Acornima-dev .NET Framework 4.8 jquery-1.9.1 18.009 ms 3339.42 KB
Esprima-v3.0.4 .NET 8.0 jquery-1.9.1 8.469 ms 3305.23 KB
Esprima-v3.0.4 .NET Framework 4.8 jquery-1.9.1 16.542 ms 3355.15 KB
Acornima-dev .NET 8.0 jquery.mobile-1.4.2 14.038 ms 5499.24 KB
Acornima-dev .NET Framework 4.8 jquery.mobile-1.4.2 29.629 ms 5530.42 KB
Esprima-v3.0.4 .NET 8.0 jquery.mobile-1.4.2 14.599 ms 5428.48 KB
Esprima-v3.0.4 .NET Framework 4.8 jquery.mobile-1.4.2 27.261 ms 5497.48 KB
Acornima-dev .NET 8.0 mootools-1.4.5 6.695 ms 2812.26 KB
Acornima-dev .NET Framework 4.8 mootools-1.4.5 14.633 ms 2828 KB
Esprima-v3.0.4 .NET 8.0 mootools-1.4.5 7.034 ms 2777.83 KB
Esprima-v3.0.4 .NET Framework 4.8 mootools-1.4.5 13.754 ms 2816.33 KB
Acornima-dev .NET 8.0 underscore-1.5.2 1.158 ms 541.81 KB
Acornima-dev .NET Framework 4.8 underscore-1.5.2 2.782 ms 544.51 KB
Esprima-v3.0.4 .NET 8.0 underscore-1.5.2 1.229 ms 539.42 KB
Esprima-v3.0.4 .NET Framework 4.8 underscore-1.5.2 2.516 ms 547.18 KB
Acornima-dev .NET 8.0 yui-3.12.0 5.867 ms 2638.28 KB
Acornima-dev .NET Framework 4.8 yui-3.12.0 13.651 ms 2655.09 KB
Esprima-v3.0.4 .NET 8.0 yui-3.12.0 6.488 ms 2585.78 KB
Esprima-v3.0.4 .NET Framework 4.8 yui-3.12.0 12.365 ms 2624.92 KB

Known issues and limitations

Regular expressions

The parser can be configured to convert JS regular expression literals to .NET Regex instances (see ParserOptions.RegExpParseMode). However, because of the fundamental differences between the JS and .NET regex engines, in many cases this conversion can't be done perfectly (or, in some cases, at all):

  • Case-insensitive matching won't always yield the same results. Implementing a workaround for this issue would be extremely hard, if not impossible.
  • The JS regex engine assigns numbers to capturing groups sequentially (regardless of the group being named or not named) but .NET uses a different, weird approach: "Captures that use parentheses are numbered automatically from left to right based on the order of the opening parentheses in the regular expression, starting from 1. However, named capture groups are always ordered last, after non-named capture groups." Without some adjustments, this would totally mess up numbered backreferences and replace pattern references. So, as a workaround, the converter wraps all named capturing groups in a non-named capturing group to force .NET to include all the original capturing groups in the resulting match in the expected order. (Of course, this won't prevent named groups from being listed after the numbered ones.) If needed, the original number of groups can be obtained from the returned RegExpParseResult object's ActualRegexGroupCount property.
  • The characters allowed in group names differs in the two regex engines. For example a the group name $group is valid in JS but invalid in .NET. So, as a workaround, the converter encodes the problematic group names to names that are valid in .NET and probably won't collide with other group names present in the pattern. For example, $group is encoded like __utf8_2467726F7570. The original group names can be obtained using the returned RegExpParseResult object's GetRegexGroupName method.
  • Self-referencing capturing groups like /((a+)(\1) ?)+/ may not produce the exact same captures. RegexOptions.ECMAScript is supposed to cover this, however even the MSDN example doesn't produce the same matches. (As a side note, RegexOptions.ECMAScript is kinda a false promise, it can't even get some basic cases right by itself.)
  • Similarily, repeated nested groups like /((a+)?(b+)?(c))*/ may produce different captures for the groups. (JS has an overwrite behavior while .NET doesn't).
  • .NET treats forward references like \1(\w) differently than JS and it's not possible to convert this kind of patterns reliably. (The converter could make some patterns work by rewriting them to something like (?:)(\w) but there are cases where even this wouldn't work.)
  • Unicode mode issues:
    • There could be false positive empty string matches in the middle of surrogate pairs. Patterns as simple as /a?/u will cause this issue when the input string contains astral Unicode chars. There is no out-of-the-box workaround for this issue but it can be mitigated somewhat using a bit of "post-processing", i.e., by filtering out the false positive matches after evaluation like it's done here. Probably there is no way to improve this situation until .NET adds the option to treat the input string as Unicode code points.
    • Support for Unicode property escapes is pretty limited (see explanation). Currently, only General Category expressions are converted. But even this is not perfect as the result will depend the Unicode version included in the specific .NET runtime which is executing the parser's code.

To sum it up, legacy pattern conversion is pretty solid apart from the minor issues listed above. However, support for unicode mode (flag u) patterns is partial and quirky, while conversion of the upcoming unicode sets mode (flag v) will be next to impossible - until the .NET regex engine gets some improved Unicode support.

What's missing currently:

  • Implementation of some experimental features (decorators, import attributes)
  • Moving messages into resources and replacing acorn messages with V8 messages
  • AST to JSON conversion
  • AST to JS conversion
  • Support for JSX
  • Porting additional tests from Esprima.NET
  • Porting additional tests from acornjs
  • CI
  • Docs
Product Compatible and additional computed target framework versions.
.NET net5.0 was computed.  net5.0-windows was computed.  net6.0 is compatible.  net6.0-android was computed.  net6.0-ios was computed.  net6.0-maccatalyst was computed.  net6.0-macos was computed.  net6.0-tvos was computed.  net6.0-windows was computed.  net7.0 was computed.  net7.0-android was computed.  net7.0-ios was computed.  net7.0-maccatalyst was computed.  net7.0-macos was computed.  net7.0-tvos was computed.  net7.0-windows was computed.  net8.0 was computed.  net8.0-android was computed.  net8.0-browser was computed.  net8.0-ios was computed.  net8.0-maccatalyst was computed.  net8.0-macos was computed.  net8.0-tvos was computed.  net8.0-windows was computed. 
.NET Core netcoreapp2.0 was computed.  netcoreapp2.1 was computed.  netcoreapp2.2 was computed.  netcoreapp3.0 was computed.  netcoreapp3.1 was computed. 
.NET Standard netstandard2.0 is compatible.  netstandard2.1 is compatible. 
.NET Framework net461 was computed.  net462 is compatible.  net463 was computed.  net47 was computed.  net471 was computed.  net472 was computed.  net48 was computed.  net481 was computed. 
MonoAndroid monoandroid was computed. 
MonoMac monomac was computed. 
MonoTouch monotouch was computed. 
Tizen tizen40 was computed.  tizen60 was computed. 
Xamarin.iOS xamarinios was computed. 
Xamarin.Mac xamarinmac was computed. 
Xamarin.TVOS xamarintvos was computed. 
Xamarin.WatchOS xamarinwatchos was computed. 
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.

NuGet packages (2)

Showing the top 2 NuGet packages that depend on Acornima:

Package Downloads
Karambolo.AspNetCore.Bundling.EcmaScript

ES6 (ECMAScript 2015) module bundling features for the Karambolo.AspNetCore.Bundling library.

Acornima.Extras

Additional features and utilities for the Acornima package.

GitHub repositories

This package is not used by any popular GitHub repositories.

Version Downloads Last updated
1.0.0 2,704 4/7/2024
0.9.3 138 4/3/2024
0.9.2 145 3/31/2024
0.9.1 158 3/29/2024
0.9.0 2,436 3/26/2024