FuzzySharp 1.0.6
See the version list below for details.
dotnet add package FuzzySharp --version 1.0.6
NuGet\Install-Package FuzzySharp -Version 1.0.6
<PackageReference Include="FuzzySharp" Version="1.0.6" />
paket add FuzzySharp --version 1.0.6
#r "nuget: FuzzySharp, 1.0.6"
// Install FuzzySharp as a Cake Addin #addin nuget:?package=FuzzySharp&version=1.0.6 // Install FuzzySharp as a Cake Tool #tool nuget:?package=FuzzySharp&version=1.0.6
FuzzySharp
C# .NET fuzzy string matching implementation of Seat Geek's well known python FuzzyWuzzy algorithm.
Usage
Install-Package FuzzySharp
Simple Ratio
Fuzz.Ratio("mysmilarstring","myawfullysimilarstirng")
72
Fuzz.Ratio("mysmilarstring","mysimilarstring")
97
Partial Ratio
Fuzz.PartialRatio("similar", "somewhresimlrbetweenthisstring")
71
Token Sort Ratio
Fuzz.TokenSortRatio("order words out of"," words out of order")
100
Fuzz.PartialTokenSortRatio("order words out of"," words out of order")
100
Token Set Ratio
Fuzz.TokenSetRatio("fuzzy was a bear", "fuzzy fuzzy fuzzy bear")
100
Fuzz.PartialTokenSetRatio("fuzzy was a bear", "fuzzy fuzzy fuzzy bear")
100
Token Initialism Ratio
Fuzz.TokenInitialismRatio("NASA", "National Aeronautics and Space Administration");
89
Fuzz.TokenInitialismRatio("NASA", "National Aeronautics Space Administration");
100
Fuzz.TokenInitialismRatio("NASA", "National Aeronautics Space Administration, Kennedy Space Center, Cape Canaveral, Florida 32899");
53
Fuzz.PartialTokenInitialismRatio("NASA", "National Aeronautics Space Administration, Kennedy Space Center, Cape Canaveral, Florida 32899");
100
Token Abbreviation Ratio
Fuzz.TokenAbbreviationRatio("bl 420", "Baseline section 420", PreprocessMode.Full);
40
Fuzz.PartialTokenAbbreviationRatio("bl 420", "Baseline section 420", PreprocessMode.Full);
50
Weighted Ratio
Fuzz.WeightedRatio("The quick brown fox jimps ofver the small lazy dog", "the quick brown fox jumps over the small lazy dog")
95
Process
Process.ExtractOne("cowboys", new[] { "Atlanta Falcons", "New York Jets", "New York Giants", "Dallas Cowboys"})
(string: Dallas Cowboys, score: 90, index: 3)
Process.ExtractTop("goolge", new[] { "google", "bing", "facebook", "linkedin", "twitter", "googleplus", "bingnews", "plexoogl" }, limit: 3);
[(string: google, score: 83, index: 0), (string: googleplus, score: 75, index: 5), (string: plexoogl, score: 43, index: 7)]
Process.ExtractAll("goolge", new [] {"google", "bing", "facebook", "linkedin", "twitter", "googleplus", "bingnews", "plexoogl" })
[(string: google, score: 83, index: 0), (string: bing, score: 22, index: 1), (string: facebook, score: 29, index: 2), (string: linkedin, score: 29, index: 3), (string: twitter, score: 15, index: 4), (string: googleplus, score: 75, index: 5), (string: bingnews, score: 29, index: 6), (string: plexoogl, score: 43, index: 7)]
// score cutoff
Process.ExtractAll("goolge", new[] { "google", "bing", "facebook", "linkedin", "twitter", "googleplus", "bingnews", "plexoogl" }, cutoff: 40)
[(string: google, score: 83, index: 0), (string: googleplus, score: 75, index: 5), (string: plexoogl, score: 43, index: 7)]
Process.ExtractSorted("goolge", new [] {"google", "bing", "facebook", "linkedin", "twitter", "googleplus", "bingnews", "plexoogl" })
[(string: google, score: 83, index: 0), (string: googleplus, score: 75, index: 5), (string: plexoogl, score: 43, index: 7), (string: facebook, score: 29, index: 2), (string: linkedin, score: 29, index: 3), (string: bingnews, score: 29, index: 6), (string: bing, score: 22, index: 1), (string: twitter, score: 15, index: 4)]
Extraction will use WeightedRatio
and full process
by default. Override these in the method parameters to use different scorers and processing.
Here we use the Fuzz.Ratio scorer and keep the strings as is, instead of Full Process (which will .ToLowercase() before comparing)
Process.ExtractOne("cowboys", new[] { "Atlanta Falcons", "New York Jets", "New York Giants", "Dallas Cowboys" }, s => s, ScorerCache.Get<DefaultRatioScorer>());
(string: Dallas Cowboys, score: 57, index: 3)
Extraction can operate on objects of similar type. Use the "process" parameter to reduce the object to the string which it should be compared on. In the following example, the object is an array that contains the matchup, the arena, the date, and the time. We are matching on the first (0 index) parameter, the matchup.
var events = new[]
{
new[] { "chicago cubs vs new york mets", "CitiField", "2011-05-11", "8pm" },
new[] { "new york yankees vs boston red sox", "Fenway Park", "2011-05-11", "8pm" },
new[] { "atlanta braves vs pittsburgh pirates", "PNC Park", "2011-05-11", "8pm" },
};
var query = new[] { "new york mets vs chicago cubs", "CitiField", "2017-03-19", "8pm" };
var best = Process.ExtractOne(query, events, strings => strings[0]);
best: (value: { "chicago cubs vs new york mets", "CitiField", "2011-05-11", "8pm" }, score: 95, index: 0)
FuzzySharp in Different Languages
FuzzySharp was written with English in mind, and as such the Default string preprocessor only looks at English alphanumeric characters in the input strings, and will strip all others out. However, the Extract
methods in the Process
class do provide the option to specify your own string preprocessor. If this parameter is omitted, the Default will be used. However if you provide your own, the provided one will be used, so you are free to provide your own criteria for whatever character set you want to admit. For instance, using the parameter (s) => s
will prevent the string from being altered at all before being run through the similarity algorithms.
E.g.,
var query = "strng";
var choices = new [] { "stríng", "stráng", "stréng" };
var results = Process.ExtractAll(query, choices, (s) => s);
The above will run the similarity algorithm on all the choices without stripping out the accented characters.
Using Different Scorers
Scoring strategies are stateless, and as such should be static. However, in order to get them to share all the code they have in common via inheritance, making them static was not possible. Currently one way around having to new up an instance everytime you want to use one is to use the cache. This will ensure only one instance of each scorer ever exists.
var ratio = ScorerCache.Get<DefaultRatioScorer>();
var partialRatio = ScorerCache.Get<PartialRatioScorer>();
var tokenSet = ScorerCache.Get<TokenSetScorer>();
var partialTokenSet = ScorerCache.Get<PartialTokenSetScorer>();
var tokenSort = ScorerCache.Get<TokenSortScorer>();
var partialTokenSort = ScorerCache.Get<PartialTokenSortScorer>();
var tokenAbbreviation = ScorerCache.Get<TokenAbbreviationScorer>();
var partialTokenAbbreviation = ScorerCache.Get<PartialTokenAbbreviationScorer>();
var weighted = ScorerCache.Get<WeightedRatioScorer>();
Credits
- SeatGeek
- Adam Cohen
- David Necas (python-Levenshtein)
- Mikko Ohtamaa (python-Levenshtein)
- Antti Haapala (python-Levenshtein)
- Panayiotis (Java implementation I heavily borrowed from)
Product | Versions Compatible and additional computed target framework versions. |
---|---|
.NET | net5.0 was computed. net5.0-windows was computed. net6.0 was computed. net6.0-android was computed. net6.0-ios was computed. net6.0-maccatalyst was computed. net6.0-macos was computed. net6.0-tvos was computed. net6.0-windows was computed. net7.0 was computed. net7.0-android was computed. net7.0-ios was computed. net7.0-maccatalyst was computed. net7.0-macos was computed. net7.0-tvos was computed. net7.0-windows was computed. net8.0 was computed. net8.0-android was computed. net8.0-browser was computed. net8.0-ios was computed. net8.0-maccatalyst was computed. net8.0-macos was computed. net8.0-tvos was computed. net8.0-windows was computed. |
.NET Core | netcoreapp1.0 was computed. netcoreapp1.1 was computed. netcoreapp2.0 is compatible. netcoreapp2.1 is compatible. netcoreapp2.2 was computed. netcoreapp3.0 was computed. netcoreapp3.1 was computed. |
.NET Standard | netstandard1.6 is compatible. netstandard2.0 is compatible. netstandard2.1 is compatible. |
.NET Framework | net45 is compatible. net451 was computed. net452 was computed. net46 is compatible. net461 is compatible. net462 was computed. net463 was computed. net47 is compatible. net471 is compatible. net472 is compatible. net48 was computed. net481 was computed. |
MonoAndroid | monoandroid was computed. |
MonoMac | monomac was computed. |
MonoTouch | monotouch was computed. |
Tizen | tizen30 was computed. tizen40 was computed. tizen60 was computed. |
Xamarin.iOS | xamarinios was computed. |
Xamarin.Mac | xamarinmac was computed. |
Xamarin.TVOS | xamarintvos was computed. |
Xamarin.WatchOS | xamarinwatchos was computed. |
-
.NETCoreApp 2.0
- No dependencies.
-
.NETCoreApp 2.1
- No dependencies.
-
.NETFramework 4.5
- No dependencies.
-
.NETFramework 4.6
- No dependencies.
-
.NETFramework 4.6.1
- No dependencies.
-
.NETFramework 4.7
- No dependencies.
-
.NETFramework 4.7.1
- No dependencies.
-
.NETFramework 4.7.2
- No dependencies.
-
.NETStandard 1.6
- NETStandard.Library (>= 1.6.1)
-
.NETStandard 2.0
- No dependencies.
-
.NETStandard 2.1
- No dependencies.
NuGet packages (20)
Showing the top 5 NuGet packages that depend on FuzzySharp:
Package | Downloads |
---|---|
Remora.Discord.Commands
Glue code for using Remora.Commands with Remora.Discord |
|
BaDaBoomShop
Webshop back-end framework based on the IAM stack |
|
FemDesign.Core
The FEM-Design API package |
|
SoftwaredeveloperDotAt.Infrastructure.Core
Library for base .NET Classes |
|
cbim.mango.server.framework
BIM-STAR平台服务器端主程序所需库 |
GitHub repositories (21)
Showing the top 5 popular GitHub repositories that depend on FuzzySharp:
Repository | Stars |
---|---|
DevToys-app/DevToys
A Swiss Army knife for developers.
|
|
MudBlazor/MudBlazor
Blazor Component Library based on Material design with an emphasis on ease of use. Mainly written in C# with Javascript kept to a bare minimum it empowers .NET developers to easily debug it if needed.
|
|
LykosAI/StabilityMatrix
Multi-Platform Package Manager for Stable Diffusion
|
|
MCCTeam/Minecraft-Console-Client
Lightweight console for Minecraft chat and automated scripts
|
|
SteamAutoCracks/Steam-auto-crack
Steam Game Automatic Cracker
|
Include symbols in package. Support .Net Standard 2.1. Fix bug where empty strings throws error with Abbreviation ratio