FilePrepper 0.4.0

There is a newer version of this package available.
See the version list below for details.
dotnet add package FilePrepper --version 0.4.0
                    
NuGet\Install-Package FilePrepper -Version 0.4.0
                    
This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.
<PackageReference Include="FilePrepper" Version="0.4.0" />
                    
For projects that support PackageReference, copy this XML node into the project file to reference the package.
<PackageVersion Include="FilePrepper" Version="0.4.0" />
                    
Directory.Packages.props
<PackageReference Include="FilePrepper" />
                    
Project file
For projects that support Central Package Management (CPM), copy this XML node into the solution Directory.Packages.props file to version the package.
paket add FilePrepper --version 0.4.0
                    
#r "nuget: FilePrepper, 0.4.0"
                    
#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.
#:package FilePrepper@0.4.0
                    
#:package directive can be used in C# file-based apps starting in .NET 10 preview 4. Copy this into a .cs file before any lines of code to reference the package.
#addin nuget:?package=FilePrepper&version=0.4.0
                    
Install as a Cake Addin
#tool nuget:?package=FilePrepper&version=0.4.0
                    
Install as a Cake Tool

FilePrepper

NuGet Version NuGet Downloads .NET Version License

A powerful .NET CLI tool for data preprocessing without coding. Perfect for ML data preparation, ETL pipelines, and data analysis workflows.

🚀 Quick Start

Installation

# Install as global .NET tool
dotnet tool install -g fileprepper-cli

# Verify installation
fileprepper --version

Basic Usage

# Normalize numeric columns
fileprepper normalize-data --input data.csv --output normalized.csv \
  --columns "Age,Salary,Score" --method MinMax

# Fill missing values
fileprepper fill-missing-values --input data.csv --output filled.csv \
  --columns "Age,Salary" --method Mean

# Filter rows
fileprepper filter-rows --input sales.csv --output filtered.csv \
  --conditions "Revenue:GreaterThan:1000,Region:Equals:North"

# Convert file formats
fileprepper file-format-convert --input data.csv --output data.json \
  --format JSON

# Get help
fileprepper --help
fileprepper <command> --help

📦 Supported Formats

Process data in multiple formats:

  • CSV (Comma-Separated Values)
  • TSV (Tab-Separated Values)
  • JSON (JavaScript Object Notation)
  • XML (Extensible Markup Language)
  • Excel (XLSX/XLS files)

🛠️ Available Commands (20+)

Data Transformation

  • normalize-data - Normalize columns (MinMax, ZScore)
  • scale-data - Scale numeric data (StandardScaler, MinMaxScaler, RobustScaler)
  • one-hot-encoding - Convert categorical to binary columns
  • data-type-convert - Convert column data types
  • date-extraction - Extract date features (Year, Month, Day, DayOfWeek)

Data Cleaning

  • fill-missing-values - Fill missing data (Mean, Median, Mode, Forward, Backward, Constant)
  • drop-duplicates - Remove duplicate rows
  • value-replace - Replace values in columns

Column Operations

  • add-columns - Add new calculated columns
  • remove-columns - Delete unwanted columns
  • rename-columns - Rename column headers
  • reorder-columns - Change column order
  • column-interaction - Create interaction features

Data Analysis

  • basic-statistics - Calculate statistics (Mean, Median, StdDev, ZScore)
  • aggregate - Group and aggregate data
  • filter-rows - Filter rows by conditions

Data Organization

  • merge - Combine multiple files (Horizontal/Vertical merge)
  • data-sampling - Sample rows (Random, Stratified, Systematic)
  • file-format-convert - Convert between formats

Feature Engineering

  • create-lag-features - Create time-series lag features

💡 Common Use Cases

Data Cleaning Pipeline

# 1. Remove unnecessary columns
fileprepper remove-columns --input raw.csv --output step1.csv \
  --columns "Debug,TempCol,Notes"

# 2. Drop duplicates
fileprepper drop-duplicates --input step1.csv --output step2.csv \
  --columns "Email" --keep First

# 3. Fill missing values
fileprepper fill-missing-values --input step2.csv --output step3.csv \
  --columns "Age,Salary" --method Mean

# 4. Normalize numeric columns
fileprepper normalize-data --input step3.csv --output clean.csv \
  --columns "Age,Salary,Score" --method MinMax

ML Feature Engineering

# Extract date features
fileprepper date-extraction --input orders.csv --output features1.csv \
  --columns "OrderDate" --features Year,Month,DayOfWeek

# Create lag features for time series
fileprepper create-lag-features --input sales.csv --output features2.csv \
  --group-by ProductID --lag-columns Revenue \
  --periods 1,2,3,7 --sort-by Date

# One-hot encode categorical variables
fileprepper one-hot-encoding --input features2.csv --output features3.csv \
  --columns "Category,Region"

# Create interaction features
fileprepper column-interaction --input features3.csv --output final.csv \
  --column-pairs "Price*Quantity,Age*Income"

Format Conversion

# CSV to JSON
fileprepper file-format-convert --input data.csv --output data.json --format JSON

# Excel to CSV
fileprepper file-format-convert --input report.xlsx --output report.csv --format CSV

# CSV to XML
fileprepper file-format-convert --input data.csv --output data.xml --format XML

Data Analysis

# Calculate statistics
fileprepper basic-statistics --input data.csv --output stats.csv \
  --columns "Age,Salary,Score" --statistics Mean,Median,StdDev,ZScore

# Aggregate by group
fileprepper aggregate --input sales.csv --output summary.csv \
  --group-by "Region,Category" --agg-columns "Revenue:Sum,Quantity:Mean"

# Sample data
fileprepper data-sampling --input large.csv --output sample.csv \
  --method Random --sample-size 1000

🔧 Programmatic Usage

FilePrepper can also be used as a .NET library:

dotnet add package FilePrepper
using FilePrepper.Tasks.NormalizeData;
using Microsoft.Extensions.Logging;

var options = new NormalizeDataOption
{
    InputPath = "data.csv",
    OutputPath = "normalized.csv",
    TargetColumns = new[] { "Age", "Salary", "Score" },
    Method = NormalizationMethod.MinMax,
    MinValue = 0,
    MaxValue = 1
};

var task = new NormalizeDataTask(logger);
var context = new TaskContext(options);
bool success = await task.ExecuteAsync(context);

See API Reference for detailed programmatic usage.

📖 Documentation

For more documentation, see the docs/ directory.

🎯 Use Cases

  • Machine Learning - Prepare datasets for training (normalization, encoding, feature engineering)
  • Data Analysis - Clean and transform data for analysis
  • ETL Pipelines - Extract, transform, and load data workflows
  • Data Migration - Convert between formats and clean legacy data
  • Automation - Script data processing without custom code

📋 Requirements

  • .NET 9.0 or later
  • Cross-platform - Windows, Linux, macOS
  • No coding required - Command-line only (or use as library)

🤝 Contributing

Contributions are welcome! Please feel free to submit issues, feature requests, or pull requests.

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.


Made with ❤️ by iyulab | ML Data Preprocessing Tool - No Coding Required

Product Compatible and additional computed target framework versions.
.NET net9.0 is compatible.  net9.0-android was computed.  net9.0-browser was computed.  net9.0-ios was computed.  net9.0-maccatalyst was computed.  net9.0-macos was computed.  net9.0-tvos was computed.  net9.0-windows was computed.  net10.0 was computed.  net10.0-android was computed.  net10.0-browser was computed.  net10.0-ios was computed.  net10.0-maccatalyst was computed.  net10.0-macos was computed.  net10.0-tvos was computed.  net10.0-windows was computed. 
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.

NuGet packages

This package is not used by any NuGet packages.

GitHub repositories

This package is not used by any popular GitHub repositories.

Version Downloads Last Updated
0.4.8 138 11/16/2025
0.4.7 247 11/14/2025
0.4.5 287 11/13/2025
0.4.3 263 11/10/2025
0.4.0 191 11/3/2025
0.2.3 190 11/3/2025
0.2.2 154 1/17/2025
0.2.1 132 1/16/2025
0.2.0 159 1/11/2025
0.1.1 167 12/16/2024
0.1.0 161 12/6/2024