FilePrepper 0.4.0
See the version list below for details.
dotnet add package FilePrepper --version 0.4.0
NuGet\Install-Package FilePrepper -Version 0.4.0
<PackageReference Include="FilePrepper" Version="0.4.0" />
<PackageVersion Include="FilePrepper" Version="0.4.0" />
<PackageReference Include="FilePrepper" />
paket add FilePrepper --version 0.4.0
#r "nuget: FilePrepper, 0.4.0"
#:package FilePrepper@0.4.0
#addin nuget:?package=FilePrepper&version=0.4.0
#tool nuget:?package=FilePrepper&version=0.4.0
FilePrepper
A powerful .NET CLI tool for data preprocessing without coding. Perfect for ML data preparation, ETL pipelines, and data analysis workflows.
🚀 Quick Start
Installation
# Install as global .NET tool
dotnet tool install -g fileprepper-cli
# Verify installation
fileprepper --version
Basic Usage
# Normalize numeric columns
fileprepper normalize-data --input data.csv --output normalized.csv \
--columns "Age,Salary,Score" --method MinMax
# Fill missing values
fileprepper fill-missing-values --input data.csv --output filled.csv \
--columns "Age,Salary" --method Mean
# Filter rows
fileprepper filter-rows --input sales.csv --output filtered.csv \
--conditions "Revenue:GreaterThan:1000,Region:Equals:North"
# Convert file formats
fileprepper file-format-convert --input data.csv --output data.json \
--format JSON
# Get help
fileprepper --help
fileprepper <command> --help
📦 Supported Formats
Process data in multiple formats:
- CSV (Comma-Separated Values)
- TSV (Tab-Separated Values)
- JSON (JavaScript Object Notation)
- XML (Extensible Markup Language)
- Excel (XLSX/XLS files)
🛠️ Available Commands (20+)
Data Transformation
normalize-data- Normalize columns (MinMax, ZScore)scale-data- Scale numeric data (StandardScaler, MinMaxScaler, RobustScaler)one-hot-encoding- Convert categorical to binary columnsdata-type-convert- Convert column data typesdate-extraction- Extract date features (Year, Month, Day, DayOfWeek)
Data Cleaning
fill-missing-values- Fill missing data (Mean, Median, Mode, Forward, Backward, Constant)drop-duplicates- Remove duplicate rowsvalue-replace- Replace values in columns
Column Operations
add-columns- Add new calculated columnsremove-columns- Delete unwanted columnsrename-columns- Rename column headersreorder-columns- Change column ordercolumn-interaction- Create interaction features
Data Analysis
basic-statistics- Calculate statistics (Mean, Median, StdDev, ZScore)aggregate- Group and aggregate datafilter-rows- Filter rows by conditions
Data Organization
merge- Combine multiple files (Horizontal/Vertical merge)data-sampling- Sample rows (Random, Stratified, Systematic)file-format-convert- Convert between formats
Feature Engineering
create-lag-features- Create time-series lag features
💡 Common Use Cases
Data Cleaning Pipeline
# 1. Remove unnecessary columns
fileprepper remove-columns --input raw.csv --output step1.csv \
--columns "Debug,TempCol,Notes"
# 2. Drop duplicates
fileprepper drop-duplicates --input step1.csv --output step2.csv \
--columns "Email" --keep First
# 3. Fill missing values
fileprepper fill-missing-values --input step2.csv --output step3.csv \
--columns "Age,Salary" --method Mean
# 4. Normalize numeric columns
fileprepper normalize-data --input step3.csv --output clean.csv \
--columns "Age,Salary,Score" --method MinMax
ML Feature Engineering
# Extract date features
fileprepper date-extraction --input orders.csv --output features1.csv \
--columns "OrderDate" --features Year,Month,DayOfWeek
# Create lag features for time series
fileprepper create-lag-features --input sales.csv --output features2.csv \
--group-by ProductID --lag-columns Revenue \
--periods 1,2,3,7 --sort-by Date
# One-hot encode categorical variables
fileprepper one-hot-encoding --input features2.csv --output features3.csv \
--columns "Category,Region"
# Create interaction features
fileprepper column-interaction --input features3.csv --output final.csv \
--column-pairs "Price*Quantity,Age*Income"
Format Conversion
# CSV to JSON
fileprepper file-format-convert --input data.csv --output data.json --format JSON
# Excel to CSV
fileprepper file-format-convert --input report.xlsx --output report.csv --format CSV
# CSV to XML
fileprepper file-format-convert --input data.csv --output data.xml --format XML
Data Analysis
# Calculate statistics
fileprepper basic-statistics --input data.csv --output stats.csv \
--columns "Age,Salary,Score" --statistics Mean,Median,StdDev,ZScore
# Aggregate by group
fileprepper aggregate --input sales.csv --output summary.csv \
--group-by "Region,Category" --agg-columns "Revenue:Sum,Quantity:Mean"
# Sample data
fileprepper data-sampling --input large.csv --output sample.csv \
--method Random --sample-size 1000
🔧 Programmatic Usage
FilePrepper can also be used as a .NET library:
dotnet add package FilePrepper
using FilePrepper.Tasks.NormalizeData;
using Microsoft.Extensions.Logging;
var options = new NormalizeDataOption
{
InputPath = "data.csv",
OutputPath = "normalized.csv",
TargetColumns = new[] { "Age", "Salary", "Score" },
Method = NormalizationMethod.MinMax,
MinValue = 0,
MaxValue = 1
};
var task = new NormalizeDataTask(logger);
var context = new TaskContext(options);
bool success = await task.ExecuteAsync(context);
See API Reference for detailed programmatic usage.
📖 Documentation
- Quick Start Guide - Get started in 5 minutes
- CLI Guide - Complete command reference
- Common Scenarios - Real-world use cases
- API Reference - Programmatic usage
- Installation Guide - Detailed installation
For more documentation, see the docs/ directory.
🎯 Use Cases
- Machine Learning - Prepare datasets for training (normalization, encoding, feature engineering)
- Data Analysis - Clean and transform data for analysis
- ETL Pipelines - Extract, transform, and load data workflows
- Data Migration - Convert between formats and clean legacy data
- Automation - Script data processing without custom code
📋 Requirements
- .NET 9.0 or later
- Cross-platform - Windows, Linux, macOS
- No coding required - Command-line only (or use as library)
🤝 Contributing
Contributions are welcome! Please feel free to submit issues, feature requests, or pull requests.
📄 License
This project is licensed under the MIT License - see the LICENSE file for details.
🔗 Links
- NuGet Package: https://www.nuget.org/packages/fileprepper-cli
- GitHub Repository: https://github.com/iyulab/FilePrepper
- Issues: https://github.com/iyulab/FilePrepper/issues
- Documentation: docs/
- Changelog: CHANGELOG.md
Made with ❤️ by iyulab | ML Data Preprocessing Tool - No Coding Required
| Product | Versions Compatible and additional computed target framework versions. |
|---|---|
| .NET | net9.0 is compatible. net9.0-android was computed. net9.0-browser was computed. net9.0-ios was computed. net9.0-maccatalyst was computed. net9.0-macos was computed. net9.0-tvos was computed. net9.0-windows was computed. net10.0 was computed. net10.0-android was computed. net10.0-browser was computed. net10.0-ios was computed. net10.0-maccatalyst was computed. net10.0-macos was computed. net10.0-tvos was computed. net10.0-windows was computed. |
-
net9.0
- CsvHelper (>= 33.1.0)
- EPPlus (>= 8.2.1)
- ExcelDataReader (>= 3.8.0)
- ExcelDataReader.DataSet (>= 3.8.0)
- Microsoft.Extensions.DependencyInjection.Abstractions (>= 9.0.10)
- Microsoft.Extensions.Logging.Abstractions (>= 9.0.10)
- Microsoft.Extensions.Options (>= 9.0.10)
- Scrutor (>= 6.1.0)
NuGet packages
This package is not used by any NuGet packages.
GitHub repositories
This package is not used by any popular GitHub repositories.