JustinL.MCP.CodeVectorsMCP 1.0.9

.NET 8.0

dotnet tool install --global JustinL.MCP.CodeVectorsMCP --version 1.0.9

This package contains a .NET tool you can call from the shell/command line.

dotnet new tool-manifest
                    

                            if you are setting up this repo

dotnet tool install --local JustinL.MCP.CodeVectorsMCP --version 1.0.9

This package contains a .NET tool you can call from the shell/command line.

#tool dotnet:?package=JustinL.MCP.CodeVectorsMCP&version=1.0.9

The NuGet Team does not provide support for this client. Please contact its maintainers for support.

nuke :add-package JustinL.MCP.CodeVectorsMCP --version 1.0.9

The NuGet Team does not provide support for this client. Please contact its maintainers for support.

CodeVectorsMCP

CodeVectorsMCP is an MCP (Model Context Protocol) server that provides context awareness for large software projects through text embeddings and vector search. It uses OpenAI embeddings and Qdrant vector database to enable semantic search across codebases.

Features

Semantic Code Search: Search your codebase using natural language queries
MCP Protocol Support: Integrates with AI assistants that support the Model Context Protocol
Real-time Updates: Watch for file changes and automatically update embeddings
Multi-language Support: Process various programming languages and documentation formats
Efficient Chunking: Smart file chunking that respects code boundaries
Batch Processing: Efficient batch embedding generation

Installation

Install as .NET Tool (Recommended)

Install CodeVectors globally from NuGet:

dotnet tool install -g JustinL.MCP.CodeVectorsMCP

Then use it anywhere with:

codevectors --help

Quick Start

Prerequisites

.NET 8.0 SDK
Docker (for Qdrant)
OpenAI API key

Setup

Initialize configuration
```
codevectors init
```
This creates a configuration file at ./.codevectors/config.json
Set up Qdrant
```
cd docker
./setup-qdrant.sh
```

Configure OpenAI API key

export OPENAI_API_KEY=your-api-key-here

Quick Start Usage

Initial Setup (One-time per project)

# Navigate to your project directory
cd /path/to/your/project

# Initialize configuration
codevectors init

# Generate initial embeddings for the entire project
codevectors generate .

Normal AI Agent Usage

# Run with no parameters - this is the standard way AI agents use the tool
codevectors

This command:

Runs in STDIO mode by default (for MCP protocol communication)
Automatically detects and reindexes any changed files
Keeps embeddings up to date without manual intervention
Provides the search_codebase tool to AI agents

Additional Commands

# Generate embeddings for specific directories
codevectors generate . --include src/ docs/

# Generate embeddings for specific file types
codevectors generate . --include "*.cs" "*.md"

# Generate only documentation files
codevectors generate . --docs-only

# Query the vector store manually
codevectors query "How does authentication work?"

# Watch for file changes and auto-update embeddings
codevectors watch .

# Test all connections and services
codevectors test
codevectors test --verbose  # Show detailed test output

Testing Connections

The test command helps diagnose connectivity issues with all configured services:

# Run connection tests
codevectors test

# Example output:
# Starting CodeVectors connection tests...
# 
# ============================================================
# TEST RESULTS SUMMARY
# ============================================================
# ✓ Configuration: Loaded
# ✓ Embedding Service: Connected
# ✓ Vector Store (Qdrant): Connected
# 
# Total test time: 523ms
# 
# All tests passed!

With verbose output:

codevectors test --verbose

# Shows additional details like:
# - Configuration values
# - Service URLs
# - Embedding dimensions
# - Collection counts

The test command checks:

Configuration: Validates that your config.json is properly loaded
Embedding Service: Tests connectivity based on your configured mode:
- Local: Validates OpenAI API key and tests embedding generation
- REST: Checks health endpoint connectivity
- gRPC: Validates configuration (full connectivity test requires server)
Vector Store: Connects to Qdrant, creates a test collection, and performs CRUD operations

Troubleshooting: Reset Project Vectors

If embeddings seem inconsistent or you want to start fresh:

# Method 1: Delete the project's vectors (keeps other projects intact)
codevectors delete

# Method 2: Regenerate all embeddings with force flag
codevectors generate . --force

# Method 3: Manual cleanup (if using local Qdrant)
# Stop qdrant, delete the collection data, restart qdrant, then regenerate

Configuration

Configuration Directory and File Location

CodeVectorsMCP stores its configuration in a .codevectors directory located in your project's root directory. The configuration file is .codevectors/config.json.

Default location: When you run codevectors commands, the tool looks for:

your-project/
├── .codevectors/
│   └── config.json
├── src/
├── docs/
└── README.md

This design allows each project to have its own configuration, including different vector store settings, API keys, and project-specific options.

Creating Configuration

First, navigate to your project's root directory, then initialize the configuration:

cd /path/to/your/project
codevectors init

This creates .codevectors/config.json with default settings. You can also specify a custom location:

codevectors init --config /custom/path/config.json

How AI Agents Use CodeVectorsMCP

CodeVectorsMCP is designed to be used by AI agents through the Model Context Protocol (MCP). Here's the typical workflow:

Project Setup: You set up CodeVectorsMCP once per project by running codevectors init in the project root
Index Your Code: Generate embeddings with codevectors generate . --project myproject
AI Agent Integration: Your AI agent runs in the same directory and uses the MCP protocol to search your indexed code

When the AI agent needs code context, it calls the search_codebase tool, which:

Reads the .codevectors/config.json file in the current directory
Connects to the configured vector store
Searches for relevant code using the specified project name
Returns contextual code snippets to help the AI understand your codebase

Project Names: Multi-Project Support

The --project parameter is crucial for organizing your code in the vector store:

Why Project Names Matter:

Isolation: Each project's code is stored separately, preventing cross-contamination
Multi-Project Support: A single Qdrant instance can serve multiple projects
Targeted Search: AI agents only search within the specified project's code
Organization: Clear separation between different codebases or versions

Example Setup:

# Project A
cd /path/to/project-a
codevectors init  # Configure project name in config.json
codevectors generate .

# Project B  
cd /path/to/project-b
codevectors init  # Configure different project name in config.json
codevectors generate .

Both projects can use the same Qdrant vector store, but their code embeddings are kept separate and searchable independently.

Configuration Examples

Local Configuration (Default Setup)

Example .codevectors/config.json generated by codevectors init:

{
  "ProjectName": "your-project-name",
  "EmbeddingsService": {
    "Mode": "Local"
  },
  "Embeddings": {
    "Model": "text-embedding-3-small",
    "ApiKey": "${OPENAI_API_KEY}",
    "ChunkSize": 1000,
    "Overlap": 200,
    "BatchSize": 100,
    "MaxRetries": 3,
    "BaseDelayMs": 1000,
    "MaxTokens": 8191,
    "RequestsPerMinute": 30,
    "TokensPerMinute": 150000
  },
  "VectorStore": {
    "Type": "qdrant",
    "ConnectionString": "http://localhost:6334",
    "CodeCollectionName": "code-vectors",
    "DocumentationCollectionName": "documentation-vectors",
    "VectorSize": 1536,
    "DistanceMetric": "Cosine"
  },
  "FileWatcher": {
    "Enabled": true,
    "DebounceMs": 60000
  },
  "FilePatterns": {
    "IncludePatterns": [
      "**/*.cs", "**/*.ts", "**/*.js", "**/*.py", "**/*.java",
      "**/*.cpp", "**/*.c", "**/*.h", "**/*.go", "**/*.rs",
      "**/*.rb", "**/*.php", "**/*.swift", "**/*.kt"
    ],
    "ExcludePatterns": [
      "**/bin/**", "**/obj/**", "**/node_modules/**",
      "**/.git/**", "**/dist/**", "**/build/**",
      "**/packages/**", "**/vendor/**", "**/.vs/**",
      "**/.idea/**", "**/target/**", "**/__pycache__/**"
    ]
  }
}

Azure OpenAI Configuration

Example configuration using Azure OpenAI Service with API Key:

{
  "ProjectName": "your-project-name",
  "EmbeddingsService": {
    "Mode": "Local"
  },
  "Embeddings": {
    "ServiceType": "AzureOpenAI",
    "ApiKey": "${AZURE_OPENAI_KEY}",
    "Endpoint": "https://your-resource.openai.azure.com/",
    "AzureDeploymentName": "text-embedding-3-small",
    "Model": "text-embedding-3-small",
    "ChunkSize": 1000,
    "Overlap": 200,
    "BatchSize": 100,
    "MaxRetries": 3,
    "BaseDelayMs": 1000,
    "MaxTokens": 8191,
    "RequestsPerMinute": 30,
    "TokensPerMinute": 150000
  },
  "VectorStore": {
    "Type": "qdrant",
    "ConnectionString": "http://localhost:6334",
    "CodeCollectionName": "code-vectors",
    "DocumentationCollectionName": "documentation-vectors",
    "VectorSize": 1536,
    "DistanceMetric": "Cosine"
  },
  "FileWatcher": {
    "Enabled": true,
    "DebounceMs": 60000
  },
  "FilePatterns": {
    "IncludePatterns": [
      "**/*.cs", "**/*.ts", "**/*.js", "**/*.py"
    ],
    "ExcludePatterns": [
      "**/bin/**", "**/obj/**", "**/node_modules/**", "**/.git/**"
    ]
  }
}

Example using Azure AD authentication (no API key required):

{
  "ProjectName": "your-project-name",
  "EmbeddingsService": {
    "Mode": "Local"
  },
  "Embeddings": {
    "ServiceType": "AzureOpenAI",
    "UseAzureIdentity": true,
    "Endpoint": "https://your-resource.openai.azure.com/",
    "AzureDeploymentName": "text-embedding-3-small",
    "Model": "text-embedding-3-small",
    "ChunkSize": 1000,
    "Overlap": 200
  }
}

Notes for Azure OpenAI:

ServiceType must be set to "AzureOpenAI"
AzureDeploymentName is required and should match your deployment name in Azure
Endpoint is required and should be your Azure OpenAI resource endpoint
Use either ApiKey or set UseAzureIdentity to true for Azure AD authentication
The Model field is still used for caching but deployment name is used for API calls

REST API Embeddings Configuration

Example configuration using a remote REST embedding service:

{
  "ProjectName": "your-project-name",
  "EmbeddingsService": {
    "Mode": "Rest",
    "Url": "https://your-embedding-service.com"
  },
  "Embeddings": {
    "ChunkSize": 1000,
    "Overlap": 200,
    "BatchSize": 100,
    "MaxRetries": 3,
    "BaseDelayMs": 1000
  },
  "VectorStore": {
    "Type": "qdrant",
    "ConnectionString": "http://localhost:6334",
    "CodeCollectionName": "code-vectors",
    "DocumentationCollectionName": "documentation-vectors",
    "VectorSize": 1536,
    "DistanceMetric": "Cosine"
  },
  "FileWatcher": {
    "Enabled": true,
    "DebounceMs": 60000
  },
  "FilePatterns": {
    "IncludePatterns": [
      "**/*.cs", "**/*.ts", "**/*.js", "**/*.py"
    ],
    "ExcludePatterns": [
      "**/bin/**", "**/obj/**", "**/node_modules/**", "**/.git/**"
    ]
  }
}

gRPC Embeddings Configuration

Example configuration using a gRPC embedding service:

{
  "ProjectName": "your-project-name",
  "EmbeddingsService": {
    "Mode": "Grpc",
    "Url": "http://localhost:5001"
  },
  "Embeddings": {
    "ChunkSize": 1000,
    "Overlap": 200,
    "BatchSize": 100,
    "MaxRetries": 3,
    "BaseDelayMs": 1000
  },
  "VectorStore": {
    "Type": "qdrant",
    "ConnectionString": "http://localhost:6334",
    "CodeCollectionName": "code-vectors",
    "DocumentationCollectionName": "documentation-vectors",
    "VectorSize": 1536,
    "DistanceMetric": "Cosine"
  },
  "FileWatcher": {
    "Enabled": true,
    "DebounceMs": 60000
  },
  "FilePatterns": {
    "IncludePatterns": [
      "**/*.cs", "**/*.ts", "**/*.js", "**/*.py"
    ],
    "ExcludePatterns": [
      "**/bin/**", "**/obj/**", "**/node_modules/**", "**/.git/**"
    ]
  }
}

Environment Variable Configuration

For security, use environment variables for sensitive data:

{
  "ProjectName": "${PROJECT_NAME}",
  "EmbeddingsService": {
    "Mode": "Local"
  },
  "Embeddings": {
    "Model": "text-embedding-3-small",
    "ApiKey": "${OPENAI_API_KEY}",
    "ChunkSize": 1000
  },
  "VectorStore": {
    "Type": "qdrant",
    "ConnectionString": "${QDRANT_URL}",
    "CodeCollectionName": "code-vectors",
    "DocumentationCollectionName": "documentation-vectors"
  }
}

Then set environment variables:

export PROJECT_NAME=my-project
export OPENAI_API_KEY=sk-your-actual-key-here
export QDRANT_URL=http://localhost:6334

Configuration Options Reference

ProjectName: Project identifier used for vector store isolation
EmbeddingsService:
- Mode: "Local" (OpenAI API), "Rest" (REST API), or "Grpc" (gRPC service)
- Url: Service URL for Rest/Grpc modes (e.g., "http://localhost:5000" for REST, "http://localhost:5001" for gRPC)
Embeddings:
- ServiceType: Type of embedding service when Mode is "Local" ("OpenAI" or "AzureOpenAI", default: "OpenAI")
- Model: OpenAI model name (only used in Local mode, default: text-embedding-3-small)
- ApiKey: API key for the embedding service (only used in Local mode)
- Endpoint: Optional custom endpoint URL (only used in Local mode, required for Azure OpenAI)
- AzureDeploymentName: Azure OpenAI deployment name (required when ServiceType is "AzureOpenAI")
- UseAzureIdentity: Use Azure AD authentication instead of API key (only for Azure OpenAI, default: false)
- ChunkSize: Text chunk size for processing (default: 1000)
- Overlap: Overlap between chunks in characters (default: 200)
- BatchSize: Number of texts to process in one batch (default: 100)
- MaxRetries: Maximum retry attempts for failed requests (default: 3)
- BaseDelayMs: Base delay for exponential backoff (default: 1000)
- MaxTokens: Maximum tokens per request for OpenAI (default: 8191)
- RequestsPerMinute: Rate limit for requests (default: 30)
- TokensPerMinute: Rate limit for tokens (default: 150000)
VectorStore:
- Type: Vector store type (default: "qdrant")
- ConnectionString: Qdrant server URL (default: http://localhost:6334)
- CodeCollectionName: Collection for code embeddings
- DocumentationCollectionName: Collection for documentation
- VectorSize: Embedding dimension (default: 1536)
- DistanceMetric: Distance metric for similarity (default: "Cosine")
FileWatcher:
- Enabled: Enable file watching (default: true)
- DebounceMs: Debounce delay for file changes (default: 60000)
FilePatterns:
- IncludePatterns: File patterns to include when indexing
- ExcludePatterns: File patterns to exclude (node_modules, .git, etc.)

Using Custom Configuration Locations

For advanced scenarios, you can specify a different config file:

codevectors generate . --config /custom/path/config.json

For Developers

Building from Source

If you want to build and run CodeVectorsMCP from source:

Clone the repository

git clone https://github.com/jmlewis1/CodeVectorsMCP.git
cd CodeVectorsMCP

Build the project
```
dotnet build
```

Run commands using dotnet run

dotnet run --project src/JustinL.MCP.CodeVectors -- [command] [options]

Development and Debugging

HTTP Server Mode

For debugging or testing, you can run the HTTP server instead of STDIO mode:

# Run HTTP server for easier debugging and testing
codevectors --http

The HTTP server provides:

Test UI at http://localhost:5000/test.html
REST API endpoints for manual testing
Easier debugging since logs go to console instead of files

REST API Server with Test UI

You can also run the standalone REST API server:

# Default port 5000
dotnet run --project src/JustinL.MCP.CodeVectors.Server

# Custom port
ASPNETCORE_URLS="http://localhost:8080" dotnet run --project src/JustinL.MCP.CodeVectors.Server

The REST API server provides:

Test UI: http://localhost:5000/test.html - Interactive web interface for testing queries
API endpoints:
- POST /api/test/search - Search the vector store
- GET /api/test/history - Get search history
- DELETE /api/test/history - Clear search history
Health check: GET /health - Server health status

Using the Test UI

The Test UI provides an easy way to test vector search without using an AI agent:

Open http://localhost:5000/test.html in your browser
Select which collection to search: All Collections, Code Only, or Documentation Only
Enter a search query (e.g., "How does authentication work?")
Click Search or press Enter
View the results with file paths, scores, code snippets, and type indicators (CODE/DOCS)

See docs/TEST_UI.md for detailed documentation.

Architecture

The solution consists of several projects:

JustinL.MCP.CodeVectors: Main CLI and MCP server
JustinL.MCP.CodeVectors.Core: Core interfaces and models
JustinL.MCP.CodeVectors.Server: REST API server for remote operations
JustinL.MCP.CodeVectors.VectorStore: Qdrant integration

License

MIT License

Product	Compatible and additional computed target framework versions.
.NET	net8.0 is compatible. net8.0-android was computed. net8.0-browser was computed. net8.0-ios was computed. net8.0-maccatalyst was computed. net8.0-macos was computed. net8.0-tvos was computed. net8.0-windows was computed. net9.0 was computed. net9.0-android was computed. net9.0-browser was computed. net9.0-ios was computed. net9.0-maccatalyst was computed. net9.0-macos was computed. net9.0-tvos was computed. net9.0-windows was computed. net10.0 was computed. net10.0-android was computed. net10.0-browser was computed. net10.0-ios was computed. net10.0-maccatalyst was computed. net10.0-macos was computed. net10.0-tvos was computed. net10.0-windows was computed.

Product

.NET

Compatible target framework(s)

Included target framework(s) (in package)

Learn more about Target Frameworks and .NET Standard.

This package has no dependencies.

Version	Downloads	Last Updated
1.0.9	535	7/22/2025
1.0.8	153	7/14/2025