JustinL.MCP.CodeVectorsMCP 1.0.9

dotnet tool install --global JustinL.MCP.CodeVectorsMCP --version 1.0.9
                    
This package contains a .NET tool you can call from the shell/command line.
dotnet new tool-manifest
                    
if you are setting up this repo
dotnet tool install --local JustinL.MCP.CodeVectorsMCP --version 1.0.9
                    
This package contains a .NET tool you can call from the shell/command line.
#tool dotnet:?package=JustinL.MCP.CodeVectorsMCP&version=1.0.9
                    
nuke :add-package JustinL.MCP.CodeVectorsMCP --version 1.0.9
                    

CodeVectorsMCP

CodeVectorsMCP is an MCP (Model Context Protocol) server that provides context awareness for large software projects through text embeddings and vector search. It uses OpenAI embeddings and Qdrant vector database to enable semantic search across codebases.

Features

  • Semantic Code Search: Search your codebase using natural language queries
  • MCP Protocol Support: Integrates with AI assistants that support the Model Context Protocol
  • Real-time Updates: Watch for file changes and automatically update embeddings
  • Multi-language Support: Process various programming languages and documentation formats
  • Efficient Chunking: Smart file chunking that respects code boundaries
  • Batch Processing: Efficient batch embedding generation

Installation

Install CodeVectors globally from NuGet:

dotnet tool install -g JustinL.MCP.CodeVectorsMCP

Then use it anywhere with:

codevectors --help

Quick Start

Prerequisites

  • .NET 8.0 SDK
  • Docker (for Qdrant)
  • OpenAI API key

Setup

  1. Initialize configuration

    codevectors init
    

    This creates a configuration file at ./.codevectors/config.json

  2. Set up Qdrant

    cd docker
    ./setup-qdrant.sh
    
  3. Configure OpenAI API key

    export OPENAI_API_KEY=your-api-key-here
    

Quick Start Usage

Initial Setup (One-time per project)
# Navigate to your project directory
cd /path/to/your/project

# Initialize configuration
codevectors init

# Generate initial embeddings for the entire project
codevectors generate .
Normal AI Agent Usage
# Run with no parameters - this is the standard way AI agents use the tool
codevectors

This command:

  • Runs in STDIO mode by default (for MCP protocol communication)
  • Automatically detects and reindexes any changed files
  • Keeps embeddings up to date without manual intervention
  • Provides the search_codebase tool to AI agents
Additional Commands
# Generate embeddings for specific directories
codevectors generate . --include src/ docs/

# Generate embeddings for specific file types
codevectors generate . --include "*.cs" "*.md"

# Generate only documentation files
codevectors generate . --docs-only

# Query the vector store manually
codevectors query "How does authentication work?"

# Watch for file changes and auto-update embeddings
codevectors watch .

# Test all connections and services
codevectors test
codevectors test --verbose  # Show detailed test output

Testing Connections

The test command helps diagnose connectivity issues with all configured services:

# Run connection tests
codevectors test

# Example output:
# Starting CodeVectors connection tests...
# 
# ============================================================
# TEST RESULTS SUMMARY
# ============================================================
# ✓ Configuration: Loaded
# ✓ Embedding Service: Connected
# ✓ Vector Store (Qdrant): Connected
# 
# Total test time: 523ms
# 
# All tests passed!

With verbose output:

codevectors test --verbose

# Shows additional details like:
# - Configuration values
# - Service URLs
# - Embedding dimensions
# - Collection counts

The test command checks:

  • Configuration: Validates that your config.json is properly loaded
  • Embedding Service: Tests connectivity based on your configured mode:
    • Local: Validates OpenAI API key and tests embedding generation
    • REST: Checks health endpoint connectivity
    • gRPC: Validates configuration (full connectivity test requires server)
  • Vector Store: Connects to Qdrant, creates a test collection, and performs CRUD operations
Troubleshooting: Reset Project Vectors

If embeddings seem inconsistent or you want to start fresh:

# Method 1: Delete the project's vectors (keeps other projects intact)
codevectors delete

# Method 2: Regenerate all embeddings with force flag
codevectors generate . --force

# Method 3: Manual cleanup (if using local Qdrant)
# Stop qdrant, delete the collection data, restart qdrant, then regenerate

Configuration

Configuration Directory and File Location

CodeVectorsMCP stores its configuration in a .codevectors directory located in your project's root directory. The configuration file is .codevectors/config.json.

Default location: When you run codevectors commands, the tool looks for:

your-project/
├── .codevectors/
│   └── config.json
├── src/
├── docs/
└── README.md

This design allows each project to have its own configuration, including different vector store settings, API keys, and project-specific options.

Creating Configuration

First, navigate to your project's root directory, then initialize the configuration:

cd /path/to/your/project
codevectors init

This creates .codevectors/config.json with default settings. You can also specify a custom location:

codevectors init --config /custom/path/config.json

How AI Agents Use CodeVectorsMCP

CodeVectorsMCP is designed to be used by AI agents through the Model Context Protocol (MCP). Here's the typical workflow:

  1. Project Setup: You set up CodeVectorsMCP once per project by running codevectors init in the project root
  2. Index Your Code: Generate embeddings with codevectors generate . --project myproject
  3. AI Agent Integration: Your AI agent runs in the same directory and uses the MCP protocol to search your indexed code

When the AI agent needs code context, it calls the search_codebase tool, which:

  • Reads the .codevectors/config.json file in the current directory
  • Connects to the configured vector store
  • Searches for relevant code using the specified project name
  • Returns contextual code snippets to help the AI understand your codebase

Project Names: Multi-Project Support

The --project parameter is crucial for organizing your code in the vector store:

Why Project Names Matter:

  • Isolation: Each project's code is stored separately, preventing cross-contamination
  • Multi-Project Support: A single Qdrant instance can serve multiple projects
  • Targeted Search: AI agents only search within the specified project's code
  • Organization: Clear separation between different codebases or versions

Example Setup:

# Project A
cd /path/to/project-a
codevectors init  # Configure project name in config.json
codevectors generate .

# Project B  
cd /path/to/project-b
codevectors init  # Configure different project name in config.json
codevectors generate .

Both projects can use the same Qdrant vector store, but their code embeddings are kept separate and searchable independently.

Configuration Examples

Local Configuration (Default Setup)

Example .codevectors/config.json generated by codevectors init:

{
  "ProjectName": "your-project-name",
  "EmbeddingsService": {
    "Mode": "Local"
  },
  "Embeddings": {
    "Model": "text-embedding-3-small",
    "ApiKey": "${OPENAI_API_KEY}",
    "ChunkSize": 1000,
    "Overlap": 200,
    "BatchSize": 100,
    "MaxRetries": 3,
    "BaseDelayMs": 1000,
    "MaxTokens": 8191,
    "RequestsPerMinute": 30,
    "TokensPerMinute": 150000
  },
  "VectorStore": {
    "Type": "qdrant",
    "ConnectionString": "http://localhost:6334",
    "CodeCollectionName": "code-vectors",
    "DocumentationCollectionName": "documentation-vectors",
    "VectorSize": 1536,
    "DistanceMetric": "Cosine"
  },
  "FileWatcher": {
    "Enabled": true,
    "DebounceMs": 60000
  },
  "FilePatterns": {
    "IncludePatterns": [
      "**/*.cs", "**/*.ts", "**/*.js", "**/*.py", "**/*.java",
      "**/*.cpp", "**/*.c", "**/*.h", "**/*.go", "**/*.rs",
      "**/*.rb", "**/*.php", "**/*.swift", "**/*.kt"
    ],
    "ExcludePatterns": [
      "**/bin/**", "**/obj/**", "**/node_modules/**",
      "**/.git/**", "**/dist/**", "**/build/**",
      "**/packages/**", "**/vendor/**", "**/.vs/**",
      "**/.idea/**", "**/target/**", "**/__pycache__/**"
    ]
  }
}
Azure OpenAI Configuration

Example configuration using Azure OpenAI Service with API Key:

{
  "ProjectName": "your-project-name",
  "EmbeddingsService": {
    "Mode": "Local"
  },
  "Embeddings": {
    "ServiceType": "AzureOpenAI",
    "ApiKey": "${AZURE_OPENAI_KEY}",
    "Endpoint": "https://your-resource.openai.azure.com/",
    "AzureDeploymentName": "text-embedding-3-small",
    "Model": "text-embedding-3-small",
    "ChunkSize": 1000,
    "Overlap": 200,
    "BatchSize": 100,
    "MaxRetries": 3,
    "BaseDelayMs": 1000,
    "MaxTokens": 8191,
    "RequestsPerMinute": 30,
    "TokensPerMinute": 150000
  },
  "VectorStore": {
    "Type": "qdrant",
    "ConnectionString": "http://localhost:6334",
    "CodeCollectionName": "code-vectors",
    "DocumentationCollectionName": "documentation-vectors",
    "VectorSize": 1536,
    "DistanceMetric": "Cosine"
  },
  "FileWatcher": {
    "Enabled": true,
    "DebounceMs": 60000
  },
  "FilePatterns": {
    "IncludePatterns": [
      "**/*.cs", "**/*.ts", "**/*.js", "**/*.py"
    ],
    "ExcludePatterns": [
      "**/bin/**", "**/obj/**", "**/node_modules/**", "**/.git/**"
    ]
  }
}

Example using Azure AD authentication (no API key required):

{
  "ProjectName": "your-project-name",
  "EmbeddingsService": {
    "Mode": "Local"
  },
  "Embeddings": {
    "ServiceType": "AzureOpenAI",
    "UseAzureIdentity": true,
    "Endpoint": "https://your-resource.openai.azure.com/",
    "AzureDeploymentName": "text-embedding-3-small",
    "Model": "text-embedding-3-small",
    "ChunkSize": 1000,
    "Overlap": 200
  }
}

Notes for Azure OpenAI:

  • ServiceType must be set to "AzureOpenAI"
  • AzureDeploymentName is required and should match your deployment name in Azure
  • Endpoint is required and should be your Azure OpenAI resource endpoint
  • Use either ApiKey or set UseAzureIdentity to true for Azure AD authentication
  • The Model field is still used for caching but deployment name is used for API calls
REST API Embeddings Configuration

Example configuration using a remote REST embedding service:

{
  "ProjectName": "your-project-name",
  "EmbeddingsService": {
    "Mode": "Rest",
    "Url": "https://your-embedding-service.com"
  },
  "Embeddings": {
    "ChunkSize": 1000,
    "Overlap": 200,
    "BatchSize": 100,
    "MaxRetries": 3,
    "BaseDelayMs": 1000
  },
  "VectorStore": {
    "Type": "qdrant",
    "ConnectionString": "http://localhost:6334",
    "CodeCollectionName": "code-vectors",
    "DocumentationCollectionName": "documentation-vectors",
    "VectorSize": 1536,
    "DistanceMetric": "Cosine"
  },
  "FileWatcher": {
    "Enabled": true,
    "DebounceMs": 60000
  },
  "FilePatterns": {
    "IncludePatterns": [
      "**/*.cs", "**/*.ts", "**/*.js", "**/*.py"
    ],
    "ExcludePatterns": [
      "**/bin/**", "**/obj/**", "**/node_modules/**", "**/.git/**"
    ]
  }
}
gRPC Embeddings Configuration

Example configuration using a gRPC embedding service:

{
  "ProjectName": "your-project-name",
  "EmbeddingsService": {
    "Mode": "Grpc",
    "Url": "http://localhost:5001"
  },
  "Embeddings": {
    "ChunkSize": 1000,
    "Overlap": 200,
    "BatchSize": 100,
    "MaxRetries": 3,
    "BaseDelayMs": 1000
  },
  "VectorStore": {
    "Type": "qdrant",
    "ConnectionString": "http://localhost:6334",
    "CodeCollectionName": "code-vectors",
    "DocumentationCollectionName": "documentation-vectors",
    "VectorSize": 1536,
    "DistanceMetric": "Cosine"
  },
  "FileWatcher": {
    "Enabled": true,
    "DebounceMs": 60000
  },
  "FilePatterns": {
    "IncludePatterns": [
      "**/*.cs", "**/*.ts", "**/*.js", "**/*.py"
    ],
    "ExcludePatterns": [
      "**/bin/**", "**/obj/**", "**/node_modules/**", "**/.git/**"
    ]
  }
}
Environment Variable Configuration

For security, use environment variables for sensitive data:

{
  "ProjectName": "${PROJECT_NAME}",
  "EmbeddingsService": {
    "Mode": "Local"
  },
  "Embeddings": {
    "Model": "text-embedding-3-small",
    "ApiKey": "${OPENAI_API_KEY}",
    "ChunkSize": 1000
  },
  "VectorStore": {
    "Type": "qdrant",
    "ConnectionString": "${QDRANT_URL}",
    "CodeCollectionName": "code-vectors",
    "DocumentationCollectionName": "documentation-vectors"
  }
}

Then set environment variables:

export PROJECT_NAME=my-project
export OPENAI_API_KEY=sk-your-actual-key-here
export QDRANT_URL=http://localhost:6334

Configuration Options Reference

  • ProjectName: Project identifier used for vector store isolation
  • EmbeddingsService:
    • Mode: "Local" (OpenAI API), "Rest" (REST API), or "Grpc" (gRPC service)
    • Url: Service URL for Rest/Grpc modes (e.g., "http://localhost:5000" for REST, "http://localhost:5001" for gRPC)
  • Embeddings:
    • ServiceType: Type of embedding service when Mode is "Local" ("OpenAI" or "AzureOpenAI", default: "OpenAI")
    • Model: OpenAI model name (only used in Local mode, default: text-embedding-3-small)
    • ApiKey: API key for the embedding service (only used in Local mode)
    • Endpoint: Optional custom endpoint URL (only used in Local mode, required for Azure OpenAI)
    • AzureDeploymentName: Azure OpenAI deployment name (required when ServiceType is "AzureOpenAI")
    • UseAzureIdentity: Use Azure AD authentication instead of API key (only for Azure OpenAI, default: false)
    • ChunkSize: Text chunk size for processing (default: 1000)
    • Overlap: Overlap between chunks in characters (default: 200)
    • BatchSize: Number of texts to process in one batch (default: 100)
    • MaxRetries: Maximum retry attempts for failed requests (default: 3)
    • BaseDelayMs: Base delay for exponential backoff (default: 1000)
    • MaxTokens: Maximum tokens per request for OpenAI (default: 8191)
    • RequestsPerMinute: Rate limit for requests (default: 30)
    • TokensPerMinute: Rate limit for tokens (default: 150000)
  • VectorStore:
    • Type: Vector store type (default: "qdrant")
    • ConnectionString: Qdrant server URL (default: http://localhost:6334)
    • CodeCollectionName: Collection for code embeddings
    • DocumentationCollectionName: Collection for documentation
    • VectorSize: Embedding dimension (default: 1536)
    • DistanceMetric: Distance metric for similarity (default: "Cosine")
  • FileWatcher:
    • Enabled: Enable file watching (default: true)
    • DebounceMs: Debounce delay for file changes (default: 60000)
  • FilePatterns:
    • IncludePatterns: File patterns to include when indexing
    • ExcludePatterns: File patterns to exclude (node_modules, .git, etc.)

Using Custom Configuration Locations

For advanced scenarios, you can specify a different config file:

codevectors generate . --config /custom/path/config.json

For Developers

Building from Source

If you want to build and run CodeVectorsMCP from source:

  1. Clone the repository

    git clone https://github.com/jmlewis1/CodeVectorsMCP.git
    cd CodeVectorsMCP
    
  2. Build the project

    dotnet build
    
  3. Run commands using dotnet run

    dotnet run --project src/JustinL.MCP.CodeVectors -- [command] [options]
    

Development and Debugging

HTTP Server Mode

For debugging or testing, you can run the HTTP server instead of STDIO mode:

# Run HTTP server for easier debugging and testing
codevectors --http

The HTTP server provides:

  • Test UI at http://localhost:5000/test.html
  • REST API endpoints for manual testing
  • Easier debugging since logs go to console instead of files
REST API Server with Test UI

You can also run the standalone REST API server:

# Default port 5000
dotnet run --project src/JustinL.MCP.CodeVectors.Server

# Custom port
ASPNETCORE_URLS="http://localhost:8080" dotnet run --project src/JustinL.MCP.CodeVectors.Server

The REST API server provides:

  • Test UI: http://localhost:5000/test.html - Interactive web interface for testing queries
  • API endpoints:
    • POST /api/test/search - Search the vector store
    • GET /api/test/history - Get search history
    • DELETE /api/test/history - Clear search history
  • Health check: GET /health - Server health status
Using the Test UI

The Test UI provides an easy way to test vector search without using an AI agent:

  1. Open http://localhost:5000/test.html in your browser
  2. Select which collection to search: All Collections, Code Only, or Documentation Only
  3. Enter a search query (e.g., "How does authentication work?")
  4. Click Search or press Enter
  5. View the results with file paths, scores, code snippets, and type indicators (CODE/DOCS)

See docs/TEST_UI.md for detailed documentation.

Architecture

The solution consists of several projects:

  • JustinL.MCP.CodeVectors: Main CLI and MCP server
  • JustinL.MCP.CodeVectors.Core: Core interfaces and models
  • JustinL.MCP.CodeVectors.Server: REST API server for remote operations
  • JustinL.MCP.CodeVectors.VectorStore: Qdrant integration

License

MIT License

Product Compatible and additional computed target framework versions.
.NET net8.0 is compatible.  net8.0-android was computed.  net8.0-browser was computed.  net8.0-ios was computed.  net8.0-maccatalyst was computed.  net8.0-macos was computed.  net8.0-tvos was computed.  net8.0-windows was computed.  net9.0 was computed.  net9.0-android was computed.  net9.0-browser was computed.  net9.0-ios was computed.  net9.0-maccatalyst was computed.  net9.0-macos was computed.  net9.0-tvos was computed.  net9.0-windows was computed.  net10.0 was computed.  net10.0-android was computed.  net10.0-browser was computed.  net10.0-ios was computed.  net10.0-maccatalyst was computed.  net10.0-macos was computed.  net10.0-tvos was computed.  net10.0-windows was computed. 
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.

This package has no dependencies.

Version Downloads Last Updated
1.0.9 535 7/22/2025
1.0.8 153 7/14/2025