JustinL.MCP.CodeVectorsMCP
1.0.9
dotnet tool install --global JustinL.MCP.CodeVectorsMCP --version 1.0.9
dotnet new tool-manifest
dotnet tool install --local JustinL.MCP.CodeVectorsMCP --version 1.0.9
#tool dotnet:?package=JustinL.MCP.CodeVectorsMCP&version=1.0.9
nuke :add-package JustinL.MCP.CodeVectorsMCP --version 1.0.9
CodeVectorsMCP
CodeVectorsMCP is an MCP (Model Context Protocol) server that provides context awareness for large software projects through text embeddings and vector search. It uses OpenAI embeddings and Qdrant vector database to enable semantic search across codebases.
Features
- Semantic Code Search: Search your codebase using natural language queries
- MCP Protocol Support: Integrates with AI assistants that support the Model Context Protocol
- Real-time Updates: Watch for file changes and automatically update embeddings
- Multi-language Support: Process various programming languages and documentation formats
- Efficient Chunking: Smart file chunking that respects code boundaries
- Batch Processing: Efficient batch embedding generation
Installation
Install as .NET Tool (Recommended)
Install CodeVectors globally from NuGet:
dotnet tool install -g JustinL.MCP.CodeVectorsMCP
Then use it anywhere with:
codevectors --help
Quick Start
Prerequisites
- .NET 8.0 SDK
- Docker (for Qdrant)
- OpenAI API key
Setup
Initialize configuration
codevectors init
This creates a configuration file at
./.codevectors/config.json
Set up Qdrant
cd docker ./setup-qdrant.sh
Configure OpenAI API key
export OPENAI_API_KEY=your-api-key-here
Quick Start Usage
Initial Setup (One-time per project)
# Navigate to your project directory
cd /path/to/your/project
# Initialize configuration
codevectors init
# Generate initial embeddings for the entire project
codevectors generate .
Normal AI Agent Usage
# Run with no parameters - this is the standard way AI agents use the tool
codevectors
This command:
- Runs in STDIO mode by default (for MCP protocol communication)
- Automatically detects and reindexes any changed files
- Keeps embeddings up to date without manual intervention
- Provides the
search_codebase
tool to AI agents
Additional Commands
# Generate embeddings for specific directories
codevectors generate . --include src/ docs/
# Generate embeddings for specific file types
codevectors generate . --include "*.cs" "*.md"
# Generate only documentation files
codevectors generate . --docs-only
# Query the vector store manually
codevectors query "How does authentication work?"
# Watch for file changes and auto-update embeddings
codevectors watch .
# Test all connections and services
codevectors test
codevectors test --verbose # Show detailed test output
Testing Connections
The test
command helps diagnose connectivity issues with all configured services:
# Run connection tests
codevectors test
# Example output:
# Starting CodeVectors connection tests...
#
# ============================================================
# TEST RESULTS SUMMARY
# ============================================================
# ✓ Configuration: Loaded
# ✓ Embedding Service: Connected
# ✓ Vector Store (Qdrant): Connected
#
# Total test time: 523ms
#
# All tests passed!
With verbose output:
codevectors test --verbose
# Shows additional details like:
# - Configuration values
# - Service URLs
# - Embedding dimensions
# - Collection counts
The test command checks:
- Configuration: Validates that your config.json is properly loaded
- Embedding Service: Tests connectivity based on your configured mode:
- Local: Validates OpenAI API key and tests embedding generation
- REST: Checks health endpoint connectivity
- gRPC: Validates configuration (full connectivity test requires server)
- Vector Store: Connects to Qdrant, creates a test collection, and performs CRUD operations
Troubleshooting: Reset Project Vectors
If embeddings seem inconsistent or you want to start fresh:
# Method 1: Delete the project's vectors (keeps other projects intact)
codevectors delete
# Method 2: Regenerate all embeddings with force flag
codevectors generate . --force
# Method 3: Manual cleanup (if using local Qdrant)
# Stop qdrant, delete the collection data, restart qdrant, then regenerate
Configuration
Configuration Directory and File Location
CodeVectorsMCP stores its configuration in a .codevectors
directory located in your project's root directory. The configuration file is .codevectors/config.json
.
Default location: When you run codevectors
commands, the tool looks for:
your-project/
├── .codevectors/
│ └── config.json
├── src/
├── docs/
└── README.md
This design allows each project to have its own configuration, including different vector store settings, API keys, and project-specific options.
Creating Configuration
First, navigate to your project's root directory, then initialize the configuration:
cd /path/to/your/project
codevectors init
This creates .codevectors/config.json
with default settings. You can also specify a custom location:
codevectors init --config /custom/path/config.json
How AI Agents Use CodeVectorsMCP
CodeVectorsMCP is designed to be used by AI agents through the Model Context Protocol (MCP). Here's the typical workflow:
- Project Setup: You set up CodeVectorsMCP once per project by running
codevectors init
in the project root - Index Your Code: Generate embeddings with
codevectors generate . --project myproject
- AI Agent Integration: Your AI agent runs in the same directory and uses the MCP protocol to search your indexed code
When the AI agent needs code context, it calls the search_codebase
tool, which:
- Reads the
.codevectors/config.json
file in the current directory - Connects to the configured vector store
- Searches for relevant code using the specified project name
- Returns contextual code snippets to help the AI understand your codebase
Project Names: Multi-Project Support
The --project
parameter is crucial for organizing your code in the vector store:
Why Project Names Matter:
- Isolation: Each project's code is stored separately, preventing cross-contamination
- Multi-Project Support: A single Qdrant instance can serve multiple projects
- Targeted Search: AI agents only search within the specified project's code
- Organization: Clear separation between different codebases or versions
Example Setup:
# Project A
cd /path/to/project-a
codevectors init # Configure project name in config.json
codevectors generate .
# Project B
cd /path/to/project-b
codevectors init # Configure different project name in config.json
codevectors generate .
Both projects can use the same Qdrant vector store, but their code embeddings are kept separate and searchable independently.
Configuration Examples
Local Configuration (Default Setup)
Example .codevectors/config.json
generated by codevectors init
:
{
"ProjectName": "your-project-name",
"EmbeddingsService": {
"Mode": "Local"
},
"Embeddings": {
"Model": "text-embedding-3-small",
"ApiKey": "${OPENAI_API_KEY}",
"ChunkSize": 1000,
"Overlap": 200,
"BatchSize": 100,
"MaxRetries": 3,
"BaseDelayMs": 1000,
"MaxTokens": 8191,
"RequestsPerMinute": 30,
"TokensPerMinute": 150000
},
"VectorStore": {
"Type": "qdrant",
"ConnectionString": "http://localhost:6334",
"CodeCollectionName": "code-vectors",
"DocumentationCollectionName": "documentation-vectors",
"VectorSize": 1536,
"DistanceMetric": "Cosine"
},
"FileWatcher": {
"Enabled": true,
"DebounceMs": 60000
},
"FilePatterns": {
"IncludePatterns": [
"**/*.cs", "**/*.ts", "**/*.js", "**/*.py", "**/*.java",
"**/*.cpp", "**/*.c", "**/*.h", "**/*.go", "**/*.rs",
"**/*.rb", "**/*.php", "**/*.swift", "**/*.kt"
],
"ExcludePatterns": [
"**/bin/**", "**/obj/**", "**/node_modules/**",
"**/.git/**", "**/dist/**", "**/build/**",
"**/packages/**", "**/vendor/**", "**/.vs/**",
"**/.idea/**", "**/target/**", "**/__pycache__/**"
]
}
}
Azure OpenAI Configuration
Example configuration using Azure OpenAI Service with API Key:
{
"ProjectName": "your-project-name",
"EmbeddingsService": {
"Mode": "Local"
},
"Embeddings": {
"ServiceType": "AzureOpenAI",
"ApiKey": "${AZURE_OPENAI_KEY}",
"Endpoint": "https://your-resource.openai.azure.com/",
"AzureDeploymentName": "text-embedding-3-small",
"Model": "text-embedding-3-small",
"ChunkSize": 1000,
"Overlap": 200,
"BatchSize": 100,
"MaxRetries": 3,
"BaseDelayMs": 1000,
"MaxTokens": 8191,
"RequestsPerMinute": 30,
"TokensPerMinute": 150000
},
"VectorStore": {
"Type": "qdrant",
"ConnectionString": "http://localhost:6334",
"CodeCollectionName": "code-vectors",
"DocumentationCollectionName": "documentation-vectors",
"VectorSize": 1536,
"DistanceMetric": "Cosine"
},
"FileWatcher": {
"Enabled": true,
"DebounceMs": 60000
},
"FilePatterns": {
"IncludePatterns": [
"**/*.cs", "**/*.ts", "**/*.js", "**/*.py"
],
"ExcludePatterns": [
"**/bin/**", "**/obj/**", "**/node_modules/**", "**/.git/**"
]
}
}
Example using Azure AD authentication (no API key required):
{
"ProjectName": "your-project-name",
"EmbeddingsService": {
"Mode": "Local"
},
"Embeddings": {
"ServiceType": "AzureOpenAI",
"UseAzureIdentity": true,
"Endpoint": "https://your-resource.openai.azure.com/",
"AzureDeploymentName": "text-embedding-3-small",
"Model": "text-embedding-3-small",
"ChunkSize": 1000,
"Overlap": 200
}
}
Notes for Azure OpenAI:
- ServiceType must be set to "AzureOpenAI"
- AzureDeploymentName is required and should match your deployment name in Azure
- Endpoint is required and should be your Azure OpenAI resource endpoint
- Use either ApiKey or set UseAzureIdentity to true for Azure AD authentication
- The Model field is still used for caching but deployment name is used for API calls
REST API Embeddings Configuration
Example configuration using a remote REST embedding service:
{
"ProjectName": "your-project-name",
"EmbeddingsService": {
"Mode": "Rest",
"Url": "https://your-embedding-service.com"
},
"Embeddings": {
"ChunkSize": 1000,
"Overlap": 200,
"BatchSize": 100,
"MaxRetries": 3,
"BaseDelayMs": 1000
},
"VectorStore": {
"Type": "qdrant",
"ConnectionString": "http://localhost:6334",
"CodeCollectionName": "code-vectors",
"DocumentationCollectionName": "documentation-vectors",
"VectorSize": 1536,
"DistanceMetric": "Cosine"
},
"FileWatcher": {
"Enabled": true,
"DebounceMs": 60000
},
"FilePatterns": {
"IncludePatterns": [
"**/*.cs", "**/*.ts", "**/*.js", "**/*.py"
],
"ExcludePatterns": [
"**/bin/**", "**/obj/**", "**/node_modules/**", "**/.git/**"
]
}
}
gRPC Embeddings Configuration
Example configuration using a gRPC embedding service:
{
"ProjectName": "your-project-name",
"EmbeddingsService": {
"Mode": "Grpc",
"Url": "http://localhost:5001"
},
"Embeddings": {
"ChunkSize": 1000,
"Overlap": 200,
"BatchSize": 100,
"MaxRetries": 3,
"BaseDelayMs": 1000
},
"VectorStore": {
"Type": "qdrant",
"ConnectionString": "http://localhost:6334",
"CodeCollectionName": "code-vectors",
"DocumentationCollectionName": "documentation-vectors",
"VectorSize": 1536,
"DistanceMetric": "Cosine"
},
"FileWatcher": {
"Enabled": true,
"DebounceMs": 60000
},
"FilePatterns": {
"IncludePatterns": [
"**/*.cs", "**/*.ts", "**/*.js", "**/*.py"
],
"ExcludePatterns": [
"**/bin/**", "**/obj/**", "**/node_modules/**", "**/.git/**"
]
}
}
Environment Variable Configuration
For security, use environment variables for sensitive data:
{
"ProjectName": "${PROJECT_NAME}",
"EmbeddingsService": {
"Mode": "Local"
},
"Embeddings": {
"Model": "text-embedding-3-small",
"ApiKey": "${OPENAI_API_KEY}",
"ChunkSize": 1000
},
"VectorStore": {
"Type": "qdrant",
"ConnectionString": "${QDRANT_URL}",
"CodeCollectionName": "code-vectors",
"DocumentationCollectionName": "documentation-vectors"
}
}
Then set environment variables:
export PROJECT_NAME=my-project
export OPENAI_API_KEY=sk-your-actual-key-here
export QDRANT_URL=http://localhost:6334
Configuration Options Reference
- ProjectName: Project identifier used for vector store isolation
- EmbeddingsService:
Mode
:"Local"
(OpenAI API),"Rest"
(REST API), or"Grpc"
(gRPC service)Url
: Service URL for Rest/Grpc modes (e.g., "http://localhost:5000" for REST, "http://localhost:5001" for gRPC)
- Embeddings:
ServiceType
: Type of embedding service when Mode is "Local" ("OpenAI" or "AzureOpenAI", default: "OpenAI")Model
: OpenAI model name (only used in Local mode, default: text-embedding-3-small)ApiKey
: API key for the embedding service (only used in Local mode)Endpoint
: Optional custom endpoint URL (only used in Local mode, required for Azure OpenAI)AzureDeploymentName
: Azure OpenAI deployment name (required when ServiceType is "AzureOpenAI")UseAzureIdentity
: Use Azure AD authentication instead of API key (only for Azure OpenAI, default: false)ChunkSize
: Text chunk size for processing (default: 1000)Overlap
: Overlap between chunks in characters (default: 200)BatchSize
: Number of texts to process in one batch (default: 100)MaxRetries
: Maximum retry attempts for failed requests (default: 3)BaseDelayMs
: Base delay for exponential backoff (default: 1000)MaxTokens
: Maximum tokens per request for OpenAI (default: 8191)RequestsPerMinute
: Rate limit for requests (default: 30)TokensPerMinute
: Rate limit for tokens (default: 150000)
- VectorStore:
Type
: Vector store type (default: "qdrant")ConnectionString
: Qdrant server URL (default: http://localhost:6334)CodeCollectionName
: Collection for code embeddingsDocumentationCollectionName
: Collection for documentationVectorSize
: Embedding dimension (default: 1536)DistanceMetric
: Distance metric for similarity (default: "Cosine")
- FileWatcher:
Enabled
: Enable file watching (default: true)DebounceMs
: Debounce delay for file changes (default: 60000)
- FilePatterns:
IncludePatterns
: File patterns to include when indexingExcludePatterns
: File patterns to exclude (node_modules, .git, etc.)
Using Custom Configuration Locations
For advanced scenarios, you can specify a different config file:
codevectors generate . --config /custom/path/config.json
For Developers
Building from Source
If you want to build and run CodeVectorsMCP from source:
Clone the repository
git clone https://github.com/jmlewis1/CodeVectorsMCP.git cd CodeVectorsMCP
Build the project
dotnet build
Run commands using dotnet run
dotnet run --project src/JustinL.MCP.CodeVectors -- [command] [options]
Development and Debugging
HTTP Server Mode
For debugging or testing, you can run the HTTP server instead of STDIO mode:
# Run HTTP server for easier debugging and testing
codevectors --http
The HTTP server provides:
- Test UI at
http://localhost:5000/test.html
- REST API endpoints for manual testing
- Easier debugging since logs go to console instead of files
REST API Server with Test UI
You can also run the standalone REST API server:
# Default port 5000
dotnet run --project src/JustinL.MCP.CodeVectors.Server
# Custom port
ASPNETCORE_URLS="http://localhost:8080" dotnet run --project src/JustinL.MCP.CodeVectors.Server
The REST API server provides:
- Test UI:
http://localhost:5000/test.html
- Interactive web interface for testing queries - API endpoints:
POST /api/test/search
- Search the vector storeGET /api/test/history
- Get search historyDELETE /api/test/history
- Clear search history
- Health check:
GET /health
- Server health status
Using the Test UI
The Test UI provides an easy way to test vector search without using an AI agent:
- Open
http://localhost:5000/test.html
in your browser - Select which collection to search: All Collections, Code Only, or Documentation Only
- Enter a search query (e.g., "How does authentication work?")
- Click Search or press Enter
- View the results with file paths, scores, code snippets, and type indicators (CODE/DOCS)
See docs/TEST_UI.md for detailed documentation.
Architecture
The solution consists of several projects:
- JustinL.MCP.CodeVectors: Main CLI and MCP server
- JustinL.MCP.CodeVectors.Core: Core interfaces and models
- JustinL.MCP.CodeVectors.Server: REST API server for remote operations
- JustinL.MCP.CodeVectors.VectorStore: Qdrant integration
License
MIT License
Product | Versions Compatible and additional computed target framework versions. |
---|---|
.NET | net8.0 is compatible. net8.0-android was computed. net8.0-browser was computed. net8.0-ios was computed. net8.0-maccatalyst was computed. net8.0-macos was computed. net8.0-tvos was computed. net8.0-windows was computed. net9.0 was computed. net9.0-android was computed. net9.0-browser was computed. net9.0-ios was computed. net9.0-maccatalyst was computed. net9.0-macos was computed. net9.0-tvos was computed. net9.0-windows was computed. net10.0 was computed. net10.0-android was computed. net10.0-browser was computed. net10.0-ios was computed. net10.0-maccatalyst was computed. net10.0-macos was computed. net10.0-tvos was computed. net10.0-windows was computed. |
This package has no dependencies.