Initial commit: Gitea code search with MeiliSearch + MCP

Go indexer (full re-index + webhook), MeiliSearch integration,
MCP server exposing gitea_search tool for LLM agents.
K8s manifests for MeiliSearch + indexer CronJob.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Raymond Scott Pert
2026-04-08 01:27:42 +00:00
commit 61574855bf
11 changed files with 1318 additions and 0 deletions

251
README.md Normal file
View File

@@ -0,0 +1,251 @@
# gitea-search
Full-text code search across all Gitea repositories, exposed as an MCP tool for Claude Code.
Indexes file content from a Gitea instance into MeiliSearch. Provides two interfaces: a CLI for indexing/searching and an MCP server (stdio JSON-RPC) that Claude Code can call as a tool.
## Architecture
```
+-----------------+
| Gitea Instance |
| (33 repos) |
+--------+--------+
|
+--------------+--------------+
| |
git clone --depth 1 push webhook
| |
v v
+-------------------+ +-------------------+
| indexer full | | indexer webhook |
| (CronJob, 4h) | | (Deployment, :8080)|
+--------+----------+ +--------+----------+
| |
+----------+---------------+
|
v
+-------------------+
| MeiliSearch |
| (PVC-backed) |
+--------+----------+
|
v
+-------------------+
| mcp-server |
| (stdio JSON-RPC) |
+-------------------+
^
|
+-------------------+
| Claude Code |
| (MCP client) |
+-------------------+
```
## Components
| Binary | Purpose |
|--------|---------|
| `indexer full` | Clone all repos, extract files, push to MeiliSearch |
| `indexer repo <owner/name>` | Re-index a single repo |
| `indexer webhook` | HTTP server (:8080) for Gitea push webhooks |
| `indexer search <query>` | CLI search for testing |
| `mcp-server` | MCP stdio server exposing `gitea_search` tool |
## Quick Start
### Prerequisites
- Go 1.22+
- MeiliSearch instance (v1.6+)
- Gitea instance with API token
- git (for cloning repos)
### Build
```sh
go build -o indexer ./cmd/indexer
go build -o mcp-server ./cmd/mcp-server
```
### Run a full index
```sh
export GITEA_TOKEN=your-token-here
export MEILI_URL=http://localhost:7700
./indexer full
```
### Test search
```sh
./indexer search "wireguard config" --type=conf --limit=5
```
### Run MCP server
```sh
export MEILI_URL=http://localhost:7700
./mcp-server
```
## Configuration
All configuration via environment variables:
| Variable | Default | Description |
|----------|---------|-------------|
| `GITEA_URL` | `https://gitea.rspworks.tech` | Gitea instance URL |
| `GITEA_TOKEN` | *(required)* | Gitea API token |
| `MEILI_URL` | `http://localhost:7700` | MeiliSearch URL |
| `MEILI_KEY` | *(empty)* | MeiliSearch master key |
| `INDEX_NAME` | `gitea-code` | MeiliSearch index name |
| `WEBHOOK_SECRET` | *(empty)* | HMAC secret for Gitea webhook validation |
## MCP Integration with Claude Code
### Option 1: Local binary
Add to `~/.claude/claude_code_config.json`:
```json
{
"mcpServers": {
"gitea-search": {
"command": "/path/to/mcp-server",
"env": {
"MEILI_URL": "http://meilisearch.gitea-search.svc.cluster.local:7700",
"MEILI_KEY": "your-master-key"
}
}
}
}
```
### Option 2: Via Docker
```json
{
"mcpServers": {
"gitea-search": {
"command": "docker",
"args": [
"run", "--rm", "-i",
"-e", "MEILI_URL=http://host.docker.internal:7700",
"gitea.rspworks.tech/rpert/gitea-search:mcp-server"
]
}
}
}
```
### Tool usage
Once configured, Claude Code can call the `gitea_search` tool:
```
Search for "wireguard" across all repos
Search for "backup" in repo "rpert/infra-ssh" with filetype "sh"
```
Tool parameters:
| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `query` | string | yes | Search terms |
| `repo` | string | no | Filter by repo full name (e.g., `rpert/infra-ssh`) |
| `filetype` | string | no | Filter by extension (e.g., `go`, `md`, `yaml`) |
| `limit` | integer | no | Max results (default: 10) |
## MeiliSearch Document Schema
```json
{
"id": "sha256(repo+branch+path)",
"repo": "rpert/infra-ssh",
"branch": "main",
"path": "docs/mail-setup.md",
"filename": "mail-setup.md",
"extension": "md",
"content": "file content (up to 50KB)",
"language": "markdown",
"updated_at": 1712534400
}
```
Searchable: `content`, `path`, `filename`, `repo`
Filterable: `repo`, `extension`, `branch`
Displayed: all fields except `content` (snippets returned via highlighting)
## K8s Deployment
### 1. Create namespace and secrets
```sh
kubectl apply -f k8s/namespace.yaml
# Generate a real master key
MEILI_KEY=$(openssl rand -base64 32)
kubectl -n gitea-search create secret generic meilisearch-secret \
--from-literal=master-key="$MEILI_KEY" \
--dry-run=client -o yaml | kubectl apply -f -
kubectl -n gitea-search create secret generic indexer-secret \
--from-literal=gitea-token="your-gitea-token" \
--from-literal=webhook-secret="your-webhook-secret" \
--dry-run=client -o yaml | kubectl apply -f -
```
### 2. Deploy MeiliSearch
```sh
kubectl apply -f k8s/meilisearch.yaml
```
### 3. Build and push container image
```sh
# Build indexer image
docker build --target indexer -t gitea.rspworks.tech/rpert/gitea-search:latest .
docker push gitea.rspworks.tech/rpert/gitea-search:latest
# Build MCP server image
docker build --target mcp-server -t gitea.rspworks.tech/rpert/gitea-search:mcp-server .
docker push gitea.rspworks.tech/rpert/gitea-search:mcp-server
```
### 4. Deploy indexer CronJob and webhook server
```sh
kubectl apply -f k8s/indexer-cronjob.yaml
```
### 5. Trigger initial index
```sh
kubectl -n gitea-search create job --from=cronjob/gitea-indexer gitea-indexer-initial
kubectl -n gitea-search logs -f job/gitea-indexer-initial
```
### 6. Configure Gitea webhook (optional)
In Gitea, go to Site Administration > Webhooks > Add Webhook:
- URL: `http://indexer-webhook.gitea-search.svc.cluster.local:8080/webhook`
- Content Type: `application/json`
- Secret: same as `WEBHOOK_SECRET`
- Events: Push only
## Indexing Details
- Clones each repo with `git clone --depth 1` (shallow, fast)
- Walks all files, skipping: `.git/`, `node_modules/`, `vendor/`, `__pycache__/`, binary files, lock files, images, archives
- Files >50KB are skipped
- Binary detection: checks first 512 bytes for null bytes
- Full reindex clears the index first, then re-populates
- Webhook reindex deletes only the affected repo's documents, then re-indexes that repo
## License
MIT