Go indexer (full re-index + webhook), MeiliSearch integration, MCP server exposing gitea_search tool for LLM agents. K8s manifests for MeiliSearch + indexer CronJob. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
252 lines
6.6 KiB
Markdown
252 lines
6.6 KiB
Markdown
# gitea-search
|
|
|
|
Full-text code search across all Gitea repositories, exposed as an MCP tool for Claude Code.
|
|
|
|
Indexes file content from a Gitea instance into MeiliSearch. Provides two interfaces: a CLI for indexing/searching and an MCP server (stdio JSON-RPC) that Claude Code can call as a tool.
|
|
|
|
## Architecture
|
|
|
|
```
|
|
+-----------------+
|
|
| Gitea Instance |
|
|
| (33 repos) |
|
|
+--------+--------+
|
|
|
|
|
+--------------+--------------+
|
|
| |
|
|
git clone --depth 1 push webhook
|
|
| |
|
|
v v
|
|
+-------------------+ +-------------------+
|
|
| indexer full | | indexer webhook |
|
|
| (CronJob, 4h) | | (Deployment, :8080)|
|
|
+--------+----------+ +--------+----------+
|
|
| |
|
|
+----------+---------------+
|
|
|
|
|
v
|
|
+-------------------+
|
|
| MeiliSearch |
|
|
| (PVC-backed) |
|
|
+--------+----------+
|
|
|
|
|
v
|
|
+-------------------+
|
|
| mcp-server |
|
|
| (stdio JSON-RPC) |
|
|
+-------------------+
|
|
^
|
|
|
|
|
+-------------------+
|
|
| Claude Code |
|
|
| (MCP client) |
|
|
+-------------------+
|
|
```
|
|
|
|
## Components
|
|
|
|
| Binary | Purpose |
|
|
|--------|---------|
|
|
| `indexer full` | Clone all repos, extract files, push to MeiliSearch |
|
|
| `indexer repo <owner/name>` | Re-index a single repo |
|
|
| `indexer webhook` | HTTP server (:8080) for Gitea push webhooks |
|
|
| `indexer search <query>` | CLI search for testing |
|
|
| `mcp-server` | MCP stdio server exposing `gitea_search` tool |
|
|
|
|
## Quick Start
|
|
|
|
### Prerequisites
|
|
|
|
- Go 1.22+
|
|
- MeiliSearch instance (v1.6+)
|
|
- Gitea instance with API token
|
|
- git (for cloning repos)
|
|
|
|
### Build
|
|
|
|
```sh
|
|
go build -o indexer ./cmd/indexer
|
|
go build -o mcp-server ./cmd/mcp-server
|
|
```
|
|
|
|
### Run a full index
|
|
|
|
```sh
|
|
export GITEA_TOKEN=your-token-here
|
|
export MEILI_URL=http://localhost:7700
|
|
./indexer full
|
|
```
|
|
|
|
### Test search
|
|
|
|
```sh
|
|
./indexer search "wireguard config" --type=conf --limit=5
|
|
```
|
|
|
|
### Run MCP server
|
|
|
|
```sh
|
|
export MEILI_URL=http://localhost:7700
|
|
./mcp-server
|
|
```
|
|
|
|
## Configuration
|
|
|
|
All configuration via environment variables:
|
|
|
|
| Variable | Default | Description |
|
|
|----------|---------|-------------|
|
|
| `GITEA_URL` | `https://gitea.rspworks.tech` | Gitea instance URL |
|
|
| `GITEA_TOKEN` | *(required)* | Gitea API token |
|
|
| `MEILI_URL` | `http://localhost:7700` | MeiliSearch URL |
|
|
| `MEILI_KEY` | *(empty)* | MeiliSearch master key |
|
|
| `INDEX_NAME` | `gitea-code` | MeiliSearch index name |
|
|
| `WEBHOOK_SECRET` | *(empty)* | HMAC secret for Gitea webhook validation |
|
|
|
|
## MCP Integration with Claude Code
|
|
|
|
### Option 1: Local binary
|
|
|
|
Add to `~/.claude/claude_code_config.json`:
|
|
|
|
```json
|
|
{
|
|
"mcpServers": {
|
|
"gitea-search": {
|
|
"command": "/path/to/mcp-server",
|
|
"env": {
|
|
"MEILI_URL": "http://meilisearch.gitea-search.svc.cluster.local:7700",
|
|
"MEILI_KEY": "your-master-key"
|
|
}
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
### Option 2: Via Docker
|
|
|
|
```json
|
|
{
|
|
"mcpServers": {
|
|
"gitea-search": {
|
|
"command": "docker",
|
|
"args": [
|
|
"run", "--rm", "-i",
|
|
"-e", "MEILI_URL=http://host.docker.internal:7700",
|
|
"gitea.rspworks.tech/rpert/gitea-search:mcp-server"
|
|
]
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
### Tool usage
|
|
|
|
Once configured, Claude Code can call the `gitea_search` tool:
|
|
|
|
```
|
|
Search for "wireguard" across all repos
|
|
Search for "backup" in repo "rpert/infra-ssh" with filetype "sh"
|
|
```
|
|
|
|
Tool parameters:
|
|
|
|
| Parameter | Type | Required | Description |
|
|
|-----------|------|----------|-------------|
|
|
| `query` | string | yes | Search terms |
|
|
| `repo` | string | no | Filter by repo full name (e.g., `rpert/infra-ssh`) |
|
|
| `filetype` | string | no | Filter by extension (e.g., `go`, `md`, `yaml`) |
|
|
| `limit` | integer | no | Max results (default: 10) |
|
|
|
|
## MeiliSearch Document Schema
|
|
|
|
```json
|
|
{
|
|
"id": "sha256(repo+branch+path)",
|
|
"repo": "rpert/infra-ssh",
|
|
"branch": "main",
|
|
"path": "docs/mail-setup.md",
|
|
"filename": "mail-setup.md",
|
|
"extension": "md",
|
|
"content": "file content (up to 50KB)",
|
|
"language": "markdown",
|
|
"updated_at": 1712534400
|
|
}
|
|
```
|
|
|
|
Searchable: `content`, `path`, `filename`, `repo`
|
|
Filterable: `repo`, `extension`, `branch`
|
|
Displayed: all fields except `content` (snippets returned via highlighting)
|
|
|
|
## K8s Deployment
|
|
|
|
### 1. Create namespace and secrets
|
|
|
|
```sh
|
|
kubectl apply -f k8s/namespace.yaml
|
|
|
|
# Generate a real master key
|
|
MEILI_KEY=$(openssl rand -base64 32)
|
|
|
|
kubectl -n gitea-search create secret generic meilisearch-secret \
|
|
--from-literal=master-key="$MEILI_KEY" \
|
|
--dry-run=client -o yaml | kubectl apply -f -
|
|
|
|
kubectl -n gitea-search create secret generic indexer-secret \
|
|
--from-literal=gitea-token="your-gitea-token" \
|
|
--from-literal=webhook-secret="your-webhook-secret" \
|
|
--dry-run=client -o yaml | kubectl apply -f -
|
|
```
|
|
|
|
### 2. Deploy MeiliSearch
|
|
|
|
```sh
|
|
kubectl apply -f k8s/meilisearch.yaml
|
|
```
|
|
|
|
### 3. Build and push container image
|
|
|
|
```sh
|
|
# Build indexer image
|
|
docker build --target indexer -t gitea.rspworks.tech/rpert/gitea-search:latest .
|
|
docker push gitea.rspworks.tech/rpert/gitea-search:latest
|
|
|
|
# Build MCP server image
|
|
docker build --target mcp-server -t gitea.rspworks.tech/rpert/gitea-search:mcp-server .
|
|
docker push gitea.rspworks.tech/rpert/gitea-search:mcp-server
|
|
```
|
|
|
|
### 4. Deploy indexer CronJob and webhook server
|
|
|
|
```sh
|
|
kubectl apply -f k8s/indexer-cronjob.yaml
|
|
```
|
|
|
|
### 5. Trigger initial index
|
|
|
|
```sh
|
|
kubectl -n gitea-search create job --from=cronjob/gitea-indexer gitea-indexer-initial
|
|
kubectl -n gitea-search logs -f job/gitea-indexer-initial
|
|
```
|
|
|
|
### 6. Configure Gitea webhook (optional)
|
|
|
|
In Gitea, go to Site Administration > Webhooks > Add Webhook:
|
|
- URL: `http://indexer-webhook.gitea-search.svc.cluster.local:8080/webhook`
|
|
- Content Type: `application/json`
|
|
- Secret: same as `WEBHOOK_SECRET`
|
|
- Events: Push only
|
|
|
|
## Indexing Details
|
|
|
|
- Clones each repo with `git clone --depth 1` (shallow, fast)
|
|
- Walks all files, skipping: `.git/`, `node_modules/`, `vendor/`, `__pycache__/`, binary files, lock files, images, archives
|
|
- Files >50KB are skipped
|
|
- Binary detection: checks first 512 bytes for null bytes
|
|
- Full reindex clears the index first, then re-populates
|
|
- Webhook reindex deletes only the affected repo's documents, then re-indexes that repo
|
|
|
|
## License
|
|
|
|
MIT
|