From b106eed92dc41c803d9be34d5fcdf63e96f51411 Mon Sep 17 00:00:00 2001 From: Rolando Santamaria Maso Date: Sun, 7 Jun 2026 15:55:40 +0200 Subject: [PATCH 1/2] feat(vision): add vision tool using MiniCPM-V 4.6 (1.3B) via llama-mtmd-cli MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Adds a `vision` built-in tool that analyses images and videos locally using MiniCPM-V 4.6 — a 1.3B multimodal model running via llama.cpp's llama-mtmd-cli, with no cloud API required. ## What's new **Tool (`vision`)** - Accepts images (JPEG, PNG, GIF, WebP, BMP) and videos (MP4, MOV, AVI, MKV, WebM) - Videos: ffprobe reads duration, ffmpeg extracts N evenly-spaced frames, all frames sent as a multi-image call to the model (configurable via `video_frames`, default 8) - Security: O_NOFOLLOW open (symlink protection), danger.CheckOperation classification, all output wrapped in wrapUntrusted() with provenance tag - Setup instructions in every error path (missing binary, missing model, missing mmproj, missing ffmpeg) **Docker (`docker/Dockerfile`)** - New `minicpm` multi-stage build: downloads pre-built llama-mtmd-cli (llama.cpp b9549) for amd64/arm64 from the official GitHub release, then fetches MiniCPM-V-4_6-Q4_K_M.gguf (529 MB) and mmproj-model-f16.gguf (1.1 GB) from HuggingFace into /usr/local/share/minicpm-v/models/ - Overridable via --build-arg MINICPM_QUANT=Q8_0 and LLAMA_VERSION - Runtime stage copies binary + models; no new runtime deps (libstdc++6 already present for whisper) **Config (`internal/config/loader.go`)** - New VisionConfig struct: ModelsDir, BinaryPath, VideoFrames - Wired into FileConfig, ResolvedConfig, resolveVision(), mergeFile() **Tests** - 13 tests in cmd/odek/vision_tool_test.go: empty path, invalid JSON, file not found, symlink rejected, missing binary, missing model, missing mmproj, mock happy-path image (4 extensions), custom prompt, mock happy-path video (with mock ffprobe+ffmpeg via PATH override), missing ffmpeg fallback, schema shape - 3 tests in internal/config/vision_test.go: resolveVision defaults, zero-frames backfill, custom values round-trip **Docs** - docs/CHEATSHEET.md: new Image & Video Understanding section with config snippet and field reference - docs/SECURITY.md: vision added to untrusted-content table, always-untrusted list, and skills provenance gate paragraph - docs/CONFIG.md + docs/TELEGRAM.md: smart-previews bullet updated - docker/README.md: new Image & video understanding (out of the box) section - README.md: vision added to external-content ingestion list Co-Authored-By: Claude Sonnet 4.6 --- README.md | 2 +- cmd/odek/injection_hardening_test.go | 2 +- cmd/odek/main.go | 7 +- cmd/odek/main_test.go | 2 +- cmd/odek/mcp.go | 2 +- cmd/odek/repl.go | 2 +- cmd/odek/schedule.go | 2 +- cmd/odek/serve.go | 2 +- cmd/odek/subagent.go | 2 +- cmd/odek/subagent_contract_test.go | 6 +- cmd/odek/telegram.go | 2 +- cmd/odek/vision_tool.go | 320 +++++++++++++++++++++++++ cmd/odek/vision_tool_test.go | 338 +++++++++++++++++++++++++++ docker/Dockerfile | 43 ++++ docker/README.md | 19 ++ docs/CHEATSHEET.md | 19 ++ docs/CONFIG.md | 2 +- docs/SECURITY.md | 5 +- docs/TELEGRAM.md | 2 +- internal/config/loader.go | 39 ++++ internal/config/vision_test.go | 40 ++++ 21 files changed, 839 insertions(+), 19 deletions(-) create mode 100644 cmd/odek/vision_tool.go create mode 100644 cmd/odek/vision_tool_test.go create mode 100644 internal/config/vision_test.go diff --git a/README.md b/README.md index ca74f31..b93a577 100644 --- a/README.md +++ b/README.md @@ -36,7 +36,7 @@ odek is not a framework. It's a **runtime** — the smallest possible surface ar Every session can run in an isolated Docker container: no network, no host mounts beyond the working directory, zero capabilities, destroyed on exit. `odek serve` enables the sandbox **by default**; `odek run` keeps it opt-in but warns when running unsandboxed. `--ctx` files are auto-injected into the container at `/workspace/`. Full security model in [docs/SANDBOXING.md](docs/SANDBOXING.md). ### 🛡️ Prompt-Injection-Aware -External content the agent ingests (`browser`, `read_file`, `shell`, `search_files`, `multi_grep`, `transcribe`, `session_search`, MCP tools) is wrapped in per-call nonce'd `` boundaries so the model can distinguish data from instructions. Redirect hops are re-classified (`browser`/`http_batch`), MCP tool descriptions are scanned for injection at registration, and the MCP error channel is wrapped too. The danger classifier resists 8 known shell-evasion tricks (`$()`, backticks, `$IFS`, `command`/`exec`, `\rm`, basenamed absolute paths). Approvers engage friction mode after 3 same-class approvals in 60 s. Memory episodes from tainted sessions are stored but never auto-replayed. Skill auto-save tracks provenance and pins untrusted suggestions for explicit `odek skill promote`. `odek audit ` surfaces every ingest + per-turn divergence heuristic. Full threat model in [docs/SECURITY.md](docs/SECURITY.md). +External content the agent ingests (`browser`, `read_file`, `shell`, `search_files`, `multi_grep`, `transcribe`, `vision`, `session_search`, MCP tools) is wrapped in per-call nonce'd `` boundaries so the model can distinguish data from instructions. Redirect hops are re-classified (`browser`/`http_batch`), MCP tool descriptions are scanned for injection at registration, and the MCP error channel is wrapped too. The danger classifier resists 8 known shell-evasion tricks (`$()`, backticks, `$IFS`, `command`/`exec`, `\rm`, basenamed absolute paths). Approvers engage friction mode after 3 same-class approvals in 60 s. Memory episodes from tainted sessions are stored but never auto-replayed. Skill auto-save tracks provenance and pins untrusted suggestions for explicit `odek skill promote`. `odek audit ` surfaces every ingest + per-turn divergence heuristic. Full threat model in [docs/SECURITY.md](docs/SECURITY.md). ### 🧩 Sub-Agent Delegation Parallel OS-process sub-agents via `delegate_tasks`. True isolation — each sub-agent is a fresh `odek subagent` process with its own config, tools, and termination timeout. Up to 8 concurrent workers. [docs/SUBAGENTS.md](docs/SUBAGENTS.md) diff --git a/cmd/odek/injection_hardening_test.go b/cmd/odek/injection_hardening_test.go index b5c459e..d19a220 100644 --- a/cmd/odek/injection_hardening_test.go +++ b/cmd/odek/injection_hardening_test.go @@ -244,7 +244,7 @@ func TestBuiltinTools_SessionSearchWrappedAsUntrusted(t *testing.T) { store, cleanup := seedSessionStore(t) defer cleanup() - tools := builtinTools(danger.DangerousConfig{}, nil, nil, 4, "", config.TranscriptionConfig{}, store) + tools := builtinTools(danger.DangerousConfig{}, nil, nil, 4, "", config.TranscriptionConfig{}, config.VisionConfig{}, store) var ss odek.Tool for _, tool := range tools { diff --git a/cmd/odek/main.go b/cmd/odek/main.go index 85c335a..4830d09 100644 --- a/cmd/odek/main.go +++ b/cmd/odek/main.go @@ -779,7 +779,7 @@ func run(args []string) error { // Sandbox setup var sandboxCleanup func() error - tools := builtinTools(resolved.Dangerous, sm, nil, resolved.MaxConcurrency, resolved.APIKey, resolved.Transcription, nil) + tools := builtinTools(resolved.Dangerous, sm, nil, resolved.MaxConcurrency, resolved.APIKey, resolved.Transcription, resolved.Vision, nil) // MCP server tools var mcpCleanup func() @@ -1054,7 +1054,7 @@ func setupSandbox(tools []odek.Tool, cfg sandboxConfig) (containerName string, c return containerName, cleanup, nil } -func builtinTools(dc danger.DangerousConfig, sm *skills.SkillManager, approver danger.Approver, maxConcurrency int, apiKey string, tc config.TranscriptionConfig, store *session.Store) []odek.Tool { +func builtinTools(dc danger.DangerousConfig, sm *skills.SkillManager, approver danger.Approver, maxConcurrency int, apiKey string, tc config.TranscriptionConfig, vc config.VisionConfig, store *session.Store) []odek.Tool { tools := []odek.Tool{ &shellTool{ dangerousConfig: dc, @@ -1089,6 +1089,7 @@ func builtinTools(dc danger.DangerousConfig, sm *skills.SkillManager, approver d &trTool{dangerousConfig: dc}, &wordCountTool{dangerousConfig: dc}, newTranscribeTool(dc, tc), + newVisionTool(dc, vc), // session_search returns content from arbitrary past sessions — // including sessions that ingested untrusted content. That path // otherwise bypasses the memory taint gate and the audit log, so @@ -1598,7 +1599,7 @@ func continueCmd(args []string) error { "./.odek/skills", ) } - tools := builtinTools(resolved.Dangerous, sm, nil, resolved.MaxConcurrency, resolved.APIKey, resolved.Transcription, store) + tools := builtinTools(resolved.Dangerous, sm, nil, resolved.MaxConcurrency, resolved.APIKey, resolved.Transcription, resolved.Vision, store) var sandboxCleanup func() error // MCP server tools diff --git a/cmd/odek/main_test.go b/cmd/odek/main_test.go index 67b8d62..619d412 100644 --- a/cmd/odek/main_test.go +++ b/cmd/odek/main_test.go @@ -203,7 +203,7 @@ func TestRun_NoAPIKey(t *testing.T) { } func TestBuiltinTools(t *testing.T) { - tools := builtinTools(danger.DangerousConfig{}, nil, nil, 3, "", config.TranscriptionConfig{}, nil) + tools := builtinTools(danger.DangerousConfig{}, nil, nil, 3, "", config.TranscriptionConfig{}, config.VisionConfig{}, nil) if len(tools) == 0 { t.Fatal("builtinTools() returned empty slice") } diff --git a/cmd/odek/mcp.go b/cmd/odek/mcp.go index 993f260..55f604f 100644 --- a/cmd/odek/mcp.go +++ b/cmd/odek/mcp.go @@ -73,7 +73,7 @@ Flags: } // Build tools - toolSet := builtinTools(resolved.Dangerous, sm, nil, resolved.MaxConcurrency, resolved.APIKey, config.TranscriptionConfig{}, nil) + toolSet := builtinTools(resolved.Dangerous, sm, nil, resolved.MaxConcurrency, resolved.APIKey, config.TranscriptionConfig{}, config.VisionConfig{}, nil) // MCP server tools — connect and discover before sandbox var mcpCleanup func() diff --git a/cmd/odek/repl.go b/cmd/odek/repl.go index 4ae42d5..bd2fcae 100644 --- a/cmd/odek/repl.go +++ b/cmd/odek/repl.go @@ -77,7 +77,7 @@ func replCmd(args []string) error { "./.odek/skills", ) } - tools := builtinTools(resolved.Dangerous, sm, nil, resolved.MaxConcurrency, resolved.APIKey, config.TranscriptionConfig{}, nil) + tools := builtinTools(resolved.Dangerous, sm, nil, resolved.MaxConcurrency, resolved.APIKey, config.TranscriptionConfig{}, config.VisionConfig{}, nil) var sandboxCleanup func() error // MCP server tools diff --git a/cmd/odek/schedule.go b/cmd/odek/schedule.go index 0e21111..2858370 100644 --- a/cmd/odek/schedule.go +++ b/cmd/odek/schedule.go @@ -570,7 +570,7 @@ func runTaskHeadless(ctx context.Context, resolved config.ResolvedConfig, system resolved.Dangerous.NonInteractive = &deny } - tools := builtinTools(resolved.Dangerous, nil, nil, resolved.MaxConcurrency, resolved.APIKey, resolved.Transcription, nil) + tools := builtinTools(resolved.Dangerous, nil, nil, resolved.MaxConcurrency, resolved.APIKey, resolved.Transcription, resolved.Vision, nil) tools = append(tools, mcpTools...) // Capture cumulative token usage from the final iteration so the Runner diff --git a/cmd/odek/serve.go b/cmd/odek/serve.go index 676bde2..b37a06d 100644 --- a/cmd/odek/serve.go +++ b/cmd/odek/serve.go @@ -267,7 +267,7 @@ func newServeAgent(resolved config.ResolvedConfig, system string, sendFn func(v approver := newWSApprover(sendFn) resolved.Dangerous.Approver = approver - tools := builtinTools(resolved.Dangerous, sm, approver, resolved.MaxConcurrency, resolved.APIKey, config.TranscriptionConfig{}, nil) + tools := builtinTools(resolved.Dangerous, sm, approver, resolved.MaxConcurrency, resolved.APIKey, config.TranscriptionConfig{}, config.VisionConfig{}, nil) // Find the delegateTasksTool to wire up sub-agent log streaming var subagentTool *delegateTasksTool diff --git a/cmd/odek/subagent.go b/cmd/odek/subagent.go index dea620f..b4f8275 100644 --- a/cmd/odek/subagent.go +++ b/cmd/odek/subagent.go @@ -291,7 +291,7 @@ func subagentCmd(args []string) error { "./.odek/skills", ) } - tools := builtinTools(resolved.Dangerous, sm, nil, resolved.MaxConcurrency, resolved.APIKey, config.TranscriptionConfig{}, nil) + tools := builtinTools(resolved.Dangerous, sm, nil, resolved.MaxConcurrency, resolved.APIKey, config.TranscriptionConfig{}, config.VisionConfig{}, nil) var sandboxCleanup func() error // MCP server tools diff --git a/cmd/odek/subagent_contract_test.go b/cmd/odek/subagent_contract_test.go index f5e24b0..ffb9221 100644 --- a/cmd/odek/subagent_contract_test.go +++ b/cmd/odek/subagent_contract_test.go @@ -320,7 +320,7 @@ func TestSubagent_ExitCodeThree(t *testing.T) { // ── 4. delegate_tasks Tool Schema ─────────────────────────────────── func TestDelegateTasksTool_Exists(t *testing.T) { - tools := builtinTools(danger.DangerousConfig{}, nil, nil, 3, "", config.TranscriptionConfig{}, nil) + tools := builtinTools(danger.DangerousConfig{}, nil, nil, 3, "", config.TranscriptionConfig{}, config.VisionConfig{}, nil) if len(tools) == 0 { t.Fatal("builtinTools() returned empty slice") } @@ -338,7 +338,7 @@ func TestDelegateTasksTool_Exists(t *testing.T) { } func TestDelegateTasksTool_HasSchema(t *testing.T) { - tools := builtinTools(danger.DangerousConfig{}, nil, nil, 3, "", config.TranscriptionConfig{}, nil) + tools := builtinTools(danger.DangerousConfig{}, nil, nil, 3, "", config.TranscriptionConfig{}, config.VisionConfig{}, nil) var tool odek.Tool for _, t2 := range tools { @@ -432,7 +432,7 @@ func TestDelegateTasksTool_HasSchema(t *testing.T) { } func TestDelegateTasksTool_Description(t *testing.T) { - tools := builtinTools(danger.DangerousConfig{}, nil, nil, 3, "", config.TranscriptionConfig{}, nil) + tools := builtinTools(danger.DangerousConfig{}, nil, nil, 3, "", config.TranscriptionConfig{}, config.VisionConfig{}, nil) var tool odek.Tool for _, t2 := range tools { diff --git a/cmd/odek/telegram.go b/cmd/odek/telegram.go index a7dd0fd..d88bfac 100644 --- a/cmd/odek/telegram.go +++ b/cmd/odek/telegram.go @@ -1078,7 +1078,7 @@ func handleChatMessage( } // Build the agent with Telegram approver. - tools := builtinTools(resolved.Dangerous, nil, approver, resolved.MaxConcurrency, resolved.APIKey, resolved.Transcription, sessionManager.Store) + tools := builtinTools(resolved.Dangerous, nil, approver, resolved.MaxConcurrency, resolved.APIKey, resolved.Transcription, resolved.Vision, sessionManager.Store) modelLabel := odek.ProfileLabel(resolved.Model) if modelLabel == "" { diff --git a/cmd/odek/vision_tool.go b/cmd/odek/vision_tool.go new file mode 100644 index 0000000..d9f57dd --- /dev/null +++ b/cmd/odek/vision_tool.go @@ -0,0 +1,320 @@ +package main + +import ( + "encoding/json" + "fmt" + "os" + "os/exec" + "path/filepath" + "strconv" + "strings" + "syscall" + + "github.com/BackendStack21/odek" + "github.com/BackendStack21/odek/internal/config" + "github.com/BackendStack21/odek/internal/danger" +) + +var videoExts = map[string]bool{ + ".mp4": true, ".mov": true, ".avi": true, ".mkv": true, + ".webm": true, ".m4v": true, ".flv": true, ".wmv": true, +} + +// llamaMtmdBinary locates the llama-mtmd-cli binary. +// Priority: cfg.BinaryPath > PATH search. +func llamaMtmdBinary(cfg config.VisionConfig) (string, error) { + if cfg.BinaryPath != "" { + if _, err := os.Stat(cfg.BinaryPath); err == nil { + return cfg.BinaryPath, nil + } + return "", fmt.Errorf("llama-mtmd-cli not found at configured path %q", cfg.BinaryPath) + } + if path, err := exec.LookPath("llama-mtmd-cli"); err == nil { + return path, nil + } + return "", fmt.Errorf(`llama-mtmd-cli not found on PATH. + +The vision tool requires llama.cpp's multimodal CLI (build b9549+). + +To install manually: + git clone --depth 1 --branch b9549 https://github.com/ggerganov/llama.cpp + cd llama.cpp + cmake -B build -DCMAKE_BUILD_TYPE=Release -DGGML_NATIVE=OFF -DLLAMA_CURL=OFF + cmake --build build -j$(nproc) --target llama-mtmd-cli + install build/bin/llama-mtmd-cli /usr/local/bin/ + +Or set binary_path in the vision config.`) +} + +// visionModelPaths resolves the model.gguf and mmproj.gguf paths. +// Priority: cfg.ModelsDir > Docker image path > ~/.odek/minicpm-v/models. +func visionModelPaths(cfg config.VisionConfig) (modelPath, mmprojPath string, err error) { + dir := cfg.ModelsDir + if dir == "" { + // Docker image baked path (see docker/Dockerfile minicpm stage) + const dockerPath = "/usr/local/share/minicpm-v/models" + if _, statErr := os.Stat(filepath.Join(dockerPath, "model.gguf")); statErr == nil { + dir = dockerPath + } else { + home, homeErr := os.UserHomeDir() + if homeErr != nil { + return "", "", fmt.Errorf("cannot determine home directory: %v", homeErr) + } + dir = filepath.Join(home, ".odek", "minicpm-v", "models") + } + } + + mp := filepath.Join(dir, "model.gguf") + mmp := filepath.Join(dir, "mmproj.gguf") + + if _, err := os.Stat(mp); err != nil { + return "", "", fmt.Errorf(`MiniCPM-V model not found at %q. + +Download and install: + mkdir -p %s + cd %s + curl -LO "https://huggingface.co/openbmb/MiniCPM-V-4_6-gguf/resolve/main/MiniCPM-V-4_6-Q4_K_M.gguf" + mv MiniCPM-V-4_6-Q4_K_M.gguf model.gguf + curl -LO "https://huggingface.co/openbmb/MiniCPM-V-4_6-gguf/resolve/main/mmproj-model-f16.gguf" + mv mmproj-model-f16.gguf mmproj.gguf + +Or set models_dir in the vision config.`, mp, dir, dir) + } + if _, err := os.Stat(mmp); err != nil { + return "", "", fmt.Errorf("MiniCPM-V projector not found at %q — download mmproj-model-f16.gguf to %s and rename to mmproj.gguf", mmp, dir) + } + return mp, mmp, nil +} + +// extractVideoFrames samples n evenly-spaced frames from videoPath into a +// temporary directory. Returns paths to the JPEG frame files; caller must +// remove the directory (filepath.Dir of the first path). +func extractVideoFrames(videoPath string, n int) ([]string, error) { + if _, err := exec.LookPath("ffmpeg"); err != nil { + return nil, fmt.Errorf("ffmpeg not found — required for video frame extraction") + } + if _, err := exec.LookPath("ffprobe"); err != nil { + return nil, fmt.Errorf("ffprobe not found — required to read video duration") + } + + // Get duration with ffprobe + out, err := exec.Command("ffprobe", + "-v", "error", + "-show_entries", "format=duration", + "-of", "csv=p=0", + videoPath, + ).Output() + if err != nil { + return nil, fmt.Errorf("ffprobe failed: %v", err) + } + var duration float64 + fmt.Sscanf(strings.TrimSpace(string(out)), "%f", &duration) + if duration <= 0 { + duration = 60 + } + + tmpDir, err := os.MkdirTemp("", "odek-vision-*") + if err != nil { + return nil, fmt.Errorf("cannot create temp dir: %v", err) + } + + // Extract frames at evenly-spaced timestamps, avoiding the very start/end + interval := duration / float64(n+1) + var frames []string + for i := 1; i <= n; i++ { + ts := interval * float64(i) + out := filepath.Join(tmpDir, fmt.Sprintf("frame_%02d.jpg", i)) + cmd := exec.Command("ffmpeg", + "-ss", fmt.Sprintf("%.3f", ts), + "-i", videoPath, + "-frames:v", "1", + "-q:v", "2", + "-y", + out, + ) + if cmd.Run() == nil { + frames = append(frames, out) + } + } + + if len(frames) == 0 { + os.RemoveAll(tmpDir) + return nil, fmt.Errorf("no frames could be extracted from %q", videoPath) + } + return frames, nil +} + +// runLlamaMtmd calls llama-mtmd-cli in single-turn mode with one or more images +// and returns the trimmed stdout response. +func runLlamaMtmd(binary, modelPath, mmprojPath, prompt string, imagePaths []string) (string, error) { + args := []string{ + "-m", modelPath, + "--mmproj", mmprojPath, + "-c", "4096", + "--temp", "0.7", + "--top-p", "0.8", + "--top-k", "100", + "--repeat-penalty", "1.05", + "-n", strconv.Itoa(1024), + "-p", prompt, + } + for _, img := range imagePaths { + args = append(args, "--image", img) + } + + cmd := exec.Command(binary, args...) + output, err := cmd.Output() + if err != nil { + if exitErr, ok := err.(*exec.ExitError); ok { + return "", fmt.Errorf("llama-mtmd-cli failed (exit %d): %s", + exitErr.ExitCode(), strings.TrimSpace(string(exitErr.Stderr))) + } + return "", fmt.Errorf("llama-mtmd-cli failed: %v", err) + } + return strings.TrimSpace(string(output)), nil +} + +// ═════════════════════════════════════════════════════════════════════════ +// vision Tool +// ═════════════════════════════════════════════════════════════════════════ + +type visionTool struct { + dangerousConfig danger.DangerousConfig + visionCfg config.VisionConfig +} + +func newVisionTool(dc danger.DangerousConfig, vc config.VisionConfig) *visionTool { + return &visionTool{dangerousConfig: dc, visionCfg: vc} +} + +func (t *visionTool) Name() string { return "vision" } +func (t *visionTool) Description() string { + return `Analyze an image or video file using MiniCPM-V 4.6, a local 1.3B multimodal model (llama-mtmd-cli). Images are described directly; videos are sampled into evenly-spaced frames and analyzed together. Supports JPEG, PNG, GIF, WebP, BMP for images and MP4, MOV, AVI, MKV, WebM for video. Requires llama-mtmd-cli and MiniCPM-V 4.6 model files (bundled in the Docker image).` +} + +type visionArgs struct { + Path string `json:"path"` + Prompt string `json:"prompt,omitempty"` +} + +type visionResult struct { + Description string `json:"description"` + Model string `json:"model"` + Type string `json:"type"` // "image" or "video" + Frames int `json:"frames,omitempty"` + Error string `json:"error,omitempty"` +} + +func (t *visionTool) Schema() any { + return map[string]any{ + "type": "object", + "properties": map[string]any{ + "path": map[string]any{ + "type": "string", + "description": "Path to an image (JPEG, PNG, GIF, WebP, BMP) or video file (MP4, MOV, AVI, MKV, WebM).", + }, + "prompt": map[string]any{ + "type": "string", + "description": `Instruction or question for the model. Default: "Describe this in detail."`, + }, + }, + "required": []string{"path"}, + } +} + +func (t *visionTool) Call(argsJSON string) (result string, err error) { + defer func() { + if r := recover(); r != nil { + err = fmt.Errorf("vision: panic: %v", r) + result = `{"error":"internal error"}` + } + }() + + var args visionArgs + if err := json.Unmarshal([]byte(argsJSON), &args); err != nil { + return jsonError("invalid arguments: " + err.Error()) + } + if args.Path == "" { + return jsonError("path is required") + } + prompt := args.Prompt + if prompt == "" { + prompt = "Describe this in detail." + } + + // Security: classify the file path + if err := t.dangerousConfig.CheckOperation(danger.ToolOperation{ + Name: "vision", Resource: args.Path, Risk: danger.ClassifyPath(args.Path), + }, nil); err != nil { + return jsonError(err.Error()) + } + + // Check file exists (O_NOFOLLOW prevents symlink attacks) + f, err := os.OpenFile(args.Path, os.O_RDONLY|syscall.O_NOFOLLOW, 0) + if err != nil { + return jsonResult(visionResult{ + Error: fmt.Sprintf("cannot open file %q: %v", args.Path, err), + }) + } + f.Close() + + binary, err := llamaMtmdBinary(t.visionCfg) + if err != nil { + return jsonResult(visionResult{Error: err.Error()}) + } + modelPath, mmprojPath, err := visionModelPaths(t.visionCfg) + if err != nil { + return jsonResult(visionResult{Error: err.Error()}) + } + + ext := strings.ToLower(filepath.Ext(args.Path)) + source := "vision:" + args.Path + + if videoExts[ext] { + return t.analyzeVideo(binary, modelPath, mmprojPath, args.Path, prompt, source) + } + return t.analyzeImage(binary, modelPath, mmprojPath, args.Path, prompt, source) +} + +func (t *visionTool) analyzeImage(binary, modelPath, mmprojPath, imgPath, prompt, source string) (string, error) { + desc, err := runLlamaMtmd(binary, modelPath, mmprojPath, prompt, []string{imgPath}) + if err != nil { + return jsonResult(visionResult{Error: err.Error()}) + } + return jsonResult(visionResult{ + Description: wrapUntrusted(source, desc), + Model: "minicpm-v-4.6", + Type: "image", + }) +} + +func (t *visionTool) analyzeVideo(binary, modelPath, mmprojPath, videoPath, prompt, source string) (string, error) { + n := t.visionCfg.VideoFrames + if n <= 0 { + n = 8 + } + + frames, err := extractVideoFrames(videoPath, n) + if err != nil { + return jsonResult(visionResult{Error: err.Error()}) + } + defer os.RemoveAll(filepath.Dir(frames[0])) + + videoPrompt := fmt.Sprintf( + "These are %d frames sampled evenly from a video. %s", + len(frames), prompt, + ) + desc, err := runLlamaMtmd(binary, modelPath, mmprojPath, videoPrompt, frames) + if err != nil { + return jsonResult(visionResult{Error: err.Error()}) + } + return jsonResult(visionResult{ + Description: wrapUntrusted(source, desc), + Model: "minicpm-v-4.6", + Type: "video", + Frames: len(frames), + }) +} + +// Ensure visionTool implements odek.Tool +var _ odek.Tool = (*visionTool)(nil) diff --git a/cmd/odek/vision_tool_test.go b/cmd/odek/vision_tool_test.go new file mode 100644 index 0000000..024f5d8 --- /dev/null +++ b/cmd/odek/vision_tool_test.go @@ -0,0 +1,338 @@ +package main + +import ( + "encoding/json" + "fmt" + "os" + "path/filepath" + "strings" + "testing" + + "github.com/BackendStack21/odek/internal/config" + "github.com/BackendStack21/odek/internal/danger" +) + +// ── Helpers ────────────────────────────────────────────────────────────── + +// createMockLlamaMtmd creates a shell script that mimics llama-mtmd-cli +// in single-turn mode: it ignores all flags and writes a fixed description +// to stdout. This lets tests exercise the full tool.Call() path without a +// real model or GPU. +func createMockLlamaMtmd(t *testing.T) string { + t.Helper() + dir := t.TempDir() + script := `#!/bin/sh +echo 'A vivid test scene with colorful objects arranged neatly. The foreground shows a simple geometric shape and the background is uniformly lit.' +` + path := filepath.Join(dir, "llama-mtmd-cli") + if err := os.WriteFile(path, []byte(script), 0755); err != nil { + t.Fatalf("createMockLlamaMtmd: %v", err) + } + return path +} + +// createMockFftools creates mock ffprobe and ffmpeg binaries in a temp dir +// and returns the dir. Prepend it to PATH so extractVideoFrames uses them. +// +// ffprobe: outputs a fixed duration (10.0 seconds) +// ffmpeg: creates a minimal JPEG stub at the last argument path +func createMockFftools(t *testing.T) string { + t.Helper() + dir := t.TempDir() + + ffprobe := `#!/bin/sh +echo '10.000000' +` + // ffmpeg mock: the output path is always the last argument. + // "shift $(($# - 1))" leaves $1 as the last original arg. + ffmpeg := `#!/bin/sh +shift $(($# - 1)) +mkdir -p "$(dirname "$1")" +printf '\xff\xd8\xff\xe0' > "$1" +` + for name, body := range map[string]string{"ffprobe": ffprobe, "ffmpeg": ffmpeg} { + p := filepath.Join(dir, name) + if err := os.WriteFile(p, []byte(body), 0755); err != nil { + t.Fatalf("createMockFftools: %v", err) + } + } + return dir +} + +// fakeModelsDir creates a directory with stub model.gguf and mmproj.gguf files. +func fakeModelsDir(t *testing.T) string { + t.Helper() + dir := t.TempDir() + os.WriteFile(filepath.Join(dir, "model.gguf"), []byte("fake model"), 0644) + os.WriteFile(filepath.Join(dir, "mmproj.gguf"), []byte("fake mmproj"), 0644) + return dir +} + +// fakeImageFile writes a minimal file with a supported image extension. +func fakeImageFile(t *testing.T, ext string) string { + t.Helper() + path := filepath.Join(t.TempDir(), "test"+ext) + os.WriteFile(path, []byte("fake image data"), 0644) + return path +} + +// fakeVideoFile writes a minimal file with a supported video extension. +func fakeVideoFile(t *testing.T) string { + t.Helper() + path := filepath.Join(t.TempDir(), "test.mp4") + os.WriteFile(path, []byte("fake video data"), 0644) + return path +} + +// decodeVisionResult is a convenience wrapper for parsing tool output. +func decodeVisionResult(t *testing.T, raw string) visionResult { + t.Helper() + var r visionResult + if err := json.Unmarshal([]byte(raw), &r); err != nil { + t.Fatalf("decodeVisionResult: unmarshal failed: %v\nraw: %s", err, raw) + } + return r +} + +// ── Unit Tests ──────────────────────────────────────────────────────────── + +func TestVision_EmptyPath(t *testing.T) { + tool := newVisionTool(danger.DangerousConfig{}, config.VisionConfig{}) + result, err := tool.Call(`{"path":""}`) + if err != nil { + t.Fatalf("unexpected error: %v", err) + } + var r struct{ Error string `json:"error"` } + json.Unmarshal([]byte(result), &r) + if !strings.Contains(r.Error, "required") { + t.Errorf("expected 'required' in error, got: %s", r.Error) + } +} + +func TestVision_InvalidJSON(t *testing.T) { + tool := newVisionTool(danger.DangerousConfig{}, config.VisionConfig{}) + result, err := tool.Call(`{bad json}`) + if err != nil { + return // error return is also acceptable + } + var r struct{ Error string `json:"error"` } + json.Unmarshal([]byte(result), &r) + if !strings.Contains(r.Error, "invalid") { + t.Errorf("expected 'invalid' in error, got: %s", r.Error) + } +} + +func TestVision_FileNotFound(t *testing.T) { + tool := newVisionTool(danger.DangerousConfig{}, config.VisionConfig{}) + result, err := tool.Call(`{"path":"/nonexistent/image.jpg"}`) + if err != nil { + t.Fatalf("unexpected error: %v", err) + } + r := decodeVisionResult(t, result) + if !strings.Contains(r.Error, "cannot open") { + t.Errorf("expected 'cannot open' in error, got: %s", r.Error) + } +} + +func TestVision_SymlinkRejected(t *testing.T) { + dir := t.TempDir() + target := filepath.Join(dir, "real.png") + os.WriteFile(target, []byte("data"), 0644) + link := filepath.Join(dir, "link.jpg") + os.Symlink(target, link) + + tool := newVisionTool(danger.DangerousConfig{}, config.VisionConfig{}) + result, err := tool.Call(fmt.Sprintf(`{"path":"%s"}`, link)) + if err != nil { + t.Fatalf("unexpected error: %v", err) + } + r := decodeVisionResult(t, result) + if r.Error == "" { + t.Error("expected an error for symlink path, got none") + } +} + +func TestVision_MissingBinary(t *testing.T) { + tool := newVisionTool(danger.DangerousConfig{}, config.VisionConfig{ + BinaryPath: "/nonexistent/llama-mtmd-cli", + }) + path := fakeImageFile(t, ".jpg") + result, err := tool.Call(fmt.Sprintf(`{"path":"%s"}`, path)) + if err != nil { + t.Fatalf("unexpected error: %v", err) + } + r := decodeVisionResult(t, result) + if !strings.Contains(r.Error, "llama") && !strings.Contains(r.Error, "not found") { + t.Errorf("expected error mentioning 'llama' or 'not found', got: %s", r.Error) + } +} + +func TestVision_MissingModel(t *testing.T) { + tool := newVisionTool(danger.DangerousConfig{}, config.VisionConfig{ + BinaryPath: createMockLlamaMtmd(t), + ModelsDir: t.TempDir(), // empty — no model.gguf + }) + path := fakeImageFile(t, ".jpg") + result, err := tool.Call(fmt.Sprintf(`{"path":"%s"}`, path)) + if err != nil { + t.Fatalf("unexpected error: %v", err) + } + r := decodeVisionResult(t, result) + if !strings.Contains(r.Error, "model") && !strings.Contains(r.Error, "not found") { + t.Errorf("expected error about missing model, got: %s", r.Error) + } +} + +func TestVision_MissingMmproj(t *testing.T) { + dir := t.TempDir() + os.WriteFile(filepath.Join(dir, "model.gguf"), []byte("fake"), 0644) + // mmproj.gguf intentionally absent + + tool := newVisionTool(danger.DangerousConfig{}, config.VisionConfig{ + BinaryPath: createMockLlamaMtmd(t), + ModelsDir: dir, + }) + path := fakeImageFile(t, ".jpg") + result, err := tool.Call(fmt.Sprintf(`{"path":"%s"}`, path)) + if err != nil { + t.Fatalf("unexpected error: %v", err) + } + r := decodeVisionResult(t, result) + if !strings.Contains(r.Error, "mmproj") && !strings.Contains(r.Error, "projector") { + t.Errorf("expected error about missing mmproj, got: %s", r.Error) + } +} + +// ── Mock Happy-Path Tests ───────────────────────────────────────────────── + +// TestVision_MockHappyPath_Image exercises the full image analysis flow +// with a mock llama-mtmd-cli binary — no GPU or real model required. +func TestVision_MockHappyPath_Image(t *testing.T) { + tool := newVisionTool(danger.DangerousConfig{}, config.VisionConfig{ + BinaryPath: createMockLlamaMtmd(t), + ModelsDir: fakeModelsDir(t), + }) + + for _, ext := range []string{".jpg", ".jpeg", ".png", ".webp"} { + t.Run("ext="+ext, func(t *testing.T) { + path := fakeImageFile(t, ext) + result, err := tool.Call(fmt.Sprintf(`{"path":"%s"}`, path)) + if err != nil { + t.Fatalf("unexpected error: %v", err) + } + r := decodeVisionResult(t, result) + if r.Error != "" { + t.Fatalf("expected success, got error: %s", r.Error) + } + if r.Type != "image" { + t.Errorf("type = %q, want 'image'", r.Type) + } + if r.Model != "minicpm-v-4.6" { + t.Errorf("model = %q, want 'minicpm-v-4.6'", r.Model) + } + if r.Description == "" { + t.Error("description is empty") + } + if !strings.Contains(r.Description, "test scene") { + t.Errorf("description = %q, expected mock output containing 'test scene'", r.Description) + } + }) + } +} + +// TestVision_MockHappyPath_CustomPrompt verifies that a custom prompt is +// accepted (the mock binary ignores it, but the tool must not error). +func TestVision_MockHappyPath_CustomPrompt(t *testing.T) { + tool := newVisionTool(danger.DangerousConfig{}, config.VisionConfig{ + BinaryPath: createMockLlamaMtmd(t), + ModelsDir: fakeModelsDir(t), + }) + path := fakeImageFile(t, ".png") + result, err := tool.Call(fmt.Sprintf(`{"path":"%s","prompt":"What text is visible?"}`, path)) + if err != nil { + t.Fatalf("unexpected error: %v", err) + } + r := decodeVisionResult(t, result) + if r.Error != "" { + t.Fatalf("expected success, got error: %s", r.Error) + } + if r.Type != "image" { + t.Errorf("type = %q, want 'image'", r.Type) + } +} + +// TestVision_MockHappyPath_Video exercises the full video analysis flow +// with mock ffprobe, ffmpeg, and llama-mtmd-cli. It verifies frame extraction +// and the multi-image prompt path end-to-end. +func TestVision_MockHappyPath_Video(t *testing.T) { + ffDir := createMockFftools(t) + // Prepend mock tool dir to PATH so exec.LookPath finds them first. + orig := os.Getenv("PATH") + t.Setenv("PATH", ffDir+string(os.PathListSeparator)+orig) + + tool := newVisionTool(danger.DangerousConfig{}, config.VisionConfig{ + BinaryPath: createMockLlamaMtmd(t), + ModelsDir: fakeModelsDir(t), + VideoFrames: 3, + }) + + videoPath := fakeVideoFile(t) + result, err := tool.Call(fmt.Sprintf(`{"path":"%s"}`, videoPath)) + if err != nil { + t.Fatalf("unexpected error: %v", err) + } + r := decodeVisionResult(t, result) + if r.Error != "" { + t.Fatalf("expected success, got error: %s", r.Error) + } + if r.Type != "video" { + t.Errorf("type = %q, want 'video'", r.Type) + } + if r.Model != "minicpm-v-4.6" { + t.Errorf("model = %q, want 'minicpm-v-4.6'", r.Model) + } + if r.Frames == 0 { + t.Error("frames count is 0, expected > 0") + } + if r.Description == "" { + t.Error("description is empty") + } +} + +// TestVision_VideoFallsBackOnMissingFfmpeg verifies that a missing ffmpeg +// produces a clear, actionable error rather than a panic. +func TestVision_VideoFallsBackOnMissingFfmpeg(t *testing.T) { + // Point PATH at an empty dir so neither ffmpeg nor ffprobe are found. + t.Setenv("PATH", t.TempDir()) + + tool := newVisionTool(danger.DangerousConfig{}, config.VisionConfig{ + BinaryPath: createMockLlamaMtmd(t), + ModelsDir: fakeModelsDir(t), + VideoFrames: 4, + }) + videoPath := fakeVideoFile(t) + result, err := tool.Call(fmt.Sprintf(`{"path":"%s"}`, videoPath)) + if err != nil { + t.Fatalf("unexpected error: %v", err) + } + r := decodeVisionResult(t, result) + if !strings.Contains(r.Error, "ffmpeg") && !strings.Contains(r.Error, "ffprobe") { + t.Errorf("expected error mentioning ffmpeg/ffprobe, got: %s", r.Error) + } +} + +// TestVision_SchemaShape verifies the JSON schema contains the required keys. +func TestVision_SchemaShape(t *testing.T) { + tool := newVisionTool(danger.DangerousConfig{}, config.VisionConfig{}) + schema := tool.Schema() + b, err := json.Marshal(schema) + if err != nil { + t.Fatalf("marshal schema: %v", err) + } + s := string(b) + for _, want := range []string{`"path"`, `"prompt"`, `"required"`} { + if !strings.Contains(s, want) { + t.Errorf("schema missing %q; schema: %s", want, s) + } + } +} diff --git a/docker/Dockerfile b/docker/Dockerfile index 4de3ace..11e2e31 100644 --- a/docker/Dockerfile +++ b/docker/Dockerfile @@ -54,6 +54,44 @@ RUN mkdir -p /models \ && curl -fsSL -o "/models/ggml-${WHISPER_MODEL}.bin" \ "https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-${WHISPER_MODEL}.bin" +# ---- minicpm-v stage ---- +# Downloads a pre-built llama-mtmd-cli binary and fetches MiniCPM-V 4.6 model +# files (main GGUF + vision projector) so the `vision` tool works out of the +# box with zero host setup. Pre-built binaries from the official llama.cpp +# release avoid a multi-minute C++ compile inside Docker. +# +# Size added to the runtime image: ~530 MB model + ~1.1 GB mmproj ≈ 1.6 GB. +# To use a different quantization level: --build-arg MINICPM_QUANT=Q8_0 +# Available quants: Q4_0 (501 MB) | Q4_K_M (529 MB, default) | Q8_0 (812 MB) +# LLAMA_VERSION pins the llama.cpp release — bump it deliberately. +FROM debian:bookworm-slim AS minicpm +ARG MINICPM_QUANT=Q4_K_M +ARG LLAMA_VERSION=b9549 +RUN apt-get update && apt-get install -y --no-install-recommends \ + curl ca-certificates \ + && rm -rf /var/lib/apt/lists/* +# Map dpkg architecture to the llama.cpp release asset name suffix +# (amd64 → x64, arm64 → arm64) and extract only llama-mtmd-cli. +RUN set -ex \ + && LLAMA_ARCH=$(dpkg --print-architecture | sed 's/amd64/x64/') \ + && curl -fsSL \ + "https://github.com/ggerganov/llama.cpp/releases/download/${LLAMA_VERSION}/llama-${LLAMA_VERSION}-bin-ubuntu-${LLAMA_ARCH}.tar.gz" \ + -o /tmp/llama.tar.gz \ + && mkdir -p /tmp/llama-bin \ + && tar -xzf /tmp/llama.tar.gz -C /tmp/llama-bin \ + && find /tmp/llama-bin -name "llama-mtmd-cli" -exec install -m 755 {} /usr/local/bin/llama-mtmd-cli \; \ + && rm -rf /tmp/llama.tar.gz /tmp/llama-bin +# Fetch the model GGUF and vision projector into a fixed image path (NOT under +# ~/.odek — bind-mount profiles would hide files baked there). The runtime +# config points vision.models_dir at this path, or the tool auto-detects it. +RUN mkdir -p /usr/local/share/minicpm-v/models \ + && curl -fsSL \ + "https://huggingface.co/openbmb/MiniCPM-V-4_6-gguf/resolve/main/MiniCPM-V-4_6-${MINICPM_QUANT}.gguf" \ + -o /usr/local/share/minicpm-v/models/model.gguf \ + && curl -fsSL \ + "https://huggingface.co/openbmb/MiniCPM-V-4_6-gguf/resolve/main/mmproj-model-f16.gguf" \ + -o /usr/local/share/minicpm-v/models/mmproj.gguf + # ---- runtime stage ---- FROM debian:bookworm-slim # Tooling the agent commonly needs inside the sandbox container. @@ -82,6 +120,11 @@ RUN curl -fsSL https://cli.github.com/packages/githubcli-archive-keyring.gpg \ COPY --from=whisper /whisper/build/bin/whisper-cli /usr/local/bin/whisper-cli COPY --from=whisper /models/ /usr/local/share/whisper/models/ +# Bundle llama-mtmd-cli + MiniCPM-V 4.6 from the minicpm stage so `vision` +# works with zero setup. The tool auto-detects /usr/local/share/minicpm-v/models. +COPY --from=minicpm /usr/local/bin/llama-mtmd-cli /usr/local/bin/llama-mtmd-cli +COPY --from=minicpm /usr/local/share/minicpm-v/models/ /usr/local/share/minicpm-v/models/ + # ── Adding extra dependencies the agent can use ────────────────────────── # The agent runs shell commands INSIDE this image, so any runtime or CLI it # should call must be installed here. Trim or extend to taste, then rebuild diff --git a/docker/README.md b/docker/README.md index e6d367f..a4922d9 100644 --- a/docker/README.md +++ b/docker/README.md @@ -150,6 +150,25 @@ auto-transcription work with zero setup. No host install, no first-run download. `--build-arg WHISPER_MODEL=base` (or `small` / `medium`) and bump the `model` field in the config to match. +## Image & video understanding (out of the box) + +The image **bundles `llama-mtmd-cli` (llama.cpp b9549) and MiniCPM-V 4.6** +(1.3B multimodal model) so the `vision` tool works with zero setup — no cloud +API, no host install, no first-run download. + +- The model GGUF (`Q4_K_M`, ~529 MB) and vision projector (`mmproj`, ~1.1 GB) + ship at `/usr/local/share/minicpm-v/models/`. They live outside `~/.odek` so + Telegram bind-mounts cannot shadow them. +- Send the agent an image path → `vision` describes it locally using the + bundled 1.3B model. Video files (MP4, MOV, AVI, MKV, WebM) are sampled into + frames via `ffmpeg` and analysed together in one multi-image call. +- Want a higher-quality quantization? Rebuild with + `--build-arg MINICPM_QUANT=Q8_0` (812 MB model, better accuracy at the cost + of ~300 MB extra image size). Available quants: `Q4_0` (501 MB), `Q4_K_M` + (529 MB, default), `Q8_0` (812 MB). +- To point at models installed on the host instead, set `vision.models_dir` in + config to the directory containing `model.gguf` and `mmproj.gguf`. + ## Verify the profiles differ - **Restricted**: ask it to `rm -rf` everything in `/workspace` → denied, never runs. diff --git a/docs/CHEATSHEET.md b/docs/CHEATSHEET.md index 3bcc6dd..752a4bb 100644 --- a/docs/CHEATSHEET.md +++ b/docs/CHEATSHEET.md @@ -85,6 +85,25 @@ Priority: `~/.odek/config.json` ← `./odek.json` ← `ODEK_*` env ← CLI flags Settings: `model` (tiny/base/small/medium), `language` (ISO code, empty=auto), `auto_transcribe` (Telegram voice → text), `models_dir` (model directory), `binary_path` (whisper binary path). +### Image & Video Understanding +- **`vision`** tool uses local MiniCPM-V 4.6 (1.3B) via `llama-mtmd-cli` — no cloud APIs +- Accepts images (JPEG, PNG, GIF, WebP, BMP) and videos (MP4, MOV, AVI, MKV, WebM) +- Videos are sampled into evenly-spaced frames with ffmpeg; all frames analysed in one call +- Model files: `model.gguf` (~529 MB, Q4\_K\_M) + `mmproj.gguf` (~1.1 GB) — bundled in the Docker image at `/usr/local/share/minicpm-v/models/` +- Configure via `vision` section in config: + +```json +{ + "vision": { + "models_dir": "~/.odek/minicpm-v/models", + "binary_path": "/usr/local/bin/llama-mtmd-cli", + "video_frames": 8 + } +} +``` + +Settings: `models_dir` (dir with `model.gguf` + `mmproj.gguf`), `binary_path` (llama-mtmd-cli path), `video_frames` (frames to sample from video, default 8). + ## Memory System Architecture ### Three Tiers diff --git a/docs/CONFIG.md b/docs/CONFIG.md index 8887f3f..03f973e 100644 --- a/docs/CONFIG.md +++ b/docs/CONFIG.md @@ -434,7 +434,7 @@ The progress system is an evolving single message that gets edited in-place (sim ``` Key behaviors: -- **Smart previews** — instead of showing raw JSON args, the system extracts meaningful context: filename for file tools, the command text for shell, URL for browser, query text for memory/search tools, audio filename for transcribe +- **Smart previews** — instead of showing raw JSON args, the system extracts meaningful context: filename for file tools, the command text for shell, URL for browser, query text for memory/search tools, audio filename for transcribe, file path for vision - **Edit throttling** — edits are rate-limited to one every 1.5 seconds to avoid hitting Telegram's flood control limits. Rapid tool chains don't produce 429 errors - **Tool dedup** — when the same tool runs consecutively (common with parallel batch tools like `batch_read`), identical lines are collapsed into a `(×N)` counter instead of repeating N times - **Flood control fallback** — if an edit message fails with "flood" or "retry after", the system automatically switches to sending new messages instead of editing. This prevents the bot from becoming unresponsive under heavy load diff --git a/docs/SECURITY.md b/docs/SECURITY.md index 755ccf3..9c66f83 100644 --- a/docs/SECURITY.md +++ b/docs/SECURITY.md @@ -58,6 +58,7 @@ Tools that wrap: | `search_files`, `multi_grep` | `:` per match | | `shell` | `$ ` | | `transcribe` | `transcribe: