DevForge

PROJECT INFO

StageIdea
Prioritymedium
CreatedMar 22
UpdatedMar 22

ACTIVITY TREND

Commits per week (8 wk)

LINKS

GitHub RepositoryWebsiteDocs

About This Project

An AI Control Plane that evolves from passive monitoring to active orchestration of multiple AI coding agents. Implements a Self-Report API with 6 endpoints (heartbeat, deliverable, discovery, workflow, upgrade, review) enabling agents to proactively communicate their status. Features a Survival Engine for intelligent task scheduling, tmux-based multi-agent session management, and a web dashboard built with Next.js and TailwindCSS for real-time agent visibility.

README

Evolve

The Control Plane for Autonomous AI Agents
Self-managing. Self-learning. Self-evolving.

FeaturesArchitectureQuick StartAPI中文文档


Evolve is not another agent framework. It's a control system — it doesn't care how your agent writes code. It cares whether your agent is working, working correctly, learning from mistakes, and getting better over time.

Evolve Dashboard

Why Evolve?

When you run Claude/GPT autonomously 24/7, three problems emerge:

ProblemTraditional FixEvolve's Approach
No idea what Agent is doingTail logs, watch terminalAgent proactively reports (Self-Report API)
Agent repeats the same mistakesRemind it manually every timeAuto-extract lessons → inject into prompt (Knowledge Hub)
Agent does something dumbDiscover it after the factA second AI reviews in real-time (Supervisor Agent)

Features

1. Agent Self-Report API

Traditional monitoring watches agents from the outside. Evolve flips this: the agent must report its own status.

# Agent says: "I'm coding, 40% done"
curl -X POST $MYAGENT_URL/api/agent/heartbeat \
  -d '{"activity":"coding","description":"implementing auth module","progress_pct":40}'

# Agent says: "I discovered something important"
curl -X POST $MYAGENT_URL/api/agent/discovery \
  -d '{"title":"Rate limit found","content":"Max 3 posts/day on XHS","priority":"high"}'

# Agent says: "Here's what I learned today"
curl -X POST $MYAGENT_URL/api/agent/review \
  -d '{"accomplished":["API integration"],"learned":["Use MD5 not SHA256 for signatures"]}'

6 reporting endpoints: Heartbeat | Deliverable | Discovery | Workflow | Upgrade Proposal | Review

No report = work doesn't exist. This rule is baked into the agent's prompt.

2. Knowledge Hub — Agents That Get Smarter

Most agent frameworks start from zero every conversation. Evolve solves this with a closed-loop learning system:

Agent makes a mistake (review.learned: "pkill -f crashes the system")
        │
        ▼ Real-time ingestion
   Doubao auto-evaluates: score 10/10 (critical lesson)
        │
        ▼ Layered storage
   ┌─────────────────────────────────────────────┐
   │  Permanent (≥8)  Core lessons, never expire  │
   │  Recent (5-7)    Useful but temporal, 30d TTL │
   │  Task            Matched to current plans     │
   └─────────────────────────────────────────────┘
        │
        ▼ Injected into prompt on next startup
   Agent never uses pkill -f again

Key design decisions:

  • Refinement, not storage. A secondary LLM scores each lesson (1-10), distills it to one sentence, and tags it
  • Three-layer injection. Only the most relevant knowledge enters the prompt — not everything
  • Auto-expiry. Low-score knowledge expires after 30 days, keeping the knowledge base lean

3. Supervisor Agent — AI Reviewing AI

Supervisor Reports

One agent works. Another agent reviews its work.

Click "Analyze" → Evolve reads the survival engine's full JSONL conversation log → Python extracts key actions (tool calls, decisions, commands) → compresses to ~6000 chars → sends to Doubao for analysis:

  • Was each decision reasonable?
  • Any repeated operations, idle loops, or wasted effort?
  • Did it follow instructions?
  • Efficiency score + improvement suggestions

Extremely low cost: Doubao handles analysis, not Claude.

4. Survival Engine — 24/7 Persistent Agent

Not a one-shot script. A continuously alive agent:

  • Watchdog: Health check every 10s, auto-revival on hang
  • Heartbeat detection: 5min no heartbeat → gentle nudge, 15min → context-aware intervention
  • Crash recovery: --resume restart with knowledge injection, seamless continuation
  • Web terminal: Operate the tmux session directly from your browser

5. Skills-First Workflow

The agent is required to use structured skills before writing code:

New project? → /brainstorming → /writing-plans → /executing-plans
Bug found?   → /systematic-debugging
Done?        → /verification-before-completion

Before starting any task, the agent must:

  1. Run /skills to check available capabilities
  2. Search for better tools, MCP servers, or skills that could help
  3. Install and register new tools via the Discovery API

No cowboy coding. Design first, then execute.

6. Extensions Management

Full visibility into what your agent has installed:

Extensions Manager

  • Skills scanner — discovers skills from global (~/.claude/skills/), plugins, and workspace projects
  • MCP servers — shows all connected Model Context Protocol servers
  • Plugins — lists installed Claude Code plugins with enable/disable status
  • Tagging system — auto-inferred tags (AI, Web, Tools, Data, etc.) with manual override
  • Source tracking — marks which extensions were installed by the survival engine vs. manually

7. Capability Controls

Toggle agent permissions at runtime from the Dashboard — no restart needed:

Capability Controls

CapabilityStatusMeaning
Browser accessAllowedAgent can use Chrome for research
Git pushAllowedAgent can push code to GitHub
Spend moneyBlockedAgent cannot purchase paid services
Install packagesBlockedAgent cannot pip install / npm install

8. Scheduled Tasks

The agent creates scheduled tasks via API. Evolve executes them automatically:

curl -X POST $MYAGENT_URL/api/scheduled-tasks \
  -d '{"name":"Daily publish","cron_expr":"0 9 * * *","command":"/path/to/script.sh"}'

croniter-based scheduler with stdout/stderr capture, timeout handling, and run history.

9. Internationalization

Full i18n support with Chinese and English — 450+ translation keys. Switch languages from the sidebar.


Architecture

┌──────────────────────────────────────────────────────────────┐
│                        Web Dashboard                         │
│  Dashboard│Survival│Knowledge│Supervisor│Workflows│Extensions │
└────────────────────────────┬─────────────────────────────────┘
                             │ REST API
┌────────────────────────────┴─────────────────────────────────┐
│                     Evolve Server (FastAPI)                   │
│                                                              │
│  ┌─────────────┐  ┌──────────────┐  ┌─────────────────────┐ │
│  │ Self-Report  │  │  Knowledge   │  │   Supervisor Agent  │ │
│  │   6 APIs     │  │   Engine     │  │   (JSONL + Doubao)  │ │
│  └──────┬──────┘  └──────┬───────┘  └──────────┬──────────┘ │
│         │                │                      │            │
│  ┌──────┴────────────────┴──────────────────────┴──────────┐ │
│  │              SQLite (knowledge + extensions + ...)       │ │
│  └─────────────────────────────────────────────────────────┘ │
│                                                              │
│  ┌──────────────────────┐  ┌──────────────────────────────┐ │
│  │   Survival Engine    │  │     Cron Scheduler           │ │
│  │  (tmux + watchdog)   │  │  (croniter + shell exec)     │ │
│  └──────────┬───────────┘  └──────────────────────────────┘ │
└─────────────┼────────────────────────────────────────────────┘
              │ tmux
     ┌────────┴────────┐
     │  Claude Agent   │  ← Runs continuously, self-reports, self-decides
     └─────────────────┘

Data Flow: The Learning Loop

Agent works ──→ Calls Self-Report API
                        │
            ┌───────────┼───────────┐
            ▼           ▼           ▼
        heartbeat   deliverable  review.learned
                                    │
                              ▼ Doubao refines
                         knowledge_base
                              │
                    ▼ Injected on next startup
                      Agent got smarter

Tech Stack

LayerTechnology
BackendPython 3.12+ / FastAPI / SQLite / aiosqlite
FrontendReact 19 + TypeScript + Vite + Tailwind CSS
Terminalxterm.js + tmux
Agent RuntimeClaude Code (Survival Engine)
Analysis LLMDoubao (knowledge refinement / supervisor analysis)
NotificationsFeishu Bot (optional)
i18nreact-i18next (zh-CN / en)

API Reference

Self-Report API (Agent → Evolve)

EndpointMethodPurpose
/api/agent/heartbeatPOSTActivity heartbeat
/api/agent/deliverablePOSTReport deliverable
/api/agent/discoveryPOSTReport discovery → auto-enters knowledge base
/api/agent/reviewPOSTWork review → learned auto-enters knowledge base
/api/agent/workflowPOSTCreate reusable workflow
/api/agent/upgradePOSTSubmit capability upgrade proposal

Management API (Dashboard → Evolve)

EndpointMethodPurpose
/api/knowledgeGET/POSTKnowledge base CRUD
/api/knowledge/{id}/promotePOSTPromote to permanent layer
/api/extensionsGETList skills, MCPs, plugins
/api/extensions/syncPOSTScan filesystem and sync to DB
/api/projects/scanGETScan workspace projects
/api/supervisor/analyzePOSTTrigger supervisor analysis
/api/scheduled-tasksGET/POSTScheduled task management
/api/agent/promptGET/PUTEdit survival engine prompt

Quick Start

git clone https://github.com/xmqywx/Evolve.git
cd Evolve

# Backend
python -m venv .venv
.venv/bin/pip install -r requirements.txt

# Frontend
cd web && npm install && npm run build && cd ..

# Configure
cp config.yaml.example config.yaml
# Edit config.yaml with your API keys

# Run
.venv/bin/python run.py
# Visit http://localhost:3818

Project Structure

myagent/
├── server.py          FastAPI server + all API endpoints
├── survival.py        Survival engine (tmux watchdog + dynamic prompt injection)
├── knowledge.py       Knowledge hub (ingest → refine → store → inject)
├── supervisor.py      Supervisor agent (JSONL extraction + Doubao analysis)
├── extensions.py      Extensions scanner (skills, MCPs, plugins)
├── cron_scheduler.py  Scheduled task scheduler (croniter + asyncio)
├── db.py              SQLite data layer (20+ tables)
├── doubao.py          Doubao API client
├── scanner.py         Claude session scanner + JSONL parser
├── feishu.py          Feishu notifications
└── config.py          Pydantic config models

web/src/
├── i18n/              Internationalization (zh-CN, en)
├── pages/
│   ├── Dashboard.tsx      Control panel (heartbeats, deliverables, discoveries)
│   ├── Survival.tsx       Survival engine terminal (web terminal + watchdog toggle)
│   ├── Knowledge.tsx      Knowledge base (filter, add, promote, delete)
│   ├── Supervisor.tsx     Supervisor reports (JSONL analysis)
│   ├── Extensions.tsx     Extensions manager (skills, MCPs, plugins with tags)
│   ├── Output.tsx         Deliverables + project overview
│   ├── Sessions.tsx       Multi-session management
│   ├── ScheduledTasks.tsx Scheduled task management
│   ├── Workflows.tsx      Workflow / skill library
│   ├── Capabilities.tsx   Capability toggle panel
│   ├── PromptEditor.tsx   Live prompt editor
│   └── ...
└── components/
    ├── IconSidebar.tsx    Expandable icon sidebar
    └── Layout.tsx         App layout

The Harness Engineering Paradigm

Evolve is built on a concept we call Harness Engineering — the discipline of building infrastructure that wraps, constrains, and amplifies AI models. Instead of improving the model itself, you improve the system around it.

Traditional: Better Model → Better Results
Harness Eng: Same Model + Better Harness → Dramatically Better Results

Evolve implements five types of harness:

HarnessWhat it doesEvolve component
Prompt HarnessDynamically assembles the optimal prompt with context, knowledge, and constraintsIdentity Prompt + Knowledge Injection
Output HarnessCaptures, validates, and routes agent outputs to the right systemsSelf-Report API + Knowledge Hub
Constraint HarnessEnforces boundaries and permissions at runtimeCapability Controls + Forbidden Operations
Runtime HarnessKeeps the agent alive, detects failures, recovers stateSurvival Engine + Watchdog + Crash Recovery
Observation HarnessMonitors agent behavior and generates insightsSupervisor Agent + JSONL Analysis

Why this matters: The model (Claude, GPT, etc.) is a commodity. The harness is your competitive advantage. Two teams using the same model will get wildly different results based on their harness quality.

Future directions:

  • Knowledge Distillation — aggregate noisy discoveries into weekly intelligence briefings, not raw lists
  • Intent Marketplace — decompose high-level goals into tradeable sub-intents that multiple agents can bid on

Comparison

EvolveHermes AgentAutoGPTCrewAI
PurposeAgent control planeAgent frameworkAutonomous agentMulti-agent orchestration
Self-reporting6 APIsNoneNoneNone
Knowledge loopAuto-refine + injectManual skill filesNoneNone
AI reviews AISupervisor agentNoneNoneNone
Skills-first workflowEnforcedNoneNoneNone
Runtime capability controlDashboard togglesNoneNoneNone
24/7 watchdogHeartbeat + nudgeNoneNoneNone
Extension managementScan + tag + trackNoneNoneNone
Web dashboardFull-featuredCLI onlyWeb UINone
i18nzh-CN + enNoneen onlyen only

License

MIT

Leave Feedback

STATS

Commits92
Open Issues3
Progress0%

ACTIVITY

01Feb 23Feb 28Mar 5Mar 10Mar 15Mar 20OpenedResolved

RELEASE

Latest: idea

Mar 22

LABELS

Idea

RESOURCES

GitHub
Docs (coming soon)

CONTRIBUTORS

1 contributor