feat: auto-project-detection + project stages

Three changes: 1. ABB-Space registered as a lead project with stage=lead in Trusted Project State. Projects now have lifecycle awareness (lead/proposition vs active contract vs completed). 2. Extraction no longer drops unregistered project tags. When the LLM extractor sees a conversation about a project not in the registry, it keeps the model's tag on the candidate instead of falling back to empty. This enables auto-detection of new projects/leads from organic conversations. The nightly pipeline surfaces these candidates for triage, where the operator sees "hey, there's a new project called X" and can decide whether to register it. 3. Extraction prompt updated to tell the model: "If the conversation discusses a project NOT in the known list, still tag it — the system will auto-detect it." This removes the artificial ceiling that prevented new project discovery. Updated Case D test: unregistered + unscoped now keeps the model's tag instead of dropping to empty. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 17:16:04 -04:00
parent 3f18ba3b35
commit c57617f611
3 changed files with 22 additions and 9 deletions
--- a/src/atocore/memory/extractor_llm.py
+++ b/src/atocore/memory/extractor_llm.py
@@ -74,7 +74,7 @@ _SYSTEM_PROMPT = """You extract durable memory candidates from LLM conversation

 AtoCore stores two kinds of knowledge:

-A. PROJECT-SPECIFIC: applied decisions, constraints, and architecture for a named project (p04-gigabit, p05-interferometer, p06-polisher, atomizer-v2, atocore). These stay scoped to one project.
+A. PROJECT-SPECIFIC: applied decisions, constraints, and architecture for a named project. Known projects include p04-gigabit, p05-interferometer, p06-polisher, atomizer-v2, atocore, abb-space. If the conversation discusses a project NOT in this list, still tag it with the project name you identify — the system will auto-detect it as a new project or lead.

 B. DOMAIN KNOWLEDGE: generalizable engineering insight that was EARNED through project work and is reusable across projects. Tag these with a domain instead of a project.

@@ -291,9 +291,20 @@ def _parse_candidates(raw_output: str, interaction: Interaction) -> list[MemoryC

                registered_ids = {p.project_id for p in load_project_registry()}
                resolved = resolve_project_name(model_project)
-                project = resolved if resolved in registered_ids else ""
+                if resolved in registered_ids:
+                    project = resolved
+                else:
+                    # Unregistered project — keep the model's tag so
+                    # auto-triage / the operator can see it and decide
+                    # whether to register it as a new project or lead.
+                    project = model_project
+                    log.info(
+                        "unregistered_project_detected",
+                        model_project=model_project,
+                        interaction_id=interaction.id,
+                    )
            except Exception:
-                project = ""
+                project = model_project if model_project else ""
        else:
            project = ""
        domain = str(item.get("domain") or "").strip().lower()