Add engineering architecture docs

2026-04-06 12:45:28 -04:00
parent 8f74cab0e6
commit af01dd3e70
5 changed files with 483 additions and 12 deletions
--- a/README.md
+++ b/README.md
@@ -38,13 +38,13 @@ python scripts/ingest_folder.py --path /path/to/notes

 ## Architecture

-```
+```text
 FastAPI (port 8100)
-  ├── Ingestion: markdown → parse → chunk → embed → store
-  ├── Retrieval: query → embed → vector search → rank
-  ├── Context Builder: retrieve → boost → budget → format
-  ├── SQLite (documents, chunks, memories, projects, interactions)
-  └── ChromaDB (vector embeddings)
+  |- Ingestion: markdown -> parse -> chunk -> embed -> store
+  |- Retrieval: query -> embed -> vector search -> rank
+  |- Context Builder: retrieve -> boost -> budget -> format
+  |- SQLite (documents, chunks, memories, projects, interactions)
+  '- ChromaDB (vector embeddings)
 ```

 ## Configuration
@@ -65,3 +65,11 @@ Set via environment variables (prefix `ATOCORE_`):
 pip install -e ".[dev]"
 pytest
 ```
+
+## Architecture Notes
+
+Implementation-facing architecture notes live under `docs/architecture/`.
+
+Current additions:
+- `docs/architecture/engineering-knowledge-hybrid-architecture.md`
+- `docs/architecture/engineering-ontology-v1.md`
--- a/docs/architecture/engineering-knowledge-hybrid-architecture.md
+++ b/docs/architecture/engineering-knowledge-hybrid-architecture.md
@@ -0,0 +1,205 @@
+# Engineering Knowledge Hybrid Architecture
+
+## Purpose
+
+This note defines how **AtoCore** can evolve into the machine foundation for a **living engineering project knowledge system** while remaining aligned with core AtoCore philosophy.
+
+AtoCore remains:
+- the trust engine
+- the memory/context engine
+- the retrieval/context assembly layer
+- the runtime-facing augmentation layer
+
+It does **not** become a generic wiki app or a PLM clone.
+
+## Core Architectural Thesis
+
+AtoCore should act as the **machine truth / context / memory substrate** for project knowledge systems.
+
+That substrate can then support:
+- engineering knowledge accumulation
+- human-readable mirrors
+- OpenClaw augmentation
+- future engineering copilots
+- project traceability across design, analysis, manufacturing, and operations
+
+## Layer Model
+
+### Layer 0 — Raw Artifact Layer
+Examples:
+- CAD exports
+- FEM exports
+- videos / transcripts
+- screenshots
+- PDFs
+- source code
+- spreadsheets
+- reports
+- test data
+
+### Layer 1 — AtoCore Core Machine Layer
+Canonical machine substrate.
+
+Contains:
+- source registry
+- source chunks
+- embeddings / vector retrieval
+- structured memory
+- trusted project state
+- entity and relationship stores
+- provenance and confidence metadata
+- interactions / retrieval logs / context packs
+
+### Layer 2 — Engineering Knowledge Layer
+Domain-specific project model built on top of AtoCore.
+
+Represents typed engineering objects such as:
+- Project
+- System
+- Subsystem
+- Component
+- Interface
+- Requirement
+- Constraint
+- Assumption
+- Decision
+- Material
+- Parameter
+- Equation
+- Analysis Model
+- Result
+- Validation Claim
+- Manufacturing Process
+- Test
+- Software Module
+- Vendor
+- Artifact
+
+### Layer 3 — Human Mirror
+Derived human-readable support surface.
+
+Examples:
+- project overview
+- current state
+- subsystem pages
+- component pages
+- decision log
+- validation summary
+- timeline
+- open questions / risks
+
+This layer is **derived** from structured state and approved synthesis. It is not canonical machine truth.
+
+### Layer 4 — Runtime / Clients
+Consumers such as:
+- OpenClaw
+- CLI tools
+- dashboards
+- future IDE integrations
+- engineering copilots
+- reporting systems
+- Atomizer / optimization tooling
+
+## Non-Negotiable Rule
+
+**Human-readable pages are support artifacts. They are not the primary machine truth layer.**
+
+Runtime trust order should remain:
+1. trusted current project state
+2. validated structured records
+3. selected reviewed synthesis
+4. retrieved source evidence
+5. historical / low-confidence material
+
+## Responsibilities
+
+### AtoCore core owns
+- memory CRUD
+- trusted project state CRUD
+- retrieval orchestration
+- context assembly
+- provenance
+- confidence / status
+- conflict flags
+- runtime APIs
+
+### Engineering Knowledge Layer owns
+- engineering object taxonomy
+- engineering relationships
+- domain adapters
+- project-specific interpretation logic
+- design / analysis / manufacturing / operations linkage
+
+### Human Mirror owns
+- readability
+- navigation
+- overview pages
+- subsystem summaries
+- decision digests
+- human inspection / audit comfort
+
+## Update Model
+
+New artifacts should not directly overwrite trusted state.
+
+Recommended update flow:
+1. ingest source
+2. parse / chunk / register artifact
+3. extract candidate objects / claims / relationships
+4. compare against current trusted state
+5. flag conflicts or supersessions
+6. promote updates only under explicit rules
+7. regenerate affected human-readable pages
+8. log history and provenance
+
+## Integration with Existing Knowledge Base System
+
+The existing engineering Knowledge Base project can be treated as the first major domain adapter.
+
+Bridge targets include:
+- KB-CAD component and architecture pages
+- KB-FEM models / results / validation pages
+- generation history
+- images / transcripts / session captures
+
+AtoCore should absorb the structured value of that system, not replace it with plain retrieval.
+
+## Suggested First Implementation Scope
+
+1. stabilize current AtoCore core behavior
+2. define engineering ontology v1
+3. add minimal entity / relationship support
+4. create a Knowledge Base bridge for existing project structures
+5. generate Human Mirror v1 pages:
+   - overview
+   - current state
+   - decision log
+   - subsystem summary
+6. add engineering-aware context assembly for OpenClaw
+
+## Why This Is Aligned With AtoCore Philosophy
+
+This architecture preserves the original core ideas:
+- owned memory layer
+- owned context assembly
+- machine-human separation
+- provenance and trust clarity
+- portability across runtimes
+- robustness before sophistication
+
+## Long-Range Outcome
+
+AtoCore can become the substrate for a **knowledge twin** of an engineering project:
+- structure
+- intent
+- rationale
+- validation
+- manufacturing impact
+- operational behavior
+- change history
+- evidence traceability
+
+That is significantly more powerful than either:
+- a generic wiki
+- plain document RAG
+- an assistant with only chat memory
--- a/docs/architecture/engineering-ontology-v1.md
+++ b/docs/architecture/engineering-ontology-v1.md
@@ -0,0 +1,250 @@
+# Engineering Ontology V1
+
+## Purpose
+
+Define the first practical engineering ontology that can sit on top of AtoCore and represent a real engineering project as structured knowledge.
+
+This ontology is intended to be:
+- useful to machines
+- inspectable by humans through derived views
+- aligned with AtoCore trust / provenance rules
+- expandable across mechanical, FEM, electrical, software, manufacturing, and operations
+
+## Goal
+
+Represent a project as a **system of objects and relationships**, not as a pile of notes.
+
+The ontology should support queries such as:
+- what is this subsystem?
+- what requirements does this component satisfy?
+- what result validates this claim?
+- what changed recently?
+- what interfaces are affected by a design change?
+- what is active vs superseded?
+
+## Object Families
+
+### Project structure
+- Project
+- System
+- Subsystem
+- Assembly
+- Component
+- Interface
+
+### Intent / design logic
+- Requirement
+- Constraint
+- Assumption
+- Decision
+- Rationale
+- Risk
+- Issue
+- Open Question
+- Change Request
+
+### Physical / technical definition
+- Material
+- Parameter
+- Equation
+- Configuration
+- Geometry Artifact
+- CAD Artifact
+- Tolerance
+- Operating Mode
+
+### Analysis / validation
+- Analysis Model
+- Load Case
+- Boundary Condition
+- Solver Setup
+- Result
+- Validation Claim
+- Test
+- Correlation Record
+
+### Manufacturing / delivery
+- Manufacturing Process
+- Vendor
+- BOM Item
+- Part Number
+- Assembly Procedure
+- Inspection Step
+- Cost Driver
+
+### Software / controls / electrical
+- Software Module
+- Control Function
+- State Machine
+- Signal
+- Sensor
+- Actuator
+- Electrical Interface
+- Firmware Artifact
+
+### Evidence / provenance
+- Source Document
+- Transcript Segment
+- Image / Screenshot
+- Session
+- Report
+- External Reference
+- Generated Summary
+
+## Minimum Viable V1 Scope
+
+Initial implementation should start with:
+- Project
+- Subsystem
+- Component
+- Requirement
+- Constraint
+- Decision
+- Material
+- Parameter
+- Analysis Model
+- Result
+- Validation Claim
+- Artifact
+
+This is enough to represent meaningful project state without trying to model everything immediately.
+
+## Core Relationship Types
+
+### Structural
+- `CONTAINS`
+- `PART_OF`
+- `INTERFACES_WITH`
+
+### Intent / logic
+- `SATISFIES`
+- `CONSTRAINED_BY`
+- `BASED_ON_ASSUMPTION`
+- `AFFECTED_BY_DECISION`
+- `SUPERSEDES`
+
+### Validation
+- `ANALYZED_BY`
+- `VALIDATED_BY`
+- `SUPPORTS`
+- `CONFLICTS_WITH`
+- `DEPENDS_ON`
+
+### Artifact / provenance
+- `DESCRIBED_BY`
+- `UPDATED_BY_SESSION`
+- `EVIDENCED_BY`
+- `SUMMARIZED_IN`
+
+## Example Statements
+
+- `Subsystem:Lateral Support CONTAINS Component:Pivot Pin`
+- `Component:Pivot Pin CONSTRAINED_BY Requirement:low lateral friction`
+- `Decision:Use GF-PTFE pad AFFECTS Subsystem:Lateral Support`
+- `AnalysisModel:M1 static model ANALYZES Subsystem:Reference Frame`
+- `Result:deflection case 03 SUPPORTS ValidationClaim:vertical stiffness acceptable`
+- `Artifact:NX assembly DESCRIBES Component:Reference Frame`
+- `Session:gen-004 UPDATED_BY_SESSION Component:Vertical Support`
+
+## Shared Required Fields
+
+Every major object should support fields equivalent to:
+- `id`
+- `type`
+- `name`
+- `project_id`
+- `status`
+- `confidence`
+- `source_refs`
+- `created_at`
+- `updated_at`
+- `notes` (optional)
+
+## Suggested Status Lifecycle
+
+For objects and claims:
+- `candidate`
+- `active`
+- `superseded`
+- `invalid`
+- `needs_review`
+
+## Trust Rules
+
+1. An object may exist before it becomes trusted.
+2. A generated markdown summary is not canonical truth by default.
+3. If evidence conflicts, prefer:
+   1. trusted current project state
+   2. validated structured records
+   3. reviewed derived synthesis
+   4. raw evidence
+   5. historical notes
+4. Conflicts should be surfaced, not silently blended.
+
+## Mapping to the Existing Knowledge Base System
+
+### KB-CAD can map to
+- System
+- Subsystem
+- Component
+- Material
+- Decision
+- Constraint
+- Artifact
+
+### KB-FEM can map to
+- Analysis Model
+- Load Case
+- Boundary Condition
+- Result
+- Validation Claim
+- Correlation Record
+
+### Session generations can map to
+- Session
+- Generated Summary
+- object update history
+- provenance events
+
+## Human Mirror Possibilities
+
+Once the ontology exists, AtoCore can generate pages such as:
+- project overview
+- subsystem page
+- component page
+- decision log
+- validation summary
+- requirement trace page
+
+These should remain **derived representations** of structured state.
+
+## Recommended V1 Deliverables
+
+1. minimal typed object registry
+2. minimal typed relationship registry
+3. evidence-linking support
+4. practical query support for:
+   - component summary
+   - subsystem current state
+   - requirement coverage
+   - result-to-claim mapping
+   - decision history
+
+## What Not To Do In V1
+
+- do not model every engineering concept immediately
+- do not build a giant graph with no practical queries
+- do not collapse structured objects back into only markdown
+- do not let generated prose outrank structured truth
+- do not auto-promote trusted state too aggressively
+
+## Summary
+
+Ontology V1 should be:
+- small enough to implement
+- rich enough to be useful
+- aligned with AtoCore trust philosophy
+- capable of absorbing the existing engineering Knowledge Base work
+
+The first goal is not to model everything.
+The first goal is to represent enough of a real project that AtoCore can reason over structure, not just notes.
--- a/docs/current-state.md
+++ b/docs/current-state.md
@@ -45,6 +45,9 @@ now includes a first curated ingestion batch for the active projects.
  - approved registration
  - safe update of existing project registrations
  - refresh
+- implementation-facing architecture notes for:
+  - engineering knowledge hybrid architecture
+  - engineering ontology v1
 - env-driven storage and deployment paths
 - Dalidou Docker deployment foundation
 - initial AtoCore self-knowledge corpus ingested on Dalidou
@@ -208,10 +211,11 @@ This separation is healthy:
 ## Immediate Next Focus

 1. Use the new T420-side AtoCore skill and registration flow in real OpenClaw workflows
-2. Tighten retrieval quality for the newly seeded active projects
-3. Define the first broader AtoVault/AtoDrive ingestion batches
-4. Add backup/export strategy for Dalidou machine state
-5. Only later consider deeper automatic OpenClaw integration or write-back
+2. Keep the new engineering-knowledge architecture docs as implementation guidance while avoiding premature schema work
+3. Tighten retrieval quality for the newly seeded active projects
+4. Define the first broader AtoVault/AtoDrive ingestion batches
+5. Add backup/export strategy for Dalidou machine state
+6. Only later consider deeper automatic OpenClaw integration or write-back

 ## Guiding Constraints

--- a/docs/next-steps.md
+++ b/docs/next-steps.md
@@ -33,11 +33,15 @@ AtoCore now has:
   - foundation now exists via project registry + per-project refresh API
   - registration policy + template + proposal + approved registration are now
     the normal path for new projects
-5. Define backup and export procedures for Dalidou
+5. Integrate the new engineering architecture docs into active planning, not immediate schema code
+   - keep `docs/architecture/engineering-knowledge-hybrid-architecture.md` as the target layer model
+   - keep `docs/architecture/engineering-ontology-v1.md` as the V1 structured-domain target
+   - do not start entity/relationship persistence until the ingestion, retrieval, registry, and backup baseline feels boring and stable
+6. Define backup and export procedures for Dalidou
   - exercise the new SQLite + registry snapshot path on Dalidou
   - Chroma backup or rebuild policy
   - retention and restore validation
-6. Keep deeper automatic runtime integration deferred until the read-only model
+7. Keep deeper automatic runtime integration deferred until the read-only model
   has proven value

 ## Trusted State Status