ATOCore

Author	SHA1	Message	Date
Anto01	3f23ca1bc6	feat: signal-aggressive extraction + auto vault refresh in nightly cron Extraction prompt rewritten for signal-aggressive mode. The old prompt rewarded silence ("durable insight only, empty is correct") which caused quiet failures — real project signal (Schott quotes arriving, stakeholder events, blockers) was dropped as "not architectural enough". New prompt explicitly lists what to emit: 1. Project activity (mentions with context — quote received, blocker, action item) 2. Decisions and choices (architectural commitments, vendor selection) 3. Durable engineering insight (earned knowledge, generalizable) 4. Stakeholder and vendor events (emails sent, meetings scheduled) 5. Preferences and adaptations (how Antoine works) Philosophy shift: "capture more signal, let triage filter noise" replaces "extract only durable architectural facts". Auto-triage already rejects noise well, so moving the filter downstream gives us visibility into weak signals without polluting active memory. Added 'episodic' to the candidate types list to support stakeholder events with a timestamp feel. LLM_EXTRACTOR_VERSION bumped to llm-0.4.0. Also: cron-backup.sh now runs POST /ingest/sources before extraction so new PKM files flow in automatically. Fail-open, non-blocking. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 10:24:50 -04:00
Anto01	cd0fd390a8	fix: host-side LLM extraction (claude CLI not in container) The claude CLI is installed on the Dalidou HOST but not inside the Docker container. The /admin/extract-batch API endpoint with mode=llm silently returned 0 candidates because shutil.which('claude') was None inside the container. Fix: extraction runs host-side via deploy/dalidou/batch-extract.sh which calls scripts/batch_llm_extract_live.py with the host's PYTHONPATH pointing at the repo's src/. The script: - Fetches interactions from the API (GET /interactions?since=...) - Runs extract_candidates_llm() locally (host has claude CLI) - POSTs candidates back to the API (POST /memory, status=candidate) - Tracks last-run timestamp via project state The cron now calls the host-side script instead of the container API endpoint for LLM mode. Rule-mode extraction in the container still works via /admin/extract-batch. The API endpoint retains the mode=llm option for environments where claude IS inside the container (future Docker image with claude CLI, or a different deployment model). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-12 10:55:22 -04:00
Anto01	c67bec095c	feat: nightly batch extraction in cron-backup.sh (Day 2) Step 4 added to the daily cron: POST /admin/extract-batch with mode=llm, persist=true, limit=50. Runs after backup + cleanup + rsync. Fail-open: extraction failure never blocks the backup. Gated on ATOCORE_EXTRACT_BATCH=true (defaults to true). The endpoint uses the last_extract_batch_run timestamp from project state to auto-resume, so the cron doesn't need to track state. curl --max-time 600 gives the LLM extractor up to 10 minutes for the batch (50 interactions × ~20s each worst case = ~17 min, but most will be no-ops if already extracted). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-12 10:51:13 -04:00
Anto01	7bf83bf46a	chore: mark cron-backup.sh executable deploy.sh sync-checkout was landing the file without an exec bit, so the cron run hit 'Permission denied' until chmod +x was applied manually on Dalidou. Persist the exec bit in the git index so future deploys don't regress. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-11 12:22:20 -04:00
Anto01	9366ba7879	feat: length-aware reinforcement + batch triage CLI + off-host backup - Reinforcement matcher now handles paragraph-length memories via a dual-mode threshold: short memories keep the 70% overlap rule, long memories (>15 stems) require 12 absolute overlaps AND 35% fraction so organic paraphrase can still reinforce. Diagnosis: every active memory stayed at reference_count=0 because 40-token project summaries never hit 70% overlap on real responses. - scripts/atocore_client.py gains batch-extract (fan out /interactions/{id}/extract over recent interactions) and triage (interactive promote/reject walker for the candidate queue), matching the Phase 9 reflection-loop review flow without pulling extraction into the capture hot path. - deploy/dalidou/cron-backup.sh adds an optional off-host rsync step gated on ATOCORE_BACKUP_RSYNC, fail-open when the target is offline so a laptop being off at 03:00 UTC never reds the local backup. - docs/next-steps.md records the retrieval-quality sweep: project state surfaces, chunks are on-topic but broad, active memories never reach the pack (reflection loop has no retrieval outlet yet). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-11 11:20:03 -04:00
Anto01	0b1742770a	feat: cleanup endpoint, auto-extraction on capture, daily cron script - POST /admin/backup/cleanup — retention cleanup via API (dry-run by default) - record_interaction() accepts extract=True to auto-extract candidate memories from response text using the Phase 9C rule-based extractor - POST /interactions accepts extract field to enable extraction on capture - deploy/dalidou/cron-backup.sh — daily backup + cleanup for cron Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-11 10:28:32 -04:00

6 Commits