docs/architecture/project-identity-canonicalization.md

# Project Identity Canonicalization

## Why this document exists

AtoCore identifies projects by name in many places: trusted state
rows, memories, captured interactions, query/context API parameters,
extractor candidates, future engineering entities. Without an
explicit rule, every callsite would have to remember to canonicalize
project names through the registry — and the recent codex review
caught exactly the bug class that follows when one of them forgets.

The fix landed in `fb6298a` and works correctly today. This document
exists to make the rule **explicit and discoverable** so the
engineering layer V1 implementation, future entity write paths, and
any new agent integration don't reintroduce the same fragmentation
when nobody is looking.

## The contract

> **Every read/write that takes a project name MUST canonicalize it
> through `resolve_project_name()` before the value crosses a service
> boundary.**

The boundary is wherever a project name becomes a database row, a
query filter, an attribute on a stored object, or a key for any
lookup. The canonicalization happens **once**, at that boundary,
before the underlying storage primitive is called.

Symbolically:

```
HTTP layer (raw user input)
    ↓
   service entry point
    ↓
   project_name = resolve_project_name(project_name)   ← ONLY canonical from this point
    ↓
   storage / queries / further service calls
```

The rule is intentionally simple. There's no per-call exception,
no "trust me, the caller already canonicalized it" shortcut, no
opt-out flag. Every service-layer entry point applies the helper
the moment it receives a project name from outside the service.

## The helper

```python
# src/atocore/projects/registry.py

def resolve_project_name(name: str | None) -> str:
    """Canonicalize a project name through the registry.

    Returns the canonical project_id if the input matches any
    registered project's id or alias. Returns the input unchanged
    when it's empty or not in the registry — the second case keeps
    backwards compatibility with hand-curated state, memories, and
    interactions that predate the registry, or for projects that
    are intentionally not registered.
    """
    if not name:
        return name or ""
    project = get_registered_project(name)
    if project is not None:
        return project.project_id
    return name
```

Three behaviors worth keeping in mind:

1. **Empty / None input → empty string output.** Callers don't have
   to pre-check; passing `""` or `None` to a query filter still
   works as "no project scope".
2. **Registered alias → canonical project_id.** The helper does the
   case-insensitive lookup and returns the project's `id` field
   (e.g. `"p05" → "p05-interferometer"`).
3. **Unregistered name → input unchanged.** This is the
   backwards-compatibility path. Hand-curated state, memories, or
   interactions created under a name that isn't in the registry
   keep working. The retrieval is then "best effort" — the raw
   string is used as the SQL key, which still finds the row that
   was stored under the same raw string. This path exists so the
   engineering layer V1 doesn't have to also be a data migration.

## Where the helper is currently called

As of `fb6298a`, the helper is invoked at exactly these eight
service-layer entry points:

| Module | Function | What gets canonicalized |
|---|---|---|
| `src/atocore/context/builder.py` | `build_context` | the `project_hint` parameter, before the trusted state lookup |
| `src/atocore/context/project_state.py` | `set_state` | `project_name`, before `ensure_project()` |
| `src/atocore/context/project_state.py` | `get_state` | `project_name`, before the SQL lookup |
| `src/atocore/context/project_state.py` | `invalidate_state` | `project_name`, before the SQL lookup |
| `src/atocore/interactions/service.py` | `record_interaction` | `project`, before insert |
| `src/atocore/interactions/service.py` | `list_interactions` | `project` filter parameter, before WHERE clause |
| `src/atocore/memory/service.py` | `create_memory` | `project`, before insert |
| `src/atocore/memory/service.py` | `get_memories` | `project` filter parameter, before WHERE clause |

Every one of those is the **first** thing the function does after
input validation. There is no path through any of those eight
functions where a project name reaches storage without passing
through `resolve_project_name`.

## Where the helper is NOT called (and why that's correct)

These places intentionally do not canonicalize:

1. **`update_memory`'s project field.** The API does not allow
   changing a memory's project after creation, so there's no
   project to canonicalize. The function only updates `content`,
   `confidence`, and `status`.
2. **The retriever's `_project_match_boost` substring matcher.** It
   already calls `get_registered_project` internally to expand the
   hint into the candidate set (canonical id + all aliases + last
   path segments). It accepts the raw hint by design.
3. **`_rank_chunks`'s secondary substring boost in
   `builder.py`.** Still uses the raw hint. This is a multiplicative
   factor on top of correct retrieval, not a filter, so it cannot
   drop relevant chunks. Tracked as a future cleanup but not
   critical.
4. **Direct SQL queries for the projects table itself** (e.g.
   `ensure_project`'s lookup). These are intentional case-insensitive
   raw lookups against the column the canonical id is stored in.
   `set_state` already canonicalized before reaching `ensure_project`,
   so the value passed is the canonical id by definition.
5. **Hand-authored project names that aren't in the registry.**
   The helper returns those unchanged. This is the backwards-compat
   path mentioned above; it is *not* a violation of the rule, it's
   the rule applied to a name with no registry record.

## Why this is the trust hierarchy in action

The whole point of AtoCore is the trust hierarchy from the operating
model:

1. Trusted Project State (Layer 3) is the most authoritative layer
2. Memories (active) are second
3. Source chunks (raw retrieved content) are last

If a caller passes the alias `p05` and Layer 3 was written under
`p05-interferometer`, and the lookup fails to find the canonical
row, **the trust hierarchy collapses**. The most-authoritative
layer is silently invisible to the caller. The system would still
return *something* — namely, lower-trust retrieved chunks — and the
human would never know they got a degraded answer.

The canonicalization helper is what makes the trust hierarchy
**dependable**. Layer 3 is supposed to win every time. To win it
has to be findable. To be findable, the lookup key has to match
how the row was stored. And the only way to guarantee that match
across every entry point is to canonicalize at every boundary.

## The rule for new entry points

When you add a new service-layer function that takes a project name,
follow this checklist:

1. **Does the function read or write a row keyed by project?** If
   yes, you must call `resolve_project_name`. If no (e.g. it only
   takes `project` as a label for logging), you may skip the
   canonicalization but you should add a comment explaining why.
2. **Where does the canonicalization go?** As the first statement
   after input validation. Not later, not "before storage", not
   "in the helper that does the actual write". As the first
   statement, so any subsequent service call inside the function
   sees the canonical value.
3. **Add a regression test that uses an alias.** Use the
   `project_registry` fixture from `tests/conftest.py` to set up
   a temp registry with at least one project + aliases, then
   verify the new function works when called with the alias and
   when called with the canonical id.
4. **If the function can be called with `None` or empty string,
   verify that path too.** The helper handles it correctly but
   the function-under-test might not.

## How the `project_registry` test fixture works

`tests/conftest.py::project_registry` returns a callable that
takes one or more `(project_id, [aliases])` tuples (or just a bare
`project_id` string), writes them into a temp registry file,
points `ATOCORE_PROJECT_REGISTRY_PATH` at it, and reloads
`config.settings`. Use it like:

```python
def test_my_new_thing_canonicalizes(project_registry):
    project_registry(("p05-interferometer", ["p05", "interferometer"]))

    # ... call your service function with "p05" ...
    # ... assert it works the same as if you'd passed "p05-interferometer" ...
```

The fixture is reused by all 12 alias-canonicalization regression
tests added in `fb6298a`. Following the same pattern for new
features is the cheapest way to keep the contract intact.

## What this rule does NOT cover

1. **Alias creation / management.** This document is about reading
   and writing project-keyed data. Adding new projects or new
   aliases is the registry's own write path
   (`POST /projects/register`, `PUT /projects/{name}`), which
   already enforces collision detection and atomic file writes.
2. **Registry hot-reloading.** The helper calls
   `load_project_registry()` on every invocation, which reads the
   JSON file each time. There is no in-process cache. If the
   registry file changes, the next call sees the new contents.
   Performance is fine for the current registry size but if it
   becomes a bottleneck, add a versioned cache here, not at every
   call site.
3. **Cross-project deduplication.** If two different projects in
   the registry happen to share an alias, the registry's collision
   detection blocks the second one at registration time, so this
   case can't arise in practice. The helper does not handle it
   defensively.
4. **Time-bounded canonicalization.** A project's canonical id is
   stable. Aliases can be added or removed via
   `PUT /projects/{name}`, but the canonical `id` field never
   changes after registration. So a row written today under the
   canonical id will always remain findable under that id, even
   if the alias set evolves.
5. **Migration of legacy data.** If the live Dalidou DB has rows
   that were written under aliases before the canonicalization
   landed, those rows still work via the unregistered-name
   fallback path. They are not automatically migrated to canonical
   form. A future migration script could walk the DB and
   re-key any rows whose `project` field matches a known alias to
   the canonical id; tracked as an open follow-up below.

## What this enables for the engineering layer V1

When the engineering layer ships per `engineering-v1-acceptance.md`,
it adds at least these new project-keyed surfaces:

- `entities` table with a `project_id` column
- `relationships` table that joins entities, indirectly project-keyed
- `conflicts` table with a `project` column
- `mirror_regeneration_failures` table with a `project` column
- new endpoints: `POST /entities/...`, `POST /ingest/kb-cad/export`,
  `POST /ingest/kb-fem/export`, `GET /mirror/{project}/...`,
  `GET /conflicts?project=...`

**Every one of those write/read paths needs to call
`resolve_project_name` at its service-layer entry point**, following
the same pattern as the eight existing call sites listed above. The
implementation sprint should:

1. Apply the helper at each new service entry point as the first
   statement after input validation
2. Add a regression test using the `project_registry` fixture that
   exercises an alias against each new entry point
3. Treat any new service function that takes a project name without
   calling `resolve_project_name` as a code review failure

The pattern is simple enough to follow without thinking, which is
exactly the property we want for a contract that has to hold
across many independent additions.

## Open follow-ups

These are things the canonicalization story still has open. None
are blockers, but they're the rough edges to be aware of.

1. **Legacy alias data migration.** If the live Dalidou DB has any
   rows written under aliases before `fb6298a` landed, they
   still work via the unregistered-name fallback path. A small
   migration script could walk `memories`, `interactions`,
   `project_state`, and `projects`, find any names that match a
   registry alias, and re-key them to the canonical id. Worth
   doing once before the engineering layer V1 lands. Estimated
   cost: ~30 LOC + a dry-run mode + a one-time run.
2. **Registry file caching.** `load_project_registry()` reads the
   JSON file on every `resolve_project_name` call. With ~5
   projects this is fine; with 50+ it would warrant a versioned
   cache (cache key = file mtime + size). Defer until measured.
3. **Case sensitivity audit.** The helper uses
   `get_registered_project` which lowercases for comparison. The
   stored canonical id keeps its original casing. No bug today
   because every test passes, but worth re-confirming when the
   engineering layer adds entity-side storage.
4. **`_rank_chunks`'s secondary substring boost.** Mentioned
   earlier; still uses the raw hint. Replace it with the same
   helper-driven approach the retriever uses, OR delete it as
   redundant once we confirm the retriever's primary boost is
   sufficient.
5. **Documentation discoverability.** This doc lives under
   `docs/architecture/`. The contract is also restated in the
   docstring of `resolve_project_name` and referenced from each
   call site's comment. That redundancy is intentional — the
   contract is too easy to forget to live in only one place.

## Quick reference card

Copy-pasteable for new service functions:

```python
from atocore.projects.registry import resolve_project_name


def my_new_service_entry_point(
    project_name: str,
    other_args: ...,
) -> ...:
    # Validate inputs first
    if not project_name:
        raise ValueError("project_name is required")

    # Canonicalize through the registry as the first thing after
    # validation. Every subsequent operation in this function uses
    # the canonical id, so storage and queries are guaranteed
    # consistent across alias and canonical-id callers.
    project_name = resolve_project_name(project_name)

    # ... rest of the function ...
```

## TL;DR

- One helper, one rule: `resolve_project_name` at every service-layer
  entry point that takes a project name
- Currently called in 8 places across builder, project_state,
  interactions, and memory; all 8 listed in this doc
- Backwards-compat path returns unregistered names unchanged so
  legacy data still works without a migration
- The trust hierarchy depends on this helper being applied
  everywhere — Layer 3 trusted state has to be findable for it to
  win the trust battle
- Use the `project_registry` test fixture to add regression tests
  for any new service function that takes a project name
- The engineering layer V1 implementation must follow the same
  pattern at every new service entry point
- Open follow-ups: legacy data migration, registry caching,
  redundant substring boost cleanup
docs(arch): project-identity-canonicalization contract Codifies the helper-at-every-service-boundary rule that fb6298a implemented across the eight current callsites. The contract is intentionally simple but easy to forget, so it lives in its own doc that the engineering layer V1 implementation sprint can read before adding new project-keyed entity surfaces. docs/architecture/project-identity-canonicalization.md ------------------------------------------------------ - The contract: every read/write that takes a project name MUST call resolve_project_name() before the value crosses a service boundary; canonicalization happens once, at the first statement after input validation, never later - The helper API: resolve_project_name(name) returns the canonical project_id for registered names, the input unchanged for empty or unregistered names (the second case is the backwards-compat path for hand-curated state predating the registry) - Full table of the 8 current callsites: builder.build_context, project_state.set_state/get_state/invalidate_state, interactions.record_interaction/list_interactions, memory.create_memory/get_memories - Where the helper is intentionally NOT called and why: legacy ensure_project lookup, retriever's own _project_match_boost (which already calls get_registered_project), _rank_chunks secondary substring boost (multiplicative not filter, can't drop relevant chunks), update_memory (no project field update), unregistered names (the rule applied to a name with no record) - Why this is the trust hierarchy in action: Layer 3 trusted state has to be findable to win the trust battle; an un-canonicalized lookup silently makes Layer 3 invisible and the system falls through to lower-trust retrieved chunks with no signal to the human - The 4-step rule for new entry points: identify project-keyed reads/writes, place the call as the first statement after validation, add a regression test using the project_registry fixture, verify None/empty paths - How the project_registry fixture works with a copy-pasteable example - What the rule does NOT cover: alias creation (registry's own write path), registry hot-reloading (no in-process cache by design), cross-project dedup (collision detection at registration), time-bounded canonicalization (canonical id is stable forever), legacy data migration (open follow-up) - Engineering layer V1 implications: every new service entry point in the entities/relationships/conflicts/mirror modules must apply the helper at the first statement after validation; treated as code review failure if missing - Open follow-ups: legacy data migration script (~30 LOC), registry file caching when projects scale beyond ~50, case sensitivity audit when entity-side storage lands, _rank_chunks cleanup, documentation discoverability (intentional redundancy between this doc, the helper docstring, and per-callsite comments) - Quick reference card: copy-pasteable template for new service functions master-plan-status.md updated ----------------------------- - New doc added to the engineering-layer planning sprint listing - Marked as required reading before V1 implementation begins - Note that V1 must apply the contract at every new service-layer entry point Pure doc work, no code changes. Full suite stays at 174 passing because no source changed. 2026-04-07 19:32:31 -04:00			`# Project Identity Canonicalization`

			`## Why this document exists`

			`AtoCore identifies projects by name in many places: trusted state`
			`rows, memories, captured interactions, query/context API parameters,`
			`extractor candidates, future engineering entities. Without an`
			`explicit rule, every callsite would have to remember to canonicalize`
			`project names through the registry — and the recent codex review`
			`caught exactly the bug class that follows when one of them forgets.`

			The fix landed in `fb6298a` and works correctly today. This document
			`exists to make the rule explicit and discoverable so the`
			`engineering layer V1 implementation, future entity write paths, and`
			`any new agent integration don't reintroduce the same fragmentation`
			`when nobody is looking.`

			`## The contract`

			`> **Every read/write that takes a project name MUST canonicalize it`
			> through `resolve_project_name()` before the value crosses a service
			`> boundary.**`

			`The boundary is wherever a project name becomes a database row, a`
			`query filter, an attribute on a stored object, or a key for any`
			`lookup. The canonicalization happens once, at that boundary,`
			`before the underlying storage primitive is called.`

			`Symbolically:`

			```
			`HTTP layer (raw user input)`
			`↓`
			`service entry point`
			`↓`
			`project_name = resolve_project_name(project_name) ← ONLY canonical from this point`
			`↓`
			`storage / queries / further service calls`
			```

			`The rule is intentionally simple. There's no per-call exception,`
			`no "trust me, the caller already canonicalized it" shortcut, no`
			`opt-out flag. Every service-layer entry point applies the helper`
			`the moment it receives a project name from outside the service.`

			`## The helper`

			```python
			`# src/atocore/projects/registry.py`

			`def resolve_project_name(name: str \| None) -> str:`
			`"""Canonicalize a project name through the registry.`

			`Returns the canonical project_id if the input matches any`
			`registered project's id or alias. Returns the input unchanged`
			`when it's empty or not in the registry — the second case keeps`
			`backwards compatibility with hand-curated state, memories, and`
			`interactions that predate the registry, or for projects that`
			`are intentionally not registered.`
			`"""`
			`if not name:`
			`return name or ""`
			`project = get_registered_project(name)`
			`if project is not None:`
			`return project.project_id`
			`return name`
			```

			`Three behaviors worth keeping in mind:`

			`1. Empty / None input → empty string output. Callers don't have`
			to pre-check; passing `""` or `None` to a query filter still
			`works as "no project scope".`
			`2. Registered alias → canonical project_id. The helper does the`
			case-insensitive lookup and returns the project's `id` field
			(e.g. `"p05" → "p05-interferometer"`).
			`3. Unregistered name → input unchanged. This is the`
			`backwards-compatibility path. Hand-curated state, memories, or`
			`interactions created under a name that isn't in the registry`
			`keep working. The retrieval is then "best effort" — the raw`
			`string is used as the SQL key, which still finds the row that`
			`was stored under the same raw string. This path exists so the`
			`engineering layer V1 doesn't have to also be a data migration.`

			`## Where the helper is currently called`

			As of `fb6298a`, the helper is invoked at exactly these eight
			`service-layer entry points:`

			`\| Module \| Function \| What gets canonicalized \|`
			`\|---\|---\|---\|`
			\| `src/atocore/context/builder.py` \| `build_context` \| the `project_hint` parameter, before the trusted state lookup \|
			\| `src/atocore/context/project_state.py` \| `set_state` \| `project_name`, before `ensure_project()` \|
			\| `src/atocore/context/project_state.py` \| `get_state` \| `project_name`, before the SQL lookup \|
			\| `src/atocore/context/project_state.py` \| `invalidate_state` \| `project_name`, before the SQL lookup \|
			\| `src/atocore/interactions/service.py` \| `record_interaction` \| `project`, before insert \|
			\| `src/atocore/interactions/service.py` \| `list_interactions` \| `project` filter parameter, before WHERE clause \|
			\| `src/atocore/memory/service.py` \| `create_memory` \| `project`, before insert \|
			\| `src/atocore/memory/service.py` \| `get_memories` \| `project` filter parameter, before WHERE clause \|

			`Every one of those is the first thing the function does after`
			`input validation. There is no path through any of those eight`
			`functions where a project name reaches storage without passing`
			through `resolve_project_name`.

			`## Where the helper is NOT called (and why that's correct)`

			`These places intentionally do not canonicalize:`

			1. `update_memory`'s project field. The API does not allow
			`changing a memory's project after creation, so there's no`
			project to canonicalize. The function only updates `content`,
			`confidence`, and `status`.
			2. The retriever's `_project_match_boost` substring matcher. It
			already calls `get_registered_project` internally to expand the
			`hint into the candidate set (canonical id + all aliases + last`
			`path segments). It accepts the raw hint by design.`
			3. **`_rank_chunks`'s secondary substring boost in
			`builder.py`.** Still uses the raw hint. This is a multiplicative
			`factor on top of correct retrieval, not a filter, so it cannot`
			`drop relevant chunks. Tracked as a future cleanup but not`
			`critical.`
			`4. Direct SQL queries for the projects table itself (e.g.`
			`ensure_project`'s lookup). These are intentional case-insensitive
			`raw lookups against the column the canonical id is stored in.`
			`set_state` already canonicalized before reaching `ensure_project`,
			`so the value passed is the canonical id by definition.`
			`5. Hand-authored project names that aren't in the registry.`
			`The helper returns those unchanged. This is the backwards-compat`
			`path mentioned above; it is not a violation of the rule, it's`
			`the rule applied to a name with no registry record.`

			`## Why this is the trust hierarchy in action`

			`The whole point of AtoCore is the trust hierarchy from the operating`
			`model:`

			`1. Trusted Project State (Layer 3) is the most authoritative layer`
			`2. Memories (active) are second`
			`3. Source chunks (raw retrieved content) are last`

			If a caller passes the alias `p05` and Layer 3 was written under
			`p05-interferometer`, and the lookup fails to find the canonical
			`row, the trust hierarchy collapses. The most-authoritative`
			`layer is silently invisible to the caller. The system would still`
			`return something — namely, lower-trust retrieved chunks — and the`
			`human would never know they got a degraded answer.`

			`The canonicalization helper is what makes the trust hierarchy`
			`dependable. Layer 3 is supposed to win every time. To win it`
			`has to be findable. To be findable, the lookup key has to match`
			`how the row was stored. And the only way to guarantee that match`
			`across every entry point is to canonicalize at every boundary.`

			`## The rule for new entry points`

			`When you add a new service-layer function that takes a project name,`
			`follow this checklist:`

			`1. Does the function read or write a row keyed by project? If`
			yes, you must call `resolve_project_name`. If no (e.g. it only
			takes `project` as a label for logging), you may skip the
			`canonicalization but you should add a comment explaining why.`
			`2. Where does the canonicalization go? As the first statement`
			`after input validation. Not later, not "before storage", not`
			`"in the helper that does the actual write". As the first`
			`statement, so any subsequent service call inside the function`
			`sees the canonical value.`
			`3. Add a regression test that uses an alias. Use the`
			`project_registry` fixture from `tests/conftest.py` to set up
			`a temp registry with at least one project + aliases, then`
			`verify the new function works when called with the alias and`
			`when called with the canonical id.`
			4. **If the function can be called with `None` or empty string,
			`verify that path too.** The helper handles it correctly but`
			`the function-under-test might not.`

			## How the `project_registry` test fixture works

			`tests/conftest.py::project_registry` returns a callable that
			takes one or more `(project_id, [aliases])` tuples (or just a bare
			`project_id` string), writes them into a temp registry file,
			points `ATOCORE_PROJECT_REGISTRY_PATH` at it, and reloads
			`config.settings`. Use it like:

			```python
			`def test_my_new_thing_canonicalizes(project_registry):`
			`project_registry(("p05-interferometer", ["p05", "interferometer"]))`

			`# ... call your service function with "p05" ...`
			`# ... assert it works the same as if you'd passed "p05-interferometer" ...`
			```

			`The fixture is reused by all 12 alias-canonicalization regression`
			tests added in `fb6298a`. Following the same pattern for new
			`features is the cheapest way to keep the contract intact.`

			`## What this rule does NOT cover`

			`1. Alias creation / management. This document is about reading`
			`and writing project-keyed data. Adding new projects or new`
			`aliases is the registry's own write path`
			(`POST /projects/register`, `PUT /projects/{name}`), which
			`already enforces collision detection and atomic file writes.`
			`2. Registry hot-reloading. The helper calls`
			`load_project_registry()` on every invocation, which reads the
			`JSON file each time. There is no in-process cache. If the`
			`registry file changes, the next call sees the new contents.`
			`Performance is fine for the current registry size but if it`
			`becomes a bottleneck, add a versioned cache here, not at every`
			`call site.`
			`3. Cross-project deduplication. If two different projects in`
			`the registry happen to share an alias, the registry's collision`
			`detection blocks the second one at registration time, so this`
			`case can't arise in practice. The helper does not handle it`
			`defensively.`
			`4. Time-bounded canonicalization. A project's canonical id is`
			`stable. Aliases can be added or removed via`
			`PUT /projects/{name}`, but the canonical `id` field never
			`changes after registration. So a row written today under the`
			`canonical id will always remain findable under that id, even`
			`if the alias set evolves.`
			`5. Migration of legacy data. If the live Dalidou DB has rows`
			`that were written under aliases before the canonicalization`
			`landed, those rows still work via the unregistered-name`
			`fallback path. They are not automatically migrated to canonical`
			`form. A future migration script could walk the DB and`
			re-key any rows whose `project` field matches a known alias to
			`the canonical id; tracked as an open follow-up below.`

			`## What this enables for the engineering layer V1`

			When the engineering layer ships per `engineering-v1-acceptance.md`,
			`it adds at least these new project-keyed surfaces:`

			- `entities` table with a `project_id` column
			- `relationships` table that joins entities, indirectly project-keyed
			- `conflicts` table with a `project` column
			- `mirror_regeneration_failures` table with a `project` column
			- new endpoints: `POST /entities/...`, `POST /ingest/kb-cad/export`,
			`POST /ingest/kb-fem/export`, `GET /mirror/{project}/...`,
			`GET /conflicts?project=...`

			`**Every one of those write/read paths needs to call`
			`resolve_project_name` at its service-layer entry point**, following
			`the same pattern as the eight existing call sites listed above. The`
			`implementation sprint should:`

			`1. Apply the helper at each new service entry point as the first`
			`statement after input validation`
			2. Add a regression test using the `project_registry` fixture that
			`exercises an alias against each new entry point`
			`3. Treat any new service function that takes a project name without`
			calling `resolve_project_name` as a code review failure

			`The pattern is simple enough to follow without thinking, which is`
			`exactly the property we want for a contract that has to hold`
			`across many independent additions.`

			`## Open follow-ups`

			`These are things the canonicalization story still has open. None`
			`are blockers, but they're the rough edges to be aware of.`

			`1. Legacy alias data migration. If the live Dalidou DB has any`
			rows written under aliases before `fb6298a` landed, they
			`still work via the unregistered-name fallback path. A small`
			migration script could walk `memories`, `interactions`,
			`project_state`, and `projects`, find any names that match a
			`registry alias, and re-key them to the canonical id. Worth`
			`doing once before the engineering layer V1 lands. Estimated`
			`cost: ~30 LOC + a dry-run mode + a one-time run.`
			2. Registry file caching. `load_project_registry()` reads the
			JSON file on every `resolve_project_name` call. With ~5
			`projects this is fine; with 50+ it would warrant a versioned`
			`cache (cache key = file mtime + size). Defer until measured.`
			`3. Case sensitivity audit. The helper uses`
			`get_registered_project` which lowercases for comparison. The
			`stored canonical id keeps its original casing. No bug today`
			`because every test passes, but worth re-confirming when the`
			`engineering layer adds entity-side storage.`
			4. `_rank_chunks`'s secondary substring boost. Mentioned
			`earlier; still uses the raw hint. Replace it with the same`
			`helper-driven approach the retriever uses, OR delete it as`
			`redundant once we confirm the retriever's primary boost is`
			`sufficient.`
			`5. Documentation discoverability. This doc lives under`
			`docs/architecture/`. The contract is also restated in the
			docstring of `resolve_project_name` and referenced from each
			`call site's comment. That redundancy is intentional — the`
			`contract is too easy to forget to live in only one place.`

			`## Quick reference card`

			`Copy-pasteable for new service functions:`

			```python
			`from atocore.projects.registry import resolve_project_name`


			`def my_new_service_entry_point(`
			`project_name: str,`
			`other_args: ...,`
			`) -> ...:`
			`# Validate inputs first`
			`if not project_name:`
			`raise ValueError("project_name is required")`

			`# Canonicalize through the registry as the first thing after`
			`# validation. Every subsequent operation in this function uses`
			`# the canonical id, so storage and queries are guaranteed`
			`# consistent across alias and canonical-id callers.`
			`project_name = resolve_project_name(project_name)`

			`# ... rest of the function ...`
			```

			`## TL;DR`

			- One helper, one rule: `resolve_project_name` at every service-layer
			`entry point that takes a project name`
			`- Currently called in 8 places across builder, project_state,`
			`interactions, and memory; all 8 listed in this doc`
			`- Backwards-compat path returns unregistered names unchanged so`
			`legacy data still works without a migration`
			`- The trust hierarchy depends on this helper being applied`
			`everywhere — Layer 3 trusted state has to be findable for it to`
			`win the trust battle`
			- Use the `project_registry` test fixture to add regression tests
			`for any new service function that takes a project name`
			`- The engineering layer V1 implementation must follow the same`
			`pattern at every new service entry point`
			`- Open follow-ups: legacy data migration, registry caching,`
			`redundant substring boost cleanup`