fix: critical bugs and hardening from validation audit
- Fix infinite loop in chunker _hard_split when overlap >= max_size - Fix tag filter false positives by quoting tag values in ChromaDB query - Fix score boost semantics (additive → multiplicative) to stay within 0-1 range - Add error handling and type hints to all API routes - Update README with proper project documentation Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -137,6 +137,10 @@ def _split_by_paragraphs(
|
||||
|
||||
def _hard_split(text: str, max_size: int, overlap: int) -> list[str]:
|
||||
"""Hard split text at max_size with overlap."""
|
||||
# Prevent infinite loop: overlap must be less than max_size
|
||||
if overlap >= max_size:
|
||||
overlap = max_size // 4
|
||||
|
||||
chunks = []
|
||||
start = 0
|
||||
while start < len(text):
|
||||
|
||||
Reference in New Issue
Block a user