Commit graph

5 commits

Author SHA1 Message Date
5505d2b217 Fix compress endpoint to use request.messages correctly 2026-03-22 22:47:49 -04:00
9ad11f5be4 Fix compression endpoint request validation and message schema 2026-03-22 22:47:07 -04:00
b8edf40010 Major refactor: remove semantic cache, add config, auth, improve RAG performance, fix tags JSON 2026-03-22 22:32:44 -04:00
82fd963577 Add token-saving patterns: semantic cache, RAG, compression
- semantic_cache.py: Semantic similarity matching for cache hits
- rag.py: RAG-based context selection with local embeddings
- compression.py: Conversation history summarization
- New endpoints: /cache/semantic-lookup, /cache/semantic-store, /context/rag, /compress
- Uses sentence-transformers (all-MiniLM-L6-v2) - no external API calls
- No vector DB needed - cosine similarity on small datasets is fast enough
- Expected savings: 50-70% token reduction
2026-03-22 21:32:08 -04:00
7f7699ff94 Initial commit: Skills API with MCP servers
- FastAPI backend with SQLite (ai.db)
- Tables: skills, snippets, conventions, cache, memory
- MCP servers: homelab, gameservers, skills
- Docker Compose setup
- Seed data with 8 skills, 2 conventions, 2 snippets
- Token savings patterns via context bundles and caching
2026-03-22 21:18:23 -04:00