Expand description
HTTP handlers for the embedding service.
Submodules:
common: shared input validation and readiness helpers.dense:POST /v1/embeddings(OpenAI-compatible dense embeddings).sparse:POST /v1/sparse-embeddings(BGE-M3 SPLADE-style sparse embeddings).both:POST /v1/embeddings:both(paired dense + sparse output in one pass).health:GET /health(readiness + tuning details).models:GET /v1/models(fleet discovery).
Modulesยง
- both ๐
POST /v1/embeddings:bothhandler โ dense + sparse embeddings in one pass.- common ๐
- Shared input validation and service-readiness helpers used by all handlers.
- dense ๐
POST /v1/embeddingshandler โ OpenAI-compatible dense embeddings.- health ๐
GET /healthhandler โ readiness status, worker counts, and tuning diagnostics.- models ๐
GET /v1/modelshandler โ OpenAI-compatible fleet discovery endpoint.- sparse ๐
POST /v1/sparse-embeddingshandler โ BGE-M3 SPLADE-style sparse embeddings.
Functionsยง
- both_
embeddings - Handles
POST /v1/embeddings:bothโ returns dense and sparse embeddings in one pass. - dense_
embeddings - Handles
POST /v1/embeddingsโ returns dense (float32) embeddings. - health
- Handles
GET /healthโ returns readiness status, worker counts, and tuning diagnostics. - models
- Returns an OpenAI-compatible models list confirming BGE-M3 is resident.
- sparse_
embeddings - Handles
POST /v1/sparse-embeddingsโ returns sparse (SPLADE-style) embeddings.