pub(super) fn embed_both(
session: &mut Session,
tokenizer: &Tokenizer,
texts: &[String],
cost_model: &CostModel,
model_variant: ModelVariant,
) -> Result<(Vec<DualEmbedding>, EmbedStats)>Expand description
Produces paired dense + sparse embeddings using one session.run() per chunk.
Both projections are derived from the same forward pass:
- FP32: extracts both
sentence_embedding(dense) andtoken_embeddings(sparse base) from the model’s dual outputs. - FP16/INT8: extracts dense from the CLS token (position 0) of
last_hidden_state, and sparse from the full hidden states of the same tensor. This avoids a second forward pass.
Numerically equivalent to calling super::dense::embed_dense and
super::sparse::embed_sparse separately, within FP rounding tolerance.