AI Tools.

Search

feature extraction

bge-small-en-v1.5

Small English dense embedding model from BAAI's BGE (BAAI General Embedding) series, producing 384-dimensional vectors via MIT license. Optimized for MTEB retrieval benchmarks through a retrieval-focused training strategy, it achieves competitive scores relative to its parameter count. Suited for embedding workflows where throughput and cost matter more than peak accuracy.

Last reviewed

Use cases

  • Embedding at scale where cost per inference matters
  • Semantic search in memory-constrained edge deployments
  • RAG pipeline embedding for high-volume document corpora
  • Lightweight similarity scoring in microservices
  • Batch embedding of large content repositories

Pros

  • MIT license for broad commercial use
  • 384-dim output supports large vector stores at lower memory cost
  • Competitive MTEB retrieval performance relative to model size
  • Fast CPU inference; ONNX and OpenVINO export supported

Cons

  • Smaller capacity limits accuracy ceiling on complex semantic distinctions
  • English-only with no multilingual or cross-lingual transfer
  • Falls behind larger BGE-base and BGE-large on out-of-distribution retrieval
  • No instruction prefix support for asymmetric retrieval like newer BGE models
  • Narrower community adoption than sentence-transformers library models

FAQ

What is bge-small-en-v1.5 used for?

Embedding at scale where cost per inference matters. Semantic search in memory-constrained edge deployments. RAG pipeline embedding for high-volume document corpora. Lightweight similarity scoring in microservices. Batch embedding of large content repositories.

Is bge-small-en-v1.5 free to use?

bge-small-en-v1.5 is an open-source model published on HuggingFace. License terms vary by model — check the model card for the specific license.

How do I run bge-small-en-v1.5 locally?

Most HuggingFace models can be loaded with transformers or the appropriate framework library. See the model card for framework-specific instructions and hardware requirements.

Tags

sentence-transformerspytorchonnxsafetensorsbertfeature-extractionsentence-similaritytransformersmtebenarxiv:2401.03462arxiv:2312.15503arxiv:2311.13534arxiv:2310.07554arxiv:2309.07597license:mitmodel-indextext-embeddings-inferenceendpoints_compatibledeploy:azure