← All LLMs
text-embedding-3-large
OpenAI
text
OpenAI embedding model for semantic search.
- Developer
- OpenAI
- Release date
- Jan 25, 2024
- Parameters
- Undisclosed
- Corpus size
- Undisclosed
- License
- Proprietary
- Context window
- 128K tokens
- Modalities
- text
Links
Learn this model
Tutorial tailored to text-embedding-3-large—cost, capabilities, API setup, and production patterns based on this model's specs (not generic copy for every LLM).
Cost & access
text-embedding-3-large is priced per million embedding tokens (or per request), not like chat completion. Compare dimensions and batch APIs on OpenAI's pricing page. With a 128K tokens context window, long PDFs or chat histories increase input tokens quickly—trim history or summarize older turns in production.
Functional understanding
- OpenAI embedding model for semantic search.
- Modalities: text · License: Proprietary · Released 2024-01-25.
- Best-fit workflows for this model:
- • Semantic search, deduplication, and RAG retrieval—text-embedding-3-large outputs vectors, not chat prose.
- • Clustering support tickets, docs, or product catalogs by meaning.
Technical foundation
- OpenAI reports Undisclosed parameters; training data: Undisclosed.
- Context: 128K tokens. Open weights: no.
- text-embedding-3-large is positioned as a embedding model in the OpenAI lineup.
First API call
Use OpenAI's embeddings endpoint; pass text-embedding-3-large as the model name and store vectors in your DB or vector index.
from openai import OpenAI
client = OpenAI()
vec = client.embeddings.create(
model="text-embedding-3-large",
input="Text to embed for text-embedding-3-large",
)
print(len(vec.data[0].embedding), "dimensions")Important technical topics
- Prompting text-embedding-3-large: be explicit about output format. Weak: "Analyze this." Better: "Return JSON with fields id, total, date for OpenAI billing data."
- Temperature: use 0–0.3 for extraction and compliance on text-embedding-3-large; 0.7–1.0 for brainstorming.
- Tokens: text-embedding-3-large bills by tokens (~¾ word each). Undisclosed parameters affect capability; your bill is driven by context length and call volume.
- Context window (128K tokens): everything in one request—system prompt, tools, RAG chunks, and history—must fit. Truncate or summarize when approaching the limit for text-embedding-3-large.
Real enterprise patterns
- Index docs with text-embedding-3-large, then retrieve top-k chunks before calling a chat model.
- Hybrid search: combine keyword (BM25) + embeddings for better recall.
- Version embedding indexes when you change models—dimensions may differ.
- Monitor drift: re-embed when OpenAI ships a new embedding revision.
Production & security
- Secrets: never commit keys for text-embedding-3-large; use vault + per-environment rotation.
- PII: mask before inference; log redacted prompts only.
- Observability: trace id per request; log model=text-embedding-3-large, tokens in/out, latency.
- Rate limits: handle OpenAI 429/5xx with exponential backoff and circuit breakers.
- Guardrails: schema-validate JSON; block disallowed topics; cross-check numbers against source docs.
Mini projects with this model
- Semantic FAQ: embed help-center articles with text-embedding-3-large, answer from nearest neighbors.
- Duplicate ticket detector for support queues.
- Product recommendation by description similarity.
- Eval: NDCG@k on labeled query–doc pairs.
Suggested stack
- Language: Python 3.11+
- Embeddings: text-embedding-3-large (OpenAI)
- Vector DB: Pinecone, Chroma, pgvector, or Weaviate
- Orchestration: LangChain or LlamaIndex for chunking
- UI: Streamlit or Next.js for internal tools
- APIs: FastAPI
Learning path
- Python basics
- HTTP/REST and environment variables
- OpenAI embeddings API for text-embedding-3-large
- Vector database fundamentals
- Chunking strategies
- RAG retrieval evaluation
- Deploy index refresh pipeline