Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More
Enterprise retrieval augmented generation (RAG) remains integral to the current agentic AI craze. Taking advantage of the continued interest in agents, Cohere released the latest version of its embeddings model with longer context windows and more multimodality.
Cohere’s Embed 4 builds on the multimodal updates of Embed 3 and adds more capabilities around unstructured data. Thanks to a 128,000 token context window, organizations can generate embeddings for documents with around 200 pages.
“Existing embedding models fail to natively understand complex multimodal business materials, leading companies to develop cumbersome data pre-processing pipelines that only slightly improve accuracy,” Cohere said in a blog post. “Embed 4 solves this problem, allowing enterprises and their employees to efficiently surface insights that are hidden within mountains of unsearchable information.”
Enterprises can deploy Embed 4 on virtual private clouds or on-premise technology stacks for added data security.
Companies can generate embeddings to transform their documents or other data into numerical representations for RAG use cases. Agents can then reference these embeddings to answer prompts.
Domain-specific knowledge
Embed 4 “excels in regulated industries” like finance, healthcare and manufacturing, the company said. Cohere, which mainly focuses on enterprise AI use cases, said its models consider the security needs of regulated sectors and have a strong understanding of businesses.
The company trained Embed 4 “to be robust against noisy real-world data” in that it remains accurate despite the “imperfections” of enterprise data, such as spelling mistakes and formatting issues.
“It is also performant at searching over scanned documents and handwriting. These formats are common in legal paperwork, insurance invoices, and expense receipts. This capability eliminates the need for complex data preparations or pre-processing pipelines, saving businesses time and operational costs,” Cohere said.
Organizations can use Embed 4 for investor presentations, due diligence files, clinical trial reports, repair guides and product documents.
The model supports more than 100 languages, just like the previous version of the model.

Agora, a customer of Cohere, used Embed 4 for its AI search engine and found that the model could surface relevant products.
“E-commerce data is complex, containing images and multifaceted text descriptions. Being able to represent our products in a unified embedding makes our search faster and our internal tooling more efficient,” said Param Jaggi, Founder of Agora, in the blog post.
Agent use cases
Cohere argues that models like Embed 4 would improve agentic use cases and claims it can be “the optimal search engine” for agents and AI assistants across an enterprise.
“In addition to strong accuracy across data types, the model delivers enterprise-grade efficiency,” Cohere said. “This allows it to scale to meet the demands of large organizations.”
Cohere added that Embed 4 creates compressed data embeddings to cut high storage costs.
Embeddings and RAG-based searches let the agent reference specific documents to fulfill request-related tasks. Many believe these provide more accurate results, ensuring the agents do not respond with incorrect or hallucinated answers.
Other embedding models that Cohere competes against include Qodo’s Qodo-Embed-1-1.5B and models from Voyage AI, which database vendor MongoDB recently acquired.