What happened
Google announced an update to the Gemini API File Search tool on May 5, 2026. Three new features stand out.
First, multimodal search. File Search can now process images and text together using Google's latest Gemini embedding technology. The model understands native image data, which cuts noise from irrelevant documents and improves both speed and accuracy.
Second, custom metadata. Builders can attach their own fields to indexed files for more focused queries.
Third, page-level citations. Each piece of indexed information now ties back to a specific page number in the source. This makes it easier to verify model answers.
Supported formats include PDFs, DOCX, TXT, Excel, CSV, JSON, SQL, Jupyter notebooks, HTML, Markdown, and PNG and JPEG images up to 4K. Storage and query-time embeddings are free. Developers only pay for initial indexing and standard Gemini tokens.
Why it matters
Retrieval augmented generation, often called RAG, is the most common pattern for grounding AI in private data. Native multimodal RAG removes the need for separate text and image pipelines. Page-level citations also reduce the work needed to fact check answers in regulated industries.
MintedBrain take
For builders, this update narrows the gap between rolling your own RAG pipeline and using a managed one. If you have been gluing together a vector database, an OCR step, and a citation layer, take a serious look at File Search before adding more glue code.
Discussion
Sign in to comment. Your account must be at least 1 day old.