Multimodal RAG

1.[Multimodal RAG] ColPali: Efficient Document Retrieval with Vision Language Models (2024)

post-thumbnail

2.[Multimodal RAG] VisRAG: Vision-based Retrieval-augmented Generation on Multi-modality Documents (2024)

post-thumbnail

3.[Multimodal RAG] RagVL (RagLLaVA) (2024)

post-thumbnail