Lists (15)
Sort Name ascending (A-Z)
Stars
(ICML 2024) Improve Context Understanding in Multimodal Large Language Models via Multimodal Composition Learning
RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
The huggingface implementation of Fine-grained Late-interaction Multi-modal Retriever.
The code used to train and run inference with the ColPali architecture.
This is the official repository for Retrieval Augmented Visual Question Answering
pytablewriter is a Python library to write a table in various formats: AsciiDoc / CSV / Elasticsearch / HTML / JavaScript / JSON / LaTeX / LDJSON / LTSV / Markdown / MediaWiki / NumPy / Excel / Pan…
Generative Representational Instruction Tuning
Code for "Are “Hierarchical” Visual Representations Hierarchical?" in NeurIPS Workshop for Symmetry and Geometry in Neural Representations.
A lightweight open-source package to fine-tune embedding models.
【ICLR 2024🔥】 Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment
Mixture-of-Experts for Large Vision-Language Models
Implementation of PALI3 from the paper PALI-3 VISION LANGUAGE MODELS: SMALLER, FASTER, STRONGER"
Zero-shot Document Ranking with Large Language Models.
Official repository of ICCV 2021 - Image Retrieval on Real-life Images with Pre-trained Vision-and-Language Models
RankLLM is a Python toolkit for reproducible information retrieval research using rerankers, with a focus on listwise reranking.
(AAAI 2024) BLIVA: A Simple Multimodal LLM for Better Handling of Text-rich Visual Questions
Conversational Recommender System (CRS) paper list. 对话推荐系统论文列表
E5-V: Universal Embeddings with Multimodal Large Language Models
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
Collection of Composed Image Retrieval (CIR) papers.
[ICLR 2023] This is the code repo for our ICLR‘23 paper "Universal Vision-Language Dense Retrieval: Learning A Unified Representation Space for Multi-Modal Retrieval".
[ACL 2024] This is the code repo for our ACL‘24 paper "MARVEL: Unlocking the Multi-Modal Capability of Dense Retrieval via Visual Module Plugin".
History-Aware Conversational Dense Retrieval. A codebase for ACL 2024 Findings accepted paper.