Repository navigation

cross-modal

Website
Wikipedia

🪩 Create Disco Diffusion artworks in one line

creative-ai disco-diffusion cross-modal dalle generative-art multimodal diffusion prompts midjourney imgen clip-guided-diffusion latent-diffusion stable-diffusion

Python

3841

248

2 年前

docarray / docarray

Represent, send, store and search multimodal data

docarray 数据结构 multimodal cross-modal neural-search 深度学习 nested-data qdrant weaviate nearest-neighbor-search protobuf elasticsearch multi-modal semantic-search 机器学习 PyTorch FastAPI pydantic

Python

3043

233

2 天前

shaoxiongji / knowledge-graphs

A collection of research on knowledge graphs

knowledge-graph representation-learning relation-extraction reasoning 自然语言处理 ner commonsense cross-modal question-answering dialogue-systems information-retrieval survey Bukkit

JavaScript

1732

295

3 年前

krantiparida / awesome-audio-visual

A curated list of different papers and datasets in various areas of audio-visual processing

Awesome Lists cross-modal Localization (l10n)source-separation

715

1 年前

kuanghuei / SCAN

PyTorch source code for "Stacked Cross Attention for Image-Text Matching" (ECCV 2018)

cross-modal image-captioning 神经网络深度学习 PyTorch 机器视觉

Python

560

115

2 年前

towhee-io / examples

Analyze the unstructured data with Towhee, such as reverse image search, reverse video search, audio classification, question and answer systems, molecular search, etc.

audio-classification cross-modal embeddings image-classification 机器学习自然语言处理

Jupyter Notebook

486

118

1 年前

haihuangcode / CMG

The official implementation of Achieving Cross Modal Generalization with Multimodal Unified Representation (NeurIPS '23)

cross-modal multimodal pretrained-models

Python

217

4 个月前

yisun98 / SOLC

Remote Sensing Sar-Optical Land-use Classfication Pytorch Pytorch高分辨率遥感语义分割/地物分割/地物分类

PyTorch remote-sensing segmentation deeplabv3 cross-modal multi-modal

Python

214

1 年前

JizhiziLi / RIM

[CVPR 2023] Referring Image Matting

cross-modal image-matting image-segmentation multimodal matting

207

2 年前

DRSY / MoTIS

[NAACL 2022]Mobile Text-to-Image search powered by multimodal semantic representation models(e.g., OpenAI's CLIP)

iOS 人工智能 image-search clip vector-search knn lsh semantic-search knowledge-distillation retrieval cross-modal naacl

Swift

124

2 年前

QizhiPei / BioT5

BioT5 (EMNLP 2023) and BioT5+ (ACL 2024 Findings)

Bioinformatics computational-biology cross-modal 机器学习自然语言处理 nlp-applications

Python

111

7 个月前

Zengyi-Qin / Weakly-Supervised-3D-Object-Detection

Weakly Supervised 3D Object Detection from Point Clouds (VS3D), ACM MM 2020

3d-object-detection kitti Point cloud cross-modal transfer-learning Tensorflow lidar monocular stereo unsupervised-learning

Jupyter Notebook

106

2 年前

qcraftai / distill-bev

DistillBEV: Boosting Multi-Camera 3D Object Detection with Cross-Modal Knowledge Distillation (ICCV 2023)

3d-object-detection knowledge-distillation lidar nuscenes Point cloud self-driving autonomous-driving distillation cross-modal multi-modal

Python

1 年前

yangli18 / VLTVG

Improving Visual Grounding with Visual-Linguistic Verification and Iterative Reasoning, CVPR 2022

vision-language cross-modal

Python

2 年前

rohitrango / objects-that-sound

Unofficial Implementation of Google Deepmind's paper `Objects that Sound`

机器学习深度学习 embeddings 深度神经网络 deepmind cross-modal

Python

7 年前

kywen1119 / DSRAN

Code for journal paper "Learning Dual Semantic Relations with Graph Attention for Image-Text Matching", TCSVT, 2020.

PyTorch image-text-matching cross-modal 机器视觉

Python

2 年前

marslanm / Multimodality-Representation-Learning

This repository provides a comprehensive collection of research papers focused on multimodal representation learning, all of which have been cited and discussed in the survey just accepted https://dl.acm.org/doi/abs/10.1145/3617833 .

cross-modal multimodal-datasets multimodal-deep-learning multimodal-pre-trained-model transformer-models vision-language-pretraining

2 年前

Paranioar / UniPT

[CVPR2024] The code of "UniPT: Universal Parallel Tuning for Transfer Learning with Efficient Parameter and Memory"

cross-modal parameter-efficient-learning parameter-efficient-tuning transfer-learning parameter-efficient-fine-tuning

Python

6 个月前

Eaphan / UPIDet

Unleash the Potential of Image Branch for Cross-modal 3D Object Detection [NeurIPS2023]

3d-object-detection cross-modal multi-modal

Python

1 年前

GT-RIPL / Xmodal-Ctx

Official PyTorch implementation of our CVPR 2022 paper: Beyond a Pre-Trained Object Detector: Cross-Modal Textual and Visual Context for Image Captioning

clip cross-modal image-captioning vision-and-language

Python

2 年前