Repository navigation

multimodal-pre-trained-model

Website
Wikipedia

Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022

document-ai eccv-2022 multimodal-pre-trained-model OCR 自然语言处理机器视觉

Python

6569

538

1 年前

jpWang / LiLT

Official PyTorch implementation of LiLT: A Simple yet Effective Language-Independent Layout Transformer for Structured Document Understanding (ACL 2022)

自然语言处理 document-ai document-analysis document-understanding information-extraction multimodal-pre-trained-model

Python

356

3 年前

marslanm / Multimodality-Representation-Learning

This repository provides a comprehensive collection of research papers focused on multimodal representation learning, all of which have been cited and discussed in the survey just accepted https://dl.acm.org/doi/abs/10.1145/3617833 .

cross-modal multimodal-datasets multimodal-deep-learning multimodal-pre-trained-model transformer-models vision-language-pretraining

4 个月前

Shulk97 / daniel

This repository contain the implementation of DANIEL. (A fast Document Attention Network for Information Extraction and Labeling of handwritten documents)

机器视觉 document-ai multimodal-pre-trained-model 自然语言处理 OCR

Python

1 个月前