Repository navigation

#

layoutlm

Python
21093
2 个月前

This repository contains demos I made with the Transformers library by HuggingFace.

Jupyter Notebook
10742
3 个月前

The MERIT Dataset is a fully synthetic, labeled dataset created for training and benchmarking LLMs on Visually Rich Document Understanding tasks. It is also designed to help detect biases and improve interpretability in LLMs, where we are actively working. This repository is actively maintained, and new features are continuously being added.

Python
9
2 个月前

Create meta data from resume

Python
4
3 年前

Exploring LayoutLM for Smart OCR Capabilities

3
4 年前

All in one package for Document (image, pdf) Classification. Unified Interface for google ocr and tesseract. Train, evaluate, and infer using fasttext, Small language models (NER), Small Vision Language Models (layoutlm), and LLM.

Python
3
4 个月前

Prototypical Networks for Information Extraction in Visual Documents. Weights can be found at https://drive.google.com/file/d/1Zrp7QaZIf0H_FFRx_LhB0uZTqDUSis2H/view?usp=sharing.

Jupyter Notebook
0
2 年前

Fine-tuning LayoutLMv3 on the SROIE dataset to build a receipt understanding model

Jupyter Notebook
0
6 个月前