Repository navigation

#

image2text

GLM-4.5V and GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning

Python
1481
2 天前

TexTeller can convert image to latex formulas (image2latex, latex OCR) with higher accuracy and exhibits superior generalization ability, enabling it to cover most usage scenarios.

Python
580
6 天前

📋 Python wrapper to grab text from images and save as text files using Tesseract Engine

Python
410
6 天前

读过的CV方向的一些论文,图像生成文字、弱监督分割等

125
5 年前

Various nodes for ComfyUI

Python
41
3 个月前

Vim commands to use mathpix from your screen

Shell
41
1 年前

CNN-Encoder and RNN-Decoder (Bahdanau Attention) for image caption or image to text on MS-COCO dataset. 图片描述

Jupyter Notebook
35
6 年前

Deep Extreme Cut http://www.vision.ee.ethz.ch/~cvlsegmentation/dextr . a tool to do automatically object segmentation from extreme points.

Python
25
5 年前

A collection of scripts to "help" you with your programming exams and assignments.

Python
17
2 年前

Civitai Stable Diffusion 337k Dataset; dataset of ai generated image

Python
10
7 个月前

Python tool, which takes 1..n images, tries to rotate them based on the text, extract the text and store 1..n images to a pdf.

Python
6
3 年前

[INLG2023] The High-Level (HL) dataset is a Vision and Language (V&L) resource aligning object-centric descriptions from COCO with high-level descriptions crowdsourced along 3 axes: scene, action, rationale.

6
2 年前

[AINL 2023] IMAD: IMage Augmented multi-modal Dialogue

Python
4
2 年前

🎞 Video editor with description generation for MTS TrueTech Hack

Jupyter Notebook
4
2 年前