Repository navigation

image-to-text

Website
Wikipedia

thiagoalessio / tesseract-ocr-for-php

A wrapper to work with Tesseract OCR inside PHP.

OCR tesseract PHP text-recognition image-to-text

PHP

2994

552

5 个月前

lucidrains / CoCa-pytorch

Implementation of CoCa, Contrastive Captioners are Image-Text Foundation Models, in Pytorch

人工智能 attention-mechanism contrastive-learning 深度学习 multimodal transformers image-to-text

Python

1165

2 年前

killkimno / MORT

MORT 번역기 프로젝트 - Real-time game translator with OCR

OCR auto-translation translation translate game game-translation tesseract-ocr image-to-text

1161

1 个月前

PaddlePaddle / PaddleMIX

Paddle Multimodal Integration and eXploration, supporting mainstream multi-modal tasks, including end-to-end large-scale multi-modal pretrain models and diffusion model toolbox. Equipped with high performance and flexibility.

aigc stable-diffusion clip image-to-text text-to-image controlnet multimodal text-to-video dit llava sora qwen2-vl minicpm-v

Python

687

218

13 天前

Flame-Code-VLM / Flame-Code-VLM

Flame is an open-source multimodal AI system designed to translate UI design mockups into high-quality React code. It leverages vision-language modeling, automated data synthesis, and structured training workflows to bridge the gap between design and front-end development.

code-generation frontend-development vision-language-model 人工智能深度学习前端 multimodal Open Source React vlm deepseek design-to-code front-end image-to-text 大语言模型 Vue.js

Python

538

5 个月前

zapolnoch / node-tesseract-ocr

A Node.js wrapper for the Tesseract OCR API

tesseract OCR text-recognition image-to-text

JavaScript

312

2 年前

google / imageinwords

Data release for the ImageInWords (IIW) paper.

evaluation image-captioning image-to-text dataset dataset-generation

JavaScript

216

9 个月前

Yushi-Hu / tifa

TIFA: Accurate and Interpretable Text-to-Image Faithfulness Evaluation with Question Answering

image-to-text large-language-models text-to-image visual-question-answering

Python

171

1 年前

NormXU / nougat-latex-ocr

Codebase for fine-tuning / evaluating nougat-based image2latex generation models

image-to-text

Python

156

1 年前

yardstick17 / image_text_reader

The module extracts text from image using the tesseract-OCR engine. Generally, text present in the images are blur or are of uneven sizes. The image is pre-processed for better comprehension by OCR. This module first makes bounding box for text in images and then normalizes it to 300 dpi, suitable for OCR engine to read.

OCR image-to-text tesseract-ocr

Python

147

6 年前

shoryasethia / markdrop

A Python package for converting PDFs to markdown while extracting images and tables, generate descriptive text descriptions for extracted tables/images using several LLM clients. And many more functionalities. Markdrop is available on PyPI.

Open Source pypi-package image-to-text 大语言模型 pdf-to-markdown pdf-to-text table-to-text agents

Python

142

1 个月前

nateshmbhat / card-scanner-flutter

A flutter package for Fast, Accurate and Secure Credit card & Debit card scanning

Flutter Dart 机器学习人工智能 credit-card 图像处理 image-to-text

Swift

126

114

7 个月前

NanoNets / ocr-python

OCR library to extract text & tables from PDF files and images. Convert any image or PDF to CSV / TXT / JSON / Searchable PDF.

OCR tesseract pdf Python pdf-to-json pdf-to-text image-to-text

Jupyter Notebook

112

3 年前

mshdabiola / NotePad

Notepad is multi module Jetpack compose note taking app with sketch pad, voice recorder, image capturing app

Android Actions Jetpack Compose Kotlin image-to-text room-persistence-library

Kotlin

112

8 天前

MIMICLab / L-Verse

L-Verse: Bidirectional Generation Between Image and Text

深度学习 PyTorch pytorch-lightning vq-vae transformer image-to-text text-to-image image-captioning

Python

108

5 个月前

BEPb / image_to_ascii

Everything is very simple: you either download a picture file or specify its link when running a python script, and output you get a text file, and you can immediately view on the command line how it will look the result of your conversion.

cmd image-to-text conversion convert converter Python Script

Python

105

2 年前