Repository navigation

florence-2

Website
Wikipedia

streamline the fine-tuning process for multimodal models: PaliGemma 2, Florence-2, and Qwen2.5-VL

captioning fine-tuning florence-2 multimodal objectdetection paligemma phi-3-vision transformers vision-and-language vqa qwen2-vl

Python

2631

217

5 天前

jhc13 / taggui

Tag manager and captioner for image datasets

image-captioning pyside6 stable-diffusion llava cogvlm florence-2

Python

1136

4 个月前

D-Ogi / WatermarkRemover-AI

AI-Powered Watermark Remover using Florence-2 and LaMA Models: A Python application leveraging state-of-the-art deep learning models to effectively remove watermarks from images with a user-friendly PyQt6 interface.

florence-2 lama-cleaner watermark-remover dataset-creation inpainting

Python

707

121

2 个月前

autodistill / autodistill-grounded-sam-2

Use Segment Anything 2, grounded with Florence-2, to auto-label data for use in training vision models.

florence-2

Python

132

1 年前

Ravi-Teja-konda / Surveillance_Video_Summarizer

VLM driven tool that processes surveillance videos, extracts frames, and generates insightful annotations using a fine-tuned Florence-2 Vision-Language Model. Includes a Gradio-based interface for querying and analyzing video footage.

人工智能 ChatGPT florence-2 gpt-4 gradio gradio-python-llm huggingface summarization Video vision-and-language vlm

Python

124

4 个月前

anyantudre / Florence-2-Vision-Language-Model

Florence-2 is a novel vision foundation model with a unified, prompt-based representation for a variety of computer vision and vision-language tasks.

机器视觉深度学习 florence-2 huggingface vision-language vision-language-model vision-transformer vision-transformer-models

Jupyter Notebook

1 年前

Damarcreative / rem-wm

Watermark remover tool that leverages the capabilities of Microsoft Florence and Lama Cleaner models.

florence-2 lama-cleaner watermark

Python

8 个月前

retkowsky / florence-2

Florence-2

Azure florence-2

Jupyter Notebook

8 个月前

autodistill / autodistill-florence-2

Use Florence 2 to auto-label data for use in training fine-tuned object detection models.

florence-2 object-detection zero-shot-object-detection

Python

1 年前

sayedmohamedscu / Vision-language-models-VLM

vision language models finetuning notebooks & use cases (Medgemma - paligemma - florence .....)

colab-notebook 机器视觉 finetuning multimodal paligemma vlm florence-2 lora Medical imaging qlora

Jupyter Notebook

15 天前

fireicewolf / wd-llm-caption-cli

A Python base cli tool for caption images with WD series, Joy-caption-pre-alpha,meta Llama 3.2 Vision Instruct and Qwen2 VL Instruct models.

qwen2-vl florence-2

Python

10 天前

Iteranya / AktivaAI

Local LLM Discord Bot

人工智能 discord-bot florence-2 llama multimodal roleplay 聊天机器人

Python

3 个月前

jacobmarks / fiftyone_florence2_plugin

Run SOTA Vision-Language Model Florence-2 on your data!

机器视觉 florence-2 机器学习 transformer vision-language-model

Jupyter Notebook

6 个月前

mithunparab / text2segment_video

Simple Video Summarization using Text-to-Segment Anything (Florence2 + SAM2) This project provides a video processing tool that utilizes advanced AI models, specifically Florence2 and SAM2, to detect and segment specific objects or activities in a video based on textual descriptions.

florence-2 optical-flow raft segment-anything

Python

7 个月前

nguyennpa412 / simple-multimodal-ai

Simple Gradio application integrated with Hugging Face Multimodals to support visual question answering chatbot and more features

机器视觉 Docker gradio text-to-speech visual-question-answering vlm 大语言模型 mllm florence-2

Python

1 年前

PRITHIVSAKTHIUR / Florence-2-Image-Caption

This application utilizes the powerful Florence-2 vision-language model from Microsoft to generate comprehensive captions for images. The model is capable of understanding visual content and expressing it in natural language.

florence-2 gradio huggingface image-captioning 图像处理 pillow timm torch transformers vision-language-model

Python

2 个月前

Rm1n90 / Florence2Onnx

ONNX deploys for Florence 2 visual multimodal

florence-2 onnx onnxruntime inference

Python

8 个月前

sitammeur / TextSnap

TextSnap: Demo for Florence 2 model used in OCR tasks to extract and visualize text from images.

人工智能 florence-2 gradio gradio-interface huggingface-spaces huggingface-transformers optical-character-recognition vision-language-model Python

Python

6 个月前

regiellis / ecko-cli

ecko-cli is a simple CLI tool that streamlines the process of processing images in a directory, generating captions, and saving them as text files. Additionally, it provides functionalities to create a JSONL file from images in the directory you specify. Images will be captioned using the Microsoft Florence-2-large model and ONNX

人工智能命令行界面 florence-2 generative-ai huggingface-transformers image-classification 图像处理 onnxruntime

Python

2 个月前

jkawamoto / mcp-florence2

An MCP server for processing images using Florence-2

florence-2 mcp-server Python

Python

16 天前