Repository navigation
florence2
- Website
- Wikipedia
MLX-VLM is a package for inference and fine-tuning of Vision Language Models (VLMs) on your Mac using MLX.
A Rust library integrated with ONNXRuntime, providing a collection of Computer Vison and Vision-Language models.
Unofficial repository for building Florence-2 in Microsoft Azure
This repository provides a powerful AI-driven solution for removing objects from videos using text prompts. By integrating SAM2, Florence2, and ProPainter, the model enables precise and seamless object removal. Simply describe the objects to remove (e.g., "man, car, cap, basket"), and the AI will handle the rest with high accuracy.
This project applies a modular generative AI pipeline to perform virtual staging on empty room images. It synthesizes realistic, high-quality interior furnishings while rigorously preserving the original room’s geometry, structure, and spatial consistency.
Sample: Object Detection over a Video Stream using Microsoft's Florence-2 Model
SmolVLM 🐙: Ready-to-run SmolVLM2 Docker image with web UI and HTTP API for image-to-text and text-to-text tasks; offline-capable, low GPU needs (>=4GB VRAM).