Repository navigation

pixtral

Website
Wikipedia

MLX-VLM is a package for inference and fine-tuning of Vision Language Models (VLMs) on your Mac using MLX.

llava 大语言模型 MLX vision-transformer apple-silicon idefics local-ai paligemma vision-framework vision-language-model florence2 molmo pixtral

Python

1673

180

1 天前

MaxiDonkey / DelphiMistralAI

DelphiMistralAI wrapper brings Mistral’s text-vision-audio models and agentic Conversations to Delphi, with chat, embeddings, Codestral codegen, fine-tuning, batching, moderation, async/await helpers and live request monitoring.

API 聊天机器人 ChatGPT delphi gpt mistral mistral-7b mistral-api mistralai mixtral-8x7b fine-tuning finetuning agents vision pixtral

Pascal

16 天前

2U1 / Pixtral-Finetune

An open-source implementaion for fine-tuning Pixtral by MistralAI.

聊天机器人 mistral multimodal pixtral vision-language-model

Python

8 个月前

2dameneko / ide-cap-chan

ide-cap-chan is a utility for batch image captioning with natural language using various VL models

人工智能 batch captioning 大语言模型机器学习 Python vlm gpu llava mistral molmo Nvidia pixtral qwen

Python

5 个月前

Pavansomisetty21 / Visual-Question-Answering-Pixtral_Vision_Finetuning_Unsloth

In this we finetune Pixtral-12B-2409 model using unsloth for visual Question Answering(NLP Task)

fine-tuning large-language-models 大语言模型自然语言处理 pixtral unsloth vision-language-model visual-question-answering vlm

Jupyter Notebook

9 个月前

eniompw / Mistral

Examples of Mistral API using Python REST / curl

mistral mistral-api mistralai pixtral vision

Python

1 年前

Ojasva-Goyal / Pixtral-Fine-Tuning

LoRA-based fine-tuning pipeline for Pixtral‑12B on multimodal instruction tasks using Hugging Face and vision-language templates.

huggingface instruction-tuning llava lora mistral multimodal pixtral

Python

4 个月前

munas-git / LookOutAI

LookOutAI is a tool for spotting a target person across images or video clips using facial recognition. It leverages NLP and LLMs to describe appearances and includes features to censor or highlight the target’s face.

大语言模型自然语言处理 pixtral text-mining

Python

10 个月前

Madhur-1 / RevealVLLMSafetyEval

RevealVLLMSafetyEval is a comprehensive pipeline for evaluating Vision-Language Models (VLMs) on their compliance with harm-related policies. It automates the creation of adversarial multi-turn datasets and the evaluation of model responses, supporting responsible AI development and red-teaming efforts.

gpt-4o llama3 llava phi3 pixtral qwen2 qwen2-vl red-teaming responsible-ai vllm

Python

5 个月前