Repository navigation

gpt4vision

Website
Wikipedia

Control Any Computer Using LLMs.

gpt 大语言模型机器学习 macOS openai Python 自动化 assistant assistant-computer-control gpt4 gpt4v gpt4vision Linux pyautogui pyinstaller self-driving self-driving-software Windows

Python

2454

252

7 个月前

soulteary / amazing-openai-api

Convert different model APIs into the OpenAI API format out of the box.

azure-openai azure-openai-api gemini-pro google-gemini openai openai-api gpt4v gpt4vision

160

2 年前

kyegomez / Lets-Verify-Step-by-Step

"Improving Mathematical Reasoning with Process Supervision" by OPENAI

人工智能 finetuning gpt4 gpt4-api gpt4vision llama 机器学习

Python

111

5 天前

TIGER-AI-Lab / VIEScore

Visual Instruction-guided Explainable Metric. Code for "Towards Explainable Metrics for Conditional Image Synthesis Evaluation" (ACL 2024)

机器视觉 image-editing image-generation gpt4vision visual-question-answering

Python

1 年前

Azure-Samples / rag-as-a-service-with-vision

This repository offers a Python framework for a retrieval-augmented generation (RAG) pipeline using text and images from MHTML documents, leveraging Azure AI and OpenAI services. It includes ingestion and enrichment flows, a RAG with Vision pipeline, and evaluation tools.

azure-ai-search cosmosdb gpt-4o gpt4v gpt4vision 大语言模型 openai rag vision

Python

10 个月前

mostafasadeghi97 / auto-pilot-computer

This is a tool that uses GPT4 Vision to operate your computer

自动化 gpt4vision openai vision Rust

Rust

2 年前

mapluisch / GPT-4-Vision-for-HoloLens

Capture images with HoloLens and receive descriptive responses from OpenAI's GPT-4V(ision).

gpt-4 gpt-4-vision-preview hololens openai openai-api Unity gpt-4-vision gpt4vision

ShaderLab

2 年前

afonso07 / ruskin

Your own personal Ruskin.

elevenlabs FastAPI gpt gpt4 Next openai React vision gpt4vision gpt-4v

TypeScript

2 年前

Envedity / DAIA

Digital Artificial Intelligence Agent

agi 人工智能 ai-agent 大语言模型 llm-agent 机器学习 gpt4v gpt4vision

Python

2 年前

HaimOzer123 / Automated-Construction-Site-Inspector

Developed an IoT-based construction site inspector using a Raspberry Pi 4 that autonomously navigates and inspects construction sites. The system features two DC motors for line-following and a servo-mounted ultrasonic sensor for real-time obstacle detection.

自动化 gpt4vision Python 树莓派 yolo

Python

1 年前

yunwoong7 / VisionQuery-GPT-4v

VisionQuery GPT-4v is a cutting-edge tool that combines screenshot-based queries with OpenAI's GPT-4. It enables users to capture screens, ask questions, and receive insightful answers from GPT-4v, revolutionizing digital interaction and understanding.

classification face-recognition gpt4v gpt4vision image-recognition object-detection OCR openai-api Python

Jupyter Notebook

2 年前

nectariferous / gpt4all-webui

Web-based user interface for GPT4All and set it up to be hosted on GitHub Pages. This will allow users to interact with the model through a browser. We'll use Flask for the backend and some modern HTML/CSS/JavaScript for the frontend.

ChatGPT gpt4 gpt4-api gpt4free gpt4vision Python webui

Python

1 年前