Repository navigation
gpt4vision
- Website
- Wikipedia
Control Any Computer Using LLMs.
Convert different model APIs into the OpenAI API format out of the box.
"Improving Mathematical Reasoning with Process Supervision" by OPENAI
Visual Instruction-guided Explainable Metric. Code for "Towards Explainable Metrics for Conditional Image Synthesis Evaluation" (ACL 2024)
This is a tool that uses GPT4 Vision to operate your computer
This repository offers a Python framework for a retrieval-augmented generation (RAG) pipeline using text and images from MHTML documents, leveraging Azure AI and OpenAI services. It includes ingestion and enrichment flows, a RAG with Vision pipeline, and evaluation tools.
Capture images with HoloLens and receive descriptive responses from OpenAI's GPT-4V(ision).
Your own personal Ruskin.
Developed an IoT-based construction site inspector using a Raspberry Pi 4 that autonomously navigates and inspects construction sites. The system features two DC motors for line-following and a servo-mounted ultrasonic sensor for real-time obstacle detection.
VisionQuery GPT-4v is a cutting-edge tool that combines screenshot-based queries with OpenAI's GPT-4. It enables users to capture screens, ask questions, and receive insightful answers from GPT-4v, revolutionizing digital interaction and understanding.
Web-based user interface for GPT4All and set it up to be hosted on GitHub Pages. This will allow users to interact with the model through a browser. We'll use Flask for the backend and some modern HTML/CSS/JavaScript for the frontend.
Camera powered with AI on the web