Repository navigation

prompt-testing

Website
Wikipedia

Test your prompts, agents, and RAGs. AI Red teaming, pentesting, and vulnerability scanning for LLMs. Compare performance of GPT, Claude, Gemini, Llama, and more. Simple declarative configs with command line and CI/CD integration.

大语言模型 prompt-engineering prompts llmops prompt-testing Testing rag evaluation evaluation-framework llm-eval llm-evaluation llm-evaluation-framework 持续集成 CI/CD pentesting red-teaming vulnerability-scanners

TypeScript

8047

658

3 小时前

msoedov / agentic_security

Agentic LLM Vulnerability Scanner / AI red teaming kit 🧪

llm-security ai-red-team llm-evaluation llm-evaluation-framework prompt-testing agent-framework

Python

1617

251

5 天前

babelcloud / LLM-RGB

LLM Reasoning and Generation Benchmark. Evaluate LLMs in complex scenarios systematically.

benchmark 大语言模型 prompt prompt-engineering prompt-testing

TypeScript

163

3 个月前

aralyekta / prompttester

Test, compare, and optimize your AI prompts in minutes

llm-evaluation llm-tools prompt-testing

JavaScript

6 天前

prompt-foundry / typescript-sdk

The prompt engineering, prompt management, and prompt evaluation tool for TypeScript, JavaScript, and NodeJS.

prompt-engineering prompt-management prompt-testing TypeScript llm-eval llm-evaluation open-ai gpt gpt-3 gpt-4 大语言模型 llm-ops llmops

TypeScript

1 年前

calibrtr / llm-prompt-test

LLM Prompt Test helps you test Large Language Models (LLMs) prompts to ensure they consistently meet your expectations.

large-language-models 大语言模型 prompt prompt-engineering prompt-testing prompts Testing Test automation Test-driven development

TypeScript

1 年前

yukinagae / genkitx-promptfoo

Community Plugin for Genkit to use Promptfoo

人工智能 evaluation evaluation-framework Firebase genkit 大语言模型 llm-eval llm-evaluation llm-evaluation-framework llmops 插件 prompt prompt-testing prompts Testing

TypeScript

8 个月前

yukinagae / promptfoo-sample

Sample project demonstrates how to use Promptfoo, a test framework for evaluating the output of generative AI models

evaluation evaluation-framework 大语言模型 llm-eval llm-evaluation llm-evaluation-framework llmops prompt-testing prompts Testing

1 年前

jairerazodev / prompt-testing

prompt-testing

3 年前

abdullahkhalid00 / prompt-db

A collection of prompts that I use on a day-to-day basis for work and leisure.

ChatGPT jinja2 Markdown prompt-engineering prompt-testing prompts text

1 年前

yukinagae / genkit-promptfoo-sample

Sample implementation demonstrating how to use Firebase Genkit with Promptfoo

evaluation evaluation-framework genkit 大语言模型 llm-eval llm-evaluation llm-evaluation-framework llmops prompt-testing prompts Testing

TypeScript

1 年前

Sigmakib2 / openai-prompt-testing-playground

A dynamic and interactive playground for testing and refining prompts with OpenAI's language models. Includes customizable inputs for prompts, advanced model settings, and live response streaming for seamless experimentation.

人工智能 ChatGPT openai playground prompt prompt-engineering prompt-testing

HTML

7 个月前

ashleysally00 / promptfoo-quickstart-guide

Quickstart guide for using PromptFoo to evaluate LLM prompts via CLI or Colab.

cli-tool colab 大语言模型 openai prompt-engineering prompt-testing

18 天前

radoslaw-sz / maia

A pytest-based framework for testing multi AI agents (mAIa) system. It provides a flexible and extensible platform for creating and running complex multi-agent simulations and capturing the results.

agents 人工智能框架大语言模型 Python Testing ai-testing-tool prompt-engineering prompt-testing

Python

13 天前

snowz123 / team-agents

🐙 Team Agents unifica 82 especialistas en IA para resolver desafíos con chat inteligente, analista de requisitos y subida de documentos. Plataforma futurista y modular.

agent agent-simulations agents Azure ChatGPT generative-ai langchain 大语言模型 llm-evaluation llm-evaluation-framework llm-security multi-agents prompt-testing

Python

6 天前