Repository navigation

hallucinations

Website
Wikipedia

vectara / hallucination-leaderboard

Leaderboard Comparing LLM Performance at Producing Hallucinations when Summarizing Short Documents

generative-ai hallucinations 大语言模型

Python

2130

3 天前

EdinburghNLP / awesome-hallucination-detection

List of papers on hallucination detection in LLMs.

hallucinations llms 自然语言处理

839

13 天前

VITA-MLLM / Woodpecker

✨✨Woodpecker: Hallucination Correction for Multimodal Large Language Models

hallucination hallucinations large-language-models 大语言模型 mllm multimodal-large-language-models multimodality

Python

634

4 个月前

voidism / DoLa

Official implementation for the paper "DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models"

factuality hallucinations large-language-models

Python

480

3 个月前

MinghuiChen43 / awesome-trustworthy-deep-learning

A curated list of trustworthy deep learning papers. Daily updating...

adversarial-machine-learning 安全隐私深度学习 poisoning fairness backdoor ownership robustness interpretable-deep-learning causality hallucinations uncertainty watermarking ai-alignment

365

2 天前

pegasi-ai / feather

AI Testing Toolkit for AI applications

hallucinations accuracy rag retrieval-augmented-generation safety 安全

Python

318

2 个月前

IAAR-Shanghai / UHGEval

[ACL 2024] User-friendly evaluation framework: Eval Suite & Benchmarks: UHGEval, HaluEval, HalluQA, etc.

benchmark dataset evaluation 大语言模型 ChatGPT gpt-3 gpt-4 hallucinations large-language-models qwen hallucination huggingface huggingface-transformers openai openai-api ceval

Python

162

5 个月前

PKU-YuanGroup / Hallucination-Attack

Attack to induce LLMs within hallucinations

adversarial-attacks 大语言模型 hallucinations 机器学习自然语言处理 ai-safety 深度学习

Python

155

1 年前

ictnlp / TruthX

Code for ACL 2024 paper "TruthX: Alleviating Hallucinations by Editing Large Language Models in Truthful Space"

hallucinations language-model 大语言模型 llm-inference baichuan chatglm ChatGPT gpt-4 hallucination llama llama2 llms mistral safety representation explainable-ai llama3

Python

147

1 年前

lechmazur / confabulations

Hallucinations (Confabulations) Document-Based Benchmark for RAG. Includes human-verified questions and answers.

benchmark claude gemini hallucinations leaderboard llama 大语言模型 rag language-model deepseek-r1 gemini-pro o1

HTML

124

2 天前

OpenMOSS / HalluQA

Dataset and evaluation script for "Evaluating Hallucinations in Chinese Large Language Models"

hallucinations large-language-models question-answering

Python

124

10 个月前

voidism / Lookback-Lens

Code for the EMNLP 2024 paper "Detecting and Mitigating Contextual Hallucinations in Large Language Models Using Only Attention Maps"

hallucinations large-language-models text-generation factuality

Python

120

8 个月前

rungalileo / hallucination-index

Initiative to evaluate and rank the most popular LLMs across common task types based on their propensity to hallucinate.

hallucinations large-language-models 大语言模型 llm-evaluation openai rag retrieval-augmented-generation

107

7 个月前

LLAMATOR-Core / llamator

Framework for testing vulnerabilities of large language models (LLM).

attack 大语言模型自然语言处理 Python 安全 red-teaming ai-security red-team hallucinations llm-security rag-evaluation 人工智能 rag vulnerability-assessment jailbreak owasp red-team-tools

Python

1 天前

X-PLUG / mPLUG-HalOwl

mPLUG-HalOwl: Multimodal Hallucination Evaluation and Mitigating

mllm multimodal-large-language-models benchmark contrastive-learning hallucinations

Python

1 年前

BillChan226 / HALC

[ICML 2024] Official implementation for "HALC: Object Hallucination Reduction via Adaptive Focal-Contrast Decoding"

hallucinations large-language-models

Python

5 个月前

OpenKG-ORG / EasyDetect

An Easy-to-use Hallucination Detection Framework for LLMs.

hallucinations knowledge-graph large-language-models multimodal-large-language-models 自然语言处理 aigc generation

Python

1 年前

hongbinye / Cognitive-Mirage-Hallucinations-in-LLMs

Repository for the paper "Cognitive Mirage: A Review of Hallucinations in Large Language Models"

hallucinations large-language-models natural-language-generation machine-translation

1 年前

intuit / sac3

Official repo for SAC3: Reliable Hallucination Detection in Black-Box Language Models via Semantic-aware Cross-check Consistency

blackbox consistency hallucinations large-language-models 大语言模型 reliability semantic

Jupyter Notebook

3 个月前

ChanLiang / CONNER

The implementation for EMNLP 2023 paper ”Beyond Factuality: A Comprehensive Evaluation of Large Language Models as Knowledge Generators“

llm-evaluation hallucinations emnlp2023 large-language-models factuality ChatGPT llama

Python

1 年前