Repository navigation

#

jailbreaking

[CCS'24] A dataset consists of 15,140 ChatGPT prompts from Reddit, Discord, websites, and open-source datasets (including 1,405 jailbreak prompts).

Jupyter Notebook
3369
9 个月前

A powerful tool for automated LLM fuzzing. It is designed to help developers and security researchers identify and mitigate potential jailbreaks in their LLM APIs.

Jupyter Notebook
776
3 个月前

Open Source iOS 15 - iOS 15.6 Jailbreak Project

C
247
3 年前

Frida script to bypass the iOS application Jailbreak Detection

JavaScript
77
7 年前

Does Refusal Training in LLMs Generalize to the Past Tense? [ICLR 2025]

Python
74
8 个月前

Materials for the course Principles of AI: LLMs at UPenn (Stat 9911, Spring 2025). LLM architectures, training paradigms (pre- and post-training, alignment), test-time computation, reasoning, safety and robustness (jailbreaking, oversight, uncertainty), representations, interpretability (circuits), etc.

42
4 个月前
Python
37
3 年前

An extensive prompt to make a friendly persona from a chatbot-like model like ChatGPT

30
2 年前

Security Kit is a lightweight framework that helps to achieve a security layer

Swift
23
2 年前

SecurityKit is a lightweight, easy-to-use Swift library that helps protect iOS apps according to the OWASP MASVS standard, chapter v8, providing an advanced security and anti-tampering layer.

Swift
21
7 个月前

APT distribution repository for jailbroken iOS devices (rootless / rootful).

JavaScript
18
23 天前

During the Development of Suave7 and it's Predecessors, we've created a lot of Icons and UI-Images and we would like to share them with you. The Theme Developer Kit contains nearly 5.600 Icons, more than 380 Photoshop-Templates and 100 Pixelmator-Documents. With this Package you can customize every App from the App Store …

15
2 个月前

Customizable Dark Mode Extension for iOS 13+

Logos
15
5 年前

TurboFuzzLLM: Turbocharging Mutation-based Fuzzing for Effectively Jailbreaking Large Language Models in Practice

Python
14
2 个月前

Source code for bypass tweaks hosted under https://github.com/hekatos/repo. Licensed under 0BSD except submodules

Logos
11
4 年前

This repository contains the code for the paper "Tricking LLMs into Disobedience: Formalizing, Analyzing, and Detecting Jailbreaks" by Abhinav Rao, Sachin Vashishta*, Atharva Naik*, Somak Aditya, and Monojit Choudhury, accepted at LREC-CoLING 2024

Jupyter Notebook
8
1 年前

"ChatGPT Evil Confidant Mode" delves into a controversial and unethical use of AI, highlighting how specific prompts can generate harmful and malicious responses from ChatGPT.

6
1 年前