Repository navigation
jailbreaking
- Website
- Wikipedia
[CCS'24] A dataset consists of 15,140 ChatGPT prompts from Reddit, Discord, websites, and open-source datasets (including 1,405 jailbreak prompts).
A powerful tool for automated LLM fuzzing. It is designed to help developers and security researchers identify and mitigate potential jailbreaks in their LLM APIs.
Open Source iOS 15 - iOS 15.6 Jailbreak Project
Frida script to bypass the iOS application Jailbreak Detection
Does Refusal Training in LLMs Generalize to the Past Tense? [ICLR 2025]
An extensive prompt to make a friendly persona from a chatbot-like model like ChatGPT
Materials for the course Principles of AI: LLMs at UPenn (Stat 9911, Spring 2025). LLM architectures, training paradigms (pre- and post-training, alignment), test-time computation, reasoning, safety and robustness (jailbreaking, oversight, uncertainty), representations, interpretability (circuits), etc.
Security Kit is a lightweight framework that helps to achieve a security layer
iOS APT distribution repository for rootful and rootless jailbreaks
During the Development of Suave7 and it's Predecessors, we've created a lot of Icons and UI-Images and we would like to share them with you. The Theme Developer Kit contains nearly 5.600 Icons, more than 380 Photoshop-Templates and 100 Pixelmator-Documents. With this Package you can customize every App from the App Store …
Customizable Dark Mode Extension for iOS 13+
Source code for bypass tweaks hosted under https://github.com/hekatos/repo. Licensed under 0BSD except submodules
This repository contains the code for the paper "Tricking LLMs into Disobedience: Formalizing, Analyzing, and Detecting Jailbreaks" by Abhinav Rao, Sachin Vashishta*, Atharva Naik*, Somak Aditya, and Monojit Choudhury, accepted at LREC-CoLING 2024
SecurityKit is a lightweight, easy-to-use Swift library that helps protect iOS apps according to the OWASP MASVS standard, chapter v8, providing an advanced security and anti-tampering layer.
LV-Crew.org_(LVC)_-_Howto_-_iPhones
Your best llm security paper library
"ChatGPT Evil Confidant Mode" delves into a controversial and unethical use of AI, highlighting how specific prompts can generate harmful and malicious responses from ChatGPT.
ChatGPT Developer Mode is a jailbreak prompt introduced to perform additional modifications and customization of the OpenAI ChatGPT model.