Posts

GPT Prompt Attack

I came upon https://gpa.43z.one today. It's a GPT-flavored capture the flag. The idea is, given a prompt containing a secret, convince the LM to leak the prompt against prior instructions it's been given. It's cool way to develop intuition for how to prompt and steer LMs. I managed to complete all...