First tokens: The Achilles’ heel of LLMs
The Assistant Prefill feature available in many LLMs can leave models vulnerable to safety alignment bypasses (aka jailbreaking). This article builds on prior research to investigate the practical aspects of prefill security.