Adversarial Prompt Expert
Reinforce Labs, Inc. • Remote
Posted: February 9, 2026
Job Description
We are looking for a creative "breaker" to join our team as an Adversarial Prompt Expert. In this role, you won’t just be using LLMs—you’ll be stress-testing their boundaries, bypassing their safeguards, and helping us build safer, more robust intelligence.
This is an asynchronous, remote position designed for self-starters who thrive in the gray areas between code, linguistics, and security.
Work Details
Design and execute complex jailbreak attempts to identify vulnerabilities in state-of-the-art models.
Use your background in linguistics or social sciences to find "hidden" biases or harms that standard automated filters miss.
Model Evaluation: Systematically rank LLM outputs to determine where safety guardrails are failing or succeeding.
Knowledge Loop: Document your "attack vectors" clearly to help our engineering teams patch vulnerabilities.
Who you Are
Heavy LLM Usage — hands-on experience with multiple models (open- and closed-source), comfort experimenting across systems and platforms.
You have a "hacker mindset." You enjoy the puzzle of finding edge cases and can think of ten different ways to ask a forbidden question.
You can turn a chaotic afternoon of prompt-hacking into a clean, actionable report.
You understand the weight of this work. You can handle sensitive or "dark" content professionally and stay within ethical boundaries.
Qualifications & Skill Requirements
Proven ability to navigate complex model restrictions using creative evasion techniques.
Background in offensive security or red teaming is a major plus.
You don't give up when a model says "I cannot fulfill this request." You find a new angle.
Additional Content
We are looking for a creative "breaker" to join our team as an Adversarial Prompt Expert. In this role, you won’t just be using LLMs—you’ll be stress-testing their boundaries, bypassing their safeguards, and helping us build safer, more robust intelligence.
This is an asynchronous, remote position designed for self-starters who thrive in the gray areas between code, linguistics, and security.
Work Details
Design and execute complex jailbreak attempts to identify vulnerabilities in state-of-the-art models.
Use your background in linguistics or social sciences to find "hidden" biases or harms that standard automated filters miss.
Model Evaluation: Systematically rank LLM outputs to determine where safety guardrails are failing or succeeding.
Knowledge Loop: Document your "attack vectors" clearly to help our engineering teams patch vulnerabilities.
Who you Are
Heavy LLM Usage — hands-on experience with multiple models (open- and closed-source), comfort experimenting across systems and platforms.
You have a "hacker mindset." You enjoy the puzzle of finding edge cases and can think of ten different ways to ask a forbidden question.
You can turn a chaotic afternoon of prompt-hacking into a clean, actionable report.
You understand the weight of this work. You can handle sensitive or "dark" content professionally and stay within ethical boundaries.
Qualifications & Skill Requirements
Proven ability to navigate complex model restrictions using creative evasion techniques.
Background in offensive security or red teaming is a major plus.
You don't give up when a model says "I cannot fulfill this request." You find a new angle.