Logo

Prompt Engineer (LLM Systems, Evals & Safety)

webook.comJordan


No Relocation

Posted: January 12, 2026

Job Description

Do you want to love what you do at work? Do you want to make a difference, an impact, and transform peoples lives? Do you want to work with a team that believes in disrupting the normal, boring, and average?

If yes, then this is the job you are looking for , webook.com is Saudi’s #1 event ticketing and experience booking platform in terms of technology, features, agility, revenue serving some of the largest mega events in the Kingdom surpassing over 2 billion in sales.  

Role Overview

Design high-quality prompts, system instructions, and tooling that make our LLM features accurate, safe, and cost-effective. You’ll own evaluation, prompt versioning, and continuous improvement.

Key Responsibilities:

  • Author, refactor, and chain prompts (system/tool/policy) for varied tasks.
  • Create offline/online evaluation harnesses (rubrics, golden sets, metrics).
  • Build prompt libraries with versioning, A/B testing, and telemetry.
  • Reduce hallucinations via verification, constrained decoding, and tool use.
  • Implement safety: jailbreak/prompt-injection tests, content policy checks, PII handling.
  • Partner with engineers to integrate prompts into production features.
Do you want to love what you do at work? Do you want to make a difference, an impact, and transform peoples lives? Do you want to work with a team that believes in disrupting the normal, boring, and average?If yes, then this is the job you are looking ...
  • Demonstrated prompt design across multiple task types and models.
  • Experience building eval datasets and automated scoring (e.g., accuracy, faithfulness, utility, cost/latency).
  • Familiarity with retrieval-augmented generation concepts and tool/function calling.
  • Strong scripting (Python/TypeScript) for data prep, evals, and analysis.
  • Clear writing; ability to translate business goals into measurable prompt specs.

Nice-to-Haves

  • Experience with LangChain/LLM orchestration, vector stores, and rerankers.
  • Knowledge of safety tooling and red-teaming techniques.
  • Experiment platforms (feature flags, A/B tests), analytics.

Additional Content

Do you want to love what you do at work? Do you want to make a difference, an impact, and transform peoples lives? Do you want to work with a team that believes in disrupting the normal, boring, and average?

If yes, then this is the job you are looking for , webook.com is Saudi’s #1 event ticketing and experience booking platform in terms of technology, features, agility, revenue serving some of the largest mega events in the Kingdom surpassing over 2 billion in sales.  

Role Overview

Design high-quality prompts, system instructions, and tooling that make our LLM features accurate, safe, and cost-effective. You’ll own evaluation, prompt versioning, and continuous improvement.

Key Responsibilities:

  • Author, refactor, and chain prompts (system/tool/policy) for varied tasks.
  • Create offline/online evaluation harnesses (rubrics, golden sets, metrics).
  • Build prompt libraries with versioning, A/B testing, and telemetry.
  • Reduce hallucinations via verification, constrained decoding, and tool use.
  • Implement safety: jailbreak/prompt-injection tests, content policy checks, PII handling.
  • Partner with engineers to integrate prompts into production features.
Do you want to love what you do at work? Do you want to make a difference, an impact, and transform peoples lives? Do you want to work with a team that believes in disrupting the normal, boring, and average?If yes, then this is the job you are looking ...
  • Demonstrated prompt design across multiple task types and models.
  • Experience building eval datasets and automated scoring (e.g., accuracy, faithfulness, utility, cost/latency).
  • Familiarity with retrieval-augmented generation concepts and tool/function calling.
  • Strong scripting (Python/TypeScript) for data prep, evals, and analysis.
  • Clear writing; ability to translate business goals into measurable prompt specs.

Nice-to-Haves

  • Experience with LangChain/LLM orchestration, vector stores, and rerankers.
  • Knowledge of safety tooling and red-teaming techniques.
  • Experiment platforms (feature flags, A/B tests), analytics.