Staff AI & Data Engineer | ESProfiler Careers

About the Role

You will operate at the intersection of Data Engineering, Data Science, and modern AI/ML systems, taking ownership of initiatives that directly shape product and business outcomes. We need someone with genuine breadth — equally comfortable designing scalable data pipelines as they are building agentic AI architectures — and with the curiosity to keep pace with a space that is moving faster than almost any other in software engineering.

You will bring deep technical expertise across the full AI/ML stack alongside the leadership qualities to mentor colleagues, challenge assumptions, and drive a culture of engineering excellence. Crucially, you will not just set direction — you will get your hands dirty and build it too.

What You Will Be Doing

This role spans two interconnected disciplines. We are looking for strong coverage across both.

Data Engineering & Data Science

Design, build, and maintain robust data pipelines and ingestion frameworks that feed our AI/ML systems with clean, reliable data.
Own data quality initiatives end-to-end: crawlers to audit acquired sites (broken links, 404s, redirect chains), validation frameworks, and alerting.
Identifying the right tool for the job. Not everything is solved by an LLM, sometimes traditional ML is the right tool and that's okay, and will be faster and cheaper to run longer term.
Evolve our data architecture: champion the appropriate pattern for the job, whether that's a GraphDB, vector stores, and more standard SQL/NoSQL structures, all whilst ensuring scalability and long-term maintainability.
Establish and promote good data modelling practices across the organisation — schema design, query optimisation, and a sensible approach to data governance.
Work across a range of storage paradigms: SQL (PostgreSQL, MySQL), vector databases (pgvector and equivalents), and graph databases (Neo4j or similar).

AI/ML Engineering & Science

Design and ship agentic AI pipelines and multi-agent reasoning systems that solve real business problems — content review, classification, enrichment, and beyond.
Lead the evaluation and adoption of emerging AI/ML tooling: Vertex AI, Google ADK, AWS SageMaker, Azure ML, and next-generation LLM frameworks.
Establish LLMOps practices: formal evaluation pipelines, regression testing, and quality baselines so we always know whether our AI systems are improving or declining.
Identify where Large Language Models can be replaced by leaner, more cost-effective traditional ML models — and deliver those replacements.
Build NLP-powered systems, including classifiers, semantic search, and potentially fine-tuned or custom-trained models where the use case justifies it.
Drive the auto-generation of marketing content and other AI-powered product features, working closely with Product to turn ideas into production systems.
Bring your own ideas to the table. If you see an opportunity we have not spotted, we want to hear it — and we will give you the space and support to explore it.

Leadership & Cross-Cutting Responsibilities

Mentor and collaborate with engineers across the team, raising the collective bar for AI/ML quality, reproducibility, and best practice.
Run tech-sharing sessions; keep the team current on fast-moving developments in the AI/data space.
Contribute to hiring: interview, assess, and help build the team you want to work in.

Skills & Experience

We do not expect any candidate to tick every box. We are looking for breadth and intellectual curiosity alongside genuine depth in most of these areas. If you are excited by the challenge, please apply.

Requirements (Must-Have)

7+ years of professional experience across Data Engineering, Data Science, or Machine Learning roles — with meaningful exposure to both the data and AI/ML sides of that spectrum.
Hands-on experience designing and shipping agentic AI systems and multi-agent architectures (LangChain, LangGraph, AutoGen, Google ADK, or similar frameworks).
Strong working knowledge of Large Language Models: prompt engineering, evaluation, and responsible deployment in production.
Experience with Cloud ML platforms — at least one of Vertex AI (GCP), SageMaker (AWS), or Azure Machine Learning.
Expertise in Python for data processing, model training, and API development.
Solid understanding of classical ML and NLP: ability to identify when a simpler model outperforms an LLM in production and to deliver that alternative.
Relational databases: PostgreSQL, MySQL, or equivalent — schema design, query optimisation, and data modelling.
Vector databases: practical production experience with pgvector, Pinecone, Weaviate, or similar for semantic search and RAG pipelines.
Graph databases: Neo4j or equivalent; experience modelling domain knowledge as a graph.
Demonstrated technical leadership — not necessarily formal line management, but clear ownership of complex technical workstreams and influence over engineering decisions.
Strong communication skills; comfortable translating technical concepts for non-technical stakeholders.

Nice-to-Have (Bonus Points)

GraphQL API design and implementation.
MLOps / LLMOps tooling: MLflow, Weights & Biases, Evidently AI, or similar for experiment tracking and model monitoring.
Experience training or fine-tuning your own models (transformer-based or otherwise) from scratch or from pre-trained checkpoints.
NLP specialism: named entity recognition, text classification, semantic similarity, topic modelling, or conversational AI.
Data orchestration tools: Airflow, Prefect, Dagster.
Experience working with Knowledge Graphs or ontologies in a production environment.
Published work, open-source contributions, or a track record of writing or speaking about AI/ML topics.