Job ID: r20468
What You'll Do:
At Criteo, the Platform Core group builds the foundational infrastructure powering our global advertising platform. We design and operate large-scale, resilient systems supporting real-time decision-making and data processing across thousands of services.
As we expand our distributed computing and ML infrastructure capabilities, we are building a new team focused on GPU platforms and high-performance model serving technologies.
As a Site Reliability Engineer in the GPU team, you will help design, operate, and scale the infrastructure powering machine learning training and inference workloads.
You will work on technologies such as:
Ray on Kubernetes
Build and operate scalable Ray clusters running on Kubernetes.
Develop reliable self-service distributed computing platforms for ML workloads.
Improve provisioning, observability, reliability, and operational efficiency of ray-as-a-service environments.
NVIDIA Triton Inference Server
Operate and optimize large-scale inference platforms using Triton.
Improve latency, throughput, scalability, and GPU utilization for deep learning inference workloads.
You will collaborate closely with ML engineers, data scientists, and infrastructure teams to deliver reliable, production-grade ML platforms accelerating innovation across Criteo.
Who You Are:
5+ years of experience in backend engineering, Site Reliability Engineering, or platform engineering roles focused on distributed systems.
Strong experience with Kubernetes, including workload scheduling, dynamic provisioning, and custom controllers/operators.
Hands-on experience running or optimizing GPU-based workloads in production, ideally for ML training or inference systems.
Strong software engineering skills in C#, Python, Go, or similar languages, with a focus on building reliable distributed systems.
Experience building or operating production-grade infrastructure with strong requirements around performance, scalability, and reliability.
Strong interest in automation, observability, and designing systems that scale efficiently under high load.
Bonus Points
Experience with distributed ML frameworks such as Ray or similar systems.
Familiarity with inference serving stacks such as NVIDIA Triton or TensorRT.
Experience with GPU scheduling, resource management, or multi-tenant GPU platforms.
Exposure to cloud-native GPU orchestration (GKE, EKS, or on-prem Kubernetes GPU clusters).
We acknowledge that many candidates may not meet every single role requirement listed above. If your experience looks a little different from our requirements but you believe that you can still bring value to the role, we’d love to see your application!
Who We Are:
We’re Criteo, the Commerce Intelligence Platform. Criteo helps businesses turn shopper signals into commerce outcomes while delivering more relevant experiences for shoppers. We use proprietary commerce intelligence and AI decisioning to drive relevance for shoppers and performance for businesses.
At Criteo, our culture is as unique as it is diverse. From our offices across the globe or from the comfort of home, our 3,600 Criteos collaborate together to build an open, impactful, and forward-thinking environment.
We foster a workplace where everyone is valued, and employment decisions are based solely on skills, qualifications, and business needs—never on non-job-related factors or legally protected characteristics.
What We Offer:
🏢 Ways of working – Our hybrid model blends home with in-office experiences, making space for both.
📈 Grow with us – Learning, mentorship & career development programs.
💪 Your wellbeing matters – Health benefits, wellness perks & mental health support.
🤝 A team that cares – Diverse, inclusive, and globally connected.
💸 Fair pay & perks – Attractive salary, with performance-based rewards and family-friendly policies, plus the potential for equity depending on role and level.
Additional benefits may vary depending on the country where you work and the nature of your employment with Criteo.

