Site Reliability Engineer - PaaS

2d2 days ago

Algolia

Paris, FR · Full-time · €55,000 – €75,000

About this role

At Algolia, we’re proud to be a pioneer and market leader in AI Search, empowering 17,000+ businesses to deliver blazing-fast, predictive search and browse experiences at internet scale. Every week, we power over 30 billion search requests.

The Site Reliability Engineer position within the Platform As a Service team provides a dynamic opportunity for a professional with foundational experience in maintaining and optimizing scalable infrastructures. This role specifically concentrates on CI/CD, Observability, and application hosting.

The Platform as a Service (PaaS) team is dedicated to empowering development teams by creating toolchains, guidelines, and standards. Our focus is on enabling seamless automation and CI/CD, comprehensive observability, and unwavering reliability in a secured cloud-native environment.

As a member of the Platform As a Service team, you will play a key role in supporting the reliability and scalability of Algolia’s Search Products. You will work on planning and accountability for the next quarter, demonstrating independence in problem-solving and minimal reliance on managers and senior team members.

Requirements

Basic to intermediate knowledge of programming languages such as Golang or Python, with an understanding of software craftsmanship. Familiarity with Ruby is a plus.
Experience in setting up and managing CI/CD pipelines and Kubernetes-based architectures.
Exposure to operating distributed systems and understanding their challenges at a basic level.
Familiarity with public cloud providers such as Microsoft Azure, AWS, or GCP.
Ability to independently identify and solve problems, demonstrating initiative and minimal reliance on senior team members.
Strong communication and organizational skills to effectively collaborate across teams.

Responsibilities

Assist in the implementation and maintenance of a scalable CI/CD toolchain, contributing to the overall efficiency and reliability of development processes.
Support the development and deployment of observability standards and solutions, providing teams with actionable insights to enhance system reliability.
Help maintain and optimize Kubernetes-based architecture and cloud services, enhancing fault tolerance and resource utilization.
Operate components or features, ensuring proper monitoring and alerting are in place, and assisting in the transition from legacy systems.
Work collaboratively with team members to identify and solve problems, reducing dependence on senior staff for guidance.
Contribute to establishing engineering processes and best practices to ensure high-quality, reliable, and scalable systems.