LearnUpon logo

Staff Engineer, Site Reliability

LearnUponDublin 7, Ireland2 months ago
Engineering
Dublin

Description

Work Mode: Flex 1+ days per week in our Dublin office

Department: Engineering

 

About the Company 

LearnUpon partners with over 1,600 organisations globally to unlock the potential of employees, customers & members through learning that’s easy, scalable and focused on results.

Read more about life at LearnUpon here

 

About the Team

Our Engineering organization is dedicated to building robust, scalable infrastructure that handles world-scale platform demands. As part of the Site Reliability Engineering (SRE) team, we focus on system architecture, absolute performance, and technical innovation. Operating with high ownership and technical expertise, we are responsible for the scale-out of the LearnUpon infrastructure, championing internal self-service tooling, and embedding a culture of observability and shared operational responsibility across all engineering squads.

 

About the Opportunity

As a Staff Site Reliability Engineer, you will be a principal technical leader and a key catalyst for our infrastructure's evolution. In this role, you will take ownership of our core platform resilience, driving the strategy to build out an advanced, cost-effective observability function spanning metrics, logs, and transaction tracking. This opportunity requires a strategic thinker who can design cross-team SLO/SLI frameworks, navigate complex distributed system requirements, and mentor talent to ensure LearnUpon scales efficiently to support our ambitious global goals.

In addition, you’ll be responsible for:

  • Infrastructure Optimization: Identify opportunities to improve and scale our infrastructure for performance, observability, maintainability, and cost, by creating innovative solutions.
  • Observability Function Strategy: Lead our efforts to build an observability function that incorporates application metrics, application transaction tracking, and event log management.
  • Resilience & Scaling: Drive the processes to maintain resilient, scalable, and cost-effective infrastructure while working with other Engineering teams to provide solutions that meet their ongoing requirements.
  • Tooling & Self-Service: Build tools focused on measuring, monitoring, and alerting, with an eye towards self-service in order to promote Engineers’ ownership of observability.
  • Operational Agility & Support: React quickly to changing customer and business needs and actively participate in the team's on-call rota. 
    Team Up-Leveling: Mentor junior talent and effectively communicate complex technical ideas to both technical and non-technical peers.

 

Skills & Experience 

Must-Haves                                                                         

  • 7+ years of experience in a software or Ops role.
  • 5+ years of cloud engineering experience, with at least 2 years of experience with AWS.
  • Experience deploying Microservice environments using containerisation technologies such as Kubernetes and Docker.
  • Experience designing and implementing Observability tech stacks, championing its benefits to Engineering teams, and managing the associated cost analysis of metrics gathering, effort, and tooling.
  • Ability to architect the design of SLO/SLI implementations that balance the needs of different teams.
  • Experience building and supporting large-scale distributed systems that back a consumer app or website with associated requirements of performance, security, and disaster recovery.
  • Experience with implementing IaC (e.g., CloudFormation, Terraform, etc.), automation tooling (e.g., Puppet, Ansible etc.), and CI/CD (e.g., Jenkins, Travis CI, GitLab, etc.).
  • Experience using AI tools to streamline tasks and improve efficiencies.

 

Nice-to-Haves

  • Experience with database scaling would be a strong plus.
  • Certification in AWS, any PaaS, and/or related technologies.

*If you don’t tick every box but believe this role is a mutually good fit, please don’t hesitate to apply. We’d love to hear from you.

 

Why choose LearnUpon?

From comprehensive rewards and generous time off to meaningful investment in your growth and development, LearnUpon gives you the support, trust, and opportunity to do the most impactful work of your career.

Learn more here

 

Hiring Process

  • Qualified applicants may be invited to an initial screening call with a member of our TA Team.
  • Successful candidates will be invited to a series of practical interviews.
  • Finally, candidates will have an interview with our CTO.
  • Successful candidates will be contacted with an offer to join our team.

 

Note: At LearnUpon, we utilise AI to enhance the speed and quality of our screening and assessment practices, but our hiring decisions are always human. 

 

If you need any accommodations during the hiring process, please reach out to us at peopleops@learnupon.com.

 

LearnUpon is an Equal Opportunities Employer. 

We do not discriminate on the basis of gender, marital status, family status, age disability, sexual orientation, race, religion, membership of the Traveller community, or any other legally protected status.

 

Check out our Careers site and Instagram to learn more about working at LearnUpon.

 

By submitting your application, you agree to LearnUpon's Privacy Policy






About LearnUpon

More Jobs at LearnUpon

LearnUpon logo

Senior Information Security Officer

LearnUponDublin 7, IrelandToday
Greenhouse
Operations
LearnUpon logo

Director, Partnerships

LearnUponDublin 7, Ireland1 weeks ago
Greenhouse
partnerships
LearnUpon logo

HR Business Partner

LearnUponDublin 7, Ireland1 weeks ago
Greenhouse
People