
Staff Site Reliability Engineer
Description
Datavant is the data collaboration platform trusted for healthcare. Guided by our mission to make the world’s health data secure, accessible and actionable, we provide critical data solutions for organizations across the healthcare ecosystem - including providers, health plans, researchers, and life sciences companies. From fulfilling a single patient’s request for their medical records to powering the AI revolution in healthcare, Datavanters are building the future of how data is connected and used to improve health.
By joining Datavant today, you’re stepping onto a driven and highly collaborative team that is passionate about creating transformative change in healthcare.
What We’re Looking For:
We’re seeking Cloud Platform Embedded Site Reliability Engineers (SRE) who thrive at the intersection of infrastructure, reliability engineering, and enablement. You are passionate about building resilient, scalable systems and enabling product teams to deliver with confidence. You bring strong technical depth in cloud-native environments and understand how to balance long-term engineering rigor with the pragmatism needed to support fast-moving teams.
The ideal candidate is fluent in modern cloud architectures, containers, CI/CD automation, observability practices, and Infrastructure-as-Code. You know how to diagnose complex production issues, optimize system performance, and design platforms that minimize operational burden. You’re comfortable working hands-on within product engineering teams—pairing with developers, advising on architectural decisions, and leading initiatives that improve reliability, scalability, security and all the other “ilities”.
You are an advocate for operational excellence and continuous improvement. You think in systems, measure what matters, and build guardrails rather than gates. You communicate clearly, influence without authority, and enjoy building tools, patterns, and shared services that raise the engineering bar across the organization.
Most importantly, you bring a collaborative mindset. You listen deeply, understand real-world constraints, and help teams adopt best practices that make their services more robust and their developers more productive.
What You Will Do:
- Lead cross-functional endeavors touching every aspect of the SLDC and PDLC in an effort to modernize, secure, and standardize how we operate.
- Partner directly with product engineering teams to improve reliability, scalability, and operational maturity against organization platform and security standards.
- Guide teams in designing resilient architectures, defining SLIs/SLOs, and managing error budgets.
- Lead troubleshooting of complex production issues and support incident response, root-cause analysis, and postmortem processes.
- Automate infrastructure and operational workflows to reduce toil and accelerate delivery.
- Enhance observability through improved logging, metrics, tracing, and performance monitoring.
- Develop and deploy reusable tools, templates, and platform components that streamline engineering workflows.
- Influence and establish reliability-focused best practices across the organization.
- Serve as a technical advisor to engineering teams, helping them adopt sound architectural and operational patterns.
What You Need to Succeed:
- 5 plus years working in at least one programming language (e.g., Python, Go, etc)
- Experience operating cloud-native applications on platforms such as AWS, GCP, or Azure.
- 5 plus yeears of knowledge of Kubernetes and container orchestration principles.
- 5 plus years knowledge of Terraform.
- Hands-on experience with observability systems (metrics, logging, tracing) and diagnosing distributed systems issues.
- Demonstrated ability to lead or support incident response and drive high-quality postmortems.
- Strong collaboration skills with the ability to influence without authority and partner effectively with developers.
- Clear communication skills for explaining reliability tradeoffs to both technical and non-technical audiences.
- A mindset grounded in systems thinking, operational excellence, and continuous improvement.
- Curiosity, adaptability, and the judgment to balance innovation with pragmatic engineering choices.
We are committed to building a diverse team of Datavanters who are all responsible for stewarding a high-performance culture in which all Datavanters belong and thrive.
We are committed to building a diverse team of Datavanters who are all responsible for stewarding a high-performance culture in which all Datavanters belong and thrive.
We are proud to be an Equal Employment Opportunity employer. Datavant is committed to working with and providing reasonable accommodations to individuals with physical and mental disabilities. If you need an accommodation while seeking employment, please request it here, by selecting the ‘Interview Accommodation Request’ category. You will need your requisition ID when submitting your request, you can find instructions for locating it here. Requests for reasonable accommodations will be reviewed on a case-by-case basis.
For more information about how we collect and use your data, please review our Privacy Policy.
