tag · role

SRE jobs

61 open SRE roles across the observability ecosystem.

G
Senior Field Engineer | Germany | Remote
Grafana Labs Germany (Remote) ◆ Remote
RemoteSREBackendPrometheusGrafana
today
E
Senior SRE - Platform (Managed Kubernetes Infrastructure)
Design, build, and scale multi-cloud platform for hosted and serverless services, ensuring reliability and automating system engineering efforts.
Elastic Canada senior
MetricsIncident ResponseSREBackendPrometheus
today
I
Platform Engineer
Design, maintain, and scale infrastructure for incident response platform
incident.io London ◆ Remote mid £110k – £200k
Incident ResponseRemoteSREPlatformDevRel
3d ago
S
Staff Technical Program Manager
Technical program manager for platform org, driving strategic execution and partnering with senior engineering leaders.
Sentry San Francisco, California ◆ Remote staff $200k – $240k
MetricsIncident ResponseRemoteSRE
3d ago
N
Manager, Software Engineering (Fullstack Team)
New Relic Bangalore, India
SREBackendFrontendEngineering Management
3d ago
H
Resident Architect- LATAM
Honeycomb Remote - Brazil ◆ Remote
RemoteSREOpenTelemetry
4d ago
D
Senior Software Engineer - Observability Visibility
Datadog New York, New York, USA
SRE
6d ago
C
Staff Professional Services Consultant
Partner with IT and Security teams to implement telemetry infrastructure for large enterprises.
Cribl Remote - United States ◆ Remote staff
RemoteSREGrafanaKubernetes
1w ago
N
Software Engineer - Fullstack
New Relic Hyderabad, India
SREBackendFrontendData
1w ago
H
Senior Site Reliability Engineer
Honeycomb Remote - United Kingdom ◆ Remote
Incident ResponseRemoteSREBackendDevRel
1w ago
E
Site Reliability Engineer (Hosted Infra) - Platform
Design and implement large-scale system automation, optimize host reliability, and strengthen observability across multiple cloud providers
Elastic United States senior
Incident ResponseSREPlatformPrometheusKubernetes
1w ago
D
Staff Applied Scientist - Dashboards
Datadog New York, New York, USA
MetricsLogsTracingSRE
1w ago
H
Senior Site Reliability Engineer
Honeycomb Remote - Ireland ◆ Remote
Incident ResponseRemoteSREBackendDevRel
2w ago
D
Staff Applied Scientist - Agentic Interfaces
Datadog New York, New York, USA
MetricsTracingSRE
2w ago
G
Staff Backend Engineer - Grafana Enterprise | Canada | Remote
Grafana Labs Canada (Remote) ◆ Remote
RemoteSREBackendPrometheusGrafana
3w ago
G
Staff Backend Engineer - Grafana Enterprise | US | Remote
Grafana Labs United States (Remote) ◆ Remote
RemoteSREBackendPrometheusGrafana
3w ago
C
Senior Software Engineer - Postgres
Design and build backend services for Postgres database clusters in ClickHouse cloud platform
ClickHouse Canada senior
Incident ResponseSREBackendDevRelKubernetes
4w ago
C
Senior Software Engineer - Postgres
Design and build backend services for Postgres database clusters in ClickHouse
ClickHouse India (remote) ◆ Remote senior
Incident ResponseRemoteSREBackendDevRel
4w ago
C
Senior Software Engineer - Postgres
Design and build backend services for ClickHouse Cloud database clusters
ClickHouse United States (remote) ◆ Remote senior
Incident ResponseRemoteSREBackendDevRel
4w ago
C
Senior Software Engineer - Postgres
Design and build backend services for Postgres database clusters in ClickHouse
ClickHouse United Kingdom senior
Incident ResponseSREBackendDevRelKubernetes
4w ago
D
Manager II, Engineering - Site Reliability Engineering
Datadog France, Remote; Ireland, Remote ◆ Remote
RemoteSREBackendEngineering Management
4w ago
D
Manager II, Engineering - Site Reliability Engineering
Datadog Paris, France
SREBackendEngineering Management
4w ago
C
Senior Site Reliability Engineer
Design and deploy observability infrastructure for large-scale enterprise customers
Cribl Remote - Poland ◆ Remote senior
APMIncident ResponseRemoteSREPrometheus
1mo ago
N
Senior Technical Success Manager
New Relic Tokyo, Japan
SRE
1mo ago
N
Lead Fullstack Engineer - Backend focused
New Relic Hyderabad, India
LogsSREBackendFrontend
1mo ago
D
Senior Software Engineer - Bits AI SRE
Datadog New York, New York, USA
SREBackendKubernetes
1mo ago
S
Staff Site Reliability Engineer
Design and execute reliability roadmap for planet-scale observability and security products
Sumo Logic Bangalore, Karnataka, India ◆ Hybrid staff
Incident ResponseSREKubernetes
1mo ago
C
Senior Consulting Engineer - Singapore
Provide consultative support and engineering assistance to enterprise customers
ClickHouse Singapore (Remote) ◆ Remote senior
Incident ResponseRemoteSREBackendKubernetes
1mo ago
S
Senior Site Reliability Engineer I
Design, deploy, and operate scalable systems, improve reliability and velocity, and drive root cause analysis.
Sumo Logic San Jose, Costa Rica ◆ Remote senior
LogsIncident ResponseRemoteSREKubernetes
2mo ago
S
Site Reliability Engineer I
Cloud native SRE for observability and security products, improving operational excellence and feature velocity
Sumo Logic San Jose, Costa Rica ◆ Remote junior
LogsIncident ResponseRemoteSREKubernetes
2mo ago
C
Staff Professional Services Consultant
Partner with IT and Security teams to implement telemetry infrastructure for large enterprises.
Cribl Remote - Netherlands ◆ Remote staff
RemoteSREGrafanaKubernetes
2mo ago
C
Database Reliability Engineer - Core Team
Design and implement reliability processes for ClickHouse Core, collaborating with multiple teams to ensure high availability and performance.
ClickHouse united states (Remote) ◆ Remote senior
MetricsIncident ResponseRemoteSREClickHouse
2mo ago
C
Database Reliability Engineer - Core Team
Design and implement reliability processes for ClickHouse Core, collaborating with multiple teams to ensure high availability, scalability, and performance.
ClickHouse Australia (remote) ◆ Remote senior
MetricsIncident ResponseRemoteSREClickHouse
2mo ago
C
Database Reliability Engineer - Core Team
Design and implement reliability processes for ClickHouse Core, collaborating with multiple teams to ensure high availability and performance.
ClickHouse United Kingdom (remote) ◆ Remote senior
MetricsIncident ResponseRemoteSREClickHouse
2mo ago
C
Database Reliability Engineer - Core Team
Design and implement reliability processes for ClickHouse Core, collaborating with multiple teams to ensure high availability, scalability, and performance.
ClickHouse Germany (remote) ◆ Remote senior
MetricsIncident ResponseRemoteSREClickHouse
2mo ago
C
Database Reliability Engineer - Core Team
Design and implement reliability processes for ClickHouse Core, collaborating with multiple teams to ensure high availability and performance.
ClickHouse Netherlands (remote) ◆ Remote senior
MetricsIncident ResponseRemoteSREClickHouse
2mo ago
D
Senior Product Manager - Network Path
Datadog New York, New York, USA ◆ Hybrid
MetricsAPMSRE
2mo ago
D
Senior Developer Advocate - Data Observability
Datadog California, USA, Remote; Colorado, USA, Remote; Illinois, USA, Remote; New York, USA, Remote; Washington, USA, Remote ◆ Remote
RemoteSREDevRelData
2mo ago
S
Observability Success Architect
Technical post-sales role for observability platform, driving customer adoption and expansion
SigNoz United States senior
MetricsLogsTracingAPMSRE
3mo ago
C
Senior Site Reliability Engineer- Remote
Design and implement scalable, secure, highly available distributed systems for cloud infrastructure, ensuring reliability, availability, and performance.
ClickHouse Australia(Remote) ◆ Remote senior
Incident ResponseRemoteSREBackendKubernetes
3mo ago
C
Senior Site Reliability Engineer- Remote
Design and implement scalable, secure, highly available distributed systems for cloud infrastructure, ensuring reliability, availability, and performance.
ClickHouse Singapore(Remote) ◆ Remote senior
Incident ResponseRemoteSREBackendKubernetes
3mo ago
C
Senior Site Reliability Engineer- Remote
Design and implement scalable, secure, highly available distributed systems for cloud infrastructure, ensuring reliability and performance.
ClickHouse Canada(remote) ◆ Remote senior
Incident ResponseRemoteSREBackendKubernetes
3mo ago
C
Senior Site Reliability Engineer- Remote
Design and implement scalable, secure, highly available distributed systems for cloud infrastructure, ensuring reliability, availability, and performance.
ClickHouse United States(remote) ◆ Remote senior
Incident ResponseRemoteSREBackendKubernetes
3mo ago
S
Senior Site Reliability Engineer I
Design and implement reliability roadmap for product area, optimize operations and security, and improve developer velocity
Sumo Logic Noida, Uttar Pradesh, India ◆ Hybrid senior
Incident ResponseSREKubernetes
3mo ago
S
DevOps Engineer - Support
DevOps Engineer for open-source observability platform, troubleshooting complex distributed systems, architecting solutions for observability at scale.
SigNoz India senior
TracingSREPlatformBackendOpenTelemetry
3mo ago
D
Manager I, Engineering - CodeGen
Datadog New York, New York, USA
Incident ResponseSREPlatform
4mo ago
S
Staff Site Reliability Engineer
Design and execute reliability roadmap for observability and security products, collaborate with multiple teams to optimize operations and improve developer experience.
Sumo Logic Noida, Uttar Pradesh, India ◆ Hybrid staff
Incident ResponseSREKubernetes
4mo ago
N
Site Reliability / DevOps Engineer (Cloud)
Develop CI/CD pipelines, automation tools, and infrastructure for cloud offerings and open-source project.
Netdata Remote job ◆ Remote senior
Incident ResponseRemoteSREKubernetes
5mo ago
I
Product Engineer (Mobile)
Build mobile app for incident response platform using React Native
incident.io London ◆ Remote mid $110k – $165k
MetricsLogsIncident ResponseRemoteSRE
5mo ago
I
Product Engineer
Build and maintain observability features for incident response platform
incident.io London ◆ Remote senior £110k – £200k
LogsIncident ResponseRemoteSREFrontend
5mo ago
I
Security Engineer
Design and implement application security features, collaborate with product teams to ensure secure software development lifecycle.
incident.io London ◆ Remote mid £110k – £200k
Incident ResponseRemoteSREFrontendSecurity
5mo ago
C
Senior Infrastructure Engineer - Postgres
Design and implement automation, observability, and operational rigor for a global data platform
ClickHouse India (remote) ◆ Remote senior
MetricsTracingIncident ResponseRemoteSRE
7mo ago
C
Senior Infrastructure Engineer - Postgres
Design and implement automation, observability, and operational rigor for a global data platform
ClickHouse United States (remote) ◆ Remote senior
MetricsTracingIncident ResponseRemoteSRE
7mo ago
C
Cloud Software Engineer - Observability Platform
Design and operate a high-throughput telemetry platform for real-time analytics and observability
ClickHouse United States (remote) ◆ Remote senior
Incident ResponseRemoteSREBackendData
7mo ago
C
Cloud Software Engineer - Observability Platform
Design and operate a telemetry platform for real-time analytics and observability, handling trillions of events per day.
ClickHouse Canada (remote) ◆ Remote senior
Incident ResponseRemoteSREBackendData
7mo ago
S
Forward Deployed Engineer
Design and implement observability solutions, write technical documentation, and contribute to customer codebases.
SigNoz India ◆ Remote senior
RemoteSREPlatformOpenTelemetryKubernetes
9mo ago
D
AI Research Engineer - Datadog AI Research (DAIR)
Datadog Paris, France
MetricsLogsTracingIncident ResponseSRE
9mo ago
D
AI Research Engineer - Datadog AI Research (DAIR)
Datadog New York, New York, USA
MetricsLogsTracingIncident ResponseSRE
10mo ago
D
Senior Product Manager - Database AI Optimization
Datadog New York, New York, USA
APMSRE
11mo ago
D
AI Research Scientist – Datadog AI Research (DAIR)
Datadog Paris, France
MetricsLogsTracingIncident ResponseSRE
14mo ago
D
AI Research Scientist - Datadog AI Research (DAIR)
Datadog New York, New York, USA
MetricsLogsTracingIncident ResponseSRE
17mo ago