This job is in your area. Enjoy a short commute and work close to home.
Job Description
Responsibilities
As a Sr. Site Reliability Engineer, you will be the guardian of our platform’s reliability and performance, ensuring millions of hospitality transactions flow seamlessly across the globe. You will architect and implement scalable AWS cloud solutions, maintain and support highly‑loaded Kubernetes (EKS) clusters, support the CICD process with ArgoCD and GitOps, automate platform deployments with Terraform, and develop and continuously improve product observability and monitoring systems using Grafana, Prometheus, Datadog, and CloudWatch. You will also participate in incident management and root‑cause analysis, optimize system performance, collaborate with development teams on monitoring best practices, and work with security teams to maintain security best practices and infrastructure support.
What You Bring to the Team
- Design and implement a reliable and scalable AWS architecture to meet the organization’s needs.
- Maintain and...