Location
toronto
Posted
June 01, 2026
Commute
Local Area
Local Opportunity Near You!
This job is in your area. Enjoy a short commute and work close to home.
Job Description
Join a leading team as a Site Reliability Engineering Lead, focusing on APM and advanced observability. Your skills in troubleshooting and complex applications will drive system reliability.
In this role, you'll need hands-on expertise with Dynatrace or similar tools to analyze and instrument applications. Your deep understanding of observability fundamentals will aid in diagnosing issues using metrics, logs, and traces. Proficient programming in Python and Node.js is required for backend development.
Key Responsibilities:
• Instrument distributed applications for performance insights
• Analyze logs, metrics, and traces to resolve application failures
• Design effective dashboards that enhance user experience
• Advocate for Google SRE principles in operational practices
• Work collaboratively to enhance system reliability
Requirements:
• Strong programming skills in Python and Node.js
• Proficient with enterprise observability tools
• Solid troubleshoot...
In this role, you'll need hands-on expertise with Dynatrace or similar tools to analyze and instrument applications. Your deep understanding of observability fundamentals will aid in diagnosing issues using metrics, logs, and traces. Proficient programming in Python and Node.js is required for backend development.
Key Responsibilities:
• Instrument distributed applications for performance insights
• Analyze logs, metrics, and traces to resolve application failures
• Design effective dashboards that enhance user experience
• Advocate for Google SRE principles in operational practices
• Work collaboratively to enhance system reliability
Requirements:
• Strong programming skills in Python and Node.js
• Proficient with enterprise observability tools
• Solid troubleshoot...