Director DevOps and SRE
United States Digital Space LLC
Job Description
Job Overview We are seeking a Director to lead Group Functions DevOps and Site Reliability organization. The Director of DevOps and Site Reliability Engineering is accountable for the end‑to‑end delivery, reliability, and operation of the Group Functions Data Office platform and AI enablement ecosystem. This role ensures platforms and AI services are scalable, reliable, secure, and cost‑efficient through automation, reliability engineering, and modern DevOps practices.
You will collaborate closely with leaders across business, technology, governance, and risk to ensure our platform is optimized for accelerated delivery, automated site reliability while meeting the highest standards of security, compliance, and operational excellence. Position Responsibilities The ideal candidate is a highly engaged and experienced engineer who will lead DevOps and SRE teams, is comfortable in a fast‑paced, multi‑functional delivery model motivated by value generation, and is hands‑on in details while building consensus with senior partners. Lead, build, transform, and scale a high‑performing, unified organization comprising DevOps, Site Reliability Engineers, Platform Engineering, and a Global Operations team.
Establish clear accountability across L1, L2, and L3 support tiers, embedding operational ownership within engineering delivery teams. Ensure platforms are production‑ready, resilient, and cost‑optimized. Implement and mature Site Reliability Engineering practices, including error handling and automated remediation.
Own production reliability, incident response, and service continuity across global platforms and support teams. Drive proactive reliability improvements through observability, automation, and resilience engineering. Establish automation as the default approach and continuously improve platform performance, automation, developer experience, and overall efficiency.
Ensure robust platform automation, observability, SRE practices, and proactive incident detection. Reduce manual intervention through self‑healing systems, infrastructure as code, and automated workflows. Deliver measurable reductions in operational costs.
Champion compliance, security, and governance, proactively addressing regulatory requirements and operational risk. Ensure platforms and operations meet security, risk and compliance architecture, and regulatory requirements including disaster recovery and business continuity planning. Manage operational risk associated with scaling AI workloads and complex distributed systems.
Provide transparent reporting on reliability, cost, security, sensitive data accessibility and usage, and operational performance. Partner with architecture, security, data engineering, AI, and product leaders to align platform roadmaps with business priorities. Partner with architecture, security, risk, data engineering, AI and Group Functions Data Delivery leaders to align platform roadmaps with business priorities, accelerate adoption of platform capabilities, and foster reusable component development and automation.
Required Qualifications Bachelor’s or Master’s degree in Computer Science, Engineering, or related field. 8‑10 years of experience in technology leadership roles, with at least 3 years in enterprise platform delivery. Strong cloud platform engineering, distributed systems, and automation expertise, with proven thought leadership and delivery experience building model CI/CD practices. Hands‑on leadership of engineering teams, leaning‑in to ensure the team is using the latest DevOps and SRE practices.
Demonstrated leadership, communication, and collaboration skills with experience leading global engineering DevOps or SRE teams including tiered support models. Advanced proficiency in Java (required) and Python (required); additional experience in other programming languages is an asset. Previous experience and knowledge implementing GitHub Actions.
Benefits and Career Growth Empowerment to learn and grow in the career you want. Recognition and support in a flexible environment where well‑being and inclusion are more than words. Part of a global team supporting you in shaping the future you want to see.
The company offers eligible employees a wide array of customizable benefits, including health, dental, mental health, vision, short‑ and long‑term disability, life and AD&D insurance coverage, adoption/surrogacy and wellness benefits, employee/family assistance plans, various retirement savings plans (including pension and a global share ownership plan with employer matching contributions), and financial education and counseling resources. The generous paid time off program in Canada includes holidays, vacation, personal, and sick days, and the full range of statutory leaves of absence. Equal Opportunity Employer At the company, we embrace our diversity.
We strive to attract, develop and retain a workforce that is as diverse as the customers we serve and to foster an inclusive work environment that embraces the strength of cultures and individuals. We are committed to fair recruitment, retention, advancement and compensation, and we administer all of our practices and programs without discrimination on the basis of race, ancestry, place of origin, colour, ethnic origin, citizenship, religion or religious beliefs, creed, sex (including pregnancy and pregnancy‑related conditions), sexual orientation, genetic characteristics, veteran status, gender identity, gender expression, age, marital status, family status, disability, or any other ground protected by applicable law. Salary & Working Arrangement Hybrid work arrangement in Toronto, Ontario.
Salary range: CAD 113,260.00 – 210,340.00. Employees also have the opportunity to participate in incentive programs and earn incentive compensation tied to business and individual performance. Salary will vary depending on local market conditions, geography and relevant job‑related factors such as knowledge, skills, qualifications, experience, and education. #J-18808-Ljbffr