Site Reliability Engineer (sre) / Infrastructure Engineer-iVedha-Canada, Canada

Site Reliability Engineer (sre) / Infrastructure Engineer

iVedha

6 days ago

Expires on: 09 Jul 2025

Canada, Canada

Job description & requirements

About the Role


We are seeking an experienced Site Reliability Engineer (SRE) / Infrastructure Engineer to join our Platform Engineering team. This role requires a hands-on technologist with deep expertise in cloud infrastructure, Kubernetes, DevOps, and SRE practices to ensure the performance, availability, scalability, and security of mission-critical platforms.


Key Responsibilities


  • Design, implement, and maintain highly available, scalable, and secure infrastructure across AWS, Azure, and GCP.
  • Build and automate CI/CD pipelines using Azure DevOps, Jenkins, Ansible Tower, and Terraform.
  • Manage containerized applications using Kubernetes, Docker, AKS, EKS, and GKE
  • Develop and enforce SRE best practices including monitoring, incident response, capacity planning, and reliability automation.
  • Implement Infrastructure as Code (IaC) using Terraform, Bicep, ARM templates, and CloudFormation.
  • Collaborate with development, QA, and security teams to integrate DevSecOps pipelines.
  • Use observability tools (e.g., ELK, Kibana, ) for metrics, logging, and alerting.
  • Manage machine identity and key lifecycle with Venafi, TLS, and PKI-based automation.
  • Lead root cause analysis and provide reliable fixes for complex infrastructure issues.
  • Participate in architectural reviews, security audits, and disaster recovery planning.


Qualifications


Must-Have:

  • 10+ years in infrastructure, DevOps, or SRE roles within enterprise-grade environments.
  • Proven experience with AWS, Azure, and GCP cloud services.
  • Hands-on expertise in Kubernetes (AKS/EKS/GKE), Helm, Docker.
  • Strong scripting skills in Python, Bash, PowerShell.
  • Experience with Terraform, Ansible.
  • Familiarity with CI/CD tools: Jenkins, Azure DevOps, Octopus, GitHub Actions.
  • In-depth knowledge of Linux, Windows Server, and hybrid cloud environments.
  • Solid understanding of networking, load balancing (NGINX, F5, ELB), and firewalls.
  • Knowledge of security best practices and tools (e.g., IAM, TLS, PKI, SIEM, WAF, DAST/SAST).


Nice-to-Have:

  • Experience with Apache airflow, snowflake , and big data pipelines.
  • Familiarity with SRE maturity models and service level objectives (SLOs, SLIs, SLAs).


Job domain/function :

Educational qualifications :

Location :

Canada, California, Canada

Create alert for similar jobs