Site Reliability Engineer (sre)-Ellwood Consulting-Melbourne, Australia

EC

Site Reliability Engineer (sre)

Ellwood Consulting

4 days ago

Expires on: 05 Jul 2025

Melbourne, Australia

Job description & requirements

Direct message the job poster from Ellwood Consulting

Are you passionate about building robust, scalable systems that keep critical services running smoothly? We’re looking for a Engineer – Site Reliability to be a key force behind our infrastructure, ensuring peak performance, stability, and efficiency.

In this role, you'll bridge the gap between development and operations, designing resilient architectures, driving automation, and embedding a culture of reliability across the engineering team. Your work will directly impact the user experience and uptime of our most essential services.

  • Salary up to MYR 14,000
  • Working hours: 5pm - 3am (Afternoon Shift) | 8am - 5pm (Day Shift)
  • Flexi Hybrid

What You’ll Be Doing:

  • Design and implement high-availability, scalable system architectures that can handle production-grade workloads.
  • Develop tools and automation to streamline operations, reduce manual tasks, and improve response times.
  • Define and monitor Service Level Objectives (SLOs) and Service Level Indicators (SLIs) to measure and improve system reliability.
  • Conduct detailed post-incident reviews and lead root cause analyses to prevent recurrence.
  • Collaborate across engineering, QA, and infrastructure teams to build and maintain resilient systems.
  • Diagnose and resolve complex issues across databases, networks, and deployment pipelines—including Kubernetes and VMs.
  • Ensure adherence to Service Level Agreements (SLAs) by proactively managing incidents and performance.
  • Continuously tune and optimize systems for performance, scalability, and reliability.
  • Document processes, incident resolutions, and system designs to enable knowledge sharing and operational transparency.

What You’ll Bring:

  • Strong programming skills in Python, Golang, Java, or similar languages—especially for automation and tooling.
  • Experience designing and operating distributed systems at scale.
  • Deep understanding of SRE and DevOps principles, including observability, reliability, and incident response.
  • Hands-on experience in cloud environments such as AWS, Azure, or Google Cloud.
  • Proficiency in Linux system administration, performance tuning, and troubleshooting.
  • Solid grasp of networking concepts and infrastructure troubleshooting.
  • A proactive mindset, excellent problem-solving skills, and the ability to drive improvements autonomously.
  • Comfortable working independently while collaborating in cross-functional environments.
  • Fluency in Mandarin (both written and spoken), preferred, but not necessary, for communication with clients in the China market.

Bonus Points For:

  • Experience with monitoring tools like Prometheus, Grafana, Datadog, or similar.
  • Familiarity with CI/CD pipelines, Infrastructure as Code (IaC) (e.g., Terraform), and containerization tools (Docker, Kubernetes).
  • Knowledge of automation and scripting for system tasks (e.g., Bash, Python).
  • A strong understanding of DevOps culture and its best practices.
Seniority level
  • Seniority levelMid-Senior level
Employment type
  • Employment typeFull-time
Job function
  • Job functionInformation Technology
  • IndustriesTechnology, Information and Media

Referrals increase your chances of interviewing at Ellwood Consulting by 2x

Get notified about new Site Reliability Engineer jobs in Greater Kuala Lumpur.

Kuala Lumpur, Federal Territory of Kuala Lumpur, Malaysia 1 month ago

WP. Kuala Lumpur, Federal Territory of Kuala Lumpur, Malaysia 3 weeks ago

Petaling Jaya, Selangor, Malaysia 1 month ago

Kuala Lumpur, Federal Territory of Kuala Lumpur, Malaysia 4 days ago

WP. Kuala Lumpur, Federal Territory of Kuala Lumpur, Malaysia 1 week ago

Kuala Lumpur, Federal Territory of Kuala Lumpur, Malaysia 1 month ago

Kuala Lumpur, Federal Territory of Kuala Lumpur, Malaysia 3 days ago

Kuala Lumpur, Federal Territory of Kuala Lumpur, Malaysia 1 month ago

Kuala Lumpur, Federal Territory of Kuala Lumpur, Malaysia 2 weeks ago

Kuala Lumpur, Federal Territory of Kuala Lumpur, Malaysia 1 week ago

Kuala Lumpur, Federal Territory of Kuala Lumpur, Malaysia 1 month ago

Junior DevOps / Site Reliability Engineer

Kuala Lumpur, Federal Territory of Kuala Lumpur, Malaysia 2 months ago

Petaling Jaya, Selangor, Malaysia 1 month ago

Kuala Lumpur, Federal Territory of Kuala Lumpur, Malaysia 1 week ago

Petaling Jaya, Selangor, Malaysia 9 months ago

Kuala Lumpur, Federal Territory of Kuala Lumpur, Malaysia 1 month ago

Petaling Jaya, Selangor, Malaysia 4 days ago

Kuala Lumpur, Federal Territory of Kuala Lumpur, Malaysia 3 weeks ago

Kuala Lumpur, Federal Territory of Kuala Lumpur, Malaysia 2 months ago

Kuala Lumpur, Federal Territory of Kuala Lumpur, Malaysia 2 months ago

Kuala Lumpur, Federal Territory of Kuala Lumpur, Malaysia 2 weeks ago

Senior Site Reliability Engineer (DevOps)

WP. Kuala Lumpur, Federal Territory of Kuala Lumpur, Malaysia 3 days ago

WP. Kuala Lumpur, Federal Territory of Kuala Lumpur, Malaysia 1 week ago

We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

#J-18808-Ljbffr

Job domain/function :

Educational qualifications :

Location :

Melbourne, Victoria, Australia

Create alert for similar jobs

EC

Ellwood Consulting