Dynatrace Engineer [T500-25876]
Talent500
Job Description
Talent500 is hiring for one of its clients.
Who are we:
Core Insurance Platforms (CIP) is Zurich’s global capability responsible for building, running, and evolving core insurance technology. We set a unified, scalable operating model—covering governance, standards, architecture, service delivery, and reuse—so our business units can deliver at speed and scale.
CIP is the strategic steward of Zurich’s Guidewire ecosystem, aligning platform roadmaps to business strategy while driving stability, modernization, reduced supplier dependency, and long term cost efficiency.
India delivery center is one of our global delivery and capability hub. We bring together experts in AI, engineering, analysis, quality, and architecture to deliver product & process solutions, application run services, change and transformation initiatives, and centralized platform services across both on prem and Guidewire Cloud environments. Our teams operate from multiple global delivery centers, supporting Zurich’s business units worldwide.
Dynatrace Engineer – Enterprise Command Center (ECC)
ECC is responsible for overseeing Zurich’s IT Change Management processes and coordinating the response to all major incidents (Sev 3WL and above). This includes gathering the appropriate technical SMEs, driving triage and troubleshooting, managing incident communications, identifying root causes, and approving corrective actions. This is supported by ECC Technologies, which provide deep observability across Zurich’s hybrid IT environment.
We are seeking an experienced Dynatrace Engineer with strong automation, cloud, and observability expertise. The role focuses on building scalable monitoring solutions, automating observability platforms, and enabling deep visibility across hybrid and multi-cloud environments. You will work closely with infrastructure, cloud, and application teams to standardize, mature, and scale observability practices.
Role Overview:
The Dynatrace Engineer focuses on automation, cloud engineering, and observability using Dynatrace across Azure and AWS environments. The role involves building automation, enabling cloud observability, configuring Dynatrace monitoring, performing root cause analysis, and collaborating with application and cloud teams to standardize and scale observability practices.
The role requires the ability to work independently in a globally distributed environment, take ownership of observability initiatives end to end, and contribute quickly while improving platform maturity and effectiveness.
Key Responsibilities:
Automation & Tooling:
Develop and maintain automation scripts using Python for:
- Monitoring onboarding
- Validation checks
- Service discovery
- API integrations
- Data processing
- Automate Infrastructure as Code using Terraform, including modular deployments and multi cloud provisioning
- Implement configuration automation and server orchestration using Ansible
- Automate Dynatrace One Agent installation, tagging rules, dashboards, alerting profiles, management zones, and platform configuration
- Identify manual observability processes and propose scalable automation based solutions.
Cloud Engineering (Azure & AWS):
- Deploy and manage cloud resources across Azure and AWS using Terraform
- Enable cloud observability for compute, networking, serverless, container workloads, logs, and metrics
- Work with cloud teams to ensure visibility into critical services and hybrid environments
- Support observability for both cloud native and on prem components within hybrid architectures.
Observability & Monitoring:
Configure Dynatrace capabilities, including:
- Automated service onboarding
- Tagging strategies and naming conventions
- Management Zones and custom dashboards
- Custom Metrics API, log ingestion, and alerting pipelines
- Distributed tracing analysis
- Deploy, configure, and manage ActiveGates, including scaling, secure communication patterns, and hybrid/on prem integration
- Perform root cause analysis and performance troubleshooting using Dynatrace dashboards, service flows, extensions, and logs
- Collaborate with application owners to improve observability practices and drive adoption
- Contribute to observability standards, templates, and best practices
- Ensure monitoring consistency across regions, business units, and environments
- Support governance around observability, including access management and configuration standards.
Technical Skills:
Automation & Programming:
- Strong expertise in Python for automation, API integration, file handling, retries, exception handling, and logging
- Hands on experience building Infrastructure as Code using Ansible in enterprise environments
- Proficiency with Ansible for configuration management, provisioning, orchestration, and reusable playbook design.
Cloud:
Strong understanding of Azure and AWS, including:
- Virtual machines, networking, and load balancers.
- IAM concepts and secure access models.
- Monitoring tools such as CloudWatch and Azure Monitor.
- Serverless and container based workloads.
Observability / APM:
Deep hands-on experience with Dynatrace, including:
- One Agent deployment on Windows and Linux
- Active Gate deployment, architecture, and scaling
- Management Zones, tagging rules, and naming standards
- Problem notifications, integrations, and alert routing
- Custom dashboards, metrics, extensions, and log ingestion
- Distributed tracing and service flow analysis
- Experience with Dynatrace access management, user groups, permissions, and governance models
- Strong understanding of observability pillars: metrics, logs, traces, SLIs, and SLOs.
Nice to Have:
- Experience with Kubernetes, Docker, AKS, and EKS.
- Experience with CI/CD tools such as Azure DevOps, GitHub Actions, or Jenkins.
- Experience with ITSM tools including ServiceNow (Incident, Change, CMDB).
- Scripting knowledge in Bash or PowerShell.
Experience Required:
- Minimum 5–6 years of relevant experience in observability, cloud engineering.