Cloud Engineer - Platform Engineering
Lightspeed DMS
Job Description
Job DescriptionJob DescriptionCompany Overview:Lightspeed is the leading provider of cloud-based software for dealerships, serving the Powersport, Marine, RV, Trailer, and Golf Car industries adding hundreds of dealerships to the Lightspeed community each year.Lightspeed's Dealer Management Solution (DMS) enables dealerships to optimize their end-to-end business operations, including Sales, Parts, Service, Rentals, Payments, Accounting, and Customer Relationship Management (CRM). When implemented into their daily operations, Lightspeed helps dealers increase their profitability by selling more units, service, and parts, all while creating a more streamlined experience for customers.Lightspeed is the most complete and integrated DMS in the industry with over 500 integrations with Original Equipment Manufacturers (OEMs), aftermarket parts and accessory distributors and dozens of other software tools that a dealership may use to run their business. Uniquely designed by dealers for dealers, and refined over the past 4 decades, Lightspeed empowers over 4,500+ dealers across North America with the tools and technology they need to manage their dealerships.We're seeking a Cloud Engineer to join our platform engineering team as an embedded member of development squads.
In this role, you'll work directly alongside application developers, enabling their success through self-service infrastructure platforms, GitOps automation, and AI-powered tooling. You'll manage infrastructure supporting 4,000+ EKS workloads in AWS while building the next generation of developer-focused platforms that dramatically increase velocity.This isn't a traditional infrastructure role. You'll be a strategic partner to developers, guiding cloud architecture decisions, building automation that eliminates toil, and leveraging AI to accelerate both infrastructure and application delivery.What you'll do:Platform Development & Developer EnablementBuild and maintain self-service infrastructure platforms that empower developers to provision and manage their own AWS resourcesWork embedded within development squads as their cloud infrastructure guide and strategic partnerDesign and implement reusable Terraform modules that enable consistent, scalable infrastructure patternsDevelop AI-powered automation tools that increase team velocity and reduce manual operational overheadCreate developer-friendly abstractions over complex infrastructure using GitOps workflowsInfrastructure as Code & GitOpsImplement GitOps practices using ArgoCD and Helm for automated deployments across EKS clustersBuild and maintain modular Terraform infrastructure supporting EKS, RDS PostgreSQL, Aurora PostgreSQL, and core AWS servicesLeverage AI tools to accelerate Terraform module development, code review, and documentationDesign Terraform module libraries that balance flexibility with guardrailsEstablish CI/CD pipelines that enable developers to safely deploy infrastructure changesCloud Architecture & CollaborationPartner with developers to design cloud-native architectures optimized for AWSGuide infrastructure decisions within development squads, translating application requirements into scalable AWS solutionsCollaborate with Cloud Architects and Security teams on infrastructure patterns and best practicesImplement highly available, scalable, and self-healing systemsAutomation & VelocityDevelop Python and Bash automation that eliminates repetitive operational tasksStrategically apply AI tools (code generation, infrastructure optimization) to accelerate deliveryBuild monitoring, logging, and observability solutions that provide developers insight into their infrastructureContinuously improve speed, efficiency, and scalability through automation-first thinkingWhat you should have:Technical Skills5+ years in Cloud/Platform Engineering/SRE/DevOps roles with production AWS experienceStrong GitOps mindset with hands-on experience using ArgoCD, FluxCD, or similar toolsKubernetes expertise (4+ years) managing EKS clusters in production environmentsPostgreSQL experience with RDS PostgreSQL and Aurora PostgreSQL in productionExperience using AI coding assistants (GitHub Copilot, Claude, Cursor, or similar) for infrastructure code developmentAdvanced Terraform skills including:Designing and implementing modular, reusable Terraform codeBuilding Terraform module librariesManaging complex AWS infrastructure as codeUnderstanding of Terraform best practices and state managementDeep AWS knowledge: VPC, IAM, EKS, RDS, Aurora, S3, Security Groups, networkingProficient in Helm chart development and managementAdvanced Python scripting for automation and tooling (beyond basic scripts)Bash/Shell scripting proficiencyCI/CD pipeline experience (GitLab CI preferred)Linux administration background (4+ years)Essential Mindset & ApproachSelf-service platform thinking - instinct to build tools that enable othersCloud-native design mindset - architecting for AWS from the ground upAutomation-first approach - proven track record of eliminating toil through toolingSystems thinking - understanding how components interact in complex distributed systemsSoft Skills (Critical)Exceptional communication with developers - ability to translate technical concepts and collaborate effectively · Embedded squad experience - comfortable working directly within development teams as their infrastructure guideExcellent follow-through - strong task completion orientation and reliabilityCollaborative mindset - team-first approach, not siloed thinkingStrategic guidance ability - can help developers make informed AWS infrastructure decisionsExperience working in Agile development environmentsPreferred QualificationsExperience building developer self-service platforms or internal developer platforms (IDP)Hands-on experience with AI coding assistants for Terraform/Python development, AI-driven infrastructure analysis, or LLM-based automation workflowsAWS Well-Architected Framework knowledgeExperience with observability platforms (Datadog, Prometheus, Grafana)Background in platform engineering or SRE disciplinesUnderstanding of building and managing large-scale distributed systemsExperience with messaging technologies (AmazonMQ, RabbitMQ, Kafka)Incident management and root cause analysis experiencePrior experience as an embedded infrastructure engineer within development teamsWhat Makes This Role DifferentYou'll be a force multiplier for development teams, building platforms and automation that let developers move fast while maintaining reliability and security.
Your success is measured by developer velocity, reduced operational toil, and the quality of self-service tools you create.Work RequirementsHybrid or Remote (Utah preferred, open to other locations)Occasional availability during maintenance windowsOn-call rotation support for production infrastructureWe're looking for engineers who are excited about empowering developers, obsessed with automation, and think in terms of platforms. If you love building tools that make others more productive, we want to talk to you.Inclusion and Diversity at Lightspeed:At Lightspeed, we celebrate the uniqueness of every individual and encourage diverse perspectives. We believe that inclusion drives innovation and fosters meaningful connections.
We are committed to building an environment where everyone feels valued and empowered to make an impact.Equal Employment Opportunity Statement:Lightspeed is an Equal Opportunity Employer and is dedicated to building a diverse and inclusive workforce. All qualified applicants will be considered for employment without regard to race, color, creed, ancestry, national origin, gender, sexual orientation, gender identity, gender expression, marital status, religion, age, disability, veteran status, or any other protected category.Important Note:Applicants must be authorized to work in the U.S.Ready to apply?Take the next step in your career—apply today and join a team where your skills will make an impact!