Job Description:
We are looking for a highly skilled and motivated Data Engineer to join our team. The ideal candidate will be responsible for expanding and optimising our data pipeline architecture, as well as optimising data flow and collection for cross-functional teams. The Data Engineer will work closely with software developers, data analysts, and data scientists on various data initiatives, ensuring optimal data delivery architecture is consistent across ongoing projects.
Key Responsibilities:
- Design, implement, and maintain scalable, robust data pipelines that ingest, process, and transform large datasets from multiple sources.
- Develop and optimise ETL (Extract, Transform, Load) processes to load data into the data warehouse, ensuring high data quality and reliability.
- Build and maintain database systems that support business intelligence and analytics functions.
- Collaborate with data scientists, analysts, and software engineers to understand data requirements and translate them into scalable, high-performance data architectures.
- Monitor and troubleshoot data pipeline performance, ensuring minimal data downtime and latency.
- Implement data governance practices, including data security, privacy, and quality, ensuring compliance with company policies and industry regulations.
- Develop tools to monitor data quality and ensure the accuracy and integrity of data across systems.
- Participate in the design and implementation of the company’s data strategy, influencing best practices and technology adoption.
Qualifications:
Education & Experience:
- Bachelor’s or Master’s degree in Computer Science, Information Systems, Engineering, or a related field.
- 3-5+ years of experience as a Data Engineer or in a similar role.
- Experience working with cloud-based data solutions such as AWS, Azure, or Google Cloud.
Technical Skills:
- Strong proficiency in SQL, including advanced query optimization techniques.
- Proficiency in Python,Golang, or Scala for data processing and automation tasks.
- Experience with big data frameworks such as Hadoop, Spark, Kafka, or Flink.
- Experience with relational databases (e.g., MySQL, PostgreSQL) and NoSQL databases (e.g., MongoDB, Cassandra).
- Hands-on experience with data warehousing technologies such as Snowflake, Amazon Redshift, Google BigQuery, or similar.
- Knowledge of workflow orchestration tools such as Apache Airflow, Luigi, or similar.
- Familiarity with data lake architectures, data streaming, and real-time data pipelines.
- Experience with ETL tools such as Talend, Apache Nifi, or custom ETL frameworks.