EX

2567 - Java Data Engineer

EXL

5 months ago

India

Job description & requirements

Job Summary

We are seeking a skilled Data Engineer with strong expertise in Java and big data technologies to design, develop, and maintain scalable batch data pipelines. The ideal candidate will have hands-on experience working with modern data Lakehouse architectures, cloud-native data platforms, and automation tools to support high-performance analytics and data processing workloads.


Experience - 5-8 Years

Must Haves

  • Bachelors or masters degree in computer science, Engineering, or a related technical field.
  • Strong proficiency in Java programming with solid understanding of object-oriented design principles.
  • Proven experience designing and building ETL/ELT pipelines and frameworks.
  • Excellent command of SQL and familiarity with relational database management systems.
  • Hands-on experience with big data technologies such as Apache Spark, Hadoop, and Kafka or equivalent streaming and batch processing frameworks.
  • Knowledge of cloud data platforms, preferably AWS services (Glue, EMR, Lambda) and Snowflake.
  • Experience with data modeling, schema design, and concepts of data warehousing.
  • Understanding of distributed computing, parallel processing, and performance tuning in big data environments.
  • Strong analytical, problem-solving, and debugging skills.
  • Excellent communication and teamwork skills with experience working in Agile environments.


Nice to Have

  • Experience with containerization and orchestration technologies such as Docker and Kubernetes.
  • Familiarity with workflow orchestration tools like Apache Airflow.
  • Basic scripting skills in languages like Python or Bash for automation tasks.
  • Exposure to DevOps best practices and building robust CI/CD pipelines.
  • Prior experience managing data security, governance, and compliance in cloud environments.


Responsibilities:

  • Design, develop, and optimize scalable batch data pipelines using Java and Apache Spark to handle large volumes of structured and semi-structured data.
  • Utilize Apache Iceberg to manage data lakehouse environments, supporting advanced features such as schema evolution and time travel for data versioning and auditing.
  • Build and maintain reliable data ingestion and transformation workflows using AWS Glue, EMR, and Lambda services to ensure seamless data flow and integration.
  • Integrate with Snowflake as the cloud data warehouse to enable efficient data storage, querying, and analytics workloads.
  • Collaborate closely with DevOps and infrastructure teams to automate deployment, testing, and monitoring of data workflows using CI/CD tools like Jenkins.
  • Develop and manage CI/CD pipelines for Spark/Java applications, ensuring automated testing and smooth releases in a cloud environment.
  • Monitor and continuously optimize the performance, reliability, and cost-efficiency of data pipelines running on cloud-native platforms.
  • Implement and enforce data security, compliance, and governance policies in line with organizational standards.
  • Troubleshoot and resolve complex issues related to distributed data processing and integration. Work collaboratively within Agile teams to deliver high-quality data engineering solutions aligned with business requirements

Job domain/function :

Educational qualifications :

Location :

India

Create alert for similar jobs

similarJobs

2567 - Java Data Engineer-EXL-India