Pyspark
Tata Consultancy Services
Job Description
Location: PAN IndiaExperience: 8 to 10 Years
Key Skills:PySpark, Pytho
n
Must have Skill
s:
Implementing data ingestion pipelines from different types of data sources i.e Databases, S3, Files etc Experience in building ETL/ Data Warehouse transformation process.Experience working with structured and unstructured data.Developing Big Data and non-Big Data cloud-based enterprise solutions in PySpark and SparkSQL and related frameworks/libraries,Developing scalable and re-usable, self-service frameworks for data ingestion and processing,Integrating end to end data pipelines to take data from data source to target data repositories ensuring the quality and consistency of data,Processing performance analysis and optimization,Bringing best practices in following areas: Design & Analysis, Automation (Pipelining, IaC), Testing, Monitoring, Documentati
on.
Good to have (Knowled
ge):
Experience in cloud-based solutions,Knowledge of data management princi
ples.