Responsible for designing, developing, and optimizing big data solutions using the Databricks platform. This role involves close collaboration with data engineers, data scientists, and stakeholders to ensure efficient data processing, quality, and performance
Key Responsibilities
- Develop and implement advanced data solutions on Databricks using Python, Scala, or SQL.
- Optimize ETL processes and data pipelines for performance and reliability.
- Integrate Databricks notebooks with tools like Azure Data Factory and Synapse Analytics
- Implement secret scopes and secure data access using Databricks CLI and Azure Service Principals.
- Collaborate with cross-functional teams to understand data requirements and deliver scalable solutions.
- Conduct performance tuning and troubleshoot data workflows.
- Document processes and maintain high standards of code quality and maintainability.
Qualifications
5+ years of experience in data engineering or similar roles.Strong proficiency in Databricks, Apache Spark, and cloud platforms (Azure).Experience with data modeling, data cleansing, and warehousing at scale (1TB+).Familiarity with Agile delivery paradigms and software engineering best practices.Knowledge of core Python libraries (NumPy, Pandas, Matplotlib) is a plusSkills
Databricks, Apache SparkPython, SQL, ScalaAzure Data Factory, Synapse AnalyticsETL, Data Modeling, Performance TuningCloud platforms : AzureCI / CD and Git integration workflowsMust have and Good to have skills : Databricks, Apache Spark, Azure Data Factory, Synapse Analytics
Number of years experiences needed : 3-5
Budget / target bill rate per hr. : Max CL9 Bill Rate
No. of hours per week : 45
Shift Schedule : EMEA
Work Set up : Hybrid
Number of days in a month or week that needs to be in the office : 2 x A week (can be discussed)
Project Location (During onsite) : UT3