We are seeking an experienced Full Stack Data Engineer with 5–6 years of industry
experience. The ideal candidate will have a proven track record of working on live projects,
preferably within the manufacturing or energy sectors. He/she will play a key role in developing,
and maintaining scalable data solutions using Databricks and related technologies.
Key Responsibilities:
- Develop, and deploy end-to-end data pipelines and solutions on Databricks, integrating with
- various data sources and systems.
- Collaborate with cross-functional teams to understand data, and deliver effective BI
- solutions.
- Implement data ingestion, transformation, and processing workflows using Spark
- (PySpark/Scala), SQL, and Databricks notebooks.
- Develop and maintain data models, ETL/ELT processes ensuring high performance,
- reliability, scalability and data quality.
- Build and maintain APIs and data services to support analytics, reporting, and application
- integration.
- Ensure data quality, integrity, and security across all stages of the data lifecycle.
- Monitor, troubleshoot, and optimize pipeline performance in a cloud-based environment.
- Write clean, modular, and well-documented Python/Scala/SQL/PySpark code.
- Integrate data from various sources, including APIs, relational and non-relational databases,
- IoT devices, and external data providers.
- Ensure adherence to data governance, security, and compliance policies.
Required Skills and Experience:
- Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field.
- 5-6 years of hands-on experience in data engineering, with a strong focus on Databricks and
- Apache Spark.
- Strong programming skills in Python/PySpark and/or Scala, with a deep understanding of
- Apache Spark.
- Experience with Azure Databricks.
- Strong SQL skills for data manipulation, analysis, and performance tuning.
- Strong understanding of data structures and algorithms, with the ability to apply them to
- optimize code and implement efficient solutions.
- Strong understanding of data architecture, data modeling, ETL/ELT processes, and data
- warehousing concepts.
- Experience building and maintaining ETL/ELT pipelines in production environments.
- Familiarity with Delta Lake, Unity Catalog, or similar technologies.
- Experience working with structured and unstructured data, including JSON, Parquet, Avro,
- and time-series data.
- Familiarity with CI/CD pipelines and tools like Azure DevOps, version control (Git), and
- DevOps practices for data engineering.
- Excellent problem-solving skills, attention to detail, and ability to work independently or as
- part of a team.
- Strong communication skills to interact with technical and non-technical stakeholders.
Requirements
- Experience with Delta Lake and Databricks Workflows.
- Exposure to real-time data processing and streaming technologies (Kafka, Spark Streaming).
- Exposure to data visualization tool Databricks Genie for data analysis and reporting.
- Knowledge of data governance, security, and compliance best practices.