Job description
Develop, test, and maintain big data solutions using Apache Spark.
Collaborate with data engineers, data scientists, and other stakeholders to understand data requirements and translate them into scalable data processing workflows.
Optimize Spark jobs for performance and scalability.
Write high-quality, clean, and maintainable code.
Monitor and troubleshoot production data pipelines.
Implement data quality checks and ensure data integrity.
Stay up to date with the latest trends and advancements in big data technologies.
Requirements:
Proven experience as a Spark Developer or in a similar role.
Strong knowledge of Apache Spark and its ecosystem.
Experience with big data processing frameworks such as Hadoop.
Proficiency in programming languages such as Scala, Java, or Python.
Experience with data storage and retrieval technologies such as HDFS, HBase, Cassandra, or similar.
Familiarity with data integration tools and ETL processes.
Understanding of distributed computing principles.
Skills:
Strong problem-solving and analytical skills.
Excellent communication and collaboration abilities.
Ability to work in a fast-paced, agile environment.
Knowledge of performance tuning and optimization techniques for Spark.
Experience with cloud platforms (AWS, Azure, Google Cloud) is a plus.
Familiarity with streaming technologies such as Apache Kafka, Flink, or similar is a plus.