We are looking for a Data Engineer Associate, to join our Big Data & Analytics team, in Madrid / Barcelona. You will be responsible for designing, building and maintaining big data architectures and data pipelines.
What we offer
- Interact with senior stakeholders on regular basis, to drive their business towards impactful change.
- Become the go-to person for end-to-end data handling, management and analytics processes.
- Work with Data Scientists to take data throughout its lifecycle - acquisition, exploration, data cleaning, integration, analysis, interpretation and visualization.
- Become part of a fast-growing international and diverse team.
What you will do
- Design, build and maintain big data architecture and data pipeline.
- Define and implement software solutions for scalability and maintainability.
- Solve problems, disaggregate issues, develop hypotheses and develop actionable solutions to support data engineering and data scientists teams.
- Kept up to date with latest industry knowledge and news to maintain and improve enterprise and business platforms.
- Develop and implement CI/CD pipeline automation solutions.
- Support and improve agile culture in your team.
- Collaborate and work closely with teams to better understand the end-user requirements.
What you’ll bring
- Relevant knowledge in defining and implementing data architectures, optimized for performance.
- Accuracy and maintainability.
- Relevant knowledge in defining, implementing and maintaining ETL processes.
- Good knowledge in the implementation of data cleaning and data transformation pipelines in a Spark environment ,with Python.
- Relevant Expertise with SQL and Apache Spark environments.
- Solid grasp of OLAP architectures.
- (Optional) Previous experience in AWS EMR platform.
- Technical skills regarding cloud-based software architectures and cloud solution deployment and management.
- Ability to collaborate and follow best practices, using Git and devops based projects.
- Solid grasp of the basics of CI/CD (continuous integration, deployment automation).
- Comfortable with schema-on-read databases (e.g. Redis).
- Comfortable with Kubernetes design and operations (k8s objects definitions and deploy, Helm charts)