Cabify

Data Engineer

Sept. 8, 2022

Anywhere

About the position At data engineering, we operate dozens of services, pipelines, and our in-house developed machine learning platform. We are a hands-on team: we manage our own infrastructure, including our Kubernetes clusters.  This is an excellent opportunity to work in a company with a highly technological product that generates hundreds of thousand events per second. A vast sea of data that is not only stored and organized but also consumed to improve all aspects of the operation: pricing, dispatching, marketing, governance, and many others. Cabify has the perfect size to allow you to have a real impact on the product: you will be able to build and improve the platform that provides trusted data at scale. And you will do it as part of a team of very experienced data engineers, helping each other to grow technically and professionally.  You will: Design and develop end-to-end data solutions and modern data architectures for Cabify products and teams (data lake, data warehouse, streaming ingestion…). Evolve and maintain Lykeion, a Machine Learning platform developed along with the Data Science team, to take care of the whole lifecycle of models and features. It includes a feature store, which allows other teams inside Cabify to make better decisions based on data, and a prediction platform to serve ML models at scale. Manage and evolve the data platform. Continuously identify, evaluate and implement new tools and approaches to maximize development speed and cost efficiency. Provide the company with data discoverability and governance. Collaborate with other technical teams to define, execute and release new services and features. Design and maintain complex APIs exposing data at scale, that helps other teams to make better decisions. Extract data from internal and external sources to empower our Data Analytics team. What we’re looking for We are looking for software engineers with a deep interest in data. When we think about the ideal candidate, we look for: At least 2 years of tenure in coding and delivering complex software projects, following best practices. Experience with different programming languages (we work with Python, Scala, and Go; but you don’t need to know all of them). Experience with different storage technologies (file-based, relational, columnar, document-based, NoSQL, key-value...). Experience with cloud infrastructures (GCP, AWS, Azure or others). Given the current stack of tools and technologies that we currently handle, we will positively value if you have experience with any of the following: Distributed systems (messaging brokers, batch/streaming processing, APIs, etc…). Modern data processing stacks, like Apache Beam, Apache Flink, Akka streams, Hadoop or Spark. Message delivery for streaming processing: Kafka, RabbitMQ, NATS, MQTT. Orchestration tools such as Airflow, Luigi, or Dagster. Machine Learning, especially with its production lifecycle / MLOps (features, models, training & evaluation processes, productionizing). Automation/IaC tools (Terraform, Puppet, Ansible…). Google Cloud BigData products (PubSub, Dataflow, BigTable, BigQuery…). Kubernetes. The good stuff We’re a company full of happy, motivated people and we never want that to change. Here are more reasons why it rocks to be part of our family. Salary conditions: 35k - 48k Very competitive stock options plan. Remote position, or on-site/hybrid position at our Madrid HQ. 22 vacation days + 2 extra days for the Christmas period. Recharge day: in addition to the just mentioned vacation days, every third Friday of each month is also a day off!  Personal development programs based on our career paths, and a budget for training. Cabify staff free rides. Flexible compensation plan: Restaurant Tickets, Transport Tickets, healthcare and childcare. Regular team events. All the equipment you need (you only have to bring your talent). A pet room in the office, so you don’t have to leave your furry friend at home. And last but not least… free coffee and fruit!        Join us!

Built with ❤️ for the ML Community by Dom © 2022 RemoteML