Rapsodo Inc. is a sports analytics company that uses computer vision and machine learning to help all athletes maximize their performance. Our proprietary technology applications range from helping PGA Tour golfers optimize their launch conditions to allowing MLB pitchers to increase the efficiency of their breaking balls. Current partners include all 30 MLB teams, MLB, USA Baseball, Golf Digest, PGA of America, and over 1000 NCAA athletic departments.
Requirements
Responsibilities:
- Lead the design, development, and maintenance of our comprehensive data warehouse architecture, integrating Google BigQuery, Kafka, GCP Pub/Sub, and other relevant technologies.
- Collaborate closely with business units to gather requirements and translate them into effective and scalable data solutions.
- Develop and optimize ETL processes to extract, transform, and load data from diverse sources into our data warehouse, ensuring data quality and accuracy.
- Implement and manage real-time data streaming pipelines using Kafka and GCP Pub/Sub to enable rapid data ingestion and processing.
- Work alongside data scientists and analysts to provide them with clean, structured data for analysis and reporting purposes.
- Design and implement data governance strategies to ensure data security, compliance, and privacy.
- Monitor and troubleshoot data pipelines, identifying and resolving performance bottlenecks and data quality issues.
- Stay current with emerging technologies and trends in data engineering, proposing innovative solutions to enhance our data infrastructure.
Qualifications:
- Bachelor's or higher degree in Computer Science, Data Engineering, or a related field.
- Extensive experience as a Data Engineer, specializing in Google BigQuery, Kafka, GCP Pub/Sub, and related technologies.
- Strong knowledge of data warehouse architecture, ETL processes, and data integration methodologies.
- Proficiency in SQL and experience with optimizing complex queries for performance.
- Solid understanding of event-driven architecture and real-time data streaming using Kafka and GCP Pub/Sub.
- Familiarity with cloud-based solutions, particularly Google Cloud Platform (GCP).
- Experience in designing and implementing data governance and security measures.
- Strong problem-solving skills and the ability to troubleshoot and resolve complex data-related issues.
- Excellent communication skills to collaborate effectively with technical and non-technical stakeholders.
- Leadership experience or the ability to guide junior team members is a plus.
- Relevant certifications in GCP, Google BigQuery, and Kafka are highly desirable.
If you think you have what it takes and look forward to working independently as well as contribute in an innovative, passionate and driven environment, apply now!