Coupang

Staff II, Data Infra Engineer

Sep 08, 2023

Seattle, USA

About us 

The Data Infrastructure team in the Data Platform organization serves Hadoop clusters, DW, and Orchestration platforms across the entire business domain. The infrastructure managed by the Data Infrastructure team provides various data processing related to Coupang services, including log analysis, recommendation, price comparison, AB testing, search indexing, advertising, and rocket delivery. We are confident that we are playing a pivotal role in experiencing the best e-commerce for Coupang's customers. 

The Data Infrastructure team simultaneously serves hundreds of Hadoop clusters in cloud environments and has the infrastructure and know-how to reliably scale to thousands of nodes. To this end, open sources such as Hadoop, Spark, Hive, Presto, Airflow, Oozie, Docker, Kubernetes, Ansible, Terraform, Packer, and Java and Python are used as development languages. 

Our vision is to provide modern self-service tools to enhance engineering productivity, making it easy for anyone to utilize our data, and build a platform that is consistently scalable and reliable. We are looking for a software engineer who will create the most robust platform services at a global level beyond Korea! 

Responsibilities: 

With a high understanding of big data technology and skilled development capabilities, the Staff Back-end Engineer performs the following tasks: 

  • Provide a roadmap and vision for scalable and robust growth for your data infrastructure team 

  • Collaborate with stakeholders and lead engineers on key mission-critical projects 

  • Leading the design and deployment of big data infrastructure architectures 

  • Hadoop/Data Warehouse infrastructure Maintanence 

  • Establish Data Warehouse governance and improve cluster operation efficiency 

  • Self-service development to improve cluster operational efficiency and user experience 

  • As a Backend Engineer, participate in business/technical improvement projects from a data platform perspective 

  • Modern data engineering technology research and product development

 

Requirements: 

  • Bachelor's degree or/and master's degree in computer science and equivalent majors 

  • Proficient in at least one or more Java, Scala, Python 

  • More than 10 years of experience in designing, developing, and maintaining large software infrastructures 

  • Have expertise in distributed systems such as Hadoop and Spark 

  • More than 3 years of experience in developing and operating enterprise DW platform (RedShift, Netezza, Greenplum, Exadata, Teradata) 

  • Great communication skills and fun to share your experiences and learning with your colleagues 

  • People who try to automate without maintaining manual or repetitive tasks

 

Preferred: 

  • Experience in designing and developing data pipelines in cloud environments such as AWS and GCP 
  • Experience in leading projects and initiatives with complex scope 
  • High competency in SQL writing and OLTP / Batch SQL Tuning 
  • Strong experiences with RedShift 
  • Experience with Snowflake and Databricks

Join 27186+ Machine Learning Engineers, receiving daily job alerts.