Data Scientist

Nov 14, 2023

New York

About Cybersyn

Cybersyn is a new DaaS (data-as-a-service) company, backed by Sequoia, Coatue, and Snowflake. Our mission is to make the world's economic data transparent to governments, businesses, and entrepreneurs and enable a new generation of decision makers. We acquire unique data assets (companies, licenses, data rights, consumer dividends) and build derived products on top of that, focusing on measuring what consumers and businesses are spending money on. You can think of Cybersyn as a cross between an investment firm and a technology company focused on data: if we are successful, we will disrupt the traditional market intelligence space. The reward is great - if we are successful, we can disrupt an industry worth $100Bs and build SimCity for the real world.

We have already released a fair number of public datasets that we have cleaned, restructured and made joinable on the Snowflake Marketplace.

  • See our current data here.

  • Demo our data on our Streamlit App here.

About the role: 

Cybersyn is looking for a Data Scientist to take on the challenges that arise in modernizing the world of economic data.  You will be joining an incredibly talented team of fast-moving, product-oriented data scientists and engineers working to develop novel solutions to complex statistical problems and building out our data product vision. 

What you will do:

  • Build derived data products that answer some of the most complex and interesting questions about the economy; in practice, this means:

    • Prototyping and implementing data processing pipeline and statistical models in Python/SQL/R which will ultimately contribute to our technical vision

    • Leveraging SQL, Python, dbt, and orchestration tools (eg. Dagster)

    • Working closely with engineers, analytics engineers and product managers to execute our roadmap

  • Report to the Head of Data Science and assist them in executing our data product vision.

Who you are: 

  • Commercially minded data scientist with an ability to balance technical rigor with fast execution and actionable results.

  • At least two years of hands-on experience with developing statistical models and data pipelines to make sense of imperfect data. Read more on our thesis here and here.

  • Proven track record of executing practical research projects from beginning to end.

  • Prior experience with alternative, third party data is strongly preferred.

  • Prior experience in the following fields is a plus: sampling and inference methods, panel data analysis, bayesian data analysis, time series modeling, data normalization, numerical analysis

  • Experience in Python/R and SQL is requisite; ideally has worked with cloud data warehouses before (Snowflake, BigQuery, Redshift, etc.) 

    • You should have a good sense of what “clean code” looks like, have experience reviewing Pull Requests and developing coding standards

    • Prior experience with working with big data is strongly preferred.

  • Experience in dbt, AWS, Github all very useful, but not strictly required

What you get out of it:

  • Ability to shape Cybersyn’s initial product, technology decisions and ownership of statistical methodologies and libraries 

  • Access to some of the most interesting economic data in the world, including real-time spending, transaction, clickstream, data from both third-party and first-party sources. Much of our data is not available to any other third parties 

  • Fast moving culture, lots of responsibility and autonomy from day 1

Join 27676+ Machine Learning Engineers, receiving daily job alerts.