Senior Data Platform Engineer

As a Data Platform Engineer, you will build our next generation data platform and accompanying services. Our data pipelines are growing rapidly, currently processing several terabytes of data per day from production databases and external providers to our data warehouse. We build foundational self-service systems that allow end users to create ETL flows and consume data in batch and streaming fashion for machine learning, fraud prevention, A/B testing and analytics purposes.

Requirements

  • Exhibit our core cultural values: positive energy, clear communication, efficient execution, continuous learning
  • Experience building (data) backend systems at scale with parallel/distributed compute
  • Experience building microservices
  • Experience with Python and/or Java/Scala
  • Knowledge of SQL
  • A data-oriented mindset

Preferred (not required) 

  • Computer Science or related engineering degree
  • Deep knowledge of Apache Airflow, Spark, Flink, Snowflake, Hadoop, Hive, Kafka/Kinesis

Responsibilities:

  • Data ingestion pipeline: Build our next generation streaming ingestion pipeline for scale (10x data), speed (<1 minute of lag), and ease of use (<1 hour to add a new source). Read from a variety of upstream systems (MongoDB, Postgres, DynamoDB, MySQL, API) in both batch and streaming fashion (including change data capture for databases).  Today we do this with Apache Airflow, Hadoop, Spark and a pure Kotlin service.
  • Self-service transformation engine: Build and maintain our self-service tooling that allows anybody at Coinbase to transform complex JSON and create dimensional models. Specific challenges are supporting type 2 slowly changing dimensions, end-to-end testability, validation/monitoring/alerting and fast incremental execution. Today we do this with Apache Airflow.
  • Data quality & transparency: Build a variety of tools and systems which help make sure the end to end flow of data is correct, efficient and easily discoverage. This includes running and reporting on quality checks, tracking and displaying how data flows through different tables, detecting similar tables, reporting on table usage, and removing stale tables.
  • Anomaly detection: Build a comprehensive anomaly detection service that allows anybody at Coinbase to quickly set up notifications in order to detect process breakage.
  • Security: build a security layer that authorizes data access at the row/column level. Build a logging and auditing system in order to surface suspicious data access patterns.

What to send:

  • A resume that describes scalable systems you’ve built

Coinbase

Buy and sell digital currency.

Technology we use

Javascript
Python
Java
SQL
Go
Swift
Ruby
TypeScript
MySQL
PostgreSQL
MongoDB
React
Rails
Docker
Node.js