Data Engineer (Remote)

Job Description

Wirecutter is seeking a Data Engineer to help build the infrastructure, data architecture, and pipelines that power our business.  Data Engineers operate within a distributed, agile, cross-functional squad. The data squad has an organization-wide impact by providing the data to inform the user experience, product, editorial, growth, and financial decisions at Wirecutter.  The squad is responsible for the ETL processes, architecture, storage, reliability, accuracy, monitoring, and infrastructure surrounding our internal data and analytics.

About You:

  • You have 3+ years in software or data engineering and scaling large data sets.

  • You can design & optimize queries, data sets, and data pipelines to organize, collect and standardize data that helps generate insights and addresses reporting needs.

  • You understand the challenges of reliable data replication, optimizing for a data warehouse, and maintaining the integrity of a data lake.

  • You have experience reliably integrating and handling data from multiple APIs.

  • You have experience building ETLs at scale on any major cloud provider (AWS, GCP: Cloud Composer, Kubernetes, etc.)

  • You are thoughtful, clear, and persuasive in writing and in person.

  • You have strong problem-solving skills and critical thinking abilities.

  • You have experience listening to analysts and other business users, and can translate their needs into actionable tasks.

  • You are excited to play a pivotal role in Wirecutter’s mission, innovation, and growth.

  • You are passionate and enthusiastic about what you do.

  • You have experience with version control, shell scripting, the Unix filesystem, and automating deployments.

  • Ideally, you have production experience with Python and Apache Airflow.

  • Ideally, you have experience with BI tools and managing data sets for BI tools.

  • Ideally, you have a basic understanding of statistics and sampling.

  • Ideally, you have experience working with Google Tag Manager and analytics data sets

  • Ideally, you’ve worked as a member of a distributed team.

Our data engineering tech stack consists of:

  • Python and Apache Airflow for ETL pipelines

  • PostgreSQL database and S3 data lake in AWS RDS

  • BigQuery analytics data

  • Looker BI tool

In this role, you will:

  • Help drive the optimization, testing, and tooling to improve data quality.

  • Write, debug, and test complex ETL processes for new or existing data pipelines.

  • Write and maintain database design and architecture documentation.

  • Maintain an understanding of Wirecutter’s data platforms as well as the data platforms of our parent company, the New York Times

  • Uncover dependencies and leverage features and tools from both the Wirecutter and New York Times

  • Collaborate with your squad leaders and stakeholders on the scoping, planning, prioritization, successful execution, and rollout of complex technical projects to provide the foundation for generating insights and address additional data for reporting needs.

  • Create new data models that are appropriately scalable, standardized, performant, and reliable.

  • Evolve our current data models from production services into readily consumable formats for all downstream data consumption.

  • Support and maintain the integrity and security of our internal data.

  • Provide insight into changing database storage and utilization requirements.

  • Recommend solutions that best align with our product and business goals, as well as the quality, reliability, and secure storage and replication of our data.

  • Improve our development workflow and infrastructure.

  • Share knowledge and problem solving with other members of your squad and the engineering team.

  • Contribute to engineering initiatives and culture as a member of Wirecutter’s engineering team.


Wirecutter is the best product recommendation service in the U.S.

Technology we use