Data Engineer – Remote (USA/Onsite)

Location: New York (In-person, Dumbo, Brooklyn)
Employment Type: Full-Time
Department: Engineering
Compensation: $180,000 – $280,000 + Equity

About Sunset

Sunset is building the data infrastructure layer for real-world AI training. The company partners with frontier AI labs to transform messy, multi-modal enterprise data into high-quality training datasets—sourced from hundreds of venture-backed startups that have gone through wind-down processes.

Backed by investors such as Floodgate, Afore Capital, and Hustle Fund, Sunset is a fast-growing, in-person team based in Dumbo, Brooklyn. The mission is ambitious: turn fragmented, real-world data into structured intelligence that powers the next generation of AI systems.

Job Overview

As a Data Engineer at Sunset, you will own the systems that convert raw, unstructured, and often chaotic enterprise data into structured, high-value training data. A core challenge of this role lies in entity resolution and de-identification across diverse data sources and formats.

You won’t just process data—you’ll reconstruct relationships, map complex entity linkages, and help model the structure of real-world business interactions hidden inside disconnected datasets.

APPLY NOW  Growth Data Engineer - (AI Storytelling & Visual Creation Platform)

What You’ll Work On

You’ll take full ownership of complex problems from day one. In your first 90 days, you may:

  • Extend entity resolution systems to support new data types such as audio transcripts, design files, and embedded references in PDF documents
  • Build coreference resolution across Slack messages, email threads, and project management tools (e.g., Linear) so references like “me,” “him,” and named entities resolve correctly
  • Design de-identification systems that replace sensitive information (PII) with consistent pseudonyms while preserving relationships across datasets
  • Develop scalable ingestion pipelines for unfamiliar and evolving data formats
  • Tackle ambiguous data challenges where structure must be inferred, not provided

What We’re Looking For

  • Strong product-minded engineer with experience building and shipping data pipelines at scale
  • Advanced Python skills with familiarity in NER (Named Entity Recognition), record linkage, and coreference resolution
  • Comfortable working in ambiguous environments without detailed specifications
  • Someone who prefers end-to-end ownership over narrowly defined tasks
  • Deep integration of AI tools into your workflow and problem-solving approach
APPLY NOW  Video Editor - Remote (Short Form) USA

This Role May Not Be a Fit If

  • You prefer remote or hybrid work (this is fully in-person, 5 days/week in Brooklyn)
  • You are primarily focused on theoretical or research-heavy ML work
  • You prefer long planning cycles or narrowly scoped responsibilities

Tech Stack

Python, PostgreSQL, Redis, AWS
(Tools are selected based on problem fit rather than strict standardization.)

Compensation & Benefits

  • $180K – $280K base salary + meaningful equity
  • Fully covered medical, dental, and vision insurance
  • Unlimited PTO
  • $500 in-office setup stipend

Hiring Process

  1. Intro Chat (20 min): Mutual fit and expectations
  2. Technical Session (1 hour): Collaborative problem-solving exercise
  3. Onsite (2–3 hours): System design, product deep dive, and team interviews
  4. References → Offer

How to Apply?

If you are interested in this Job
CLICK HERE TO APPLY NOW

Join Our Job Update Communities

Get fast job alerts, remote opportunities & visa updates instantly.

Leave a Reply

Your email address will not be published. Required fields are marked *

You May Also Like