Data Engineer – Remote (USA/Onsite)

Location: New York (In-person, Dumbo, Brooklyn)
Employment Type: Full-Time
Department: Engineering
Compensation: $180,000 – $280,000 + Equity

About Sunset

Sunset is building the data infrastructure layer for real-world AI training. The company partners with frontier AI labs to transform messy, multi-modal enterprise data into high-quality training datasets—sourced from hundreds of venture-backed startups that have gone through wind-down processes.

Backed by investors such as Floodgate, Afore Capital, and Hustle Fund, Sunset is a fast-growing, in-person team based in Dumbo, Brooklyn. The mission is ambitious: turn fragmented, real-world data into structured intelligence that powers the next generation of AI systems.

Job Overview

As a Data Engineer at Sunset, you will own the systems that convert raw, unstructured, and often chaotic enterprise data into structured, high-value training data. A core challenge of this role lies in entity resolution and de-identification across diverse data sources and formats.

You won’t just process data—you’ll reconstruct relationships, map complex entity linkages, and help model the structure of real-world business interactions hidden inside disconnected datasets.

What You’ll Work On

You’ll take full ownership of complex problems from day one. In your first 90 days, you may:

Extend entity resolution systems to support new data types such as audio transcripts, design files, and embedded references in PDF documents
Build coreference resolution across Slack messages, email threads, and project management tools (e.g., Linear) so references like “me,” “him,” and named entities resolve correctly
Design de-identification systems that replace sensitive information (PII) with consistent pseudonyms while preserving relationships across datasets
Develop scalable ingestion pipelines for unfamiliar and evolving data formats
Tackle ambiguous data challenges where structure must be inferred, not provided

What We’re Looking For

Strong product-minded engineer with experience building and shipping data pipelines at scale
Advanced Python skills with familiarity in NER (Named Entity Recognition), record linkage, and coreference resolution
Comfortable working in ambiguous environments without detailed specifications
Someone who prefers end-to-end ownership over narrowly defined tasks
Deep integration of AI tools into your workflow and problem-solving approach

This Role May Not Be a Fit If

You prefer remote or hybrid work (this is fully in-person, 5 days/week in Brooklyn)
You are primarily focused on theoretical or research-heavy ML work
You prefer long planning cycles or narrowly scoped responsibilities

Tech Stack

Python, PostgreSQL, Redis, AWS
(Tools are selected based on problem fit rather than strict standardization.)

Compensation & Benefits

$180K – $280K base salary + meaningful equity
Fully covered medical, dental, and vision insurance
Unlimited PTO
$500 in-office setup stipend

Hiring Process

Intro Chat (20 min): Mutual fit and expectations
Technical Session (1 hour): Collaborative problem-solving exercise
Onsite (2–3 hours): System design, product deep dive, and team interviews
References → Offer

How to Apply?

If you are interested in this Job
CLICK HERE TO APPLY NOW

Data Engineer – Remote (USA/Onsite)

About Sunset

Job Overview

What You’ll Work On

What We’re Looking For

This Role May Not Be a Fit If

Tech Stack

Compensation & Benefits

Hiring Process

How to Apply?

admin

Leave a Reply Cancel reply

Junior Data Scientist – Remote Internship (USA)

Data Analyst – Fully Remote (UK)

Junior Software Engineer (TypeScript) | $42/hr Remote

Operations Intern – Remote (Nigeria)

Service Technician – Overhead Crane & Hoist (Onsite,USA)

Video Content Producer & Editor Job – Remote/USA

Data Engineer – Remote (USA/Onsite)

About Sunset

Job Overview

What You’ll Work On

What We’re Looking For

This Role May Not Be a Fit If

Tech Stack

Compensation & Benefits

Hiring Process

How to Apply?

admin

Leave a Reply Cancel reply

You May Also Like