Senior Data Engineer

Cobalt

Cobalt

Data Science

New York, NY, USA

USD 140k-180k / year + Equity

Posted on May 5, 2026

Location

New York City

Employment Type

Full time

Location Type

On-site

Department

Engineering

Compensation

  • $140K – $180K • Offers Equity

Company Description

We have an internet for money - but still can't tell real companies apart from fake ones. Cobalt ID is building the business identity infrastructure for the financial internet. While others focus on consumers, we separate real companies from synthetic ones.

With AI accelerating fraud rings and shell companies at global scale, distinguishing a legitimate business from a sophisticated fraudster is now one of the hardest problems in fintech. We’re mapping 100M+ businesses and counting to expose hidden financial crime networks and ensure real businesses are never left out of the financial ecosystem.

Role Description

Our knowledge graph is built by fusing data from hundreds of messy, heterogeneous sources - some of which are exclusively ours to access.

As a Senior Data Engineer, you'll own the data layer that makes everything else possible. You'll build the ingestion pipelines, entity resolution systems, and data quality infrastructure that connects raw source data to a unified view of every entity in our graph in a manner that’s fast, accurate, and explainable for compliance.

The problems here are specific and mostly unsolved by the industry. Similar infrastructure powers leading social media platforms, search engines, and data fusion platforms, but hasn't yet been applied to this problem. If you're energized by turning chaos into structure at massive scale, this role is for you.

This is a full-time on-site role for a Senior Data Engineer located in New York, NY.

What you'll do:

  • Design and build production data pipelines that ingest, normalize, and link data from hundreds of heterogeneous sources

  • Build and maintain data quality infrastructure: monitoring, validation, deduplication, and freshness tracking across millions of data points

  • Develop the ingestion and processing layer for unstructured and semi-structured data, including document parsing and extraction from inconsistent sources

  • Instrument and monitor pipeline health, data coverage, and entity resolution accuracy as the system scales

  • Ship to production constantly - we're a small team and everything you build matters

  • Collaborate directly with founders and customers to shape what we build next

Base Qualifications

  • 4+ years building production data pipelines and infrastructure (we care more about skill and impact than years alone)

  • Experience with large-scale data processing. You've built ETL/ELT systems that handled messy, real-world data at meaningful volume

  • Hands-on experience with entity resolution, record linkage, or data deduplication. You understand the algorithmic and practical challenges of matching records across noisy sources

  • Strong fundamentals in data modeling and pipeline orchestration

  • Comfort with ambiguity and fast iteration in an early-stage environment

  • You care about data quality as a first-class engineering problem, not an afterthought

  • You want to be close to the problem and the customer, not siloed from product decisions

Preferred Qualifications

  • Experience ingesting and normalizing data across unstructured / semi-structured sources

  • Background in knowledge graph construction, graph databases, or large-scale entity graph systems

  • Experience with NLP or LLM-based approaches to entity resolution or document extraction

  • Background in fraud detection, identity systems, ads ranking, recommendation systems, or other domains that require profiling and linking entities at scale

  • Familiarity with data infrastructure on cloud platforms at production scale

The Team:

We’re a small, tight-knit, and highly technical team. Our founders and early team members come from Waymo, Google, Meta, Brex, and Virtu Financial. We value technical depth and curiosity, low ego, and fast execution.

We’ve partnered with investors who understand the plumbing of the global financial system. We recently raised a round led by Nyca Partners, with participation from operators who built the modern fintech stack at Ramp, Plaid, and Brex.

Compensation Range: $140K - $180K