About Us

Data. Technology. Finance. Research. Talent.

Sitting at the intersection of technology and finance, we are an innovative investment fund with over $1BN+ assets under management on behalf of our investors, many of whom are non-profits, educational endowments and pension funds.

We combine quantitative research—machine learning and data science, software engineering, and rigorous scientific investigation—to build credit portfolios that produce strong and consistent yields across business cycles.

Our firm is 30+ professionals working in San Francisco (HQ) and remote locations. We are passionate, hard-working, relentlessly-resourceful, impact-focused individuals. We deeply value intellectual curiosity, creative idea generation, empathy, and close collaboration. We have (currently virtual) coffee hours, game nights, and team get-togethers. Company events are inclusive and fun, with expeditions to food trucks, Michelin-starred local restaurants, and annual retreats featuring hiking and cooking.

The Role

As a Senior Data Engineer, you will bring subject matter expertise in Data Engineering at Theorem.

A unique aspect of this position is you may work alongside every person on our team: quantitative researchers who are our data scientists, and colleagues in finance and operations, investor relations and sales, capital markets and partnerships as well as the firm’s leadership.

Your job is to develop the systems and data pipelines that enable the shared data asset informing every decision of the firm.

What You’ll Do

  • Partner with stakeholders across Theorem to understand needs and data-driven workflows
  • Work with leadership to set data-related priorities for the firm
  • Design and develop new or improved capabilities for Theorem’s data infrastructure
  • Build complex data pipelines with very high demands for correctness and robustness
  • Develop robust software to support new and existing data projects and initiatives across the firm

Technically, you will own:

  • The creation of an end-to-end reporting pipeline that allows for the comparison of predicted whole loan asset performance against realized performance
  • Deprecation and turndown of legacy data pipelines & reporting systems
  • Deployment of Apache Spark and related distributed data processing infrastructure
  • Increase availability and decrease data staleness through driving clear ETL ownership and automated alerting strategies
  • Integration of system operational data sources into the Theorem data warehouse
  • Quantify and track data quality across all pipelines and build technology to systematize improvements
  • Normalizing, standardizing, and cataloging all data used across the firm with an eye towards end-user discovery and accessibility

What We’re Looking For

  • 5+ years of Data Engineering experience
  • Ability to partner with semi-technical / non-technical colleagues, from data scientists to C-level executives, and transform business requirements into technical solutions
  • Experience building automated reports, dashboards, & visualizations of curated data, e.g. SQL, Jupyter, Tableau, Looker
  • Experience assessing, implementing, and monitoring data validation and quality (correctness, completeness, availability, etc)
  • Fluent in SQL and Python
  • Deep expertise in relational data modeling, schema design, and normalization
  • Experienced with containerized environments, e.g. Kubernetes, Docker

Brings expertise in a few of the following areas:

  • Distributed computation/query frameworks, e.g. Apache Spark, Databricks, Presto
  • Distributed columnar data warehouses, e.g. AWS Redshift, Google BigQuery, Snowflake
  • ETL and data cataloging frameworks, e.g. Hive/AWS Glue, Fivetran, dbt
  • DAG/workflow management tools, e.g. Argo, Airflow
  • Streams/queues/event sourcing, e.g. Kafka, AWS MSK

Characteristics to Thrive

  • Hardworking and gritty
  • Ethical, intellectual honesty and transparency
  • High attention to detail
  • Proactive communicator
  • Enjoys working in small, high-impact teams
  • Seeks end-to-end ownership of outcomes
  • Bias for action and moves fast to solve problems
  • Welcomes and adapts behavior to feedback
  • Collaborative and team success-oriented

Our Commitment

We foster an environment that welcomes professionals with a diversity of backgrounds and ideas. We value professionals who are thoughtful, innovative, tenacious, and mission-driven. Every member of the team has a major impact on the company’s success with visible contributions to the business. We encourage and reward a growth, learning, and solutions-seeking mindset. We offer a competitive salary and opportunity for equity ownership, generous benefits, and an inclusive and collaborative work environment. If you’re excited by the above, we strongly encourage you to apply.

Tagged as: AI, C++, Data, Databricks, Docker, ETL, Go, Hive, Kafka, Kubernetes, Machine Learning, Python, R, Redshift, Spark, SQL, Tableau


Job Overview
We use cookies to improve your experience on our website. By browsing this website, you agree to our use of cookies.

Sign in

Sign Up

Forgotten Password

Receive the latest news

Subscribe To Our Weekly Newsletter

Get notified about the latest Data Science career insights!