Scaling PostgreSQL at Thumbtack: Load Balancing And Health Checks

7,252

By Marco Almeida, Site Reliability Engineer at Thumbtack.


Introduction

Running PostgreSQL on a single primary master node is simple and convenient. There is a single source of truth, one instance to handle all reads and writes, one target for all clients to connect to, and only a single configuration file to maintain. However, such a setup usually does not last forever. As traffic increases, so does the number of concurrent reads and writes, the read/write ratio may become too high, a fast and reliable recovery plan needs to exist, the list goes on…

No single approach solves all possible scaling challenges, but there are quite a few options for scaling PostgreSQL depending on the requirements. When the read/write ratio is high enough, there is fairly straightforward scaling strategy: setup secondary PostgreSQL nodes (replicas) that stream data from the primary node (master) and split SQL traffic by sending all writes (INSERT, DELETE, UPDATE, UPSERT) to the single master node and all reads (SELECT) to the replicas. There can be many replicas, so this strategy scales better with a higher read/write ratio. Replicas are also valuable to implement a disaster recovery plan as it’s possible to promote one to master in the event of a failure.

Context

In 2014, Thumbtack was running PostgreSQL 9.1 on two servers: a basic master – slave setup leveraging PostgreSQL’s built-in streaming replication. Our infrastructure was comprised of a few dozen physical machines on SoftLayer running RHEL 5 and we were using HAproxy with Keepalived for load balancing. The future, already being planned for, would be powered by EC2 instances on AWS, running Debian 7 behind Elastic Load Balancers.

As traffic grew, we knew we would need to scale out PostgreSQL further. Thumbtack’s SQL traffic was (and still is) quite read-intensive, with less than 3% of all queries being executed on the master node. This was good news as it meant we could scale out by sending SELECT statements to a cluster of read-only replicas and leaving the master alone to process DML commands.

In order to properly implement this we would need:

  • an arbitrary number of read-only replicas behind a load balancer;
  • the load balancer itself could not be a single point of failure;
  • a way of performing health checks on each server, executed from the load balancer, so that failed nodes would be taken in and out of rotation automatically;
  • to support SoftLayer and AWS environments during the transition period.

Replication, high-availability, and load-balancing

We knew what we wanted the infrastructure to look like from a high-level perspective and had the tools available to implement almost all of it on both providers (Fig. 1).

Thumbtack Postgres Acrhictecture

One critical detail, however, was far from being a solved problem: health checks.

A basic ping on port 5432 was not enough. Performance and replication lag were (and still are!) very important factors to us — if a given replica is lagging behind by more than N (varying according to the database and the cluster we’re connecting to) seconds, we prefer not to use it until it recovers as it would otherwise lead to stale reads.

Custom health checks

Not having found an open source tool that implements powerful enough health-checks for PostgreSQL, we decided to write our own. These were the requirements:

  1. Work equally well on both environments — RHEL 5/HAproxy on Softlayer and Debian 7/ELBs on AWS
  2. Check basic TCP connectivity, on an arbitrary port, with a configurable timeout
  3. Check server availability by running a test query with a time limit — if a server is under load, it may be responding to TCP but not able to process a simple query (SELECT 1). We need to distinguish between these two scenarios, and potentially take different actions
  4. Check replication lag (time elapsed since the last transaction was replayed)
  5. Support custom health checks in the form of SQL queries — extensible and future-proof
  6. Low memory footprint — avoid “stealing” memory from PostgreSQL
  7. Minimal list of external dependencies

A web service, exposing a simple HTTP endpoint, would work in any environment and easily be able to test TCP connectivity. Simple queries and testing replication lag are just a special case of running arbitrary SQL queries as a health check, so we just focused on this one and implemented the others as a form of syntactic sugar.

Programming languages One important decision for delivering a platform independent solution with low memory footprint and minimal dependencies was the choice of the programming language. We considered a few from Python (there was already a reasonably large Python code base at Thumbtack), to Go (we were taking our first steps with it), and even Rust (too immature at the time).

We ended up writing it in C. It was easy to meet all requirements with only one external dependency for implementing the web server, clearly no challenges running it on any of the Linux distributions we were maintaining, and arguably the implementation with the smallest memory footprint given the choices above.

The final result

We named the project pgDoctor and made it publicly available on our Github repository. It uses microhttpd to implement a very simple web service that listens on port 8071, logs to the local7 syslog facility (configurable), and provides a reasonably rich set of configuration parameters. The behavior is quite simple: an HTTP GET request to :8071 returns 200 if all checks pass, 500 otherwise. All errors are logged.

pgDoctor has been running flawlessly on all our PostgreSQL replicas for roughly 3 years now, having gone through two major upgrades (9.1 –> 9.4 –> 9.6). As of now, there are 18 streaming replicas, all running pgDoctor alongside PostgreSQL, and distributed among 4 clusters. Each cluster supports different use cases and requires slightly different health checks.

PostgreSQL replicas are sometimes taken out of rotation. The most common reasons are temporary high replication lag or some transient issue with the underlying EC2 instance. As expected, they are added back to the cluster without any intervention once normality is restored and the health checks succeed.

Figure 2 shows a diagram of (a downsized version of) our production environment:

  • Three availability zones;
  • One master node and two hot-standby instances on different availability zones;
  • Three clusters of read-only replicas, streaming from the master, each with its own load balancer;
  • Several clients, on all availability zones, reading from one or more clusters and writing to the master.

Thumbtack Postgres Architecture 2

Does this sound interesting? There is a lot more to be done. Join Thumbtack and help us build, scale, and operate a high reliability service!

Related work

http://www.severalnines.com/mysql-load-balancing-haproxy-tutorial#issues https://www.digitalocean.com/community/tutorials/how-to-use-haproxy-to-set-up-mysql-load-balancing--3 http://www.severalnines.com/mysql-load-balancing-haproxy-tutorial#issues


Originally posted on Thumbtack Engineering

Tools mentioned in article
Open jobs at Thumbtack
Lead Technical Program Manager, Platf...
, Ontario
<div class="content-intro"><p><span style="font-weight: 400;">A home is the biggest investment most people make, and yet, it doesn’t come with a manual. That's why we’re building the only app homeowners need to effortlessly manage their homes —&nbsp; knowing what to do, when to do it, and who to hire. With Thumbtack, millions of people care for what matters most, and pros earn billions of dollars through our platform. And as one of the fastest-growing companies in a $600B+ industry — we must be doing something right.&nbsp;</span></p> <p><span style="font-weight: 400;">We are driven by a common goal and the deep satisfaction that comes from knowing our work supports local economies, helps small businesses grow, and brings homeowners peace of mind. We’re seeking people who continually put our purpose first: advocating for pros and customers, embracing change, and choosing teamwork every day.</span></p> <p><span style="font-weight: 400;">At Thumbtack, we're creating a new era of home care. If making an impact and the chance to do good inspires you, join us. Imagine what we’ll build together.&nbsp;</span></p> <p><strong>Thumbtack by the Numbers</strong></p> <ul> <li>Available nationwide in every U.S. county</li> <li>80 million projects started on Thumbtack</li> <li>10 million 5-star reviews and counting</li> <li>Pros earn billions on our platform</li> <li>1000+ employees&nbsp;</li> <li>$3.2 billion valuation (June, 2021)&nbsp;</li> </ul></div><p><strong>About the Technical Program Management and Engineering Teams</strong></p> <p>Technical Program Management is a collaborative, engaging team with a high motor and strong domain and technical expertise.&nbsp; We partner with engineering and teams across the business to drive discovery, prioritization, implementation, improvement, and ongoing management of high impact technical solutions and programs.&nbsp; It’s our mission to help all of Thumbtack scale and get more done through technology and cross-functional alignment.</p> <p>At Thumbtack, engineers at every level build products and systems that directly impact our customers and professionals. Our challenges span a wide variety of areas, ranging from architecting sound data and infrastructure to be leveraged across the company, to building search and booking experiences, to optimizing pricing systems, to building tools to help professionals grow their businesses. We believe in tackling these hard problems together as a team, with strong values around collaboration, ownership, and transparency. To read more about the hard problems that our team is taking on, visit our <a href="https://www.thumbtack.com/engineering/">engineering blog</a>.</p> <h4><strong>About the Role</strong></h4> <p class="p1">Thumbtack is looking for an experienced Technical Program Manager to build out and manage a program that delivers the technology strategy and roadmap for our Platform Engineering organization.&nbsp; As the first dedicated TPM to this area at Thumbtack, we’re looking for the right person who’s excited about the unique and high impact opportunity to build something special.&nbsp; You will lead complex, multi-disciplinary projects in a program that spans critical systems &amp; application infrastructure, data &amp; machine learning systems, developer experience and productivity, and vendor-managed services and platforms.&nbsp; You will provide input and direction on solution design and prioritization, clarify outcomes and metrics, manage schedules and dependencies, and communicate progress with stakeholders and business leaders on initiatives that impact our internal and external users – from deployment of new technology, to large-scale infrastructure migrations, to definition of new processes, to enforcement of security policies, reliability measures, and data governance.</p> <h4><strong>Responsibilities</strong></h4> <ul> <li>Partner with engineering leadership to develop a Platform Engineering program strategy, drive semi-annual planning to identify objectives and key results that support that strategy, and collaborate with engineering, product, data science, IT, and others to prioritize and resource dependent initiatives</li> <li>Collaborate with engineering, product, data science, and relevant business teams to develop comprehensive and actionable plans, manage complex cross-team dependencies, and implement scalable tools, infrastructure, automation, and processes</li> <li>Utilize agile project management framework and techniques to plan, document, and manage both strategic and tactical initiatives from conception through delivery and closure</li> <li>Keep cross-functional stakeholders, contributors, executives, and interested parties engaged, aligned on strategy and involvement, and up-to-date on relevant details; escalate issues when necessary</li> <li>Ensure proper project closure by troubleshooting and resolving outstanding issues after launch, confirming solutions are in an acceptable state, and evaluating outcomes against objectives</li> </ul> <h4>What you'll need&nbsp;</h4> <p><em><span style="font-weight: 400;">If you don't think you meet all of the criteria below but still are interested in the job, please apply. Nobody checks every box, and we're looking for someone excited to join the team</span></em><em><span style="font-weight: 400;">.</span></em></p> <ul> <li>Bachelor’s degree in Computer Science or 5+ years of relevant experience in the field</li> <li>Proven experience owning the planning and delivery of a cross-functional engineering program</li> <li>Expert in agile project management, prioritization methodologies, and the software development lifecycle, with the ability to manage and prioritize many competing requests and cross-team dependencies</li> <li>Excellent analytical and problem-solving skills; able to make thoughtful proposals, provide root cause analysis and demonstrate results through testing and metrics</li> <li>General understanding of modern best practices in more than one of: infrastructure/operations; frontend, backend, and/or native client application development; data or machine learning infrastructure</li> <li>Experience and expertise in managing technical projects and programs related to the following: <ul> <li>Running software applications on public cloud infrastructure</li> <li>Application development and architecture in large-scale environments</li> <li>Data infrastructure investments and tooling (build/buy, multi-cloud, etc.)</li> </ul> </li> <li>Outstanding communication, collaboration, and interpersonal skills, with the ability to effectively set and manage stakeholder expectations</li> </ul> <h4><strong>Bonus points if you have</strong></h4> <ul> <li>Experience with a multi-platform product for a mix of internal and external users</li> <li>Previous hands-on engineering experience</li> <li>Compliance management experience (e.g. CCPA/CPRA, CAN-SPAM, WCAG, PCI DSS, SOX)</li> </ul> <p><span style="font-weight: 400;">Thumbtack is a virtual-first company, meaning you can live and work from any one of our approved locations across the United States, Canada or the Philippines.* Learn more about our virtual-first working model </span><a href="https://careers.thumbtack.com/virtual-first"><span style="font-weight: 400;">here</span></a><span style="font-weight: 400;">.</span></p> <p><span style="font-weight: 400;">#LI-Remote</span></p><div class="content-conclusion"><div class="p-rich_text_section"><strong data-stringify-type="bold">Benefits &amp; Perks</strong></div> <ul class="p-rich_text_list p-rich_text_list__bullet" data-stringify-type="unordered-list" data-indent="0" data-border="0"> <li style="font-weight: 400;">Virtual-first working model coupled with in-person events</li> <li style="font-weight: 400;">20 company-wide holidays including a week-long end-of-year company shutdown</li> <li data-stringify-indent="0" data-stringify-border="0">Libraries (optional use collaboration &amp; connection hubs)<strong data-stringify-type="bold"> </strong><span style="font-weight: 400;">in San Francisco and Salt Lake City&nbsp;</span><span style="font-weight: 400;">&nbsp;</span></li> <li style="font-weight: 400;"><span style="font-weight: 400;">WiFi reimbursements&nbsp;</span></li> <li style="font-weight: 400;"><span style="font-weight: 400;">Cell phone reimbursements (North America)&nbsp;</span></li> <li style="font-weight: 400;"><span style="font-weight: 400;">Employee Assistance Program for mental health and well-being&nbsp;</span></li> </ul> <p><span style="font-weight: 400;"><strong>Learn More About Us </strong></span></p> <ul> <li><a href="https://medium.com/life-thumbtack"><span style="font-weight: 400;">Life @ Thumbtack Blog </span></a><span style="font-weight: 400;">&nbsp;&nbsp;</span></li> <li style="font-weight: 400;"><a href="https://www.youtube.com/watch?v=vkdNjVKtpB4"><span style="font-weight: 400;">How Thumbtack is embracing virtual work</span></a><span style="font-weight: 400;"> &nbsp;</span></li> <li style="font-weight: 400;"><a href="https://www.linkedin.com/company/thumbtack-inc.">Follow us on LinkedIn </a>&nbsp;</li> <li style="font-weight: 400;"><a href="https://www.youtube.com/watch?v=khIN2FCdfnM&amp;list=PLPz1npTzydS577c1B-vduYBWoF0cRPKMv">Meet the pros who inspire us</a></li> </ul> <p><span style="font-weight: 400;">Thumbtack embraces diversity. We are proud to be an equal opportunity workplace and do not discriminate on the basis of sex, race, color, age, pregnancy, sexual orientation, gender identity or expression, religion, national origin, ancestry, citizenship, marital status, military or veteran status, genetic information, disability status, or any other characteristic protected by federal, provincial, state, or local law. We also will consider for employment qualified applicants with arrest and conviction records, consistent with applicable law.&nbsp;</span></p> <p><span style="font-weight: 400;">Thumbtack is committed to working with and providing reasonable accommodation to individuals with disabilities. If you would like to request a reasonable accommodation for a medical condition or disability during any part of the application process, please contact: </span><a href="mailto:recruitingops@thumbtack.com"><span style="font-weight: 400;">recruitingops@thumbtack.com</span></a><span style="font-weight: 400;">.&nbsp;</span></p> <p><span style="font-weight: 400;">If you are a California resident, please review information regarding your rights under California privacy laws contained in Thumbtack’s Privacy policy available at </span><a href="https://www.thumbtack.com/privacy/"><span style="font-weight: 400;">https://www.thumbtack.com/privacy/</span></a><span style="font-weight: 400;"> .</span></p> <h5 id="b81c" class="ht hu dv hv b eu hw hx hy ex hz ia ib ic id ie if ig ih ii ij ik il im in io dn es"></h5></div>
Lead Technical Program Manager, Platf...
, United States
<div class="content-intro"><p><span style="font-weight: 400;">A home is the biggest investment most people make, and yet, it doesn’t come with a manual. That's why we’re building the only app homeowners need to effortlessly manage their homes —&nbsp; knowing what to do, when to do it, and who to hire. With Thumbtack, millions of people care for what matters most, and pros earn billions of dollars through our platform. And as one of the fastest-growing companies in a $600B+ industry — we must be doing something right.&nbsp;</span></p> <p><span style="font-weight: 400;">We are driven by a common goal and the deep satisfaction that comes from knowing our work supports local economies, helps small businesses grow, and brings homeowners peace of mind. We’re seeking people who continually put our purpose first: advocating for pros and customers, embracing change, and choosing teamwork every day.</span></p> <p><span style="font-weight: 400;">At Thumbtack, we're creating a new era of home care. If making an impact and the chance to do good inspires you, join us. Imagine what we’ll build together.&nbsp;</span></p> <p><strong>Thumbtack by the Numbers</strong></p> <ul> <li>Available nationwide in every U.S. county</li> <li>80 million projects started on Thumbtack</li> <li>10 million 5-star reviews and counting</li> <li>Pros earn billions on our platform</li> <li>1000+ employees&nbsp;</li> <li>$3.2 billion valuation (June, 2021)&nbsp;</li> </ul></div><p><strong>About the Technical Program Management and Engineering Teams</strong></p> <p>Technical Program Management is a collaborative, engaging team with a high motor and strong domain and technical expertise.&nbsp; We partner with engineering and teams across the business to drive discovery, prioritization, implementation, improvement, and ongoing management of high impact technical solutions and programs.&nbsp; It’s our mission to help all of Thumbtack scale and get more done through technology and cross-functional alignment.</p> <p>At Thumbtack, engineers at every level build products and systems that directly impact our customers and professionals. Our challenges span a wide variety of areas, ranging from architecting sound data and infrastructure to be leveraged across the company, to building search and booking experiences, to optimizing pricing systems, to building tools to help professionals grow their businesses. We believe in tackling these hard problems together as a team, with strong values around collaboration, ownership, and transparency. To read more about the hard problems that our team is taking on, visit our <a href="https://www.thumbtack.com/engineering/">engineering blog</a>.</p> <h4><strong>About the Role</strong></h4> <p class="p1">Thumbtack is looking for an experienced Technical Program Manager to build out and manage a program that delivers the technology strategy and roadmap for our Platform Engineering organization.&nbsp; As the first dedicated TPM to this area at Thumbtack, we’re looking for the right person who’s excited about the unique and high impact opportunity to build something special.&nbsp; You will lead complex, multi-disciplinary projects in a program that spans critical systems &amp; application infrastructure, data &amp; machine learning systems, developer experience and productivity, and vendor-managed services and platforms.&nbsp; You will provide input and direction on solution design and prioritization, clarify outcomes and metrics, manage schedules and dependencies, and communicate progress with stakeholders and business leaders on initiatives that impact our internal and external users – from deployment of new technology, to large-scale infrastructure migrations, to definition of new processes, to enforcement of security policies, reliability measures, and data governance.</p> <h4><strong>Responsibilities</strong></h4> <ul> <li>Partner with engineering leadership to develop a Platform Engineering program strategy, drive semi-annual planning to identify objectives and key results that support that strategy, and collaborate with engineering, product, data science, IT, and others to prioritize and resource dependent initiatives</li> <li>Collaborate with engineering, product, data science, and relevant business teams to develop comprehensive and actionable plans, manage complex cross-team dependencies, and implement scalable tools, infrastructure, automation, and processes</li> <li>Utilize agile project management framework and techniques to plan, document, and manage both strategic and tactical initiatives from conception through delivery and closure</li> <li>Keep cross-functional stakeholders, contributors, executives, and interested parties engaged, aligned on strategy and involvement, and up-to-date on relevant details; escalate issues when necessary</li> <li>Ensure proper project closure by troubleshooting and resolving outstanding issues after launch, confirming solutions are in an acceptable state, and evaluating outcomes against objectives</li> </ul> <h4>What you'll need&nbsp;</h4> <p><em><span style="font-weight: 400;">If you don't think you meet all of the criteria below but still are interested in the job, please apply. Nobody checks every box, and we're looking for someone excited to join the team</span></em><em><span style="font-weight: 400;">.</span></em></p> <ul> <li>Bachelor’s degree in Computer Science or 5+ years of relevant experience in the field</li> <li>Proven experience owning the planning and delivery of a cross-functional engineering program</li> <li>Expert in agile project management, prioritization methodologies, and the software development lifecycle, with the ability to manage and prioritize many competing requests and cross-team dependencies</li> <li>Excellent analytical and problem-solving skills; able to make thoughtful proposals, provide root cause analysis and demonstrate results through testing and metrics</li> <li>General understanding of modern best practices in more than one of: infrastructure/operations; frontend, backend, and/or native client application development; data or machine learning infrastructure</li> <li>Experience and expertise in managing technical projects and programs related to the following: <ul> <li>Running software applications on public cloud infrastructure</li> <li>Application development and architecture in large-scale environments</li> <li>Data infrastructure investments and tooling (build/buy, multi-cloud, etc.)</li> </ul> </li> <li>Outstanding communication, collaboration, and interpersonal skills, with the ability to effectively set and manage stakeholder expectations</li> </ul> <h4><strong>Bonus points if you have</strong></h4> <ul> <li>Experience with a multi-platform product for a mix of internal and external users</li> <li>Previous hands-on engineering experience</li> <li>Compliance management experience (e.g. CCPA/CPRA, CAN-SPAM, WCAG, PCI DSS, SOX)</li> </ul> <p><span style="font-weight: 400;">Thumbtack is a virtual-first company, meaning you can live and work from any one of our approved locations across the United States, Canada or the Philippines.* Learn more about our virtual-first working model </span><a href="https://careers.thumbtack.com/virtual-first"><span style="font-weight: 400;">here</span></a><span style="font-weight: 400;">.</span></p> <p><span style="font-weight: 400;">For candidates living in San Francisco / Bay Area, New York City, or Seattle metros, the expected salary range for the role is currently $193,500- $236,500. Actual offered salaries will vary and will be based on various factors, such as calibrated job level, qualifications, skills, competencies, and proficiency for the role.</span></p> <div>For candidates living in all other US locations, the expected salary range for this role is currently $170,100 - $207,900 . Actual offered salaries will vary and will be based on various factors, such as calibrated job level, qualifications, skills, competencies, and proficiency for the role.</div> <p><span style="font-weight: 400;">#LI-Remote</span></p><div class="content-conclusion"><div class="p-rich_text_section"><strong data-stringify-type="bold">Benefits &amp; Perks</strong></div> <ul class="p-rich_text_list p-rich_text_list__bullet" data-stringify-type="unordered-list" data-indent="0" data-border="0"> <li style="font-weight: 400;">Virtual-first working model coupled with in-person events</li> <li style="font-weight: 400;">20 company-wide holidays including a week-long end-of-year company shutdown</li> <li data-stringify-indent="0" data-stringify-border="0">Libraries (optional use collaboration &amp; connection hubs)<strong data-stringify-type="bold"> </strong><span style="font-weight: 400;">in San Francisco and Salt Lake City&nbsp;</span><span style="font-weight: 400;">&nbsp;</span></li> <li style="font-weight: 400;"><span style="font-weight: 400;">WiFi reimbursements&nbsp;</span></li> <li style="font-weight: 400;"><span style="font-weight: 400;">Cell phone reimbursements (North America)&nbsp;</span></li> <li style="font-weight: 400;"><span style="font-weight: 400;">Employee Assistance Program for mental health and well-being&nbsp;</span></li> </ul> <p><span style="font-weight: 400;"><strong>Learn More About Us </strong></span></p> <ul> <li><a href="https://medium.com/life-thumbtack"><span style="font-weight: 400;">Life @ Thumbtack Blog </span></a><span style="font-weight: 400;">&nbsp;&nbsp;</span></li> <li style="font-weight: 400;"><a href="https://www.youtube.com/watch?v=vkdNjVKtpB4"><span style="font-weight: 400;">How Thumbtack is embracing virtual work</span></a><span style="font-weight: 400;"> &nbsp;</span></li> <li style="font-weight: 400;"><a href="https://www.linkedin.com/company/thumbtack-inc.">Follow us on LinkedIn </a>&nbsp;</li> <li style="font-weight: 400;"><a href="https://www.youtube.com/watch?v=khIN2FCdfnM&amp;list=PLPz1npTzydS577c1B-vduYBWoF0cRPKMv">Meet the pros who inspire us</a></li> </ul> <p><span style="font-weight: 400;">Thumbtack embraces diversity. We are proud to be an equal opportunity workplace and do not discriminate on the basis of sex, race, color, age, pregnancy, sexual orientation, gender identity or expression, religion, national origin, ancestry, citizenship, marital status, military or veteran status, genetic information, disability status, or any other characteristic protected by federal, provincial, state, or local law. We also will consider for employment qualified applicants with arrest and conviction records, consistent with applicable law.&nbsp;</span></p> <p><span style="font-weight: 400;">Thumbtack is committed to working with and providing reasonable accommodation to individuals with disabilities. If you would like to request a reasonable accommodation for a medical condition or disability during any part of the application process, please contact: </span><a href="mailto:recruitingops@thumbtack.com"><span style="font-weight: 400;">recruitingops@thumbtack.com</span></a><span style="font-weight: 400;">.&nbsp;</span></p> <p><span style="font-weight: 400;">If you are a California resident, please review information regarding your rights under California privacy laws contained in Thumbtack’s Privacy policy available at </span><a href="https://www.thumbtack.com/privacy/"><span style="font-weight: 400;">https://www.thumbtack.com/privacy/</span></a><span style="font-weight: 400;"> .</span></p> <h5 id="b81c" class="ht hu dv hv b eu hw hx hy ex hz ia ib ic id ie if ig ih ii ij ik il im in io dn es"></h5></div>
Senior Data Engineer, Modeling
, United States
<div class="content-intro"><p><span style="font-weight: 400;">A home is the biggest investment most people make, and yet, it doesn’t come with a manual. That's why we’re building the only app homeowners need to effortlessly manage their homes —&nbsp; knowing what to do, when to do it, and who to hire. With Thumbtack, millions of people care for what matters most, and pros earn billions of dollars through our platform. And as one of the fastest-growing companies in a $600B+ industry — we must be doing something right.&nbsp;</span></p> <p><span style="font-weight: 400;">We are driven by a common goal and the deep satisfaction that comes from knowing our work supports local economies, helps small businesses grow, and brings homeowners peace of mind. We’re seeking people who continually put our purpose first: advocating for pros and customers, embracing change, and choosing teamwork every day.</span></p> <p><span style="font-weight: 400;">At Thumbtack, we're creating a new era of home care. If making an impact and the chance to do good inspires you, join us. Imagine what we’ll build together.&nbsp;</span></p> <p><strong>Thumbtack by the Numbers</strong></p> <ul> <li>Available nationwide in every U.S. county</li> <li>80 million projects started on Thumbtack</li> <li>10 million 5-star reviews and counting</li> <li>Pros earn billions on our platform</li> <li>1000+ employees&nbsp;</li> <li>$3.2 billion valuation (June, 2021)&nbsp;</li> </ul></div><h4>About the Data Engineering Team</h4> <p>Thumbtack’s Data Engineering team is a centralized team that works closely with engineers, analysts, data scientists, and machine learning engineers to help design and curate data sets originating from internal and third-party sources to meet current and future needs. Over the next year, it will continue to build on its prior successes in building a more cohesive data warehouse while starting to work more deeply upstream to build data best practices into the full software development lifecycle (SDLC).</p> <h4><strong>About the Role</strong></h4> <p>As a Senior Data Engineer, you will work closely with product and engineering teams throughout Thumbtack, helping turn data into insight into action. The Data Engineering team is a hybrid-embedded team of engineers, some of whom consult directly with product teams to integrate data into the development lifecycle, and others who help build core pipelines and data models for use across the entire company. You’ll work to understand requirements, then design, deploy, test, and deploy data pipelines and transformations for use by Analysts, Machine Learning Engineers, and Data Scientists, and Product Managers. Major project areas include: working across product and marketing data teams to build a centralized customer data warehouse, developing advanced ingress/egress validation in the data lake, and modeling cost of supply acquisition for our two-sided marketplace.</p> <h4>Challenge</h4> <p>In 2024, Thumbtack is significantly investing in Data and Data Engineering as a strategic growth area for the company. While there are interesting and difficult challenges across the entire focus area, we’re building on a solid foundation of the modern data stack, are committed to supporting each other, and have internal champions and strong advocates on our partner teams to ensure we succeed. Our primary mandate is to take these core building blocks of a modern data system, and extend them to make simple analysis simple, and deeply complex analysis easier. As a Senior Data Engineer, you will be instrumental in making this happen.</p> <p><strong>Responsibilities</strong></p> <ul> <li>Collaboratively refine and evangelize a comprehensive framework for integrating data-thinking into the software development lifecycle for product teams</li> <li>Design, build, and maintain core datasets, data marts, and feature stores that support a blend of mature products and features with a rapidly evolving product line, in partnership with analytics, data science, and machine learning</li> <li>Integrate with teams consisting of product engineers, analysts, data scientists, machine learning engineers throughout the organization to understand their data needs, and help design datasets with the same engineering rigor as any other software we design</li> <li>Drive data quality and best practices across key product and business areas</li> </ul> <p><strong>What you’ll need</strong></p> <p><em><span style="font-weight: 400;">If you don't think you meet all of the criteria below but still are interested in the job, please apply. Nobody checks every box, and we're looking for someone excited to join the team.</span></em></p> <ul> <li>5+ years experience designing and building data sets and warehouses</li> <li>Hands-on experience with SQL, ETLs, Python, data pipelines, distributed systems</li> <li>Ability to understand the needs of and collaborate with stakeholders in other functions, especially Analytics, and identify opportunities for process improvements across teams</li> <li>Familiarity building the above with a modern data stack based on a cloud-native data warehouse, in our case we use BigQuery, dbt, and Apache Airflow, but a similar stack is fine</li> <li>Strong sense of ownership and pride in your work, from ideation and requirements-gathering to project completion and maintenance</li> </ul> <p><strong>Bonus points if you have&nbsp;</strong></p> <ul> <li>Domain experience working with data in a relevant area, such as Marketing, Customer Behavior &amp; Engagement, Finance, et al.</li> <li>Experience mentoring and coaching engineers</li> <li>Experience using and/or configuring Business Intelligence tools (Looker, Tableau, Mode, et al.)</li> <li>Experience working with semi-structured or unstructured data in a data lake or similar</li> <li>Understanding of database internals and query optimization</li> </ul> <p><span style="font-weight: 400;">Thumbtack is a virtual-first company, meaning you can live and work from any one of our approved locations across the United States, Canada or the Philippines.* Learn more about our virtual-first working model </span><a href="https://careers.thumbtack.com/virtual-first"><span style="font-weight: 400;">here</span></a><span style="font-weight: 400;">.</span></p> <p><span style="font-weight: 400;">For candidates living in San Francisco / Bay Area, New York City, or Seattle metros, the expected salary range for the role is currently $180,000 - $250,000. Actual offered salaries will vary and will be based on various factors, such as calibrated job level, qualifications, skills, competencies, and proficiency for the role.</span></p> <div>For candidates living in all other US locations, the expected salary range for this role is currently $170,000 - $215,000. Actual offered salaries will vary and will be based on various factors, such as calibrated job level, qualifications, skills, competencies, and proficiency for the role.</div> <p><span style="font-weight: 400;">#LI-Remote</span></p><div class="content-conclusion"><div class="p-rich_text_section"><strong data-stringify-type="bold">Benefits &amp; Perks</strong></div> <ul class="p-rich_text_list p-rich_text_list__bullet" data-stringify-type="unordered-list" data-indent="0" data-border="0"> <li style="font-weight: 400;">Virtual-first working model coupled with in-person events</li> <li style="font-weight: 400;">20 company-wide holidays including a week-long end-of-year company shutdown</li> <li data-stringify-indent="0" data-stringify-border="0">Libraries (optional use collaboration &amp; connection hubs)<strong data-stringify-type="bold"> </strong><span style="font-weight: 400;">in San Francisco and Salt Lake City&nbsp;</span><span style="font-weight: 400;">&nbsp;</span></li> <li style="font-weight: 400;"><span style="font-weight: 400;">WiFi reimbursements&nbsp;</span></li> <li style="font-weight: 400;"><span style="font-weight: 400;">Cell phone reimbursements (North America)&nbsp;</span></li> <li style="font-weight: 400;"><span style="font-weight: 400;">Employee Assistance Program for mental health and well-being&nbsp;</span></li> </ul> <p><span style="font-weight: 400;"><strong>Learn More About Us </strong></span></p> <ul> <li><a href="https://medium.com/life-thumbtack"><span style="font-weight: 400;">Life @ Thumbtack Blog </span></a><span style="font-weight: 400;">&nbsp;&nbsp;</span></li> <li style="font-weight: 400;"><a href="https://www.youtube.com/watch?v=vkdNjVKtpB4"><span style="font-weight: 400;">How Thumbtack is embracing virtual work</span></a><span style="font-weight: 400;"> &nbsp;</span></li> <li style="font-weight: 400;"><a href="https://www.linkedin.com/company/thumbtack-inc.">Follow us on LinkedIn </a>&nbsp;</li> <li style="font-weight: 400;"><a href="https://www.youtube.com/watch?v=khIN2FCdfnM&amp;list=PLPz1npTzydS577c1B-vduYBWoF0cRPKMv">Meet the pros who inspire us</a></li> </ul> <p><span style="font-weight: 400;">Thumbtack embraces diversity. We are proud to be an equal opportunity workplace and do not discriminate on the basis of sex, race, color, age, pregnancy, sexual orientation, gender identity or expression, religion, national origin, ancestry, citizenship, marital status, military or veteran status, genetic information, disability status, or any other characteristic protected by federal, provincial, state, or local law. We also will consider for employment qualified applicants with arrest and conviction records, consistent with applicable law.&nbsp;</span></p> <p><span style="font-weight: 400;">Thumbtack is committed to working with and providing reasonable accommodation to individuals with disabilities. If you would like to request a reasonable accommodation for a medical condition or disability during any part of the application process, please contact: </span><a href="mailto:recruitingops@thumbtack.com"><span style="font-weight: 400;">recruitingops@thumbtack.com</span></a><span style="font-weight: 400;">.&nbsp;</span></p> <p><span style="font-weight: 400;">If you are a California resident, please review information regarding your rights under California privacy laws contained in Thumbtack’s Privacy policy available at </span><a href="https://www.thumbtack.com/privacy/"><span style="font-weight: 400;">https://www.thumbtack.com/privacy/</span></a><span style="font-weight: 400;"> .</span></p> <h5 id="b81c" class="ht hu dv hv b eu hw hx hy ex hz ia ib ic id ie if ig ih ii ij ik il im in io dn es"></h5></div>
Senior Data Engineer, Modeling
, Ontario
<div class="content-intro"><p><span style="font-weight: 400;">A home is the biggest investment most people make, and yet, it doesn’t come with a manual. That's why we’re building the only app homeowners need to effortlessly manage their homes —&nbsp; knowing what to do, when to do it, and who to hire. With Thumbtack, millions of people care for what matters most, and pros earn billions of dollars through our platform. And as one of the fastest-growing companies in a $600B+ industry — we must be doing something right.&nbsp;</span></p> <p><span style="font-weight: 400;">We are driven by a common goal and the deep satisfaction that comes from knowing our work supports local economies, helps small businesses grow, and brings homeowners peace of mind. We’re seeking people who continually put our purpose first: advocating for pros and customers, embracing change, and choosing teamwork every day.</span></p> <p><span style="font-weight: 400;">At Thumbtack, we're creating a new era of home care. If making an impact and the chance to do good inspires you, join us. Imagine what we’ll build together.&nbsp;</span></p> <p><strong>Thumbtack by the Numbers</strong></p> <ul> <li>Available nationwide in every U.S. county</li> <li>80 million projects started on Thumbtack</li> <li>10 million 5-star reviews and counting</li> <li>Pros earn billions on our platform</li> <li>1000+ employees&nbsp;</li> <li>$3.2 billion valuation (June, 2021)&nbsp;</li> </ul></div><h4>About the Data Engineering Team</h4> <p>Thumbtack’s Data Engineering team is a centralized team that works closely with engineers, analysts, data scientists, and machine learning engineers to help design and curate data sets originating from internal and third-party sources to meet current and future needs. Over the next year, it will continue to build on its prior successes in building a more cohesive data warehouse while starting to work more deeply upstream to build data best practices into the full software development lifecycle (SDLC).</p> <h4><strong>About the Role</strong></h4> <p>As a Senior Data Engineer, you will work closely with product and engineering teams throughout Thumbtack, helping turn data into insight into action. The Data Engineering team is a hybrid-embedded team of engineers, some of whom consult directly with product teams to integrate data into the development lifecycle, and others who help build core pipelines and data models for use across the entire company. You’ll work to understand requirements, then design, deploy, test, and deploy data pipelines and transformations for use by Analysts, Machine Learning Engineers, and Data Scientists, and Product Managers. Major project areas include: working across product and marketing data teams to build a centralized customer data warehouse, developing advanced ingress/egress validation in the data lake, and modeling cost of supply acquisition for our two-sided marketplace.</p> <h4>Challenge</h4> <p>In 2024, Thumbtack is significantly investing in Data and Data Engineering as a strategic growth area for the company. While there are interesting and difficult challenges across the entire focus area, we’re building on a solid foundation of the modern data stack, are committed to supporting each other, and have internal champions and strong advocates on our partner teams to ensure we succeed. Our primary mandate is to take these core building blocks of a modern data system, and extend them to make simple analysis simple, and deeply complex analysis easier. As a Senior Data Engineer, you will be instrumental in making this happen.</p> <p><strong>Responsibilities</strong></p> <ul> <li>Collaboratively refine and evangelize a comprehensive framework for integrating data-thinking into the software development lifecycle for product teams</li> <li>Design, build, and maintain core datasets, data marts, and feature stores that support a blend of mature products and features with a rapidly evolving product line, in partnership with analytics, data science, and machine learning</li> <li>Integrate with teams consisting of product engineers, analysts, data scientists, machine learning engineers throughout the organization to understand their data needs, and help design datasets with the same engineering rigor as any other software we design</li> <li>Drive data quality and best practices across key product and business areas</li> </ul> <p><strong>What you’ll need</strong></p> <p><em><span style="font-weight: 400;">If you don't think you meet all of the criteria below but still are interested in the job, please apply. Nobody checks every box, and we're looking for someone excited to join the team.</span></em></p> <ul> <li>5+ years experience designing and building data sets and warehouses</li> <li>Hands-on experience with SQL, ETLs, Python, data pipelines, distributed systems</li> <li>Ability to understand the needs of and collaborate with stakeholders in other functions, especially Analytics, and identify opportunities for process improvements across teams</li> <li>Familiarity building the above with a modern data stack based on a cloud-native data warehouse, in our case we use BigQuery, dbt, and Apache Airflow, but a similar stack is fine</li> <li>Strong sense of ownership and pride in your work, from ideation and requirements-gathering to project completion and maintenance</li> </ul> <p><strong>Bonus points if you have&nbsp;</strong></p> <ul> <li>Domain experience working with data in a relevant area, such as Marketing, Customer Behavior &amp; Engagement, Finance, et al.</li> <li>Experience mentoring and coaching engineers</li> <li>Experience using and/or configuring Business Intelligence tools (Looker, Tableau, Mode, et al.)</li> <li>Experience working with semi-structured or unstructured data in a data lake or similar</li> <li>Understanding of database internals and query optimization</li> </ul> <p><span style="font-weight: 400;">Thumbtack is a virtual-first company, meaning you can live and work from any one of our approved locations across the United States, Canada or the Philippines.* Learn more about our virtual-first working model </span><a href="https://careers.thumbtack.com/virtual-first"><span style="font-weight: 400;">here</span></a><span style="font-weight: 400;">.</span></p> <p><span style="font-weight: 400;">#LI-Remote</span></p><div class="content-conclusion"><div class="p-rich_text_section"><strong data-stringify-type="bold">Benefits &amp; Perks</strong></div> <ul class="p-rich_text_list p-rich_text_list__bullet" data-stringify-type="unordered-list" data-indent="0" data-border="0"> <li style="font-weight: 400;">Virtual-first working model coupled with in-person events</li> <li style="font-weight: 400;">20 company-wide holidays including a week-long end-of-year company shutdown</li> <li data-stringify-indent="0" data-stringify-border="0">Libraries (optional use collaboration &amp; connection hubs)<strong data-stringify-type="bold"> </strong><span style="font-weight: 400;">in San Francisco and Salt Lake City&nbsp;</span><span style="font-weight: 400;">&nbsp;</span></li> <li style="font-weight: 400;"><span style="font-weight: 400;">WiFi reimbursements&nbsp;</span></li> <li style="font-weight: 400;"><span style="font-weight: 400;">Cell phone reimbursements (North America)&nbsp;</span></li> <li style="font-weight: 400;"><span style="font-weight: 400;">Employee Assistance Program for mental health and well-being&nbsp;</span></li> </ul> <p><span style="font-weight: 400;"><strong>Learn More About Us </strong></span></p> <ul> <li><a href="https://medium.com/life-thumbtack"><span style="font-weight: 400;">Life @ Thumbtack Blog </span></a><span style="font-weight: 400;">&nbsp;&nbsp;</span></li> <li style="font-weight: 400;"><a href="https://www.youtube.com/watch?v=vkdNjVKtpB4"><span style="font-weight: 400;">How Thumbtack is embracing virtual work</span></a><span style="font-weight: 400;"> &nbsp;</span></li> <li style="font-weight: 400;"><a href="https://www.linkedin.com/company/thumbtack-inc.">Follow us on LinkedIn </a>&nbsp;</li> <li style="font-weight: 400;"><a href="https://www.youtube.com/watch?v=khIN2FCdfnM&amp;list=PLPz1npTzydS577c1B-vduYBWoF0cRPKMv">Meet the pros who inspire us</a></li> </ul> <p><span style="font-weight: 400;">Thumbtack embraces diversity. We are proud to be an equal opportunity workplace and do not discriminate on the basis of sex, race, color, age, pregnancy, sexual orientation, gender identity or expression, religion, national origin, ancestry, citizenship, marital status, military or veteran status, genetic information, disability status, or any other characteristic protected by federal, provincial, state, or local law. We also will consider for employment qualified applicants with arrest and conviction records, consistent with applicable law.&nbsp;</span></p> <p><span style="font-weight: 400;">Thumbtack is committed to working with and providing reasonable accommodation to individuals with disabilities. If you would like to request a reasonable accommodation for a medical condition or disability during any part of the application process, please contact: </span><a href="mailto:recruitingops@thumbtack.com"><span style="font-weight: 400;">recruitingops@thumbtack.com</span></a><span style="font-weight: 400;">.&nbsp;</span></p> <p><span style="font-weight: 400;">If you are a California resident, please review information regarding your rights under California privacy laws contained in Thumbtack’s Privacy policy available at </span><a href="https://www.thumbtack.com/privacy/"><span style="font-weight: 400;">https://www.thumbtack.com/privacy/</span></a><span style="font-weight: 400;"> .</span></p> <h5 id="b81c" class="ht hu dv hv b eu hw hx hy ex hz ia ib ic id ie if ig ih ii ij ik il im in io dn es"></h5></div>
Verified by
Infra & Data Eng Manager
Senior Technical Sourcer
You may also like