99% to 99.9% SLO: High Performance Kubernetes Control Plane at Pinterest

1,566
Pinterest
Pinterest's profile on StackShare is not actively maintained, so the information here may be out of date.

By Shunyao Li | Software Engineer, Cloud Runtime


Over the past three years, the Cloud Runtime team’s journey has gone from “Why Kubernetes?” to “How to scale?”. There is no doubt that Kubernetes based compute platform has achieved huge success at Pinterest. We have been supporting big data processing, machine learning, distributed training, workflow engine, CI/CD, internal tools — backing up every engineer at Pinterest.

Why Control Plane Latency Matters

As more and more business-critical workloads onboard Kubernetes, it is increasingly important to have a high-performance control plane that efficiently orchestrates every workload. Critical workloads such as content model training and ads reporting pipelines will be delayed if it takes too long to translate user workloads into Kubernetes native pods.

To measure control plane performance, we introduced top line business metrics through Service Level Indicator and Objective (SLI/SLO) in early 2021. We measure control plane SLI by reconcile latency, defined as the time from when a user change is received to when it propagates out of the control plane. For example, one of reconcile latency measurements for batch jobs is the delay between workload creation and Pod creation.

The initial SLO was set to 99%. At the time of writing this post, we are proudly serving a control plane SLO of 99.9%. This post is about how we improved the control plane to achieve high performance.

Control Plane in a Nutshell

Control plane is the nerve center of the Kubernetes platform and is responsible for workload orchestration. It listens to changes from the Kubernetes API, compares the desired state of resources with their actual status, and takes actions to make sure the actual resource status matches the desired status (reconciliation). Workload orchestration also includes making scheduling decisions about where to place workloads.

Kubernetes control plane consists of a set of resource controllers. Our resource controllers are written in the controller framework, which has an informer-reflector-cache architecture. Informers use the List-Watch mechanism to fetch and monitor resource changes from the Kubernetes API. Reflector updates cache with resource changes and dispatches events for handling. Cache stores resource objects and serve List and Get calls. The controller framework follows the producer-consumer pattern. The event handler is the producer and is responsible for queuing reconcile requests, while the controller worker pool is the consumer who pulls items from workqueue to run the reconciliation logic.

Figure 1: Kubernetes Controller Framework

Challenge 1: Worker Pool Efficiency

The controller worker pool is where the actual status to desired status reconciliation occurs. We leveraged the metrics provided by the workqueue package to gain a deep insight into the worker pool efficiency. These metrics are:

  • Work duration: how long it takes to process an item from workqueue
  • Queue duration: how long an item stays in workqueue before being processed
  • Enqueue rate: how often an item gets enqueued
  • Retry rate: how often an item gets retried
  • Queue depth: current depth of workqueue

Among these metrics, queue depth draws our attention as its spikes highly correlate with control plane performance degradation. Spikes in queue depth indicate head-of-line blocking. This usually happens when a large number of irrelevant items are enqueued in a short period of time. For those items that really need to be reconciled, they end up waiting in the queue for a longer time and cause SLI dips.

Figure 2: Correlation between control plane queue depth spikes and control plane instant SLI dips.

To resolve the head-of-line blocking, we categorize informer events and handle them with different priorities. User-triggered events have a high priority and need to be reconciled immediately, e.g., Create events triggered by users creating workloads or Update events triggered by users updating the labels of workloads. On the other hand, some system-triggered events are low priorities, e.g., a Create event during informer initialization, or an Update event during informer periodic resync. They don’t affect our SLI and are not as time-sensitive as user-triggered events. They can be delayed so they don’t pile up in the queue and block urgent events. The following section is about how to identify and delay these system-triggered events.

Create Events During Informer Initialization

Each time we update the controller, the informer initializes its List-Watch mechanism by issuing a List call to the API server. It then stores the returned results in its cache and triggers a Create event for each result. This results in a spike in the queue depth. The solution is to delay any subsequent Create events for existing objects; an object cannot be created twice by the user, and any subsequent Create events must come from informer initialization.

Figure 3: Control plane queue depth spikes to 10k during an informer initialization, resulting in a dip in control plane instant SLI.

Update Events During Informer Periodic Resync

Periodically, the informer goes over all items remaining in its cache, triggering an Update event for each item. These events are enqueued at the same time and result in a queue depth spike. As shown in Figure 2, the queue depth spike aligns with the informer periodic resync interval we configured.

Update events triggered by periodic resync are easy to identify, where the old and new objects are always the same since they both come from the informer cache. The solution is to delay Update events whose old and new objects are deep equal. The delay is randomized so that queue depth spikes can be smoothed out by scattering resync requests over a period of time.

Result

The above optimizations solved the head-of-line blocking problem caused by inefficient worker pools. As a result, there are no longer recurring spikes in control plane queue depth. The average queue depth during informer periodic resync has been reduced by 97%, from 1k to 30. The instant SLI dips caused by the control plane queue depth spikes have been eliminated.

Figure 4: Improvement on workqueue efficiency

Challenge 2: Leadership Switch

Only the leader in the controller fleet does the actual reconciliation work, and leadership switch happens pretty often during deployments or controller pod evictions. A prolonged leadership switch can have a considerable negative impact on the control plane instant SLI.

Figure 5: Control plane leadership switches result in instant SLI dips.

Leader Election Mechanisms

There are two common leader election mechanisms for the Kubernetes control plane.

  • Leader-with-lease: the leader pod periodically renews a lease and gives up leadership when it cannot renew the lease. Kubernetes native components including cluster-autoscaler, kube-controller-manager, and kube-scheduler are using leader-with-lease in client-go.
  • Leader-for-life: the leader pod only gives up leadership when it is deleted and its dependent configmap is garbage collected. The configmap is used as a source of truth for leadership, so it is impossible to have two leaders at the same time (a.k.a. split brain). All resource controllers in our control plane are using the leader-for-life leader election mechanism from the operator framework to ensure we have at most one leader at a time.

In this post, we focus on the optimization of the leader-for-life approach to reduce control plane leadership switch time and improve control plane performance.

Monitoring

To monitor the leadership switch time, we implemented fine-grained leadership switch metrics with the following phases:

  • Leaderless: when there is no leader
  • Leader ramp-up: the time from a controller pod becoming leader to its first reconciliation. The new leader pod cannot begin to reconcile as soon as it becomes the leader; instead, it must wait until all relevant informers are synchronized.

Figure 6: Diagram of the leadership switch procedure

Figure 7: Control plane leadership switch monitored by the proposed leadership switch metrics

As shown in Figure 7, the control plane leadership switch usually takes more than one minute to complete, which is unacceptable for a high-performance control plane. We proposed the following solutions to reduce the leadership switch time.

Reduce Leaderless Time

The leader-for-life package hardcoded the exponential backoff interval between attempts to become a leader, starting from 1s to a maximum of 16s. When a container requires some time to initialize, it always hits the maximum of 16s. We make the backoff interval configurable and reduce it to fit our situation. We also contributed our solution back to the operator framework community.

Reduce Leader Ramp-up Time

During the leader ramp-up time, each resource informer in each cluster initiates a List call to the API server and synchronizes its cache with the returned results. The leader will only start reconciliation when all informer caches are synchronized.

Preload Informer Cache

One way to reduce the leader ramp-up time is to have standby controller pods preload their informer cache. In other words, the initialization of the informer cache is no longer exclusive to the leader but applies to every controller pod upon its creation. Note that registering event handlers is still exclusive to the leader, otherwise we will suffer from a split brain.

Use Readiness Probe to Ensure Graceful Rolling Upgrade

The informer cache preload procedure runs in the background and does not block a standby pod from becoming the leader. To enforce the blocking, we define a readiness probe by HTTP GET request to periodically check if all informer cache are synchronized. With a rolling upgrade strategy, the old leader pod is killed after the new standby pod is ready, which ensures the new pod is always warmed up when it becomes the leader.

Result

Table 1: Improvement on control plane leadership switch monitored by the proposed leadership switch metrics (4 observations before and after)

Table 1 shows the improvement on the control plane leadership switch. The average control plane leadership switch time has been decreased from 64s to 10s, with an 85% improvement.

What’s Next

With these efforts, we revamp the control plane performance and redefine its SLO from 99% to 99.9%. This is a huge milestone for the Kubernetes-based compute platform, demonstrating unprecedented reliability and availability. We are working on achieving higher SLOs and have identified the following areas where the control plane performance can be further improved.

  • Proactive leadership handover: The leadership handover in leader-for-life is passive because it depends on observation from external components to release resource lock. The time spent on garbage collection accounts for 50% of our current leadership handover time. Proactive leadership handover is performed by the leader when it receives SIGTERM and intentionally releases its lock before exiting. This will significantly reduce the leadership switch time.
  • Reconcile Quality of Service (QoS): In this post, we present our optimization of worker pool efficiency in terms of delayed enqueue v.s. immediate enqueue. For future work, we want to introduce reconcile QoS and workqueue tiering (for example, creating different queues for different tiers of workloads to ensure that high tiers are not interfered with and blocked).

Acknowledgement

Shout out to Suli Xu and Harry Zhang for their great contributions in building a high-performance control plane to support business needs. Special thanks to June Liu, Anson Qian, Haniel Martino, Ming Zong, Quentin Miao, Robson Braga and Martin Stankard for their feedback and support.

Pinterest
Pinterest's profile on StackShare is not actively maintained, so the information here may be out of date.
Tools mentioned in article
Open jobs at Pinterest
Sr. Staff Software Engineer, Ads ML I...
San Francisco, CA, US; , CA, US
<div class="content-intro"><p><strong>About Pinterest</strong><span style="font-weight: 400;">:&nbsp;&nbsp;</span></p> <p>Millions of people across the world come to Pinterest to find new ideas every day. It’s where they get inspiration, dream about new possibilities and plan for what matters most. Our mission is to help those people find their inspiration and create a life they love.&nbsp;In your role, you’ll be challenged to take on work that upholds this mission and pushes Pinterest forward. You’ll grow as a person and leader in your field, all the while helping&nbsp;Pinners&nbsp;make their lives better in the positive corner of the internet.</p> <p>Creating a life you love also means finding a career that celebrates the unique perspectives and experiences that you bring. As you read through the expectations of the position, consider how your skills and experiences may complement the responsibilities of the role. We encourage you to think through your relevant and transferable skills from prior experiences.</p> <p><em>Our new progressive work model is called PinFlex, a term that’s uniquely Pinterest to describe our flexible approach to living and working. Visit our </em><a href="https://www.pinterestcareers.com/pinflex/" target="_blank"><em><u>PinFlex</u></em></a><em> landing page to learn more.&nbsp;</em></p></div><p>Pinterest is one of the fastest growing online advertising platforms. Continued success depends on the machine-learning systems, which crunch thousands of signals in a few hundred milliseconds, to identify the most relevant ads to show to pinners. You’ll join a talented team with high impact, which designs high-performance and efficient ML systems, in order to power the most critical and revenue-generating models at Pinterest.</p> <p><strong>What you’ll do</strong></p> <ul> <li>Being the technical leader of the Ads ML foundation evolution movement to 2x Pinterest revenue and 5x ad performance in next 3 years.</li> <li>Opportunities to use cutting edge ML technologies including GPU and LLMs to empower 100x bigger models in next 3 years.&nbsp;</li> <li>Tons of ambiguous problems and you will be tasked with building 0 to 1 solutions for all of them.</li> </ul> <p><strong>What we’re looking for:</strong></p> <ul> <li>BS (or higher) degree in Computer Science, or a related field.</li> <li>10+ years of relevant industry experience in leading the design of large scale &amp; production ML infra systems.</li> <li>Deep knowledge with at least one state-of-art programming language (Java, C++, Python).&nbsp;</li> <li>Deep knowledge with building distributed systems or recommendation infrastructure</li> <li>Hands-on experience with at least one modeling framework (Pytorch or Tensorflow).&nbsp;</li> <li>Hands-on experience with model / hardware accelerator libraries (Cuda, Quantization)</li> <li>Strong communicator and collaborative team player.</li> </ul><div class="content-pay-transparency"><div class="pay-input"><div class="description"><p>At Pinterest we believe the workplace should be equitable, inclusive, and inspiring for every employee. In an effort to provide greater transparency, we are sharing the base salary range for this position. The position is also eligible for equity. Final salary is based on a number of factors including location, travel, relevant prior experience, or particular skills and expertise.</p> <p><em><span style="font-weight: 400;">Information regarding the culture at Pinterest and benefits available for this position can be found <a href="https://www.pinterestcareers.com/pinterest-life/" target="_blank">here</a>.</span></em></p></div><div class="title">US based applicants only</div><div class="pay-range"><span>$135,150</span><span class="divider">&mdash;</span><span>$278,000 USD</span></div></div></div><div class="content-conclusion"><p><strong>Our Commitment to Diversity:</strong></p> <p>Pinterest is an equal opportunity employer and makes employment decisions on the basis of merit. We want to have the best qualified people in every job. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, protected veteran status, or any other characteristic under federal, state, or local law. We also consider qualified applicants regardless of criminal histories, consistent with legal requirements. If you require an accommodation during the job application process, please notify&nbsp;<a href="mailto:accessibility@pinterest.com">accessibility@pinterest.com</a>&nbsp;for support.</p></div>
Senior Staff Machine Learning Enginee...
San Francisco, CA, US; , CA, US
<div class="content-intro"><p><strong>About Pinterest</strong><span style="font-weight: 400;">:&nbsp;&nbsp;</span></p> <p>Millions of people across the world come to Pinterest to find new ideas every day. It’s where they get inspiration, dream about new possibilities and plan for what matters most. Our mission is to help those people find their inspiration and create a life they love.&nbsp;In your role, you’ll be challenged to take on work that upholds this mission and pushes Pinterest forward. You’ll grow as a person and leader in your field, all the while helping&nbsp;Pinners&nbsp;make their lives better in the positive corner of the internet.</p> <p>Creating a life you love also means finding a career that celebrates the unique perspectives and experiences that you bring. As you read through the expectations of the position, consider how your skills and experiences may complement the responsibilities of the role. We encourage you to think through your relevant and transferable skills from prior experiences.</p> <p><em>Our new progressive work model is called PinFlex, a term that’s uniquely Pinterest to describe our flexible approach to living and working. Visit our </em><a href="https://www.pinterestcareers.com/pinflex/" target="_blank"><em><u>PinFlex</u></em></a><em> landing page to learn more.&nbsp;</em></p></div><p>We are looking for a highly motivated and experienced Machine Learning Engineer to join our team and help us shape the future of machine learning at Pinterest. In this role, you will tackle new challenges in machine learning that will have a real impact on the way people discover and interact with the world around them.&nbsp; You will collaborate with a world-class team of research scientists and engineers to develop new machine learning algorithms, systems, and applications that will bring step-function impact to the business metrics (recent publications <a href="https://arxiv.org/abs/2205.04507">1</a>, <a href="https://dl.acm.org/doi/abs/10.1145/3523227.3547394">2</a>, <a href="https://arxiv.org/abs/2306.00248">3</a>).&nbsp; You will also have the opportunity to work on a variety of exciting projects in the following areas:&nbsp;</p> <ul> <li>representation learning</li> <li>recommender systems</li> <li>graph neural network</li> <li>natural language processing (NLP)</li> <li>inclusive AI</li> <li>reinforcement learning</li> <li>user modeling</li> </ul> <p>You will also have the opportunity to mentor junior researchers and collaborate with external researchers on cutting-edge projects.&nbsp;&nbsp;</p> <p><strong>What you'll do:&nbsp;</strong></p> <ul> <li>Lead cutting-edge research in machine learning and collaborate with other engineering teams to adopt the innovations into Pinterest problems</li> <li>Collect, analyze, and synthesize findings from data and build intelligent data-driven model</li> <li>Scope and independently solve moderately complex problems; write clean, efficient, and sustainable code</li> <li>Use machine learning, natural language processing, and graph analysis to solve modeling and ranking problems across growth, discovery, ads and search</li> </ul> <p><strong>What we're looking for:</strong></p> <ul> <li>Mastery of at least one systems languages (Java, C++, Python) or one ML framework (Pytorch, Tensorflow, MLFlow)</li> <li>Experience in research and in solving analytical problems</li> <li>Strong communicator and team player. Being able to find solutions for open-ended problems</li> <li>8+ years working experience in the r&amp;d or engineering teams that build large-scale ML-driven projects</li> <li>3+ years experience leading cross-team engineering efforts that improves user experience in products</li> <li>MS/PhD in Computer Science, ML, NLP, Statistics, Information Sciences or related field</li> </ul> <p><strong>Desired skills:</strong></p> <ul> <li>Strong publication track record and industry experience in shipping machine learning solutions for large-scale challenges&nbsp;</li> <li>Cross-functional collaborator and strong communicator</li> <li>Comfortable solving ambiguous problems and adapting to a dynamic environment</li> </ul> <p>This position is not eligible for relocation assistance.</p> <p>#LI-SA1</p> <p>#LI-REMOTE</p><div class="content-pay-transparency"><div class="pay-input"><div class="description"><p>At Pinterest we believe the workplace should be equitable, inclusive, and inspiring for every employee. In an effort to provide greater transparency, we are sharing the base salary range for this position. The position is also eligible for equity. Final salary is based on a number of factors including location, travel, relevant prior experience, or particular skills and expertise.</p> <p><em><span style="font-weight: 400;">Information regarding the culture at Pinterest and benefits available for this position can be found <a href="https://www.pinterestcareers.com/pinterest-life/" target="_blank">here</a>.</span></em></p></div><div class="title">US based applicants only</div><div class="pay-range"><span>$158,950</span><span class="divider">&mdash;</span><span>$327,000 USD</span></div></div></div><div class="content-conclusion"><p><strong>Our Commitment to Diversity:</strong></p> <p>Pinterest is an equal opportunity employer and makes employment decisions on the basis of merit. We want to have the best qualified people in every job. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, protected veteran status, or any other characteristic under federal, state, or local law. We also consider qualified applicants regardless of criminal histories, consistent with legal requirements. If you require an accommodation during the job application process, please notify&nbsp;<a href="mailto:accessibility@pinterest.com">accessibility@pinterest.com</a>&nbsp;for support.</p></div>
Staff Software Engineer, ML Training
San Francisco, CA, US; , CA, US
<div class="content-intro"><p><strong>About Pinterest</strong><span style="font-weight: 400;">:&nbsp;&nbsp;</span></p> <p>Millions of people across the world come to Pinterest to find new ideas every day. It’s where they get inspiration, dream about new possibilities and plan for what matters most. Our mission is to help those people find their inspiration and create a life they love.&nbsp;In your role, you’ll be challenged to take on work that upholds this mission and pushes Pinterest forward. You’ll grow as a person and leader in your field, all the while helping&nbsp;Pinners&nbsp;make their lives better in the positive corner of the internet.</p> <p>Creating a life you love also means finding a career that celebrates the unique perspectives and experiences that you bring. As you read through the expectations of the position, consider how your skills and experiences may complement the responsibilities of the role. We encourage you to think through your relevant and transferable skills from prior experiences.</p> <p><em>Our new progressive work model is called PinFlex, a term that’s uniquely Pinterest to describe our flexible approach to living and working. Visit our </em><a href="https://www.pinterestcareers.com/pinflex/" target="_blank"><em><u>PinFlex</u></em></a><em> landing page to learn more.&nbsp;</em></p></div><p>The ML Platform team provides foundational tools and infrastructure used by hundreds of ML engineers across Pinterest, including recommendations, ads, visual search, growth/notifications, trust and safety. We aim to ensure that ML systems are healthy (production-grade quality) and fast (for modelers to iterate upon).</p> <p>We are seeking a highly skilled and experienced Staff Software Engineer to join our ML Training Infrastructure team and lead the technical strategy. The ML Training Infrastructure team builds platforms and tools for large-scale training and inference, model lifecycle management, and deployment of models across Pinterest. ML workloads are increasingly large, complex, interdependent and the efficient use of ML accelerators is critical to our success. We work on various efforts related to adoption, efficiency, performance, algorithms, UX and core infrastructure to enable the scheduling of ML workloads.</p> <p>You’ll be part of the ML Platform team in Data Engineering, which aims to ensure healthy and fast ML in all of the 40+ ML use cases across Pinterest.</p> <p><strong>What you’ll do:</strong></p> <ul> <li>Implement cost effective and scalable solutions to allow ML engineers to scale their ML training and inference workloads on compute platforms like Kubernetes.</li> <li>Lead and contribute to key projects; rolling out GPU sharing via MIGs and MPS , intelligent resource management, capacity planning, fault tolerant training.</li> <li>Lead the technical strategy and set the multi-year roadmap for ML Training Infrastructure that includes ML Compute and ML Developer frameworks like PyTorch, Ray and Jupyter.</li> <li>Collaborate with internal clients, ML engineers, and data scientists to address their concerns regarding ML development velocity and enable the successful implementation of customer use cases.</li> <li>Forge strong partnerships with tech leaders in the Data and Infra organizations to develop a comprehensive technical roadmap that spans across multiple teams.</li> <li>Mentor engineers within the team and demonstrate technical leadership.</li> </ul> <p><strong>What we’re looking for:</strong></p> <ul> <li>7+ years of experience in software engineering and machine learning, with a focus on building and maintaining ML infrastructure or Batch Compute infrastructure like YARN/Kubernetes/Mesos.</li> <li>Technical leadership experience, devising multi-quarter technical strategies and driving them to success.</li> <li>Strong understanding of High Performance Computing and/or and parallel computing.</li> <li>Ability to drive cross-team projects; Ability to understand our internal customers (ML practitioners and Data Scientists), their common usage patterns and pain points.</li> <li>Strong experience in Python and/or experience with other programming languages such as C++ and Java.</li> <li>Experience with GPU programming, containerization, orchestration technologies is a plus.</li> <li>Bonus point for experience working with cloud data processing technologies (Apache Spark, Ray, Dask, Flink, etc.) and ML frameworks such as PyTorch.</li> </ul> <p>This position is not eligible for relocation assistance.</p> <p>#LI-REMOTE</p> <p><span data-sheets-value="{&quot;1&quot;:2,&quot;2&quot;:&quot;#LI-AH2&quot;}" data-sheets-userformat="{&quot;2&quot;:14464,&quot;10&quot;:2,&quot;14&quot;:{&quot;1&quot;:2,&quot;2&quot;:0},&quot;15&quot;:&quot;Helvetica Neue&quot;,&quot;16&quot;:12}">#LI-AH2</span></p><div class="content-pay-transparency"><div class="pay-input"><div class="description"><p>At Pinterest we believe the workplace should be equitable, inclusive, and inspiring for every employee. In an effort to provide greater transparency, we are sharing the base salary range for this position. The position is also eligible for equity. Final salary is based on a number of factors including location, travel, relevant prior experience, or particular skills and expertise.</p> <p><em><span style="font-weight: 400;">Information regarding the culture at Pinterest and benefits available for this position can be found <a href="https://www.pinterestcareers.com/pinterest-life/" target="_blank">here</a>.</span></em></p></div><div class="title">US based applicants only</div><div class="pay-range"><span>$135,150</span><span class="divider">&mdash;</span><span>$278,000 USD</span></div></div></div><div class="content-conclusion"><p><strong>Our Commitment to Diversity:</strong></p> <p>Pinterest is an equal opportunity employer and makes employment decisions on the basis of merit. We want to have the best qualified people in every job. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, protected veteran status, or any other characteristic under federal, state, or local law. We also consider qualified applicants regardless of criminal histories, consistent with legal requirements. If you require an accommodation during the job application process, please notify&nbsp;<a href="mailto:accessibility@pinterest.com">accessibility@pinterest.com</a>&nbsp;for support.</p></div>
Distinguished Engineer, Frontend
San Francisco, CA, US; , US
<div class="content-intro"><p><strong>About Pinterest</strong><span style="font-weight: 400;">:&nbsp;&nbsp;</span></p> <p>Millions of people across the world come to Pinterest to find new ideas every day. It’s where they get inspiration, dream about new possibilities and plan for what matters most. Our mission is to help those people find their inspiration and create a life they love.&nbsp;In your role, you’ll be challenged to take on work that upholds this mission and pushes Pinterest forward. You’ll grow as a person and leader in your field, all the while helping&nbsp;Pinners&nbsp;make their lives better in the positive corner of the internet.</p> <p>Creating a life you love also means finding a career that celebrates the unique perspectives and experiences that you bring. As you read through the expectations of the position, consider how your skills and experiences may complement the responsibilities of the role. We encourage you to think through your relevant and transferable skills from prior experiences.</p> <p><em>Our new progressive work model is called PinFlex, a term that’s uniquely Pinterest to describe our flexible approach to living and working. Visit our </em><a href="https://www.pinterestcareers.com/pinflex/" target="_blank"><em><u>PinFlex</u></em></a><em> landing page to learn more.&nbsp;</em></p></div><p>As a Distinguished Engineer at Pinterest, you will play a pivotal role in shaping the technical direction of our platform, driving innovation, and providing leadership to our engineering teams. You'll be at the forefront of developing cutting-edge solutions that impact millions of users.</p> <p><strong>What you’ll do:</strong></p> <ul> <li>Advise executive leadership on highly complex, multi-faceted aspects of the business, with technological and cross-organizational impact.</li> <li>Serve as a technical mentor and role model for engineering teams, fostering a culture of excellence.</li> <li>Develop cutting-edge innovations with global impact on the business and anticipate future technological opportunities.</li> <li>Serve as strategist to translate ideas and innovations into outcomes, influencing and driving objectives across Pinterest.</li> <li>Embed systems and processes that develop and connect teams across Pinterest to harness the diversity of thought, experience, and backgrounds of Pinployees.</li> <li>Integrate velocity within Pinterest; mobilizing the organization by removing obstacles and enabling teams to focus on achieving results for the most important initiatives.</li> </ul> <p>&nbsp;<strong>What we’re looking for:</strong>:</p> <ul> <li>Proven experience as a distinguished engineer, fellow, or similar role in a technology company.</li> <li>Recognized as a pioneer and renowned technical authority within the industry, often globally, requiring comprehensive expertise in leading-edge theories and technologies.</li> <li>Deep technical expertise and thought leadership that helps accelerate adoption of the very best engineering practices, while maintaining knowledge on industry innovations, trends and practices.</li> <li>Ability to effectively communicate with and influence key stakeholders across the company, at all levels of the organization.</li> <li>Experience partnering with cross-functional project teams on initiatives with significant global impact.</li> <li>Outstanding problem-solving and analytical skills.</li> </ul> <p>&nbsp;</p> <p>This position is not eligible for relocation assistance.</p> <p>&nbsp;</p> <p>#LI-REMOTE</p> <p>#LI-NB1</p><div class="content-pay-transparency"><div class="pay-input"><div class="description"><p>At Pinterest we believe the workplace should be equitable, inclusive, and inspiring for every employee. In an effort to provide greater transparency, we are sharing the base salary range for this position. The position is also eligible for equity. Final salary is based on a number of factors including location, travel, relevant prior experience, or particular skills and expertise.</p> <p><em><span style="font-weight: 400;">Information regarding the culture at Pinterest and benefits available for this position can be found <a href="https://www.pinterestcareers.com/pinterest-life/" target="_blank">here</a>.</span></em></p></div><div class="title">US based applicants only</div><div class="pay-range"><span>$242,029</span><span class="divider">&mdash;</span><span>$498,321 USD</span></div></div></div><div class="content-conclusion"><p><strong>Our Commitment to Diversity:</strong></p> <p>Pinterest is an equal opportunity employer and makes employment decisions on the basis of merit. We want to have the best qualified people in every job. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, protected veteran status, or any other characteristic under federal, state, or local law. We also consider qualified applicants regardless of criminal histories, consistent with legal requirements. If you require an accommodation during the job application process, please notify&nbsp;<a href="mailto:accessibility@pinterest.com">accessibility@pinterest.com</a>&nbsp;for support.</p></div>
You may also like