CambridgeRecruiter Since 2001
the smart solution for Cambridge jobs

Technical Lead, Site Reliability Engineering

Company: HubSpot
Location: cambridge
Posted on: May 3, 2021

Job Description:

About the team

The HubSpot Product team is made up of over 700 engineers, designers, product managers, and researchers. We're passionate about building tools that help small and medium-sized businesses market, sell, and serve their customers - and ultimately, grow better.

Those tools end up in the HubSpot application platform, which itself is made up of thousands of services, workers, and jobs spanning over 170 teams and thousands of repos. Our teams work autonomously to deploy these systems across a common infrastructure, up to 3,000 times a day. As we've grown to serve over 75,000 customers in 100 countries, reliability and stability have become just as important as speed and time to market. And as we've opened up our APIs, our product has moved to the core of many of our customers' and partners' businesses.

In 2019, we built an SRE team to help our product teams focus on delivering highly available and dependable products. This team is off to a great start: evangelizing, building tools, and embedding onto product teams. We are looking to grow this team by hiring engineers with an interest in reliability and scale. This is an opportunity to work on hard problems across a variety of domains with an experienced team of software engineers.

What you'll do

* Help product and infrastructure teams hold retroactive root cause analysis meetings, focusing on identifying remediations using a blameless process similar to the 5 whys methodology * Embed on product and infrastructure teams directly to build more reliable, scalable software * Conceive, design, and build infrastructure tooling that improves reliability across the entire product surface area, dealing with massive distributed scale * Evangelize best practices around reliability engineering * Proactively identify risks and advocate for engineering process, tooling, or work streams that reduce that risk in a customer centric way

What we're looking for

* Experience with SRE culture, improving reliability with automation, chaos testing, and process improvement * Experience designing and operating distributed systems and cloud infrastructure at scale * Experience working collaboratively with other engineering teams * Interest or experience implementing and iterating on process to improve outcomes with minimal disruption to team culture * Interest or experience working across multiple stakeholders to drive effective change

Keywords: HubSpot, Cambridge , Technical Lead, Site Reliability Engineering, Other , cambridge, Massachusetts

Click here to apply!

Didn't find what you're looking for? Search again!

I'm looking for
in category

Log In or Create An Account

Get the latest Massachusetts jobs by following @recnetMA on Twitter!

Cambridge RSS job feeds