Senior Manager of Reliability Engineering

Remote

Upwork ($UPWK) is the world’s work marketplace. We help connect companies large and small with top independent talent from around the world. Simply put, our mission is to create economic opportunities so people have better lives.

Every year, more than $2 billion of work is done through Upwork by skilled independent professionals who want the freedom of working anytime, anywhere.

Cloud Engineering & Operations (CLEO) is at the core of the technology engine that enables the Upwork platform. We engineer and operate all infrastructure (IAAS), platform services (PAAS), automation, tooling and drive modern Service Engineering (DevOps, Ops Eng, Ops Analytics) to successfully deliver & manage our end-to-end offering.

This is an opportunity to bring your expertise and experience in leading more than traditional “SRE”, driving modern failure & chaos engineering but curated via data-driven analytics to focus on the experiences that are most important to the business first, then the best techniques to make the enabling technology resilient. “The tail shall not wag the dog!”.

This is a highly technical role that needs a leader who is inquisitive by nature, uses data to answer questions, solves problems as an engineer and is detailed in execution.

 

 Your Responsibilities:

  • Lead the team to continually assess & ensure that all technology areas within Cloud Engineering are operating on measurably resilient, scalable and reliable infrastructure & platform services and solutions.
  • Apply engineering leadership and deep knowledge of infrastructure and software development at scale to lead the operation, adoption, and evolution of these services where immature and requiring modernization or optimization
  • Develop, mentor and train other Reliability Engineers as we “drop the S” and focus on on the reliability of user experiences (vs a Site) and the tooling, troubleshooting techniques and processes required to deliver and manage infrastructure, platform & application services at scale in Production.

What it takes to catch our eye:

  • You’re not just a senior infrastructure/platform engineering leader. Every team within Cloud Engineering already has a focus on the most modern ways to make their domain tech resilient, highly-available, etc. The leader of Reliability Engineering has maintained that ground-up infrastructure knowledge but from there developed expertise and acquired experience in modern, cloud-based (IAAS, PAAS) reliability & failure engineering.
  • Solid understanding and experience with modern Chaos/Failure Engineering techniques and tools (simian army, chaos monkey, gremlin, etc).
  • Strong experience in a cloud environment (AWS, Azure, GCP), cloud data infrastructure and can make recommendations when a cloud or vendor-managed service can be utilized.
  • You have a high-degree of autonomy and drive. You demonstrate a customer first mentality and take full ownership of the work. You can navigate professionally and collaboratively through ambiguity to deliver outcomes.
  • Ability to write code in at least one language(Python, Ruby preferred). You are comfortable reviewing both functional implementation and tests.
  • Deep experience with 24/7/365, modern cloud (multi-zone/region) monitoring and First-Response support for availability & performance SLA’s

 

 Come change how the world works.

At Upwork, you’ll shape talent solutions for how the world works today. We are a remote-first organization working together to create exciting remote work opportunities for a global community of professionals.  While we have physical offices in San Francisco and Chicago, currently we also support hiring of corporate full-time employees in 15 states in the United States. Please speak with a member of our recruitment team to determine whether you are located in a state in which we are hiring corporate full-time employees. 

Our vibrant culture is built on shared values and our mission to create economic opportunities so that people have better lives. We foster amazing teams, put our community first, and have a bias toward action. We encourage everyone to bring their whole selves to work and grow together through development opportunities, mentorship, and employee resource groups. Oh yeah, we’ve also got amazing benefits.

Check out our Life at Upwork page to learn more about the employee experience.   

Upwork is proudly committed to recruiting and retaining a diverse and inclusive workforce. As an Equal Opportunity Employer, we never discriminate based on race, religion, color, national origin, gender (including pregnancy, childbirth, or related medical condition), sexual orientation, gender identity, gender expression, age, status as a protected veteran, status as an individual with a disability, or other applicable legally protected characteristics.

Apply Now

Don't see the right opportunity now?
Join our Talent Pipeline instead.

Life at Upwork

We believe in a workplace in which all employees are empowered to see themselves in our One Upwork community. 

Learn More