At Stack Overflow, our mission is to serve developers. Whether we’re helping developers get answers to their questions or find new jobs, we build products that make millions of developers’ lives better every day. Our goal is to create a community and a company where every developer feels welcome to learn, share their knowledge, and build their careers.
We partner with businesses to help them understand, hire, engage, and enable the world’s developers. Our products and services are focused on developer marketing, technical recruiting, market research, and enterprise knowledge sharing. Our clientele includes Google, Microsoft, Bloomberg, and many other Fortune 500 names.
As our Director of Reliability Engineering, you’ll lead the organization responsible for reliable operation, engineering, and evolution of Stack Overflow’s technology infrastructure supporting our products, platforms, and services. You’ll partner closely with a variety of teams in our product-led organization to guide platform and product engineering to build fast, reliable, and durable production systems, adopting industry best practices and leveraging automation across the stack.
What you’ll do:
- Own, innovate, and create programs, software, and analytics that drive improvements to the availability, latency, and efficiency of all of Stack Overflow’s products, platforms, and services
- Drive infrastructure reliability and automation strategy across our product ecosystem
- Work in close partnership with our product-led engineering teams to build fast, reliable, and durable production systems
- Lead the teams supporting our datacenter and cloud infrastructure through growth and transition entirely to the cloud
- You’ll monitor and improve platform performance, using industry best practices and new technologies to creatively solve problems
- Manage, lead, retain, and grow a team of Reliability Engineers
What you’ll need to have:
- 5+ years in a similar role with experience in both on premise and cloud hosted infrastructure, tooling and SaaS products
- 15+ years in infrastructure, tooling and reliability engineering
- Ideally experience in Azure, .Net applications and cloud migrations
- An understanding of Stack Overflow and other Stack Exchange network sites
- Ability to make decisions at high velocity
- A passion for helping teams work more productively as well as experience with change management
What you’ll get in return:
- Competitive base salary
- 20 days paid vacation
- Flexible hours
- Stock options
- Completely free health insurance (no copay, no premiums)
- Great office w/ espresso bar, games, and free daily lunches
- Gym membership reimbursement
- Transportation reimbursement
If you want to work remotely…. We’ll reimburse you up to $2,000 to set up a great home office.
If you want to work in our office… You’ll be in our office in New York, and enjoy additional benefits like free lunch every day, and all the espresso you can drink.
We’re a remote-friendly team. Whether you work remotely or work out of our office (re-opening June 2021 at the earliest due to COVID-19), you’ll be part of a remote work culture that emphasizes online communication (Slack, GitHub, Hangouts, Zoom, Stack Overflow for Teams).
Employment is conditioned upon successful completion of a background check and upon having the appropriate legal right to work.
Diverse teams build better products
Legally, we need you to know this: Stack Exchange, Inc. does not discriminate in employment matters on the basis of race, color, religion, gender, national origin, age, military service eligibility, veteran status, sexual orientation, marital status, disability, or any other protected class. We support workplace diversity.
But we want to add this: We strongly believe that diversity of experience contributes to a broader collective perspective that will consistently lead to a better company and better products. We are working hard to increase the diversity of our team wherever we can and we actively encourage everyone to consider becoming a part of it.