If you are passionate about creating scalable, highly reliable software systems and work well in a fast paced atmosphere, this might be a great role for you! Welcome to Bigleaf – We keep businesses connected to the cloud. Wonder how that works? Check out our videos or visit our homepage:  https://www.bigleaf.net/how-it-works/

Bigleaf is growing quickly and we have big plans. We’re looking for a Site Reliability Engineer to join our team. We’re a well-funded fast-growing startup, so your day-to-day tasks will vary quite a bit and morph over time, but here’s an overview:

What you’ll be doing:

As a Site Reliability Engineer at Bigleaf, you will work as part of our engineering team to improve the reliability and availability of our infrastructure, both internal and customer-facing, both hardware and software. You will also work with our network-operations team to identify and resolve issues, reduce human toil, and broadly make our system more robust and resilient. SRE is a new role at Bigleaf; you will help us define the role and space.


  • Improve the reliability, availability, and observability of Bigleaf infrastructure, including both software and hardware systems
  • Reduce human toil, through better automation, better tooling, etc. Identify or build those tools + automation, as necessary
  • Act as escalation point and on-call support for issues raised by our network-operations team
  • Encourage automation and minimize manual systems work. Champion efforts that bring long-term value + robustness to the Bigleaf system
  • Collaborate with our engineering and network-operations teams to build tools for better incident response, drive incident resolution and better post-mortems
  • Increase the observability and visibility into Bigleaf systems and customer incidents


  • Bachelor’s degree in Computer Science, a related technical field involving software/systems engineering, or equivalent practical experience
  • Prior experience in DevOps or Site Reliability Engineering roles
  • Familiarity with infrastructure monitoring tools, e.g., Prometheus, Nagios, Datadog, etc.
  • Familiarity with infrastructure-as-code tools, e.g., Terraform, Puppet, etc.
  • Experience with designing, analyzing, and troubleshooting large-scale, public-facing, distributed systems
  • Ability to work in a fast-paced environment, supporting multiple concurrent projects
  • Excellent written and verbal communication skills in a multi-team, collaborative environment

Highlight of our Benefits:

  • Medical (We pay 100% for all levels of coverage)
  • Dental & Vision (We pay 100% for all levels of coverage)
  • Life insurance, long term disability
  • 401k with dollar for dollar safe harbor match
  • Stock options plan
  • Generous parental leave (6 weeks)
  • 4 weeks of PTO +1 week of Wellness Time per year
  • 13 Company Holidays
  • Wellness Reimbursement
  • Technology/Remote Work Stipend
  • Monthly credit for GrubHub
  • 2x per year full company offsite events!

This is a very exciting period of growth for our team. We appreciate you taking the time to carefully read through this ad. Our vision is to bring peace into the lives of our customers through advanced technology and excellent service. If you would like to join us in this role, please send along a resume and cover letter of how you meet the qualifications above and why you’re interested. Benefits include medical, dental, vision, life insurance, long term disability, 401k with match and a stock options plan.

We’re building a team in addition to a product, and we value and seek inclusion and diversity in that team. We are an equal opportunity employer. We encourage diversity and feel it makes our teams stronger, so we encourage you to apply even if you don’t meet the exact qualifications for this role.