Continual is a Series A startup building the missing AI layer for the modern data stack. Our mission is to unlock the transformational power of machine learning and AI for every organization. We do this by delivering an operational AI platform designed natively for cloud data warehouses that empowers modern data and analytics teams to deliver production-grade machine learning solutions without operational burden. Our customers use Continual to help better understand their customers, operate more efficiently, and power innovative new products and services. You can learn more about Continual at https://continual.ai.
We offer competitive benefits, a collaborative work environment, flexible working arrangements, and rapid learning and growth opportunities. We’re a small team that cares deeply about our colleagues, customers, and mission. We embrace diversity and are committed to building a team that represents a variety of backgrounds, perspectives, and skills.
About this Role:
As a Software Engineer at Continual you will be responsible for designing and building scalable infrastructure that is easy to manage and enables an incredible user experience for our customers. You will be building the next generation of AI infrastructure that can handle petabytes of data and makes AI accessible to everyone.
You will work closely with AI/ML Engineers to build software to handle streaming data, manage ML data with ease, and scale dynamic clusters of training jobs and low latency prediction servers. A good candidate for this role understands the strengths and weaknesses of existing technologies but fully embraces the future.
This role will also be responsible for the cloud infrastructure our product relies upon. You will work as part of a team but will have ownership in these areas. This includes the security aspects of external and internal communication and networking as well as the scalability and reliability of our infrastructure. These components include multiple kubernetes clusters, hosted databases, and other cloud services. You should:
- Understand reliability architecture including failover, disaster recovery and autoscaling
- Be familiar with devops concepts and continuous build and deployment processes
- Be able to debug complex systems and problems across the stack including areas such as networking, performance and memory management
- Have experience working with GCP or other cloud providers implementing production scalable systems
- Have experience with Infrastructure as Code and CI/CD systems
- Enjoy learning new complex systems and solving interesting problems as they come up. Nobody is an expert at everything but you are curious about learning what’s needed to solve a problem or implement a solution
- BS, M. Sc. or PhD in Computer Science or equivalent experience
- Fluency in one or more languages such Python, Go, Java, Scala, or C++.
- Experience building production systems that power business critical applications
- Familiarity with cloud and container technologies such as AWS, GCP, Azure, Kubernetes, Docker, Istio, and GRPC.
- Strong problem solving and analytical skills
- Excellent communication skills, both written and verbal, and a collaborative mindset
- Familiarity with machine learning is not required but this is a great opportunity for those that want to learn more about ML.
- Comfort architecting systems from scratch and growing engineering teams around them.
- Experience building infrastructure for large-scale data management, analytics, cluster scheduling, stream processing, or machine learning.
- Knowledge of or interest in machine learning / deep learning, analytics, and/or statistics