About Labelbox
Labelbox’s mission is to build the best products for humans to advance artificial intelligence.
We are building software infrastructure for industrial data science teams to do data labeling for the training of neural networks. When we build software, we take for granted the existence of collaborative tools to write and debug code. The machine learning workflow has no standard tooling for labeling data, storing it, debugging models and then continually improving model accuracy. Enter Labelbox. Our vision is to become the default software for data scientists to manage data and train neural networks in the same way that GitHub or text editors are defaults for software engineers.
Current Labelbox customers include American Family Insurance, Lytx, Airbus, Genius Sports, Keeptruckin and more. Labelbox is venture backed by A16Z, Gradient Ventures, Kleiner Perkins and First Round Capital and has been featured in Tech CrunchVenture BeatFortune, and Forbes.
About the Team
The BI team at Labelbox is working on becoming a large part of the decision making for the company. We believe that our data will set us apart and help us succeed as a data driven company. Members of the BI team are working on understanding and making sense of data while partnering with product and business teams on helping drive direction with data. We like to partner with each other and with stakeholders across the company to get our work done, and we like to constantly think about how we can improve. We also like to come up with new ideas based on data.
About the Role
As a member of this team, you will be responsible for creating clean, scalable and easy to consume data models and data pipelines. You will build ETLs that will allow the rest of the company answer questions they need in a self-service manner, and allow analysts and data scientists to quickly analyze and prototype new ideas. You will partner with the rest of the team on prototyping those new ideas and build scalable product.
The Impact You’ll Have
By building scalable self-service solutions you will enable easier and faster decision making. In addition, you will be able to increase productivity and accuracy of our BI team, and operations and product teams as well

The Problems You’ll Solve

    • Build resilient data pipelines based on internal and external data sources
    • Architect clean and scalable data models to support evolving business requirements
    • Design scalable ETLs for consistent metric definitions
    • Build or evaluate tooling for data accuracy detection and alerting
    • Partner with the rest of the team on prototyping and building scalable products driven by the BI team
    • Partner with teams throughout Labelbox on identifying opportunities and building solutions to help in simplifying operations while producing rich and accurate data sets for us to use
    • Constantly identify opportunities for providing self-service tooling to our internal team

About You

    • 5+ years experience working in business intelligence, analytics, data engineering, or a similar role
    • An ability to code complex SQL queries. Python a plus
    • Knowledge of relational databases (BigQuery, Redshift, MySQL) and big data structures
    • Experience using and building solutions to support multiple reporting and data user tools (Chartio, Tableau, Looker)
    • Advanced data visualization and SQL skills
We believe that AI has the power to transform every aspect of our lives — from healthcare to agriculture. The exponential impact of artificial intelligence will mean mammograms can happen quickly and cheaply irrespective of the limited number of radiologists there are in the world and growers will know the instant that disease hits their farm without even being there.
At Labelbox, we’re building a platform to accelerate the development of this future. Rather than requiring companies to create their own expensive and incomplete homegrown tools, we’ve created a training data platform that acts as a central hub for humans to interface with AI. When humans have better ways to input and manage data, machines have better ways to learn.