Scribd is a leading digital library for audiobooks, ebooks, podcasts, magazines, documents, slide presentations and more. The company utilizes machine learning to optimize search, make recommendations, and improve new features.
The Challenge
No easy way to track, monitor, or reproduce models
Scribd has several ML models running in production. But according to QP, Scribd’s Senior Engineer, Core Platform, “We didn’t really have a consistent way to keep track of these models—how many models we had in production or what types of models they were.”
Model management was always a bespoke process
Each model was managed differently, which was inefficient and made collaboration difficult. “Product teams want to move as efficiently as possible—they don't want to waste time trying to figure out platform-level questions like, ‘What should be the standardized way to manage our models?’
No way to proactively find problems in production
Without a way to systematically track and monitor models, Scribd couldn't flag performance degradation in a fully automated fashion. Engineers would learn of issues in production through manual testing or customer feedback.
Serving models to production was a slow process
“Every time anyone wanted to create a new model, it had to go through an external team, wait for a couple of days and then come back to them,” recalls QP. “That was really painful.”