Data Con LA 2019

Jesse Steinweg Woods

E-Commerce Product Categorization at Scale

Aug 2019
play

Honey provides a shopping platform that allows users to shop across multiple stores online. In order to allow products to be more easily discoverable, a unified categorization taxonomy of all products across all stores is required. To accomplish this, we utilize deep learning across tens of millions of products. Multiple models are built, with the model scores compared and a final voting scheme that determines whether we can automatically categorize new products and add them to our product catalog. Products with too much disagreement between the models are passed on to crowdsourcing for further validation.In this talk, the following will be covered:- Model architectures for each of the models along with how text/image features are processed for training- The infrastructure utilized to allow automated model re-training and scoring with Google Cloud Services- The decision-making process behind whether to automatically categorize products or utilize crowdsourcing- How the unified categorization taxonomy was developed- Lessons learned from this process and plans for the future.

Discuss

0 comment