loading page

Toy Matching in Amazon E-commerce by Matching Learning
  • Sidharth Rupani
Sidharth Rupani
Massachusetts Institute of Technology

Corresponding Author:[email protected]

Author Profile


This project is to detect categories from the toy product description and names. The data set is an Amazon (toy) data set with manufacturer specific model, names and description of childrenâ\euro™s toys. The expected outcome includes labeling scheme and using CRF and bi-LSTM to measure the performance of our category extraction. As for the annotation, we performed the labeling and used product description decomposed into sequence of tokens labeled with BIO encoding, and the output of learning algorithm on a product description would be a sequence of labels. Three diffrent kinds of methods are used for our task. For the first method, We tried several traditional machine learning models like svm, logistic regression, and the linear svm for the second task. Linear svm gets the highest classification accuracy. Second model is CRF with hand-crafted features. And the last model is bi-directional LSTM. Note that there are too many categories in such a small dataset, which explains why even the best model results in relatively low accuracy.