Automatically Expanding Taxonomy

Pinterest's Arborist model finds parents for unseen textual nodes using triplet-loss, StarSpace embeddings, & shortest path.
Automatically Expanding Taxonomy
JS disabled! Watch Automatically Expanding Taxonomy on Youtube
Watch video "Automatically Expanding Taxonomy"

Why Taxonomies?

  • map concepts into a hierarchy, usually tree
  • parents are related but more general or more abstract
  • multiple relation types possible
  • abstraction is information compression
  • simplifies user navigation via categorization

Google Product Taxonomy (Viewer)

Click on the green nodes below to Google Product Taxonomy interactively. The second click collapses the children.

Loading taxonomy viewer…

Examples:

  • Animals & Pet Supplies > Pet Supplies > Bird Supplies > Bird Cage Accessories > Bird Cage Bird Baths
  • Home & Garden > Kitchen & Dining > Tableware > Serveware Accessories > Punch Bowl Stands
  • Sporting Goods > Outdoor Recreation > Outdoor Games > Platform & Paddle Tennis > Platform & Paddle Tennis Paddles

Shopify Taxonomy

Pinterest Taxonomy

  • Interests are organized into Pinterest taxonomy
  • Advertizeres use taxonomy to target ad campaigns
  • In 2020:
  • Examples:
    • Food & Drink > Drinks > Wines
  • Use custom software for curation
  • Pins and Pinners are categorized into the taxonomy to allow recommendation.
    • together called Taste Graph

Other papers on use of taxonomies

Sherlock recommender

  • “Casualness” of skirts has different visuals to casualness of “flip-flops”
  • taxonomy sub-trees use different projections of visual embeddings
  • Visual projections are shared from parents to children via stacking

Taxonomy-Aware Multi-Hop Reasoning Networks for Sequential Recommendation

  • taxonomy data as structural knowledge to instruct the learning of our model
  • learn a unique preference representation corresponding to each level in the taxonomy based on her/his overall sequential preference

Pininterest’s automatic taxonomy expansion

  • Pinterest motivation:
    • 8 curators added 6000 new taxonomy nodes in a month
    • compared to 100M of new pins every month
    • Interests = textual phrases describing concepts (mid-century architecture)
    • textual embeddings available
  • automatically find new node parent
  • handles multiple relationships: is-type-of, is-in
  • construct embeddings for unseen nodes
  • minimizes an upper-bound on the shortest-path distances between the predicted and actual taxonomy parents
  • SoTA and ablation study
  • Code: available
  • Data: no new published
  • Name: Arborist

Building Taxonomy

  • is parent retrieval problem
  • Given taxonomy and a query node
  • Rank true parents high
  • Rank short-path to true parent high as well

Parental Retrieval Examples

  • luxor: africa travel, european travel, asia travel, greece
  • 2nd month baby: baby stage, baby, baby names, preparing for baby
  • depression: mental illness, stress, mental wellbeing, disease
  • ramadan: hosting occasions, holiday, sukkot, middle east and african cuisine
  • minion humor: humor, people humor, character humor, funny

Modeling

Setup

  • each node \( u \) of the dataset has a feature vector \( e_u \in \mathbb{R}^{d} \)
  • choose natural number for hyper-param \( k \in \mathbb{N} \)
  • \( k \) linear relatedness matricies \( M_i \in \mathbb{R}^{d \times d} \)
  • to each node \( u \) assign \( k \) learnable weights \( w_{i, u} \in \mathbb{R} \)
  • define relatedness score from child \( u \) to candidate parent \( v \):

\( s(u, v) = e_u^\intercal \sum_{i = 0}^{k} w_{i, v} M_i e_v \)

Training

Loss

  • similar to triplet loss
  • we want \( s(child, parent) > s(child, nonparent) + \gamma \)
  • loss \( \sum max(0, s(child, nonparent) + \gamma \) \( - s(child, parent) ) \)
  • How to choose margin \( \gamma \) and scalable sum?

Dynamic Margin

Negative sampling

  • cannot sum over all
  • easy samples slow convergence
  • use embedding-similarity-weighted negative-sampling

\( \mathrm{Pr}(v’, (u, v)) \propto e_u^\intercal e_{v’} \)

Evaluation

Datasets:

Metrics:

  • mean reciprocal rank (MRR)
    • = the multiplicative inverse of its rank in the predicted parents list
    • if multiple parents => the reciprocal of the highest ranked true parent
    • 0% (worst) to 100% (best)
  • Recall@15
  • SPDist = shortest-path distance in the taxonomy
Pinterest SemEval Mammal
MRR (%) Recall@15 (%) SPDist MRR (%) Recall@15 (%) SPDist MRR (%) Recall@15 (%) SPDist
Vec-Concat 41.831 64.671 3.816 20.992 33.155 3.474 14.995 30.726 4.274
Vec-Sum 33.891 62.548 4.124 17.803 27.607 4.047 19.611 38.175 4.186
Vec-Diff 41.185 67.699 3.494 18.514 30.949 4.163 31.386 46.182 3.674
Vec-Prod 42.233 68.743 3.144 17.483 31.083 4.178 32.177 48.976 3.665
CRIM 53.223 79.325 2.393 41.691 62.064 2.743 21.345 52.700 4.080
Arborist 59.044 83.606 2.220 43.373 67.694 2.864 29.354 61.639 3.225

Sources

Meet Other ML Enthusiasts One-on-One Online

Video-call each week an interesting person in the machine learning field and break out of your remote isolation. Network One-on-One Around Your Online Village with RandomMeets.
Or create a room of your own for you company or friend group and invite them to meet each other!

Join Machine Learning @ RandomMeets

Created on 22 Mar 2021. Updated on: 21 May 2022.
Thank you










About Vaclav Kosar How many days left in this quarter? Twitter Bullet Points to Copy & Paste Averaging Stopwatch Privacy Policy
Copyright © Vaclav Kosar. All rights reserved. Not investment, financial, medical, or any other advice. No guarantee of information accuracy.