Learning to Rank in Vector Spaces and Social Networks (WWW ...

icon

93

pages

icon

English

icon

Documents

Le téléchargement nécessite un accès à la bibliothèque YouScribe Tout savoir sur nos offres

icon

93

pages

icon

English

icon

Documents

Le téléchargement nécessite un accès à la bibliothèque YouScribe Tout savoir sur nos offres

Learning to Rank in
Vector Spaces and Social Networks
(WWW 2007 Tutorial)
Soumen Chakrabarti
IIT Bombay
∼http://www.cse.iitb.ac.in/ soumen
Soumen Chakrabarti Learning to Rank in Vector Spaces and Social Networks (WWW 2007 Tutorial) 1 Motivation: Web search
I User query q, Web pages{v}
I (q,v) can be represented with a rich feature vector
I Text match score with title, anchor text, headings, bold
text, body text, ..., of v as a hypertext document
I Pagerank, topic-specific Pageranks, personalized
Pageranks of v as a node in the Web graph
I Estimated location of user, commercial intent, ...
I Must we guess the relative importance of these features?
I How to combine these into a single scoring function on
(q,v) so as to induce a ranking on{v}?
Soumen Chakrabarti Learning to Rank in Vector Spaces and Social Networks (WWW 2007 Tutorial) 2 Motivation: Ad and link placement
I Here, the “query” is the surfer’s contextual information
I More noisy than queries, which are noisy enough!
I Plus page and site contents
I A response is an ad to place, or a link to insert
I Must rank and select from a large pool of available ads or
links
I (In this tutorial we will ignore issues of bidding and
visibility pricing)
Soumen Chakrabarti Learning to Rank in Vector Spaces and Social Networks (WWW 2007 Tutorial) 3 Motivation: Desktop search
I The Web has only a few kinds of hyperlinks: same-host
subdirectory, same-host superdirectory, same-host
across-path, different-host same-domain, ...
Voir icon arrow

Publié par

Nombre de lectures

100

Langue

English

Soumen Chakrabarti
Learning r Spaces a (WWW 2
to Rank in nd Social Networks 007 Tutorial)
Vecto
Soumen Chakrabarti IIT Bombay http://www.cse.iitb.ac.in/
Learning to Rank in
soumen
Vector Spaces and Social Networks
WW 2007 Tutorial
1
Motivation:
I
I
I
I
I
I
I
Web search
User queryq, Web pages{v} (q,v) can be represented with a rich feature vector Text match score with title, anchor text, headings, bold text, body text, . . . , ofvas a hypertext document Pagerank, topic-specific Pageranks, personalized Pageranks ofvas a node in the Web graph Estimated location of user, commercial intent, . . . Must we guess the relative importance of these features? How to combine these into a single scoring function on (q,v) so as to induce a ranking on{v}?
Soumen Chakrabarti
Learning to Rank in Vector Spaces and Social Networks
WW 2007 Tutorial
2
Motivation:
I
I
I
I
I
I
Ad and link placement
Here, the “query” is the surfer’s contextual information More noisy than queries, which are noisy enough! Plus page and site contents A response is an ad to place, or a link to insert Must rank and select from a large pool of available ads or links (In this tutorial we will ignore issues of bidding and visibility pricing)
Soumen Chakrabarti
Learning to Rank in Vector Spaces and Social Networks
WW 2007 Tutorial
3
Motivation:
I
I
I
I
I
I
Desktop search
The Web has only a few kinds of hyperlinks: same-host subdirectory, same-host superdirectory, same-host across-path, different-host same-domain, different-domain etc. Often differentiated by hardwired policy, e.g, HITS completely ignores same-host links Entity-relationship (ER) graphs are richer E.g. A personal information management (PIM) system has many node/entity types (person, organization, email, paper, conference, phone number) and edge/relation types (works-for, sent, received, authored, published-in) Ranking needs to exploit the richer type system Don’t want to guess the relative importance of edge types (may be dependent on queries)
Soumen Chakrabarti
Learning to Rank in Vector Spaces and Social Networks
WW 2007 Tutorial
4
Desktop
Soumen
Chakrabarti
search
example
Learning to Rank in
Vector Spaces and Social Networks (WWW 2007 Tutorial)
5
Relevance feedback
I
I
I
I
Relevance feedback is well-explored in traditional IR User-assisted local modification of ranking function for vector-space models How to extend these to richer data representations that incorporate entities, relationship links, entity and relation types? Surprisingly unexplored area
Soumen Chakrabarti
Learning to Rank in Ve
ctor Spaces and Social Networks
WW 2007 Tutorial
6
Tutorial
I
I
I
outline:
Preliminaries
Training and evaluation scenarios Measurements to evaluate quality of ranking ILabel mismatch loss functions for ordinal regression IPreference pair violations Iunder (true positive, false positive) curveArea IAverage precision IRank-discounted reward for relevance IRank correlations What’s useful vs. what’s easy to learn
Soumen Chakrabarti
Learning to Rank
in Vector Spaces and Social Networks
WW 2007 Tutorial
7
Tutorial outline:
Ranking in vector spaces
Instancevis represented by a feature vectorxvRd IDiscriminative max-margin ranking (RankSVM) ILinear-time max-margin approximation IProbabilistic ranking in vector spaces (RankNet) Iabsolute rank and cost of poor rankingsSensitivity to
Soumen Chakrabarti
Learning to Rank in
Vector Spaces and Social Networks
WW 2007 Tutorial
8
Tutorial
outline:
Ranking
in
graphs
Instancevis a node in a graphG= (V,E) IThe graph-Laplacian approach Iscores to nodes to induce ranAssign
I
I
king IGimposes a smoothness constraint on node scores ILarge difference between neighboring node scores penalized The Markov walk approach IRandom surfer, Pagerank and variants; by far most popular way to use graphs for scoring nodes IWalks constrained by preferences IHow to incorporate node, edge types and query words Surprising connections between the two approaches
Soumen Chakrabarti
Learning to Rank in Vector Spaces and Social Networks
WW 2007 Tutorial
9
Tutorial outline:
I
I
I
I
Stability and generalization
Some notes on score- vs. rank-stability Stability and generalization of max-margin ranking in vector spaces Stability and generalization of graph-Laplacian ranking Stability and generalization of Markov walk based ranking
Soumen Chakrabarti
Learning to Rank in Vector Spaces
and Social Networks
WW 2007 Tutorial
10
Preliminaries Motivation Training and evaluation setup Performance measures
I I I
Ranking in vector spaces
I I
Discriminative, max-margin algorithms Probabilistic models, gradient-descent algorithms
Ranking nodes in graphs
I I
Roughness penalty using graph Laplacian Constrained network flows
Stability and generalization
I I
Admissibility and stability Ranking loss and generalization bounds
Soumen Chakrabarti
Learning to Rank in Vector Spaces and Social Networks (WWW 2007 Tutorial)
11
Voir icon more
Alternate Text