Abstract: “Categorization of web-search queries in semantically coherent topics is a crucial task to understand the interest trends of search engine users and, therefore, to provide more intelligent personalization services. Query clustering usually relies on lexical and clickthrough data, while the information originating from the user actions in submitting their queries is currently neglected. In particular, the intent that drives users to submit their requests is an important element for meaningful aggregation of queries. We propose a new intentcentric notion of topical query clusters and we define a query clustering technique that differs from existing algorithms in both methodology and nature of the resulting clusters. Our method extracts topics from the query log by merging mis- sions, i.e., activity fragments that express a coherent user intent, on the basis of their topical affinity. Our approach works in a bottom-up way, without any a-priori knowledge of topical categorization, and produces good quality topics compared to state-of-the-art clustering techniques. It can also summarize topically-coherent missions that occur far away from each other, thus enabling a more compact user profiling on a topical basis. Furthermore, such a topical user profiling discriminates the stream of activity of a particular user from the activity of others, with a potential to predict future user search activity. “
New Paper at CIKM
More from PublicationsMore posts in Publications »
- “Measuring scientific brain drain with hubs and authorities: A dual perspective” is out!
- “Developing Real Estate Automated Valuation Models by Learning from Heterogeneous Data Sources” is out!“Developing Real Estate Automated Valuation Models by Learning from Heterogeneous Data Sources” is out!
- “PyPlutchik: Visualising and comparing emotion-annotated corpora” is out!
- Our study on “Immigration as a Divisive Topic” is out on Future Internet
- The emergence of structural inequalities from a nation-wide wire transfers network