Paper Digest: SIGIR 2017 Highlights
SIGIR (Annual International ACM SIGIR Conference on Research and Development in Information Retrieval) is one of the top information retrieval conferences in the world.
To help the community quickly catch up on the work presented in this conference, Paper Digest Team processed all accepted papers, and generated one highlight sentence (typically the main topic) for each paper. Readers are encouraged to read these machine generated highlights / summaries to quickly get the main idea of each paper.
If you do not want to miss any interesting academic paper, you are welcome to sign up our free daily paper digest service to get updates on new papers published in your area every day. You are also welcome to follow us on Twitter and Linkedin to get updated with new conference digests.
Paper Digest Team
team@paperdigest.org
TABLE 1: SIGIR 2017 Papers
Title | Authors | Highlight | |
---|---|---|---|
1 | Forward to the Past: Notes towards a Pre-history of Web Search | Stephen Robertson | Forward to the Past: Notes towards a Pre-history of Web Search |
2 | Mail Search: It’s Getting Personal! | Yoelle Maarek | Failure is evident when we can’t find a message that we remember having read, and this increases our frustration. |
3 | Navigating Imprecision in Relevance Assessments on the Road to Total Recall: Roger and Me | Gordon V. Cormack, Maura R. Grossman | Models are presented that better fit available data than the infallible ground-truth model. |
4 | Meta-evaluation of Online and Offline Web Search Evaluation Metrics | Ye Chen, Ke Zhou, Yiqun Liu, Min Zhang, Shaoping Ma | Offline metrics are usually based on relevance judgments of query-document pairs from assessors while online metrics exploit the user behavior data, such as clicks, collected from search engines to compare search algorithms. |
5 | The Probability that Your Hypothesis Is Correct, Credible Intervals, and Effect Sizes for IR Evaluation | Tetsuya Sakai | For both paired and unpaired tests, we propose that the IR community report the EAP, the credible interval, and the probability of hypothesis being true, not only for the raw difference in means but also for the effect size in terms of Glass’s Δ. |
6 | Can Deep Effectiveness Metrics Be Evaluated Using Shallow Judgment Pools? | Xiaolu Lu, Alistair Moffat, J. Shane Culpepper | Here we study the problem of metric score adjustment, with the goal of accurately estimating system performance when using deep metrics and limited judgment sets, assuming that dynamic score adjustment is required per topic due to the variability in the number of relevant documents. |
7 | Learning to Rank Using Localized Geometric Mean Metrics | Yuxin Su, Irwin King, Michael Lyu | In this paper, we propose a novel Riemannian metric learning algorithm to capture the local structures and develop a robust LtR algorithm. |
8 | End-to-End Neural Ad-hoc Ranking with Kernel Pooling | Chenyan Xiong, Zhuyun Dai, Jamie Callan, Zhiyuan Liu, Russell Power | This paper proposes K-NRM, a kernel based neural model for document ranking. |
9 | Neural Ranking Models with Weak Supervision | Mostafa Dehghani, Hamed Zamani, Aliaksei Severyn, Jaap Kamps, W. Bruce Croft | Hence, in this paper, we propose to train a neural ranking model using weak supervision, where labels are obtained automatically without human annotators or any external resources (e.g., click data). |
10 | Variational Deep Semantic Hashing for Text Documents | Suthee Chaidaroon, Yi Fang | In this paper, we propose a series of novel deep document generative models for text hashing. |
11 | ICE: Item Concept Embedding via Textual Information | Chuan-Ju Wang, Ting-Hsiang Wang, Hsiu-Wei Yang, Bo-Sin Chang, Ming-Feng Tsai | This paper proposes an item concept embedding (ICE) framework to model item concepts via textual information. |
12 | Leveraging Contextual Sentence Relations for Extractive Summarization Using a Neural Attention Model | Pengjie Ren, Zhumin Chen, Zhaochun Ren, Furu Wei, Jun Ma, Maarten de Rijke | We propose a neural network model, Contextual Relation-based Summarization (CRSum), to take advantage of contextual relations among sentences so as to improve the performance of sentence regression. |
13 | Stacking Bagged and Boosted Forests for Effective Automated Classification | Raphael Campos, Sérgio Canuto, Thiago Salles, Clebson C.A. de Sá, Marcos André Gonçalves | We show that BERT is among the top performers in the vast majority of analyzed cases, while retaining the unique benefits of RF classifiers (explainability, parallelization, easiness of parameterization). |
14 | Deep Learning for Extreme Multi-label Text Classification | Jingzhou Liu, Wei-Cheng Chang, Yuexin Wu, Yiming Yang | This paper presents the first attempt at applying deep learning to XMTC, with a family of new Convolutional Neural Network (CNN) models which are tailored for multi-label classification in particular. |
15 | Engaged or Frustrated?: Disambiguating Emotional State in Search | Ashlee Edwards, Diane Kelly | This research examines the differences in the search behaviors and physiologies of people who are engaged or frustrated during search. |
16 | A Study of Snippet Length and Informativeness: Behaviour, Performance and User Experience | David Maxwell, Leif Azzopardi, Yashar Moshfeghi | The design and presentation of a Search Engine Results Page (SERP) has been subject to much research. |
17 | Improving Exploratory Search Experience through Hierarchical Knowledge Graphs | Bahareh Sarrafzadeh, Edward Lank | In this paper, we explore a multi-layer extension to knowledge graphs, hierarchical knowledge graphs (HKGs), that combines hierarchical and network visualizations into a unified data representation. |
18 | Searching on the Go: The Effects of Fragmented Attention on Mobile Web Search Tasks | Morgan Harvey, Matthew Pointon | In this work we conducted a laboratory experiment with both device types in which we simulated everyday, common mobile situations that may cause fragmented attention, impact search performance and affect user perception. |
19 | User Interaction Sequences for Search Satisfaction Prediction | Rishabh Mehrotra, Imed Zitouni, Ahmed Hassan Awadallah, Ahmed El Kholy, Madian Khabsa | In this work, we focus on considering holistic view of user interaction with the search engine result page (SERP) and construct detailed universal interaction sequences of their activity. |
20 | Multi-site User Behavior Modeling and Its Application in Video Recommendation | Chunfeng Yang, Huan Yan, Donghan Yu, Yong Li, Dah Ming Chiu | In this work, we try to model user preferences in six popular video websites with user viewing records obtained from a large ISP in China. |
21 | Item Silk Road: Recommending Items from Information Domains to Social Users | Xiang Wang, Xiangnan He, Liqiang Nie, Tat-Seng Chua | In this work, we address the problem of cross-domain social recommendation, i.e., recommending relevant items of information domains to potential users of social networks. |
22 | Cross-Domain Recommendation via Clustering on Multi-Layer Graphs | Aleksandr Farseev, Ivan Samborskii, Andrey Filchenkov, Tat-Seng Chua | Taking into account these two aspects, we introduce a novel cross-network collaborative recommendation framework C3R, which utilizes both individual and group knowledge, while being trained on data from multiple social media sources. |
23 | Optimizing Trade-offs Among Stakeholders in Real-Time Bidding by Incorporating Multimedia Metrics | Xiang Chen, Bowei Chen, Mohan Kankanhalli | In this paper, we propose a two-stage computational framework that selects a banner ad based on the optimized trade-offs among all stakeholders. |
24 | A Probabilistic Reformulation of Memory-Based Collaborative Filtering: Implications on Popularity Biases | Rocío Cañamares, Pablo Castells | We develop a probabilistic formulation giving rise to a formal version of heuristic k nearest-neighbor (kNN) collaborative filtering. |
25 | Deep Semantic Hashing with Generative Adversarial Networks | Zhaofan Qiu, Yingwei Pan, Ting Yao, Tao Mei | This paper studies the exploration of generating synthetic data through semi-supervised generative adversarial networks (GANs), which leverages largely unlabeled and limited labeled training data to produce highly compelling data with intrinsic invariance and global coherence, for better understanding statistical structures of natural data. |
26 | Characterizing and Predicting Enterprise Email Reply Behavior | Liu Yang, Susan T. Dumais, Paul N. Bennett, Ahmed Hassan Awadallah | In this paper, we extend previous work on predicting email reply behavior by looking at enterprise settings and considering more than dyadic communications. |
27 | ReAct: Online Multimodal Embedding for Recency-Aware Spatiotemporal Activity Modeling | Chao Zhang, Keyang Zhang, Quan Yuan, Fangbo Tao, Luming Zhang, Tim Hanratty, Jiawei Han | We propose ReAct, a method that processes continuous GTSM streams and obtains recency-aware spatiotemporal activity models on the fly. |
28 | EntiTables: Smart Assistance for Entity-Focused Tables | Shuo Zhang, Krisztian Balog | We introduce and focus on two specifc tasks: populating rows with additional instances (entities) and populating columns with new headings. |
29 | Understanding and Modeling Success in Email Search | Jin Young Kim, Nick Craswell, Susan Dumais, Filip Radlinski, Fang Liu | In this study, we built an opt-in client application which monitors a user’s email search activity and then pops up an in-situ survey when a search session is finished. |
30 | Investigating Examination Behavior of Image Search Users | Xiaohui Xie, Yiqun Liu, Xiaochuan Wang, Meng Wang, Zhijing Wu, Yingying Wu, Min Zhang, Shaoping Ma | To shed light on this research question, we conducted an eye-tracking study to investigate users’ examination behavior in image searches. |
31 | Extracting Hierarchies of Search Tasks & Subtasks via a Bayesian Nonparametric Approach | Rishabh Mehrotra, Emine Yilmaz | To this end, we propose an efficient Bayesian nonparametric model for extracting hierarchies of such tasks & subtasks. |
32 | Using Information Scent to Understand Mobile and Desktop Web Search Behavior | Kevin Ong, Kalervo Järvelin, Mark Sanderson, Falk Scholer | This paper investigates if Information Foraging Theory can be used to understand differences in user behavior when searching on mobile and desktop web search systems. |
33 | Deep Character-Level Click-Through Rate Prediction for Sponsored Search | Bora Edizel, Amin Mantrach, Xiao Bai | In this paper, we propose two novel approaches (one working at character level and the other working at word level) that use deep convolutional neural networks to predict the click-through rate of a query-advertisement pair. |
34 | Personalized Key Frame Recommendation | Xu Chen, Yongfeng Zhang, Qingyao Ai, Hongteng Xu, Junchi Yan, Zheng Qin | In this paper, we propose and investigate the problem of personalized key frame recommendation to bridge the above gap. |
35 | Personalized Itinerary Recommendation with Queuing Time Awareness | Kwan Hui Lim, Jeffrey Chan, Shanika Karunasekera, Christopher Leckie | To solve these challenges, we propose the PersQ algorithm for recommending personalized itineraries that take into consideration attraction popularity, user interests and queuing times. |
36 | Attentive Collaborative Filtering: Multimedia Recommendation with Item- and Component-Level Attention | Jingyuan Chen, Hanwang Zhang, Xiangnan He, Liqiang Nie, Wei Liu, Tat-Seng Chua | In this paper, we introduce a novel attention mechanism in CF to address the challenging item- and component-level implicit feedback in multimedia recommendation, dubbed Attentive Collaborative Filtering (ACF). |
37 | Neural Rating Regression with Abstractive Tips Generation for Recommendation | Piji Li, Zihao Wang, Zhaochun Ren, Lidong Bing, Wai Lam | We propose a deep learning based framework named NRT which can simultaneously predict precise ratings and generate abstractive tips with good linguistic quality simulating user experience and feelings. |
38 | Neural Factorization Machines for Sparse Predictive Analytics | Xiangnan He, Tat-Seng Chua | In this paper, we propose a novel model Neural Factorization Machine (NFM) for prediction under sparse settings. |
39 | Accounting for the Correspondence in Commented Data | Renqin Cai, Chi Wang, Hongning Wang | In this work, we develop a Commented Correspondence Topic Model to model correspondence in commented text data. |
40 | Jointly Learning Word Embeddings and Latent Topics | Bei Shi, Wai Lam, Shoaib Jameel, Steven Schockaert, Kwun Ping Lai | In this paper, we propose STE, a framework which can learn word embeddings and latent topics in a unified manner. |
41 | On the Power Laws of Language: Word Frequency Distributions | Flavio Chierichetti, Ravi Kumar, Bo Pang | A simple generative model is proposed to capture this phenomenon. |
42 | Retrieval Consistency in the Presence of Query Variations | Peter Bailey, Alistair Moffat, Falk Scholer, Paul Thomas | In this paper we examine the retrieval consistency of a set of five systems responding to syntactic query variations over one hundred topics, working with the UQV100 test collection, and using Rank-Biased Overlap (RBO) relative to a centroid ranking over the query variations per topic as a measure of consistency. |
43 | Comparing In Situ and Multidimensional Relevance Judgments | Jiepu Jiang, Daqing He, James Allan | The second one collects multidimensional assessments to complement relevance or usefulness judgments, with four distinct alternative aspects examined in this paper – novelty, understandability, reliability, and effort. |
44 | Online In-Situ Interleaved Evaluation of Real-Time Push Notification Systems | Adam Roegiest, Luchen Tan, Jimmy Lin | We describe a user study of such systems in the context of the TREC 2016 Real-Time Summarization Track, where system updates are immediately delivered as push notifications to the mobile devices of a cohort of users. |
45 | Evaluating Web Search with a Bejeweled Player Model | Fan Zhang, Yiqun Liu, Xin Li, Min Zhang, Yinghui Xu, Shaoping Ma | Inspired by a popular computer game named Bejeweled, we propose a Bejeweled Player Model (BPM) to simulate users’ search interaction processes and evaluate their search performances. |
46 | Evaluating Mobile Search with Height-Biased Gain | Cheng Luo, Yiqun Liu, Tetsuya Sakai, Fan Zhang, Min Zhang, Shaoping Ma | Based on these findings, we propose a new evaluation metric, Height-Biased Gain, which is calculated by summing up the product of gain distribution and discount factors that are both modeled in terms of result height. |
47 | Efficient Cost-Aware Cascade Ranking in Multi-Stage Retrieval | Ruey-Cheng Chen, Luke Gallagher, Roi Blanco, J. Shane Culpepper | In this paper, we re-examine the importance of tightly integrating feature costs into multi-stage learning-to-rank (LTR) IR systems. |
48 | Computational Social Indicators: A Case Study of Chinese University Ranking | Fuli Feng, Liqiang Nie, Xiang Wang, Richang Hong, Tat-Seng Chua | Towards this end, we present a novel graph-based multi-channel ranking scheme for social indicator computation by exploring the rich multi-channel Web data. |
49 | Information Retrieval Meets Game Theory: The Ranking Competition Between Documents’ Authors | Nimrod Raifer, Fiana Raiber, Moshe Tennenholtz, Oren Kurland | We present a novel theoretical and empirical analysis of the strategic behavior of publishers using these foundations. |
50 | On Application of Learning to Rank for E-Commerce Search | Shubhra Kanti Karmaker Santu, Parikshit Sondhi, ChengXiang Zhai | In this paper, we discuss the practical challenges in applying learning to rank methods to E-Com search, including the challenges in feature representation, obtaining reliable relevance judgments, and optimally exploiting multiple user feedback signals such as click rates, add-to-cart ratios, order rates, and revenue. |
51 | Intent-Aware Semantic Query Annotation | Rafael Glater, Rodrygo L.T. Santos, Nivio Ziviani | In this paper, we propose a framework for learning semantic query annotations suitable to the target intent of each individual query. |
52 | Efficient & Effective Selective Query Rewriting with Efficiency Predictions | Craig Macdonald, Nicola Tonellotto, Iadh Ounis | In this paper, we propose a novel framework for using the predicted execution time of various query rewritings to select between alternatives on a per-query basis, in a manner that ensures both effectiveness and efficiency. |
53 | Relevance-based Word Embedding | Hamed Zamani, W. Bruce Croft | In this paper, we propose two learning models with different objective functions; one learns a relevance distribution over the vocabulary set for each query, and the other classifies each term as belonging to the relevant or non-relevant class for each query. |
54 | IRGAN: A Minimax Game for Unifying Generative and Discriminative Information Retrieval Models | Jun Wang, Lantao Yu, Weinan Zhang, Yu Gong, Yinghui Xu, Benyou Wang, Peng Zhang, Dell Zhang | We propose a game theoretical minimax game to iteratively optimise both models. |
55 | Personalized PageRank in Uncertain Graphs with Mutually Exclusive Edges | Jung Hyun Kim, Mao-Lin Li, K. Selçuk Candan, Maria Luisa Sapino | To tackle with this challenge, in this paper, we propose an efficient Uncertain Personalized PageRank (UPPR) algorithm to approximately compute personalized PageRank values on an uncertain graph with edge uncertainties. |
56 | Adapting Markov Decision Process for Search Result Diversification | Long Xia, Jun Xu, Yanyan Lan, Jiafeng Guo, Wei Zeng, Xueqi Cheng | In this paper we address the issue of learning diverse ranking models for search result diversification. |
57 | Learning to Diversify Search Results via Subtopic Attention | Zhengbao Jiang, Ji-Rong Wen, Zhicheng Dou, Wayne Xin Zhao, Jian-Yun Nie, Ming Yue | In this paper, we propose a learning framework for explicit result diversification where subtopics are explicitly modeled. |
58 | Retrieval Algorithms Optimized for Human Learning | Rohail Syed, Kevyn Collins-Thompson | We address this problem by introducing a novel theoretical framework, algorithms, and empirical analysis of an information retrieval model that is optimized for learning outcomes instead of generic relevance. |
59 | Content Recommendation for Viral Social Influence | Sergei Ivanov, Konstantinos Theocharidis, Manolis Terrovitis, Panagiotis Karras | In this paper, we address the natural problem that arises in such circumstances: Suggest content, expressed as a limited set of attributes, for a creative promotion campaign that starts out from a given seed set of initiators, so as to maximize its expected spread over a social network. |
60 | Exploiting Food Choice Biases for Healthier Recipe Recommendation | David Elsweiler, Christoph Trattner, Morgan Harvey | In this paper, using insights gained from various data sources, we explore the feasibility of substituting meals that would typically be recommended to users with similar, healthier dishes. |
61 | Embedding Factorization Models for Jointly Recommending Items and User Generated Lists | Da Cao, Liqiang Nie, Xiangnan He, Xiaochi Wei, Shunzhi Zhu, Tat-Seng Chua | Specifically, we employ factorization model to capture users’ preferences over items and lists, and utilize embedding-based models to discover the co-occurrence information among items and lists. |
62 | Classification by Retrieval: Binarizing Data and Classifiers | Fumin Shen, Yadong Mu, Yang Yang, Wei Liu, Li Liu, Jingkuan Song, Heng Tao Shen | This paper proposes a generic formulation that significantly expedites the training and deployment of image classification models, particularly under the scenarios of many image categories and high feature dimensions. |
63 | BitFunnel: Revisiting Signatures for Search | Bob Goodwin, Michael Hopcroft, Dan Luu, Alex Clemmer, Mihaela Curmei, Sameh Elnikety, Yuxiong He | This paper describes algorithmic innovations and changes in the cloud computing landscape that led us to reconsider and eventually field a technology that was once considered unusable. |
64 | Efficient Data Structures for Massive | Giulio Ermanno Pibiri, Rossano Venturini | In this paper we study the problem of reducing the space required by the representation of such datasets, maintaining the capability of looking up for a given N-gram within micro seconds. |
65 | Faster BlockMax WAND with Variable-sized Blocks | Antonio Mallia, Giuseppe Ottaviano, Elia Porciani, Nicola Tonellotto, Rossano Venturini | We introduce a refinement for BlockMaxWAND that uses variable- sized blocks, rather than constant-sized. |
66 | LoSHa: A General Framework for Scalable Locality Sensitive Hashing | Jinfeng Li, James Cheng, Fan Yang, Yuzhen Huang, Yunjian Zhao, Xiao Yan, Ruihao Zhao | We propose LoSHa, a distributed computing framework that reduces the development cost by designing a tailor-made, general programming interface and achieves high efficiency by exploring LSH-specific system implementation and optimizations. |
67 | Learning a Hierarchical Embedding Model for Personalized Product Search | Qingyao Ai, Yongfeng Zhang, Keping Bi, Xu Chen, W. Bruce Croft | In this paper, we propose a hierarchical embedding model to learn semantic representations for entities (i.e. words, products, users and queries) from different levels with their associated language data. Following the methodology of previous studies, we constructed personalized product search benchmarks with Amazon product data. |
68 | Exploring User-Specific Information in Music Retrieval | Zhiyong Cheng, Jialie Shen, Liqiang Nie, Tat-Seng Chua, Mohan Kankanhalli | In this paper, we propose a novel model, named User-Information-Aware Music Interest Topic (UIA-MIT) model. |
69 | The Utility and Privacy Effects of a Click | Rachid Guerraoui, Anne-Marie Kermarrec, Mahsa Taziki | In this paper, for the first time, we propose a way to quantify the exact utility and privacy effects of each user click. |
70 | Privacy through Solidarity: A User-Utility-Preserving Framework to Counter Profiling | Asia J. Biega, Rishiraj Saha Roy, Gerhard Weikum | In this work, we propose a framework which leverages solidarity in a large community to scramble user interaction histories. |
71 | Joint Learning of Response Ranking and Next Utterance Suggestion in Human-Computer Conversation System | Rui Yan, Dongyan Zhao, Weinan E. | In this paper, we propose a new task for conversation systems: joint learning of response ranking featured with next utterance suggestion. |
72 | Learning to Rank Question Answer Pairs with Holographic Dual LSTM Architecture | Yi Tay, Minh C. Phan, Luu Anh Tuan, Siu Cheung Hui | We describe a new deep learning architecture for learning to rank question answer pairs. |
73 | Incomplete Follow-up Question Resolution using Retrieval based Sequence to Sequence Learning | Vineet Kumar, Sachindra Joshi | In this work, we present a retrieval based sequence to sequence learning system that can generate the complete (or intended) question for an incomplete follow-up question (given the conversation context). |
74 | Modelling Information Needs in Collaborative Search Conversations | Sosuke Shiga, Hideo Joho, Roi Blanco, Johanne R. Trippas, Mark Sanderson | The contribution of this work is three-fold. |
75 | Predicting Which Topics You Will Join in the Future on Social Media | Haoran Huang, Qi Zhang, Jindou Wu, Xuanjing Huang | In this study, we investigate the problem of predicting whether a user will join a topic based on his posting history. To train and evaluate the proposed method, we collected a large-scale dataset from Twitter. |
76 | What Are You Known For?: Learning User Topical Profiles with Implicit and Explicit Footprints | Cheng Cao, Hancheng Ge, Haokai Lu, Xia Hu, James Caverlee | In this paper, we propose a unified model for learning user topical profiles that simultaneously considers multiple footprints. |
77 | Hierarchical Community-Level Information Diffusion Modeling in Social Networks | Yuan Zhang, Tianshu Lyu, Yan Zhang | In this paper, we propose a Hierarchical Community-level Information Diffusion (HCID) model to capture the information diffusion process in social networks. |
78 | Word-Entity Duet Representations for Document Ranking | Chenyan Xiong, Jamie Callan, Tie-Yan Liu | This paper presents a word-entity duet framework for utilizing knowledge bases in ad-hoc retrieval. |
79 | Dynamic Factual Summaries for Entity Cards | Faegheh Hasibi, Krisztian Balog, Svein Erik Bratsberg | In this paper, we make the first effort towards generating and evaluating such factual summaries. We introduce and address the novel problem of dynamic entity summarization for entity cards, and break it down to two specific subtasks: fact ranking and summary generation. |
80 | MEmbER: Max-Margin Based Embeddings for Entity Retrieval | Shoaib Jameel, Zied Bouraoui, Steven Schockaert | We propose a new class of methods for learning vector space embeddings of entities. |
81 | On the Reusability of "Living Labs" Test Collections: A Case Study of Real-Time Summarization | Luchen Tan, Gaurav Baruah, Jimmy Lin | In this paper, we performed a "leave-one-out" analysis of human judgment data derived from the TREC 2016 Real-Time Summarization Track and show that those judgments do not appear to be reusable. |
82 | Automatically Extracting High-Quality Negative Examples for Answer Selection in Question Answering | Haotian Zhang, Jinfeng Rao, Jimmy Lin, Mark D. Smucker | We propose a heuristic called "one answer per document" for automatically extracting high-quality negative examples for answer selection in question answering. |
83 | Leveraging Cross-Network Information for Graph Sparsification in Influence Maximization | Xiao Shen, Fu-lai Chung, Sitong Mao | In this work, a Cross-Network Graph Sparsification (CNGS) model is proposed to leverage the influence backbone knowledge pre-detected in a source network to predict and remove the edges least likely to contribute to the influence propagation in the target networks. |
84 | Combining Top-N Recommenders with Metasearch Algorithms | Daniel Valcarce, Javier Parapar, Álvaro Barreiro | In this paper, we explore methods for combining multiple recommendation approaches. |
85 | Counter Deanonymization Query: H-index Based | Jianliang Gao, Bo Song, Zheng Chen, Weimao Ke, Wanying Ding, Xiaohua Hu | In this paper, we propose a novel k-anonymization scheme to counter deanonymization queries on social networks. |
86 | SPOT: Selecting occuPations frOm Trajectories | Peipei Li, Junjie Yao, Liping Wang, Xuemin Lin | This paper proposes a novel approach, i.e., SPOT (Selecting occuPation frOm Trajectories). |
87 | Personalized Query Suggestion Diversification | Wanyu Chen, Fei Cai, Honghui Chen, Maarten de Rijke | We propose a personalized query suggestion diversification model (PQSD), where a user’s long-term search behavior is injected into a basic greedy query suggestion diversification model (G-QSD) that considers a user’s search context in their current session. |
88 | Weighted Domain Translation for Online News Comments Emotion Tagging | Ying Zhang, Li Yu, Xue Zhao, Xiaojie Yuan, Lei Xu | In this paper, we accomplish cross-domain emotion tagging based on an advanced neural network BLSTM (bidirectional long short-term memory) with "domain translation”, which can overcome the difference between domains. |
89 | From Footprint to Friendship: Modeling User Followership in Mobile Social Networks from Check-in Data | Cheng Wang, Jieren Zhou, Bo Yang | In this paper we aim at addressing the correlation between two critical factors in mobile social networks (MSNs): the social-relationship networking among users and the spatial mobility pattern of users. |
90 | Video Question Answering via Attribute-Augmented Attention Network Learning | Yunan Ye, Zhou Zhao, Yimeng Li, Long Chen, Jun Xiao, Yueting Zhuang | In this paper, we study the problem of video question answering by modeling its temporal dynamics with frame-level attention mechanism. |
91 | Learning Max-Margin GeoSocial Multimedia Network Representations for Point-of-Interest Suggestion | Zhou Zhao, Qifan Yang, Hanqing Lu, Min Yang, Jun Xiao, Fei Wu, Yueting Zhuang | In this paper, we consider the problem of POI suggestion from the viewpoint of learning geosocial multimedia network representations. |
92 | Learning To Rank Resources | Zhuyun Dai, Yubin Kim, Jamie Callan | We present a learning-to-rank approach for resource selection. |
93 | DeepStyle: Learning User Preferences for Visual Recommendation | Qiang Liu, Shu Wu, Liang Wang | Accordingly, we propose a DeepStyle method for learning style features of items and sensing preferences of users. |
94 | Target Type Identification for Entity-Bearing Queries | Darío Garigliotti, Faegheh Hasibi, Krisztian Balog | In this work, we address the problem of automatically detecting the target types of a query with respect to a type taxonomy. |
95 | Query Expansion for Email Search | Saar Kuzi, David Carmel, Alex Libov, Ariel Raviv | Three state-of-the-art expansion methods are examined: 1) a global translation-based expansion model; 2) a personalized-based word embedding model; 3) the classical pseudo-relevance-feedback model. |
96 | Generating Clinical Queries from Patient Narratives: A Comparison between Machines and Humans | Bevan Koopman, Liam Cripwell, Guido Zuccon | This paper investigates how automated query generation methods can be used to derive effective ad-hoc queries from verbose patient narratives. |
97 | Centered kNN Graph for Semi-Supervised Learning | Ikumi Suzuki, Kazuo Hara | So we present a new graph construction method, the centered kNN graph, which not only reduces hub nodes but also avoids the over sparsification problem. |
98 | Event Recommendation based on Graph Random Walking and History Preference Reranking | Shenghao Liu, Bang Wang, Minghua Xu | In this paper, we study how to exploit diverse relations in an EBSN as well as individual history preferences to recommend preferred events. |
99 | An Extended Relevance Model for Session Search | Nir Levine, Haggai Roitman, Doron Cohen | We propose an extended relevance model that captures the user’s dynamic information need in the session. |
100 | An Enhanced Approach to Query Performance Prediction Using Reference Lists | Haggai Roitman | In this work, we try to fill the gaps. |
101 | Autonomous Crowdsourcing through Human-Machine Collaborative Learning | Azad Abad, Moin Nabi, Alessandro Moschitti | In this paper, we introduce a general iterative human-machine collaborative method for training crowdsource workers: the classifier (i.e., the machine) selects the highest quality examples for training the crowdsource workers (i.e., the humans). |
102 | Game State Retrieval with Keyword Queries | Atsushi Ushiku, Shinsuke Mori, Hirotaka Kameko, Yoshimasa Tsuruoka | In this work, we propose a search system that allows users to retrieve game states from a game record database by using keywords. |
103 | Improving Search Engines via Large-Scale Physiological Sensing | Ryen W. White, Ryan Ma | In this paper, we focus on heart rate and show that there are strong relationships between heart rate and various measures of user interest in a search result. |
104 | A Poisson Regression Method for Top-N Recommendation | Jiajin Huang, Jian Wang, Ning Zhong | As a user’s decision may be affected by correlations among items, we incorporate such correlations with the user and item latent factors to propose a Poisson-regression-based method for top-N recommendation tasks. |
105 | A Hierarchical Multimodal Attention-based Neural Network for Image Captioning | Yong Cheng, Fei Huang, Lian Zhou, Cheng Jin, Yuejie Zhang, Tao Zhang | The main contribution of our work is that the hierarchical structure and multimodal attention mechanism is both applied, thus each caption word can be generated with the multimodal attention on the intermediate semantic objects and the global visual content. |
106 | Unifying Multi-Source Social Media Data for Personalized Travel Route Planning | Gang Hu, Jie Shao, Fumin Shen, Zi Huang, Heng Tao Shen | This paper presents an approach which mines the user interest model by multi-source social media (e.g., travelogues and check-in records), and understands the user’s real intention by active behavior such as point of interest (POI) inputs. |
107 | LiveMaps: Converting Map Images into Interactive Maps | Michael R. Evans, Dragomir Yankov, Pavel Berkhin, Pavel Yudin, Florin Teodorescu, Wei Wu | In this paper, we describe a novel system, LiveMaps, for analyzing and retrieving an appropriate map viewport for a given image of a map. |
108 | Sub-corpora Impact on System Effectiveness | Nicola Ferro, Mark Sanderson | We found that sub-corpora are a significant effect. |
109 | Automatic and Semi-Automatic Document Selection for Technology-Assisted Review | Maura R. Grossman, Gordon V. Cormack, Adam Roegiest | In this work, we investigate the extent to which the observed effectiveness of the different methods may be confounded by chance, by inconsistent adherence to the Track guidelines, by selection bias in the evaluation method, or by discordant relevance assessments. |
110 | Recommending Complementary Products in E-Commerce Push Notifications with a Mixture Model Approach | Huasha Zhao, Luo Si, Xiaogang Li, Qiong Zhang | This paper proposes a mixture model approach for predicting push message open rate for a post-purchase complementary product recommendation task. |
111 | Neural Network based Reinforcement Learning for Real-time Pushing on Text Stream | Haihui Tan, Ziyu Lu, Wenjie Li | In this paper, we formulate the real-time pushing on text stream as a sequential decision making problem and propose a Neural Network based Reinforcement Learning (NNRL) algorithm for real-time decision making, e.g., push or skip the incoming text, with considering both history dependencies and future uncertainty. |
112 | Joint Latent Subspace Learning and Regression for Cross-Modal Retrieval | Jianlong Wu, Zhouchen Lin, Hongbin Zha | We conduct extensive experiments on multiple benchmark datasets and our proposed method outperforms the state-of-the-art approaches. |
113 | Label Aggregation for Crowdsourcing with Bi-Layer Clustering | Jing Zhang, Victor S. Sheng, Tao Li | This paper proposes a novel general label aggregation method for both binary and multi-class labeling in crowdsourcing, namely Bi-Layer Clustering (BLC), which clusters two layers of features – the conceptual-level and the physical-level features – to infer true labels of instances. |
114 | Effective Music Feature NCP: Enhancing Cover Song Recognition with Music Transcription | Yao Cheng, Xiaoou Chen, Deshun Yang, Xiaoshuo Xu | In this paper, we proposed a similar but more effective feature Note Class Profile (NCP) derived with music transcription techniques. |
115 | Deep Multimodal Embedding Model for Fine-grained Sketch-based Image Retrieval | Fei Huang, Yong Cheng, Cheng Jin, Yuejie Zhang, Tao Zhang | In this paper, we consider Fine-grained SBIR as a cross-modal retrieval problem and propose a deep multimodal embedding model that exploits all the beneficial multimodal information sources in sketches and images. |
116 | Cross-Device User Linking: URL, Session, Visiting Time, and Device-log Embedding | Minh C. Phan, Aixin Sun, Yi Tay | In this paper, we present insightful analysis on the dataset and propose a solution to link users based on their visited URLs, visiting time, and profile embeddings. |
117 | Mailbox-Based vs. Log-Based Query Completion for Mail Search | Michal Horovitz, Liane Lewin-Eytan, Alex Libov, Yoelle Maarek, Ariel Raviv | We therefore propose here to leverage the mailbox content in order to generate suggestions, taking advantage of mail-specific features. |
118 | The Impact of Linkage Methods in Hierarchical Clustering for Active Learning to Rank | Ziming Li, Maarten de Rijke | We propose a sampling method which selects a set of instances and labels the full set only once before training the ranking model. |
119 | Reinforcement Learning to Rank with Markov Decision Process | Zeng Wei, Jun Xu, Yanyan Lan, Jiafeng Guo, Xueqi Cheng | In this paper, we propose a novel learning to rank model on the basis of Markov decision process (MDP), referred to as MDPRank. |
120 | A Comparative Live Evaluation of Multileaving Methods on a Commercial cQA Search | Tomohiro Manabe, Akiomi Nishida, Makoto P. Kato, Takehiro Yamamoto, Sumio Fujita | We present one of the world’s first attempts to examine the feasibility of multileaving evaluation of document rankings on a large scale commercial community Question Answering (cQA) service. |
121 | Support for Interactive Identification of Mentioned Entities in Conversational Speech | Ning Gao, Douglas W. Oard, Mark Dredze | This paper describes our initial experiments with a freely available collection of Enron telephone conversations. |
122 | AutoSVD++: An Efficient Hybrid Collaborative Filtering Model via Contractive Auto-encoders | Shuai Zhang, Lina Yao, Xiwei Xu | In this paper, we propose a new hybrid model by generalizing contractive auto-encoder paradigm into matrix factorization framework with good scalability and computational efficiency, which jointly models content information as representations of effectiveness and compactness, and leverage implicit user feedback to make accurate recommendations. |
123 | Unsupervised Query-Focused Multi-Document Summarization using the Cross Entropy Method | Guy Feigenblat, Haggai Roitman, Odellia Boni, David Konopnicki | We present a novel unsupervised query-focused multi-document summarization approach. |
124 | Translation of Natural Language Query Into Keyword Query Using a RNN Encoder-Decoder | Hyun-Je Song, A-Yeong Kim, Seong-Bae Park | This paper proposes a novel method to translate a natural language query into a keyword query relevant to the natural language query for retrieving better search results without change of the engines. |
125 | Predictive Network Representation Learning for Link Prediction | Zhitao Wang, Chengyao Chen, Wenjie Li | In this paper, we propose a predictive network representation learning (PNRL) model to solve the structural link prediction problem. |
126 | Sentence-level Sentiment Classification with Weak Supervision | Fangzhao Wu, Jia Zhang, Zhigang Yuan, Sixing Wu, Yongfeng Huang, Jun Yan | In this paper, we propose an approach for sentence-level sentiment classification without the need of sentence labels. |
127 | Predicting Session Length in Media Streaming | Theodore Vasiloudis, Hossein Vahabi, Ross Kravitz, Valery Rashkov | In this work we present the first analysis of session length in a mobile-focused online service, using a real world data-set from a major music streaming service. |
128 | Exploring the Query Halo Effect in Site Search: Leading People to Longer Queries | Djoerd Hiemstra, Claudia Hauff, Leif Azzopardi | In this paper, we test whether a similar increase is observed when the same component is deployed in a production system for site search and used by real end users. |
129 | Top-N Recommendation with High-Dimensional Side Information via Locality Preserving Projection | Yifan Chen, Xiang Zhao, Maarten de Rijke | In this paper, we leverage high-dimensional side information to enhance top-N recommendations. |
130 | Applying Information Extraction for Patent Structure Analysis | Masayuki Okamoto, Zifei Shan, Ryohei Orihara | In this paper, we propose an information-extraction-based technique to grasp the patent claim structure. |
131 | Enhancing Recurrent Neural Networks with Positional Attention for Question Answering | Qin Chen, Qinmin Hu, Jimmy Xiangji Huang, Liang He, Weijie An | Based on this assumption, we propose a positional attention based RNN model, which incorporates the positional context of the question words into the answers’ attentive representations. |
132 | Event Early Embedding: Predicting Event Volume Dynamics at Early Stage | Zhiwei Liu, Yang Yang, Zi Huang, Fumin Shen, Dongxiang Zhang, Heng Tao Shen | In order to overcome these two problems, in this paper, we design an event early embedding model (EEEM) that can 1) extract social events from noise, 2) find the previous similar events, and 3) predict future dynamics of a new event. |
133 | POI Popularity Prediction via Hierarchical Fusion of Multiple Social Clues | Yaqian Duan, Xinze Wang, Yang Yang, Zi Huang, Ning Xie, Heng Tao Shen | In this paper, we propose a novel approach, termed Hierarchical Multi-Clue Fusion (HMCF), for predicting the popularity of POIs. We collect a multi-source POI dataset from four widely-used tourism platforms. |
134 | Multitask Learning for Fine-Grained Twitter Sentiment Analysis | Georgios Balikas, Simon Moura, Massih-Reza Amini | We argue that such classification tasks are correlated and we propose a multitask approach based on a recurrent neural network that benefits by jointly learning them. |
135 | Evolution of Information Needs based on Life Event Experiences with Topic Transition | Naoto Takeda, Yohei Seki, Mimpei Morishita, Yoichi Inagaki | We propose a method to clarify the evolution of users’ information needs related to a user’s interests and actions based upon life events such as "childbirth." |
136 | Distribution-oriented Aesthetics Assessment for Image Search | Chaoran Cui, Huidi Fang, Xiang Deng, Xiushan Nie, Hongshuai Dai, Yilong Yin | In this paper, distinguished from existing studies relying on a single label, we propose to quantify the image aesthetics by a distribution over quality levels. |
137 | On the Benefit of Incorporating External Features in a Neural Architecture for Answer Sentence Selection | Ruey-Cheng Chen, Evi Yulianti, Mark Sanderson, W. Bruce Croft | Incorporating conventional, unsupervised features into a neural architecture has the potential to improve modeling effectiveness, but this aspect is often overlooked in the research of deep learning models for information retrieval. |
138 | Personalized Response Generation via Domain adaptation | Min Yang, Zhou Zhao, Wei Zhao, Xiaojun Chen, Jia Zhu, Lianqiang Zhou, Zigang Cao | In this paper, we propose a novel personalized response generation model via domain adaptation (PRG-DM). |
139 | An Accurate, Efficient, and Scalable Approach to Channel Matching in Smart TVs | Jiwon Hong, Sang-Wook Kim, Mina Rho, YoonHee Choi, Yoonsik Tak | In this paper, we introduce our TV channel matching system that resolves such problems. |
140 | Top-K Influential Nodes in Social Networks: A Game Perspective | Yu Zhang, Yan Zhang | In this paper, we study influence maximization from a game perspective. |
141 | Online Learning to Rank for Cross-Language Information Retrieval | Razieh Rahimi, Azadeh Shakery | In this work, we present the first empirical study of optimizing a model for Cross-Language Information Retrieval (CLIR) based on implicit feedback inferred from user interactions. |
142 | Mining Business Opportunities from Location-based Social Networks | Shenglin Zhao, Irwin King, Michael R. Lyu, Jia Zeng, Mingxuan Yuan | In this paper, we take this challenge and define the business opportunity mining problem, which recommends new business categories at a partitioned business district. |
143 | On Including the User Dynamic in Learning to Rank | Nicola Ferro, Claudio Lucchese, Maria Maistro, Raffaele Perego | In this paper, we explore the possibility of integrating the user dynamic directly into the LtR algorithms. |
144 | Document Expansion Using External Collections | Garrick Sherman, Miles Efron | Our experiments demonstrate that the proposed model improves ad-hoc document retrieval effectiveness on a variety of corpus types, with a particular benefit on more heterogeneous collections of documents. |
145 | Detecting Positive Medical History Mentions | Bing Bai, Pierre-Francois Laquerre, Richard Jackson, Robert Stewart | In this work we designed a scheme to automatically extract the medical history of patients from a large healthcare database. |
146 | Emotional Social Signals for Search Ranking | Ismail Badache, Mohand Boughanem | Our objective in this paper is to study the impact of the new social signals, called Facebook reactions (love, haha, angry, wow, sad) in the retrieval. |
147 | Skill Translation Models in Expert Finding | Arash Dargahi Nobari, Sajad Sotudeh Gharebagh, Mahmood Neshati | In this paper, we propose two translation models to augment a given query with relevant words. |
148 | A Metric for Sentence Ordering Assessment Based on Topic-Comment Structure | Liana Ermakova, Josiane Mothe, Anton Firsov | In contrast to that, we present a self-sufficient metric for SO assessment based on text topic-comment structure. |
149 | Gaussian Embeddings for Collaborative Filtering | Ludovic Dos Santos, Benjamin Piwowarski, Patrick Gallinari | In this paper, we leverage recent works in learning Gaussian embeddings for the recommendation task. |
150 | Detecting Controversies in Online News Media | Kaspar Beelen, Evangelos Kanoulas, Bob van de Velde | This paper sets out to detect controversial news reports using online discussions as a source of information. |
151 | Generating and Personalizing Bundle Recommendations on | Apurva Pathak, Kshitiz Gupta, Julian McAuley | In this paper, we seek to understand the semantics of what constitutes a ‘good’ bundle, in order to recommend existing bundles to users on the basis of their constituent products, as well the more difficult task of generating new bundles that are personalized to a user. To do so we collect a new dataset from the Steam video game distribution platform, which is unique in that it contains both ‘traditional’ recommendation data (rating and purchase histories between users and items), as well as bundle purchase information. |
152 | X-DART: Blending Dropout and Pruning for Efficient Learning to Rank | Claudio Lucchese, Franco Maria Nardini, Salvatore Orlando, Raffaele Perego, Salvatore Trani | In this paper we propose X-DART, a new Learning to Rank algorithm focusing on the training of robust and compact ranking models. |
153 | Non-negative Matrix Factorization Meets Word Embedding | Melissa Ailem, Aghiles Salah, Mohamed Nadif | In this paper, we aim to address the above issue and propose a new model which successfully integrates a word embedding model, word2vec, into an NMF framework so as to leverage the semantic relationships between words. |
154 | Term Proximity Constraints for Pseudo-Relevance Feedback | Ali Montazeralghaem, Hamed Zamani, Azadeh Shakery | In this paper, we propose three additional constraints based on the proximity of feedback terms to the query terms in the feedback documents. Previous work has introduced a set of constraints (axioms) that should be satisfied by any PRF model. |
155 | Gauging the Quality of Relevance Assessments using Inter-Rater Agreement | Tadele T. Damessie, Thao P. Nghiem, Falk Scholer, J. Shane Culpepper | In this work, we directly compare the reliability of judgments using three different types of bronze assessor groups. |
156 | Neural Citation Network for Context-Aware Citation Recommendation | Travis Ebesu, Yi Fang | We propose a flexible encoder-decoder architecture called Neural Citation Network (NCN), embodying a robust representation of the citation context with a max time delay neural network, further augmented with an attention mechanism and author networks. |
157 | A Study of SVM Kernel Functions for Sensitivity Classification Ensembles with POS Sequences | Graham McDonald, Nicolás García-Pedrajas, Craig Macdonald, Iadh Ounis | Therefore, in this work, we present an evaluation of five SVM kernel functions for sensitivity classification using POS sequences. |
158 | Entity Set Expansion via Knowledge Graphs | Xiangling Zhang, Yueguo Chen, Jun Chen, Xiaoyong Du, Ke Wang, Ji-Rong Wen | We propose a novel approach to solve the problem using knowledge graphs, by considering the deficiency (e.g., incompleteness) of knowledge graphs. |
159 | Word Embedding Causes Topic Shifting; Exploit Global Context! | Navid Rekabsaz, Mihai Lupu, Allan Hanbury, Hamed Zamani | To address this issue, we revisit the use of global context (i.e. the term co-occurrence in documents) to measure the term relatedness. |
160 | Building Your Own Reading List Anytime via Embedding Relevance, Quality, Timeliness and Diversity | Bo-Wen Zhang, Xu-Cheng Yin, Fang Zhou, Jian-Lin Jin | In this paper, we propose a searching framework for building a topical reading list anytime, where the Relevance (between topics and books), Quality (of books), Timeliness (of popularities) and Diversity (of results) are embedded into vector representations respectively based on user-generated contents and statistics on social media. |
161 | ENCORE: External Neural Constraints Regularized Distant Supervision for Relation Extraction | Siliang Tang, Jinjian Zhang, Ning Zhang, Fei Wu, Jun Xiao, Yueting Zhuang | In this paper, we proposed a novel neural framework, named ENCORE (External Neural COnstraints REgularized distant supervision), which allows an integration of other information for standard DS through regularizations under multiple external neural networks. |
162 | Information Retrieval Model using Generalized Pareto Distribution and Its Application to Instance Search | Masaya Murata, Kaoru Hiramatsu, Shin’ichi Satoh | We adopt the generalized Pareto distribution for the information-based model and show that the parameters can be estimated based on the mean excess function. |
163 | Predicting Information Seeking Intentions from Search Behaviors | Matthew Mitsui, Jiqun Liu, Nicholas J. Belkin, Chirag Shah | We present results of a study of 40 participants, each working on two different journalism tasks, which investigated how their search behaviors could indicate their intentions. |
164 | But Is It Statistically Significant?: Statistical Significance in IR Research, 1995-2014 | Ben Carterette | We analyze 5,792 IR conference papers published over 20 years to investigate how researchers have used and are using statistical significance testing in their experiments |
165 | Generalized Mixed Effect Models for Personalizing Job Search | Ankan Saha, Dhruv Arya | We describe the details of the new method along with the challenges faced in launching such a model into production and making it efficient at a very large scale. |
166 | Contextualizing Citations for Scientific Summarization using Word Embeddings and Domain Knowledge | Arman Cohan, Nazli Goharian | To address this problem, we propose an unsupervised model that uses distributed representation of words as well as domain knowledge to extract the appropriate context from the reference paper. |
167 | Crowdsourced App Review Manipulation | Shanshan Li, James Caverlee, Wei Niu, Parisa Kaghazgaran | And yet, the user reviews and ratings on these marketplaces may be strategically targeted by app developers. |
168 | CitySearcher: A City Search Engine For Interests | Mohamed Abdel Maksoud, Gaurav Pandey, Shuaiqiang Wang | We introduce CitySearcher, a vertical search engine that searches for cities when queried for an interest. To reduce the effect of the mismatched semantic relationships, we generate a set of features for learning based on a novel clustering-based method. |
169 | Cross-Language Question Re-Ranking | Giovanni Da San Martino, Salvatore Romeo, Alberto Barroón-Cedeño, Shafiq Joty, Lluís Maàrquez, Alessandro Moschitti, Preslav Nakov | We observe that both approaches degrade the performance of the system when working on the translated text, especially the kernel-based system, which depends heavily on a syntactic kernel. |
170 | Open Relation Extraction for Support Passage Retrieval: Merit and Open Issues | Amina Kadry, Laura Dietz | Our goal is to complement an entity ranking with human-readable explanations of how those retrieved entities are connected to the information need. |
171 | Generating Query Suggestions to Support Task-Based Search | Dario Garigliotti, Krisztian Balog | We propose a probabilistic modeling framework that obtains keyphrases from multiple sources and generates query suggestions from these keyphrases. |
172 | De-duping URLs with Sequence-to-Sequence Neural Networks | Keyang Xu, Zhengzhong Liu, Jamie Callan | Traditional de-duping methods are usually limited to heavily engineered rule matching strategies.In this work, we propose a novel URL de-duping framework based on sequence-to-sequence (Seq2Seq) neural networks. |
173 | Graph Summarization for Entity Relatedness Visualization | Yukai Miao, Jianbin Qin, Wei Wang | In this work, we investigate how to summarize the relatedness graphs and how to use the summarized graphs to assistant the users to retrieve target information. |
174 | Layout and Semantics: Combining Representations for Mathematical Formula Search | Kenny Davila, Richard Zanibbi | We propose searching both formula representations using a three-layer model. |
175 | Understanding and Predicting Usefulness Judgment in Web Search | Jiaxin Mao, Yiqun Liu, Huanbo Luan, Min Zhang, Shaoping Ma, Hengliang Luo, Yuntao Zhang | Our study sheds light on the understanding of the dynamics of the user-perceived usefulness of documents in a search session and provides implications for the evaluation and design of Web search engines. |
176 | Open Source Repository Recommendation in Social Coding | Jyun-Yu Jiang, Pu-Jen Cheng, Wei Wang | The aim of this paper is to investigate the feasibility of leveraging user programming language preference to improve the performance of OCCF-based repository recommendation. |
177 | Venue Appropriateness Prediction for Personalized Context-Aware Venue Suggestion | Mohammad Aliannejadi, Fabio Crestani | In this paper, we present a set of novel scores to measure the similarity between a user and a candidate venue in a new city. |
178 | Building Bridges across Social Platforms: Answering Twitter Questions with Yahoo! Answers | Mossaab Bagdouri, Douglas W. Oard | This paper investigates techniques for answering microblog questions by searching in a large community question answering website. |
179 | Large-Scale Goodness Polarity Lexicons for Community Question Answering | Todor Mihaylov, Daniel Balchev, Yasen Kiprov, Ivan Koychev, Preslav Nakov | In particular, we use pointwise mutual information in order to build large-scale goodness polarity lexicons in a semi-supervised manner starting with a small number of initial seeds. |
180 | A Neural Language Model for Query Auto-Completion | Dae Hoon Park, Rikio Chiba | In order to suggest queries for previously unseen text, we propose a neural language model that learns how to generate a query from a starting text, a prefix. |
181 | Social Media Advertisement Outreach: Learning the Role of Aesthetics | Avikalp Srivastava, Madhav Datt, Jaikrishna Chaparala, Shubham Mangla, Priyadarshi Patnaik | Our paper is an initial study, where we propose a novel method to evaluate and improve outreach of promotional images from corporations on Twitter, based purely on their describable aesthetic attributes. |
182 | Seasonal Web Search Query Selection for Influenza-Like Illness (ILI) Estimation | Niels Dalum Hansen, Kåre Mølbak, Ingemar J. Cox, Christina Lioma | To address this issue, we propose modeling the seasonal variation in ILI activity and selecting queries that are correlated with the residual of the seasonal model and the observed ILI signal. |
183 | Improving Retrieval Performance for Verbose Queries via Axiomatic Analysis of Term Discrimination Heuristic | Mozhdeh Ariannezhad, Ali Montazeralghaem, Hamed Zamani, Azadeh Shakery | In this paper, we propose a constraint to model the interaction between query length and IDF. |
184 | Timestamping Entities using Contextual Information | Adam Jatowt, Daisuke Kawai, Katsumi Tanaka | We propose in this paper to estimate entity’s lifetimes using link structure in Wikipedia focusing on person entities. |
185 | On the Use of an Intermediate Class in Boolean Crowdsourced Relevance Annotations for Learning to Rank Comments | Alberto Barrón-Cedeño, Giovanni Da San Martino, Simone Filice, Alessandro Moschitti | Our main contribution is to show empirically that the inclusion of an intermediate class to assess Boolean relevance is not useful. |
186 | Embedding-based Query Expansion for Weighted Sequential Dependence Retrieval Model | Saeid Balaneshin-kordan, Alexander Kotov | In this paper, we propose Semantic Weighted Dependence Model (SWDM), a PRF based query expansion method for WSDM, which utilizes distributed low-dimensional word representations (i.e., word embeddings). |
187 | Experiments with Convolutional Neural Network Models for Answer Selection | Jinfeng Rao, Hua He, Jimmy Lin | This paper focuses on the problem of answer selection for question answering: we attempt to replicate the results of Severyn and Moschitti using their open-source code as well as to reproduce their results via a de novo (i.e., from scratch) implementation using a completely different deep learning toolkit. |
188 | Luandri: A Clean Lua Interface to the Indri Search Engine | Bhaskar Mitra, Fernando Diaz, Nick Craswell | To bridge this gap, we introduce Luandri (pronounced "laundry"), a simple interface for exposing the search capabilities of Indri to Torch models implemented in Lua. |
189 | Finally, a Downloadable Test Collection of Tweets | Royal Sequiera, Jimmy Lin | We analyzed both datasets in terms of content overlap and retrieval baselines to show that the Internet Archive data can serve as a drop-in replacement for the Tweets2013 collection, thereby providing the research community with, finally, a downloadable collection of tweets. |
190 | Cookpad Image Dataset: An Image Collection as Infrastructure for Food Research | Jun Harashima, Yuichiro Someya, Yohei Kikuta | In this study, we construct the Cookpad Image Dataset, a novel collection of food images taken from Cookpad, the largest recipe search service in the world. |
191 | SogouT-16: A New Web Corpus to Embrace IR Research | Cheng Luo, Yukun Zheng, Yiqun Liu, Xiaochuan Wang, Jingfang Xu, Min Zhang, Shaoping Ma | In this study, we present a Chinese Web collection, SogouT-16, which is the largest free-of-charge public Chinese Web collection so far. |
192 | A Test Collection for Evaluating Retrieval of Studies for Inclusion in Systematic Reviews | Harrisen Scells, Guido Zuccon, Bevan Koopman, Anthony Deacon, Leif Azzopardi, Shlomo Geva | This paper introduces a test collection for evaluating the effectiveness of different methods used to retrieve research studies for inclusion in systematic reviews. |
193 | One Million Posts: A Data Set of German Online Discussions | Dietmar Schabus, Marcin Skowron, Martin Trapp | In this paper we introduce a new data set consisting of user comments posted to the website of a German-language Austrian newspaper. |
194 | KASANDR: A Large-Scale Dataset with Implicit Feedback for Recommendation | Sumit Sidana, Charlotte Laclau, Massih R. Amini, Gilles Vandelle, André Bois-Crettez | In this paper, we describe a novel, publicly available collection for recommendation systems that records the behavior of customers of the European leader in eCommerce advertising, Kelkoo\footnote{\url{https://www.kelkoo.com/}}, during one month. |
195 | A Collection for Detecting Triggers of Sentiment Spikes | Anastasia Giachanou, Ida Mele, Fabio Crestani | With the aim to facilitate research on this problem, we describe a collection of tweets that can be used for detecting and ranking the likely triggers of sentiment spikes towards different entities. |
196 | Anserini: Enabling the Use of Lucene for Information Retrieval Research | Peilin Yang, Hui Fang, Jimmy Lin | This paper introduces Anserini, a new information retrieval toolkit that aims to provide the best of both worlds, to better align information retrieval practice and research. |
197 | A Stream-based Resource for Multi-Dimensional Evaluation of Recommender Algorithms | Benjamin Kille, Andreas Lommatzsch, Frank Hopfgartner, Martha Larson, Arjen P. de Vries | We introduce two resources supporting such evaluation methodologies: the new data set of stream recommendation interactions released for CLEF NewsREEL 2017, and the new Open Recommendation Platform (ORP). |
198 | A Large-Scale Query Spelling Correction Corpus | Matthias Hagen, Martin Potthast, Marcel Gohsen, Anja Rathgeber, Benno Stein | We present a new large-scale collection of 54,772 queries with manually annotated spelling corrections. |
199 | DBpedia-Entity v2: A Test Collection for Entity Search | Faegheh Hasibi, Fedor Nikolaev, Chenyan Xiong, Krisztian Balog, Svein Erik Bratsberg, Alexander Kotov, Jamie Callan | We develop and release a new version of this test collection, DBpedia-Entity v2, which uses a more recent DBpedia dump and a unified candidate result pool from the same set of retrieval models. |
200 | A Cross-Platform Collection for Contextual Suggestion | Mohammad Aliannejadi, Ida Mele, Fabio Crestani | In this paper, we release both collections that were used by the system above. |
201 | RELink: A Research Framework and Test Collection for Entity-Relationship Retrieval | Pedro Saleiro, Natasa Milic-Frayling, Eduarda Mendes Rodrigues, Carlos Soares | In this paper we describe a method for generating E-R test queries to support comprehensive E-R search experiments. |
202 | BioNex: A System For Biomedical News Event Exploration | Patrick Ernst, Arunav Mishra, Avishek Anand, Vinay Setty | We demonstrate BioNex, a system to mine, rank and visualize biomedical news events. |
203 | RankEval: An Evaluation and Analysis Framework for Learning-to-Rank Solutions | Claudio Lucchese, Cristina Ioana Muntean, Franco Maria Nardini, Raffaele Perego, Salvatore Trani | In this demo paper we propose RankEval, an open-source tool for the analysis and evaluation of Learning-to-Rank (LtR) models based on ensembles of regression trees. |
204 | ReviewMiner: An Aspect-based Review Analytics System | Derek Wu, Hongning Wang | We develop an aspect-based sentiment analysis system named ReviewMiner. |
205 | Nordlys: A Toolkit for Entity-Oriented and Semantic Search | Faegheh Hasibi, Krisztian Balog, Darío Garigliotti, Shuo Zhang | We introduce Nordlys, a toolkit for entity-oriented and semantic search. |
206 | Teaching the Information Retrieval Process Using a Web-Based Environment and Game Mechanics | Thomas Wilhelm-Stein, Stefan Kahl, Maximilian Eibl | Teaching the Information Retrieval Process Using a Web-Based Environment and Game Mechanics |
207 | Smart Media Generation System for Broadcasting Contents | Jeong-Woo Son, Wonjoo Park, Sang-Yun Lee, Jinwoo Kim, Sun-Joong Kim | This paper proposes a new system for this purpose. |
208 | EvALL: Open Access Evaluation for Information Access Systems | Enrique Amigó, Jorge Carrillo-de-Albornoz, Mario Almagro-Cádiz, Julio Gonzalo, Javier Rodríguez-Vidal, Felisa Verdejo | The EvALL online evaluation service aims to provide a unified evaluation framework for Information Access systems that makes results completely comparable and publicly available for the whole research community. |
209 | TOTEM: Personal Tweets Summarization on Mobile Devices | Jin Yao Chin, Sourav S. Bhowmick, Adam Jatowt | Given that 80% of active Twitter users access the site on mobile devices, in this demonstration we present a lightweight, personalized, on-demand, topic modeling-based tweets summarization engine called TOTEM, designed for such devices. |
210 | LSTM vs. BM25 for Open-domain QA: A Hands-on Comparison of Effectiveness and Efficiency | Sosuke Kato, Riku Togashi, Hideyuki Maeda, Sumio Fujita, Tetsuya Sakai | In this demonstration, we provide the attendees of SIGIR 2017 an opportunity to experience a live comparison of two open-domain QA systems, one based on a long short-term memory (LSTM) architecture with over 11 million Yahoo! Chiebukuro (i.e., Japanese Yahoo! Answers) questions and over 27.4 million answers for training, and the other based on BM25. |
211 | Proactive Information Retrieval via Screen Surveillance | Tung Vuong, Giulio Jacucci, Tuukka Ruotsalo | We demonstrate proactive information retrieval via screen surveillance. |
212 | ASTERIX: Ambiguity and Missing Element-Aware XML Keyword Search Engine | Ba Quan Truong, Sourav S. Bhowmick, Curtis Dyreson, Hong Jing Khok | We demonstrate ASTERIX, an innovative XKS engine that addresses these limitations. |
213 | Visual Pool: A Tool to Visualize and Interact with the Pooling Method | Aldo Lipani, Mihai Lupu, Allan Hanbury | In this paper we develop a novel visualization technique for the pooling method, and integrate it in a demo application named Visual Pool. |
214 | Event Detection on Curated Tweet Streams | Nimesh Ghelani, Salman Mohammed, Shine Wang, Jimmy Lin | We present a system for identifying interesting social media posts on Twitter and delivering them to users’ mobile devices in real time as push notifications. |
215 | A Task-oriented Search Engine for Evidence-based Medicine | Bevan Koopman, Guido Zuccon, Jack Russell | This paper describes a search engine specifically designed for searching medical literature for the purpose of EBM and in a clinical decision support setting. |
216 | Social Media Image Recognition for Food Trend Analysis | Giuseppe Amato, Paolo Bolettieri, Vinicius Monteiro de Lira, Cristina Ioana Muntean, Raffaele Perego, Chiara Renso | The system that we propose, WorldFoodMap, captures the stream of food photos from social media and, thanks to a CNN food image classifier, identifies the categories of food that people are sharing. |
217 | Computing Web-scale Topic Models using an Asynchronous Parameter Server | Rolf Jagerman, Carsten Eickhoff, Maarten de Rijke | We present APS-LDA, which integrates state-of-the-art topic modeling with cluster computing frameworks such as Spark using a novel asynchronous parameter server. |
218 | Seeing Bot | Yingwei Pan, Zhaofan Qiu, Ting Yao, Houqiang Li, Tao Mei | We demonstrate a video captioning bot, named Seeing Bot, which can generate a natural language description about what it is seeing in near real time. |
219 | MAPping the probability of start-up success | David Hawking | It turned out that end-users cared about search relevance only to the extent of reaching for buckets of vitriol when search failed to find what they wanted, and people making purchasing decisions were more interested in other things: what repositories can be included in the search, how responsive search is to updates, what security models are provided, the appearance of the search pages, and achieving internal business goals — even at the expense of end-user needs. |
220 | Search Without a Query: Powering Job Recommendations via Search Index at LinkedIn | Dhruv Arya, Ganesh Venkataraman | In this paper, we describe how the jobs recommendations is powered by a search index and some practical challenges involved in scaling such a system. |
221 | Spotify: Music Access At Scale | Fernando Diaz | In this presentation, we will highlight the research involved in developing Spotify and outline a research program for large scale music access. |
222 | Structuring the Unstructured: From Startup to Making Sense of eBay’s Huge eCommerce Inventory | Ido Guy, Kira Radinsky | In this proposed presentation, we will share the story of a research startup from its inception until its acquisition and integration as eBay’s data science team. |
223 | Making Ads More Relevant Innovations in Digital Advertising | Sudong Chung | In this talk, I will talk about the history of digital advertising and the recent innovations in advertising technologies to make ads more relevant to individual viewer. |
224 | Traditional IR Meets Ontology Engineering in Search for Data | Anton Firsov | Traditional IR Meets Ontology Engineering in Search for Data |
225 | Semantic Query Understanding | Ricardo Baeza-Yates | To accomplish semantic ranking, we use machine learning in several stages. |
226 | Twicalli: An Earthquake Detection System Based on Citizen Sensors Used for Emergency Response in Chile | Barbara Poblete | In this talk I will describe "Twicalli", a real-time earthquake detection system based on citizen sensors. |
227 | Cross-Lingual Information Retrieve in Sogou Search | Jingfang Xu, Feifei Zhai, Zhengshan Xue | Cross-Lingual Information Retrieve in Sogou Search |
228 | Find Shoes Like These | Hideyuki Maeda | We present an Euclidean embedding image representation, which serves to rank auction item images through wide range of semantic similarity spectrum, in the order of the relevance to the given query image much more effective than the baseline method in terms of a graded relevance measure. |
229 | Machine Learning Powered A/B Testing | Pavel Serdyukov | So, we proposed several metrics that learn the models of the trend in such time series and use them to quantify the changes in the user behavior. |
230 | Naver Search: Deep Learning Powered Search Portal for Intelligent Information Provision | Inho Kang | In this talk, I’ll cover some efforts and challenges in understanding and satisfying users on various devices. |
231 | Managing Tail Latencies in Large Scale IR Systems | Joel Mackenzie | In the proposed PhD project, we focus on improving the efficiency of high percentile tail latencies in large scale IR systems while minimising end-to-end effectiveness loss. |
232 | Health Misinformation in Search and Social Media | Amira Ghenai | In this paper, I briefly discuss my current work including background key references. |
233 | Relevance Judgments: Preferences, Scores and Ties | Ziying Yang | In order to have a better understanding of users’ perceptions of relevance and collect data with high fidelity, we propose to use the Pairwise Preference technique [2] to collect relevance judgments from a crowdsourcing platform. |
234 | Searchbots: Using Chatbots in Collaborative Information-seeking Tasks | Sandeep Avula | RG1: Our first research goal will be to investigate the use of searchbots in a collaborative search scenario. |
235 | Examining Information on Social Media: Topic Modelling, Trend Prediction and Community Classification | Anjie Fang | The second task is to propose a topic modelling approach that generates more coherent topics for social media data. |
236 | Dynamic Personalized Ranking of Facets for Exploratory Search | Esraa Ali | I am proposing a personalized approach to the dynamic ranking of facets. |
237 | Multi-dimensional Formula Feature Modeling for Mathematical Information Retrieval | Ke Yuan | In this study, I propose a novel formula feature modeling method for mathematical information retrieval. |
238 | Deep Collaborative Filtering Approaches for Context-Aware Venue Recommendation | Jarana Manotumruksa | Deep Collaborative Filtering Approaches for Context-Aware Venue Recommendation |
239 | Utilizing Online Social Media for Disaster Relief: Practical Challenges in Retrieval | Moumita Basu | Utilizing Online Social Media for Disaster Relief: Practical Challenges in Retrieval |
240 | Statistical Significance Testing in Information Retrieval: Theory and Practice | Ben Carterette | Statistical Significance Testing in Information Retrieval: Theory and Practice |
241 | Candidate Selection for Large Scale Personalized Search and Recommender Systems | Dhruv Arya, Ganesh Venkataraman, Aman Grover, Krishnaram Kenthapadi | In this tutorial we survey various candidate selection techniques and deep dive into case studies on a large scale social media platform. |
242 | A/B Testing at Scale: Accelerating Software Innovation | Alex Deng, Pavel Dmitriev, Somit Gupta, Ron Kohavi, Paul Raff, Lukas Vermeer | In this tutorial we will give an introduction to A/B testing, share key lessons learned from scaling experimentation at Bing to thousands of experiments per year, present real examples, and outline promising directions for future work. |
243 | Probabilistic Topic Models for Text Data Retrieval and Analysis | ChengXiang Zhai | This tutorial will systematically review the major research progress in probabilistic topic models and discuss their applications in text retrieval and text mining. |
244 | Neural Networks for Information Retrieval | Tom Kenter, Alexey Borisov, Christophe Van Gysel, Mostafa Dehghani, Maarten de Rijke, Bhaskar Mitra | The aim of this full-day tutorial is to give a clear overview of current tried-and-trusted neural methods in IR and how they benefit IR research. |
245 | Building Test Collections: An Interactive Guide for Students and Others Without Their Own Evaluation Conference Series | Ian Soboroff | The goal of this tutorial is to lay out issues, procedures, pitfalls, and practical advice. |
246 | From Design to Analysis: Conducting Controlled Laboratory Experiments with Users | Diane Kelly, Anita Crescenzi | The goals of the tutorial are to increase participants’ (1) understanding of the uses of controlled laboratory experiments with human participants; (2) understanding of the technical vocabulary and procedures associated with such experiments and (3) confidence in conducting and evaluating IIR experiments. |
247 | SIGIR 2017 Tutorial on Health Search (HS2017): A Full-day from Consumers to Clinicians | Guido Zuccon, Bevan Koopman | The HS2017 tutorial will cover topics from an area of information retrieval (IR) with significant societal impact – health search. |
248 | Axiomatic Thinking for Information Retrieval: And Related Tasks | Enrique Amigo, Hui Fang, Stefano Mizzaro, ChengXiang Zhai | The workshop aims to help foster collaboration of researchers working on different perspectives of axiomatic thinking and encourage discussion and research on general methodological issues related to applying axiomatic thinking to IR and related tasks. |
249 | Joint Workshop on Bibliometric-enhanced Information Retrieval and Natural Language Processing for Digital Libraries (BIRNDL 2017) | Muthu Kumar Chandrasekaran, Kokil Jaidka, Philipp Mayr | Bibliometrics, information retrieval (IR), text mining and NLP techniques could help in these search and look-up activities, but are not yet widely used. |
250 | First International Workshop on Conversational Approaches to Information Retrieval (CAIR’17) | Hideo Joho, Lawrence Cavedon, Jaime Arguello, Milad Shokouhi, Filip Radlinski | A specific focus is on techniques that support complex and multi-turn user-machine dialogues for information access and retrieval, and multi-model interfaces for interacting with such systems. |
251 | SIGIR 2017 Workshop on eCommerce (ECOM17) | Jon Degenhardt, Surya Kallumadi, Maarten de Rijke, Luo Si, Andrew Trotman, Yinghui Xu | SIGIR 2017 Workshop on eCommerce (ECOM17) |
252 | The First Workshop on Knowledge Graphs and Semantics for Text Retrieval and Analysis (KG4IR) | Laura Dietz, Chenyan Xiong, Edgar Meij | The goal of this workshop is to bring together a community of researchers and practitioners who are interested in using, aligning, and constructing knowledge graphs and similar semantic resources for information retrieval applications. |
253 | The Lucene for Information Access and Retrieval Research (LIARR) Workshop at SIGIR 2017 | Leif Azzopardi, Matt Crane, Hui Fang, Grant Ingersoll, Jimmy Lin, Yashar Moshfeghi, Harrisen Scells, Peilin Yang, Guido Zuccon | Our goal is to promote the use of Lucene for information access and retrieval research. |
254 | SIGIR 2017 Workshop on Neural Information Retrieval (Neu-IR’17) | Nick Craswell, W Bruce Croft, Maarten de Rijke, Jiafeng Guo, Bhaskar Mitra | After the first successful Neu-IR workshop at SIGIR 2016, our goal this year will be to host a highly interactive full-day workshop to bring the neural IR community together to specifically address these key challenges facing this line of research. |
255 | SIGIR 2017 Workshop on Open Knowledge Base and Question Answering (OKBQA2017) | Key-Sun Choi, Teruko Mitamura, Piek Vossen, Jin-Dong Kim, Axel-Cyrille Ngonga Ngomo | SIGIR 2017 Workshop on Open Knowledge Base and Question Answering (OKBQA2017) |