Paper Digest: CIKM 2015 Highlights
The ACM Conference on Information and Knowledge Management (CIKM) is an annual computer science research conference dedicated to information management and knowledge management.
To help the community quickly catch up on the work presented in this conference, Paper Digest Team processed all accepted papers, and generated one highlight sentence (typically the main topic) for each paper. Readers are encouraged to read these machine generated highlights / summaries to quickly get the main idea of each paper.
If you do not want to miss any interesting academic paper, you are welcome to sign up our free daily paper digest service to get updates on new papers published in your area every day. You are also welcome to follow us on Twitter and Linkedin to get updated with new conference digests.
Paper Digest Team
team@paperdigest.org
TABLE 1: CIKM 2015 Papers
Title | Authors | Highlight | |
---|---|---|---|
1 | Slow Search: Improving Information Retrieval Using Human Assistance | Jaime Teevan | We propose the concept of "slow search," where search engines use additional time to provide a higher quality search experience than is possible given conventional time constraints. |
2 | External Data Access And Indexing In AsterixDB | Abdullah A. Alamoudi, Raman Grover, Michael J. Carey, Vinayak Borkar | In this paper, we describe techniques to achieve the qualities offered by DBMSs when accessing external data. |
3 | Dynamic Resource Management In a Massively Parallel Stream Processing Engine | Kasper Grud Skat Madsen, Yongluan Zhou | In this paper, we propose an approach to integrate dynamic resource management with passive fault-tolerance mechanisms in a MPSPE so that we can harvest the checkpoints prepared for failure recovery to enhance the efficiency of dynamic load migrations. |
4 | A Parallel GPU-Based Approach to Clustering Very Fast Data Streams | Pengtao Huang, Xiu Li, Bo Yuan | In this paper, we present a parallel algorithm called PaStream, which is based on advanced Graphics Processing Unit (GPU) and follows the online-offline framework of CluStream. |
5 | Scalable Clustering Algorithm via a Triangle Folding Processing for Complex Networks | Ying Kang, Xiaoyan Gu, Weiping Wang, Dan Meng | In this paper, we propose a scalable clustering algorithm via a triangle folding processing for complex networks(SCAFT). |
6 | Understanding the Impact of the Role Factor in Collaborative Information Retrieval | Lynda Tamine, Laure Soulier | In this paper, we investigate whether and how different factors, such as users’ behavior, search strategies, and effectiveness, are related to role assignment within a collaborative exploratory search. |
7 | Experiments with a Venue-Centric Model for Personalisedand Time-Aware Venue Suggestion | Romain Deveaud, M-Dyaa Albakour, Craig Macdonald, Iadh Ounis | In contrast, in this paper, we introduce a venue-centric yet personalised probabilistic approach that suggests personalised and popular venues for users to visit in the near future. |
8 | Search Result Diversification Based on Hierarchical Intents | Sha Hu, Zhicheng Dou, Xiaojie Wang, Tetsuya Sakai, Ji-Rong Wen | In this paper, we introduce a new hierarchical structure to represent user intents and propose two general hierarchical diversification models to leverage hierarchical intents. |
9 | Category-Driven Approach for Local Related Business Recommendations | Yonathan Perez, Michael Schueppert, Matthew Lawlor, Shaunak Kishore | We address the problem of constructing a useful and diverse list of such recommendations that would include an optimal combination of substitutes and complements. |
10 | A Soft Computing Approach for Learning to Aggregate Rankings | Javier Alvaro Vargas Muñoz, Ricardo da Silva Torres, Marcos André Gonçalves | This paper presents an approach to combine rank aggregation techniques using a soft computing technique — Genetic Programming — in order to improve the results in Information Retrieval tasks. |
11 | Approximate String Matching by End-Users using Active Learning | Lutz Büch, Artur Andrzejak | To address this problem, we propose an Active Learning algorithm which selects a best performing similarity measure in a given set while optimizing a decision threshold. |
12 | A Unified Posterior Regularized Topic Model with Maximum Margin for Learning-to-Rank | Shoaib Jameel, Wai Lam, Steven Schockaert, Lidong Bing | In contrast, we propose a learning-to-rank framework which integrates the supervised learning of a maximum margin classifier with the discovery of a suitable probabilistic topic model. |
13 | Collaborating between Local and Global Learning for Distributed Online Multiple Tasks | Xin Jin, Ping Luo, Fuzhen Zhuang, Jia He, Qing He | Thus, in this paper a collaborative learning scheme is proposed for this problem. |
14 | Lifespan-based Partitioning of Index Structures for Time-travel Text Search | Animesh Nandi, Suriya Subramanian, Sriram Lakshminarasimhan, Prasad M. Deshpande, Sriram Raghavan | The problem we tackle is how to efficiently handle different query classes using the same index layout. |
15 | Contextual Text Understanding in Distributional Semantic Space | Jianpeng Cheng, Zhongyuan Wang, Ji-Rong Wen, Jun Yan, Zheng Chen | In this work, we propose a new framework for generating context-aware text representations without diving into the sense space. |
16 | External Knowledge and Query Strategies in Active Learning: a Study in Clinical Information Extraction | Mahnoosh Kholghi, Laurianne Sitbon, Guido Zuccon, Anthony Nguyen | This paper presents a new active learning query strategy for information extraction, called Domain Knowledge Informativeness (DKI). |
17 | Ranking Deep Web Text Collections for Scalable Information Extraction | Pablo Barrio, Luis Gravano, Chris Develder | In this paper, we focus on an especially valuable family of text sources, the so-called deep web collections, whose (remote) contents are only accessible via querying. |
18 | Forming Online Support Groups for Internet and Behavior Related Addictions | Chih-Ya Shen, Hong-Han Shuai, De-Nian Yang, Yi-Feng Lan, Wang-Chien Lee, Philip S. Yu, Ming-Syan Chen | We prove that MSSG is NP-Hard and inapproximable within any ratio, and design a 3-approximation algorithm with a guaranteed error bound. |
19 | Concept-Based Relevance Models for Medical and Semantic Information Retrieval | Chunye Wang, Ramakrishna Akella | Using this framework, we transform documents and queries from term space into concept space, and propose a concept-based relevance model for improved estimation of relevance. |
20 | PlateClick: Bootstrapping Food Preferences Through an Adaptive Visual Interface | Longqi Yang, Yin Cui, Fan Zhang, John P. Pollak, Serge Belongie, Deborah Estrin | In this paper, we propose PlateClick, a novel system that bootstraps food preference using a simple, visual quiz-based user interface. |
21 | Data Driven Water Pipe Failure Prediction: A Bayesian Nonparametric Approach | Peng Lin, Bang Zhang, Yi Wang, Zhidong Li, Bin Li, Yang Wang, Fang Chen | In this paper, we propose a Bayesian nonparametric approach, namely the Dirichlet process mixture of hierarchical beta process model, for water pipe failure prediction. |
22 | Tumblr Blog Recommendation with Boosted Inductive Matrix Completion | Donghyuk Shin, Suleyman Cetintas, Kuang-Chih Lee, Inderjit S. Dhillon | In this paper, we propose a novel boosted inductive matrix completion method (BIMC) for blog recommendation. |
23 | BiasWatch: A Lightweight System for Discovering and Tracking Topic-Sensitive Opinion Bias in Social Media | Haokai Lu, James Caverlee, Wei Niu | We propose a lightweight system for (i) semi-automatically discovering and tracking bias themes associated with opposing sides of a topic; (ii) identifying strong partisans who drive the online discussion; and (iii) inferring the opinion bias of "regular" participants. |
24 | Knowlywood: Mining Activity Knowledge From Hollywood Narratives | Niket Tandon, Gerard de Melo, Abir De, Gerhard Weikum | This paper presents a novel approach that taps into movie scripts and other narrative texts. |
25 | Entity and Aspect Extraction for Organizing News Comments | Radityo Eko Prasojo, Mouna Kacimi, Werner Nutt | In this work, we address the above problem by organizing comments around the entities and the aspects they discuss. |
26 | HDRF: Stream-Based Partitioning for Power-Law Graphs | Fabio Petroni, Leonardo Querzoni, Khuzaima Daudjee, Shahin Kamali, Giorgio Iacoboni | In this paper, we propose High-Degree (are) Replicated First (HDRF), a novel streaming vertex-cut graph partitioning algorithm that effectively exploits skewed degree distributions by explicitly taking into account vertex degree in the placement decision. |
27 | Towards Scale-out Capability on Social Graphs | Haichuan Shang, Xiang Zhao, Uday Kiran, Masaru Kitsuregawa | We propose a novel separator-combiner based query processing engine which provides native load-balancing and very low communication overhead, such that increasinglylarger graphs can be simply addressed by adding more computing nodes to the cluster.The proposed system achieves remarkable scale-out capability in processing large social graphs with skew degree distributions, while providing many critical features for big data analytics, such as easy-to-use API, fault-tolerance and recovery. |
28 | Identifying Top- | Mojtaba Rezvani, Weifa Liang, Wenzheng Xu, Chengfei Liu | In this paper, we formulate the problem as the top-k structural hole spanner problem. |
29 | Scalable Facility Location for Massive Graphs on Pregel-like Systems | Kiran Garimella, Gianmarco De Francisci Morales, Aristides Gionis, Mauro Sozio | We propose a new scalable algorithm for the facility-location problem. |
30 | Rank by Time or by Relevance?: Revisiting Email Search | David Carmel, Guy Halawi, Liane Lewin-Eytan, Yoelle Maarek, Ariel Raviv | In this paper, we study the current search traffic of Yahoo mail, a major Web commercial mail service, and discuss the limitations of ranking search results by date. |
31 | On the Cost of Extracting Proximity Features for Term-Dependency Models | Xiaolu Lu, Alistair Moffat, J. Shane Culpepper | In this paper we examine the processes used to compute these statistics. |
32 | An Optimization Framework for Merging Multiple Result Lists | Chia-Jung Lee, Qingyao Ai, W. Bruce Croft, Daniel Sheldon | In this paper, we study in depth and extend a neural network-based approach, LambdaMerge, for merging results of ranked lists drawn from one (i.e., data fusion) or more (i.e., collection fusion) verticals. |
33 | Searching and Stopping: An Analysis of Stopping Rules and Strategies | David Maxwell, Leif Azzopardi, Kalervo Järvelin, Heikki Keskustalo | In this paper, we undertake the first large scale study of stopping rules, investigating how they influence overall session performance, and which rules best match actual stopping behaviour. |
34 | Automated News Suggestions for Populating Wikipedia Entity Pages | Besnik Fetahu, Katja Markert, Avishek Anand | In this work, we therefore look at Wikipedia through the lens of news and propose a novel news-article suggestion task to improve news coverage in Wikipedia, and reduce the lag of newsworthy references. |
35 | Mining Coordinated Intent Representation for Entity Search and Recommendation | Huizhong Duan, ChengXiang Zhai | We propose a novel generative model to discover coordinated intent representations from the entity search logs. |
36 | Sentiment Extraction by Leveraging Aspect-Opinion Association Structure | Li Zhao, Minlie Huang, Jiashen Sun, Hengliang Luo, Xiankai Yang, Xiaoyan Zhu | In this paper, we investigate the aspect-opinion association structure, and propose a "first clustering, then extracting" unsupervised model to leverage properties of the structure for sentiment extraction. |
37 | Leveraging Joint Interactions for Credibility Analysis in News Communities | Subhabrata Mukherjee, Gerhard Weikum | News communities such as digg, reddit, or newstrust offer recommendations, reviews, quality ratings, and further insights on journalistic works. |
38 | Clustering-based Active Learning on Sensor Type Classification in Buildings | Dezhi Hong, Hongning Wang, Kamin Whitehouse | We propose a clustering-based active learning algorithm to differentiate sensors in buildings by type, e.g., temperature v.s. humidity. |
39 | gSparsify: Graph Motif Based Sparsification for Graph Clustering | Peixiang Zhao | In this paper, we propose gSparsify, a graph sparsification method, to preferentially retain a small subset of edges from a graph which are more likely to be within clusters, while eliminating others with less or no structure correlation to clusters. |
40 | Incomplete Multi-view Clustering via Subspace Learning | Qiyue Yin, Shu Wu, Liang Wang | In this paper, a novel incomplete multi-view clustering method is therefore developed, which learns unified latent representations and projection matrices for the incomplete multi-view data. |
41 | Robust Subspace Clustering via Tighter Rank Approximation | Zhao Kang, Chong Peng, Qiang Cheng | In this paper, an arctangent function is used as a tighter approximation to the rank function. |
42 | Interactive User Group Analysis | Behrooz Omidvar-Tehrani, Sihem Amer-Yahia, Alexandre Termier | Since user data is often sparse and noisy, we propose to produce labeled groups that describe users with common properties and develop IUGA, an interactive framework based on group discovery primitives to explore the user space. |
43 | Viewability Prediction for Online Display Ads | Chong Wang, Achir Kalra, Cristian Borcea, Yi Chen | We analyze a real-life dataset from a large publisher, identify a number of features that impact the scroll depth for a given user and a page, and propose a probabilistic latent class model that predicts the viewability of any given scroll depth for a user-page pair. |
44 | 10 Bits of Surprise: Detecting Malicious Users with Minimum Information | Reza Zafarani, Huan Liu | In this study, we develop a methodology that identifies malicious users with limited information. |
45 | MAPer: A Multi-scale Adaptive Personalized Model for Temporal Human Behavior Prediction | Sarah Masud Preum, John A. Stankovic, Yanjun Qi | The primary objective of this research is to develop a simple and interpretable predictive framework to perform temporal modeling of individual user’s behavior traits based on each person’s past observed traits/behavior. |
46 | Classification with Active Learning and Meta-Paths in Heterogeneous Information Networks | Chang Wan, Xiang Li, Ben Kao, Xiao Yu, Quanquan Gu, David Cheung, Jiawei Han | We propose class-level meta-paths and study how they can be used to (1) build more accurate classifiers and (2) improve active learning in identifying objects for which training labels should be obtained. |
47 | Semantic Path based Personalized Recommendation on Weighted Heterogeneous Information Networks | Chuan Shi, Zhiqiang Zhang, Ping Luo, Philip S. Yu, Yading Yue, Bin Wu | In this paper, we are the first to propose the weighted HIN and weighted meta path concepts to subtly depict the path semantics through distinguishing different link attribute values. |
48 | A Graph-based Recommendation across Heterogeneous Domains | Deqing Yang, Jingrui He, Huazheng Qin, Yanghua Xiao, Wei Wang | To this end, in this paper, we propose a graph-based approach for recommendation across heterogeneous domains. |
49 | Query Relaxation across Heterogeneous Data Sources | Verena Kantere, George Orfanoudakis, Anastasios Kementsietsidis, Timos Sellis | In this paper, we propose a technique to compute query relaxations of an input query that can be rewritten and evaluated in an environment of collaborating autonomous and heterogeneous data sources. |
50 | Approximated Summarization of Data Provenance | Eleanor Ainy, Pierre Bourhis, Susan B. Davidson, Daniel Deutch, Tova Milo | Based on this notion, we present a novel provenance summarization algorithm which, based on the semantics of the underlying data and the intended use of provenance, outputs a summary of the input provenance. |
51 | An Integrated Bayesian Approach for Effective Multi-Truth Discovery | Xianzhi Wang, Quan Z. Sheng, Xiu Susie Fang, Lina Yao, Xiaofei Xu, Xue Li | Based on this insight, we propose an integrated Bayesian approach to the multi-truth-finding problem, by taking these features into account. |
52 | Approximate Truth Discovery via Problem Scale Reduction | Xianzhi Wang, Quan Z. Sheng, Xiu Susie Fang, Xue Li, Xiaofei Xu, Lina Yao | To address this issue, we propose an approximate truth discovery approach, which divides sources and values into groups according to a user-specified approximation criterion. |
53 | Organic or Organized?: Exploring URL Sharing Behavior | Cheng Cao, James Caverlee, Kyumin Lee, Hancheng Ge, Jinwook Chung | In this paper, we investigate the individual-based and group-based user behavior of URL sharing in social media toward uncovering these organic versus organized user groups. |
54 | Mining Brokers in Dynamic Social Networks | Chonggang Song, Wynne Hsu, Mong Li Lee | In this paper, we formally define the problem of detecting top-$k$ brokers given a social network and show that it is NP-hard. |
55 | Who Will You "@"? | Yeyun Gong, Qi Zhang, Xuyang Sun, Xuanjing Huang | In this paper, we present our work on building a recommendation system for the mention function in microblogging services. |
56 | Characterizing and Predicting Voice Query Reformulation | Ahmed Hassan Awadallah, Ranjitha Gurunath Kulkarni, Umut Ozertem, Rosie Jones | In this paper, we study the problem of voice query reformulation. |
57 | A Hierarchical Recurrent Encoder-Decoder for Generative Context-Aware Query Suggestion | Alessandro Sordoni, Yoshua Bengio, Hossein Vahabi, Christina Lioma, Jakob Grue Simonsen, Jian-Yun Nie | We present a novel hierarchical recurrent encoder-decoder architecture that makes possible to account for sequences of previous queries of arbitrary lengths. |
58 | A Network-Aware Approach for Searching As-You-Type in Social Media | Paul Lagrée, Bogdan Cautis, Hossein Vahabi | We present in this paper a novel approach for as-you-type top-k keyword search over social media. |
59 | Improving Microblog Retrieval with Feedback Entity Model | Feifan Fan, Runwei Qiang, Chao Lv, Jianwu Yang | In this paper, we propose a feedback entity model and integrate it into an adaptive language modeling framework in order to improve the retrieval performance. |
60 | Extracting Situational Information from Microblogs during Disaster Events: a Classification-Summarization Approach | Koustav Rudra, Subham Ghosh, Niloy Ganguly, Pawan Goyal, Saptarshi Ghosh | The proposed framework takes into consideration the typicalities pertaining to disaster events where (i) the same tweet often contains a mixture of situational and non-situational information, and (ii) certain numerical information, such as number of casualties, vary rapidly with time, and thus achieves superior performance compared to state-of-the-art tweet summarization approaches. |
61 | Profession-Based Person Search in Microblogs: Using Seed Sets to Find Journalists | Mossaab Bagdouri, Douglas W. Oard | We introduce the problem of searching for professionals in microblogging platforms. |
62 | Learning Entity Types from Query Logs via Graph-Based Modeling | Jingyuan Zhang, Luo Jie, Altaf Rahman, Sihong Xie, Yi Chang, Philip S. Yu | In this paper, we study the problem of learning entity types from search query logs and address the following challenges: (1) queries are short texts, and information related to entities is usually very sparse; (2) large amounts of irrelevant information exists in search logs, bringing noise in detecting entity types. |
63 | Collaborative Prediction for Multi-entity Interaction With Hierarchical Representation | Qiang Liu, Shu Wu, Liang Wang | In this work, we propose a Hierarchical Interaction Representation (HIR) model, which models the mutual action among different entities as a joint representation. |
64 | Learning to Represent Knowledge Graphs with Gaussian Embedding | Shizhu He, Kang Liu, Guoliang Ji, Jun Zhao | Therefore, this paper switches to density-based embedding and propose KG2E for explicitly modeling the certainty of entities and relations, which learn the representations of KGs in the space of multi-dimensional Gaussian distributions. |
65 | Associative Classification with Statistically Significant Positive and Negative Rules | Jundong Li, Osmar Zaiane | To solve the above mentioned problems, we propose a novel associative classifier which is built upon both positive and negative classification association rules that show statistically significant dependencies. |
66 | A Min-Max Optimization Framework For Online Graph Classification | Peng Yang, Peilin Zhao | To solve this issue, we propose a more general min-max optimization framework for online graph node classification. |
67 | An Inference Approach to Basic Level of Categorization | Zhongyuan Wang, Haixun Wang, Ji-Rong Wen, Yanghua Xiao | In this paper, we introduce a method based on typicality and PMI for BLC. |
68 | Making Sense of Spatial Trajectories | Xiaofang Zhou, Kai Zheng, Hoyoung Jueng, Jiajie Xu, Shazia Sadiq | In this paper we will present a review of the extensive work in spatiotemporal data management and trajectory mining, and discuss new challenges and new opportunities in the context of new applications, focusing on recent advances in trajectory data management and trajectory mining from their foundations to high performance processing with modern computing infrastructure. |
69 | ReverseCloak: Protecting Multi-level Location Privacy over Road Networks | Chao Li, Balaji Palanisamy | This paper presents ReverseCloak, a new class of reversible location cloaking mechanisms that effectively support multi-level location privacy, allowing selective de-anonymization of the cloaking region to reduce the granularity of the perturbed location when suitable access credentials are provided. |
70 | GLUE: a Parameter-Tuning-Free Map Updating System | Hao Wu, Chuanchuan Tu, Weiwei Sun, Baihua Zheng, Hao Su, Wei Wang | Besides, we propose theoretical models behind all the important parameters to enable self-adaptive parameter setting. |
71 | A Cost-based Method for Location-Aware Publish/Subscribe Services | Minghe Yu, Guoliang Li, Jianhua Feng | To this end, in this paper we propose two novel indexing structures, mbrtrie and PKQ. |
72 | Probabilistic Forecasts of Bike-Sharing Systems for Journey Planning | Nicolas Gast, Guillaume Massonnet, Daniel Reijsbergen, Mirco Tribastone | Instead we introduce a new metric based on scoring rules. |
73 | Efficient Computation of Polynomial Explanations of Why-Not Questions | Nicole Bidoit, Melanie Herschel, Aikaterini Tzompanaki | Our first contribution is a general definition of a Why-Not explanation by means of a polynomial. |
74 | Interruption-Sensitive Empty Result Feedback: Rethinking the Visual Query Feedback Paradigm for Semistructured Data | Sourav S Bhowmick, Curtis Dyreson, Byron Choi, Min-Hwee Ang | In this paper, we rethink the traditional way of providing feedback. |
75 | Implementing Query Completeness Reasoning | Werner Nutt, Sergey Paramonov, Ognjen Savkovic | With this paper we make two main contributions: (i) we develop techniques to reason about the completeness of a query answer over a partially complete database, taking into account constraints that hold over the database, and (ii) we implement them by an encoding into logic programming paradigms. |
76 | Towards Scalable and Complete Query Explanation with OWL 2 EL Ontologies | Zhe Wang, Mahsa Chitsaz, Kewen Wang, Jianfeng Du | In this paper, we present a hybrid approach to achieve this. |
77 | Crowdsourcing Pareto-Optimal Object Finding By Pairwise Comparisons | Abolfazl Asudeh, Gensheng Zhang, Naeemul Hassan, Chengkai Li, Gergely V. Zaruba | It employs an iterative question-selection framework. |
78 | Practical Aspects of Sensitivity in Online Experimentation with User Engagement Metrics | Alexey Drutsa, Anna Ufliand, Gleb Gusev | We introduce the notion of Overall Acceptance Criterion (OAC) that includes both the components of an OEC and a statistical significance test. |
79 | Generalized Team Draft Interleaving | Eugene Kharitonov, Craig Macdonald, Pavel Serdyukov, Iadh Ounis | In this paper, we propose an interleaving framework that generalizes the previously studied interleaving methods in two aspects. |
80 | Exploiting Document Content for Efficient Aggregation of Crowdsourcing Votes | Martin Davtyan, Carsten Eickhoff, Thomas Hofmann | In this paper, we propose an alternative approach by relying on document information. |
81 | L2Knng: Fast Exact K-Nearest Neighbor Graph Construction with L2-Norm Pruning | David C. Anastasiu, George Karypis | We present L2Knng, an efficient algorithm that finds the exact cosine similarity k-nearest neighbor graph for a set of sparse high-dimensional objects. |
82 | Lingo: Linearized Grassmannian Optimization for Nuclear Norm Minimization | Qian Li, Wenjia Niu, Gang Li, Yanan Cao, Jianlong Tan, Li Guo | This paper proposes an efficient and accurate Linearized Grassmannian Optimization (Lingo) algorithm, which adopts matrix factorization and Grassmann manifold structure to alternatively minimize the subproblems. |
83 | Deep Collaborative Filtering via Marginalized Denoising Auto-encoder | Sheng Li, Jaya Kawale, Yun Fu | In particular, we propose a general deep architecture for CF by integrating matrix factorization with deep feature learning. |
84 | Improving Latent Factor Models via Personalized Feature Projection for One Class Recommendation | Tong Zhao, Julian McAuley, Irwin King | Therefore, in this paper we propose a novel personalized feature projection method to model users’ preferences over items. |
85 | Node Immunization over Infectious Period | Chonggang Song, Wynne Hsu, Mong Li Lee | We propose a NIIP algorithm to select $k$ nodes to immunize over a time period. |
86 | Enterprise Social Link Recommendation | Jiawei Zhang, Yuanhua Lv, Philip Yu | In this paper, we study this novel problem. |
87 | Exploiting Game Theoretic Analysis for Link Recommendation in Social Networks | Tong Zhao, H. Vicky Zhao, Irwin King | Therefore, in this paper, we study the problem of Exploiting Game Theoretic Analysis for Link Recommendation in Social Networks. |
88 | Extracting Interest Tags for Non-famous Users in Social Network | Wei He, Hongyan Liu, Jun He, Shu Tang, Xiaoyong Du | In this paper, we propose a modified topic model, Bi-Labeled LDA with a term weighting scheme, to extract interest tags for users in social network. |
89 | Robust Capped Norm Nonnegative Matrix Factorization: Capped Norm NMF | Hongchang Gao, Feiping Nie, Weidong Cai, Heng Huang | In this paper, we present a novel robust capped norm orthogonal Nonnegative Matrix Factorization model, which utilizes the capped norm for the objective to handle these extreme outliers. |
90 | MF-Tree: Matrix Factorization Tree for Large Multi-Class Learning | Lei Liu, Pang-Ning Tan, Xi Liu | To overcome these challenges, we propose a novel hierarchical learning method known as MF-Tree to efficiently classify data sets with large number of classes while simultaneously inducing a taxonomy structure that captures relationships among the classes. |
91 | GraRep: Learning Graph Representations with Global Structural Information | Shaosheng Cao, Wei Lu, Qiongkai Xu | In this paper, we present {GraRep}, a novel model for learning vertex representations of weighted graphs. |
92 | Context-Adaptive Matrix Factorization for Multi-Context Recommendation | Tong Man, Huawei Shen, Junming Huang, Xueqi Cheng | In this paper, we propose a context-adaptive matrix factorization method for multi-context recommendation by simultaneously modeling context-specific factors and entity-intrinsic factors in a unified model. |
93 | Personalized Trip Recommendation with POI Availability and Uncertain Traveling Time | Chenyi Zhang, Hongwei Liang, Ke Wang, Jianling Sun | This work presents efficient solutions to personalized trip recommendation by incorporating these constraints to prune the search space. |
94 | Range Search on Uncertain Trajectories | Liming Zhan, Ying Zhang, Wenjie Zhang, Xiaoyang Wang, Xuemin Lin | In particular, we propose a general framework for range search on uncertain trajectories following the filtering-and-refinement paradigm where summaries of uncertain trajectories are constructed to facilitate the filtering process. |
95 | Efficient Computation of Trips with Friends and Families | Tanzima Hashem, Sukarna Barua, Mohammed Eunus Ali, Lars Kulik, Egemen Tanin | In this paper, we develop both optimal and approximation algorithms for GTP queries for both Euclidean space and road networks. |
96 | Sampling Big Trajectory Data | Yanhua Li, Chi-Yin Chow, Ke Deng, Mingxuan Yuan, Jia Zeng, Jia-Dong Zhang, Qiang Yang, Zhi-Li Zhang | In this paper, we study the problem of approximate query processing for trajectory aggregate queries. |
97 | EsdRank: Connecting Query and Documents through External Semi-Structured Data | Chenyan Xiong, Jamie Callan | This paper presents EsdRank, a new technique for improving ranking using external semi-structured data such as controlled vocabularies and knowledge bases. |
98 | A Probabilistic Framework for Temporal User Modeling on Microblogs | Jitao Sang, Dongyuan Lu, Changsheng Xu | In this work, in the context of microblogs, we propose a unified probabilistic framework to simultaneously model the process of transient event detection and temporal user tweeting. |
99 | Deriving Intensional Descriptions for Web Services | Maria Koutraki, Dan Vodislav, Nicoleta Preda | In this paper, we model an API method as a view with binding patterns over a global RDF schema. |
100 | An Optimization Framework for Propagation of Query-Document Features by Query Similarity Functions | Maxim Zhukovskiy, Tsimafei Khatkevich, Gleb Gusev, Pavel Serdyukov | In this paper, we propose new algorithms that facilitate and increase the effectiveness of this propagation. |
101 | Rank Consistency based Multi-View Learning: A Privacy-Preserving Approach | Han-Jia Ye, De-Chuan Zhan, Yuan Miao, Yuan Jiang, Zhi-Hua Zhou | In this paper, we propose a novel multi-view learning framework which works in a hybrid fusion manner. |
102 | Differentially Private Histogram Publication for Dynamic Datasets: an Adaptive Sampling Approach | Haoran Li, Li Xiong, Xiaoqian Jiang, Jinfei Liu | In this paper, we address the problem of releasing series of dynamic datasets in real time with differential privacy, using a novel adaptive distance-based sampling approach. |
103 | WaveCluster with Differential Privacy | Ling Chen, Ting Yu, Rada Chirkova | In this paper, we investigate techniques to perform WaveCluster while ensuring differential privacy.Our goal is to develop a general technique for achieving differential privacy on WaveCluster that accommodates different wavelet transforms. |
104 | Process-Driven Data Privacy | Weiyi Xia, Murat Kantarcioglu, Zhiyu Wan, Raymond Heatherly, Yevgeniy Vorobeychik, Bradley Malin | We introduce a principled approach to explicitly model the attack process as a series of steps. |
105 | Unsupervised Feature Selection on Data Streams | Hao Huang, Shinjae Yoo, Shiva Prasad Kasiviswanathan | In this paper, we introduce a novel unsupervised feature selection approach on data streams that selects important features by making only one pass over the data while utilizing limited storage. |
106 | Unsupervised Streaming Feature Selection in Social Media | Jundong Li, Xia Hu, Jiliang Tang, Huan Liu | In this paper, we study a novel problem to conduct unsupervised streaming feature selection for social media data. |
107 | Weighted Similarity Estimation in Data Streams | Konstantin Kutzkov, Mohamed Ahmed, Sofia Nikitaki | Motivated by applications such as collaborative filtering in large-scale recommender systems, and influence probabilities learning in social networks, we present new randomized algorithms for the estimation of weighted similarity in data streams. |
108 | Private Analysis of Infinite Data Streams via Retroactive Grouping | Rui Chen, Yilin Shen, Hongxia Jin | In this paper, we consider the problem of private analysis of infinite data streams under differential privacy. |
109 | Parallel Lazy Semi-Naive Bayes Strategies for Effective and Efficient Document Classification | Felipe Viegas, Marcos André Gonçalves, Wellington Martins, Leonardo Rocha | In this paper, we investigate whether the relaxation of the NB feature independence assumption (aka, Semi-NB approaches) can improve its effectiveness in large text collections. |
110 | A Novel Class Noise Estimation Method and Application in Classification | Lin Gui, Qin Lu, Ruifeng Xu, Minglei Li, Qikang Wei | In this paper, we propose a method to estimate class noise rate at the level of individual samples in real data. In this paper, we first present the problem of binary classification in the presence of random noise on the class labels, which we call class noise. |
111 | Learning Task Grouping using Supervised Task Space Partitioning in Lifelong Multitask Learning | Meenakshi Mishra, Jun Huan | In this paper, we propose learning functions to model the task relationships as it is computationally cheaper in an online setting. |
112 | KSGM: Keynode-driven Scalable Graph Matching | Xilun Chen, K. Selçuk Candan, Maria Luisa Sapino, Paulo Shakarian | In this paper we note that the expensive refinement phase of graph matching algorithms is not practical in any application where scalability is critical. |
113 | Protecting Your Children from Inappropriate Content in Mobile Apps: An Automatic Maturity Rating Framework | Bing Hu, Bin Liu, Neil Zhenqiang Gong, Deguang Kong, Hongxia Jin | In this work, we aim to design and build a machine learning framework to automatically predict maturity levels for mobile Apps and the associated reasons with a high accuracy and a low cost. |
114 | The Role of Query Sessions in Interpreting Compound Noun Phrases | Marius Pasca | The Role of Query Sessions in Interpreting Compound Noun Phrases |
115 | Deep Semantic Frame-Based Deceptive Opinion Spam Analysis | Seongsoon Kim, Hyeokyoon Chang, Seongwoon Lee, Minhwan Yu, Jaewoo Kang | In this paper, we propose a frame-based deep semantic analysis method for understanding rich characteristics of deceptive and truthful opinions written by various types of individuals including crowdsourcing workers, employees who have expert-level domain knowledge about local businesses, and online users who post on Yelp and TripAdvisor. |
116 | Topic Modeling in Semantic Space with Keywords | Xiaojia Pu, Rong Jin, Gangshan Wu, Dingyi Han, Gui-Rong Xue | In this paper, for the information need about a topic or category, we propose a novel method called TDCS(Topic Distilling with Compressive Sensing) for explicit and accurate modeling the topic implied by several keywords. |
117 | F1: Accelerating the Optimization of Aggregate Continuous Queries | Anatoli U. Shein, Panos K. Chrysanthis, Alexandros Labrinidis | In this paper we propose a novel closed formula, F1, that accelerates Weavability calculations, and thus allows WeaveShare to achieve exceptional scalability in systems with heavy workloads. |
118 | Fast Distributed Correlation Discovery Over Streaming Time-Series Data | Tian Guo, Saket Sathe, Karl Aberer | To tackle the challenge, we propose a framework called AEGIS. |
119 | Time Series Analysis of Nursing Notes for Mortality Prediction via a State Transition Topic Model | Yohan Jo, Natasha Loghmanpour, Carolyn Penstein Rosé | We propose a time series model that uncovers the temporal dynamics of patients’ underlying states from nursing notes. |
120 | Learning Relative Similarity from Data Streams: Active Online Learning Approaches | Shuji Hao, Peilin Zhao, Steven C.H. Hoi, Chunyan Miao | To overcome the limitation, we propose a novel framework of active online similarity learning. |
121 | Ad Hoc Monitoring of Vocabulary Shifts over Time | Tom Kenter, Melvin Wevers, Pim Huijnen, Maarten de Rijke | In this paper, we propose an algorithm for monitoring shifts in vocabulary over time, given a small set of seed terms. As the task of monitoring shifting vocabularies over time for an ad hoc set of seed words is, to the best of our knowledge, a new one, we construct our own evaluation set. |
122 | Balancing Novelty and Salience: Adaptive Learning to Rank Entities for Timeline Summarization of High-impact Events | Tuan A. Tran, Claudia Niederee, Nattiya Kanhabua, Ujwal Gadiraju, Avishek Anand | In this work, we present a novel approach for timeline summarization of high-impact events, which uses entities instead of sentences for summarizing the event at each individual point in time. |
123 | Location-Based Influence Maximization in Social Networks | Tao Zhou, Jiuxin Cao, Bo Liu, Shuai Xu, Ziqing Zhu, Junzhou Luo | In this paper, we aim at the product promotion in O2O model and carry out the research of location-based influence maximization on the platform of LBSN. |
124 | Location and Time Aware Social Collaborative Retrieval for New Successive Point-of-Interest Recommendation | Wei Zhang, Jianyong Wang | In order to solve this problem, we propose a new model called location and time aware social collaborative retrieval model (LTSCR), which has two distinct advantages: (1) it models the location, time, and social information simultaneously for the successive POI recommendation task; (2) it efficiently utilizes the merits of the collaborative retrieval model which leverages weighted approximately ranked pairwise (WARP) loss for achieving better top-n ranking results, just as the new successive POI recommendation task needs. |
125 | Where you Instagram?: Associating Your Instagram Photos with Points of Interest | Xutao Li, Tuan-Anh Nguyen Pham, Gao Cong, Quan Yuan, Xiao-Li Li, Shonali Krishnaswamy | In this paper, we propose to study the problem of mapping Instagram photos to points of interest. |
126 | Gradient-based Signatures for Efficient Similarity Search in Large-scale Multimedia Databases | Christian Beecks, Merih Seran Uysal, Judith Hermanns, Thomas Seidl | In this paper, we propose the concept of gradient-based signatures in order to aggregate content-based features of multimedia objects by means of generative models. |
127 | Cross-Modal Similarity Learning: A Low Rank Bilinear Formulation | Cuicui Kang, Shengcai Liao, Yonghao He, Jian Wang, Wenjia Niu, Shiming Xiang, Chunhong Pan | In this research, there are two critical issues: how to get rid of the heterogeneity between different modalities and how to match the cross-modal features of different dimensions. |
128 | Efficient Sparse Matrix Multiplication on GPU for Large Social Network Analysis | Yong-Yeon Jo, Sang-Wook Kim, Duck-Ho Bae | In this paper, we propose a GPU-based method for efficient sparse matrix multiplication through the parallel computing paradigm. |
129 | The Role Of Citation Context In Predicting Long-Term Citation Profiles: An Experimental Study Based On A Massive Bibliographic Text Dataset | Mayank Singh, Vikas Patidar, Suhansanu Kumar, Tanmoy Chakraborty, Animesh Mukherjee, Pawan Goyal | In this paper, we argue that features gathered from the citation contexts of the research papers can be very relevant for citation prediction. |
130 | Discovering Canonical Correlations between Topical and Topological Information in Document Networks | Yuan He, Cheng Wang, Changjun Jiang | In this paper, we simultaneously incorporate community detection and topic modeling in a unified framework, and appeal to Canonical Correlation Analysis (CCA) to capture the latent semantic correlations between the two heterogeneous latent factors, community and topic. |
131 | Chronological Citation Recommendation with Information-Need Shifting | Zhuoren Jiang, Xiaozhong Liu, Liangcai Gao | In this study, we propose a novel method called "Chronological Citation Recommendation" which assumes initial user information needs could shift while users are searching for papers in different time slices. |
132 | Answering Questions with Complex Semantic Constraints on Open Knowledge Bases | Pengcheng Yin, Nan Duan, Ben Kao, Junwei Bao, Ming Zhou | We propose using n-tuple assertions, which are assertions with an arbitrary number of arguments, and n-tuple open KB (nOKB), which is an open knowledge base of n-tuple assertions. |
133 | Inducing Space Dirichlet Process Mixture Large-Margin Entity RelationshipInference in Knowledge Bases | Sotirios P. Chatzis | In this paper, we focus on the problem of extending a given knowledge base by accurately predicting additional true facts based on the facts included in it. |
134 | Semi-Automated Exploration of Data Warehouses | Thibault Sellam, Emmanuel Müller, Martin Kersten | In this paper, we introduce Claude, a hypothesis generator for data warehouses. |
135 | Large-scale Knowledge Base Completion: Inferring via Grounding Network Sampling over Selected Instances | Zhuoyu Wei, Jun Zhao, Kang Liu, Zhenyu Qi, Zhengya Sun, Guanhua Tian | To resolve the limitations of the above two types of methods, we propose an approach through Inferring via Grounding Network Sampling over Selected Instances. |
136 | Large-Scale Analysis of Dynamics of Choice Among Discrete Alternatives | Andrew Tomkins | The work described in this talk is partly due to other researchers, and partly joint with various colleagues including Ashton Anderson, Ravi Kumar, Mohammad Mahdian, Bo Pang, Sergei Vassilvitskii and Erik Vee. |
137 | On Gapped Set Intersection Size Estimation | Chen Chen, Jianbin Qin, Wei Wang | In this paper, we consider a generalized problem for integer sets where, given a gap parameter δ, two elements are deemed as matches if their numeric difference equals δ or is within δ. |
138 | Inclusion Dependencies Reloaded | Henning Köhler, Sebastian Link | Resolving this conundrum we establish an optimal solution by identifying the desirable class of not-null inclusion dependencies (NNINDs) that subsumes simple and partial semantics as special cases, and whose associated implication problem has the same computational properties as inclusion dependencies in the relational model. |
139 | Comprehensible Models for Reconfiguring Enterprise Relational Databases to Avoid Incidents | Ioana Giurgiu, Mirela Botezatu, Dorothea Wiesmann | We propose using machine learning to understand how configuring a DBMS can lead to such high risk incidents. We collect historical data from three IT environments that run both IBM DB2 and Oracle DBMS. |
140 | An Optimal Online Algorithm For Retrieving Heavily Perturbed Statistical Databases In The Low-Dimensional Querying Model | Krzysztof Marcin Choromanski, Afshin Rostamizadeh, Umar Syed | We assume the distribution D is defined on the neighborhood of a low-dimensional manifold. |
141 | Aggregation of Crowdsourced Ordinal Assessments and Integration with Learning to Rank: A Latent Trait Model | Pavel Metrikov, Virgil Pavlu, Javed A. Aslam | To use such assessments for either evaluation or learning, we propose a new framework for the inference of true document relevance from crowdsourced data—one simpler than previous approaches and achieving better performance. |
142 | Weakly Supervised Natural Language Processing Framework for Abstractive Multi-Document Summarization: Weakly Supervised Abstractive Multi-Document Summarization | Peng Li, Weidong Cai, Heng Huang | In this paper, we propose a new weakly supervised abstractive news summarization framework using pattern based approaches. |
143 | Short Text Similarity with Word Embeddings | Tom Kenter, Maarten de Rijke | We propose to go from word-level to text-level semantics by combining insights from methods based on external sources of semantic knowledge with word embeddings. |
144 | Building Representative Composite Items | VIncent Leroy, Sihem Amer-Yahia, Eric Gaussier, Hamid Mirisaee | We formalize building representative CIs as an optimization problem and propose KFC, an extended fuzzy clustering algorithm to solve it. |
145 | More Accurate Question Answering on Freebase | Hannah Bast, Elmar Haussmann | We evaluate our system, called Aqqu, on two standard benchmarks, Free917 and WebQuestions, improving the previous best result for each benchmark considerably. |
146 | Improving Ranking Consistency for Web Search by Leveraging a Knowledge Base and Search Logs | Jyun-Yu Jiang, Jing Liu, Chin-Yew Lin, Pu-Jen Cheng | In this paper, we propose a new idea called ranking consistency in web search. |
147 | Assessing the Impact of Syntactic and Semantic Structures for Answer Passages Reranking | Kateryna Tymoshenko, Alessandro Moschitti | In this paper, we extensively study the use of syntactic and semantic structures obtained with shallow and deeper syntactic parsers in the answer passage reranking task. |
148 | Ranking Entities for Web Queries Through Text and Knowledge | Michael Schuhmacher, Laura Dietz, Simone Paolo Ponzetto | In this paper, we aim at automating this process by retrieving and ranking entities that are relevant to understand free-text web-style queries like Argentine British relations, which typically demand a set of heterogeneous entities with no specific target type like, for instance, Falklands_-War} or Margaret-_Thatcher, as answer. |
149 | What Is a Network Community?: A Novel Quality Function and Detection Algorithms | Atsushi Miyauchi, Yasushi Kawase | In this study, we introduce a novel quality function for a network community, which we refer to as the communitude. |
150 | DifRec: A Social-Diffusion-Aware Recommender System | Hossein Vahabi, Iordanis Koutsopoulos, Francesco Gullo, Maria Halkidi | In this work we take a step towards rethinking recommender systems by exploiting the anticipated social-network information diffusion and withholding recommendation of items that are expected to reach a user through sharing/re-posting. |
151 | Who With Whom And How?: Extracting Large Social Networks Using Search Engines | Stefan Siersdorfer, Philipp Kemkes, Hanno Ackermann, Sergej Zerr | In this paper, we introduce novel methodologies for query-based search engine mining, enabling efficient extraction of social networks from large amounts of Web data. |
152 | Modeling Individual-Level Infection Dynamics Using Social Network Information | Suppawong Tuarob, Conrad S. Tucker, Marcel Salathe, Nilam Ram | In this paper, we demonstrate how social media information can be incorporated into and improve upon traditional techniques used to model the dynamics of infectious diseases. |
153 | Finding Probabilistic k-Skyline Sets on Uncertain Data | Jinfei Liu, Haoyu Zhang, Li Xiong, Haoran Li, Jun Luo | We present an efficient algorithm for computing probabilistic k-skyline sets. |
154 | Ordering Selection Operators Under Partial Ignorance | Khaled H. Alyoubi, Sven Helmer, Peter T. Wood | The selectivities are modelled as intervals rather than exact values and we apply a concept from decision theory, the minimisation of the maximum regret, as a measure of optimality. |
155 | Querying Temporal Drifts at Multiple Granularities | Sofia Kleisarchaki, Sihem Amer-Yahia, Ahlame Douzal-Chouakria, Vassilis Christophides | In this paper, we adopt a query-based approach to drift detection. |
156 | Efficient Incremental Evaluation of Succinct Regular Expressions | Henrik Björklund, Wim Martens, Thomas Timm | In this paper we study the usage and effectiveness of the counting operator (or: limited repetition) in regular expressions. |
157 | Struggling and Success in Web Search | Daan Odijk, Ryen W. White, Ahmed Hassan Awadallah, Susan T. Dumais | We address this important issue using a mixed methods study using large-scale logs, crowd-sourced labeling, and predictive modeling. |
158 | Behavioral Dynamics from the SERP’s Perspective: What are Failed SERPs and How to Fix Them? | Julia Kiseleva, Jaap Kamps, Vadim Nikulin, Nikita Makarov | Our analysis of behavioral dynamics at the SERP level gives new insight in one of the primary causes of search failure due to temporal query intent drifts. |
159 | What Users Ask a Search Engine: Analyzing One Billion Russian Question Queries | Michael Völske, Pavel Braslavski, Matthias Hagen, Galina Lezina, Benno Stein | As an alternative, we propose a robust question query classification method that uses the labeled questions from a large community question answering platform (CQA) as a training set. |
160 | Does Vertical Bring more Satisfaction?: Predicting Search Satisfaction in a Heterogeneous Environment | Ye Chen, Yiqun Liu, Ke Zhou, Meng Wang, Min Zhang, Shaoping Ma | In this paper, we carry out a lab-based user study with specifically designed SERPs to determine how verticals with different qualities and presentation styles affect search satisfaction. |
161 | Characterizing and Predicting Viral-and-Popular Video Content | David Vallet, Shlomo Berkovsky, Sebastien Ardon, Anirban Mahanti, Mohamed Ali Kafaar | In this paper, we focus on the observable dependencies between the virality of video content on a micro-blogging social network (in this case, Twitter) and the popularity of such content on a video distribution service (YouTube). |
162 | Social Spammer and Spam Message Co-Detection in Microblogging with Social Context Regularization | Fangzhao Wu, Jinyun Shu, Yongfeng Huang, Zhigang Yuan | In this paper, we propose a unified framework for social spammer and spam message co-detection in microblogging. |
163 | Central Topic Model for Event-oriented Topics Mining in Microblog Stream | Min Peng, Jiahui Zhu, Xuhui Li, Jiajia Huang, Hua Wang, Yanchun Zhang | In this paper, we propose a central topic model (CenTM), where a Multi-view Clustering algorithm with Two-phase Random Walk (MC-TRW) is devised to aggregate the LDA’s latent topics into central topics. |
164 | Video Popularity Prediction by Sentiment Propagation via Implicit Network | Wanying Ding, Yue Shang, Lifan Guo, Xiaohua Hu, Rui Yan, Tingting He | Here, we propose a Dual Sentimental Hawkes Process (DSHP) to cope with all the problems above. |
165 | Joint Modeling of User Check-in Behaviors for Point-of-Interest Recommendation | Hongzhi Yin, Xiaofang Zhou, Yingxia Shao, Hao Wang, Shazia Sadiq | In light of the above, we propose a joint probabilistic generative model to mimic user check-in behaviors in a process of decision making, which strategically integrates the above factors to effectively overcome the data sparsity, especially for out-of-town users. |
166 | ORec: An Opinion-Based Point-of-Interest Recommendation Framework | Jia-Dong Zhang, Chi-Yin Chow, Yu Zheng | In this paper, we propose an opinion-based POI recommendation framework called ORec to take full advantage of the user opinions on POIs expressed as tips. |
167 | Toward Dual Roles of Users in Recommender Systems | Suhang Wang, Jiliang Tang, Huan Liu | In this paper, we investigate how to exploit dual roles of users in recommender systems. |
168 | TriRank: Review-aware Explainable Recommendation by Modeling Aspects | Xiangnan He, Tao Chen, Min-Yen Kan, Xiao Chen | Aside from users’ ratings, their affiliated reviews often provide the rationale for their ratings and identify what aspects of the item they cared most about. |
169 | RoadRank: Traffic Diffusion and Influence Estimation in Dynamic Urban Road Networks | Tarique Anwar, Chengfei Liu, Hai L. Vu, Md. Saiful Islam | In this work, we propose RoadRank, an algorithm to compute the influence scores of each road segment in an urban road network, and rank them based on their overall influence. |
170 | On Query-Update Independence for SPARQL | Nicola Guido, Pierre Genevès, Nabil Layaïda, Cécile Roisin | This paper investigates techniques for detecting independence of SPARQL queries from updates. |
171 | A Structured Query Model for the Deep Relational Web | Hasan M. Jamil, Hosagrahar V. Jagadish | In this paper, we describe an ongoing research of a generic structured query model that can be used against the deep web. |
172 | A Flash-aware Buffering Scheme using On-the-fly Redo | Kyosung Jeong, Sang-Wook Kim, Sungchae Lim | In this paper, we address how to reduce the amount of page updates in flash-based DBMS equipped with SSD (Solid State Drive). |
173 | Defragging Subgraph Features for Graph Classification | Haishuai Wang, Peng Zhang, Ivor Tsang, Ling Chen, Chengqi Zhang | In this paper, we propose a new Subgraph Join Feature Selection (SJFS) algorithm. |
174 | Structural Constraints for Multipartite Entity Resolution with Markov Logic Network | Tengyuan Ye, Hady W. Lauw | We propose a principled solution to the multipartite entity resolution problem, building on the foundation of Markov Logic Network (MLN) that combines probabilistic graphical model and first-order logic. |
175 | Know Your Onions: Understanding the User Experience with the Knowledge Module in Web Search | Ioannis Arapakis, Luis A. Leiva, B. Barla Cambazoglu | Our work is an early attempt to bridge this gap. |
176 | Personalized Federated Search at LinkedIn | Dhruv Arya, Viet Ha-Thuc, Shakti Sinha | To tackle this problem, we exploit a data-driven approach that extracts searcher intents from their profile data and recent activities at a large scale. |
177 | Balancing Exploration and Exploitation: Empirical Parameterization of Exploratory Search Systems | Kumaripaba Ahukorala, Alan Medlar, Kalle Ilves, Dorota Glowacka | We present a user study to analyze how different exploration rates affect search performance, user satisfaction, and the number of documents selected. |
178 | On Predicting Deletions of Microblog Posts | Mossaab Bagdouri, Douglas W. Oard | This paper addresses the problem of deletion prediction by analyzing the distribution of deleted tweets, presenting a new evaluation framework, exploring tweet-based and user-based features, and reporting prediction scores. |
179 | Semi-Automated Text Classification for Sensitivity Identification | Giacomo Berardi, Andrea Esuli, Craig Macdonald, Iadh Ounis, Fabrizio Sebastiani | We use a recently proposed utility-theoretic approach to SATC that explicitly optimizes the chosen effectiveness function when ranking the documents by sensitivity; this is especially useful in our case, since sensitivity identification is a recall-oriented task, thus requiring the use of a recall-oriented evaluation measure such as F2. |
180 | Identification of Microblogs Prominent Users during Events by Learning Temporal Sequences of Features | Imen Bizid, Nibal Nayef, Patrice Boursier, Sami Faiz, Antoine Doucet | This work proposes a probabilistic model for the identification of prominent users in microblogs during specific events. |
181 | A Real-Time Eye Tracking Based Query Expansion Approach via Latent Topic Modeling | Yongqiang Chen, Peng Zhang, Dawei Song, Benyou Wang | In this paper, we propose a real-time eye tracking based query expansion method, which is able to: (1) automatically capture the terms that the user is viewing by utilizing eye tracking techniques; (2) derive the user’s latent intent based on the eye tracking terms and by using the Latent Dirichlet Allocation (LDA) approach. |
182 | Clustered Semi-Supervised Relevance Feedback | Kripabandhu Ghosh, Swapan Kumar Parui | In this paper, we consider an intermediate, semi-supervised scheme, in which only a subset of results is selected for annotation, and then their labels are propagated to their nearest neighbours. |
183 | On the Effect of "Stupid" Search Components on User Interaction with Search Engines | Lidia Grauer, Aleksandra Lomakina | Using eye-tracking, we investigate how searchers interact with Web search engines which get affected by nonsensical results. |
184 | Social-Relational Topic Model for Social Networks | Weiyu Guo, Shu Wu, Liang Wang, Tieniu Tan | To address the above limitations, we propose a novel Social-Relational Topic Model (SRTM), which can alleviate the effect of topic-irrelevant links by analyzing relational users’ topics of each link. |
185 | Building Effective Query Classifiers: A Case Study in Self-harm Intent Detection | Ashiqur R. KhudaBukhsh, Paul N. Bennett, Ryen W. White | We address a common scenario in designing such triggers for real-world settings where positives are rare and search providers possess only a small seed set of positive examples to learn query classification models. |
186 | Modelling the Usefulness of Document Collections for Query Expansion in Patient Search | Nut Limsopatham, Craig Macdonald, Iadh Ounis | In this work, we investigate two automatic approaches that measure and leverage the usefulness of document collections when exploiting multiple document collections to improve query representation. |
187 | A Convolutional Click Prediction Model | Qiang Liu, Feng Yu, Shu Wu, Liang Wang | In this work, we propose a novel model, Convolutional Click Prediction Model (CCPM), based on convolution neural network. |
188 | A Study of Query Length Heuristics in Information Retrieval | Yuanhua Lv | In this paper, we reveal that query length actually interacts with term frequency (TF) normalization, a key component of all effective retrieval models. |
189 | Detect Rumors Using Time Series of Social Context Information on Microblogging Websites | Jing Ma, Wei Gao, Zhongyu Wei, Yueming Lu, Kam-Fai Wong | In this study, we propose a novel approach to capture the temporal characteristics of these features based on the time series of rumor’s lifecycle, for which time series modeling technique is applied to incorporate various social context information. |
190 | Query Auto-Completion for Rare Prefixes | Bhaskar Mitra, Nick Craswell | In particular, we describe a candidate generation approach using frequently observed query suffixes mined from historical search logs. |
191 | Pooled Evaluation Over Query Variations: Users are as Diverse as Systems | Alistair Moffat, Falk Scholer, Paul Thomas, Peter Bailey | Therefore an approach called pooling is typically used where, for example, the documents to be judged can be determined by taking the union of all documents returned in the top positions of the answer lists returned by a range of systems. |
192 | The Influence of Pre-processing on the Estimation of Readability of Web Documents | João Rafael de Moura Palotti, Guido Zuccon, Allan Hanbury | This paper investigates the effect that text pre-processing approaches have on the estimation of the readability of web pages. |
193 | Atypical Queries in eCommerce | Neeraj Pradhan, Vinay Deolalikar, Kang Li | In this paper, we use query-click log data to address the problem of identifying "atypical queries": these are queries that are extremal in terms of specificity, ambiguity, or breadth of intent. |
194 | Bottom-up Faceted Search: Creating Search Neighbourhoods with Datacube Cells | Mark Sifer | This paper extends this approach to curated corpora that contain items or documents that have been classified in multiple dimensions (facets), where each dimension classification may be a hierarchy. |
195 | Personalized Recommendation Meets Your Next Favorite | Qiang Song, Jian Cheng, Ting Yuan, Hanqing Lu | In this paper, we propose a unified model, namely States Transition pAir-wise Ranking Model (STAR), to address users’ favorites mining for sequential-set recommendation. |
196 | Recommending Short-lived Dynamic Packages for Golf Booking Services | Robin Swezey, Young-joo Chung | We introduce an approach to recommending short-lived dynamic packages for golf booking services. |
197 | Large-Scale Question Answering with Joint Embedding and Proof Tree Decoding | Zhenghao Wang, Shengquan Yan, Huaming Wang, Xuedong Huang | We frame the problem from a proof-theoretic perspective, and formulate it as a proof tree search problem that seamlessly unifies semantic parsing, logic reasoning, and answer ranking. |
198 | Query Length, Retrievability Bias and Performance | Colin Wilkie, Leif Azzopardi | In this paper, we examine whether there are benefits of longer queries beyond performance. |
199 | Gauging Correct Relative Rankings For Similarity Search | Weiren Yu, Julie McCann | In this paper, we propose efficient ranking criteria that can secure correct relative orders of node-pairs with respect to SimRank scores when they are computed in an iterative fashion. |
200 | Learning User Preferences for Topically Similar Documents | Mustafa Zengin, Ben Carterette | In this study, we collect user preference judgements of web document similarity in order to investigate: (1) the correlation between similarity measures and users’ perception of similarity, (2) the correlation between the web document features plus document-query features and users’ similarity judgements. |
201 | Modeling Parameter Interactions in Ranking SVM | Yaogong Zhang, Jun Xu, Yanyan Lan, Jiafeng Guo, Maoqiang Xie, Yalou Huang, Xueqi Cheng | This paper aims to answer the question. |
202 | Best First Over-Sampling for Multilabel Classification | Xusheng Ai, Jian Wu, Victor S. Sheng, Yufeng Yao, Pengpeng Zhao, Zhiming Cui | In this paper we propose a MultiLabel Best First Over-sampling (ML-BFO) to improve the performance of multilabel classification algorithms, based on imbalance minimization and Wilson’s ENN rule. |
203 | Co-clustering Document-term Matrices by Direct Maximization of Graph Modularity | Melissa Ailem, François Role, Mohamed Nadif | We present Coclus, a novel diagonal co-clustering algorithm which is able to effectively co-cluster binary or contingency matrices by directly maximizing an adapted version of the modularity measure traditionally used for networks. |
204 | A Data-Driven Approach to Distinguish Cyber-Attacks from Physical Faults in a Smart Grid | Adnan Anwar, Abdun Naser Mahmood, Zubair Shah | In this paper, we utilize a data-driven approach to accurately differentiate the physical faults from cyber-attacks. First, we create a realistic dataset by generating different types of faults and cyber-attacks on the IEEE 30 bus benchmark test system. |
205 | Improving Event Detection by Automatically Assessing Validity of Event Occurrence in Text | Andrea Ceroni, Ujwal Kumar Gadiraju, Marco Fisichella | In this paper, we automatize event validation, defined as the task of determining whether a given event occurs in a given document or corpus. |
206 | DAAV: Dynamic API Authority Vectors for Detecting Software Theft | Dong-Kyu Chae, Sang-Wook Kim, Seong-Je Cho, Yesol Kim | This paper proposes a novel birthmark, a dynamic API authority vector (DAAV), for detecting software theft. |
207 | Towards Multi-level Provenance Reconstruction of Information Diffusion on Social Media | Tom De Nies, Io Taxidou, Anastasia Dimou, Ruben Verborgh, Peter M. Fischer, Erik Mannens, Rik Van de Walle | Therefore in this paper, we propose an approach to reconstruct the provenance of messages on social media on multiple levels. |
208 | Profiling Pedestrian Distribution and Anomaly Detection in a Dynamic Environment | Minh Tuan Doan, Sutharshan Rajasegarar, Mahsa Salehi, Masud Moshtaghi, Christopher Leckie | In this paper we model the normal behaviours of pedestrian flows and detect anomalous events from pedestrian counting data of the City of Melbourne. |
209 | A Clustering-based Approach to Detect Probable Outcomes of Lawsuits | Daniel Lemes Gribel, Maira Gatti de Bayser, Leonardo Guerreiro Azevedo | This work proposes an approach to identify possible judgment outcomes that considers the use of similarity calculations and clustering mechanisms based on lawsuits patterns. |
210 | Detecting Check-worthy Factual Claims in Presidential Debates | Naeemul Hassan, Chengkai Li, Mark Tremayne | Specifically, we prepared a U.S. presidential debate dataset and built classification models to distinguish check-worthy factual claims from non-factual claims and unimportant factual claims. |
211 | Where You Go Reveals Who You Know: Analyzing Social Ties from Millions of Footprints | Hsun-Ping Hsieh, Rui Yan, Cheng-Te Li | This paper aims to investigate how the geographical footprints of users correlate to their social ties. |
212 | Message Clustering based Matrix Factorization Model for Retweeting Behavior Prediction | Bo Jiang, Jiguang Liang, Ying Sha, Lihong Wang | In this paper, we propose two message clustering based matrix factorization models for retweeting prediction. |
213 | Heterogeneous Multi-task Semantic Feature Learning for Classification | Xin Jin, Fuzhen Zhuang, Sinno Jialin Pan, Changying Du, Ping Luo, Qing He | In this paper, we study the problem of MTL with heterogeneous features for each task. |
214 | Top-k Reliable Edge Colors in Uncertain Graphs | Arijit Khan, Francesco Gullo, Thomas Wohler, Francesco Bonchi | To this end, we aim at designing effective and scalable solutions for the top-k reliable color set problem. |
215 | Probabilistic Non-negative Inconsistent-resolution Matrices Factorization | Masahiro Kohjima, Tatsushi Matsubayashi, Hiroshi Sawada | In this paper, we tackle with the problem of analyzing datasets with different resolution such as a pair of user’s individual data and user group’s data, for example "userA visited shopA 5 times" and "users whose attributes are men purchased itemA 80 times in total". |
216 | Identifying Attractive News Headlines for Social Media | Sawa Kourogi, Hiroyuki Fujishiro, Akisato Kimura, Hitoshi Nishikawa | This paper provides a novel solution to this problem by identifying attractive headlines as a gateway to news articles. |
217 | A Probabilistic Rating Auto-encoder for Personalized Recommender Systems | Huizhi Liang, Timothy Baldwin | In this paper, we propose a probabilistic rating auto-encoder to perform unsupervised feature learning and generate latent user feature profiles from large-scale user rating data. |
218 | Real-time Rumor Debunking on Twitter | Xiaomo Liu, Armineh Nourbakhsh, Quanzhi Li, Rui Fang, Sameena Shah | In this paper, we propose the first real time rumor debunking algorithm for Twitter. |
219 | Fraud Transaction Recognition: A Money Flow Network Approach | Renxin Mao, Zhao Li, Jinhua Fu | In this paper, we provide some insights into analysis of fraud transaction recognition on Alipay’s Money Flow Network. |
220 | Identifying Top-k Consistent News-Casters on Twitter | Sahisnu Mazumder, Sameep Mehta, Dhaval Patel | In this paper, we present a framework, NCFinder, to discover top-k consistent news-casters directly from Twitter. |
221 | Mining the Minds of Customers from Online Chat Logs | Kunwoo Park, Jaewoo Kim, Jaram Park, Meeyoung Cha, Jiin Nam, Seunghyun Yoon, Eunhee Rhim | This study investigates factors that may determine satisfaction in customer service operations. |
222 | A Fast k-Nearest Neighbor Search Using Query-Specific Signature Selection | Youngki Park, Heasoo Hwang, Sang-goo Lee | In this paper, we target on improving the performance of k-NN search and achieving a consistent k-NN search that performs well in various datasets. |
223 | Core-Sets For Canonical Correlation Analysis | Saurabh Paul | In this work, we consider the over-constrained case where the number of rows is greater than the number of columns (m > max(n,l)). |
224 | DeepCamera: A Unified Framework for Recognizing Places-of-Interest based on Deep ConvNets | Pai Peng, Hongxiang Chen, Lidan Shou, Ke Chen, Gang Chen, Chang Xu | In this work, we present a novel project called DeepCamera(DC) for recognizing places-of-interest(POI) with smartphones. |
225 | Structured Sparse Regression for Recommender Systems | Mingjie Qian, Liangjie Hong, Yue Shi, Suju Rajan | In this paper we employ rich features from both user and item sides to enhance latent factors learnt from interaction data, uncovering hidden structures from features’ relationships and learning sparse pairwise and tree structural connections among features. |
226 | Analyzing Document Intensive Business Processes using Ontology | Suman Roychoudhury, Vinay Kulkarni, Nikhil Bellarykar | In particular, this paper presents a real life example of a document intensive business process (International Trade) and attempts to model and analyze the process in a formal way. |
227 | Transductive Domain Adaptation with Affinity Learning | Le Shu, Longin Jan Latecki | We propose a novel method to solve domain adaptation task in a transductive setting. |
228 | Update Summarization using Semi-Supervised Learning Based on Hellinger Distance | Dingding Wang, Sahar Sohangir, Tao Li | In this paper, we propose a new method to generate the sentence similarity graph using a novel similarity measure based on Helliger distance and apply semi-supervised learning on the sentence graph to select the sentences with maximum consistency and minimum redundancy to form the summaries. |
229 | Multi-view Clustering via Structured Low-rank Representation | Dong Wang, Qiyue Yin, Ran He, Liang Wang, Tieniu Tan | In this paper, we present a novel solution to multi-view clustering through a structured low-rank representation. |
230 | Partially Labeled Data Tuple Can Optimize Multivariate Performance Measures | Jim Jing-Yan Wang, Xin Gao | In this paper, we show that the multivariate performance measures can also be optimized by learning from partially labeled data tuple, when the label tuple is incomplete. |
231 | Modeling Infinite Topics on Social Behavior Data with Spatio-temporal Dependence | Peng Wang, Peng Zhang, Chuan Zhou, Zhao Li, Guo Li | In this paper we present a new nonparametric Bayesian model Time and Space Dependent Chinese Restaurant Processes (TSD-CRP for short). |
232 | ASEM: Mining Aspects and Sentiment of Events from Microblog | Ruhui Wang, Weijing Huang, Wei Chen, Tengjiao Wang, Kai Lei | In this paper we propose a novel probabilistic generative model (ASEM) to simultaneously discover aspects and the specified opinions. |
233 | Enhanced Word Embeddings from a Hierarchical Neural Language Model | Xun Wang, Katsuhoto Sudoh, Masaaki Nagata | This paper proposes a neural language model to capture the interaction of text units of different levels, i.e.. |
234 | Improving Label Quality in Crowdsourcing Using Noise Correction | Jing Zhang, Victor S. Sheng, Jian Wu, Xiaoqin Fu, Xindong Wu | This paper proposes a novel framework that introduces noise correction techniques to further improve label quality after ground truth inference in crowdsourcing. |
235 | Improving Collaborative Filtering via Hidden Structured Constraint | Qing Zhang, Houfeng Wang | To solve this problem, we propose a novel matrix factorization model with adaptive graph regularization framework, which can automatically discover latent user communities jointly with learning latent user representations, to enhance the discriminative power for recommendation. |
236 | DOLAP 2015 Workshop Summary | Carlos Garcia-Alvarado, Carlos Ordonez, Il-Yeol Song | The ACM DOLAP workshop presents research that bridges data warehousing, On-Line Analytical Processing (OLAP), and other large-scale data processing platforms. |
237 | DTMBIO 2015: International Workshop on Data and Text Mining in Biomedical Informatics | Min Song, Doheon Lee, Karin Verspoor | DTMBIO 2015: International Workshop on Data and Text Mining in Biomedical Informatics |
238 | ECol 2015: First international workshop on the Evaluation on Collaborative Information Seeking and Retrieval | Leif Azzopardi, Jeremy Pickens, Tetsuya Sakai, Laure Soulier, Lynda Tamine | The goal of this workshop is to investigate the evaluation challenges in CIS/CIR with the hope of building standardized evaluation frameworks, methodologies, and task specifications that would foster and grow the research area (in a collaborative fashion). |
239 | Eighth Workshop on Exploiting Semantic Annotations in Information Retrieval (ESAIR’15) | Krisztian Balog, Jeffrey Dalton, Antoine Doucet, Yusra Ibrahim | We dedicate a special "annotations in action" track to demonstrations that showcase innovative prototype systems, in addition to the regular research and position paper contributions. |
240 | LSDS-IR’15: 2015 Workshop on Large-Scale and Distributed Systems for Information Retrieval | Ismail Sengor Altingovde, B. Barla Cambazoglu, Nicola Tonellotto | The LSDS-IR’15 workshop will provide space for researchers to discuss the existing performance problems in the context of large-scale and distributed information retrieval systems and define new research directions in the modern Big Data era. |
241 | NWSearch 2015: International Workshop on Novel Web Search Interfaces and Systems | Davood Rafiei, Katsumi Tanaka | In particular, the workshop seeks to identify some of the problems and challenges facing the development of such tools and interfaces and to flourish new ideas and findings that can shape or influence future research directions and developments. |
242 | PIKM 2015: The 8th ACM Workshop for Ph.D. Students in Information and Knowledge Management | Mouna Kacimi, Nicoleta Preda, Maya Ramanath | Similar to the CIKM, the PIKM workshop covers a wide range of topics in the areas of databases, information retrieval and knowledge management. |
243 | TM 2015 — Topic Models: Post-Processing and Applications Workshop | Nikolaos Aletras, Jey Han Lau, Timothy Baldwin, Mark Stevenson | The main objective of the workshop is to bring together researchers who are interested in applications of topic models and improving their output. |
244 | UCUI’15: The 1st International Workshop on Understanding the City with Urban Informatics | Yashar Moshfeghi, Iadh Ounis, Craig Macdonald, Joemon M. Jose, Peter Triantafillou, Mark Livingston, Piyushimita Thakuriah | The goal of the workshop is to provide a multidisciplinary forum which brings together researchers in Big Data (BD), Information Retrieval (IR), Data Mining, and Urban Studies, to explore novel solutions to the numerous theoretical, practical and ethical challenges arising in this context. |