Paper Digest: WWW 2016 Highlights
The Web Conference (WWW) is one of the top internet conferences in the world.
To help the community quickly catch up on the work presented in this conference, Paper Digest Team processed all accepted papers, and generated one highlight sentence (typically the main topic) for each paper. Readers are encouraged to read these machine generated highlights / summaries to quickly get the main idea of each paper.
If you do not want to miss any interesting academic paper, you are welcome to sign up our free daily paper digest service to get updates on new papers published in your area every day. You are also welcome to follow us on Twitter and Linkedin to get updated with new conference digests.
Paper Digest Team
team@paperdigest.org
TABLE 1: WWW 2016 Papers
Title | Authors | Highlight | |
---|---|---|---|
1 | The Semantic Web and the Semantics of the Web: Where Does Meaning Come From? | Peter Norvig | This talk investigates the possibilities. |
2 | Dot Everyone!!: Power, the Internet and You | Martha Lane Fox | Dot Everyone!!: Power, the Internet and You |
3 | La Sécurité Ouverte How We Doin? So Far? | Mary Ellen Zurko | La Sécurité Ouverte How We Doin? So Far? |
4 | Social Networks Under Stress | Daniel M. Romero, Brian Uzzi, Jon Kleinberg | Here, we study how external events are associated with a network’s change in structure and communications. |
5 | Measuring Urban Social Diversity Using Interconnected Geo-Social Networks | Desislava Hristova, Matthew J. Williams, Mirco Musolesi, Pietro Panzarasa, Cecilia Mascolo | In this work, we present a novel network perspective on the interconnected nature of people and places, allowing us to capture the social diversity of urban locations through the social network and mobility patterns of their visitors. |
6 | Recommendations in Signed Social Networks | Jiliang Tang, Charu Aggarwal, Huan Liu | We provide a principled and mathematical approach to exploit signed social networks for recommendation, and propose a model, RecSSN, to leverage positive and negative links in signed social networks. |
7 | HeteroSales: Utilizing Heterogeneous Social Networks to Identify the Next Enterprise Customer | Qingbo Hu, Sihong Xie, Jiawei Zhang, Qiang Zhu, Songtao Guo, Philip S. Yu | Unlike many previous research works focusing on machine learning algorithms to support online sales, this paper introduces an approach that utilizes heterogenous social networks to improve the effectiveness of offline sales. |
8 | Immersive Recommendation: News and Event Recommendations Using Personal Digital Traces | Cheng-Kang Hsieh, Longqi Yang, Honghao Wei, Mor Naaman, Deborah Estrin | We propose a new user-centric recommendation model, called Immersive Recommendation, that incorporates cross-platform and diverse personal digital traces into recommendations. |
9 | Beyond Collaborative Filtering: The List Recommendation Problem | Oren Sar Shalom, Noam Koenigstein, Ulrich Paquet, Hastagiri P. Vanchinathan | In this work, we introduce the list recommendation problem. |
10 | Economic Recommendation with Surplus Maximization | Yongfeng Zhang, Qi Zhao, Yi Zhang, Daniel Friedman, Min Zhang, Yiqun Liu, Shaoping Ma | In this paper, we show how to adapt economists’ traditional idea of maximizing total surplus (the sum of consumer net benefit and producer profit) to the heterogeneous world of online service allocation, in an effort to promote the web intelligence for social good in online eco-systems. |
11 | When do Recommender Systems Work the Best?: The Moderating Effects of Product Attributes and Consumer Reviews on Recommender Performance | Dokyun Lee, Kartik Hosanagar | We investigate the moderating effect of product attributes and consumer reviews on the efficacy of a collaborative filtering recommender system on an e-commerce site. |
12 | TrackMeOrNot: Enabling Flexible Control on Web Tracking | Wei Meng, Byoungyoung Lee, Xinyu Xing, Wenke Lee | In this paper, we propose TrackMeOrNot, a new anti-tracking mechanism. |
13 | In a World That Counts: Clustering and Detecting Fake Social Engagement at Scale | Yixuan Li, Oscar Martinez, Xing Chen, Yi Li, John E. Hopcroft | In this paper, we focus on the social site of YouTube and the problem of identifying bad actors posting inorganic contents and inflating the count of social engagement metrics. |
14 | Tracking the Trackers | Zhonghao Yu, Sam Macbeth, Konark Modi, Josep M. Pujol | In this paper we propose a novel approach, based on the concepts leveraged from $k$-Anonymity, in which users collectively identify unsafe data elements, which have the potential to identify uniquely an individual user, and remove them from requests. |
15 | Crowdsourcing Annotations for Websites’ Privacy Policies: Can It Really Work? | Shomir Wilson, Florian Schaub, Rohan Ramanath, Norman Sadeh, Fei Liu, Noah A. Smith, Frederick Liu | In this paper, we assess the viability of crowdsourcing privacy policy annotations. |
16 | Abusive Language Detection in Online User Content | Chikashi Nobata, Joel Tetreault, Achint Thomas, Yashar Mehdad, Yi Chang | In this work, we develop a machine learning based method to detect hate speech on online user comments from two domains which outperforms a state-of-the-art deep learning approach. We also develop a corpus of user comments annotated for abusive language, the first of its kind. |
17 | Hidden Topic Sentiment Model | Md Mustafizur Rahman, Hongning Wang | In this paper, we build a Hidden Topic Sentiment Model (HTSM) to explicitly capture topic coherence and sentiment consistency in an opinionated text document to accurately extract latent aspects and corresponding sentiment polarities. |
18 | Mining Aspect-Specific Opinion using a Holistic Lifelong Topic Model | Shuai Wang, Zhiyuan Chen, Bing Liu | To further improve it, we incorporate the idea of lifelong machine learning and propose a more advanced model, called the LAST (Lifelong Aspect-based Sentiment Topic) model. |
19 | Mean Field Equilibria for Competitive Exploration in Resource Sharing Settings | Pu Yang, krishnamurthy Iyer, Peter I. Frazier | We consider a model of nomadic agents exploring and competing for time-varying location-specific resources, arising in crowdsourced transportation services, online communities, and in traditional location-based economic activity. |
20 | A Field Guide to Personalized Reserve Prices | Renato Paes Leme, Martin Pal, Sergei Vassilvitskii | We study the question of setting and testing reserve prices in single item auctions when the bidders are not identical. |
21 | Competition on Price and Quality in Cloud Computing | Cinar Kilcioglu, Justin M. Rao | Competition on Price and Quality in Cloud Computing |
22 | Mechanism Design for Mixed Bidders | Yoram Bachrach, Sofia Ceppi, Ian A. Kash, Peter Key, Mohammad Reza Khani | We introduce a transitional mechanism which encourages advertisers to update their bids to their valuations, while mitigating revenue loss. |
23 | Semantics and Expressive Power of Subqueries and Aggregates in SPARQL 1.1 | Mark Kaminski, Egor V. Kostylev, Bernardo Cuenca Grau | In this paper we provide an in-depth formal analysis of the semantics and expressive power of these new constructs as defined in the SPARQL 1.1 specification, and hence lay the necessary foundations for the development of robust, scalable and extensible query engines supporting complex numerical and analytics tasks. |
24 | Reverse Engineering SPARQL Queries | Marcelo Arenas, Gonzalo I. Diaz, Egor V. Kostylev | We provide here an investigation of the reverse engineering problem in the context of SPARQL. |
25 | Profiling the Potential of Web Tables for Augmenting Cross-domain Knowledge Bases | Dominique Ritze, Oliver Lehmberg, Yaser Oulabi, Christian Bizer | In this paper, we match a large, publicly available Web table corpus to the DBpedia knowledge base. |
26 | Foundations of JSON Schema | Felipe Pezoa, Juan L. Reutter, Fernando Suarez, Martín Ugarte, Domagoj Vrgoč | In this paper we provide the first formal definition of syntax and semantics for JSON Schema and use it to show that implementing this layer on top of JSON is feasible in practice. |
27 | Mining Online Social Data for Detecting Social Network Mental Disorders | Hong-Han Shuai, Chih-Ya Shen, De-Nian Yang, Yi-Feng Lan, Wang-Chien Lee, Philip S. Yu, Ming-Syan Chen | In this paper, we argue that mining online social behavior provides an opportunity to actively identify SNMDs at an early stage. |
28 | Visualizing Large-scale and High-dimensional Data | Jian Tang, Jingzhou Liu, Ming Zhang, Qiaozhu Mei | We propose the LargeVis, a technique that first constructs an accurately approximated K-nearest neighbor graph from the data and then layouts the graph in the low-dimensional space. |
29 | Predicting Pre-click Quality for Native Advertisements | Ke Zhou, Miriam Redi, Andrew Haines, Mounia Lalmas | In this work, we explore the notion of ad quality, namely the effectiveness of advertising from a user experience perspective. |
30 | The Lifecycle and Cascade of WeChat Social Messaging Groups | Jiezhong Qiu, Yixuan Li, Jie Tang, Zheng Lu, Hao Ye, Bo Chen, Qiang Yang, John E. Hopcroft | In this paper, we analyze the daily usage logs from WeChat group messaging platform the largest standalone messaging communication service in China with the goal of understanding the processes by which social messaging groups come together, grow new members, and evolve over time. |
31 | Characterizing Long-tail SEO Spam on Cloud Web Hosting Services | Xiaojing Liao, Chang Liu, Damon McCoy, Elaine Shi, Shuang Hao, Raheem Beyah | In this paper, we take the first step toward understanding how long-tail SEO spam is implemented on cloud hosting platforms. |
32 | Automatic Extraction of Indicators of Compromise for Web Applications | Onur Catakoglu, Marco Balduzzi, Davide Balzarotti | In this paper we propose for the first time an automated technique to extract and validate IOCs for web applications, by analyzing the information collected by a high-interaction honeypot. |
33 | Cracking Classifiers for Evasion: A Case Study on the Google’s Phishing Pages Filter | Bin Liang, Miaoqiang Su, Wei You, Wenchang Shi, Gang Yang | In this paper, we use the Google’s phishing pages filter (GPPF), a classifier deployed in the Chrome browser which owns over one billion users, as a case to investigate the security challenges for the client-side classifiers. |
34 | Understanding the Detection of View Fraud in Video Content Portals | Miriam Marciel, Rubén Cuevas, Albert Banchs, Roberto González, Stefano Traverso, Mohamed Ahmed, Arturo Azcorra | In this paper we present a first set of tools to serve this purpose. |
35 | On the Temporal Dynamics of Opinion Spamming: Case Studies on Yelp | Santosh KC, Arjun Mukherjee | We analyze these questions in the light of time-series analysis on Yelp. |
36 | Scaling up Dynamic Topic Models | Arnab Bhadury, Jianfei Chen, Jun Zhu, Shixia Liu | This paper fills this research void, and presents a fast and parallelizable inference algorithm using Gibbs Sampling with Stochastic Gradient Langevin Dynamics that does not make any unwarranted assumptions. |
37 | Learning Global Term Weights for Content-based Recommender Systems | Yupeng Gu, Bo Zhao, David Hardtke, Yizhou Sun | In this paper, we focus on the latter, i.e., optimizing the global term weights, for a particular recommendation domain by leveraging supervised approaches. |
38 | Exploring Patterns of Identity Usage in Tweets: A New Problem, Solution and Case Study | Kenneth Joseph, Wei Wei, Kathleen M. Carley | The present work makes two contributions to the study of identity, in particular the study of identity in text. |
39 | The Death and Life of Great Italian Cities: A Mobile Phone Data Perspective | Marco De Nadai, Jacopo Staiano, Roberto Larcher, Nicu Sebe, Daniele Quercia, Bruno Lepri | In this paper, we identify a valuable alternative to the lengthy and costly collection of activity survey data: mobile phone data. |
40 | Beyond the Baseline: Establishing the Value in Mobile Phone Based Poverty Estimates | Chris Smith-Clarke, Licia Capra | We present extensive analysis of the performance of all these models on data acquired for two developing countries — Senegal and Ivory Coast. |
41 | Pushing the Frontier: Exploring the African Web Ecosystem | Rodérick Fanou, Gareth Tyson, Pierre Francois, Arjuna Sathiaseelan | Whereas others have explored web infrastructure in developed regions, we shed light on practices in developing regions. |
42 | Latent Space Model for Multi-Modal Social Data | Yoon-Sik Cho, Greg Ver Steeg, Emilio Ferrara, Aram Galstyan | To this purpose, here we propose the Constrained Latent Space Model (CLSM), a generalized framework that combines Mixed Membership Stochastic Blockmodels (MMSB) and Latent Dirichlet Allocation (LDA) incorporating a constraint that forces the latent space to concurrently describe the multiple data modalities. |
43 | Modeling a Retweet Network via an Adaptive Bayesian Approach | Bin Bi, Junghoo Cho | In this paper, we propose two novel Bayesian nonparametric models, URM and UCM, on retweet data. |
44 | On Sampling Nodes in a Network | Flavio Chiericetti, Anirban Dasgupta, Ravi Kumar, Silvio Lattanzi, Tamás Sarlós | In this paper we consider the problem of sampling nodes from a large graph according to a prescribed distribution by using random walk as the basic primitive. |
45 | Distributed Estimation of Graph 4-Profiles | Ethan R. Elenberg, Karthikeyan Shanmugam, Michael Borokhovich, Alexandros G. Dimakis | We present a novel distributed algorithm for counting all four-node induced subgraphs in a big graph. |
46 | Detecting Good Abandonment in Mobile Search | Kyle Williams, Julia Kiseleva, Aidan C. Crook, Imed Zitouni, Ahmed Hassan Awadallah, Madian Khabsa | This paper proposes a solution to this problem using gesture interactions, such as reading times and touch actions, as signals for differentiating between good and bad abandonment. |
47 | Ups and Downs: Modeling the Visual Evolution of Fashion Trends with One-Class Collaborative Filtering | Ruining He, Julian McAuley | In this paper we build novel models for the One-Class Collaborative Filtering setting, where our goal is to estimate users’ fashion-aware personalized ranking functions based on their past feedback. |
48 | Modeling User Consumption Sequences | Austin R. Benson, Ravi Kumar, Andrew Tomkins | We study sequences of consumption in which the same item may be consumed multiple times. |
49 | A Neural Click Model for Web Search | Alexey Borisov, Ilya Markov, Maarten de Rijke, Pavel Serdyukov | We propose an alternative based on the idea of distributed representations: to represent the user’s information need and the information available to the user with a vector state. |
50 | Query-Less: Predicting Task Repetition for NextGen Proactive Search and Recommendation Engines | Yang Song, Qi Guo | In this work, we aim at discovering and characterizing these types of tasks so that we can automatically predict when and what types of tasks will be repeated by the users in the future, through analyzing search logs from a commercial Web search engine and user interaction logs from a mobile App that offers proactive recommendations. We first introduce a set of novel features that can accurately capture task repetition. |
51 | Behavior Driven Topic Transition for Search Task Identification | Liangda Li, Hongbo Deng, Yunlong He, Anlei Dong, Yi Chang, Hongyuan Zha | In this paper, we propose an unsupervised approach to identify search tasks via topic membership along with topic transition probabilities, thus it becomes possible to interpret how user’s search intent emerges and evolves over time. |
52 | A Piggyback System for Joint Entity Mention Detection and Linking in Web Queries | Marco Cornolti, Paolo Ferragina, Massimiliano Ciaramita, Stefan Rüd, Hinrich Schütze | In this paper we study the problem of linking open-domain web-search queries towards entities drawn from the full entity inventory of Wikipedia articles. |
53 | Towards Mobile Query Auto-Completion: An Efficient Mobile Application-Aware Approach | Aston Zhang, Amit Goyal, Ricardo Baeza-Yates, Yi Chang, Jiawei Han, Carl A. Gunter, Hongbo Deng | We propose AppAware, a novel QAC model using installed app and recently opened app signals to suggest queries for matching input prefixes on mobile devices. |
54 | Disinformation on the Web: Impact, Characteristics, and Detection of Wikipedia Hoaxes | Srijan Kumar, Robert West, Jure Leskovec | In this paper we study false information on Wikipedia by focusing on the hoax articles that have been created throughout its history. |
55 | Tweet Properly: Analyzing Deleted Tweets to Understand and Identify Regrettable Ones | Lu Zhou, Wenbo Wang, Keke Chen | In this paper, we study how to identify the regrettable tweets published by \emph{normal individual users} via the contents and users’ historical deletion patterns. |
56 | Winning Arguments: Interaction Dynamics and Persuasion Strategies in Good-faith Online Discussions | Chenhao Tan, Vlad Niculae, Cristian Danescu-Niculescu-Mizil, Lillian Lee | In this work, we study these interactions to understand the mechanisms behind persuasion. |
57 | Addressing Complex and Subjective Product-Related Queries with Customer Reviews | Julian McAuley, Alex Yang | In this paper we hope to fuse these two paradigms: given a large volume of previously answered queries about products, we hope to automatically learn whether a review of a product is relevant to a given query. |
58 | A Robust Framework for Estimating Linguistic Alignment in Twitter Conversations | Gabriel Doyle, Dan Yurovsky, Michael C. Frank | In this paper, we lay out a set of desiderata for a linguistic alignment measure, including robustness to sparse and short messages, explicit conditionality, and consistency across linguistic features with different baseline frequencies. |
59 | Detecting Evolution of Concepts based on Cause-Effect Relationships in Online Reviews | Yating Zhang, Adam Jatowt, Katsumi Tanaka | In this paper, we propose a novel approach for investigating the technology evolution based on collections of product reviews. |
60 | The QWERTY Effect on the Web: How Typing Shapes the Meaning of Words in Online Human-Computer Interaction | David Garcia, Markus Strohmaier | In this paper, we perform large scale investigations of the QWERTY effect on the web. |
61 | Do Cascades Recur? | Justin Cheng, Lada A. Adamic, Jon M. Kleinberg, Jure Leskovec | In this paper, we perform a large-scale analysis of cascades on Facebook over significantly longer time scales, and find that a more complex picture emerges, in which many large cascades recur, exhibiting multiple bursts of popularity with periods of quiescence in between. |
62 | Exploring Limits to Prediction in Complex Social Systems | Travis Martin, Jake M. Hofman, Amit Sharma, Ashton Anderson, Duncan J. Watts | In this paper we attempt to clarify the question by presenting a simple stylized model of success that attributes prediction error to one of two generic sources: insufficiency of available data and/or models on the one hand; and inherent unpredictability of complex social systems on the other. |
63 | TribeFlow: Mining & Predicting User Trajectories | Flavio Figueiredo, Bruno Ribeiro, Jussara M. Almeida, Christos Faloutsos | Mindful of these challenges we propose TribeFlow, a method designed to cope with the complex challenges of learning personalized predictive models of non-stationary, transient, and time-heterogeneous user trajectories. |
64 | Linking Users Across Domains with Location Data: Theory and Validation | Christopher Riederer, Yunsung Kim, Augustin Chaintreau, Nitish Korula, Silvio Lattanzi | In this paper, we address the reconciliation problem for location-based datasets and introduce a robust method for this general setting. |
65 | Exploiting Dining Preference for Restaurant Recommendation | Fuzheng Zhang, Nicholas Jing Yuan, Kai Zheng, Defu Lian, Xing Xie, Yong Rui | In this paper, based on users’ dining implicit feedbacks (restaurant visit via check-ins), explicit feedbacks (restaurant reviews) as well as some meta data (e.g., location, user demographics, restaurant attributes), we aim at recommending each user a list of restaurants for his next dining. |
66 | Non-Linear Mining of Competing Local Activities | Yasuko Matsubara, Yasushi Sakurai, Christos Faloutsos | We present COMPCUBE, a unifying non-linear model, which provides a compact and powerful representation of co-evolving activities; and also a novel fitting algorithm, COMPCUBE-FIT, which is parameter-free and scalable. |
67 | PCT: Partial Co-Alignment of Social Networks | Jiawei Zhang, Philip S. Yu | In this paper, we aim at inferring such potential corresponding connections linking multiple kinds of shared entities across networks simultaneously. |
68 | Improving Post-Click User Engagement on Native Ads via Survival Analysis | Nicola Barbieri, Fabrizio Silvestri, Mounia Lalmas | In this paper we focus on estimating the post-click engagement on native ads by predicting the dwell time on the corresponding ad landing pages. |
69 | Table Cell Search for Question Answering | Huan Sun, Hao Ma, Xiaodong He, Wen-tau Yih, Yu Su, Xifeng Yan | This work proposes a novel table cell search framework to attack this problem. |
70 | Identifying Web Queries with Question Intent | Gilad Tsur, Yuval Pinter, Idan Szpektor, David Carmel | We present a supervised classification scheme, random forest over word-clusters for variable length texts, which can model the query structure. |
71 | A Study of Retrieval Models for Long Documents and Queries in Information Retrieval | Ronan Cummins | In this paper, we formally analyse two important but distinct reasons for normalising documents with respect to length, namely verbosity and scope, and discuss the practical implications of not normalising accordingly. |
72 | Effective Construction of Relative Lempel-Ziv Dictionaries | Kewen Liao, Matthias Petri, Alistair Moffat, Anthony Wirth | In this work, we develop new dictionary design heuristics, based on effective construction, rather than on pruning; we identify dictionary construction as a (string) covering problem. |
73 | Just in Time: Controlling Temporal Performance in Crowdsourcing Competitions | Markus Rokicki, Sergej Zerr, Stefan Siersdorfer | In this paper, we investigate how incentive mechanisms in competition based crowdsourcing can be employed in such scenarios. |
74 | Averaging Gone Wrong: Using Time-Aware Analyses to Better Understand Behavior | Samuel Barbosa, Dan Cosley, Amit Sharma, Roberto M. Cesar | Using Reddit as an example, we study the evolution of users based on comment and submission data from 2007 to 2014. |
75 | Using Hierarchical Skills for Optimized Task Assignment in Knowledge-Intensive Crowdsourcing | Panagiotis Mavridis, David Gross-Amblard, Zoltán Miklós | In this paper we propose to finely model tasks and participants using a skill tree, that is a taxonomy of skills equipped with a similarity distance within skills. |
76 | Scheduling Human Intelligence Tasks in Multi-Tenant Crowd-Powered Systems | Djellel Eddine Difallah, Gianluca Demartini, Philippe Cudré-Mauroux | In this paper, we propose a new crowdsourcing system architecture that leverages scheduling algorithms to optimize task execution in a shared resources environment, in this case a crowdsourcing platform. |
77 | MapWatch: Detecting and Monitoring International Border Personalization on Online Maps | Gary Soeller, Karrie Karahalios, Christian Sandvig, Christo Wilson | In this paper, we present the architecture of MapWatch, and analyze the instances of border personalization on Google and Bing, including one border change that MapWatch identified live, as Google was rolling out the update. |
78 | What Links Alice and Bob?: Matching and Ranking Semantic Patterns in Heterogeneous Networks | Jiongqian Liang, Deepak Ajwani, Patrick K. Nicholson, Alessandra Sala, Srinivasan Parthasarathy | Building on such efforts, in this work we articulate a novel approach for mining relationships across entities in such networks while accounting for user preference (prioritization) over relationship type and interestingness metric. |
79 | An Empirical Study of Web Cookies | Aaron Cahn, Scott Alfeld, Paul Barford, S. Muthukrishnan | In this paper, we present an empirical study of web cookie characteristics, placement practices and information transmission. |
80 | From Social Machines to Social Protocols: Software Engineering Foundations for Sociotechnical Systems | Amit K. Chopra, Munindar P. Singh | The contribution of this paper is a new paradigm for STSs, evaluated via conceptual analysis. |
81 | As Time Goes By: Comprehensive Tagging of Textual Phrases with Temporal Scopes | Erdal Kuzey, Vinay Setty, Jannik Strötgen, Gerhard Weikum | We present methods for this kind of temponym resolution, using an entity- and TempEx-oriented document model and the Yago knowledge base for distant supervision. |
82 | Probabilistic Bag-Of-Hyperlinks Model for Entity Linking | Octavian-Eugen Ganea, Marina Ganea, Aurelien Lucchi, Carsten Eickhoff, Thomas Hofmann | We here propose a probabilistic approach that makes use of an effective graphical model to perform collective entity disambiguation. |
83 | Discovering Structure in the Universe of Attribute Names | Alon Halevy, Natalya Noy, Sunita Sarawagi, Steven Euijong Whang, Xiao Yu | The paper describes an unsupervised learning method to generate such a grammar automatically from a large set of attribute names. This paper introduces the problem of organizing the attributes by expressing the compositional structure of their names as a rule-based grammar. |
84 | Modeling User Exposure in Recommendation | Dawen Liang, Laurent Charlin, James McInerney, David M. Blei | In this paper, we propose a new probabilistic approach that directly incorporates user exposure to items into collaborative filtering. |
85 | On the Relevance of Irrelevant Alternatives | Austin R. Benson, Ravi Kumar, Andrew Tomkins | We present the first such algorithm, which runs in quadratic time under an oracle model, and we pair it with a matching lower bound. |
86 | Growing Wikipedia Across Languages via Recommendation | Ellery Wulczyn, Robert West, Leila Zia, Jure Leskovec | In this paper, we present an approach to filling gaps in article coverage across different Wikipedia editions. |
87 | Using Shortlists to Support Decision Making and Improve Recommender System Performance | Tobias Schnabel, Paul N. Bennett, Susan T. Dumais, Thorsten Joachims | In this paper, we study shortlists as an interface component for recommender systems with the dual goal of supporting the user’s decision process, as well as improving implicit feedback elicitation for increased recommendation quality. |
88 | Tell Me About Yourself: The Malicious CAPTCHA Attack | Nethanel Gelernter, Amir Herzberg | We present the malicious CAPTCHA attack, allowing a rogue website to trick users into unknowingly disclosing their private information. |
89 | Remedying Web Hijacking: Notification Effectiveness and Webmaster Comprehension | Frank Li, Grant Ho, Eric Kuan, Yuan Niu, Lucas Ballard, Kurt Thomas, Elie Bursztein, Vern Paxson | In this work we present the first large-scale measurement study on the effectiveness of combinations of browser, search, and direct webmaster notifications at reducing the duration a site remains compromised. |
90 | No Honor Among Thieves: A Large-Scale Analysis of Malicious Web Shells | Oleksii Starov, Johannes Dahse, Syed Sharique Ahmad, Thorsten Holz, Nick Nikiforakis | Despite their high prevalence in practice and heavy involvement in security breaches, web shells have never been the direct subject of any study. |
91 | Stress Testing the Booters: Understanding and Undermining the Business of DDoS Services | Mohammad Karami, Youngsam Park, Damon McCoy | In this paper, we empirically measure many facets of their technical and payment infrastructure. |
92 | N-gram over Context | Noriaki Kawamae | We develop a parallelizable inference algorithm, D-NOC, to support large data sets. |
93 | Representing Documents via Latent Keyphrase Inference | Jialu Liu, Xiang Ren, Jingbo Shang, Taylor Cassidy, Clare R. Voss, Jiawei Han | In this paper, we propose a data-driven model named Latent Keyphrase InferenceLAKI) that represents documents with a vector of closely related domain keyphrases instead of single words or existing concepts in the knowledge base. |
94 | Unsupervised, Efficient and Semantic Expertise Retrieval | Christophe Van Gysel, Maarten de Rijke, Marcel Worring | We introduce an unsupervised discriminative model for the task of retrieving experts in online document collections. |
95 | Using Metafeatures to Increase the Effectiveness of Latent Semantic Models in Web Search | Alexey Borisov, Pavel Serdyukov, Maarten de Rijke | In web search, latent semantic models have been proposed to bridge the lexical gap between queries and documents that is due to the fact that searchers and content creators often use different vocabularies and language styles to express the same concept. |
96 | Bayesian Budget Feasibility with Posted Pricing | Eric Balkanski, Jason D. Hartline | We consider the problem of budget feasible mechanism design proposed by Singer, but in a Bayesian setting. |
97 | People and Cookies: Imperfect Treatment Assignment in Online Experiments | Dominic Coey, Michael Bailey | We show that the estimated treatment effect in a cookie-level experiment converges to a weighted average of the marginal effects of treating more of a user’s cookies. |
98 | Objective Variables for Probabilistic Revenue Maximization in Second-Price Auctions with Reserve | Maja R. Rudolph, Joseph G. Ellis, David M. Blei | In this paper, we develop a probabilistic method to learn a profitable strategy to set the reserve price. |
99 | Understanding User Economic Behavior in the City Using Large-scale Geotagged and Crowdsourced Data | Yingjie Zhang, Beibei Li, Jason Hong | In this study, we focus on understanding users economic behavior in the city by examining the economic value from crowdsourced and geotaggged data. |
100 | IncApprox: A Data Analytics System for Incremental Approximate Computing | Dhanya R. Krishnan, Do Le Quoc, Pramod Bhatotia, Christof Fetzer, Rodrigo Rodrigues | In this paper, we observe that these two paradigms are complementary, and can be married together! |
101 | From Diversity-based Prediction to Better Ontology & Schema Matching | Avigdor Gal, Haggai Roitman, Tomer Sagi | We propose MCD (Match Competitor Deviation), a new diversity-based predictor that compares the strength of a matcher confidence in the correspondence of a concept pair with respect to other correspondences that involve either concept. |
102 | The Effect of Recommendations on Network Structure | Jessica Su, Aneesh Sharma, Sharad Goel | We investigate this issue by empirically and theoretically analyzing abrupt changes in Twitter’s network structure around the mid-2010 introduction of its "Who to Follow" feature. |
103 | Gender, Productivity, and Prestige in Computer Science Faculty Hiring Networks | Samuel F. Way, Daniel B. Larremore, Aaron Clauset | Using comprehensive data on both hiring outcomes and scholarly productivity for 2659 tenure-track faculty across 205 Ph.D.-granting departments in North America, we investigate the multi-dimensional nature of gender inequality in computer science faculty hiring through a network model of the hiring process. |
104 | Which to View: Personalized Prioritization for Broadcast Emails | Beidou Wang, Martin Ester, Jiajun Bu, Yu Zhu, Ziyu Guan, Deng Cai | In this paper, we propose the first framework for broadcast email prioritization by designing a novel active learning model that considers the collaborative filtering, implicit feedback and time sensitive responsiveness features of broadcast emails. |
105 | Learning-to-Rank for Real-Time High-Precision Hashtag Recommendation for Streaming News | Bichen Shi, Georgiana Ifrim, Neil Hurley | We present the data collection and processing pipeline, as well as our methodology for achieving low latency, high precision recommendations. |
106 | Discovery of Topical Authorities in Instagram | Aditya Pal, Amaç Herdagdelen, Sourav Chatterji, Sumit Taank, Deepayan Chakrabarti | In this paper, we present a novel approach that we call the Authority Learning Framework (ALF) to find topical authorities in Instagram. |
107 | Did You Say U2 or YouTube?: Inferring Implicit Transcripts from Voice Search Logs | Milad Shokouhi, Umut Ozertem, Nick Craswell | This paper considers an alternative source of training data for speech recognition, called implicit transcription. |
108 | Where Can I Buy a Boulder?: Searching for Offline Retail Locations | Sandro Bauer, Filip Radlinski, Ryen W. White | In this paper, we investigate "where can I buy"-style queries related to in-person purchases of products and services. |
109 | Exploiting Green Energy to Reduce the Operational Costs of Multi-Center Web Search Engines | Roi Blanco, Matteo Catena, Nicola Tonellotto | In this paper, we tackle the problem of targeting the usage of green energy to minimize the expenditure of running multi-center Web search engines, i.e., systems composed by multiple, geographically remote, computing facilities. |
110 | Strengthening Weak Identities Through Inter-Domain Trust Transfer | Giridhari Venkatadri, Oana Goga, Changtao Zhong, Bimal Viswanath, Krishna P. Gummadi, Nishanth Sastry | In this paper, we investigate the feasibility of leveraging information about identities that is aggregated across multiple domains to reason about their trustworthiness. |
111 | Entity Disambiguation with Linkless Knowledge Bases | Yang Li, Shulong Tan, Huan Sun, Jiawei Han, Dan Roth, Xifeng Yan | In this work, we propose the challenging Named Entity Disambiguation with Linkless Knowledge Bases (LNED) problem and tackle it by leveraging the useful disambiguation evidences scattered across the reference knowledge base. |
112 | Joint Recognition and Linking of Fine-Grained Locations from Tweets | Zongcheng Ji, Aixin Sun, Gao Cong, Jialong Han | We formulate this end-to-end location linking problem as a structured prediction problem and propose a beam-search based algorithm. |
113 | Internet Collaboration on Extremely Difficult Problems: Research versus Olympiad Questions on the Polymath Site | Isabel Mette Kloumann, Chenhao Tan, Jon Kleinberg, Lillian Lee | We find interesting differences between the two domains through each of these analyses, and present these analyses as a template to facilitate comparison between Polymath and other domains for collaboration and communication. |
114 | The Communication Network Within the Crowd | Ming Yin, Mary L. Gray, Siddharth Suri, Jennifer Wortman Vaughan | Since its inception, crowdsourcing has been considered a black-box approach to solicit labor from a crowd of workers. |
115 | An In-depth Study of Mobile Browser Performance | Javad Nejati, Aruna Balasubramanian | Towards understanding mobile Web page load times, in this paper we: (1) perform an in-depth pairwise comparison of loading a page on a mobile versus a non-mobile browser, and (2) characterize the bottlenecks in the mobile browser {\em vis-a-vis} non-mobile browsers. |
116 | The Case for Robotic Wireless Networks | Mahanth Gowda, Ashutosh Dhekne, Romit Roy Choudhury | This paper explores the possibility of injecting mobility into wireless network infrastructure. |
117 | GoCAD: GPU-Assisted Online Content-Adaptive Display Power Saving for Mobile Devices in Internet Streaming | Yao Liu, Mengbai Xiao, Ming Zhang, Xin Li, Mian Dong, Zhan Ma, Zhenhua Li, Songqing Chen | To address these challenges, in this paper, we design and implement GoCAD, a GPU-assisted Online Content-Adaptive Display power saving scheme for mobile devices in Internet streaming sessions. |
118 | An Empirical Analysis of Algorithmic Pricing on Amazon Marketplace | Le Chen, Alan Mislove, Christo Wilson | In this study, we develop a methodology for detecting algorithmic pricing, and use it empirically to analyze their prevalence and behavior on Amazon Marketplace. |
119 | Voting with Their Feet: Inferring User Preferences from App Management Activities | Huoran Li, Wei Ai, Xuanzhe Liu, Jian Tang, Gang Huang, Feng Feng, Qiaozhu Mei | We present a surprising finding that the metrics commonly used by app stores to rank apps do not truly reflect the users’ real attitudes towards the apps. |
120 | User Fatigue in Online News Recommendation | Hao Ma, Xueqing Liu, Zhihong Shen | In this paper, we present a comprehensive study on the research of the user fatigue in online recommender systems. |
121 | Mining User Intentions from Medical Queries: A Neural Network Based Heterogeneous Jointly Modeling Approach | Chenwei Zhang, Wei Fan, Nan Du, Philip S. Yu | In this paper, we present a neural network based jointly modeling approach to model and capture user intentions in medical related text queries. |
122 | Who Benefits from the "Sharing" Economy of Airbnb? | Giovanni Quattrone, Davide Proserpio, Daniele Quercia, Licia Capra, Mirco Musolesi | Here we propose to gather evidence from the Web. |
123 | Socialized Language Model Smoothing via Bi-directional Influence Propagation on Social Networks | Rui Yan, Cheng-Te Li, Hsun-Ping Hsieh, Po Hu, Xiaohua Hu, Tingting He | In this paper we propose to tackle the Achilles Heel of social networks by smoothing the language model via influence propagation. |
124 | Collaborative Nowcasting for Contextual Recommendation | Yu Sun, Nicholas Jing Yuan, Xing Xie, Kieran McDonald, Rui Zhang | Inspired by the nowcasting practice in meteorology and macroeconomics, we propose an innovative collaborative nowcasting model to effectively resolve these challenges. |
125 | From Freebase to Wikidata: The Great Migration | Thomas Pellissier Tanon, Denny Vrandečić, Sebastian Schaffert, Thomas Steiner, Lydia Pintscher | In this paper, we report on the ongoing transfer efforts and data mapping challenges, and provide an analysis of the effort so far. |
126 | Automatic Discovery of Attribute Synonyms Using Query Logs and Table Corpora | Yeye He, Kaushik Chakrabarti, Tao Cheng, Tomasz Tylenda | To address that problem, we propose to automatically discover all the alternate ways of referring to the attributes of a given class of entities (referred to as attribute synonyms) in order to improve search quality. |