Paper Digest: WWW 2018 Highlights
The Web Conference (WWW) is one of the top internet conferences in the world.
To help the community quickly catch up on the work presented in this conference, Paper Digest Team processed all accepted papers, and generated one highlight sentence (typically the main topic) for each paper. Readers are encouraged to read these machine generated highlights / summaries to quickly get the main idea of each paper.
If you do not want to miss any interesting academic paper, you are welcome to sign up our free daily paper digest service to get updates on new papers published in your area every day. You are also welcome to follow us on Twitter and Linkedin to get updated with new conference digests.
Paper Digest Team
team@paperdigest.org
TABLE 1: WWW 2018 Papers
Title | Authors | Highlight | |
---|---|---|---|
1 | Creating Crowdsourced Research Talks at Scale | Rajan Vaish, Shirish Goyal, Amin Saberi, Sharad Goel | To address this gap, we propose, deploy, and evaluate a scalable, end-to-end system for crowdsourcing the creation of short, 5-minute research videos based on academic papers. |
2 | Attack under Disguise: An Intelligent Data Poisoning Attack Mechanism in Crowdsourcing | Chenglin Miao, Qi Li, Lu Su, Mengdi Huai, Wenjun Jiang, Jing Gao | In this paper, we study the data poisoning attacks against such crowdsourcing systems with the Dawid-Skene model empowered. |
3 | Leveraging Crowdsourcing Data for Deep Active Learning An Application: Learning Intents in Alexa | Jie Yang, Thomas Drake, Andreas Damianou, Yoelle Maarek | This paper presents a generic Bayesian framework that enables any deep learning model to actively learn from targeted crowds. |
4 | Web-Based VR Experiments Powered by the Crowd | Xiao Ma, Megan Cackett, Leslie Park, Eric Chien, Mor Naaman | We build on the increasing availability of Virtual Reality (VR) devices and Web technologies to conduct behavioral experiments in VR using crowdsourcing techniques. |
5 | CHIMP: Crowdsourcing Human Inputs for Mobile Phones | Mario Almeida, Muhammad Bilal, Alessandro Finamore, Ilias Leontiadis, Yan Grunenberger, Matteo Varvello, Jeremy Blackburn | To address these limitations we present CHIMP, a system that integrates automated tools and large-scale crowdsourced inputs. |
6 | Crowd-based Multi-Predicate Screening of Papers in Literature Reviews | Evgeny Krivosheev, Fabio Casati, Boualem Benatallah | In this paper we derive and analyze a set of strategies for crowd-based screening, and show that an adaptive strategy, that continuously re-assesses the statistical properties of the problem to minimize the number of votes needed to take decisions for each paper, significantly outperforms a number of non-adaptive approaches in terms of cost and accuracy. |
7 | The Size Conundrum: Why Online Knowledge Markets Can Fail at Scale | Himel Dev, Chase Geigle, Qingtao Hu, Jiahui Zheng, Hari Sundaram | In this paper, we interpret the community question answering websites on the StackExchange platform as knowledge markets, and analyze how and why these markets can fail at scale. |
8 | A Fast Deep Learning Model for Textual Relevance in Biomedical Information Retrieval | Sunil Mohan, Nicolas Fiorini, Sun Kim, Zhiyong Lu | Towards addressing the problem of relevance in biomedical literature search, we introduce a deep learning model for the relevance of a document’s text to a keyword style query. |
9 | Multi-Task Learning Improves Disease Models from Web Search | Bin Zou, Vasileios Lampos, Ingemar Cox | We explore both linear and nonlinear models, specifically a multi-task expansion of elastic net and a multi-task Gaussian Process, and compare them to their respective single task formulations. |
10 | Multi-instance Domain Adaptation for Vaccine Adverse Event Detection | Junxiang Wang, Liang Zhao | In this paper, we propose a novel generic framework named Multi-instance Domain Adaptation (MIDA) to maximize the synergy between these two domains in the vaccine adverse event detection task for social media users. |
11 | Modeling Individual Cyclic Variation in Human Behavior | Emma Pierson, Tim Althoff, Jure Leskovec | Here, we present Cyclic Hidden Markov Models (CyHMMs) for detecting and modeling cycles in a collection of multidimensional heterogeneous time series data. |
12 | Multi-Task Pharmacovigilance Mining from Social Media Posts | Shaika Chowdhury, Chenwei Zhang, Philip S. Yu | Aiming to effectively monitor various aspects of Adverse Drug Reactions (ADRs) from diversely expressed social medical posts, we propose a multi-task neural network framework that learns several tasks associated with ADR monitoring with different levels of supervisions collectively. |
13 | Did You Really Just Have a Heart Attack?: Towards Robust Detection of Personal Health Mentions in Social Media | Payam Karisani, Eugene Agichtein | To address this problem, we propose a general, robust method for detecting PHMs in social media, which we call WESPAD, that combines lexical, syntactic, word embedding-based, and context-based features. |
14 | Detecting Absurd Conversations from Intelligent Assistant Logs by Exploiting User Feedback Utterances | Chikara Hashimoto, Manabu Sassano | To facilitate improvement of their conversational ability, we have developed a method that detects absurd conversations recorded in intelligent assistant logs by identifying user feedback utterances that indicate users’ favorable and unfavorable evaluations of intelligent assistant responses; e.g., "great!" |
15 | ROSC: Robust Spectral Clustering on Multi-scale Data | Xiang Li, Ben Kao, Siqiang Luo, Martin Ester | We review existing spectral methods that are designed to handle multi-scale data and propose an alternative approach that is orthogonal to existing methods. |
16 | DRN: A Deep Reinforcement Learning Framework for News Recommendation | Guanjie Zheng, Fuzheng Zhang, Zihan Zheng, Yang Xiang, Nicholas Jing Yuan, Xing Xie, Zhenhui Li | In this paper, we propose a novel Deep Reinforcement Learning framework for news recommendation. |
17 | Sharing Deep Neural Network Models with Interpretation | Huijun Wu, Chen Wang, Jie Yin, Kai Lu, Liming Zhu | In this paper, we propose a method to disclose a small set of training data that is just sufficient for users to get the insight into a complicated model. |
18 | Unsupervised Anomaly Detection via Variational Auto-Encoder for Seasonal KPIs in Web Applications | Haowen Xu, Wenxiao Chen, Nengwen Zhao, Zeyan Li, Jiahao Bu, Zhihan Li, Ying Liu, Youjian Zhao, Dan Pei, Yang Feng, Jie Chen, Zhaogang Wang, Honglin Qiao | In this paper, we proposed Donut, an unsupervised anomaly detection algorithm based on VAE. |
19 | ProxyTorrent: Untangling the Free HTTP(S) Proxy Ecosystem | Diego Perino, Matteo Varvello, Claudio Soriente | In this paper we shed light on this ecosystem via ProxyTorrent, a distributed measurement platform that leverages both active and passive measurements. |
20 | An Automated Approach to Auditing Disclosure of Third-Party Data Collection in Website Privacy Policies | Timothy Libert | To examine the efficacy of this approach, this study presents the first large-scale audit of disclosure of third-party data collection in website privacy policies. |
21 | Your Secrets Are Safe: How Browsers’ Explanations Impact Misconceptions About Private Browsing Mode | Yuxi Wu, Panya Gupta, Miranda Wei, Yasemin Acar, Sascha Fahl, Blase Ur | In this paper, we focus on browsers» disclosures, or their in-browser explanations of private browsing mode. |
22 | Betrayed by Your Dashboard: Discovering Malicious Campaigns via Web Analytics | Oleksii Starov, Yuchen Zhou, Xiao Zhang, Najmeh Miramirkhani, Nick Nikiforakis | In this paper, we analyze the analytics identifiers utilized by eighteen different third-party analytics platforms and show that these identifiers enable the clustering of seemingly unrelated websites as part of a common third-party analytics account (i.e. websites whose analytics are managed by a single person or team). |
23 | Large-Scale Analysis of Style Injection by Relative Path Overwrite | Sajjad Arshad, Seyed Ali Mirheidari, Tobias Lauinger, Bruno Crispo, Engin Kirda, William Robertson | In this paper, we present the first large-scale study of the Web to measure the prevalence and significance of style injection using RPO. |
24 | Uncovering HTTP Header Inconsistencies and the Impact on Desktop/Mobile Websites | Abner Mendoza, Phakpoom Chinprutthiwong, Guofei Gu | In this work, we have conducted the first systematic measurement study of inconsistencies between mobile and desktop HTTP security response configuration in the top 70,000 websites. |
25 | Panning for gold.com: Understanding the Dynamics of Domain Dropcatching | Najmeh Miramirkhani, Timothy Barron, Michael Ferdman, Nick Nikiforakis | In this paper, we investigate the dynamics of domain dropcatching where companies, on behalf of users, compete to register the most desirable domains as soon as they are made available and then auction them off to the highest bidder. |
26 | Incognito: A Method for Obfuscating Web Data | Rahat Masood, Dinusha Vatsalan, Muhammad Ikram, Mohamed Ali Kaafar | To this end, we propose a privacy-aware obfuscation method for Web data addressing these identified drawbacks of existing methods. |
27 | Platform Criminalism: The ‘Last-Mile’ Geography of the Darknet Market Supply Chain | Martin Dittus, Joss Wright, Mark Graham | We present strong evidence that cannabis and cocaine vendors are primarily located in a small number of consumer countries, rather than producer countries, suggesting that darknet trading happens at the »last mile», possibly leaving old trafficking routes intact. |
28 | Tagvisor: A Privacy Advisor for Sharing Hashtags | Yang Zhang, Mathias Humbert, Tahleen Rahman, Cheng-Te Li, Jun Pang, Michael Backes | In this paper, we present the first systematic analysis of privacy issues induced by hashtags. |
29 | AdBudgetKiller: Online Advertising Budget Draining Attack | I Luk Kim, Weihang Wang, Yonghwi Kwon, Yunhui Zheng, Yousra Aafer, Weijie Meng, Xiangyu Zhang | In this paper, we present a new ad budget draining attack. |
30 | Hiding in the Crowd: an Analysis of the Effectiveness of Browser Fingerprinting at Large Scale | Alejandro Gómez-Boix, Pierre Laperdrix, Benoit Baudry | We collected 2,067,942 browser fingerprints from one of the top 15 French websites. |
31 | Exposing Search and Advertisement Abuse Tactics and Infrastructure of Technical Support Scammers | Bharat Srinivasan, Athanasios Kountouras, Najmeh Miramirkhani, Monjur Alam, Nick Nikiforakis, Manos Antonakakis, Mustaque Ahamad | We use a data-driven approach to understand search-and-ad abuse by TSS to gain visibility into the online infrastructure that facilitates it. By carefully formulating tech support queries with multiple search engines, we collect data about both the support infrastructure and the websites to which TSS victims are directed when they search online for tech support resources. |
32 | Mind Your Credit: Assessing the Health of the Ripple Credit Network | Pedro Moreno-Sanchez, Navin Modi, Raghuvir Songhela, Aniket Kate, Sonia Fahmy | We find that about 13M USD are at risk in the current Ripple network due to inappropriate configuration of the rippling flag on credit links, facilitating undesired redistribution of credit across those links. |
33 | I’m Listening to your Location! Inferring User Location with Acoustic Side Channels. | Youngbae Jeon, Minchul Kim, Hyunsoo Kim, Hyoungshick Kim, Jun Ho Huh, Ji Won Yoon | Based on this reference map of ENF signals, we propose a novel side-channel attack that can identify the physical location of where a target video or sound was recorded or streamed from. |
34 | SafeKeeper: Protecting Web Passwords using Trusted Execution Environments | Klaudia Krawiecka, Arseny Kurnikov, Andrew Paverd, Mohammad Mannan, N. Asokan | We present SafeKeeper, a novel and comprehensive solution to ensure secrecy of passwords in web authentication systems. |
35 | RaRE: Social Rank Regulated Large-scale Network Embedding | Yupeng Gu, Yizhou Sun, Yanen Li, Yang Yang | Rather than simply treating these two factors independent with each other, a carefully designed link generation model is proposed, which explicitly models the interdependency between these two types of embeddings. |
36 | Minimizing Polarization and Disagreement in Social Networks | Cameron Musco, Christopher Musco, Charalampos E. Tsourakakis | In this work we initiate the study of the following question: \beginquotation \noindent Given n agents, each with its own initial opinion that reflects its core value on a topic, and an opinion dynamics model, what is the structure of a social network that minimizes \em disagreementand \em controversy simultaneously? |
37 | Minimizing Latency in Online Ride and Delivery Services | Abhimanyu Das, Sreenivas Gollapudi, Anthony Kim, Debmalya Panigrahi, Chaitanya Swamy | In this paper, we consider point-to-point requests that come with source-destination pairs and release-time constraints that restrict when each request can be served. |
38 | TIMES: Temporal Information Maximally Extracted from Structures | Abram N. Magner, Jithin K. Sreedharan, Ananth Y. Grama, Wojciech Szpankowski | We present a new formulation of the problem that admits probabilistic solutions for broad classes of dynamic network models. |
39 | Deep Collective Classification in Heterogeneous Information Networks | Yizhou Zhang, Yun Xiong, Xiangnan Kong, Shanshan Li, Jinhong Mi, Yangyong Zhu | In this paper, we study the problem of deep collective classification inHeterogeneous Information Networks (HINs), which involves different types of autocorrelations, from simple to complex relations, among the instances. |
40 | Fast and Accurate Random Walk with Restart on Dynamic Graphs with Guarantees | Minji Yoon, Woojeong Jin, U. Kang | In this paper, we propose OSP, a fast and accurate algorithm for computing dynamic RWR with insertion/deletion of nodes/edges in a directed/undirected graph. |
41 | SIR-Hawkes: Linking Epidemic Models and Hawkes Processes to Model Diffusions in Finite Populations | Marian-Andrei Rizoiu, Swapnil Mishra, Quyu Kong, Mark Carman, Lexing Xie | Here, we establish a novel connection between these two frameworks. |
42 | When Online Dating Meets Nash Social Welfare: Achieving Efficiency and Fairness | Yongzheng Jia, Xue Liu, Wei Xu | We verify our models and algorithms through sound theoretical analysis and empirical studies by using real data and show that our algorithms can significantly improve the ecosystems of the online dating applications. |
43 | A Correlation Clustering Framework for Community Detection | Nate Veldt, David F. Gleich, Anthony Wirth | In this paper we introduce a new community detection framework called LambdaCC that is based on a specially weighted version of correlation clustering. |
44 | Provable and Practical Approximations for the Degree Distribution using Sublinear Graph Samples | Talya Eden, Shweta Jain, Ali Pinar, Dana Ron, C. Seshadhri | For the analysis, we define two fatness measures of the degree distribution, called the h-index and the z-index. |
45 | Mining Tours and Paths in Activity Networks | Sofia Maria Nikolakaki, Charalampos Mavroforakis, Alina Ene, Evimaria Terzi | The k-PCSubgraphs problem we address in this paper is defined as follows: given an activity network and an integer k, identify k non-overlapping and connected subgraphs of the network such that the nodes of each subgraph are close to each other, and the total number of actions they are associated with is high. |
46 | Co-Regularized Deep Multi-Network Embedding | Jingchao Ni, Shiyu Chang, Xiao Liu, Wei Cheng, Haifeng Chen, Dongkuan Xu, Xiang Zhang | Thus, in this paper, we propose a novel multi-network embedding method, DMNE. |
47 | On Exploring Semantic Meanings of Links for Embedding Social Networks | Linchuan Xu, Xiaokai Wei, Jiannong Cao, Philip S. Yu | In this paper, the former type of links are referred to as structure-close links while the latter type are referred to as content-close links. |
48 | Any-k: Anytime Top-k Tree Pattern Retrieval in Labeled Graphs | Xiaofeng Yang, Deepak Ajwani, Wolfgang Gatterbauer, Patrick K. Nicholson, Mirek Riedewald, Alessandra Sala | We therefore propose the novel notion of an any-k ranking algorithm: for a given time budget, return as many of the top-ranked results as possible. |
49 | Dual Graph Convolutional Networks for Graph-Based Semi-Supervised Classification | Chenyi Zhuang, Qiang Ma | In this paper, we present a simple and scalable semi-supervised learning method for graph-structured data in which only a very small portion of the training data are labeled. |
50 | SIDE: Representation Learning in Signed Directed Networks | Junghwan Kim, Haekyu Park, Ji-Eun Lee, U Kang | In this paper, we propose SIDE, a general network embedding method that represents both sign and direction of edges in the embedding space. |
51 | Spectral Algorithms for Temporal Graph Cuts | Arlei Silva, Ambuj Singh, Ananthram Swami | In this paper, we introduce sparsest and normalized cuts in temporal graphs, which generalize their standard definitions by enforcing the smoothness of cuts over time. |
52 | Collective Classification of Spam Campaigners on Twitter: A Hierarchical Meta-Path Based Approach | Srishti Gupta, Abhinav Khattar, Arpit Gogia, Ponnurangam Kumaraguru, Tanmoy Chakraborty | In this paper, we aim to detect spammers that use phone numbers to promote campaigns on Twitter. |
53 | VERSE: Versatile Graph Embeddings from Similarity Measures | Anton Tsitsulin, Davide Mottin, Panagiotis Karras, Emmanuel Müller | In this paper, we carry the similarity orientation of previous works to its logical conclusion; we propose VERtex Similarity Embeddings (VERSE), a simple, versatile, and memory-efficient method that derives graph embeddings explicitly calibrated to preserve the distributions of a selected vertex-to-vertex similarity measure. |
54 | Demarcating Endogenous and Exogenous Opinion Diffusion Process on Social Networks | Abir De, Sourangshu Bhattacharya, Niloy Ganguly | In this paper, we design CherryPick, a novel learning machinery that classifies the opinions and users by solving a joint inference task in message and user set, from a temporal stream of sentiment messages. |
55 | Preferential Attachment as a Unique Equilibrium | Chen Avin, Avi Cohen, Pierre Fraigniaud, Zvi Lotker, David Peleg | This paper demonstrates that the Preferential Attachment rule naturally emerges in the context of evolutionary network formation, as the unique Nash equilibrium of a simple social network game. |
56 | Modeling Success and Engagement for the App Economy | Haim Mendelson, Ken Moon | We address this challenge by proposing an empirical framework for analyzing an app»s patterns of adoption and engagement. |
57 | Fully Dynamic k-Center Clustering | T-H. Hubert Chan, Arnaud Guerqin, Mauro Sozio | We develop a (2+ε)-approximation algorithm for the k-center clustering problem with "small»» amortized cost under the fully dynamic adversarial model. |
58 | Listing k-cliques in Sparse Real-World Graphs* | Maximilien Danisch, Oana Balalau, Mauro Sozio | Motivated by recent studies in the data mining community which require to efficiently list all k-cliques, we revisit the iconic algorithm of Chiba and Nishizeki and develop the most efficient parallel algorithm for such a problem. |
59 | Fast Exact CoSimRank Search on Evolving and Static Graphs | Weiren Yu, Fan Wang | In this study, we propose a fast dynamic scheme, \DCoSim for accurate CoSimRank search over evolving graphs. |
60 | Measuring and Improving the Core Resilience of Networks | Ricky Laishram, Ahmet Erdem Sariyüce, Tina Eliassi-Rad, Ali Pinar, Sucheta Soundarajan | To measure this, we introduce two novel node properties,Core Strength andCore Influence, which measure the resilience of individual nodes» core numbers and their influence on other nodes» core numbers. |
61 | Low Rank Spectral Network Alignment | Huda Nassar, Nate Veldt, Shahin Mohammadi, Ananth Grama, David F. Gleich | For this task, we show a new, a-posteriori, approximation bound for a simple algorithm to approximate a maximum weight bipartite matching problem on a low-rank matrix. |
62 | Mapping the Invocation Structure of Online Political Interaction | Manish Raghavan, Ashton Anderson, Jon Kleinberg | In this paper, we develop network-based methods that operate on the ways in which users share content; we construct invocation graphs on Web domains showing the extent to which pages from one domain are invoked by users to reply to posts containing pages from other domains. |
63 | Aspect-Aware Latent Factor Model: Rating Prediction with Ratings and Reviews | Zhiyong Cheng, Ying Ding, Lei Zhu, Mohan Kankanhalli | In this paper, we employ textual review information with ratings to tackle these limitations. |
64 | Aesthetic-based Clothing Recommendation | Wenhui Yu, Huidi Zhang, Xiangnan He, Xu Chen, Li Xiong, Zheng Qin | To bridge this gap, we propose to introduce the aesthetic information, which is highly relevant with user preference, into clothing recommender systems. |
65 | On the Causal Effect of Badges | Tomasz Kusmierczyk, Manuel Gomez-Rodriguez | In this paper, we focus on first-time badges, which are awarded after a user takes a particular type of action for the first time, and study their causal effect by harnessing the delayed introduction of several badges in a popular Q&A website. |
66 | Robust Factorization Machines for User Response Prediction | Surabhi Punjabi, Priyanka Bhatt | In this work, we characterize the data uncertainty using Robust Optimization (RO) paradigm to design approaches that are immune against perturbations. |
67 | Bayesian Models for Product Size Recommendations | Vivek Sembium, Rajeev Rastogi, Lavanya Tekumalla, Atul Saroop | We propose a novel approach based on Bayesian logit and probit regression models with ordinal categories Small, Fit, Largeto model size fits as a function of the difference between latent sizes of customers and products. |
68 | Variational Autoencoders for Collaborative Filtering | Dawen Liang, Rahul G. Krishnan, Matthew D. Hoffman, Tony Jebara | This non-linear probabilistic model enables us to go beyond the limited modeling capacity of linear factor models which still largely dominate collaborative filtering research.We introduce a generative model with multinomial likelihood and use Bayesian inference for parameter estimation. |
69 | Learning Causal Effects From Many Randomized Experiments Using Regularized Instrumental Variables | Alexander Peysakhovich, Dean Eckles | We use experimental groups as instrumental variables (IV) and show that a standard method (two-stage least squares) is biased even when the number of experiments is infinite. |
70 | Camel: Content-Aware and Meta-path Augmented Metric Learning for Author Identification | Chuxu Zhang, Chao Huang, Lu Yu, Xiangliang Zhang, Nitesh V. Chawla | In this paper, we study the problem of author identification in big scholarly data, which is to effectively rank potential authors for each anonymous paper by using historical data. |
71 | Prediction of Sparse User-Item Consumption Rates with Zero-Inflated Poisson Regression | Moshe Lichman, Padhraic Smyth | In this paper we address the problem of building user models that can predict the rate at which individuals consume items from a finite set, including items they have consumed in the past and items that are new. |
72 | Latent Relational Metric Learning via Memory-based Attention for Collaborative Ranking | Yi Tay, Luu Anh Tuan, Siu Cheung Hui | This paper proposes a new neural architecture for collaborative ranking with implicit feedback. |
73 | AdaError: An Adaptive Learning Rate Method for Matrix Approximation-based Collaborative Filtering | Dongsheng Li, Chao Chen, Qin Lv, Hansu Gu, Tun Lu, Li Shang, Ning Gu, Stephen M. Chu | This paper proposes AdaError, an adaptive learning rate method for matrix approximation-based collaborative filtering. |
74 | Anxiety and Information Seeking: Evidence From Large-Scale Mouse Tracking | Brit Youngmann, Elad Yom-Tov | Based on this observation, we develop a model which can predict the level of anxiety experienced by a user, using attributes derived from mouse tracking data and other user interactions. |
75 | Through a Gender Lens: Learning Usage Patterns of Emojis from Large-Scale Android Users | Zhenpeng Chen, Xuan Lu, Wei Ai, Huoran Li, Qiaozhu Mei, Xuanzhe Liu | We present various interesting findings that evidence a considerable difference in emoji usage by female and male users. |
76 | Coevolutionary Recommendation Model: Mutual Learning between Ratings and Reviews | Yichao Lu, Ruihai Dong, Barry Smyth | In this paper, we present a novel deep learning recommendation model, which co-learns user and item information from ratings and customer reviews, by optimizing matrix factorization and an attention-based GRU network. |
77 | How to Impute Missing Ratings?: Claims, Solution, and Its Application to Collaborative Filtering | Youngnam Lee, Sang-Wook Kim, Sunju Park, Xing Xie | In this paper, we identify the limitations of existing data imputation approaches and suggest three new claims that all data imputation approaches should follow to achieve high recommendation accuracy. |
78 | When Sheep Shop: Measuring Herding Effects in Product Ratings with Natural Experiments | Gael Lederrey, Robert West | The study of herding poses methodological challenges. |
79 | Modeling Interdependent and Periodic Real-World Action Sequences | Takeshi Kurashima, Tim Althoff, Jure Leskovec | In this work, we develop a novel statistical model, called TIPAS, for Time-varying, Interdependent, and Periodic Action Sequences. |
80 | The Effect of Ad Blocking on User Engagement with the Web | Ben Miroglio, David Zeber, Jofish Kaye, Rebecca Weiss | To approach this problem, we conduct a retrospective natural field experiment using Firefox browser usage data, with the goal of estimating the effect of adblocking on user engagement with the Web. |
81 | Me, My Echo Chamber, and I: Introspection on Social Media Polarization | Nabeel Gillani, Ann Yuan, Martin Saveski, Soroush Vosoughi, Deb Roy | In this paper, we introduce Social Mirror, a social network visualization tool that enables a sample of Twitter users to explore the politically-active parts of their social network. |
82 | Geographical Feature Extraction for Entities in Location-based Social Networks | Daizong Ding, Mi Zhang, Xudong Pan, Duocai Wu, Pearl Pu | In this paper, we propose a geographical convolutional neural tensor network (GeoCNTN) as a generic embedding model. |
83 | (Don’t) Mention the War: A Comparison of Wikipedia and Britannica Articles on National Histories | Anna Samoilenko, Florian Lemmerich, Maria Zens, Mohsen Jadidi, Mathieu Génois, Markus Strohmaier | In this paper we present a large-scale quantitative comparison between expert- and crowdsourced writing of history by analysing articles from the English Wikipedia and Britannica. |
84 | Adaptive Sensitive Reweighting to Mitigate Bias in Fairness-aware Classification | Emmanouil Krasanakis, Eleftherios Spyromitros-Xioufis, Symeon Papadopoulos, Yiannis Kompatsiaris | To circumvent shortcomings prevalent in data repairing approaches, such as those that weight training samples of the sensitive group (e.g. gender, race, financial status) based on their misclassification error, we present a process that iteratively adapts training sample weights with a theoretically grounded model. |
85 | On Ridesharing Competition and Accessibility: Evidence from Uber, Lyft, and Taxi | Shan Jiang, Le Chen, Alan Mislove, Christo Wilson | In this paper, we comprehensively compare Uber, Lyft, and taxis with respect to key market features (supply, demand, price, and wait time) in San Francisco and New York City. |
86 | VizByWiki: Mining Data Visualizations from the Web to Enrich News Articles | Allen Yilun Lin, Joshua Ford, Eytan Adar, Brent Hecht | To address this issue, we define a new problem: given a news ar-ticle, retrieve relevant visualizations that already exist on the web. To facilitate further advances on our "news visualization retrieval problem", we release our ground truth dataset and make our system and its source code publicly available. |
87 | Computationally Inferred Genealogical Networks Uncover Long-Term Trends in Assortative Mating | Eric Malmi, Aristides Gionis, Arno Solin | To demonstrate the applicability of the inferred large-scale genealogical networks, we present a longitudinal analysis on the mating patterns observed in a network. |
88 | What We Read, What We Search: Media Attention and Public Attention Among 193 Countries | Haewoon Kwak, Jisun An, Joni Salminen, Soon-Gyo Jung, Bernard J. Jansen | We investigate the alignment of international attention of news media organizations within 193 countries with the expressed international interests of the public within those same countries from March 7, 2016 to April 14, 2017. |
89 | Human Perceptions of Fairness in Algorithmic Decision Making: A Case Study of Criminal Risk Prediction | Nina Grgic-Hlaca, Elissa M. Redmiles, Krishna P. Gummadi, Adrian Weller | A key contribution of this work is the framework we propose to understand why people perceive certain features as fair or unfair to be used in algorithms. |
90 | Political Discourse on Social Media: Echo Chambers, Gatekeepers, and the Price of Bipartisanship | Kiran Garimella, Gianmarco De Francisci Morales, Aristides Gionis, Michael Mathioudakis | This paper studies the phenomenon of political echo chambers on social media. |
91 | Algorithmic Glass Ceiling in Social Networks: The effects of social recommendations on network diversity | Ana-Andreea Stoica, Christopher Riederer, Augustin Chaintreau | We discuss ways to address this concern in future design. |
92 | Community Interaction and Conflict on the Web | Srijan Kumar, William L. Hamilton, Jure Leskovec, Dan Jurafsky | Altogether, this work presents a data-driven view of community interactions and conflict, and paves the way towards healthier online communities. |
93 | "You are no Jack Kennedy": On Media Selection of Highlights from Presidential Debates | Chenhao Tan, Hao Peng, Noah A. Smith | To quantitatively explore the selection process, we build a three- decade dataset of presidential debate transcripts and post-debate coverage. |
94 | Auditing the Personalization and Composition of Politically-Related Search Engine Results Pages | Ronald E. Robertson, David Lazer, Christo Wilson | In this study, we conduct a targeted algorithm audit of Google Search using a dynamic set of political queries. |
95 | To Stay or to Leave: Churn Prediction for Urban Migrants in the Initial Period | Yang Yang, Zongtao Liu, Chenhao Tan, Fei Wu, Yueting Zhuang, Yafeng Li | In this paper, we use Shanghai as an example to investigate migrants’ behavior in their first weeks and in particular, how their behavior relates to early departure. |
96 | Time Expression Recognition Using a Constituent-based Tagging Scheme | Xiaoshi Zhong, Erik Cambria | We find from four datasets that time expressions are formed by loose structure and the words used to express time information can differentiate time expressions from common text. |
97 | Parabel: Partitioned Label Trees for Extreme Classification with Application to Dynamic Search Advertising | Yashoteja Prabhu, Anil Kag, Shrutendra Harsola, Rahul Agrawal, Manik Varma | This paper develops the Parabel algorithm for extreme multi-label learning where the objective is to learn classifiers that can annotate each data point with the most relevant subset of labels from an extremely large label set. |
98 | Dynamic Embeddings for Language Evolution | Maja Rudolph, David Blei | Here, we develop dynamic embeddings, building on exponential family embeddings to capture how the meanings of words change over time. |
99 | HighLife: Higher-arity Fact Harvesting | Patrick Ernst, Amy Siu, Gerhard Weikum | In this work, we present an approach to harvest higher-arity facts from textual sources. |
100 | Content Attention Model for Aspect Based Sentiment Analysis | Qiao Liu, Haibin Zhang, Yifu Zeng, Ziqi Huang, Zufeng Wu | To solve this problem, we propose a novel content attention based aspect based sentiment classification model, with two attention enhancing mechanisms: sentence-level content attention mechanism is capable of capturing the important information about given aspects from a global perspective, whiles the context attention mechanism is responsible for simultaneously taking the order of the words and their correlations into account, by embedding them into a series of customized memories. |
101 | Financing the Web of Data with Delayed-Answer Auctions | Tobias Grubenmann, Abraham Bernstein, Dmitry Moor, Sven Seuken | We introduce a new model which captures the particular situation when a user access data in the Web of Data. |
102 | Estimating the Cardinality of Conjunctive Queries over RDF Data Using Graph Summarisation | Giorgio Stefanoni, Boris Motik, Egor V. Kostylev | We present a new, principled cardinality estimation technique based on graph summarisation. |
103 | Never-Ending Learning for Open-Domain Question Answering over Knowledge Bases | Abdalghani Abujabal, Rishiraj Saha Roy, Mohamed Yahya, Gerhard Weikum | To overcome these limitations, this paper presents NEQA, a continuous learning paradigm for KB-QA. |
104 | Large-Scale Hierarchical Text Classification with Recursively Regularized Deep Graph-CNN | Hao Peng, Jianxin Li, Yu He, Yaopeng Liu, Mengjiao Bao, Lihong Wang, Yangqiu Song, Qiang Yang | In this paper, we propose a graph-CNN based deep learning model to first convert texts to graph-of-words, and then use graph convolution operations to convolve the word graph. |
105 | Estimating Rule Quality for Knowledge Base Completion with the Relationship between Coverage Assumption | Kaja Zupanc, Jesse Davis | To address this problem, we propose a novel score function for evaluating the quality of a first-order rule learned from a KB. |
106 | Improving Word Embedding Compositionality using Lexicographic Definitions | Thijs Scheepers, Evangelos Kanoulas, Efstratios Gavves | We present an in-depth analysis of four popular word embeddings (Word2Vec, GloVe, fastText and Paragram) in terms of their semantic compositionality. |
107 | Browserless Web Data Extraction: Challenges and Opportunities | Ruslan R. Fayzrakhmanov, Emanuel Sallinger, Ben Spencer, Tim Furche, Georg Gottlob | In this paper, we demonstrate the principal feasibility of automatically translating browser-based wrappers into "browserless" wrappers. |
108 | Short-Text Topic Modeling via Non-negative Matrix Factorization Enriched with Local Word-Context Correlations | Tian Shi, Kyeongpil Kang, Jaegul Choo, Chandan K. Reddy | To tackle this problem, in this paper, we propose a semantics-assisted non-negative matrix factorization (SeaNMF) model to discover topics for the short texts. |
109 | Are All People Married?: Determining Obligatory Attributes in Knowledge Bases | Jonathan Lajus, Fabian M. Suchanek | In this paper, we propose a new way to model incompleteness in KBs. |
110 | Socioeconomic Dependencies of Linguistic Patterns in Twitter: a Multivariate Analysis | Jacob Levy Abitbol, Márton Karsai, Jean-Philippe Magué, Jean-Pierre Chevrot, Eric Fleury | We show how key linguistic variables measured in individual Twitter streams depend on factors like socioeconomic status, location, time, and the social network of individuals. |
111 | An Attention Factor Graph Model for Tweet Entity Linking | Chenwei Ran, Wei Shen, Jianyong Wang | In this work, we formalize the tweet entity linking problem into a factor graph model which has shown its effectiveness and efficiency in many other applications. |
112 | Find the Conversation Killers: A Predictive Study of Thread-ending Posts | Yunhao Jiao, Cheng Li, Fei Wu, Qiaozhu Mei | In this study, we are particularly interested in identifying a post in a multi-party conversation that is unlikely to be further replied to, which therefore kills that thread of the conversation. |
113 | Semantics and Complexity of GraphQL | Olaf Hartig, Jorge Pérez | We present experiments showing that current practical implementations suffer from this issue. |
114 | Sentiment Analysis by Capsules | Yequan Wang, Aixin Sun, Jialong Han, Ying Liu, Xiaoyan Zhu | In this paper, we propose RNN-Capsule, a capsule model based on Recurrent Neural Network (RNN) for sentiment analysis. |
115 | Modelling Dynamics in Semantic Web Knowledge Graphs with Formal Concept Analysis | Larry González, Aidan Hogan | In this paper, we propose a novel data-driven schema for large-scale heterogeneous knowledge graphs inspired by Formal Concept Analysis (FCA). |
116 | Scalable Instance Reconstruction in Knowledge Bases via Relatedness Affiliated Embedding | Richong Zhang, Junpeng Li, Jiajie Mei, Yongyi Mao | In this paper, we present a new formulation of KB completion, called instance reconstruction. |
117 | Leveraging Social Media Signals for Record Linkage | Andrew T. Schneider, Arjun Mukherjee, Eduard C. Dragut | We present a method for record linkage that uses this hitherto untapped source of entity information. |
118 | A Structured Approach to Understanding Recovery and Relapse in AA | Yue Zhang, Arti Ramesh, Jennifer Golbeck, Dhanya Sridhar, Lise Getoor | In this work, we take a structured approach to understand recovery and relapse from AUD using social media data. |
119 | Facet Annotation Using Reference Knowledge Bases | Riccardo Porrini, Matteo Palmonari, Isabel F. Cruz | In this paper, we annotate the facet property with a predicate from a reference Knowledge Base (KB) so as to maximize the semantic similarity between the property and the predicate. |
120 | MemeSequencer: Sparse Matching for Embedding Image Macros | Abhimanyu Dubey, Esteban Moro, Manuel Cebrian, Iyad Rahwan | In this study, we provide a first step in the systematic study of image evolution on the Internet, by proposing an algorithm based on sparse representations and deep learning to decouple various types of content in such images and produce a rich semantic embedding. |
121 | Matching Natural Language Sentences with Hierarchical Sentence Factorization | Bang Liu, Ting Zhang, Fred X. Han, Di Niu, Kunfeng Lai, Yu Xu | In this paper, we propose Hierarchical Sentence Factorization—a technique to factorize a sentence into a hierarchical representation, with the components at each different scale reordered into a "predicate-argument" form. |
122 | Why Reinvent the Wheel: Let’s Build Question Answering Systems Together | Kuldeep Singh, Arun Sethupat Radhakrishna, Andreas Both, Saeedeh Shekarpour, Ioanna Lytra, Ricardo Usbeck, Akhilesh Vyas, Akmal Khikmatullaev, Dharmen Punjani, Christoph Lange, Maria Esther Vidal, Jens Lehmann, Sören Auer | We study this optimisation problem and train classifiers, which take features of a question as input and have the goal of optimising the selection of QA components based on those features. |
123 | Weakly-supervised Relation Extraction by Pattern-enhanced Embedding Learning | Meng Qu, Xiang Ren, Yu Zhang, Jiawei Han | In this paper, we study the integration of distributional and pattern-based methods in a weakly-supervised setting such that the two kinds of methods can provide complementary supervision for each other to build an effective, unified model. |
124 | Finding Needles in an Encyclopedic Haystack: Detecting Classes Among Wikipedia Articles | Marius Pasca | Finding Needles in an Encyclopedic Haystack: Detecting Classes Among Wikipedia Articles |
125 | User-guided Hierarchical Attention Network for Multi-modal Social Image Popularity Prediction | Wei Zhang, Wen Wang, Jun Wang, Hongyuan Zha | To this end, we propose a model named User-guided Hierarchical Attention Network (UHAN) with two novel user-guided attention mechanisms to hierarchically attend both visual and textual modalities. |
126 | A Coherent Unsupervised Model for Toponym Resolution | Ehsan Kamalloo, Davood Rafiei | In this paper, we study the problem of toponym resolution with no additional information other than a gazetteer and no training data. |
127 | Inferring Missing Categorical Information in Noisy and Sparse Web Markup | Nicolas Tempelmeier, Elena Demidova, Stefan Dietze | In this work, we introduce a supervised approach for inferring missing categorical properties in Web markup. |
128 | Towards Annotating Relational Data on the Web with Language Models | Matteo Cannaviccio, Denilson Barbosa, Paolo Merialdo | Tables and structured lists on Web pages are a potential source of valuable information, and several methods have been proposed to annotate them with semantics that can be leveraged for search, question answering and information extraction. |
129 | CESI: Canonicalizing Open Knowledge Bases using Embeddings and Side Information | Shikhar Vashishth, Prince Jain, Partha Talukdar | In order to overcome this challenge, we propose Canonicalization using Embeddings and Side Information (CESI) — a novel approach which performs canonicalization over learned embeddings of Open KBs. |
130 | Bid-Limited Targeting | Patrick Hummel, Uri Nadav | This paper analyzes a mechanism for selling items in auctions in which the auctioneer specifies a cap on the ratio between the maximum and minimum bids that bidders may use in the different auctions. |
131 | Reinforcement Mechanism Design for e-commerce | Qingpeng Cai, Aris Filos-Ratsikas, Pingzhong Tang, Yiwei Zhang | We study the problem of allocating impressions to sellers in e-commerce websites, such as Amazon, eBay or Taobao, aiming to maximize the total revenue generated by the platform. |
132 | Field-weighted Factorization Machines for Click-Through Rate Prediction in Display Advertising | Junwei Pan, Jian Xu, Alfonso Lobos Ruiz, Wenliang Zhao, Shengjun Pan, Yu Sun, Quan Lu | In this paper, we propose Field-weighted Factorization Machines (FwFMs) to model the different feature interactions between different fields in a much more memory-efficient way. |
133 | Dynamic Mechanism Design in the Field | Vahab Mirrokni, Renato Paes Leme, Rita Ren, Song Zuo | In this paper, we aim to address these shortcomings and develop simple dynamic mechanisms that can be implemented efficiently, and provide theoretical guidelines for decreasing the sensitivity of dynamic mechanisms on prediction accuracy of buyers» value distributions. |
134 | Incentive-Aware Learning for Large Markets | Alessandro Epasto, Mohammad Mahdian, Vahab Mirrokni, Song Zuo | In this paper, we study such incentive-aware learning problem in a general setting and show that it is possible to approximately optimize the objective function under two assumptions: (i) each individual agent is a "small" (part of the market); and (ii) there is a cost associated with manipulation. |
135 | Incentive-Compatible Diffusion | Yakov Babichenko, Oren Dean, Moshe Tennenholtz | We introduce the study of finding an incentive-compatible (strategy-proof) mechanism for selecting an influential vertex in a directed graph (e.g. Twitter»s network). |
136 | A Short-term Intervention for Long-term Fairness in the Labor Market | Lily Hu, Yiling Chen | In this paper, we show that current group disparate outcomes may be immovable even when hiring decisions are bound by an input-output notion of "individual fairness." |
137 | Optimizing Ad Refresh In Mobile App Advertising | Florin Constantin, Christopher Harris, Samuel Ieong, Aranyak Mehta, Xi Tan | We propose a new, natural, "two-phase" click model for this setting that explains this independence, as well as our measurements of the click-through rate as a function of the impression»s time-on-screen and of ad-repeat counts. |
138 | Detecting Ponzi Schemes on Ethereum: Towards Healthier Blockchain Technology | Weili Chen, Zibin Zheng, Jiahui Cui, Edith Ngai, Peilin Zheng, Yuren Zhou | To help dealing with this issue, this paper proposes an approach to detect Ponzi schemes on blockchain by using data mining and machine learning methods. |
139 | Testing Incentive Compatibility in Display Ad Auctions | Sébastien Lahaie, Andrés Munoz Medina, Balasubramanian Sivan, Sergei Vassilvitskii | In this work we develop tests based on simple bid perturbations that a buyer can use to answer these questions, with a focus on dynamic incentive compatibility. |
140 | Simple vs Optimal Contests with Convex Costs | Amy Greenwald, Takehiro Oyakawa, Vasilis Syrgkanis | We study an optimal contest design problem where contributors abilities are private, their costs are convex as a function of their effort, and the designer seeks to maximize their total productivity. |
141 | Arrays of (locality-sensitive) Count Estimators (ACE): Anomaly Detection on the Edge | Chen Luo, Anshumali Shrivastava | In this paper, we propose ACE (Arrays of (locality-sensitive) Count Estimators) algorithm that can be 60x faster than most state-of-the-art unsupervised anomaly detection algorithms. |
142 | Mile High WiFi: A First Look At In-Flight Internet Connectivity | John P. Rula, James Newman, Fabián E. Bustamante, Arash Molavi Kakhki, David Choffnes | In this paper, we present the first characterization of deployed IFC systems. |
143 | DeepMove: Predicting Human Mobility with Attentional Recurrent Networks | Jie Feng, Yong Li, Chao Zhang, Funing Sun, Fanchao Meng, Ang Guo, Depeng Jin | In this paper, we propose DeepMove, an attentional recurrent network for mobility prediction from lengthy and sparse trajectories. |
144 | Aladdin: Automating Release of Deep-Link APIs on Android | Yun Ma, Ziniu Hu, Yunxin Liu, Tao Xie, Xuanzhe Liu | In this paper, we present a large-scale empirical study to investigate how deep links are really adopted, over 25,000 Android apps. |
145 | The Cost of Digital Advertisement: Comparing User and Advertiser Views | Panagiotis Papadopoulos, Nicolas Kourtellis, Evangelos P. Markatos | In this study, we aim to increase user awareness regarding the hidden costs of digital advertisement in mobile devices, and compare the user and advertiser views. |
146 | Facebook (A)Live?: Are Live Social Broadcasts Really Broadcasts? | Aravindh Raman, Gareth Tyson, Nishanth Sastry | With this in mind, we explore one such prominent platform – Facebook Live. |
147 | I’ll Be Back: On the Multiple Lives of Users of a Mobile Activity Tracking Application | Zhiyuan Lin, Tim Althoff, Jure Leskovec | Based on insights developed in this work, including a marker of improved primary intent performance, our prediction models achieve 71% ROC AUC. |
148 | Joint User- and Event- Driven Stable Social Event Organization | Xin Wang, Wenwu Zhu, Chun Chen, Martin Ester | In this paper, we investigate joint user- and event- driven SEO by simultaneously considering user preferences (towards events) and event preferences (towards users). |
149 | Learning on Partial-Order Hypergraphs | Fuli Feng, Xiangnan He, Yiqun Liu, Liqiang Nie, Tat-Seng Chua | In this work, we address the inherent limitation of existing hypergraphs by proposing a new data structure named Partial-Order Hypergraph, which specifically injects the partially ordering relations among vertices into a hyperedge. |
150 | "Satisfaction with Failure" or "Unsatisfied Success": Investigating the Relationship between Search Success and User Satisfaction | Mengyang Liu, Yiqun Liu, Jiaxin Mao, Cheng Luo, Min Zhang, Shaoping Ma | In this study, we investigate the differences between user satisfaction and search success, and try to adopt the findings to predict search success in complex search tasks. |
151 | TEM: Tree-enhanced Embedding Model for Explainable Recommendation | Xiang Wang, Xiangnan He, Fuli Feng, Liqiang Nie, Tat-Seng Chua | In this work, we propose a novel solution named Tree-enhanced Embedding Method that combines the strengths of embedding-based and tree-based models. |
152 | Ad Hoc Table Retrieval using Semantic Similarity | Shuo Zhang, Krisztian Balog | The main novel contribution of this work is a method for performing semantic matching between queries and tables. We introduce and address the problem of ad hoc table retrieval: answering a keyword query with a ranked list of tables. |
153 | Query Suggestion with Feedback Memory Network | Bin Wu, Chenyan Xiong, Maosong Sun, Zhiyuan Liu | This paper presents Feedback Memory Network (\textttFMN) which models user interactions with the search engine for query suggestion. |
154 | A Sparse Topic Model for Extracting Aspect-Specific Summaries from Online Reviews | Vineeth Rakesh, Weicong Ding, Aman Ahuja, Nikhil Rao, Yifan Sun, Chandan K. Reddy | To address these chal- lenges, in this paper, we propose a generative aspect summarization model called APSUM that is capable of providing fine-grained sum- maries of online reviews. |
155 | Neural Attentional Rating Regression with Review-level Explanations | Chong Chen, Min Zhang, Yiqun Liu, Shaoping Ma | In this paper, we introduce a novel attention mechanism to explore the usefulness of reviews, and propose a Neural Attentional Regression model with Review-level Explanations (NARRE) for recommendation. |
156 | Learning from Multi-View Multi-Way Data via Structural Factorization Machines | Chun-Ta Lu, Lifang He, Hao Ding, Bokai Cao, Philip S. Yu | In this paper, we introduce a multi-tensor-based approach that can preserve the underlying structure of multi-view data in a generic predictive model. |
157 | Scalable Supervised Discrete Hashing for Large-Scale Search | Xin Luo, Ye Wu, Xin-Shun Xu | To address these issues and make the supervised method scalable to large datasets, we present a novel hashing method, named Scalable Supervised Discrete Hashing (SSDH). |
158 | Subgraph-augmented Path Embedding for Semantic User Search on Heterogeneous Social Network | Zemin Liu, Vincent W. Zheng, Zhou Zhao, Hongxia Yang, Kevin Chen-Chuan Chang, Minghui Wu, Jing Ying | Therefore, in this paper we introduce a new concept of subgraph-augmented path for semantic user search. |
159 | Leveraging Fine-Grained Wikipedia Categories for Entity Search | Denghao Ma, Yueguo Chen, Kevin Chen-Chuan Chang, Xiaoyong Du, Chuanfei Xu, Yi Chang | Based on the observation of how people describe entities of a specific type, we propose a headword-and-modifier model to deeply interpret both queries and fine-grained entity types/categories. |
160 | Online Compact Convexified Factorization Machine | Xiao Lin, Wenpeng Zhang, Min Zhang, Wenwu Zhu, Jian Pei, Peilin Zhao, Junzhou Huang | To address this subsequent challenge, we follow the general projection-free algorithmic framework of Online Conditional Gradient and propose an Online Compact Convex Factorization Machine (OCCFM) algorithm that eschews the projection operation with efficient linear optimization steps. |
161 | Understanding and Predicting Delay in Reciprocal Relations | Jundong Li, Jiliang Tang, Yilin Wang, Yali Wan, Yi Chang, Huan Liu | This paper presents the initial investigation of the time delay in reciprocal relations. |
162 | Hierarchical Variational Memory Network for Dialogue Generation | Hongshen Chen, Zhaochun Ren, Jiliang Tang, Yihong Eric Zhao, Dawei Yin | In this paper, we propose a novel hierarchical variational memory network (HVMN), by adding the hierarchical structure and the variational memory network into a neural encoder-decoder network. |
163 | Strategies for Geographical Scoping and Improving a Gazetteer | Sanket Kumar Singh, Davood Rafiei | In this paper, we study the problem of detecting the scope of locations in a geographical database and its applications in identifying inconsistencies and improving the quality of a gazetteer. |
164 | Detecting Crowdturfing "Add to Favorites" Activities in Online Shopping | Ning Su, Yiqun Liu, Zhao Li, Yuli Liu, Min Zhang, Shaoping Ma | With a comprehensive analysis of some ground truth spamming activities from the perspective of behavior, user and item, we propose a factor graph based model to identify this kind of spamming activity. |
165 | Search Process as Transitions Between Neural States | Yashar Moshfeghi, Frank E. Pollick | These models significantly contribute to our understanding of the search process. |
166 | StaQC: A Systematically Mined Question-Code Dataset from Stack Overflow | Ziyu Yao, Daniel S. Weld, Wei-Peng Chen, Huan Sun | In this paper, we investigate a new problem of systematically mining question-code pairs from Stack Overflow (in contrast to heuristically collecting them). |
167 | Finding Subcube Heavy Hitters in Analytics Data Streams | Branislav Kveton, S. Muthukrishnan, Hoa T. Vu, Yikun Xian | We present a simple one-pass sampling algorithm to solve the subcube heavy hitters problem in $\tildeO (kd/γ)$ space. |
168 | Conversational Query Understanding Using Sequence to Sequence Modeling | Gary Ren, Xiaochuan Ni, Manish Malik, Qifa Ke | In this paper, we define a conversational query as a query that depends on the context of the current conversation, and we formulate the conversational query understanding problem as context-aware query reformulation, where the goal is to reformulate the conversational query into a search engine friendly query in order to satisfy users» information needs in conversational settings. We present a large scale open domain dataset of conversational queries and various sequence to sequence models that are learned from this dataset. |
169 | Privacy and Efficiency Tradeoffs for Multiword Top K Search with Linear Additive Rank Scoring | Daniel Agun, Jinjin Shao, Shiyu Ji, Stefano Tessaro, Tao Yang | This paper proposes a private ranking scheme with linear additive scoring for efficient top K keyword search on modest-sized cloud datasets. |
170 | Manifold Learning for Rank Aggregation | Shangsong Liang, Ilya Markov, Zhaochun Ren, Maarten de Rijke | We propose manifold learning aggregation approaches, ManX and v-ManX, that build on the cluster hypothesis and exploit inter-document similarity information. |
171 | Identifying Modes of User Engagement with Online News and Their Relationship to Information Gain in Text | Nir Grinberg | In this study, we examine patterns of user engagement in a large, client-side log dataset of over 7.7 million page views (including both mobile and non-mobile devices) of 66,821 news articles from seven popular news publishers. |
172 | HTTP/2 Prioritization and its Impact on Web Performance | Maarten Wijnants, Robin Marx, Peter Quax, Wim Lamotte | Web performance is a hot topic, as many studies have shown a strong correlation between slow webpages and loss of revenue due to user dissatisfaction. |
173 | Discovering Progression Stages in Trillion-Scale Behavior Logs | Kijung Shin, Mahdi Shafiei, Myunghwan Kim, Aastha Jain, Hema Raghavan | To answer these questions, we propose a behavior model that discovers the progressions of users’ behaviors from a given starting point – such as a new subscription or first experience of certain features – to a particular target stage such as a predefined engagement level of interest. |
174 | Pixie: A System for Recommending 3+ Billion Items to 200+ Million Users in Real-Time | Chantat Eksombatchai, Pranav Jindal, Jerry Zitao Liu, Yuchen Liu, Rahul Sharma, Charles Sugnet, Mark Ulrich, Jure Leskovec | Here we present Pixie, a scalable graph-based real-time recommender system that we developed and deployed at Pinterest. |
175 | A Cross-Platform Consumer Behavior Analysis of Large-Scale Mobile Shopping Data | Hong Huang, Bo Zhao, Hao Zhao, Zhou Zhuang, Zhenxuan Wang, Xiaoming Yao, Xinggang Wang, Hai Jin, Xiaoming Fu | In this paper, we examine the consumer behaviors across multiple platforms based on a large-scale mobile Internet dataset from a major telecom operator, which covers 9.8 million users from two regions among which 1.4 million users have visited e-commerce platforms within one week of our study. |
176 | Towards Automatic Numerical Cross-Checking: Extracting Formulas from Text | Yixuan Cao, Hongwei Li, Ping Luo, Jiaquan Yao | Specifically, we formulate this task as a DAG-structure prediction problem, and propose an iterative relation extraction model to address it. |
177 | Mining E-Commerce Query Relations using Customer Interaction Networks | Bijaya Adhikari, Parikshit Sondhi, Wenke Zhang, Mohit Sharma, B. Aditya Prakash | In this work, we begin by studying the properties of CINs developed using Walmart.com»s product search logs. |
178 | Modeling Dynamic Competition on Crowdfunding Markets | Yusan Lin, Peifeng Yin, Wang-Chien Lee | In this paper, we study the competition on crowdfunding markets through data analysis, and propose a probabilistic generative model, Dynamic Market Competition (DMC) model, to capture the competitiveness of projects in crowdfunding. |
179 | No Silk Road for Online Gamers!: Using Social Network Analysis to Unveil Black Markets in Online Games | Eunjo Lee, Jiyoung Woo, Hyoungshick Kim, Huy Kang Kim | We studied the characteristics of exchanging virtual goods with real money through the processes called "real money trading (RMT)". |
180 | DKN: Deep Knowledge-Aware Network for News Recommendation | Hongwei Wang, Fuzheng Zhang, Xing Xie, Minyi Guo | To solve the above problem, in this paper, we propose a deep knowledge-aware network (DKN) that incorporates knowledge graph representation into news recommendation. |
181 | CrimeBB: Enabling Cybercrime Research on Underground Forums at Scale | Sergio Pastrana, Daniel R. Thomas, Alice Hutchings, Richard Clayton | We describe CrimeBot, a crawler designed around the particular challenges of capturing data from underground forums. |
182 | Attention Convolutional Neural Network for Advertiser-level Click-through Rate Forecasting | Hongchang Gao, Deguang Kong, Miao Lu, Xiao Bai, Jian Yang | In this paper, we focus on the advertiser-level CTR forecasting and formulate it as a time series forecasting problem based on the historical CTR record. |
183 | Hidden in Plain Sight: Classifying Emails Using Embedded Image Contents | Navneet Potti, James B. Wendt, Qi Zhao, Sandeep Tata, Marc Najork | In this paper, we tackle the problem of extracting information from commercial emails promoting an offer to the user. |
184 | Better Caching in Search Advertising Systems with Rapid Refresh Predictions | Conglong Li, David G. Andersen, Qiang Fu, Sameh Elnikety, Yuxiong He | Using the gradient boosting regression tree algorithm with well selected features, we introduce a rapid prediction framework that provides refresh decisions at higher accuracy compared to the heuristic. |
185 | Attribution Inference for Digital Advertising using Inhomogeneous Poisson Models | Zachary Nichols, Adam Stein | Here, we present a new observational attribution method based on a successful model of neural spiking that learns the temporal interactions between event-based time series. |
186 | PhotoReply: Automatically Suggesting Conversational Responses to Photos | Ning Ye, Ariel Fuxman, Vivek Ramavajjala, Sergey Nazarov, J. Patrick McGregor, Sujith Ravi | We introduce the problem of automatically suggesting conversational responses to photos and present an intelligent assistant called PhotoReply that solves the problem in the context of a messaging application. |
187 | A Feature-Oriented Sentiment Rating for Mobile App Reviews | Washington Luiz, Felipe Viegas, Rafael Alencar, Fernando Mourão, Thiago Salles, Dárlinton Carvalho, Marcos Andre Gonçalves, Leonardo Rocha | In this paper, we propose a general framework that allows developers to filter, summarize and analyze user reviews written about applications on App Stores. |
188 | Beyond Keywords and Relevance: A Personalized Ad Retrieval Framework in E-Commerce Sponsored Search | Su Yan, Wei Lin, Tianshu Wu, Daorui Xiao, Xu Zheng, Bo Wu, Kaipeng Liu | To address these problems, we propose a novel ad retrieval framework beyond keywords and relevance in e-commerce sponsored search. |
189 | Unveiling a Socio-Economic System in a Virtual World: A Case Study of an MMORPG | Selin Chun, Deajin Choi, Jinyoung Han, Huy Kang Kim, Taekyoung Kwon | In this paper, we model the socio-economic system of an Aion, a popular MMORPG, as a multi-layer graph. |
190 | Learning to Collaborate: Multi-Scenario Ranking via Multi-Agent Reinforcement Learning | Jun Feng, Heng Li, Minlie Huang, Shichen Liu, Wenwu Ou, Zhirong Wang, Xiaoyan Zhu | In this paper, we formulate multi-scenario ranking as a fully cooperative, partially observable, multi-agent sequential decision problem. |