Paper Digest: WWW 2019 Highlights
The Web Conference (WWW) is one of the top internet conferences in the world. In 2019, it is to be held in San Francisco, California. In this year, there were 1,247 full paper submissions, of which 225 accepted. There were also more than 100 submissions accepted as short papers.
To help AI community quickly catch up on the work presented in this conference, Paper Digest Team processed all accepted papers, and generated one highlight sentence (typically the main topic) for each paper. Readers are encouraged to read these machine generated highlights / summaries to quickly get the main idea of each paper.
We thank all authors for writing these interesting papers, and readers for reading our digests. If you do not want to miss any interesting AI paper, you are welcome to sign up our free paper digest service to get new paper updates customized to your own interests on a daily basis.
Paper Digest Team
team@paperdigest.org
TABLE 1: WWW 2019 Papers
Title | Authors | Highlight | |
---|---|---|---|
1 | Addressing Trust Bias for Unbiased Learning-to-Rank | Aman Agarwal, Xuanhui Wang, Cheng Li, Michael Bendersky, Marc Najork | In this paper, we relax this unrealistic assumption and study click noise explicitly in the unbiased learning-to-rank setting. |
2 | Learning Edge Properties in Graphs from Path Aggregations | Rakshit Agrawal, Luca de Alfaro | We introduce LEAP, a trainable, general framework for predicting the presence and properties of edges on the basis of the local structure, topology, and labels of the graph. |
3 | Evaluating User Actions as a Proxy for Email Significance | Tarfah Alrashed, Chia-Jung Lee, Peter Bailey, Christopher Lin, Milad Shokouhi, Susan Dumais | In this work, we hypothesize that the cumulative set of actions on any individual email can be considered as a proxy for the perceived significance of that email. |
4 | DDGK: Learning Graph Representations for Deep Divergence Graph Kernels | Rami Al-Rfou, Bryan Perozzi, Dustin Zelle | In this paper, we show that it is possible to learn representations for graph similarity with neither domain knowledge nor supervision (i.e. feature engineering or labeled graphs). |
5 | Stereotypical Bias Removal for Hate Speech Detection Task using Knowledge-based Generalizations | Pinkesh Badjatiya, Manish Gupta, Vasudeva Varma | In this paper, we make two important contributions. |
6 | Personalized Bundle List Recommendation | Jinze Bai, Chang Zhou, Junshuai Song, Xiaoru Qu, Weiting An, Zhao Li, Jun Gao | In this paper, we formalize the personalized bundle list recommendation as a structured prediction problem and propose a bundle generation network (BGN), which decomposes the problem into quality/diversity parts by the determinantal point processes (DPPs). |
7 | Improving Medical Code Prediction from Clinical Text via Incorporating Online Knowledge Sources | Tian Bai, Slobodan Vucetic | In this paper we consider Wikipedia as an external knowledge source and propose Knowledge Source Integration (KSI), a novel end-to-end code assignment framework, which can integrate external knowledge during training of any baseline deep learning model. |
8 | No Place to Hide: Catching Fraudulent Entities in Tensors | Ban Yikun, Liu Xin, Huang Ling, Duan Yitao, Liu Xue, Xu Wei | In this paper, we novelly identify dense-block detection with dense-subgraph mining, by modeling a tensor into a weighted graph without any density information lost. |
9 | Link Prediction in Networks with Core-Fringe Data | Austin Benson, Jon Kleinberg | A common example arises in collecting network data: we often obtain network datasets by recording all of the interactions among a small set of core nodes, so that we end up with a measurement of the network consisting of these core nodes along with a potentially much larger set of fringe nodes that have links to the core. |
10 | Spiders like Onions: on the Network of Tor Hidden Services | Massimo Bernaschi, Alessandro Celestini, Stefano Guarino, Flavio Lombardi, Enrico Mastrostefano | In this paper, we describe the topology of the Tor graph (aggregated at the hidden service level) measuring both global and local properties by means of well-known metrics. |
11 | Be Concise and Precise: Synthesizing Open-Domain Entity Descriptions from Facts | Rajarshi Bhowmik, Gerard de Melo | To this end, we propose a novel fact-to-sequence encoder-decoder model with a suitable copy mechanism to generate concise and precise textual descriptions of entities. |
12 | Navigating the Maze of Wikidata Query Logs | Angela Bonifati, Wim Martens, Thomas Timm | A further investigation that we pursue in this paper is to find, given a query, a number of queries structurally similar to the given query. |
13 | What happened? The Spread of Fake News Publisher Content During the 2016 U.S. Presidential Election | Ceren Budak | In this paper, we address these questions using tweets that mention the two presidential candidates sampled at the daily level, the news content mentioned in such tweets, and open-ended responses from nationally representative telephone interviews. |
14 | Unifying Knowledge Graph Learning and Recommendation: Towards a Better Understanding of User Preferences | Yixin Cao, Xiang Wang, Xiangnan He, Zikun Hu, Tat-Seng Chua | In this paper, we jointly learn the model of recommendation and knowledge graph completion. |
15 | Enriching News Articles with Related Search Queries | David Carmel, Yaroslav Fyodorov, Saar Kuzi, Avihai Mejer, Fiana Raiber, Elad Rainshmidt | We present a three-phase retrieval framework for query recommendation that incorporates various article-dependent and article-independent relevance signals. |
16 | Revisiting Opinion Dynamics with Varying Susceptibility to Persuasion via Non-Convex Local Search | T-H. Hubert Chan, Zhibin Liang, Mauro Sozio | Contrary to the claim in the aforementioned KDD 2018 paper, the objective function is in general non-convex. |
17 | Trajectories of Blocked Community Members: Redemption, Recidivism and Departure | Jonathan Chang, Cristian Danescu-Niculescu-Mizil | In this work, we introduce a computational framework for studying the future behavior of blocked users on Wikipedia. |
18 | Selling a Single Item with Negative Externalities | Matheus Xavier Ferreira, S. Matthew Weinberg, Danny Yuxing Huang, Nick Feamster, Tithi Chattopadhyay | We consider the problem of regulating products with negative externalities to a third party that is neither the buyer nor the seller, but where both the buyer and seller can take steps to mitigate the externality. |
19 | Revisiting Mobile Advertising Threats with MAdLife | Gong Chen, Wei Meng, John Copeland | In this paper, we present an ad collection framework � MAdLife � on Android to capture all the in-app ad traffic generated during an ad’s entire lifespan. |
20 | Modeling Relational Drug-Target-Disease Interactions via Tensor Factorization with Multiple Web Sources | Huiyuan Chen, Jing Li | In this work, we investigate the utility of tensor factorization to model the relationships of drug-target-disease, specifically leveraging different types of online data. |
21 | SamWalker: Social Recommendation with Informative Sampling Strategy | Jiawei Chen, Can Wang, Sheng Zhou, Qihao Shi, Yan Feng, Chun Chen | To address the above two problems, we propose a new recommendation method SamWalker that leverages social information to infer data confidence and guide the sampling process. |
22 | How Serendipity Improves User Satisfaction with Recommendations? A Large-Scale User Evaluation | Li Chen, Yonghua Yang, Ningxia Wang, Keping Yang, Quan Yuan | In this paper, we report the results of a large-scale user survey (involving over 3,000 users) conducted in an industrial mobile e-commerce setting. |
23 | Emoji-Powered Representation Learning for Cross-Lingual Sentiment Classification | Zhenpeng Chen, Sheng Shen, Ziniu Hu, Xuan Lu, Qiaozhu Mei, Xuanzhe Liu | In this paper, we employ emojis, which are widely available in many languages, as a new channel to learn both the cross-language and the language-specific sentiment patterns. |
24 | Decoupled Smoothing on Graphs | Alex Chin, Yatong Chen, Kristen M. Altenburger, Johan Ugander | In this work, we demonstrate that for social networks, the basic friendship graph itself may often not be the appropriate graph for predicting node attributes using graph smoothing. |
25 | Cross-Network Embedding for Multi-Network Alignment | Xiaokai Chu, Xinxin Fan, Di Yao, Zhihua Zhu, Jianhui Huang, Jingping Bi | In this paper, we propose a cross-network embedding method CrossMNA for multi-network alignment problem through investigating structural information only. |
26 | Improving Treatment Effect Estimators Through Experiment Splitting | Dominic Coey, Tom Cunningham | We present a method for implementing shrinkage of treatment effect estimators, and hence improving their precision, via experiment splitting. |
27 | A Semi-Supervised Active-learning Truth Estimator for Social Networks | Hang Cui, Tarek Abdelzaher, Lance Kaplan | This paper introduces an active-learning-based truth estimator for social networks, such as Twitter, that enhances estimation accuracy significantly by requesting a well-selected (small) fraction of data to be labeled. |
28 | Dressing as a Whole: Outfit Compatibility Learning Based on Node-wise Graph Neural Networks | Zeyu Cui, Zekun Li, Shu Wu, Xiao-Yu Zhang, Liang Wang | In this paper, we aim to investigate a practical problem of fashion recommendation by answering the question �which item should we select to match with the given fashion items and form a compatible outfit�. |
29 | Are All Successful Communities Alike? Characterizing and Predicting the Success of Online Communities | Tiago Cunha, David Jurgens, Chenhao Tan, Daniel Romero | Here, we present a systematic study to understand the relations between these success definitions and test how well they can be predicted based on community properties and behaviors from the earliest period of a community’s lifetime. |
30 | Adversarial Training Methods for Network Embedding | Quanyu Dai, Xiao Shen, Liang Zhang, Qiang Li, Dan Wang | In this paper, we aim to introduce a more succinct and effective local regularization method, namely adversarial training, to network embedding so as to achieve model robustness and better generalization performance. |
31 | Local Matching Networks for Engineering Diagram Search | Zhuyun Dai, Zhen Fan, Hafeezul Rahman, Jamie Callan | This paper investigates several local matching networks that explicitly model local region-to-region similarities. |
32 | Evaluating Anti-Fingerprinting Privacy Enhancing Technologies | Amit Datta, Jianan Lu, Michael Carl Tschantz | We propose a novel combination of these methods, offering the best of both worlds, by applying experimentally created models of a AFPET’s behavior to an observational dataset. |
33 | Heterographic Pun Recognition via Pronunciation and Spelling Understanding Gated Attention Network | Yufeng Diao, Hongfei Lin, Liang Yang, Xiaochao Fan, Di Wu, Dongyu Zhang, Kan Xu | In this paper, we propose an end-to-end computational approach – Pronunciation Spelling Understanding Gated Attention (PSUGA) network. |
34 | Evaluating Login Challenges as aDefense Against Account Takeover | Periwinkle Doerfler, Kurt Thomas, Maija Marincenko, Juri Ranieri, Yu Jiang, Angelika Moscicki, Damon McCoy | In this paper, we study the efficacy of login challenges at preventing account takeover, as well as evaluate the amount of friction these challenges create for normal users. |
35 | On Attribution of Recurrent Neural Network Predictions via Additive Decomposition | Mengnan Du, Ninghao Liu, Fan Yang, Shuiwang Ji, Xia Hu | In this paper, we enhance the interpretability of RNNs by providing interpretable rationales for RNN predictions. |
36 | Is a Single Embedding Enough? Learning Node Representations that Capture Multiple Social Contexts | Alessandro Epasto, Bryan Perozzi | In this work, we propose a method for learning multiple representations of the nodes in a graph (e.g., the users of a social network). |
37 | On-Device Algorithms for Public-Private Data with Absolute Privacy | Alessandro Epasto, Hossein Esfandiari, Vahab Mirrokni | Motivated by the increasing need to preserve privacy in digital devices, we introduce the on-device public-private model of computation. |
38 | Graph Neural Networks for Social Recommendation | Wenqi Fan, Yao Ma, Qing Li, Yuan He, Eric Zhao, Jiliang Tang, Dawei Yin | To address the three aforementioned challenges simultaneously, in this paper, we present a novel graph neural network framework (GraphRec) for social recommendations. |
39 | Knowledge-Enhanced Ensemble Learning for Word Embeddings | Lanting Fang, Yong Luo, Kaiyu Feng, Kaiqi Zhao, Aiqun Hu | In this paper, we propose a knowledge-enhanced ensemble method to combine both knowledge graphs and pre-trained word embedding models. |
40 | Joint Entity Linking with Deep Reinforcement Learning | Zheng Fang, Yanan Cao, Qian Li, Dongjie Zhang, Zhenyu Zhang, Yanbing Liu | To address these problems, we convert the global linking into a sequence decision problem and propose a reinforcement learning model which makes decisions from a global perspective. |
41 | Measurement and Early Detection of Third-Party Application Abuse on Twitter | Shehroze Farooqi, Zubair Shafiq | In this paper, we perform a longitudinal study of abusive third-party applications on Twitter that perform a variety of malicious and spam activities in violation of Twitter’s Terms of Service (ToS). |
42 | DPLink: User Identity Linkage via Deep Neural Network From Heterogeneous Mobility Data | Jie Feng, Mingyang Zhang, Huandong Wang, Zeyu Yang, Chao Zhang, Yong Li, Depeng Jin | In this paper, we propose DPLink, an end-to-end deep learning based framework, to complete the user identity linkage task for heterogeneous mobility data collected from different services with different properties. |
43 | MARINE: Multi-relational Network Embeddings with Relational Proximity and Node Attributes | Ming-Han Feng, Chin-Chi Hsu, Cheng-Te Li, Mi-Yen Yeh, Shou-De Lin | We observe that there are two diverse branches of network embedding: for homogeneous graphs and for multi-relational graphs. |
44 | Gaps in Information Access in Social Networks? | Benjamin Fish, Ashkan Bashardoust, Danah Boyd, Sorelle Friedler, Carlos Scheidegger, Suresh Venkatasubramanian | In this work, we study how best to spread information in a social network while minimizing this access gap. |
45 | Cross-domain Recommendation Without Sharing User-relevant Data | Chen Gao, Xiangning Chen, Fuli Feng, Kai Zhao, Xiangnan He, Yong Li, Depeng Jin | In this work, we consider a more practical scenario to perform cross-domain recommendation. |
46 | Towards Self-Adaptive Metric Learning On the Fly | Yang Gao, Yi-Fan Li, Swarup Chandra, Latifur Khan, Bhavani Thuraisingham | Existing studies have proposed various solutions to learn a Mahalanobis or bilinear metric in an online fashion by either restricting distances between similar (dissimilar) pairs to be smaller (larger) than a given lower (upper) bound or requiring similar instances to be separated from dissimilar instances with a given margin. |
47 | Knowledge-aware Assessment of Severity of Suicide Risk for Early Intervention | Manas Gaur, Amanuel Alambo, Joy Prakash Sain, Ugur Kursuncu, Krishnaprasad Thirunarayan, Ramakanth Kavuluru, Amit Sheth, Randy Welton, Jyotishman Pathak | A report by Substance Abuse and Mental Health Services Administration (SAMHSA) shows that 80% of the patients suffering from Borderline Personality Disorder (BPD) have suicidal behavior, 5-10% of whom commit suicide. |
48 | SpeedReader: Reader Mode Made Fast and Private | Mohammad Ghasemisharif, Peter Snyder, Andrius Aucinas, Benjamin Livshits | Instead of its use as a post-render feature to clean up the clutter on a page we propose SpeedReader as an alternative multistep pipeline that is part of the rendering pipeline. |
49 | Externalities and Fairness | Masoud Seddighin, Hamed Saleh, Mohammad Ghodsi | Inspired by the models in the context of network diffusion, we present a simple and natural model, namely network externalities, to capture the externalities. |
50 | �It’s almost like they’re trying to hide it�: How User-Provided Image Descriptions Have Failed to Make Twitter Accessible | Cole Gleason, Patrick Carrington, Cameron Cassidy, Meredith Ringel Morris, Kris M. Kitani, Jeffrey P. Bigham | In this paper, we present a study of 1.09 million tweets with images, finding that only 0.1% of those tweets included descriptions. |
51 | Characterizing Speed and Scale of Cryptocurrency Discussion Spread on Reddit | Maria Glenski, Emily Saldanha, Svitlana Volkova | Our analysis aims to bring the awareness to online discussion spread relevant to cryptocurrencies in addition to informing models for forecasting cryptocurrency price that rely on discussions in social media. |
52 | Goal-setting And Achievement In Activity Tracking Apps: A Case Study Of MyFitnessPal | Mitchell Gordon, Tim Althoff, Jure Leskovec | We present a large-scale study of 1.4 million users and weight loss goals, allowing for an unprecedented detailed view of how people set and achieve their goals. |
53 | Link Prediction on N-ary Relational Data | Saiping Guan, Xiaolong Jin, Yuanzhuo Wang, Xueqi Cheng | To overcome these problems, in this paper, without decomposition, we represent each n-ary relational fact as a set of its role-value pairs. |
54 | PrivIdEx: Privacy Preserving and Secure Exchange of Digital Identity Assets. | Hasini Gunasinghe, Ashish Kundu, Elisa Bertino, Hugo Krawczyk, Suresh Chari, Kapil Singh, Dong Su | This paper presents a decentralized protocol for privacy preserving exchange of users’ identity information addressing such challenges. |
55 | RED: Redundancy-Driven Data Extraction from Result Pages? | Jinsong Guo, Valter Crescenzi, Tim Furche, Giovanni Grasso, Georg Gottlob | We present red, an automatic approach and a prototype system for extracting data records from sites following this publishing pattern. |
56 | Securing the Deep Fraud Detector in Large-Scale E-Commerce Platform via Adversarial Machine Learning Approach | Qingyu Guo, Zhao Li, Bo An, Pengrui Hui, Jiaming Huang, Long Zhang, Mengchen Zhao | (iii) The model trained with an adversarial training process is significantly robust against attacks and performs well on the unperturbed data. |
57 | Predictive Crawling for Commercial Web Content | Shuguang Han, Bernhard Brodowsky, Przemek Gajda, Sergey Novikov, Mike Bendersky, Marc Najork, Robin Dua, Alexandrin Popescul | We describe our solution to technical challenges due to partial observability of price history, feedback loops arising from applying machine learned models, and offers in cold start state. |
58 | Generating Titles for Web Tables | Braden Hancock, Hongrae Lee, Cong Yu | We propose instead the application of a sequence-to-sequence neural network model as a more generalizable approach for generating high-quality table titles. |
59 | Understanding the Effects of the Neighbourhood Built Environment on Public Health with Open Data | Apinan Hasthanasombat, Cecilia Mascolo | Here we propose an approach to link the effects of neighbourhood services over citizen health using a technique that attempts to highlight the cause-effect aspects of these relationships. |
60 | Distributed Tensor Decomposition for Large Scale Health Analytics | Huan He, Jette Henderson, Joyce C Ho | To address this scaling problem more efficiently, we introduce SGranite, a distributed, scalable, and sparse tensor factorization method fit through stochastic gradient descent. |
61 | Debiasing Vandalism Detection Models at Wikidata | Stefan Heindorf, Yan Scholten, Gregor Engels, Martin Potthast | We address this problem for the first time by analyzing and measuring the sources of bias, and by developing a new vandalism detection model that avoids them. |
62 | Message Distortion in Information Cascades | Manoel Horta Ribeiro, Kristina Gligoric, Robert West | Via careful manual coding, we annotate lexical and semantic units in the medical abstracts and track them along cascades. |
63 | Auditing the Partisanship of Google Search Snippets | Desheng Hu, Shan Jiang, Ronald E. Robertson, Christo Wilson | Motivated by the growing body of evidence suggesting that search engine rankings can influence undecided voters, we conducted an algorithm audit of the political partisanship of Google Search snippets relative to the webpages they are extracted from. Then, we collected a large dataset of Search Engine Results Pages (SERPs) by running our partisan queries and their autocomplete suggestions on Google Search. |
64 | To Return or to Explore: Modelling Human Mobility and Dynamics in Cyberspace | Tianran Hu, Yinglong Xia, Jiebo Luo | In this work, we study the statistical patterns that characterize human movements in cyberspace. |
65 | MiST: A Multiview and Multimodal Spatial-Temporal Learning Framework for Citywide Abnormal Event Forecasting | Chao Huang, Chuxu Zhang, Jiashu Zhao, Xian Wu, Dawei Yin, Nitesh Chawla | In this paper, we develop a Multi-View and Multi-Modal Spatial-Temporal learning (MiST) framework to address the above challenges by promoting the collaboration of different views (spatial, temporal and semantic) and map the multi-modal units into the same latent space. |
66 | Who Watches the Watchmen: Exploring Complaints on the Web | Damilola Ibosiola, Ignacio Castro, Gianluca Stringhini, Steve Uhlig, Gareth Tyson | We present the first large-scale study of web complaints (over 1 billion URLs). |
67 | Nonlinear Diffusion for Community Detection and Semi-Supervised Learning | Rania Ibrahim, David Gleich | In this paper, we illustrate a class of nonlinear graph diffusions that are competitive with state of the art embedding techniques and outperform classic diffusions. |
68 | Bayesian Exploration with Heterogeneous Agents | Nicole Immorlica, Jieming Mao, Aleksandrs Slivkins, Zhiwei Steven Wu | We consider Bayesian Exploration: a simple model in which the recommendation system (the �principal�) controls the information flow to the users (the �agents�) and strives to incentivize exploration via information asymmetry. |
69 | Diversity and Exploration in Social Learning | Nicole Immorlica, Jieming Mao, Christos Tzamos | We consider a sequential model of consumer search in which agents’ values are correlated and each agent updates her priors based on the exploration of past agents before performing her search. |
70 | Alleviating Users’ Pain of Waiting: Effective Task Grouping for Online-to-Offline Food Delivery Services | Shenggong Ji, Yu Zheng, Zhaoyuan Wang, Tianrui Li | Thus, in this paper, we study the food delivery task grouping problem so as to improve food delivery efficiency and alleviate the pain of waiting for users, which to the best of our knowledge has not been studied yet. |
71 | CommunityGAN: Community Detection with Generative Adversarial Nets | Yuting Jia, Qinqin Zhang, Weinan Zhang, Xinbing Wang | In this paper, we propose CommunityGAN, a novel community detection framework that jointly solves overlapping community detection and graph representation learning. |
72 | Semantic Text Matching for Long-Form Documents | Jyun-Yu Jiang, Mingyang Zhang, Cheng Li, Michael Bendersky, Nadav Golbandi, Marc Najork | In this paper, we propose a novel Siamese multi-depth attention-based hierarchical recurrent neural network (SMASH RNN) that learns the long-form semantics, and enables long-form document based semantic text matching. |
73 | RealGraph: A Graph Engine Leveraging the Power-Law Distribution of Real-World Graphs | Yong-Yeon Jo, Myung-Hwan Jang, Sang-Wook Kim, Sunju Park | In this paper, we propose RealGraph, a single-machine based graph engine equipped with the hierarchical indicator and the block-based workload allocation. |
74 | (Mis)Information Dissemination in WhatsApp: Gathering, Analyzing and Countermeasures | Gustavo Resende, Philipe Melo, Hugo Sousa, Johnnatan Messias, Marisa Vasconcelos, Jussara Almeida, Fabr�cio Benevenuto | In this work, we analyze information dissemination within WhatsApp, focusing on publicly accessible political-oriented groups, collecting all shared messages during major social events in Brazil: a national truck drivers’ strike and the Brazilian presidential campaign. |
75 | The Block Point Process Model for Continuous-time Event-based Dynamic Networks | Ruthwik Junuthula, Maysam Haghdan, Kevin S. Xu, Vijay Devabhaktuni | In this paper, we introduce a block point process model (BPPM) for continuous-time event-based dynamic networks. |
76 | Outguard: Detecting In-Browser Covert Cryptocurrency Mining in the Wild | Amin Kharraz, Zane Ma, Paul Murley, Charles Lever, Joshua Mason, Andrew Miller, Nikita Borisov, Manos Antonakakis, Michael Bailey | In this paper, we design, implement, and evaluate Outguard, an automated cryptojacking detection system. We construct a large ground-truth dataset, extract several features using an instrumented web browser, and ultimately select seven distinctive features that are used to build an SVM classification model. |
77 | From Small-scale to Large-scale Text Classification | Kang-Min Kim, Yeachan Kim, Jungho Lee, Ji-Min Lee, SangKeun Lee | In this paper, we propose a novel neural network-based multi-task learning framework for large-scale text classification. |
78 | Dual Neural Personalized Ranking | Seunghyeon Kim, Jongwuk Lee, Hyunjung Shim | In this paper, we propose dual neural personalized ranking (DualNPR), which fully exploits both user- and item-side pairwise rankings in a unified manner. |
79 | Studying Preferences and Concerns about Information Disclosure in Email Notifications | Yongsung Kim, Adam Fourney, Ece Kamar | We conclude by exploring machine learning for predicting people’s comfort levels, and we present implications for the design of future social-context aware notification systems. |
80 | RiSER: Learning Better Representations for Richly Structured Emails | Furkan Kocayusufoglu, Ying Sheng, Nguyen Vo, James Wendt, Qi Zhao, Sandeep Tata, Marc Najork | In this paper, we argue that the rich formatting used in business-to-consumer emails contains valuable information that can be used to learn better representations. |
81 | Reputation Deflation Through Dynamic Expertise Assessment in Online Labor Markets | Marios Kokkodis | This work proposes a data-driven approach that deflates reputation scores by solving the problems of reputation attribution and saticity. |
82 | Learning from On-Line User Feedback in Neural Question Answering on the Web | Bernhard Kratzwald, Stefan Feuerriegel | We thus address these challenges through a novel combination of neural question answering and a dynamic process based on distant supervision, asynchronous updates, and an automatic validation of feedback credibility in order to mine high-quality training samples from the web for the purpose of achieving continuous improvement over time. |
83 | Blockchain Mining Games with Pay Forward | Elias Koutsoupias, Philip Lazos, Foluso Ogunlana, Paolo Serafino | We propose that when adding a block, miners also have the ability to pay forward an amount to be collected by the first miner who successfully extends their branch, giving them the power to influence the incentives for mining. |
84 | ContraVis: Contrastive and Visual Topic Modeling for Comparing Document Collections | Tuan Le, Leman Akoglu | We introduce (to the best of our knowledge) the first contrastive and visual topic model, called ContraVis, that jointly addresses both problems: (1) contrastive topic modeling, and (2) contrastive visualization. |
85 | Classifying Extremely Short Texts by Exploiting Semantic Centroids in Word Mover’s Distance Space | Changchun Li, Jihong Ouyang, Ximing Li | To address this problem, we use a better regularized word mover’s distance (RWMD), which can measure distances among short texts at the semantic level. |
86 | Large Scale Semantic Indexing with Deep Level-wise Extreme Multi-label Learning | Dingcheng Li, Jingyuan Zhang, Ping Li | In this paper, in order to lessen the curse of dimensionality and enhance the training efficiency, we propose an approach named Deep Level-wise Extreme Multi-label Learning and Classification (Deep Level-wise XMLC), to facilitate the semantic indexing of literatures. |
87 | Current Flow Group Closeness Centrality for Complex Networks? | Huan Li, Richard Peng, Liren Shan, Yuhao Yi, Zhongzhi Zhang | We show the NP-hardness of the problem, but propose two greedy algorithms to minimize the reciprocal of C(S). |
88 | Semi-Supervised Graph Classification: A Hierarchical Graph Perspective | Jia Li, Yu Rong, Hong Cheng, Helen Meng, Wenbing Huang, Junzhou Huang | In this work, we consider a more challenging but practically useful setting, in which a node itself is a graph instance. |
89 | Efficient Ridesharing Order Dispatching with Mean Field Multi-Agent Reinforcement Learning | Minne Li, Zhiwei Qin, Yan Jiao, Yaodong Yang, Jun Wang, Chenxi Wang, Guobin Wu, Jieping Ye | In this paper, we address the order dispatching problem using multi-agent reinforcement learning (MARL), which follows the distributed nature of the peer-to-peer ridesharing problem and possesses the ability to capture the stochastic demand-supply dynamics in large-scale ridesharing scenarios. |
90 | Exploiting Ratings, Reviews and Relationships for Item Recommendations in Topic Based Social Networks | Pengfei Li, Hua Lu, Gang Zheng, Qian Zheng, Long Yang, Gang Pan | In this paper, we propose TSNPF-a latent factor model to effectively capture user preferences and item features. |
91 | Persona-Aware Tips Generation? | Piji Li, Zihao Wang, Lidong Bing, Wai Lam | In this paper, we investigate the task of tips generation by considering the �persona� information which captures the intrinsic language style of the users or the different characteristics of the product items. |
92 | Learning Travel Time Distributions with Deep Generative Model | Xiucheng Li, Gao Cong, Aixin Sun, Yun Cheng | In this paper, we develop a deep generative model – DeepGTT – to learn the travel time distribution for any route by conditioning on the real-time traffic. |
93 | Truth Inference at Scale: A Bayesian Model for Adjudicating Highly Redundant Crowd Annotations | Yuan Li, Benjamin I. P. Rubinstein, Trevor Cohn | We propose a novel technique, based on a Bayesian graphical model with conjugate priors, and simple iterative expectation-maximisation inference. |
94 | Unsupervised Semantic Generative Adversarial Networks for Expert Retrieval | Shangsong Liang | In this paper, we study the problem of expert retrieval in enterprise corpora: given a topic, also known as query containing a set of words, identify a rank list of candidate experts who have expertise on the topic. |
95 | Estimating the Total Volume of Queries to Google | Fabrizio Lillo, Salvatore Ruggieri | We study the problem of estimating the total volume of queries of a specific domain, which were submitted to the Google search engine in a given time period. |
96 | Forecasting U.S. Domestic Migration Using Internet Search Queries | Allen Yilun Lin, Justin Cranshaw, Scott Counts | We show that migration intent mined from internet search queries can forecast domestic migration and provide new insights beyond government data. |
97 | Learning Dual Retrieval Module for Semi-supervised Relation Extraction | Hongtao Lin, Jun Yan, Meng Qu, Xiang Ren | In this paper, we leverage a key insight that retrieving sentences expressing a relation is a dual task of predicting the relation label for a given sentence-two tasks are complementary to each other and can be optimized jointly for mutual enhancement. |
98 | Distributed Algorithms for Fully Personalized PageRank on Large Graphs | Wenqing Lin | To address this problem, this paper presents a novel study on the computation of fully edge-weighted PPR on large graphs using the distributed computing framework. |
99 | Improving Outfit Recommendation with Co-supervision of Fashion Generation | Yujie Lin, Pengjie Ren, Zhumin Chen, Zhaochun Ren, Jun Ma, Maarten de Rijke | To address this problem, we propose a neural co-supervision learning framework, called the FAshion Recommendation Machine (FARM). |
100 | Learning to Generate Questions by LearningWhat not to Generate | Bang Liu, Mingjun Zhao, Di Niu, Kunfeng Lai, Yancheng He, Haojie Wei, Yu Xu | In this paper, we propose our Clue Guided Copy Network for Question Generation (CGC-QG), which is a sequence-to-sequence generative model with copying mechanism, yet employing a variety of novel components and techniques to boost the performance of question generation. |
101 | Feature Generation by Convolutional Neural Network for Click-Through Rate Prediction | Bin Liu, Ruiming Tang, Yingzhi Chen, Jinkai Yu, Huifeng Guo, Yuzhou Zhang | In this paper, We propose a novel Feature Generation by Convolutional Neural Network (FGCNN) model with two components: Feature Generation and Deep Classifier. |
102 | Efficient (a, �)-core Computation: an Index-based Approach | Boge Liu, Long Yuan, Xuemin Lin, Lu Qin, Wenjie Zhang, Jingren Zhou | In this paper, we present an efficient algorithm based on a novel index such that the algorithm runs in linear time regarding the result size (thus, the algorithm is optimal since it needs at least linear time to output the result). |
103 | Neural Variational Correlated Topic Modeling | Luyang Liu, Heyan Huang, Yang Gao, Yongfeng Zhang, Xiaochi Wei | In this paper, we propose a novel Centralized Transformation Flow to capture the correlations among topics by reshaping topic distributions. |
104 | A Hybrid BitFunnel and Partitioned Elias-Fano Inverted Index | Xinyu Liu, Zhaohua Zhang, Rebecca Stones, Yusen Li, Gang Wang, Xiaoguang Liu | We propose a hybrid method which uses both (a) the recently published mapping-matrix-style index BitFunnel (BF) for search efficiency, and (b) the state-of-the-art Partitioned Elias-Fano (PEF) inverted-index compression method. |
105 | How Do Your Neighbors Disclose Your Information: Social-Aware Time Series Imputation | Zongtao Liu, Yang Yang, Wei Huang, Zhongyi Tang, Ning Li, Fei Wu | In this paper, we study the social-aware time series imputation problem. |
106 | Autonomous Learning for Face Recognition in the Wild via Ambient Wireless Cues | Chris Xiaoxuan Lu, Xuan Kan, Bowen Du, Changhao Chen, Hongkai Wen, Andrew Markham, Niki Trigoni, John Stankovic | We propose a novel technique, AutoTune, which learns and refines the association between a face and wireless identifier over time, by increasing the inter-cluster separation and minimizing the intra-cluster distance. |
107 | Quality Effects on User Preferences and Behaviorsin Mobile News Streaming | Hongyu Lu, Min Zhang, Weizhi Ma, Yunqiu Shao, Yiqun Liu, Shaoping Ma | Based on these quality effects we have discovered, we propose the Preference Behavior Quality (PBQ) probability model which incorporates the quality into traditional behavior-only implicit feedback. |
108 | What We Vote for? Answer Selection from User Expertise View in Community Question Answering | Shanshan Lyu, Wentao Ouyang, Yongqing Wang, Huawei Shen, Xueqi Cheng | In this paper, we formalize the answer selection problem from the user expertise view, considering both the semantic relevance in question-answer pair and user expertise in question-user pair. |
109 | Jointly Learning Explainable Rules for Recommendation with Knowledge Graph | Weizhi Ma, Min Zhang, Yue Cao, Woojeong Jin, Chenyang Wang, Yiqun Liu, Shaoping Ma, Xiang Ren | In this paper, we propose a novel joint learning framework to integrate induction of explainable rules from knowledge graph with construction of a rule-guided neural recommendation model. |
110 | Exploring Perceived Emotional Intelligence of Personality-Driven Virtual Agents in Handling User Challenges | Xiaojuan Ma, Emily Yang, Pascale Fung | In this paper, we propose to improve a VA’s perceived EI by equipping it with personality-driven responsive expression of emotions. |
111 | Moving Deep Learning into Web Browser: How Far Can We Go? | Yun Ma, Dongwei Xiang, Shuyu Zheng, Deyu Tian, Xuanzhe Liu | To bridge the knowledge gap, in this paper, we conduct the first empirical study of deep learning in browsers. |
112 | Exploring User Behavior in Email Re-Finding Tasks | Joel Mackenzie, Kshitiz Gupta, Fang Qiao, Ahmed Hassan Awadallah, Milad Shokouhi | In this work, we propose a novel framework that allows for experimentation with real email data. |
113 | Jointly Leveraging Intent and Interaction Signals to Predict User Satisfaction with Slate Recommendations | Rishabh Mehrotra, Mounia Lalmas, Doug Kenney, Thomas Lim-Meng, Golli Hashemian | In this work, we consider a complex recommendation scenario, called Slate Recommendation, wherein a user is presented with an ordered set of collections, called slates, in a specific page layout. |
114 | SaGe: Web Preemption for Public SPARQL Query Services | Thomas Minier, Hala Skaf-Molli, Pascal Molli | In this paper, we propose SaGe: a SPARQL query engine based on Web preemption. |
115 | Hack for Hire: Exploring the Emerging Market for Account Hijacking | Ariana Mirian, Joe DeBlasio, Stefan Savage, Geoffrey M. Voelker, Kurt Thomas | In this paper, we study a segment of targeted attackers known as �hack for hire� services to understand the playbook that attackers use to gain access to victim accounts. |
116 | Anomaly Detection in the Dynamics of Web and Social Networks Using Associative Memory | Volodymyr Miz, Benjamin Ricaud, Kirell Benzi, Pierre Vandergheynst | In this work, we propose a new, fast and scalable method for anomaly detection in large time-evolving graphs. |
117 | Towards Predicting a Realisation of an Information Need based on Brain Signals | Yashar Moshfeghi, Peter Triantafillou, Frank Pollick | We present two methods for predicting the realisation of an IN, i.e. Generalised method (GM) and Personalised method (PM). |
118 | GhostLink: Latent Network Inference for Influence-aware Recommendation | Subhabrata Mukherjee, Stephan Guennemann | Therefore, we propose GhostLink, an unsupervised probabilistic graphical model, to automatically learn the latent influence network underlying a review community – given only the temporal traces (timestamps) of users’ posts and their content. |
119 | Estimating Walk-Based Similarities Using Random Walk | Shogo Murai, Yuichi Yoshida | In this paper, we propose a random-walk reduction method that reduces the computation of any walk-based similarity to the computation of a stationary distribution of a random walk. |
120 | Sensitivity Analysis of Centralities on Unweighted Networks | Shogo Murai, Yuichi Yoshida | In this work, we compare centralities based on their sensivitity to modifications in the graph. |
121 | Modeling Heart Rate and Activity Data for Personalized Fitness Recommendation | Jianmo Ni, Larry Muhlstein, Julian McAuley | In this paper, we develop context-aware sequential models to capture the personalized and temporal patterns of fitness data. |
122 | Generating Product Descriptions from User Reviews | Slava Novgorodov, Ido Guy, Guy Elad, Kira Radinsky | In this work, we suggest to mitigate these issues by generating short crowd-based product descriptions from user reviews . |
123 | Google Dataset Search: Building a search engine for datasets in an open Web ecosystem | Dan Brickley, Matthew Burgess, Natasha Noy | In this paper, we discuss Google Dataset Search, a dataset-discovery tool that provides search capabilities over potentially all datasets published on the Web. |
124 | Dealing with Interdependencies and Uncertainty in Multi-Channel Advertising Campaigns Optimization | Alessandro Nuara, Nicola Sosio, Francesco Trov�, Maria Chiara Zaccardi, Nicola Gatti, Marcello Restelli | In this paper, we provide the first model capturing the sub-campaigns interdependence. |
125 | Reconciliation k-median: Clustering with Non-polarized Representatives | Bruno Ordozgoiti, Aristides Gionis | We propose a new variant of the k-median problem, where the objective function models not only the cost of assigning data points to cluster representatives, but also a penalty term for disagreement among the representatives. |
126 | ActiveLink: Deep Active Learning for Link Prediction in Knowledge Graphs | Natalia Ostapuk, Jie Yang, Philippe Cudre-Mauroux | In this work, we demonstrate that we can get the best of both worlds while drastically reducing the amount of data needed to train a deep network by leveraging active learning. |
127 | Choosing to Grow a Graph: Modeling Network Formation as Discrete Choice | Jan Overgoor, Austin Benson, Johan Ugander | We provide a framework for modeling social network formation through conditional multinomial logit models from discrete choice and random utility theory, in which each new edge is viewed as a �choice� made by a node to connect to another node, based on (generic) features of the other nodes available to make a connection. |
128 | Policy Gradients for Contextual Recommendations | Feiyang Pan, Qingpeng Cai, Pingzhong Tang, Fuzhen Zhuang, Qing He | In this work, we put forward Policy Gradients for Contextual Recommendations (PGCR) to solve the problem without those unrealistic assumptions. |
129 | Cookie Synchronization: Everything You Always Wanted to Know But Were Afraid to Ask | Panagiotis Papadopoulos, Nicolas Kourtellis, Evangelos Markatos | Through our study, we aim to understand the characteristics of the CSync protocol and the impact it has on web users’ privacy. |
130 | Adversarial Sampling and Training for Semi-Supervised Information Retrieval | Dae Hoon Park, Yi Chang | To solve the problems at the same time, we propose an adversarial sampling and training framework to learn ad-hoc retrieval models with implicit feedback. |
131 | Nameles: An intelligent system for Real-Time Filtering of Invalid Ad Traffic | Antonio Pastor, Matti P�rssinen, Patricia Callejo, Pelayo Vallina, Rub�n Cuevas, �ngel Cuevas, Mikko Kotila, Arturo Azcorra | Our first contribution consists of providing evidence that shows how the Demand Side Platforms (DSPs), one of the most important intermediaries in the programmatic advertising supply chain, may be suffering from economic losses due to invalid ad traffic. |
132 | Learning How to Correct a Knowledge Base from the Edit History | Thomas Pellissier Tanon, Camille Bourgaux, Fabian Suchanek | In this work, we propose to take advantage of the edit history of the knowledge base in order to learn how to correct constraint violations. |
133 | Bootstrapping Domain-Specific Content Discovery on the Web | Kien Pham, Aecio Santos, Juliana Freire | In this paper, we propose DISCO, an approach designed to bootstrap domain-specific search. |
134 | Privacy-Preserving Crowd-Sourcing of Web Searches with Private Data Donor | Vincent Primault, Vasileios Lampos, Ingemar Cox, Emiliano De Cristofaro | Aiming to overcome these issues, this paper presents Private Data Donor (PDD), a decentralized and private-by-design platform providing crowd-sourced Web searches to researchers. |
135 | Community Detection through Likelihood Optimization: In Search of a Sound Model | Liudmila Prokhorenkova, Alexey Tikhonov | We provide an extensive theoretical and empirical analysis to compare several models: the widely used planted partition model, recently proposed degree-corrected modification of this model, and a new null model having some desirable statistical properties. |
136 | NetSMF: Large-Scale Network Embedding as Sparse Matrix Factorization | Jiezhong Qiu, Yuxiao Dong, Hao Ma, Jian Li, Chi Wang, Kuansan Wang, Jie Tang | In this work, we present the algorithm of large-scale network embedding as sparse matrix factorization (NetSMF). |
137 | Crowd-Mapping Urban Objects from Street-Level Imagery | Sihang Qiu, Achilleas Psyllidis, Alessandro Bozzon, Geert-Jan Houben | In this paper, a novel approach to crowd-mapping urban objects is proposed. |
138 | Web Experience in Mobile Networks: Lessons from Two Million Page Visits | Mohammad Rajiullah, Andra Lutu, Ali Safari Khatouni, Mah-Rukh Fida, Marco Mellia, Anna Brunstrom, Ozgu Alay, Stefan Alfredsson, Vincenzo Mancuso | Aiming at reproducibility, we present a large scale empirical study of web page performance collected in eleven commercial mobile networks spanning four countries. For this, we are releasing the dataset as open data for validation and further research. |
139 | A Dynamic Embedding Model of the Media Landscape | J�r�mie Rappaz, Dylan Bourgeois, Karl Aberer | In this work, we present a dynamic embedding method that learns to capture the decision process of individual news sources in their selection of reported events while also enabling the systematic detection of large-scale transformations in the media landscape over prolonged periods of time. |
140 | Keyphrase Extraction from Disaster-related Tweets | Jishnu Ray Chowdhury, Cornelia Caragea, Doina Caragea | Instead of the F1-measure, we propose the use of embedding-based metrics to better capture the correctness of the predicted keyphrases. |
141 | Citation Needed: A Taxonomy and Algorithmic Assessment of Wikipedia’s Verifiability | Miriam Redi, Besnik Fetahu, Jonathan Morgan, Dario Taraborelli | In this paper, we aim to provide an empirical characterization of the reasons why and how Wikipedia cites external sources to comply with its own verifiability guidelines. |
142 | SWAT: Seamless Web Authentication Technology | Florentin Rochet, Kyriakos Efthymiadis, François Koeune, Olivier Pereira | We present the threat model against which our protocol is expected to live and discuss its security. |
143 | Before and After GDPR: The Changes in Third Party Presence at Public and Private European Websites | Jannick S�rensen, Sokol Kosta | Based on an eight months longitudinal study from February to September 2018 of 1250 popular websites in Europe and US, we present a mapping of the subtle shifts in the third party topology before and after May 25, 2018. |
144 | Multiple Treatment Effect Estimation using Deep Generative Model with Task Embedding | Shiv Kumar Saini, Sunny Dhamnani, Aakash , Akil Arif Ibrahim, Prithviraj Chavan | For evaluation, the model is compared against competitive baseline models on two semi-synthetic datasets created using the covariates from the real dataset. |
145 | A Human-in-the-loop Attribute Design Framework for Classification | Md Abdus Salam, Mary E. Koone, Saravanan Thirumuruganathan, Gautam Das, Senjuti Basu Roy | In this paper, we present a semi-automated, �human-in-the-loop� framework for attribute design that assists human analysts to transform raw attributes into effective derived attributes for classification problems. |
146 | How Representative Is a SPARQL Benchmark? An Analysis of RDF Triplestore Benchmarks | Muhammad Saleem, G�bor Sz�rnyas, Felix Conrads, Syed Ahmad Chan Bukhari, Qaiser Mehmood, Axel-Cyrille Ngonga Ngomo | With this fine-grained evaluation, we aim to support the design and implementation of more diverse benchmarks. |
147 | Self- and Cross-Excitation in Stack Exchange Question & Answer Communities | Tiago Santos, Simon Walk, Roman Kern, Markus Strohmaier, Denis Helic | In this paper, we quantify the impact of self- and cross-excitation on the temporal development of user activity in Stack Exchange Question & Answer (Q&A) communities. |
148 | Automatic Boolean Query Refinement for Systematic Review Literature Search | Harrisen Scells, Guido Zuccon, Bevan Koopman | In this paper, we propose automatic methods for Boolean query refinement in the context of systematic review literature retrieval with the aim of alleviating this high-recall, low-precision problem. |
149 | Exploiting Diversity in Android TLS Implementations for Mobile App Traffic Classification | Satadal Sengupta, Niloy Ganguly, Pradipta De, Sandip Chakraborty | In this paper, we propose a novel set of bit-sequence based features which exploit differences in randomness of data generated by different applications. |
150 | BaG: Behavior-aware Group Detection in Crowded Urban Spaces using WiFi Probes | Jiaxing Shen, Jiannong Cao, Xuefeng Liu | In this work, we propose a behavior-aware group detection system (BaG). |
151 | SWeG: Lossless and Lossy Summarization of Web-Scale Graphs | Kijung Shin, Amol Ghoting, Myunghwan Kim, Hema Raghavan | In this work, we propose SWeG, a fast parallel algorithm for summarizing graphs with compact representations. |
152 | Generative Graph Models based on Laplacian Spectra? | Alana Shine, David Kempe | We present techniques for generating random graphs whose Laplacian spectrum approximately matches that of a given input graph. |
153 | VACCINE: Using Contextual Integrity For Data Leakage Detection | Yan Shvartzshnaider, Zvonimir Pavlinovic, Ananth Balashankar, Thomas Wies, Lakshminarayanan Subramanian, Helen Nissenbaum, Prateek Mittal | We use the CI framework to abstract real-world communication exchanges into formally defined information flows where privacy policies describe sequences of admissible flows. |
154 | Shapley Meets Uniform: An Axiomatic Framework for Attribution in Online Advertising | Raghav Singal, Omar Besbes, Antoine Desir, Vineet Goyal, Garud Iyengar | The main contribution in this work is to develop an axiomatic framework for attribution in online advertising. |
155 | Urban Vibes and Rural Charms: Analysis of Geographic Diversity in Mobile Service Usage at National Scale | Rajkarn Singh, Marco Fiore, Mahesh Marina, Alberto Tarable, Alessandro Nordio | We derive our results through the analysis of substantial measurement data collected by a major mobile network operator, leveraging an approach rooted in information theory that can be readily applied to other scenarios. |
156 | Anything to Hide? Studying Minified and Obfuscated Code in the Web | Philippe Skolka, Cristian-Alexandru Staicu, Michael Pradel | This paper presents an empirical study of obfuscation and minification in 967,149 scripts (424,023 unique) from the top 100,000 websites. Our learned classifiers provide an automated and accurate way to identify obfuscated code, and we release a set of real-world obfuscated scripts for future research. |
157 | SciLens: Evaluating the Quality of Scientific News Articles Using Social Media and Scientific Literature Indicators | Panayiotis Smeros, Carlos Castillo, Karl Aberer | This paper describes, develops, and validates SciLens, a method to evaluate the quality of scientific news articles. |
158 | Using Variability as a Guiding Principle to Reduce Latency in Web Applications via OS Profiling | Amoghavarsha Suresh, Anshul Gandhi | In this paper, we explore an alternative approach to reducing latency – using variability as a guiding principle when designing web services. |
159 | Aggregating E-commerce Search Results from Heterogeneous Sources via Hierarchical Reinforcement Learning | Ryuichi Takanobu, Tao Zhuang, Minlie Huang, Jun Feng, Haihong Tang, Bo Zheng | In this paper, we investigate the task of aggregating search results from heterogeneous sources in an E-commerce environment. |
160 | Towards Neural Mixture Recommender for Long Range Dependent User Sequences | Jiaxi Tang, Francois Belletti, Sagar Jain, Minmin Chen, Alex Beutel, Can Xu, Ed H. Chi | In this paper we examine how to build a model that can make use of different temporal ranges and dynamics depending on the request context. |
161 | Leveraging Peer Communication to Enhance Crowdsourcing | Wei Tang, Ming Yin, Chien-Ju Ho | In this paper, we relax such independence property and explore the usage of peer communication-a kind of direct interactions between workers-in crowdsourcing. |
162 | Joint Modeling of Dense and Incomplete Trajectories for Citywide Traffic Volume Inference | Xianfeng Tang, Boqing Gong, Yanwei Yu, Huaxiu Yao, Yandong Li, Haiyong Xie, Xiaoyu Wang | In this paper, we propose a novel framework for the citywide traffic volume inference using both dense GPS trajectories and incomplete trajectories captured by camera surveillance systems. |
163 | Listening between the Lines: Learning Personal Attributes from Conversations | Anna Tigunova, Andrew Yates, Paramita Mirza, Gerhard Weikum | In this work we address the acquisition of such knowledge, for personalization in downstream Web applications, by extracting personal attributes from conversations. |
164 | Dynamic Deep Multi-modal Fusion for Image Privacy Prediction | Ashwini Tonge, Cornelia Caragea | In this paper, we propose an approach for fusing object, scene context, and image tags modalities derived from convolutional neural networks for accurately predicting the privacy of images shared online. |
165 | Signed Distance-based Deep Memory Recommender | Thanh Tran, Xinyue Liu, Kyumin Lee, Xiangnan Kong | To overcome this limitation, in this paper, we design and propose a deep learning framework called Signed Distance-based Deep Memory Recommender, which captures non-linear relationships between users and items explicitly and implicitly, and work well in both general recommendation task and shopping basket-based recommendation task. |
166 | Rating Worker Skills and Task Strains in Collaborative Crowd Computing: A Competitive Perspective | George Trimponias, Xiaojuan Ma, Qiang Yang | In our work, we address this question by taking a competitive perspective and leveraging the vast prior work on competitive games. |
167 | Multimodal Review Generation for Recommender Systems | Quoc-Tuan Truong, Hady Lauw | Therefore, we propose Multimodal Review Generation (MRG), a neural approach that simultaneously models a rating prediction component and a review text generation component. |
168 | Revisiting Wedge Sampling for Triangle Counting | Ata Turk, Duru Turkoglu | In this paper we offer a mechanism to significantly improve wedge sampling for triangle counting. |
169 | RAQ: Relationship-Aware Graph Querying in Large Networks | Jithin Vachery, Akhil Arora, Sayan Ranu, Arnab Bhattacharya | In this paper, we propose RAQ-Relationship-Aware Graph Querying-to mitigate this gap. |
170 | BOLT-K: Bootstrapping Ontology Learning via Transfer of Knowledge | Nikhita Vedula, Pranav Maneriker, Srinivasan Parthasarathy | To this end, we propose a novel LSTM-based framework with attentive pooling, BOLT-K, to learn an ontology for a target subject or domain. |
171 | Learning Resolution Parameters for Graph Clustering | Nate Veldt, David Gleich, Anthony Wirth | To aid practitioners in determining the best clustering approach to use in different applications, we present new techniques for automatically learning how to set clustering resolution parameters. |
172 | Auditing Offline Data Brokers via Facebook’s Advertising Platform | Giridhari Venkatadri, Piotr Sapiezynski, Elissa M. Redmiles, Alan Mislove, Oana Goga, Michelle Mazurek, Krishna P. Gummadi | In this paper, we leverage the Facebook advertising system-and their partnership with six data brokers across seven countries-in order to gain insight into the extent and accuracy of data collection by data brokers today. |
173 | �Data Strikes�: Evaluating the Effectiveness of a New Form of Collective Action Against Technology Companies | Nicholas Vincent, Brent Hecht, Shilad Sen | Focusing on the important commercial domain of recommender systems, we simulate data strikes under a wide variety of conditions and explore how they can augment traditional boycotts. |
174 | Learning Semantic Models of Data Sources Using Probabilistic Graphical Models | Binh Vu, Craig Knoblock, Jay Pujara | In this paper, we present a novel approach that efficiently searches over the combinatorial space of possible semantic models, and applies a probabilistic graphical model to identify the most probable semantic model for a data source. |
175 | Generalists and Specialists: Using Community Embeddings to Quantify Activity Diversity in Online Platforms | Isaac Waller, Ashton Anderson | In this work, we propose a principled measure of how generalist or specialist a user is, and study behavior in online platforms through this lens. We develop sets of community analogies and use them to optimize our embeddings so that they encode community relationships extremely well. |
176 | A Family of Fuzzy Orthogonal Projection Models for Monolingual and Cross-lingual Hypernymy Prediction | Chengyu Wang, Yan Fan, Xiaofeng He, Aoying Zhou | In this work, we present a family of fuzzy orthogonal projection models for both monolingual and cross-lingual hypernymy prediction. |
177 | Modeling Item-Specific Temporal Dynamics of Repeat Consumption for Recommender Systems | Chenyang Wang, Min Zhang, Weizhi Ma, Yiqun Liu, Shaoping Ma | In this paper, we propose a novel unified model with introducing Hawkes Process into Collaborative Filtering (CF). |
178 | Understanding the Evolution of Mobile App Ecosystems: A Longitudinal Measurement Study of Google Play | Haoyu Wang, Hao Li, Yao Guo | In this paper, we seek to shed light on the dynamics of mobile app ecosystems. |
179 | Multi-Task Feature Learning for Knowledge Graph Enhanced Recommendation | Hongwei Wang, Fuzheng Zhang, Miao Zhao, Wenjie Li, Xing Xie, Minyi Guo | In this paper, we consider knowledge graphs as the source of side information. |
180 | Towards Efficient Sharing: A Usage Balancing Mechanism for Bike Sharing Systems | Shuai Wang, Tian He, Desheng Zhang, Yunhuai Liu, Sang H. Son | In this paper, we analyze the bike usage status in three typical bikeshare systems based on 140-month multi-event data. |
181 | Heterogeneous Graph Attention Network | Xiao Wang, Houye Ji, Chuan Shi, Bai Wang, Yanfang Ye, Peng Cui, Philip S Yu | Heterogeneous Graph Attention Network |
182 | Aspect-level Sentiment Analysis using AS-Capsules | Yequan Wang, Aixin Sun, Minlie Huang, Xiaoyan Zhu | In this paper, we propose the aspect-level sentiment capsules model (AS-Capsules), which is capable of performing aspect detection and sentiment classification simultaneously, in a joint manner. |
183 | Quality-Sensitive Training! Social Advertisement Generation by Leveraging User Click Behavior | Yongzhen Wang, Heng Huang, Yuliang Yan, Xiaozhong Liu | In this paper, we put forward a novel seq2seq model to generate social advertisements automatically, in which a quality-sensitive loss function is proposed based on user click behavior to differentiate training samples of varied qualities. |
184 | Demographic Inference and Representative Population Estimates from Multilingual Social Media Data | Zijian Wang, Scott Hale, David Ifeoluwa Adelani, Przemyslaw Grabowicz, Timo Hartman, Fabian Flöck, David Jurgens | To correct for sampling biases, we propose fully interpretable multilevel regression methods that estimate inclusion probabilities from inferred joint population counts and ground-truth population counts. |
185 | Iterative Discriminant Tensor Factorization for Behavior Comparison in Massive Open Online Courses | Xidao Wen, Yu-Ru Lin, Xi Liu, Peter Brusilovsky, Jordan Barr�-a Pineda | This work proposes a multi-level pattern discovery through hierarchical discriminative tensor factorization. |
186 | Dynamic Ensemble of Contextual Bandits to Satisfy Users’ Changing Interests | Qingyun Wu, Huazheng Wang, Yanen Li, Hongning Wang | In this work, we focus on contextual bandit algorithms for making adaptive recommendations. |
187 | Dual Graph Attention Networks for Deep Latent Representation of Multifaceted Social Effects in Recommender Systems | Qitian Wu, Hengrui Zhang, Xiaofeng Gao, Peng He, Paul Weng, Han Gao, Guihai Chen | To relax this strong assumption, in this paper, we propose dual graph attention networks to collaboratively learn representations for two-fold social effects, where one is modeled by a user-specific attention weight and the other is modeled by a dynamic and context-aware attention weight. |
188 | Grid-based Evaluation Metrics for Web Image Search | Xiaohui Xie, Jiaxin Mao, Yiqun Liu, Maarten de Rijke, Yunqiu Shao, Zixin Ye, Min Zhang, Shaoping Ma | Motivated by these observations, we propose corresponding user behavior assumptions to capture users’ search interaction processes and evaluate their search performance. |
189 | Sarcasm Detection with Self-matching Networks and Low-rank Bilinear Pooling | Tao Xiong, Peiran Zhang, Hongbo Zhu, Yihui Yang | In this work, we propose a novel self-matching network to capture sentence �incongruity� information by exploring word-to-word interactions. |
190 | A First Look at Deep Learning Apps on Smartphones | Mengwei Xu, Jiawei Liu, Yuanqiang Liu, Felix Xiaozhu Lin, Yunxin Liu, Xuanzhe Liu | To bridge the knowledge gap between research and practice, we present the first empirical study on 16,500 the most popular Android apps, demystifying how smartphone apps exploit deep learning in the wild. |
191 | Constrained Local Graph Clustering by Colored Random Walk | Yaowei Yan, Yuchen Bian, Dongsheng Luo, Dongwon Lee, Xiang Zhang | In this paper, we propose a method to take advantage of such relationship. |
192 | Revisiting User Mobility and Social Relationships in LBSNs: A Hypergraph Embedding Approach | Dingqi Yang, Bingqing Qu, Jie Yang, Philippe Cudre-Mauroux | In this paper, by revisiting user mobility and social relationships based on a large-scale LBSN dataset collected over a long-term period, we propose LBSN2Vec, a hypergraph embedding approach designed specifically for LBSN data for automatic feature learning. |
193 | Scalpel-CD: Leveraging Crowdsourcing and Deep Probabilistic Modeling for Debugging Noisy Training Data | Jie Yang, Alisa Smirnova, Dingqi Yang, Gianluca Demartini, Yuan Lu, Philippe Cudre-Mauroux | This paper presents Scalpel-CD, a first-of-its-kind system that leverages both human and machine intelligence to debug noisy labels from the training data of machine learning systems. |
194 | How Intention Informed Recommendations Modulate Choices: A Field Study of Spoken Word Content | Longqi Yang, Michael Sobolev, Yu Wang, Jenny Chen, Drew Dunne, Christina Tsangouri, Nicola Dell, Mor Naaman, Deborah Estrin | The study was conducted in the context of spoken word web content (podcasts) which is often consumed through subscription sites or apps. |
195 | Learning from Multiple Cities: A Meta-Learning Approach for Spatial-Temporal Prediction | Huaxiu Yao, Yiding Liu, Ying Wei, Xianfeng Tang, Zhenhui Li | In this paper, we tackle the problem of spatial-temporal prediction for the cities with only a short period of data collection. |
196 | STFNets: Learning Sensing Signals from the Time-Frequency Perspective with Short-Time Fourier Neural Networks | Shuochao Yao, Ailing Piao, Wenjun Jiang, Yiran Zhao, Huajie Shao, Shengzhong Liu, Dongxin Liu, Jinyang Li, Tianshi Wang, | Hence, in this paper, instead of using conventional building blocks (e.g., convolutional and recurrent layers), we propose a new foundational neural network building block, the Short-Time Fourier Neural Network (STFNet). |
197 | CoaCor: Code Annotation for Code Retrieval with Reinforcement Learning | Ziyu Yao, Jayavardhan Reddy Peddamail, Huan Sun | In this work, we investigate a novel perspective of Code annotation for Code retrieval (hence called �CoaCor�), where a code annotation model is trained to generate a natural language annotation that can represent the semantic meaning of a given code snippet and can be leveraged by a code retrieval model to better distinguish relevant code snippets from others. |
198 | Snapshot-based Loading Acceleration of Web Apps with Nondeterministic JavaScript Execution | Jihwan Yeo, Changhyun Shin, Soo-Mook Moon | In this paper, we perform an empirical study for the nondeterministic behavior of web apps during app loading. |
199 | Doppelg�ngers on the Dark Web: A Large-scale Assessment on Phishing Hidden Web Services | Changhoon Yoon, Kwanwoo Kim, Yongdae Kim, Seungwon Shin, Sooel Son | We conducted an in-depth measurement study to demystify the prevalent phishing websites on the Dark Web. |
200 | Hierarchical Temporal Convolutional Networks for Dynamic Recommender Systems | Jiaxuan You, Yichen Wang, Aditya Pal, Pong Eksombatchai, Chuck Rosenburg, Jure Leskovec | Here we propose Hierarchical Temporal Convolutional Networks (HierTCN), a hierarchical deep learning architecture that makes dynamic recommendations based on users’ sequential multi-session interactions with items. |
201 | Spectrum-enhanced Pairwise Learning to Rank | Wenhui Yu, Zheng Qin | To address these gaps, we introduce the spectral features extracted from two hypergraph structures of the purchase records. |
202 | Beyond Shortest Paths: Route Recommendations for Ride-sharing | Chak Fai Yuen, Abhishek Pratap Singh, Sagar Goyal, Sayan Ranu, Amitabha Bagchi | In this paper, we ask: Is the shortest path the optimal path for ride-sharing? |
203 | Detecting Low Self-Esteem in Youths from Web Search Data | Anis Zaman, Rupam Acharyya, Henry Kautz, Vincent Silenzio | We target college students, a population prone to depression, anxiety, and low self-esteem, and ask to take mental health assessment survey along with their individual search history. |
204 | The Matthew Effect in Computation Contests: High Difficulty May Lead to 51% Dominance? | Yulong Zeng, Song Zuo | We study the computation contests where players compete for searching a solution to a given problem with a winner-take-all reward. |
205 | Judging a Book by Its Cover: The Effect of Facial Perception on Centrality in Social Networks | Dongyu Zhang, Teng Guo, Hanxiao Pan, Jie Hou, Zhitao Feng, Liang Yang, Hongfei Lin, Feng Xia | In this paper, we examine whether perceived traits based on facial appearance affect network centrality by exploring the initial stage of social network formation in a first-year college residential area. We then collected facial perception data by requiring other participants to rate facial images for three main attributions: dominance, trustworthiness, and attractiveness. |
206 | Pruning based Distance Sketches with Provable Guarantees on Random Graphs | Hongyang Zhang, Huacheng Yu, Ashish Goel | In this work, we present a preprocessing algorithm that is able to create landmark based distance sketches efficiently, with strong theoretical guarantees. |
207 | Large-Scale Talent Flow Forecast with Dynamic Latent Factor Model? | Le Zhang, Hengshu Zhu, Tong Xu, Chen Zhu, Chuan Qin, Hui Xiong, Enhong Chen | To this end, in this paper, we aim to introduce a big data-driven approach for predictive talent flow analysis. |
208 | From Stances’ Imbalance to Their HierarchicalRepresentation and Detection | Qiang Zhang, Shangsong Liang, Aldo Lipani, Zhaochun Ren, Emine Yilmaz | In this paper, we address this problem by proposing a hierarchical representation of these classes, which combines the agree, disagree, and discuss classes under a new related class. |
209 | Reply-Aided Detection of Misinformation via Bayesian Deep Learning | Qiang Zhang, Aldo Lipani, Shangsong Liang, Emine Yilmaz | We address this problem by proposing a Bayesian deep learning model. |
210 | Multilevel Network Alignment | Si Zhang, Hanghang Tong, Ross Maciejewski, Tina Eliassi-Rad | In this paper, we propose a multilevel network alignment algorithm (Moana) which consists of three key steps. |
211 | Automatic Generation of Pattern-controlled Product Description in E-commerce | Tao Zhang, Jin Zhang, Chengfu Huo, Weijun Ren | To address this issue, we propose a novel pointer-generator neural network to generate product description. |
212 | Iteratively Learning Embeddings and Rules for Knowledge Graph Reasoning | Wen Zhang, Bibek Paudel, Liang Wang, Jiaoyan Chen, Hai Zhu, Wei Zhang, Abraham Bernstein, Huajun Chen | Based on this observation, in this paper we explore how embedding and rule learning can be combined together and complement each other’s difficulties with their advantages. |
213 | Language in Our Time: An Empirical Analysis of Hashtags | Yang Zhang | In the end, we propose a bipartite graph embedding model to summarize users’ hashtag profiles, and rely on these profiles to perform friendship prediction. |
214 | Neural IR Meets Graph Embedding: A Ranking Model for Product Search | Yuan Zhang, Dong Wang, Yan Zhang | In this paper, we leverage the recent advances in graph embedding techniques to enable neural retrieval models to exploit graph-structured data for automatic feature extraction. |
215 | Neural Multimodal Belief Tracker with Adaptive Attention for Dialogue Systems | Zheng Zhang, Lizi Liao, Minlie Huang, Xiaoyan Zhu, Tat-Seng Chua | In this paper, we present the first neural multimodal belief tracker (NMBT) to demonstrate how multimodal evidence can facilitate semantic understanding and dialogue state tracking. |
216 | Auto-EM: End-to-end Fuzzy Entity-Matching using Pre-trained Deep Models and Transfer Learning | Chen Zhao, Yeye He | We in this work take a different tack, proposing a transfer-learning approach to EM, leveraging pre-trained EM models from large-scale, production knowledge bases (KB). |
217 | Review Response Generation in E-Commerce Platforms with External Product Information | Lujun Zhao, Kaisong Song, Changlong Sun, Qi Zhang, Xuanjing Huang, Xiaozhong Liu | In this study, we propose a novel deep neural network model based on the Seq2Seq framework for the review response generation task in e-commerce platforms, which can incorporate product information by a gated multi-source attention mechanism and a copy mechanism. To evaluate the proposed model, we constructed a large-scale dataset from a popular e-commerce website, which contains product information. |
218 | CBHE: Corner-based Building Height Estimation for Complex Street Scene Images | Yunxiang Zhao, Jianzhong Qi, Rui Zhang | We propose CBHE, a building height estimation algorithm considering both building corners and rooflines. |
219 | Domain-Constrained Advertising Keyword Generation | Hao Zhou, Minlie Huang, Yishun Mao, Changlei Zhu, Peng Shu, Xiaoyan Zhu | To address the above issues, this work investigates to use generative neural networks for keyword generation in sponsored search. |
220 | Predicting ConceptNet Path Quality Using Crowdsourced Assessments of Naturalness | Yilun Zhou, Steven Schockaert, Julie Shah | In this paper we instead propose to learn to predict path quality from crowdsourced human assessments. |
221 | A Hierarchical Attention Retrieval Model for Healthcare Question Answering | Ming Zhu, Aman Ahuja, Wei Wei, Chandan K. Reddy | In this paper, we propose a neural network model for ranking documents for question answering in the healthcare domain. We also construct a new large-scale healthcare question-answering dataset, which we use to evaluate our model. |
222 | ShadowBlock: A Lightweight and Stealthy Adblocking Browser | Shitong Zhu, Umar Iqbal, Zhongjie Wang, Zhiyun Qian, Zubair Shafiq, Weiteng Chen | In this work we propose ShadowBlock, a new Chromium-based adblocking browser that can hide traces of adblocking activities from anti-adblockers as it removes ads from web pages. |
223 | GraphVite: A High-Performance CPU-GPU Hybrid System for Node Embedding | Zhaocheng Zhu, Shizhen Xu, Jian Tang, Meng Qu | In this paper, we propose GraphVite, a high-performance CPU-GPU hybrid system for training node embeddings, by co-optimizing the algorithm and the system. |
224 | Transfer Learning for Unsupervised Influenza-like Illness Models from Online Search Data | Bin Zou, Vasileios Lampos, Ingemar Cox | The vast majority of previous work proposes solutions that are based on supervised learning paradigms, in which historical disease rates are required for training a model. |
225 | Tortoise or Hare? Quantifying the Effects of Performance on Mobile App Retention | Agustin Zuniga, Huber Flores, Eemil Lagerspetz, Petteri Nurmi, Sasu Tarkoma, Pan Hui, Jukka Manner | As our second contribution, we develop a model for predicting retention based on performance metrics. |
226 | BotCamp: Bot-driven Interactions in Social Campaigns | Noor Abu-El-Rub, Abdullah Mueen | In this work, we detect a large number of bots interested in politics. |
227 | City-Wide Signal Strength Maps: Prediction with Random Forests | Emmanouil Alimpertis, Athina Markopoulou, Carter Butts, Konstantinos Psounis | In this paper, we develop a prediction framework based on random forests to improve signal strength maps from limited measurements. |
228 | Is Yelp Actually Cleaning Up the Restaurant Industry? A Re-Analysis on the Relative Usefulness of Consumer Reviews | Kristen M. Altenburger, Daniel E. Ho | We show that extreme imbalanced sampling is responsible for claims about the power of Yelp information in the original classification setup. |
229 | Bi-LSTM-CRF Sequence Labeling for Keyphrase Extraction from Scholarly Documents | Rabah Alzaidy, Cornelia Caragea, C. Lee Giles | In this paper, we address the keyphrase extraction problem as sequence labeling and propose a model that jointly exploits the complementary strengths of Conditional Random Fields that capture label dependencies through a transition parameter matrix consisting of the transition probabilities from one label to the neighboring label, and Bidirectional Long Short Term Memory networks that capture hidden semantics in text through the long distance dependencies. |
230 | Longitudinal Adversarial Attack on Electronic Health Records Data | Sungtae An, Cao Xiao, Walter F. Stewart, Jimeng Sun | We propose Longitudinal AdVersarial Attack (, a saliency score based adversarial example using a method that requires a minimal number of perturbations and that automatically minimizes the likelihood of detection. |
231 | A Graph is Worth a Thousand Words: Telling Event Stories using Timeline Summarization Graphs | Jeffery Ansah, Lin Liu, Wei Kang, Selasie Kwashie, Jixue Li, Jiuyong Li | In this paper, we propose StoryGraph, a novel graph timeline summarization structure that is capable of identifying the different themes of a story. |
232 | Firsthand Opiates Abuse on Social Media: Monitoring Geospatial Patterns of Interest Through a Digital Cohort | Duilio Balsamo, Paolo Bajardi, Andr?? Panisson | In this paper we present a computational approach to identify a digital cohort that might provide an updated and complementary view on the opioid crisis. |
233 | Learn2Clean: Optimizing the Sequence of Tasks for Web Data Preparation | Laure Berti-Equille | In this paper, we propose Learn2Clean, a method based on Q-Learning, a model-free reinforcement learning technique that selects, for a given dataset, a ML model, and a quality performance metric, the optimal sequence of tasks for preprocessing the data such that the quality of the ML model result is maximized. |
234 | Global Vectors for Node Representations | Robin Brochier, Adrien Guille, Julien Velcin | In this paper, we propose a matrix factorization approach for network embedding, inspired by GloVe, that better handles non co-occurrence with a competitive time-complexity. |
235 | The Music Streaming Sessions Dataset | Brian Brost, Rishabh Mehrotra, Tristan Jehan | This dataset enables research on important problems including how to model user listening and interaction behaviour in streaming, as well as Music Information Retrieval (MIR), and session-based sequential recommendations. In order to spur that research, we release the Music Streaming Sessions Dataset (MSSD), which consists of 160 million listening sessions and associated user actions. |
236 | Rethinking the Detection of Child Sexual Abuse Imagery on the Internet | Elie Bursztein, Einat Clarke, Michelle DeLaune, David M. Elifff, Nick Hsu, Lindsey Olson, John Shehan, Madhukar Thakur, Kurt Thomas, | In this paper, we present the first longitudinal measurement study of CSAI distribution online and the threat it poses to society’s ability to combat child sexual abuse. |
237 | The Illusion of Change: Correcting for Biases in Change Inference for Sparse, Societal-Scale Data | Gabriel Cadamuro, Ramya Korlakai Vinayak, Joshua Blumenstock, Sham Kakade, Jacob Shapiro | We propose a plug-in correction that can be applied to any estimator, including several recently proposed procedures. |
238 | Rating Augmentation with Generative Adversarial Networks towards Accurate Collaborative Filtering | Dong-Kyu Chae, Jin-Soo Kang, Sang-Wook Kim, Jaeho Choi | In this paper, we propose a Rating Augmentation framework with GAN, named RAGAN, aiming to alleviate the data sparsity problem in collaborative filtering (CF), eventually improving recommendation accuracy significantly. |
239 | On the Impact of Choice Architectures on Inequality in Online Donation Platforms | Abhijnan Chakraborty, Nuno Mota, Asia J. Biega, Krishna P. Gummadi, Hoda Heidari | In this paper, we focus on (i) quantifying inequality in the project funding in online donation platforms, and (ii) understanding the impact of platform design on donors’ behavior in magnifying those inequalities. |
240 | Multi-Domain Gated CNN for Review Helpfulness Prediction | Cen Chen, Minghui Qiu, Yinfei Yang, Jun Zhou, Jun Huang, Xiaolong Li, Forrest Sheng Bao | Presenting the most helpful reviews, instead of all, to them will greatly ease purchase decision making. |
241 | Collaborative Similarity Embedding for Recommender Systems | Chih-Ming Chen, Chuan-Ju Wang, Ming-Feng Tsai, Yi-Hsuan Yang | We present collaborative similarity embedding (CSE), a unified framework that exploits comprehensive collaborative relations available in a user-item bipartite graph for representation learning and recommendation. |
242 | Sampled in Pairs and Driven by Text: A New Graph Embedding Framework | Liheng Chen, Yanru Qu, Zhenghui Wang, Lin Qiu, Weinan Zhang, Ken Chen, Shaodian Zhang, Yong Yu | To solve these problems, we propose a novel framework, namely Text-driven Graph Embedding with Pairs Sampling (TGE-PS). |
243 | BoFGAN: Towards A New Structure of Backward-or-Forward Generative Adversarial Nets | M.K.Sophie Chen, Xinyi Lin, Chen Wei, Rui Yan | In this paper, we propose a Backward-or-Forward Generative Adversarial Nets model (BoFGAN) to address this problem. |
244 | Outage Prediction and Diagnosis for Cloud Service Systems | Yujun Chen, Xian Yang, Qingwei Lin, Hongyu Zhang, Feng Gao, Zhangwei Xu, Yingnong Dang, Dongmei Zhang, Hang Dong, | To minimize service downtime and ensure high system availability, we develop an intelligent outage management approach, called AirAlert, which can forecast the occurrence of outages before they actually happen and diagnose the root cause after they indeed occur. |
245 | What Makes a Good Team? A Large-scale Study on the Effect of Team Composition in Honor of Kings | Ziqiang Cheng, Yang Yang, Chenhao Tan, Denny Cheng, Alex Cheng, Yueting Zhuang | In this paper, we present a large-scale study on the effect of team composition on multiple measures of team effectiveness. |
246 | TiFi: Taxonomy Induction for Fictional Domains | Cuong Xuan Chu, Simon Razniewski, Gerhard Weikum | In this paper we focus on the construction of taxonomies for fictional domains, using noisy category systems from fan wikis or text extraction as input. |
247 | Deriving User- and Content-specific Rewards for Contextual Bandits | Paolo Dragone, Rishabh Mehrotra, Mounia Lalmas | We explore alternative methods to provide a more informed reward function, based on the assumptions that streaming time distribution heavily depends on the type of user and the type of content being streamed. |
248 | Pcard: Personalized Restaurants Recommendation from Card Payment Transaction Records | Min Du, Robert Christensen, Wei Zhang, Feifei Li | Personalized Point of Interest (POI) recommendation that incorporates users’ personal preferences is an important subject of research. |
249 | Improving Multiclass Classification in Crowdsourcing by Using Hierarchical Schemes | Xiaoni Duan, Keishi Tajima | In this paper, we propose a method of improving accuracy of multiclass classification tasks in crowdsourcing. |
250 | Modeling the Factors of User Success in Online Debate | Esin Durmus, Claire Cardie | In this work, we aim to better understand the mechanisms behind success in online debates. |
251 | HopRank: How Semantic Structure Influences Teleportation in PageRank (A Case Study on BioPortal) | Lisette Esp�n-Noboa, Florian Lemmerich, Simon Walk, Markus Strohmaier, Mark Musen | This paper introduces HopRank, an algorithm for modeling human navigation on semantic networks. |
252 | Product-Aware Helpfulness Prediction of Online Reviews | Miao Fan, Chao Feng, Lin Guo, Mingming Sun, Ping Li | Hence, in this paper we propose an end-to-end deep neural architecture directly fed by both the metadata of a product and the raw text of its reviews to acquire product-aware review representations for helpfulness prediction. We also construct two large-scale datasets which are a portion of the real-world web data in Amazon and Yelp, respectively, to train and test our approach. |
253 | The World Wants Mangoes and Kangaroos: A Study of New Emoji Requests Based on Thirty Million Tweets | Yunhe Feng, Wenjun Zhou, Zheng Lu, Zhibo Wang, Qing Cao | In this study, we collected more than thirty million tweets mentioning the word �emoji� in a one-year period to study emoji requests on Twitter. |
254 | Online Learning for Measuring Incentive Compatibility in Ad Auctions? | Zhe Feng, Okke Schrijvers, Eric Sodomka | In this paper we investigate the problem of measuring end-to-end Incentive Compatibility (IC) regret given black-box access to an auction mechanism. |
255 | TableNet: An Approach for Determining Fine-grained Relations for Wikipedia Tables | Besnik Fetahu, Avishek Anand, Maria Koutraki | We propose TableNet, an approach for interlinking tables with subPartOf and equivalent relations. |
256 | Learning Graph Pooling and Hybrid Convolutional Operations for Text Representations | Hongyang Gao, Yongjun Chen, Shuiwang Ji | In this work, we propose the graph pooling (gPool) layer, which employs a trainable projection vector to measure the importance of nodes in graphs. |
257 | Predicting Human Mobility via Variational Attention | Qiang Gao, Fan Zhou, Goce Trajcevski, Kunpeng Zhang, Ting Zhong, Fengli Zhang | Motivated by recent success of deep variational inference, we propose VANext (Variational Attention based Next) POI prediction: a latent variable model for inferring user’s next footprint, with historical mobility attention. |
258 | Maximizing Marginal Utility per Dollar for Economic Recommendation | Yingqiang Ge, Shuyuan Xu, Shuchang Liu, Shijie Geng, Zuohui Fu, Yongfeng Zhang | Motivated by the first consideration, in this paper, we propose a learning algorithm to maximize marginal utility per dollar for recommendations. |
259 | The few-get-richer: a surprising consequence of popularity-based rankings? | Fabrizio Germano, Vicen� G�mez, Ga�l Le Mens | In this paper, we identify a surprising consequence of popularity-based rankings: the fewer the items reporting a given signal, the higher the share of the overall traffic they collectively attract. |
260 | Context-Sensitive Malicious Spelling Error Correction | Hongyu Gong, Yuchen Li, Suma Bhat, Pramod Viswanath | In this paper, we focus on malicious spelling correction, which requires an approach that relies on the context and the surface forms of targeted keywords. |
261 | With a Little Help from My Friends (and Their Friends): Influence Neighborhoods for Social Recommendations | Avni Gulati, Magdalini Eirinaki | This has been achieved in various ways, and under different assumptions about the network characteristics, structure, and availability of other information (such as trust, content, etc.) In this work, we create neighborhoods of influence leveraging only the social graph structure. |
262 | Personalized Online Spell Correction for Personal Search | Jai Gupta, Zhen Qin, Michael Bendersky, Donald Metzler | In this work, we propose a simple and effective personalized spell correction solution that augments existing global solutions for search over private corpora. |
263 | Inferring Search Queries from Web Documents via a Graph-Augmented Sequence to Attention Network | Fred.X Han, Di Niu, Kunfeng Lai, Weidong Guo, Yancheng He, Yu Xu | Toward this end, we propose a novel generative model called the Graph-augmented Sequence to Attention (G-S2A) network. |
264 | Learning Novelty-Aware Ranking of Answers to Complex Questions | Shahar Harel, Sefi Albo, Eugene Agichtein, Kira Radinsky | We present a new method, DRN , which learns novelty-related features from unlabeled data with minimal social signals, to emphasize diversity in ranking. |
265 | Spatio-Temporal Capsule-based Reinforcement Learning for Mobility-on-Demand Network Coordination | Suining He, Kang G. Shin | To meet this need effectively, we propose STRide, an MOD coordination-learning mechanism reinforced spatio-temporally with capsules. |
266 | An Investigation of Cyber Autonomy on Government Websites | Hsu-Chun Hsiao, Tiffany Hyun-Jin Kim, Yu-Ming Ku, Chun-Ming Chang, Hung-Fang Chen, Yu-Jen Chen, Chun-Wen Wang, Wei Jeng | By reviewing policy documents and surveying technicians who maintain government websites, we identify four significant forces that can influence the degree of a government’s autonomy, including government mandates on HTTPS adoption, website development outsourcing, the citizens’ fear of large-scale surveillance, and user confusion. |
267 | Transfer Meets Hybrid: A Synthetic Approach for Cross-Domain Collaborative Filtering with Text | Guangneng Hu, Yu Zhang, Qiang Yang | To achieve this, we propose a Transfer Meeting Hybrid (TMH) model for cross-domain recommendation with unstructured text. |
268 | Unbiased LambdaMART: An Unbiased Pairwise Learning-to-Rank Algorithm | Ziniu Hu, Yang Wang, Qu Peng, Hang Li | In this paper, we propose a novel framework to accomplish the goal and apply this framework to the state-of-the-art pairwise learning-to-rank algorithm, LambdaMART. |
269 | Domain-aware Neural Model for Sequence Labeling using Joint Learning | Huang Heng, Liu Xiaozhong, Yuliang Yan | In this paper, we propose an innovative joint learning neural network which can encapsulate the global domain knowledge and the local sentence/token information to enhance the sequence labeling model. |
270 | A Multimodal Text Matching Model for Obfuscated Language Identification in Adversarial Communication? | Longtao Huang, Ting Ma, Junyu Lin, Jizhong Han, Songlin Hu | We propose a multimodal text matching model which combining textual and visual features. |
271 | The Chain of Implicit Trust: An Analysis of the Web Third-party Resources Loading | Muhammad Ikram, Rahat Masood, Gareth Tyson, Mohamed Ali Kaafar, Noha Loizon, Roya Ensafi | This paper performs a large-scale study of dependency chains in the Web, to find that around 50% of first-party websites render content that they did not directly load. |
272 | Efficient Interaction-based Neural Ranking with Locality Sensitive Hashing | Shiyu Ji, Jinjin Shao, Tao Yang | This paper presents the design choices with cost analysis, and an evaluation that assesses efficiency benefits and relevance tradeoffs for the tested datasets. |
273 | Triple Trustworthiness Measurement for Knowledge Graph | Shengbin Jia, Yang Xiang, Xiaojun Chen, Kun Wang, Shijia | In this paper, we establish a knowledge graph triple trustworthiness measurement model that quantify their semantic correctness and the true degree of the facts expressed. |
274 | A Tree-Structured Neural Network Model for Household Energy Breakdown | Yiling Jia, Nipun Batra, Hongning Wang, Kamin Whitehouse | In this paper, we propose a TreeCNN model for energy breakdown on low frequency data. |
275 | Improving Neural Response Diversity with Frequency-Aware Cross-Entropy Loss | Shaojie Jiang, Pengjie Ren, Christof Monz, Maarten de Rijke | In this paper, we address the low-diversity problem by investigating its connection with model over-confidence reflected in predicted distributions. |
276 | A Novel Generative Topic Embedding Model by Introducing Network Communities | Di Jin, Jiantao Huang, Pengfei Jiao, Liang Yang, Dongxiao He, Fran??oise Soulie-Fogelman, Yuxiao Huang | In this paper, we utilize community structure to solve these problems. |
277 | A Scalable Hybrid Research Paper Recommender System for Microsoft Academic | Anshul Kanakia, Zhihong Shen, Darrin Eide, Kuansan Wang | We present the design and methodology for the large scale hybrid paper recommender system used by Microsoft Academic. |
278 | Topic Structure-Aware Neural Language Model: Unified language model that maintains word and topic ordering by their embedded representations | Noriaki Kawamae | As we focus on the fact that topic models can be shared among, and indeed complement embedding models and neural language models, we propose Word and topic 2 vec (Wat2vec), and Topic Structure-Aware Neural Language Model (TSANL). |
279 | Fairness in Algorithmic Decision Making: An Excursion Through the Lens of Causality | Aria Khademi, Sanghack Lee, David Foley, Vasant Honavar | We introduce two definitions of group fairness grounded in causality: fair on average causal effect (FACE), and fair on average causal effect on the treated (FACT). |
280 | MVAE: Multimodal Variational Autoencoder for Fake News Detection | Dhruv Khattar, Jaipal Singh Goud, Manish Gupta, Vasudeva Varma | We propose an end-to-end network, Multimodal Variational Autoencoder (MVAE), which uses a bimodal variational autoencoder coupled with a binary classifier for the task of fake news detection. |
281 | Mobile App Risk Ranking via Exclusive Sparse Coding | Deguang Kong, Lei Cen | We propose an efficient iterative re-weighted method to solve the resultant optimization problem, the convergence of which can be rigorously proved. |
282 | A Neural Bag-of-Words Modelling Framework for Link Prediction in Knowledge Bases with Sparse Connectivity | Fanshuang Kong, Richong Zhang, Hongyu Guo, Samuel Mensah, Zhiyuan Hu, Yongyi Mao | In this paper, we present a simple and efficient model that can attain these two goals. |
283 | FARE: Diagnostics for Fair Ranking using Pairwise Error Metrics | Caitlin Kuhlman, MaryAnn VanValkenburg, Elke Rundensteiner | Therefore, in this work we propose to broaden the scope of fairness assessment to include error-based fairness criteria for rankings. |
284 | Adversarial Adaptation of Scene Graph Models for Understanding Civic Issues | Shanu Kumar, Shubham Atreja, Anjali Singh, Mohit Jain | In this work, given an image, we propose to generate a Civic Issue Graph consisting of a set of objects and the semantic relations between them, which are representative of the underlying civic issue. We also release two multi-modal (text and images) datasets, that can help in further analysis of civic issues from images. |
285 | Redesigning Bitcoin’s fee market | Ron Lavi, Or Sattath, Aviv Zohar | To decouple them, we analyze the �monopolistic auction� [8], showing: (i) its revenue does not decrease as the maximal block size increases, (ii) it is resilient to an untrusted auctioneer (the miner), and (iii) simplicity for transaction issuers (bidders), as the average gain from strategic bid shading (relative to bidding one’s true maximal willingness to pay) diminishes as the number of bids increases. |
286 | Measuring Political Personalization of Google News Search | Huyen Le, Raven Maragh, Brian Ekdale, Andrew High, Timothy Havens, Zubair Shafiq | In this paper, we investigate whether web search results are personalized based on a user’s browsing history, which can be inferred by search engines via third-party tracking. |
287 | TiSSA: A Time Slice Self-Attention Approach for Modeling Sequential User Behaviors | Chenyi Lei, Shouling Ji, Zhao Li | In this paper, we propose to integrate a novel Time Slice Self-Attention (TiSSA) mechanism into RNNs for better modeling sequential user behaviors, which utilizes the time-interval-based gated recurrent units to exploit the temporal dimension when encoding user actions, and has a specially designed time slice hierarchical self-attention function to characterize both local and global dependency of user actions, while the final context-aware user representations can be used for downstream applications. |
288 | Search Mindsets: Understanding Focused and Non-Focused Information Seeking in Music Search | Ang Li, Jennifer Thom, Praveen Chandar, Christine Hosey, Brian St. Thomas, Jean Garcia-Gathright | We propose that users who engage in domain-specific search (e.g., music search) have different information-seeking needs than in general search. |
289 | Click Feedback-Aware Query Recommendation Using Adversarial Examples | Ruirui Li, Liangda Li, Xian Wu, Yunhong Zhou, Wei Wang | In this work, we propose Click Feedback-Aware Network (CFAN) to provide feedback-aware query suggestions. |
290 | Learning Fast Matching Models from Weak Annotations | Xue Li, Zhipeng Luo, Hao Sun, Jianjin Zhang, Weihao Han, Xianqi Chu, Liangjie Zhang, Qi Zhang | We propose a novel training scheme for fast matching models in Search Ads, motivated by practical challenges. |
291 | Multistream Classification for Cyber Threat Data with Heterogeneous Feature Space | Yi-Fan Li, Yang Gao, Gbadebo Ayoade, Hemeng Tao, Latifur Khan, Bhavani Thuraisingham | We propose a framework of multistream classification by using projected data from a common latent feature space, which is embedded from both source and target domains. |
292 | Predicting pregnancy using large-scale data from a women’s health tracking mobile application | Bo Liu, Shuyang Shi, Yongshang Wu, Daniel Thomas, Laura Symul, Emma Pierson, Jure Leskovec | Here we develop four models – a logistic regression model, and 3 LSTM models – to predict a woman’s probability of becoming pregnant using data from a women’s health tracking app, Clue by BioWink GmbH. |
293 | Fuzzy Multi-task Learning for Hate Speech Type Identification | Han Liu, Pete Burnap, Wafa Alorainy, Matthew L. Williams | In this paper, we introduce a novel formulation of the hate speech type identification problem in the setting of multi-task learning through our proposed fuzzy ensemble approach. |
294 | Neural Chinese Word Segmentation with Lexicon and Unlabeled Data via Posterior Regularization | Junxin Liu, Fangzhao Wu, Chuhan Wu, Yongfeng Huang, Xing Xie | In this paper, we propose a neural approach for Chinese word segmentation which can exploit both lexicon and unlabeled data. |
295 | User-Video Co-Attention Network for Personalized Micro-video Recommendation | Shang Liu, Zhenzhong Chen, Hongyi Liu, Xinghai Hu | In this paper, a hypothesis we explore is that, not only do users have multi-modal interest, but micro-videos have multi-modal targeted audience segments. |
296 | Recommender Systems with Heterogeneous Side Information | Tianqiao Liu, Zhiwei Wang, Jiliang Tang, Songfan Yang, Gale Yan Huang, Zitao Liu | In this paper, we investigate the problem of exploiting heterogeneous side information for recommendations. |
297 | Globally-Optimized Realtime Supply-Demand Matching in On-Demand Ridesharing | Yifang Liu, Will Skinner, Chongyuan Xiang | This paper proposes a solution that performs a global optimization offline periodically based on forecasted supply and demand data, and uses the offline results to guide realtime supply and demand matching. |
298 | MCVAE: Margin-based Conditional Variational Autoencoder for Relation Classification and Pattern Generation | Fenglong Ma, Yaliang Li, Chenwei Zhang, Jing Gao, Nan Du, Wei Fan | To address this challenge, in this paper, we propose to employ a generative model, called conditional variational autoencoder (CVAE), to handle the pattern sparsity. |
299 | Detect Rumors on Twitter by Promoting Information Campaigns with Generative Adversarial Learning | Jing Ma, Wei Gao, Kam-Fai Wong | In this paper, we attempt to fight such chaos with itself to make automatic rumor detection more robust and effective. |
300 | Parametric Models for Intransitivity in Pairwise Rankings | Rahul Makhijani, Johan Ugander | In this work we generalize this trend, showing that there cannot exist an parametric model that both (i) has a log-likelihood function that is concave in item-level parameters and (ii) can exhibit intransitive preferences. |
301 | A Large-scale Study on the Risks of the HTML5 WebAPI for Mobile Sensor-based Attacks | Francesco Marcantoni, Michalis Diamantaris, Sotiris Ioannidis, Jason Polakis | In this paper we provide a comprehensive evaluation of the multifaceted threat that mobile web browsing poses to users, by conducting a large-scale study of mobile-specific HTML5 WebAPI calls used in the wild. |
302 | PYTHIA: a Framework for the Automated Analysis of Web Hosting Environments | Srdjan Matic, Gareth Tyson, Gianluca Stringhini | In this work we propose Pythia , a novel lightweight approach for identifying Web content hosted on third-party infrastructures, including both traditional Web hosts and content delivery networks. |
303 | Event Detection using Hierarchical Multi-Aspect Attention | Sneha Mehta, Mohammad Raihanul Islam, Huzefa Rangwala, Naren Ramakrishnan | In this work we present a novel factorized bilinear multi-aspect attention mechanism (FBMA) that attends to different aspects of text while constructing its representation. |
304 | Signals Matter: Understanding Popularity and Impact of Users on Stack Overflow | Arpit Merchant, Daksh Shah, Gurpreet Singh Bhatia, Anurag Ghosh, Ponnurangam Kumaraguru | We present evidence that certain non-trivial badges, reputation scores and age of the user on the site positively correlate with popularity and impact. |
305 | Fine-grained Type Inference in Knowledge Graphs via Probabilistic and Tensor Factorization Methods | A. B. M. Moniruzzaman, Richi Nayak, Maolin Tang, Thirunavukarasu Balasubramaniam | In order to address the issue, this paper proposes a new approach to the fine-grained type inference problem. |
306 | Augmenting Knowledge Tracing by Considering Forgetting Behavior | Koki Nagatani, Qian Zhang, Masahiro Sato, Yan-Ying Chen, Francine Chen, Tomoko Ohkuma | In this paper, we focus on modeling and predicting a student’s knowledge by considering their forgetting behavior. |
307 | Think Outside the Dataset: Finding Fraudulent Reviews using Cross-Dataset Analysis | Shirin Nilizadeh, Hojjat Aghakhani, Eric Gustafson, Christopher Kruegel, Giovanni Vigna | We propose OneReview, a method for locating fraudulent reviews, correlating data from multiple crowd-sourced review sites. |
308 | Entity Personalized Talent Search Models with Tree Interaction Features | Cagri Ozcaglar, Sahin Geyik, Brian Schmitz, Prakhar Sharma, Alex Shelkovnykov, Yiming Ma, Erik Buchanan | In this paper, we propose an entity-personalized Talent Search model which utilizes a combination of generalized linear mixed (GLMix) models and gradient boosted decision tree (GBDT) models, and provides personalized talent recommendations using nonlinear tree interaction features generated by the GBDT. |
309 | Value-aware Recommendation based on Reinforcement Profit Maximization | Changhua Pei, Xinru Yang, Qing Cui, Xiao Lin, Fei Sun, Peng Jiang, Wenwu Ou, Yongfeng Zhang | In this work, we blend the fundamental concepts in online advertising and micro-economics into personalized recommendation for profit maximization. |
310 | Semi-Supervised Entity Alignment via Knowledge Graph Embedding with Awareness of Degree Difference | Shichao Pei, Lu Yu, Robert Hoehndorf, Xiangliang Zhang | We propose a semi-supervised entity alignment method (SEA) to leverage both labeled entities and the abundant unlabeled entity information for the alignment. |
311 | Event-Driven Analysis of Crowd Dynamics in the Black Lives Matter Online Social Movement | Hao Peng, Ceren Budak, Daniel M. Romero | Here, focusing on the Black Lives Matter OSM and utilizing an event-driven approach on a dataset of 36 million tweets and thousands of offline events, we study how different types of offline events-police violence and heightened protests-influence crowd behavior over time. |
312 | CnGAN: Generative Adversarial Networks for Cross-network user preference generation for non-overlapped users | Dilruk Perera, Roger Zimmermann | As a solution, we propose CnGAN, a novel multi-task learning based, encoder-GAN-recommender architecture. |
313 | Learning Clusters through Information Diffusion | Liudmila Prokhorenkova, Alexey Tikhonov, Nelly Litvak | In this paper, we analyze the problem of finding communities of highly interconnected nodes, given only the infection times of nodes. |
314 | Constructing Test Collections using Multi-armed Bandits and Active Learning | Md Mustafizur Rahman, Mucahid Kutlu, Matthew Lease | We propose a two-phase approach to intelligent judging across topics which does not require document rankings from a shared task. |
315 | A Multi-modal Neural Embeddings Approach for Detecting Mobile Counterfeit Apps | Jathushan Rajasegaran, Naveen Karunanayake, Ashanie Gunathillake, Suranga Seneviratne, Guillaume Jourjon | In this paper, we propose a novel approach of combining content embeddings and style embeddings generated from pre-trained convolutional neural networks to detect counterfeit apps. |
316 | Context-Aware Sequential Recommendations withStacked Recurrent Neural Networks | Lakshmanan Rakkappan, Vaibhav Rajan | In this paper we design new context-aware sequential recommendation methods, based on Stacked Recurrent Neural Networks, that model the dynamics of contexts and temporal gaps. |
317 | Improved Cross-Lingual Question Retrieval for Community Question Answering | Andreas R�ckl�, Krishnkant Swarnkar, Iryna Gurevych | This is even more the case for specialized domains such as in technical cQA, which we explore in this work. |
318 | TurkScanner: Predicting the Hourly Wage of Microtasks | Susumu Saito, Chun-Wei Chiang, Saiph Savage, Teppei Nakano, Tetsunori Kobayashi, Jeffrey P. Bigham | This study explores various computational methods for predicting the working times (and thus hourly wages) required for tasks based on data collected from other workers completing crowd work. |
319 | A Large-scale Study of Wikipedia Users’ Quality of Experience | Flavia Salutari, Diego Da Hora, Gilles Dubuc, Dario Rossi | Whereas Web performances are typically gathered with controlled experiments, in this work we perform a large-scale study of one of the most popular websites,namely Wikipedia, explicitly asking (a small fraction of its) users for feedback on the browsing experience. |
320 | Genre Differences of Song Lyrics and Artist Wikis: An Analysis of Popularity, Length, Repetitiveness, and Readability | Markus Schedl | Exploiting content information from song lyrics, contextual information reflected in music artists’ Wikipedia articles, and listening information, we particularly study the aspects of popularity, length, repetitiveness, and readability of lyrics and Wikipedia articles. |
321 | Growing Attributed Networks through Local Processes | Harshay Shah, Suhansanu Kuma, Hari Sundaram | This paper proposes an attributed network growth model. |
322 | ?-Diagnosis: Unsupervised and Real-time Diagnosis of Small- window Long-tail Latency in Large-scale Microservice Platforms | Huasong Shan, Yuan Chen, Haifeng Liu, Yunpeng Zhang, Xiao Xiao, Xiaofeng He, Min Li, Wei Ding | To diagnose root-causes of SWLT, we propose an unsupervised and low-cost diagnosis algorithm-? |
323 | Adaptive matrix completion for the users and the items in tail | Mohit Sharma, George Karypis | In this work, we show that the skewed distribution of ratings in the user-item rating matrix of real-world datasets affects the accuracy of matrix-completion-based approaches. |
324 | What is in Your Password? Analyzing Memorable and Secure Passwords using a Tensor Decomposition | Youjin Shin, Simon S. Woo | In this work, we aim to answer some of these questions by analyzing password dataset through the lenses of data science and machine learning perspectives. |
325 | Understanding Reader Backtracking Behavior in Online News Articles | Uzi Smadja, Max Grusky, Yoav Artzi, Mor Naaman | In this work, we investigate a specific type of interaction, backtracking, which refers to the action of scrolling back in a browser while reading an online news article. |
326 | Unnecessarily Identifiable: Quantifying the fingerprintability of browser extensions due to bloat | Oleksii Starov, Pierre Laperdrix, Alexandros Kapravelos, Nick Nikiforakis | In this paper, we investigate to what extent the page modifications that make browser extensions fingerprintable are necessary for their operation. |
327 | Embarrassingly Shallow Autoencoders for Sparse Data | Harald Steck | Combining simple elements from the literature, we define a linear model that is geared toward sparse data, in particular implicit feedback data for recommender systems. |
328 | Rock, Rap, or Reggaeton?: Assessing Mexican Immigrants’ Cultural Assimilation Using Facebook Data, | Ian Stewart, Ren� D. Flores, Timothy Riffe, Ingmar Weber, Emilio Zagheni | To examine this question, we focus on musical taste, a key symbolic resource that signals the social positions of individuals. |
329 | Learning Intent to Book Metrics for Airbnb Search | Bradley C. Turnbull | In this paper, we describe the development of a model-based user intent metric, �intentful listing view�, which combines the signals of a variety of user micro-actions on the listing description page. |
330 | Detection and Analysis of Self-Disclosure in Online News Commentaries | Prasanna Umar, Anna Squicciarini, Sarah Rajtmajer | In this paper, we study self-disclosure as it occurs in newspaper comment forums. |
331 | ViTOR: Learning to Rank Webpages Based on Visual Features | Bram van den Akker, Ilya Markov, Maarten de Rijke | We introduce the Visual learning TO Rank (ViTOR) model that integrates state-of-the-art visual features extraction methods: (i) transfer learning from a pre-trained image classification model, and (ii) synthetic saliency heat maps generated from webpage snapshots. Since there is currently no public dataset for the task of LTR with visual features, we also introduce and release the ViTOR dataset, containing visually rich and diverse webpages. |
332 | Evaluating Neural Text Simplification in the Medical Domain | Laurens van den Bercken, Robert-Jan Sips, Christoph Lofi | In this paper, we introduce such a dataset to aid medical text simplification research. |
333 | Semantic Hilbert Space for Text Representation Learning | Benyou Wang, Qiuchi Li, Massimo Melucci, Dawei Song | To address this issue, we propose a new framework that models different levels of semantic units (e.g. sememe, word, sentence, and semantic abstraction) on a single Semantic Hilbert Space, which naturally admits a non-linear semantic composition by means of a complex-valued vector word representation. |
334 | Learning Task-Specific City Region Partition | Hongjian Wang, Porter Jenkins, Hua Wei, Fei Wu, Zhenhui Li | In this paper, we propose a new problem of task-specific city region partitioning, aiming to find the best partition in a city w.r.t. a given task. |
335 | Knowledge Graph Convolutional Networks for Recommender Systems | Hongwei Wang, Miao Zhao, Xing Xie, Wenjie Li, Minyi Guo | In this paper, we propose Knowledge Graph Convolutional Networks (KGCN), an end-to-end framework that captures inter-item relatedness effectively by mining their associated attributes on the KG. |
336 | Tag2Vec: Learning Tag Representations in Tag Networks | Junshan Wang, Zhicong Lu, Guojia Song, Yue Fan, Lun Du, Wei Lin | In this paper, we propose a tag representation learning model, Tag2Vec, which mixes nodes and tags into a hybrid network. |
337 | The Silent Majority Speaks: Inferring Silent Users’ Opinions in Online Social Networks | Lei Wang, Jianwei Niu, Xuefeng Liu, Kaili Mao | Inspired by the collaborative filtering techniques in cold-start recommendations, we infer the opinions of silent users by leveraging the text content posted by active users and their relationships between silent users. |
338 | A Novel Unsupervised Approach for Precise Temporal Slot Filling from Incomplete and Noisy Temporal Contexts | Xueying Wang, Haiqiao Zhang, Qi Li, Yiyu Shi, Meng Jiang | In this work, we propose an unsupervised approach of two modules that mutually enhance each other: one is a reliability estimator on fact extractors conditionally to the temporal contexts; the other is a fact trustworthiness estimator based on the extractor’s reliability. |
339 | Learning Binary Hash Codes for Fast Anchor Link Retrieval across Networks | Yongqing Wang, Huawei Shen, Jinhua Gao, Xueqi Cheng | To combat the challenges, in this paper we propose a novel embedding and matching architecture to directly learn binary hash code for each node. |
340 | Neural Chinese Named Entity Recognition via CNN-LSTM-CRF and Joint Training with Word Segmentation | Fangzhao Wu, Junxin Liu, Chuhan Wu, Yongfeng Huang, Xing Xie | In this paper, we propose a neural approach for CNER. |
341 | Semi-supervised Multi-view Individual and Sharable Feature Learning for Webpage Classification | Fei Wu, Xiao-Yuan Jing, Jun Zhou, Yimu Ji, Chao Lan, Qinghua Huang, Ruchuan Wang | In this paper, we propose a semi-supervised multi-view individual and sharable feature learning (SMISFL) approach, which jointly learns multiple view-individual transformations and one sharable transformation to explore the view-specific property for each view and the common property across views. |
342 | On Convexity and Bounds of Fairness-aware Classification | Yongkai Wu, Lu Zhang, Xintao Wu | In this paper, we study the fairness-aware classification problem by formulating it as a constrained optimization problem. |
343 | Understanding Urban Dynamics via State-sharing Hidden Markov Model | Tong Xia, Yue Yu, Fengli Xu, Funing Sun, Diansheng Guo, Depeng Jin, Yong Li | To model the temporal dynamics of human activities concisely and specifically, we present State-sharing Hidden Markov Model (SSHMM). |
344 | Efficient Path Prediction for Semi-Supervised and Weakly Supervised Hierarchical Text Classification | Huiru Xiao, Xin Liu, Yangqiu Song | We use a generative model to leverage the large amount of unlabeled data and introduce path constraints into the learning algorithm to incorporate the structural information of the class hierarchy. |
345 | Hierarchical Neural Variational Model for Personalized Sequential Recommendation | Teng Xiao, Shangsong Liang, Zaiqiao Meng | In this paper, we study the problem of recommending personalized items to users given their sequential behaviors. |
346 | Focusing Attention Network for Answer Ranking | Yufei Xie, Shuchun Liu, Tangren Yao, Yao Peng, Zhao Lu | In this paper, we propose a new attention mechanism, called Focusing Attention Network(FAN), which can automatically draw back the divergent attention by adding the semantic, and metadata features. |
347 | DLocRL: A Deep Learning Pipeline for Fine-Grained Location Recognition and Linking in Tweets | Canwen Xu, Jing Li, Xiangyang Luo, Jiaxin Pei, Chenliang Li, Donghong Ji | In this paper, we propose DLocRL, a new deep learning pipeline for fine-grained location recognition and linking in tweets, and verify its effectiveness on a real-world Twitter dataset. |
348 | Recurrent Convolutional Neural Network for Sequential Recommendation | Chengfeng Xu, Pengpeng Zhao, Yanchi Liu, Jiajie Xu, Victor S.Sheng S.Sheng, Zhiming Cui, Xiaofang Zhou, Hui Xiong | In this paper, we propose a novel Recurrent Convolutional Neural Network model (RCNN). |
349 | No More than What I Post: Preventing Linkage Attacks on Check-in Services | Fengli Xu, Zhen Tu, Hongjia Huang, Shuhao Chang, Funing Sun, Diansheng Guo, Yong Li | To address this problem, we design a partition-and-group framework to integrate the information of check-ins and additional mobility data to attain a novel privacy criterion – kt, l-anonymity. |
350 | Open-world Learning and Application to Product Classification | Hu Xu, Bing Liu, Lei Shu, P. Yu | This paper proposes a new OWL method based on meta-learning. |
351 | Place Deduplication with Embeddings | Carl Yang, Do Huy Hoang, Tomas Mikolov, Jiawei Han | In this work, we take the anonymous place graph from Facebook as an example to systematically study the problem of place deduplication: We carefully formulate the problem, study its connections to various related tasks that lead to several promising basic models, and arrive at a systematic two-step data-driven pipeline based on place embedding with multiple novel techniques that works significantly better than the state-of-the-art. |
352 | Cyberbullying Ends Here: Towards Robust Detection of Cyberbullying in Social Media | Mengfan Yao, Charalampos Chelmis, Daphney?Stavroula Zois | In this work, we introduce CONcISE, a novel approach for timely and accurate Cyberbullying detectiON on Instagram media SEssions. |
353 | Enhancing Fashion Recommendation with Visual Compatibility Relationship | Ruiping Yin, Kan Li, Jie Lu, Guangquan Zhang | In this paper, we propose a fashion compatibility knowledge learning method that incorporates visual compatibility relationships as well as style information. |
354 | Discovering Product Defects and Solutions from Online User Generated Contents | Xuan Zhang, Zhilei Qiao, Aman Ahuja, Weiguo Fan, Edward A. Fox, Chandan K. Reddy | In this paper, we propose the Product Defect Latent Dirichlet Allocation model (PDLDA), a probabilistic model that identifies domain-specific knowledge about product issues using interdependent three-dimensional topics: Component, Symptom, and Resolution. |
355 | Your Style Your Identity: Leveraging Writing and Photography Styles for Drug Trafficker Identification in Darknet Markets over Attributed Heterogeneous Information Network | Yiming Zhang, Yujie Fan, Wei Song, Shifu Hou, Yanfang Ye, Xin Li, Liang Zhao, Chuan Shi, Jiabin Wang, | Built on the constructed AHIN, to efficiently measure the relatedness over nodes (i.e., traffickers) in the constructed AHIN, we propose a new network embedding model Vendor2Vec to learn the low-dimensional representations for the nodes in AHIN, which leverages complementary attribute information attached in the nodes to guide the meta-path based random walk for path instances sampling. |
356 | Abstractive Meeting Summarization via Hierarchical Adaptive Segmental Network Learning | Zhou Zhao, Haojie Pan, Changjie Fan, Yan Liu, Linlin Li, Min Yang, Deng Cai | In this paper, we consider the problem of abstractive meeting summarization from the viewpoint of hierarchical adaptive segmental encoder-decoder network learning. |
357 | Adversarial Point-of-Interest Recommendation | Fan Zhou, Ruiyang Yin, Kunpeng Zhang, Goce Trajcevski, Ting Zhong, Jin Wu | In this work, we initiate the first attempt to learn the distribution of user latent preference by proposing an Adversarial POI Recommendation (APOIR) model, consisting of two major components: (1) the recommender (R) which suggests POIs based on the learned distribution by maximizing the probabilities that these POIs are predicted as unvisited and potentially interested; and (2) the discriminator (D) which distinguishes the recommended POIs from the true check-ins and provides gradients as the guidance to improve R in a rewarding framework. |
358 | Context-aware Variational Trajectory Encoding and Human Mobility Inference | Fan Zhou, Xiaoli Yue, Goce Trajcevski, Ting Zhong, Kunpeng Zhang | We propose a new paradigm for moving pattern mining based on learning trajectory context, and a method – Context-Aware Variational Trajectory Encoding and Human Mobility Inference (CATHI) – for learning user trajectory representation via a framework consisting of: (1) a variational encoder and a recurrent encoder; (2) a variational attention layer; (3) two decoders. |
359 | Variational Session-based Recommendation Using Normalizing Flows | Fan Zhou, Zijing Wen, Kunpeng Zhang, Goce Trajcevski, Ting Zhong | We present a novel generative Session-Based Recommendation (SBR) framework, called VAriational SEssion-based Recommendation (VASER) – a non-linear probabilistic methodology allowing Bayesian inference for flexible parameter estimation of sequential recommendations. |
360 | Improving Top-K Recommendation via JointCollaborative Autoencoders | Ziwei Zhu, Jianling Wang, James Caverlee | In this paper, we propose a Joint Collaborative Autoencoder framework that learns both user-user and item-item correlations simultaneously, leading to a more robust model and improved top-K recommendation performance. |