Most Influential KDD Papers (2024-05)
ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD) is one of the top data mining conferences in the world. Paper Digest Team analyzes all papers published on KDD in the past years, and presents the 15 most influential papers for each year. This ranking list is automatically constructed based upon citations from both research papers and granted patents, and will be frequently updated to reflect the most recent changes. To find the latest version of this list or the most influential papers from other conferences/journals, please visit Best Paper Digest page. Note: the most influential papers may or may not include the papers that won the best paper awards. (Version: 2024-05)
To search or review papers within KDD related to a specific topic, please use the search by venue (KDD) and review by venue (KDD) services. To browse the most productive KDD authors by year ranked by #papers accepted, here is a list of most productive KDD authors.
Based in New York, Paper Digest is dedicated to producing high-quality text analysis results that people can acturally use on a daily basis. Since 2018, we have been serving users across the world with a number of exclusive services to track, search, review and rewrite scientific literature.
You are welcome to follow us on Twitter and Linkedin to get updated with new conference digests.
Paper Digest Team
New York City, New York, 10017
team@paperdigest.org
TABLE 1: Most Influential KDD Papers (2024-05)
Year | Rank | Paper | Author(s) |
---|---|---|---|
2023 | 1 | Text Is All You Need: Learning Language Representations for Sequential Recommendation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose to model user preferences and item features as language representations that can be generalized to new items and datasets. |
JIACHENG LI et. al. |
2023 | 2 | Deep Weakly-supervised Anomaly Detection IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To detect both seen and unseen anomalies, we introduce a novel deep weakly-supervised approach, namely Pairwise Relation prediction Network (PReNet), that learns pairwise relation features and anomaly scores by predicting the relation of any two randomly sampled training instances, in which the pairwise relation can be anomaly-anomaly, anomaly-unlabeled, or unlabeled-unlabeled. |
Guansong Pang; Chunhua Shen; Huidong Jin; Anton van den Hengel; |
2023 | 3 | TwHIN-BERT: A Socially-Enriched Pre-trained Language Model for Multilingual Tweet Representations at Twitter IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present TwHIN-BERT, a multilingual language model productionized at Twitter, trained on in-domain data from the popular social network. |
XINYANG ZHANG et. al. |
2023 | 4 | WebGLM: Towards An Efficient Web-Enhanced Question Answering System with Human Preferences IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present WebGLM, a web-enhanced question-answering system based on the General Language Model (GLM). |
XIAO LIU et. al. |
2023 | 5 | CodeGeeX: A Pre-Trained Model for Code Generation with Multilingual Benchmarking on HumanEval-X IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce CodeGeeX, a multilingual model with 13 billion parameters for code generation. |
QINKAI ZHENG et. al. |
2023 | 6 | All in One: Multi-Task Prompting for Graph Neural Networks IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel multi-task prompting method for graph models. |
Xiangguo Sun; Hong Cheng; Jia Li; Bo Liu; Jihong Guan; |
2023 | 7 | To Aggregate or Not? Learning with Separate Noisy Labels IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The literature has also studied extensively on effective aggregation approaches. This paper revisits this choice and aims to provide an answer to the question of whether one should aggregate separate noisy labels into single ones or use them separately as given. |
JIAHENG WEI et. al. |
2023 | 8 | TSMixer: Lightweight MLP-Mixer Model for Multivariate Time Series Forecasting IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, their memory and compute-intensive requirements pose a critical bottleneck for long-term forecasting, despite numerous advancements in compute-aware self-attention modules. To address this, we propose TSMixer, a lightweight neural architecture exclusively composed of multi-layer perceptron (MLP) modules. |
Vijay Ekambaram; Arindam Jati; Nam Nguyen; Phanwadee Sinthong; Jayant Kalagnanam; |
2023 | 9 | PEPNet: Parameter and Embedding Personalized Network for Infusing with Personalized Prior Information IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a plug-and-play Parameter and Embedding Personalized Network (PEPNet) for multi-domain and multi-task recommendation. |
JIANXIN CHANG et. al. |
2023 | 10 | UnifieR: A Unified Retriever for Large-Scale Retrieval IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Inspired by their complementary global-local contextualization and distinct representing views, we propose a new learning framework, Unifier, which unifies dense-vector and lexicon-based retrieval in one model with a dual-representing capability. |
TAO SHEN et. al. |
2023 | 11 | What’s Behind The Mask: Understanding Masked Graph Modeling for Graph Autoencoders IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present masked graph autoencoder (MaskGAE), a self-supervised learning framework for graph-structured data. |
JINTANG LI et. al. |
2023 | 12 | DCdetector: Dual Attention Contrastive Representation Learning for Time Series Anomaly Detection IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose DCdetector, a multi-scale dual attention contrastive representation learning model. |
Yiyuan Yang; Chaoli Zhang; Tian Zhou; Qingsong Wen; Liang Sun; |
2023 | 13 | Multi-factor Sequential Re-ranking with Perception-Aware Diversification IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, this work proposes a general re-ranking framework named Multi-factor Sequential Re-ranking with Perception-Aware Diversification~(MPAD) to jointly optimize accuracy and diversity for feed recommendation in a sequential manner. |
YUE XU et. al. |
2023 | 14 | Semantic-Enhanced Differentiable Search Index Inspired By Learning Strategies IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a Semantic-Enhanced DSI model (SE-DSI) motivated by Learning Strategies in the area of Cognitive Psychology. |
YUBAO TANG et. al. |
2023 | 15 | Learning Strong Graph Neural Networks with Weak Information IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Accordingly, we propose D2PT, a dual-channel GNN framework that performs long-range information propagation not only on the input graph with incomplete structure, but also on a global graph that encodes global semantic similarities. |
YIXIN LIU et. al. |
2022 | 1 | GraphMAE: Self-Supervised Masked Graph Autoencoders IF:5 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we identify and examine the issues that negatively impact the development of GAEs, including their reconstruction objective, training robustness, and error metric. |
ZHENYU HOU et. al. |
2022 | 2 | Towards Universal Sequence Representation Learning for Recommender Systems IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In order to develop effective sequential recommenders, a series of sequence representation learning (SRL) methods are proposed to model historical user behaviors. |
YUPENG HOU et. al. |
2022 | 3 | Pre-training Enhanced Spatial-temporal Graph Neural Network for Multivariate Time Series Forecasting IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the patterns of time series and the dependencies between them (i.e., the temporal and spatial patterns) need to be analyzed based on long-term historical MTS data. To address this issue, we propose a novel framework, in which STGNN is Enhanced by a scalable time series Pre-training model (STEP). |
Zezhi Shao; Zhao Zhang; Fei Wang; Yongjun Xu; |
2022 | 4 | ROLAND: Graph Learning Framework for Dynamic Graphs IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Here we propose ROLAND, an effective graph representation learning framework for real-world dynamic graphs. |
Jiaxuan You; Tianyu Du; Jure Leskovec; |
2022 | 5 | FLDetector: Defending Federated Learning Against Model Poisoning Attacks Via Detecting Malicious Clients IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: It is still an open challenge how to defend against model poisoning attacks with a large number of malicious clients. Our FLDetector addresses this challenge via detecting malicious clients. |
Zaixi Zhang; Xiaoyu Cao; Jinyuan Jia; Neil Zhenqiang Gong; |
2022 | 6 | A New Generation of Perspective API: Efficient Multilingual Character-level Transformers IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present the fundamentals behind the next version of the Perspective API from Google Jigsaw. |
ALYSSA LEES et. al. |
2022 | 7 | Learned Token Pruning for Transformers IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Efficient deployment of transformer models in practice is challenging due to their inference cost including memory footprint, latency, and power consumption, which scales quadratically with input sequence length. To address this, we present a novel token reduction method dubbed Learned Token Pruning (LTP) which adaptively removes unimportant tokens as an input sequence passes through transformer layers. |
SEHOON KIM et. al. |
2022 | 8 | Towards Representation Alignment and Uniformity in Collaborative Filtering IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we measure the representation quality in CF from the perspective of alignment and uniformity on the hypersphere. |
CHENYANG WANG et. al. |
2022 | 9 | Graph Attention Multi-Layer Perceptron IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Although some scalable GNNs are proposed for large-scale graphs, they adopt a fixed K-hop neighborhood for each node, thus facing the over-smoothing issue when adopting large propagation depths for nodes within sparse regions. To tackle the above issue, we propose a new GNN architecture — Graph Attention Multi-Layer Perceptron (GAMLP), which can capture the underlying correlations between different scales of graph knowledge. |
WENTAO ZHANG et. al. |
2022 | 10 | Towards Unified Conversational Recommender Systems Via Knowledge-Enhanced Prompt Learning IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, these approaches still rely on different architectures or techniques to develop the two modules, making it difficult for effective module integration. To address this problem, we propose a unified CRS model named UniCRS based on knowledge-enhanced prompt learning. |
Xiaolei Wang; Kun Zhou; Ji-Rong Wen; Wayne Xin Zhao; |
2022 | 11 | Causal Attention for Interpretable and Generalizable Graph Classification IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we take a causal look at the GNN modeling for graph classification. |
YONGDUO SUI et. al. |
2022 | 12 | Graph-Flashback Network for Next Location Recommendation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To incorporate the learned graph into sequential model, we propose a novel network Graph-Flashback for recommendation. |
XUAN RAO et. al. |
2022 | 13 | MSDR: Multi-Step Dependency Relation Networks for Spatial Temporal Forecasting IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we argue that it is insufficient to capture the long-range spatial dependencies from the implicit representations learned by temporal extracting modules. |
Dachuan Liu; Jin Wang; Shuo Shang; Peng Han; |
2022 | 14 | Contrastive Cross-domain Recommendation in Matching IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a novel Contrastive Cross-Domain Recommendation (CCDR) framework for CDR in matching. |
RUOBING XIE et. al. |
2022 | 15 | Global Self-Attention As A Replacement for Graph Convolution IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose an extension to the transformer neural network architecture for general-purpose graph learning by adding a dedicated pathway for pairwise structural information, called edge channels. |
Md Shamim Hussain; Mohammed J. Zaki; Dharmashankar Subramanian; |
2021 | 1 | A Transformer-based Framework for Multivariate Time Series Representation Learning IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a novel framework for multivariate time series representation learning based on the transformer encoder architecture. |
George Zerveas; Srideepika Jayaraman; Dhaval Patel; Anuradha Bhamidipaty; Carsten Eickhoff; |
2021 | 2 | Self-supervised Heterogeneous Graph Neural Network with Co-contrastive Learning IF:5 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we study the problem of self-supervised HGNNs and propose a novel co-contrastive learning mechanism for HGNNs, named HeCo. |
Xiao Wang; Nian Liu; Hui Han; Chuan Shi; |
2021 | 3 | MiniRocket: A Very Fast (Almost) Deterministic Transform for Time Series Classification IF:5 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We reformulate Rocket into a new method, MiniRocket. MiniRocket is up to 75 times faster than Rocket on larger datasets, and almost deterministic (and optionally, fully deterministic), while maintaining essentially the same accuracy. |
Angus Dempster; Daniel F. Schmidt; Geoffrey I. Webb; |
2021 | 4 | Spatial-Temporal Graph ODE Networks for Traffic Flow Forecasting IF:5 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we propose Spatial-Temporal Graph Ordinary Differential Equation Networks (STGODE).1 Specifically, we capture spatial-temporal dynamics through a tensor-based ordinary differential equation (ODE), as a result, deeper networks can be constructed and spatial-temporal features are utilized synchronously. |
Zheng Fang; Qingqing Long; Guojie Song; Kunqing Xie; |
2021 | 5 | Model-Agnostic Counterfactual Reasoning for Eliminating Popularity Bias in Recommender System IF:5 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we explore the popularity bias issue from a novel and fundamental perspective — cause-effect. |
TIANXIN WEI et. al. |
2021 | 6 | Are We Really Making Much Progress?: Revisiting, Benchmarking and Refining Heterogeneous Graph Neural Networks IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present a systematical reproduction of 12 recent HGNNs by using their official codes, datasets, settings, and hyperparameters, revealing surprising findings about the progress of HGNNs. |
QINGSONG LV et. al. |
2021 | 7 | Multivariate Time Series Anomaly Detection and Interpretation Using Hierarchical Inter-Metric and Temporal Embedding IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose InterFusion, an unsupervised method that simultaneously models the inter-metric and temporal dependency for MTS. |
ZHIHAN LI et. al. |
2021 | 8 | Dynamic and Multi-faceted Spatio-temporal Deep Learning for Traffic Speed Forecasting IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, in this paper, we aim to explore these dynamic and multi-faceted spatio-temporal characteristics inherent in traffic data for further unleashing the power of DGNNs for better traffic speed forecasting. |
LIANGZHE HAN et. al. |
2021 | 9 | Socially-Aware Self-Supervised Tri-Training for Recommendation IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To capture these signals, a general socially-aware SSL framework that integrates tri-training is proposed in this paper. |
JUNLIANG YU et. al. |
2021 | 10 | Deconfounded Recommendation for Alleviating Bias Amplification IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we scrutinize the cause-effect factors for bias amplification, identifying the main reason lies in the confounding effect of imbalanced item distribution on user representation and prediction score. |
Wenjie Wang; Fuli Feng; Xiangnan He; Xiang Wang; Tat-Seng Chua; |
2021 | 11 | MixGCF: An Improved Training Method for Graph Neural Network-based Recommender Systems IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose to study negative sampling by leveraging both the user-item graph structure and GNNs’ aggregation process. |
TINGLIN HUANG et. al. |
2021 | 12 | Structure-aware Interactive Graph Neural Networks for The Prediction of Protein-Ligand Binding Affinity IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we propose a structure-aware interactive graph neural network (SIGN) which consists of two components: polar-inspired graph attention layers (PGAL) and pairwise interactive pooling (PiPool). |
SHUANGLI LI et. al. |
2021 | 13 | Contrastive Learning for Debiased Candidate Generation in Large-Scale Recommender Systems IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we theoretically prove that a popular choice of contrastive loss is equivalent to reducing the exposure bias via inverse propensity weighting, which provides a new perspective for understanding the effectiveness of contrastive learning. |
Chang Zhou; Jianxin Ma; Jianwei Zhang; Jingren Zhou; Hongxia Yang; |
2021 | 14 | Practical Approach to Asynchronous Multivariate Time Series Anomaly Detection and Localization IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a practical approach for inferring anomalies from large multivariate sets. |
Ahmed Abdulaal; Zhuanghua Liu; Tomer Lancewicki; |
2021 | 15 | Relational Message Passing for Knowledge Graph Completion IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a relational message passing method for knowledge graph completion. |
Hongwei Wang; Hongyu Ren; Jure Leskovec; |
2020 | 1 | Connecting The Dots: Multivariate Time Series Forecasting With Graph Neural Networks IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a general graph neural network framework designed specifically for multivariate time series data. |
ZONGHAN WU et. al. |
2020 | 2 | GCC: Graph Contrastive Coding For Graph Neural Network Pre-Training IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We design GCC’s pre-training task as subgraph instance discrimination in and across networks and leverage contrastive learning to empower graph neural networks to learn the intrinsic and transferable structural representations. |
JIEZHONG QIU et. al. |
2020 | 3 | LayoutLM: Pre-training Of Text And Layout For Document Image Understanding IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose the LayoutLM to jointly model interactions between text and layout information across scanned document images, which is beneficial for a great number of real-world document image understanding tasks such as information extraction from scanned documents. |
YIHENG XU et. al. |
2020 | 4 | Graph Structure Learning For Robust Graph Neural Networks IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Therefore, in this paper, we explore these properties to defend adversarial attacks on graphs. |
WEI JIN et. al. |
2020 | 5 | Towards Deeper Graph Neural Networks IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we study this observation systematically and develop new insights towards deeper graph neural networks. |
Meng Liu; Hongyang Gao; Shuiwang Ji; |
2020 | 6 | USAD: UnSupervised Anomaly Detection On Multivariate Time Series IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a fast and stable method called UnSupervised Anomaly Detection for multivariate time series (USAD) based on adversely trained autoencoders. |
Julien Audibert; Pietro Michiardi; Frédéric Guyard; Sébastien Marti; Maria A. Zuluaga; |
2020 | 7 | GPT-GNN: Generative Pre-Training Of Graph Neural Networks IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present the GPT-GNN framework to initialize GNNs by generative pre-training. |
Ziniu Hu; Yuxiao Dong; Kuansan Wang; Kai-Wei Chang; Yizhou Sun; |
2020 | 8 | Exploring Automatic Diagnosis Of COVID-19 From Crowdsourced Respiratory Sound Data IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we describe our data analysis over a large-scale crowdsourced dataset of respiratory sounds collected to aid diagnosis of COVID-19. |
CHLOË BROWN et. al. |
2020 | 9 | On Sampled Metrics For Item Recommendation IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We show that it is possible to improve the quality of the sampled metrics by applying a correction, obtained by minimizing different criteria such as bias or mean squared error. |
Walid Krichene; Steffen Rendle; |
2020 | 10 | AM-GCN: Adaptive Multi-channel Graph Convolutional Networks IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We tackle the challenge and propose an adaptive multi-channel graph convolutional networks for semi-supervised classification (AM-GCN). |
XIAO WANG et. al. |
2020 | 11 | Scaling Graph Neural Networks With Approximate PageRank IF:5 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present the PPRGo model which utilizes an efficient approximation of information diffusion in GNNs resulting in significant speed gains while maintaining state-of-the-art prediction performance. |
ALEKSANDAR BOJCHEVSKI et. al. |
2020 | 12 | Towards Physics-informed Deep Learning For Turbulent Flow Prediction IF:5 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we aim to predict turbulent flow by learning its highly nonlinear dynamics from spatiotemporal velocity fields of large-scale fluid flow simulations of relevance to turbulence modeling and climate modeling. |
Rui Wang; Karthik Kashinath; Mustafa Mustafa; Adrian Albert; Rose Yu; |
2020 | 13 | XGNN: Towards Model-Level Explanations Of Graph Neural Networks IF:5 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a novel approach, known as XGNN, to interpret GNNs at the model-level. |
Hao Yuan; Jiliang Tang; Xia Hu; Shuiwang Ji; |
2020 | 14 | Improving Conversational Recommender Systems Via Knowledge Graph Based Semantic Fusion IF:5 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address these issues, we incorporate both word-oriented and entity-oriented knowledge graphs~(KG) to enhance the data representations in CRSs, and adopt Mutual Information Maximization to align the word-level and entity-level semantic spaces. |
KUN ZHOU et. al. |
2020 | 15 | Embedding-based Retrieval In Facebook Search IF:5 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we discuss the techniques for applying EBR to a Facebook Search system. |
JUI-TING HUANG et. al. |
2019 | 1 | Optuna: A Next-generation Hyperparameter Optimization Framework IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The purpose of this study is to introduce new design-criteria for next-generation hyperparameter optimization software. |
Takuya Akiba; Shotaro Sano; Toshihiko Yanase; Takeru Ohta; Masanori Koyama; |
2019 | 2 | KGAT: Knowledge Graph Attention Network For Recommendation IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we investigate the utility of knowledge graph (KG), which breaks down the independent interaction assumption by linking items with their attributes. We release the codes and datasets at https://github.com/xiangwang1223/knowledge_graph_attention_network. |
Xiang Wang; Xiangnan He; Yixin Cao; Meng Liu; Tat-Seng Chua; |
2019 | 3 | Heterogeneous Graph Neural Network IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose HetGNN, a heterogeneous graph neural network model, to resolve this issue. |
Chuxu Zhang; Dongjin Song; Chao Huang; Ananthram Swami; Nitesh V. Chawla; |
2019 | 4 | Cluster-GCN: An Efficient Algorithm For Training Deep And Large Graph Convolutional Networks IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose Cluster-GCN, a novel GCN algorithm that is suitable for SGD-based training by exploiting the graph clustering structure. To test the scalability of our algorithm, we create a new Amazon2M data with 2 million nodes and 61 million edges which is more than 5 times larger than the previous largest publicly available dataset (Reddit). |
WEI-LIN CHIANG et. al. |
2019 | 5 | Robust Anomaly Detection For Multivariate Time Series Through Stochastic Recurrent Neural Network IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes OmniAnomaly, a stochastic recurrent neural network for multivariate time series anomaly detection that works well robustly for various devices. |
YA SU et. al. |
2019 | 6 | Auto-Keras: An Efficient Neural Architecture Search System IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel framework enabling Bayesian optimization to guide the network morphism for efficient neural architecture search. |
Haifeng Jin; Qingquan Song; Xia Hu; |
2019 | 7 | Predicting Dynamic Embedding Trajectory In Temporal Interaction Networks IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Here we propose JODIE, a coupled recurrent neural network model that learns the embedding trajectories of users and items. |
Srijan Kumar; Xikun Zhang; Jure Leskovec; |
2019 | 8 | DEFEND: Explainable Fake News Detection IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, therefore, we study the explainable detection of fake news. |
Kai Shu; Limeng Cui; Suhang Wang; Dongwon Lee; Huan Liu; |
2019 | 9 | Knowledge-aware Graph Neural Networks With Label Smoothness Regularization For Recommender Systems IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Here we propose Knowledge-aware Graph Neural Networks with Label Smoothness regularization (KGNN-LS) to provide better recommendations. |
HONGWEI WANG et. al. |
2019 | 10 | Urban Traffic Prediction From Spatio-Temporal Data Using Deep Meta Learning IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To tackle these challenges, we proposed a deep-meta-learning based model, entitled ST-MetaNet, to collectively predict traffic in all location at once. |
ZHEYI PAN et. al. |
2019 | 11 | Time-Series Anomaly Detection Service At Microsoft IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce the pipeline and algorithm of our anomaly detection service, which is designed to be accurate, efficient and general. |
HANSHENG REN et. al. |
2019 | 12 | Representation Learning For Attributed Multiplex Heterogeneous Network IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we formalize the problem of embedding learning for the Attributed Multiplex Heterogeneous Network and propose a unified framework to address this problem. |
YUKUO CEN et. al. |
2019 | 13 | Robust Graph Convolutional Networks Against Adversarial Attacks IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address this problem, we propose Robust GCN (RGCN), a novel model that fortifies” GCNs against adversarial attacks. |
Dingyuan Zhu; Ziwei Zhang; Peng Cui; Wenwu Zhu; |
2019 | 14 | Fairness In Recommendation Ranking Through Pairwise Comparisons IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we offer a set of novel metrics for evaluating algorithmic fairness concerns in recommender systems. |
ALEX BEUTEL et. al. |
2019 | 15 | AliGraph: A Comprehensive Graph Neural Network Platform IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a comprehensive graph neural network system, namely AliGraph, which consists of distributed graph storage, optimized sampling operators and runtime to efficiently support not only existing popular GNNs but also a series of in-house developed ones for different scenarios. |
Hongxia Yang; |
2018 | 1 | Graph Convolutional Neural Networks For Web-Scale Recommender Systems IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Here we describe a large-scale deep recommendation engine that we developed and deployed at Pinterest. |
REX YING et. al. |
2018 | 2 | Deep Interest Network For Click-Through Rate Prediction IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel model: Deep Interest Network (DIN) which tackles this challenge by designing a local activation unit to adaptively learn the representation of user interests from historical behaviors with respect to a certain ad. |
GUORUI ZHOU et. al. |
2018 | 3 | XDeepFM: Combining Explicit And Implicit Feature Interactions For Recommender Systems IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel Compressed Interaction Network (CIN), which aims to generate feature interactions in an explicit fashion and at the vector-wise level. |
JIANXUN LIAN et. al. |
2018 | 4 | Adversarial Attacks On Neural Networks For Graph Data IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce the first study of adversarial attacks on attributed graphs, specifically focusing on models exploiting ideas of graph convolutions. |
Daniel Z?gner; Amir Akbarnejad; Stephan G?nnemann; |
2018 | 5 | Detecting Spacecraft Anomalies Using LSTMs And Nonparametric Dynamic Thresholding IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We demonstrate the effectiveness of Long Short-Term Memory (LSTMs) networks, a type of Recurrent Neural Network (RNN), in overcoming these issues using expert-labeled telemetry anomaly data from the Soil Moisture Active Passive (SMAP) satellite and the Mars Science Laboratory (MSL) rover, Curiosity. |
Kyle Hundman; Valentino Constantinou; Christopher Laporte; Ian Colwell; Tom Soderstrom; |
2018 | 6 | Modeling Task Relationships In Multi-task Learning With Multi-gate Mixture-of-Experts IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a novel multi-task learning approach, Multi-gate Mixture-of-Experts (MMoE), which explicitly learns to model task relationships from data. |
JIAQI MA et. al. |
2018 | 7 | EANN: Event Adversarial Neural Networks For Multi-Modal Fake News Detection IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In order to address this issue, we propose an end-to-end framework named Event Adversarial Neural Network (EANN), which can derive event-invariant features and thus benefit the detection of fake news on newly arrived events. |
YAQING WANG et. al. |
2018 | 8 | STAMP: Short-Term Attention/Memory Priority Model For Session-based Recommendation IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we argue that a long-term memory model may be insufficient for modeling long sessions that usually contain user interests drift caused by unintended clicks. |
Qiao Liu; Yifu Zeng; Refuoe Mokhosi; Haibin Zhang; |
2018 | 9 | SUSTain: Scalable Unsupervised Scoring For Tensors And Its Application To Phenotyping IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a new method, which we call SUSTain, that extends real-valued matrix and tensor factorizations to data where values are integers. |
IOAKEIM PERROS et. al. |
2018 | 10 | Leveraging Meta-path Based Context For Top- N Recommendation With A Neural Co-Attention Model IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To construct the meta-path based context, we propose to use a priority based sampling technique to select high-quality path instances. |
Binbin Hu; Chuan Shi; Wayne Xin Zhao; Philip S. Yu; |
2018 | 11 | Large-Scale Learnable Graph Convolutional Networks IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To enable model training on large-scale graphs, we propose a sub-graph training method to reduce the excessive memory and computational resource requirements suffered by prior methods on graph convolutions. |
Hongyang Gao; Zhengyang Wang; Shuiwang Ji; |
2018 | 12 | Fairness Of Exposure In Rankings IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address these often conflicting responsibilities, we propose a conceptual and computational framework that allows the formulation of fairness constraints on rankings in terms of exposure allocation. |
Ashudeep Singh; Thorsten Joachims; |
2018 | 13 | IntelliLight: A Reinforcement Learning Approach For Intelligent Traffic Light Control IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a more effective deep reinforcement learning model for traffic light control. |
Hua Wei; Guanjie Zheng; Huaxiu Yao; Zhenhui Li; |
2018 | 14 | DeepInf: Social Influence Prediction With Deep Learning IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Social and information networking activities such as on Facebook, Twitter, WeChat, and Weibo have become an indispensable part of our everyday life, where we can easily access … |
JIEZHONG QIU et. al. |
2018 | 15 | Billion-scale Commodity Embedding For E-commerce Recommendation In Alibaba IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present our technical solutions to address these three challenges. |
JIZHE WANG et. al. |
2017 | 1 | Metapath2vec: Scalable Representation Learning For Heterogeneous Networks IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We study the problem of representation learning in heterogeneous networks. |
Yuxiao Dong; Nitesh V. Chawla; Ananthram Swami; |
2017 | 2 | Anomaly Detection With Robust Deep Autoencoders IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Such "Group Robust Deep Autoencoders (GRDA)" give rise to novel anomaly detection approaches whose superior performance we demonstrate on a selection of benchmark problems. |
Chong Zhou; Randy C. Paffenroth; |
2017 | 3 | Algorithmic Decision Making And The Cost Of Fairness IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To mitigate such disparities, several techniques have recently been proposed to achieve algorithmic fairness. |
Sam Corbett-Davies; Emma Pierson; Avi Feller; Sharad Goel; Aziz Huq; |
2017 | 4 | Struc2vec: Learning Node Representations From Structural Identity IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work presents struc2vec, a novel and flexible framework for learning latent representations for the structural identity of nodes. |
Leonardo F.R. Ribeiro; Pedro H.P. Saverese; Daniel R. Figueiredo; |
2017 | 5 | Google Vizier: A Service For Black-Box Optimization IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we describe Google Vizier, a Google-internal service for performing black-box optimization that has become the de facto parameter tuning engine at Google. |
DANIEL GOLOVIN et. al. |
2017 | 6 | GRAM: Graph-based Attention Model For Healthcare Representation Learning IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address these challenges, we propose GRaph-based Attention Model (GRAM) that supplements electronic health records (EHR) with hierarchical information inherent to medical ontologies. |
Edward Choi; Mohammad Taha Bahadori; Le Song; Walter F. Stewart; Jimeng Sun; |
2017 | 7 | Local Higher-Order Graph Clustering IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Local graph clustering methods aim to find a cluster of nodes by exploring a small region of the graph. |
Hao Yin; Austin R. Benson; Jure Leskovec; David F. Gleich; |
2017 | 8 | Patient Subtyping Via Time-Aware LSTM Networks IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In the study of various diseases, heterogeneity among patients usually leads to different progression patterns and may require different types of therapeutic intervention. |
INCI M. BAYTAS et. al. |
2017 | 9 | Dipole: Diagnosis Prediction In Healthcare Via Attention-based Bidirectional Recurrent Neural Networks IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address these issues, we propose Dipole, an end-to-end, simple and robust model for predicting patients’ future health information. |
FENGLONG MA et. al. |
2017 | 10 | Meta-Graph Based Recommendation Fusion Over Heterogeneous Information Networks IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: With different meta-graph based features, we propose to use FM with Group lasso (FMG) to automatically learn from the observed ratings to effectively select useful meta-graph based features. |
Huan Zhao; Quanming Yao; Jianda Li; Yangqiu Song; Dik Lun Lee; |
2017 | 11 | Collaborative Variational Autoencoder For Recommender Systems IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a Bayesian generative model called collaborative variational autoencoder (CVAE) that considers both rating and content for recommendation in multimedia scenario. |
Xiaopeng Li; James She; |
2017 | 12 | Embedding-based News Recommendation For Millions Of Users IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Services that incorporated the method we propose are already open to all users and provide recommendations to over ten million individual users per day who make billions of accesses per month. |
Shumpei Okura; Yukihiro Tagami; Shingo Ono; Akira Tajima; |
2017 | 13 | TFX: A TensorFlow-Based Production-Scale Machine Learning Platform IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present TensorFlow Extended (TFX), a TensorFlow-based general-purpose machine learning platform implemented at Google. |
DENIS BAYLOR et. al. |
2017 | 14 | Anomaly Detection In Streams With Extreme Value Theory IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Here, we propose a new approach to detect outliers in streaming univariate time series based on Extreme Value Theory that does not require to hand-set thresholds and makes no assumption on the distribution: the main parameter is only the risk, controlling the number of false positives. |
Alban Siffer; Pierre-Alain Fouque; Alexandre Termier; Christine Largouet; |
2017 | 15 | ReasoNet: Learning To Stop Reading In Machine Comprehension IF:5 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we describe a novel neural network architecture called the Reasoning Network (ReasoNet) for machine comprehension tasks. |
Yelong Shen; Po-Sen Huang; Jianfeng Gao; Weizhu Chen; |
2016 | 1 | XGBoost: A Scalable Tree Boosting System IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a novel sparsity-aware algorithm for sparse data and weighted quantile sketch for approximate tree learning. |
Tianqi Chen; Carlos Guestrin; |
2016 | 2 | Node2vec: Scalable Feature Learning For Networks IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Here we propose node2vec, an algorithmic framework for learning continuous feature representations for nodes in networks. |
Aditya Grover; Jure Leskovec; |
2016 | 3 | Structural Deep Network Embedding IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To solve this problem, in this paper we propose a Structural Deep Network Embedding method, namely SDNE. |
Daixin Wang; Peng Cui; Wenwu Zhu; |
2016 | 4 | Collaborative Knowledge Base Embedding For Recommender Systems IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate how to leverage the heterogeneous information in a knowledge base to improve the quality of recommender systems. |
Fuzheng Zhang; Nicholas Jing Yuan; Defu Lian; Xing Xie; Wei-Ying Ma; |
2016 | 5 | Asymmetric Transitivity Preserving Graph Embedding IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To tackle this challenge, we propose the idea of preserving asymmetric transitivity by approximating high-order proximity which are based on asymmetric transitivity. |
Mingdong Ou; Peng Cui; Jian Pei; Ziwei Zhang; Wenwu Zhu; |
2016 | 6 | Interpretable Decision Sets: A Joint Framework For Description And Prediction IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Here we propose interpretable decision sets, a framework for building predictive models that are highly accurate, yet also highly interpretable. |
Himabindu Lakkaraju; Stephen H. Bach; Jure Leskovec; |
2016 | 7 | Recurrent Marked Temporal Point Processes: Embedding Event History To Vector IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose the Recurrent Marked Temporal Point Process (RMTPP) to simultaneously model the event timings and the markers. |
NAN DU et. al. |
2016 | 8 | Convolutional Neural Networks For Steady Flow Approximation IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a general and flexible approximation model for real-time prediction of non-uniform steady laminar flow in a 2D or 3D domain based on convolutional neural networks (CNNs). |
Xiaoxiao Guo; Wei Li; Francesco Iorio; |
2016 | 9 | Multi-layer Representation Learning For Medical Concepts IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose Med2Vec, which not only learns the representations for both medical codes and visits from large EHR datasets with over million visits, but also allows us to interpret the learned representations confirmed positively by clinical experts. |
EDWARD CHOI et. al. |
2016 | 10 | CNTK: Microsoft’s Open-Source Deep-Learning Toolkit IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This tutorial will introduce the Computational Network Toolkit, or CNTK, Microsoft’s cutting-edge open-source deep-learning toolkit for Windows and Linux. |
Frank Seide; Amit Agarwal; |
2016 | 11 | Deep Crossing: Web-Scale Modeling Without Manually Crafted Combinatorial Features IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes the Deep Crossing model which is a deep neural network that automatically combines features to produce superior models. |
YING SHAN et. al. |
2016 | 12 | Towards Conversational Recommender Systems IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The goal of this paper is to begin to reduce this gap. |
Konstantina Christakopoulou; Filip Radlinski; Katja Hofmann; |
2016 | 13 | Algorithmic Bias: From Discrimination Discovery To Fairness-aware Data Mining IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The aim of this tutorial is to survey algorithmic bias, presenting its most common variants, with an emphasis on the algorithmic techniques and key ideas developed to derive efficient solutions. |
Sara Hajian; Francesco Bonchi; Carlos Castillo; |
2016 | 14 | Smart Reply: Automated Response Suggestion For Email IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we propose and investigate a novel end-to-end method for automatically generating short email responses, called Smart Reply. |
ANJULI KANNAN et. al. |
2016 | 15 | FRAUDAR: Bounding Graph Fraud In The Face Of Camouflage IF:5 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose FRAUDAR, an algorithm that (a) is camouflage-resistant, (b) provides upper bounds on the effectiveness of fraudsters, and (c) is effective in real-world data. |
BRYAN HOOI et. al. |
2015 | 1 | Certifying And Removing Disparate Impact IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Instead of requiring access to the process, we propose making inferences based on the data it uses. |
Michael Feldman; Sorelle A. Friedler; John Moeller; Carlos Scheidegger; Suresh Venkatasubramanian; |
2015 | 2 | Collaborative Deep Learning For Recommender Systems IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address this problem, we generalize recently advances in deep learning from i.i.d. input to non-i.i.d. (CF-based) input and propose in this paper a hierarchical Bayesian model called collaborative deep learning (CDL), which jointly performs deep representation learning for the content information and collaborative filtering for the ratings (feedback) matrix. |
Hao Wang; Naiyan Wang; Dit-Yan Yeung; |
2015 | 3 | Intelligible Models For HealthCare: Predicting Pneumonia Risk And Hospital 30-day Readmission IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In the 30-day hospital readmission case study, we show that the same methods scale to large datasets containing hundreds of thousands of patients and thousands of attributes while remaining intelligible and providing accuracy comparable to the best (unintelligible) machine learning methods. |
RICH CARUANA et. al. |
2015 | 4 | Deep Graph Kernels IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present Deep Graph Kernels, a unified framework to learn latent representations of sub-structures for graphs, inspired by latest advancements in language modeling and deep learning. |
Pinar Yanardag; S.V.N. Vishwanathan; |
2015 | 5 | PTE: Predictive Text Embedding Through Large-scale Heterogeneous Text Networks IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we fill this gap by proposing a semi-supervised representation learning method for text data, which we call the predictive text embedding (PTE). |
Jian Tang; Meng Qu; Qiaozhu Mei; |
2015 | 6 | Inferring Networks Of Substitutable And Complementary Products IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our goal in this paper is to learn the semantics of substitutes and complements from the text of online reviews. |
Julian McAuley; Rahul Pandey; Jure Leskovec; |
2015 | 7 | SEISMIC: A Self-Exciting Point Process Model For Predicting Tweet Popularity IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we focus on predicting the final number of reshares of a given post. |
Qingyuan Zhao; Murat A. Erdogdu; Hera Y. He; Anand Rajaraman; Jure Leskovec; |
2015 | 8 | Heterogeneous Network Embedding Via Deep Architectures IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we examine the scenario of a heterogeneous network with nodes and content of various types. |
SHIYU CHANG et. al. |
2015 | 9 | Collective Opinion Spam Detection: Bridging Review Networks And Metadata IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a new holistic approach called SPEAGLE that utilizes clues from all metadata (text, timestamp, rating) as well as relational data (network), and harness them collectively under a unified framework to spot suspicious users and reviews, as well as products targeted by spam. |
Shebuti Rayana; Leman Akoglu; |
2015 | 10 | Petuum: A New Platform For Distributed Machine Learning On Big Data IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a general-purpose framework that systematically addresses data- and model-parallel challenges in large-scale ML, by leveraging several fundamental properties underlying ML programs that make them different from conventional operation-centric programs: error tolerance, dynamic structure, and nonuniform convergence; all stem from the optimization-centric nature shared in ML programs’ mathematical definitions, and the iterative-convergent behavior of their algorithmic solutions. |
ERIC P. XING et. al. |
2015 | 11 | Generic And Scalable Framework For Automated Time-series Anomaly Detection IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces a generic and scalable framework for automated anomaly detection on large scale time-series data. |
Nikolay Laptev; Saeed Amizadeh; Ian Flint; |
2015 | 12 | Forecasting Fine-Grained Air Quality Based On Big Data IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we forecast the reading of an air quality monitoring station over the next 48 hours, using a data-driven method that considers current meteorological data, weather forecasts, and air quality data of the station and that of other stations within a few hundred kilometers. |
YU ZHENG et. al. |
2015 | 13 | E-commerce In Your Inbox: Product Recommendations At Scale IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we describe a system that leverages user purchase history determined from e-mail receipts to deliver highly personalized product ads to Yahoo Mail users. |
MIHAJLO GRBOVIC et. al. |
2015 | 14 | From Group To Individual Labels Using Deep Features IF:5 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we focus on the problem of learning classifiers to make predictions at the instance level. |
Dimitrios Kotzias; Misha Denil; Nando de Freitas; Padhraic Smyth; |
2015 | 15 | COSNET: Connecting Heterogeneous Social Networks With Local And Global Consistency IF:5 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose COSNET (COnnecting heterogeneous Social NETworks with local and global consistency), a novel energy-based model, to address this problem by considering both local and global consistency among multiple networks. |
Yutao Zhang; Jie Tang; Zhilin Yang; Jian Pei; Philip S. Yu; |
2014 | 1 | DeepWalk: Online Learning Of Social Representations IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present DeepWalk, a novel approach for learning latent representations of vertices in a network. |
Bryan Perozzi; Rami Al-Rfou; Steven Skiena; |
2014 | 2 | Knowledge Vault: A Web-scale Approach To Probabilistic Knowledge Fusion IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Here we introduce Knowledge Vault, a Web-scale probabilistic knowledge base that combines extractions from Web content (obtained via analysis of text, tabular data, page structure, and human annotations) with prior knowledge derived from existing knowledge repositories. |
XIN DONG et. al. |
2014 | 3 | Efficient Mini-batch Training For Stochastic Optimization IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces a technique based on approximate optimization of a conservatively regularized objective function within each minibatch. |
Mu Li; Tong Zhang; Yuqiang Chen; Alexander J. Smola; |
2014 | 4 | Clustering And Projected Clustering With Adaptive Neighbors IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel clustering model to learn the data similarity matrix and clustering structure simultaneously. |
Feiping Nie; Xiaoqian Wang; Heng Huang; |
2014 | 5 | GeoMF: Joint Geographical Modeling And Matrix Factorization For Point-of-interest Recommendation IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Besides, researchers have recently discovered a spatial clustering phenomenon in human mobility behavior on the LBSNs, i.e., individual visiting locations tend to cluster together, and also demonstrated its effectiveness in POI recommendation, thus we incorporate it into the factorization model. |
DEFU LIAN et. al. |
2014 | 6 | Travel Time Estimation Of A Path Using Sparse Trajectories IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a citywide and real-time model for estimating the travel time of any path (represented as a sequence of connected road segments) in real time in a city, based on the GPS trajectories of vehicles received in current time slots and over a period of history as well as map data sources. |
Yilun Wang; Yu Zheng; Yexiang Xue; |
2014 | 7 | A Dirichlet Multinomial Mixture Model-based Approach For Short Text Clustering IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we proposed a collapsed Gibbs Sampling algorithm for the Dirichlet Multinomial Mixture model for short text clustering (abbr. |
Jianhua Yin; Jianyong Wang; |
2014 | 8 | Jointly Modeling Aspects, Ratings And Sentiments For Movie Recommendation (JMARS) IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work we propose a probabilistic model based on collaborative filtering and topic modeling. |
QIMING DIAO et. al. |
2014 | 9 | Open Question Answering Over Curated And Extracted Knowledge Bases IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present OQA, the first approach to leverage both curated and extracted KBs. |
Anthony Fader; Luke Zettlemoyer; Oren Etzioni; |
2014 | 10 | Learning Time-series Shapelets IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In contrast to the state-of-the-art, this paper proposes a novel perspective in terms of learning shapelets. |
Josif Grabocka; Nicolas Schilling; Martin Wistuba; Lars Schmidt-Thieme; |
2014 | 11 | FastXML: A Fast, Accurate And Stable Tree-classifier For Extreme Multi-label Learning IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our objective, in this paper, is to develop an extreme multi-label classifier that is faster to train and more accurate at prediction than the state-of-the-art Multi-label Random Forest (MLRF) algorithm [2] and the Label Partitioning for Sub-linear Ranking (LPSR) algorithm [35]. |
Yashoteja Prabhu; Manik Varma; |
2014 | 12 | Streaming Submodular Maximization: Massive Data Summarization On The Fly IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we address the problem of extracting representative elements from a large stream of data. |
Ashwinkumar Badanidiyuru; Baharan Mirzasoleiman; Amin Karbasi; Andreas Krause; |
2014 | 14 | Optimal Real-time Bidding For Display Advertising IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we study bid optimisation for real-time bidding (RTB) based display advertising. |
Weinan Zhang; Shuai Yuan; Jun Wang; |
2014 | 15 | Inferring Gas Consumption And Pollution Emission Of Vehicles Throughout A City IF:5 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: As many road segments are not traversed by trajectories (i.e., data sparsity), we propose a Travel Speed Estimation (TSE) model based on a context-aware matrix factorization approach. |
Jingbo Shang; Yu Zheng; Wenzhu Tong; Eric Chang; Yong Yu; |
2013 | 1 | Auto-WEKA: Combined Selection And Hyperparameter Optimization Of Classification Algorithms IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We consider the problem of simultaneously selecting a learning algorithm and setting its hyperparameters, going beyond previous work that attacks these issues separately. |
Chris Thornton; Frank Hutter; Holger H. Hoos; Kevin Leyton-Brown; |
2013 | 2 | Ad Click Prediction: A View From The Trenches IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The goal of this paper is to highlight the close relationship between theoretical advances and practical engineering in this industrial setting, and to show the depth of challenges that appear when applying traditional machine learning methods in a complex dynamic system. |
H. BRENDAN MCMAHAN et. al. |
2013 | 3 | U-Air: When Urban Air Quality Inference Meets Big Data IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we infer the real-time and fine-grained air quality information throughout a city, based on the (historical and real-time) air quality data reported by existing monitor stations and a variety of data sources we observed in the city, such as meteorology, traffic flow, human mobility, structure of road networks, and point of interests (POIs). |
Yu Zheng; Furui Liu; Hsun-Ping Hsieh; |
2013 | 4 | FISM: Factored Item Similarity Models For Top-N Recommender Systems IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To alleviate this problem, we present an item-based method for generating top-N recommendations that learns the item-item similarity matrix as the product of two low dimensional latent factor matrices. |
Santosh Kabbur; Xia Ning; George Karypis; |
2013 | 5 | Accurate Intelligible Models With Pairwise Interactions IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we suggest adding selected terms of interacting pairs of features to standard GAMs. |
Yin Lou; Rich Caruana; Johannes Gehrke; Giles Hooker; |
2013 | 6 | Spotting Opinion Spammers Using Behavioral Footprints IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work proposes a novel angle to the problem by modeling spamicity as latent. |
ARJUN MUKHERJEE et. al. |
2013 | 7 | Learning Geographical Preferences For Point-of-interest Recommendation IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, in this paper, we propose a novel geographical probabilistic factor analysis framework which strategically takes various factors into consideration. |
Bin Liu; Yanjie Fu; Zijun Yao; Hui Xiong; |
2013 | 8 | Connecting Users Across Social Media Sites: A Behavioral-modeling Approach IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper aims to address the cross-media user identification problem. |
Reza Zafarani; Huan Liu; |
2013 | 9 | Why People Hate Your App: Making Sense Of User Feedback In A Mobile App Store IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose Wiscom, a system that can analyze tens of millions user ratings and comments in mobile app markets at three different levels of detail. |
BIN FU et. al. |
2013 | 10 | Online Controlled Experiments At Large Scale IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We discuss why negative experiments, which degrade the user experience short term, should be run, given the learning value and long-term benefits. |
RON KOHAVI et. al. |
2013 | 11 | LCARS: A Location-content-aware Recommender System IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose LCARS, a location-content-aware recommender system that offers a particular user a set of venues (e.g., restaurants) or events (e.g., concerts and exhibitions) by giving consideration to both personal interest and local preference. |
Hongzhi Yin; Yizhou Sun; Bin Cui; Zhiting Hu; Ling Chen; |
2013 | 12 | Fast And Scalable Polynomial Kernels Via Explicit Feature Maps IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Approximation of non-linear kernels using random feature mapping has been successfully employed in large-scale data analysis applications, accelerating the training of kernel … |
Ninh Pham; Rasmus Pagh; |
2013 | 13 | Simple And Deterministic Matrix Sketching IF:5 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we adapt a well known streaming algorithm for approximating item frequencies to the matrix sketching setting. |
Edo Liberty; |
2013 | 14 | Collaborative Matrix Factorization With Multiple Similarities For Predicting Drug-target Interactions IF:5 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a factor model, named Multiple Similarities Collaborative Matrix Factorization(MSCMF), which projects drugs and targets into a common low-rank feature space, which is further consistent with weighted similarity matrices over drugs and those over targets. |
Xiaodong Zheng; Hao Ding; Hiroshi Mamitsuka; Shanfeng Zhu; |
2013 | 15 | Denser Than The Densest Subgraph: Extracting Optimal Quasi-cliques With Quality Guarantees IF:5 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we define a novel density function, which gives subgraphs of much higher quality than densest subgraphs: the graphs found by our method are compact, dense, and with smaller diameter. |
Charalampos Tsourakakis; Francesco Bonchi; Aristides Gionis; Francesco Gullo; Maria Tsiarli; |
2012 | 1 | Discovering Regions Of Different Functions In A City Using Human Mobility And POIs IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a framework (titled DRoF) that Discovers Regions of different Functions in a city using both human mobility among regions and points of interests (POIs) located in a region. |
Jing Yuan; Yu Zheng; Xing Xie; |
2012 | 2 | Searching And Mining Trillions Of Time Series Subsequences Under Dynamic Time Warping IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work we show that by using a combination of four novel ideas we can search and mine truly massive time series for the first time. |
THANAWIN RAKTHANMANON et. al. |
2012 | 3 | Open Domain Event Extraction From Twitter IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper describes TwiCal– the first open-domain event-extraction and categorization system for Twitter. |
Alan Ritter; Oren Etzioni; Sam Clark; |
2012 | 4 | Information Diffusion And External Influence In Networks IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a model in which information can reach a node via the links of the social network or through the influence of external sources. |
Seth A. Myers; Chenguang Zhu; Jure Leskovec; |
2012 | 5 | Streaming Graph Partitioning For Large Distributed Graphs IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose natural, simple heuristics and compare their performance to hashing and METIS, a fast, offline heuristic. |
Isabelle Stanton; Gabriel Kliot; |
2012 | 6 | Intelligible Models For Classification And Regression IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present the first large-scale empirical comparison of existing methods for learning GAMs. |
Yin Lou; Rich Caruana; Johannes Gehrke; |
2012 | 7 | Review Spam Detection Via Temporal Pattern Discovery IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a hierarchical algorithm to robustly detect the time windows where such attacks are likely to have happened. |
Sihong Xie; Guan Wang; Shuyang Lin; Philip S. Yu; |
2012 | 8 | Circle-based Recommendation In Online Social Networks IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents an effort to develop circle-based RS. |
Xiwang Yang; Harald Steck; Yong Liu; |
2012 | 9 | Discovering Value From Community Activity On Focused Question Answering Sites: A Case Study Of Stack Overflow IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To better understand this shift in focus from one-off answers to a group knowledge-creation process, we consider a question together with its entire set of corresponding answers as our fundamental unit of analysis, in contrast with the focus on individual question-answer pairs that characterized previous work. |
Ashton Anderson; Daniel Huttenlocher; Jon Kleinberg; Jure Leskovec; |
2012 | 10 | Towards Social User Profiling: Unified And Discriminative Influence Model For Inferring Home Locations IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we focus on the problem of profiling users’ home locations in the context of social network (Twitter). |
Rui Li; Shengjie Wang; Hongbo Deng; Rui Wang; Kevin Chen-Chuan Chang; |
2012 | 11 | Constructing Popular Routes From Uncertain Trajectories IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a Route Inference framework based on Collective Knowledge (abbreviated as RICK) to construct the popular routes from uncertain trajectories. |
Ling-Yin Wei; Yu Zheng; Wen-Chih Peng; |
2012 | 12 | A Shapelet Transform For Time Series Classification IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose disconnecting the process of finding shapelets from the classification algorithm by proposing a shapelet transformation. |
Jason Lines; Luke M. Davis; Jon Hills; Anthony Bagnall; |
2012 | 13 | Rise And Fall Patterns Of Information Diffusion: Model And Implications IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose SpikeM, a concise yet flexible analytical model for the rise and fall patterns of influence propagation. |
Yasuko Matsubara; Yasushi Sakurai; B. Aditya Prakash; Lei Li; Christos Faloutsos; |
2012 | 14 | Event-based Social Networks: Linking The Online And Offline Social Worlds IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We subsequently studied the heterogeneous nature (co-existence of both online and offline social interactions) of EBSNs on two challenging problems: community detection and information flow. |
XINGJIE LIU et. al. |
2012 | 15 | Cross-domain Collaboration Recommendation IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we analyze the cross-domain collaboration data from research publications and confirm the above patterns. |
Jie Tang; Sen Wu; Jimeng Sun; Hang Su; |
2011 | 1 | Friendship And Mobility: User Movement In Location-based Social Networks IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Using cell phone location data, as well as data from two online location-based social networks, we aim to understand what basic laws govern human motion and dynamics. |
Eunjoon Cho; Seth A. Myers; Jure Leskovec; |
2011 | 2 | Collaborative Topic Modeling For Recommending Scientific Articles IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we develop an algorithm to recommend scientific articles to users of an online community. |
Chong Wang; David M. Blei; |
2011 | 3 | Driving With Knowledge From The Physical World IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a Cloud-based system computing customized and practically fast driving routes for an end user using (historical and real-time) traffic conditions and driver behavior. |
Jing Yuan; Yu Zheng; Xing Xie; Guangzhong Sun; |
2011 | 4 | Human Mobility, Social Ties, And Link Prediction IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Here we address this challenge for the first time by tracking the trajectories and communication records of 6 Million mobile phone users. |
Dashun Wang; Dino Pedreschi; Chaoming Song; Fosca Giannotti; Albert-Laszlo Barabasi; |
2011 | 5 | Large-scale Matrix Factorization With Distributed Stochastic Gradient Descent IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We describe the practical techniques used to optimize performance in our DSGD implementation. |
Rainer Gemulla; Erik Nijkamp; Peter J. Haas; Yannis Sismanis; |
2011 | 6 | Exploiting Place Features In Link Prediction On Location-based Social Networks IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we study the problem of designing a link prediction system for online location-based social networks. |
Salvatore Scellato; Anastasios Noulas; Cecilia Mascolo; |
2011 | 7 | User-level Sentiment Analysis Incorporating Social Networks IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Employing Twitter as a source for our experimental data, and working within a semi-supervised framework, we propose models that are induced either from the Twitter follower/followee network or from the network in Twitter formed by users referring to each other using "@" mentions. |
CHENHAO TAN et. al. |
2011 | 8 | Leakage In Data Mining: Formulation, Detection, And Avoidance IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In our new approach, these cases and others are explained by explicitly defining modeling goals and analyzing the broader framework of the data mining problem. |
Shachar Kaufman; Saharon Rosset; Claudia Perlich; |
2011 | 9 | Differentially Private Data Release For Data Mining IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose the first anonymization algorithm for the non-interactive setting based on the generalization technique. |
Noman Mohammed; Rui Chen; Benjamin C.M. Fung; Philip S. Yu; |
2011 | 10 | Discovering Spatio-temporal Causal Interactions In Traffic Data Streams IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we propose algorithms which construct outlier causality trees based on temporal and spatial properties of detected outliers. |
Wei Liu; Yu Zheng; Sanjay Chawla; Jing Yuan; Xie Xing; |
2011 | 11 | Democrats, Republicans And Starbucks Afficionados: User Classification In Twitter IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We describe a general and robust machine learning framework for large-scale classification of social media users according to dimensions of interest. |
Marco Pennacchiotti; Ana-Maria Popescu; |
2011 | 12 | Integrating Low-rank And Group-sparse Structures For Robust Multi-task Learning IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a robust multi-task learning (RMTL) algorithm which learns multiple tasks simultaneously as well as identifies the irrelevant (outlier) tasks. |
Jianhui Chen; Jiayu Zhou; Jieping Ye; |
2011 | 13 | Latent Aspect Rating Analysis Without Aspect Keyword Supervision IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a unified generative model for LARA, which does not need pre-specified aspect keywords and simultaneously mines 1) latent topical aspects, 2) ratings on each identified aspect, and 3) weights placed on different aspects by a reviewer. |
Hongning Wang; Yue Lu; ChengXiang Zhai; |
2011 | 14 | Logical-shapelets: An Expressive Primitive For Time Series Classification IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we address the latter problem by introducing a novel algorithm that finds shapelets in less time than current methods by an order of magnitude. |
Abdullah Mueen; Eamonn Keogh; Neal Young; |
2011 | 15 | K-NN As An Implementation Of Situation Testing For Discrimination Discovery And Prevention IF:5 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: With the support of the legally-grounded methodology of situation testing, we tackle the problems of discrimination discovery and prevention from a dataset of historical decisions by adopting a variant of k-NN classification. |
Binh Thanh Luong; Salvatore Ruggieri; Franco Turini; |
2010 | 1 | Scalable Influence Maximization For Prevalent Viral Marketing In Large-scale Social Networks IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we design a new heuristic algorithm that is easily scalable to millions of nodes and edges in our experiments. |
Wei Chen; Chi Wang; Yajun Wang; |
2010 | 2 | Unsupervised Feature Selection For Multi-cluster Data IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we consider the feature selection problem in unsupervised learning scenario, which is particularly difficult due to the absence of class labels that would guide the search for relevant information. |
Deng Cai; Chiyuan Zhang; Xiaofei He; |
2010 | 3 | Latent Aspect Rating Analysis On Review Text Data: A Rating Regression Approach IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we define and study a new opinionated text data analysis problem called Latent Aspect Rating Analysis (LARA), which aims at analyzing opinions expressed about an entity in an online review at the level of topical aspects to discover each individual reviewer’s latent opinion on each aspect as well as the relative emphasis on different aspects when forming the overall judgment of the entity. |
Hongning Wang; Yue Lu; Chengxiang Zhai; |
2010 | 4 | New Perspectives And Methods In Link Prediction IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we consider these factors by first motivating the use of a supervised framework through a careful investigation of issues such as network observational period, generality of existing methods, variance reduction, topological causes and degrees of imbalance, and sampling approaches. |
Ryan N. Lichtenwalter; Jake T. Lussier; Nitesh V. Chawla; |
2010 | 5 | Data Mining With Differential Privacy IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We consider the problem of data mining with formal privacy guarantees, given a data access interface based on the differential privacy framework. |
Arik Friedman; Assaf Schuster; |
2010 | 6 | Community-based Greedy Algorithm For Mining Top-K Influential Nodes In Mobile Social Networks IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we propose a new algorithm called Community-based Greedy algorithm for mining top-K influential nodes. |
Yu Wang; Gao Cong; Guojie Song; Kunqing Xie; |
2010 | 7 | UP-Growth: An Efficient Algorithm For High Utility Itemset Mining IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose an efficient algorithm, namely UP-Growth (Utility Pattern Growth), for mining high utility itemsets with a set of techniques for pruning candidate itemsets. |
Vincent S. Tseng; Cheng-Wei Wu; Bai-En Shie; Philip S. Yu; |
2010 | 8 | The Community-search Problem And How To Plan A Successful Cocktail Party IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we study a query-dependent variant of the community-detection problem, which we call the community-search problem: given a graph G, and a set of query nodes in the graph, we seek to find a subgraph of G that contains the query nodes and it is densely connected. |
Mauro Sozio; Aristides Gionis; |
2010 | 9 | Multi-label Learning By Exploiting Label Dependency IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose to use a Bayesian network structure to efficiently encode the conditional dependencies of the labels as well as the feature set, with the feature set as the common parent of all labels. |
Min-Ling Zhang; Kun Zhang; |
2010 | 10 | Temporal Recommendation On Graphs Via Long- And Short-term Preference Fusion IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Based on the STG model framework, we propose a novel recommendation algorithm Injected Preference Fusion (IPF) and extend the personalized Random Walk for temporal recommendation. |
LIANG XIANG et. al. |
2010 | 11 | An Energy-efficient Mobile Recommender System IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To that end, in this paper, we provide a focused study of extracting energy-efficient transportation patterns from location traces. |
YONG GE et. al. |
2010 | 12 | Training And Testing Of Recommender Systems On Data Missing Not At Random IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: As to test recommender systems, we present two performance measures that can be estimated, under mild assumptions, without bias from data even when ratings are missing not at random (MNAR). |
Harald Steck; |
2010 | 13 | Mining Periodic Behaviors For Moving Objects IF:5 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we address the problem of mining periodic behaviors for moving objects. |
Zhenhui Li; Bolin Ding; Jiawei Han; Roland Kays; Peter Nye; |
2010 | 14 | Overlapping Experiment Infrastructure: More, Better, Faster Experimentation IF:5 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we describe Google’s overlapping experiment infrastructure that is a key component to solving these problems. |
Diane Tang; Ashish Agarwal; Deirdre O’Brien; Mike Meyer; |
2010 | 15 | Combining Predictions For Accurate Recommender Systems IF:5 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: For our analysis we use a set of diverse state-of-the-art collaborative filtering (CF) algorithms, which include: SVD, Neighborhood Based Approaches, Restricted Boltzmann Machine, Asymmetric Factor Model and Global Effects. |
Michael Jahrer; Andreas Töscher; Robert Legenstein; |
2009 | 1 | Efficient Influence Maximization In Social Networks IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we study the efficient influence maximization from two complementary directions. |
Wei Chen; Yajun Wang; Siyu Yang; |
2009 | 2 | Collaborative Filtering With Temporal Dynamics IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Thus, modeling temporal dynamics should be a key when designing recommender systems or general customer preference models. |
Yehuda Koren; |
2009 | 3 | Meme-tracking And The Dynamics Of The News Cycle IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We develop a framework for tracking short, distinctive phrases that travel relatively intact through on-line text; developing scalable algorithms for clustering textual variants of such phrases, we identify a broad class of memes that exhibit wide spread and rich variation on a daily basis. |
Jure Leskovec; Lars Backstrom; Jon Kleinberg; |
2009 | 4 | Social Influence Analysis In Large-scale Networks IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address these fundamental questions, we propose Topical Affinity Propagation (TAP) to model the topic-level social influence on large networks. |
Jie Tang; Jimeng Sun; Chi Wang; Zi Yang; |
2009 | 5 | Time Series Shapelets: A New Primitive For Data Mining IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work we introduce a new time series primitive, time series shapelets, which addresses these limitations. |
Lexiang Ye; Eamonn Keogh; |
2009 | 6 | TrustWalker: A Random Walk Model For Combining Trust-based And Item-based Recommendation IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In order to find a good trade-off, we propose a random walk model combining the trust-based and the collaborative filtering approach for recommendation. |
Mohsen Jamali; Martin Ester; |
2009 | 7 | Beyond Blacklists: Learning To Detect Malicious Web Sites From Suspicious URLs IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we describe an approach to this problem based on automated URL classification, using statistical methods to discover the tell-tale lexical and host-based properties of malicious Web site URLs. |
Justin Ma; Lawrence K. Saul; Stefan Savage; Geoffrey M. Voelker; |
2009 | 8 | Differentially Private Recommender Systems: Building Privacy Into The Netflix Prize Contenders IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Such algorithms necessarily introduce uncertainty–i.e., noise–to computations, trading accuracy for privacy. |
Frank McSherry; Ilya Mironov; |
2009 | 9 | Relational Learning Via Latent Social Dimensions IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We conduct extensive experiments on social media data (one from a real-world blog site and the other from a popular content sharing site). |
Lei Tang; Huan Liu; |
2009 | 10 | Finding A Team Of Experts In Social Networks IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Given a task T, a pool of individuals X with different skills, and a social network G that captures the compatibility among these individuals, we study the problem of finding X, a subset of X, to perform the task. |
Theodoros Lappas; Kun Liu; Evimaria Terzi; |
2009 | 11 | WhereNext: A Location Predictor On Trajectory Pattern Mining IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we propose WhereNext, which is a method aimed at predicting with a certain level of accuracy the next location of a moving object. In addition, we propose a set of other measures, that evaluate a priori the predictive power of a set of Trajectory Patterns. |
Anna Monreale; Fabio Pinelli; Roberto Trasarti; Fosca Giannotti; |
2009 | 12 | New Ensemble Methods For Evolving Data Streams IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a new experimental data stream framework for studying concept drift, and two new variants of Bagging: ADWIN Bagging and Adaptive-Size Hoeffding Tree (ASHT) Bagging. |
Albert Bifet; Geoff Holmes; Bernhard Pfahringer; Richard Kirkby; Ricard Gavaldà; |
2009 | 13 | Sentiment Analysis Of Blogs By Combining Lexical Knowledge With Text Classification IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a unified framework in which one can use background lexical information in terms of word-class associations, and refine this information for specific domains using any available training examples. |
Prem Melville; Wojciech Gryc; Richard D. Lawrence; |
2009 | 14 | Regression-based Latent Factor Models IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a novel latent factor model to accurately predict response for large scale dyadic data in the presence of features. |
Deepak Agarwal; Bee-Chung Chen; |
2009 | 15 | Ranking-based Clustering Of Heterogeneous Information Networks With Star Network Schema IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we study clustering of multi-typed heterogeneous networks with a star network schema and propose a novel algorithm, NetClus, that utilizes links across multityped objects to generate high-quality net-clusters. |
Yizhou Sun; Yintao Yu; Jiawei Han; |
2008 | 1 | Factorization Meets The Neighborhood: A Multifaceted Collaborative Filtering Model IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work we introduce some innovations to both approaches. |
Yehuda Koren; |
2008 | 2 | ArnetMiner: Extraction And Mining Of Academic Social Networks IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We integrate publications from online Web databases and propose a probabilistic framework to deal with the name ambiguity problem. |
JIE TANG et. al. |
2008 | 3 | Get Another Label? Improving Data Quality And Data Mining Using Multiple, Noisy Labelers IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: (iv) Repeatedly labeling a carefully chosen set of points is generally preferable, and we present a robust technique that combines different notions of uncertainty to select data points for which quality should be improved. |
Victor S. Sheng; Foster Provost; Panagiotis G. Ipeirotis; |
2008 | 4 | Relational Learning Via Collective Matrix Factorization IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we propose a collective matrix factorization model: we simultaneously factor several matrices, sharing parameters among factors when an entity participates in multiple relations. |
Ajit P. Singh; Geoffrey J. Gordon; |
2008 | 5 | Learning Classifiers From Only Positive And Unlabeled Data IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The problem solved in this paper is how to learn a standard binary classifier given a nontraditional training set of this nature. |
Charles Elkan; Keith Noto; |
2008 | 6 | Microscopic Evolution Of Social Networks IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a detailed study of network evolution by analyzing four large online social networks with full temporal information about node and edge arrivals. |
Jure Leskovec; Lars Backstrom; Ravi Kumar; Andrew Tomkins; |
2008 | 7 | Angle-based Outlier Detection In High-dimensional Data IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel approach named ABOD (Angle-Based Outlier Detection) and some variants assessing the variance in the angles between the difference vectors of a point to the other points. |
Hans-Peter Kriegel; Matthias Schubert; Arthur Zimek; |
2008 | 8 | Influence And Correlation In Social Networks IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we study this problem systematically. |
Aris Anagnostopoulos; Ravi Kumar; Mohammad Mahdian; |
2008 | 9 | Discrimination-aware Data Mining IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, the notion of discriminatory classification rules is introduced and studied. |
Dino Pedreshi; Salvatore Ruggieri; Franco Turini; |
2008 | 10 | Feedback Effects Between Similarity And Social Influence In Online Communities IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We develop techniques for identifying and modeling the interactions between social influence and selection, using data from online communities where both social interaction and changes in behavior over time can be measured. |
David Crandall; Dan Cosley; Daniel Huttenlocher; Jon Kleinberg; Siddharth Suri; |
2008 | 11 | Context-aware Query Suggestion By Mining Click-through And Session Data IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel context-aware query suggestion approach which is in two steps. |
HUANHUAN CAO et. al. |
2008 | 12 | Fast Collapsed Gibbs Sampling For Latent Dirichlet Allocation IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we introduce a novel collapsed Gibbs sampling method for the widely used latent Dirichlet allocation (LDA) model. |
IAN PORTEOUS et. al. |
2008 | 13 | Joint Latent Topic Models For Text And Citations IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we address the problem of joint modeling of text and citations in the topic modeling framework. |
Ramesh M. Nallapati; Amr Ahmed; Eric P. Xing; William W. Cohen; |
2008 | 14 | Composition Attacks And Auxiliary Information In Data Privacy IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper explores how one can reason about privacy in the face of rich, realistic sources of auxiliary information. |
Srivatsava Ranjit Ganta; Shiva Prasad Kasiviswanathan; Adam Smith; |
2008 | 15 | ISAX: Indexing And Mining Terabyte Sized Time Series IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we show how a novel multi-resolution symbolic representation can be used to index datasets which are several orders of magnitude larger than anything else considered in the literature. |
Jin Shieh; Eamonn Keogh; |
2007 | 1 | Learning Bayesian Networks IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: No abstract available. … |
Richard E. Neapolitan; |
2007 | 2 | Cost-effective Outbreak Detection In Networks IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a general methodology for near optimal sensor placement in these and related problems. |
JURE LESKOVEC et. al. |
2007 | 3 | Trajectory Pattern Mining IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we move towards this direction and develop an extension of the sequential pattern mining paradigm that analyzes the trajectories of moving objects. |
Fosca Giannotti; Mirco Nanni; Fabio Pinelli; Dino Pedreschi; |
2007 | 4 | SCAN: A Structural Clustering Algorithm For Networks IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we proposed a novel algorithm called SCAN (Structural Clustering Algorithm for Networks), which detects clusters, hubs and outliers in networks. |
Xiaowei Xu; Nurcan Yuruk; Zhidan Feng; Thomas A. J. Schweiger; |
2007 | 5 | Truth Discovery With Multiple Conflicting Information Providers On The Web IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we propose a new problem called Veracity, i.e., conformity to truth, which studies how to find true facts from a large amount of conflicting information on many subjects that is provided by various web sites. |
Xiaoxin Yin; Jiawei Han; Philip S. Yu; |
2007 | 6 | GraphScope: Parameter-free Mining Of Large Time-evolving Graphs IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose GraphScope, that addresses both problems, using information theoretic principles. |
Jimeng Sun; Christos Faloutsos; Spiros Papadimitriou; Philip S. Yu; |
2007 | 7 | Density-based Clustering For Real-time Stream Data IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address these issues, this paper proposes D-Stream, a framework for clustering stream data using adensity-based approach. |
Yixin Chen; Li Tu; |
2007 | 8 | A Framework For Community Identification In Dynamic Social Networks IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose frameworks and algorithms for identifying communities in social networks that change over time. |
Chayant Tantipathananandh; Tanya Berger-Wolf; David Kempe; |
2007 | 9 | Automatic Labeling Of Multinomial Topic Models IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose probabilistic approaches to automatically labeling multinomial topic models in an objective way. |
Qiaozhu Mei; Xuehua Shen; ChengXiang Zhai; |
2007 | 10 | An Event-based Framework For Characterizing The Evolutionary Behavior Of Interaction Graphs IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present an event-based characterization of critical behavioral patterns for temporally varying interaction graphs. |
Sitaram Asur; Srinivasan Parthasarathy; Duygu Ucar; |
2007 | 11 | Modeling Relationships At Multiple Scales To Improve Accuracy Of Large Recommender Systems IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose novel algorithms for predicting user ratings of items by integrating complementary models that focus on patterns at different scales. |
Robert Bell; Yehuda Koren; Chris Volinsky; |
2007 | 12 | Evolutionary Spectral Clustering By Incorporating Temporal Smoothness IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose two frameworks that incorporate temporal smoothness in evolutionary spectral clustering. |
Yun Chi; Xiaodan Song; Dengyong Zhou; Koji Hino; Belle L. Tseng; |
2007 | 13 | Extracting Semantic Relations From Query Logs IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we study a large query log of more than twenty million queries with the goal of extracting the semantic relations that are implicitly captured in the actions of users submitting queries and clicking answers. |
Ricardo Baeza-Yates; Alessandro Tiberi; |
2007 | 14 | Practical Guide To Controlled Experiments On The Web: Listen To Your Customers Not To The Hippo IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We describe common architectures for experimentation systems and analyze their advantages and disadvantages. |
Ron Kohavi; Randal M. Henne; Dan Sommerfield; |
2007 | 15 | Co-clustering Based Classification For Out-of-domain Documents IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we address this problem for a text-mining task, where the labeled data are under one distribution in one domain known as in-domain data, while the unlabeled data are under a related but different domain known as out-of-domain data. |
Wenyuan Dai; Gui-Rong Xue; Qiang Yang; Yong Yu; |
2006 | 1 | Training Linear SVMs In Linear Time IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a Cutting Plane Algorithm for training linear SVMs that provably has training time 0(s,n) for classification problems and o(sn log (n))for ordinal regression problems. |
Thorsten Joachims; |
2006 | 2 | Group Formation In Large Social Networks: Membership, Growth, And Evolution IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We use decision-tree techniques to identify the most significant structural determinants of these properties. |
Lars Backstrom; Dan Huttenlocher; Jon Kleinberg; Xiangyang Lan; |
2006 | 3 | Structure And Evolution Of Online Social Networks IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we consider the evolution of structure within large online social networks. |
Ravi Kumar; Jasmine Novak; Andrew Tomkins; |
2006 | 4 | Topics Over Time: A Non-Markov Continuous-time Model Of Topical Trends IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents an LDA-style topic model that captures not only the low-dimensional structure of data, but also how the structure changes over time. |
Xuerui Wang; Andrew McCallum; |
2006 | 5 | YALE: Rapid Prototyping For Complex Data Mining Tasks IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: These case studies cover tasks like feature engineering, text mining, data stream mining and tracking drifting concepts, ensemble methods and distributed data mining. |
Ingo Mierswa; Michael Wurst; Ralf Klinkenberg; Martin Scholz; Timm Euler; |
2006 | 6 | Sampling From Large Graphs IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We consider several sampling methods, propose novel methods to check the goodness of sampling, and develop a set of scaling laws that describe relations between the properties of the original and the sample.In addition to the theoretical contributions, the practical conclusions from our work are: Sampling strategies based on edge selection do not perform well; simple uniform random node selection performs surprisingly well. |
Jure Leskovec; Christos Faloutsos; |
2006 | 7 | Orthogonal Nonnegative Matrix T-factorizations For Clustering IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We study the orthogonality constraint because it leadsto rigorous clustering interpretation. |
Chris Ding; Tao Li; Wei Peng; Haesun Park; |
2006 | 8 | Model Compression IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a method for "compressing" large, complex ensembles into smaller, faster models, usually without significant loss in performance. |
Cristian Buciluǎ; Rich Caruana; Alexandru Niculescu-Mizil; |
2006 | 9 | (α, K)-anonymity: An Enhanced K-anonymity Model For Privacy Preserving Data Publishing IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose an (α, k)-anonymity model to protect both identifications and relationships to sensitive information in data. |
Raymond Chi-Wing Wong; Jiuyong Li; Ada Wai-Chee Fu; Ke Wang; |
2006 | 10 | Evolutionary Clustering IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a generic framework for this problem, and discuss evolutionary versions of two widely-used clustering algorithms within this framework: k-means and agglomerative hierarchical clustering. |
Deepayan Chakrabarti; Ravi Kumar; Andrew Tomkins; |
2006 | 11 | Very Sparse Random Projections IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: There has been considerable interest in random projections, an approximate algorithm for estimating distances between pairs of points in a high-dimensional vector space. Let A in … |
Ping Li; Trevor J. Hastie; Kenneth W. Church; |
2006 | 12 | Beyond Streams And Graphs: Dynamic Tensor Analysis IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Thus, we introduce the dynamic tensor analysis (DTA) method, and its variants. |
Jimeng Sun; Dacheng Tao; Christos Faloutsos; |
2006 | 13 | GPLAG: Detection Of Software Plagiarism By Program Dependence Graph Analysis IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we develop a new plagiarism detection tool, called GPLAG, which detects plagiarism by mining program dependence graphs (PDGs). |
Chao Liu; Chen Chen; Jiawei Han; Philip S. Yu; |
2006 | 14 | Utility-based Anonymization Using Local Recoding IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The framework covers both numeric and categorical data. |
JIAN XU et. al. |
2006 | 15 | Mining Long-term Search History To Improve Search Accuracy IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we study statistical language modeling based methods to mine contextual information from long-term search history and exploit it for a more accurate estimate of the query language model. |
Bin Tan; Xuehua Shen; ChengXiang Zhai; |
2005 | 1 | Graphs Over Time: Densification Laws, Shrinking Diameters And Possible Explanations IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We provide a new graph generator, based on a "forest fire" spreading process, that has a simple, intuitive justification, requires very few parameters (like the "flammability" of nodes), and produces graphs exhibiting the full range of properties observed both in prior work and in the present study. |
Jure Leskovec; Jon Kleinberg; Christos Faloutsos; |
2005 | 2 | Adversarial Learning IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce the adversarial classifier reverse engineering (ACRE) learning problem, the task of learning sufficient information about a classifier to construct adversarial attacks. |
Daniel Lowd; Christopher Meek; |
2005 | 3 | Feature Bagging For Outlier Detection IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, a novel feature bagging approach for detecting outliers in very large, high dimensional and noisy databases is proposed. |
Aleksandar Lazarevic; Vipin Kumar; |
2005 | 4 | Discovering Evolutionary Theme Patterns From Text: An Exploration Of Temporal Text Mining IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we study a particular TTM task — discovering and summarizing the evolutionary patterns of themes in a text stream. |
Qiaozhu Mei; ChengXiang Zhai; |
2005 | 5 | Query Chains: Learning To Rank From Implicit Feedback IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a novel approach for using clickthrough data to learn ranked retrieval functions for web search results. |
Filip Radlinski; Thorsten Joachims; |
2005 | 6 | The Predictive Power Of Online Chatter IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: An increasing fraction of the global discourse is migrating online in the form of blogs, bulletin boards, web pages, wikis, editorials, and a dizzying array of new collaborative … |
Daniel Gruhl; R. Guha; Ravi Kumar; Jasmine Novak; Andrew Tomkins; |
2005 | 7 | Privacy-preserving Distributed K-means Clustering Over Arbitrarily Partitioned Data IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our paper makes two contributions in privacy-preserving data mining. |
Geetha Jagannathan; Rebecca N. Wright; |
2005 | 8 | Evaluating Similarity Measures: A Large-scale Study In The Orkut Social Network IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present an extensive empirical comparison of six distinct measures of similarity for recommending online communities to members of the Orkut social network. |
Ellen Spertus; Mehran Sahami; Orkut Buyukkokten; |
2005 | 9 | Density-based Clustering Of Uncertain Data IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose to express the similarity between two fuzzy objects by distance probability functions. |
Hans-Peter Kriegel; Martin Pfeifle; |
2005 | 10 | On Mining Cross-graph Quasi-cliques IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Such clusters may be potential pathways.In this paper, we investigate a novel data mining problem, mining cross-graph quasi-cliques, which is generalized from several interesting applications such as cross-market customer segmentation and joint mining of gene expression data and protein interaction data. |
Jian Pei; Daxin Jiang; Aidong Zhang; |
2005 | 11 | Deriving Marketing Intelligence From Online Discussion IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents such a system that gathers and annotates online discussion relating to consumer products using a wide variety of state-of-the-art techniques, including crawling, wrapping, search, text classification and computational linguistics. |
NATALIE GLANCE et. al. |
2005 | 12 | Dynamic Syslog Mining For Network Failure Monitoring IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a new methodology of dynamic syslog mining in order to detect failure symptoms with higher confidence and to discover sequential alarm patterns among computer devices. |
Kenji Yamanishi; Yuko Maruyama; |
2005 | 13 | An Approach To Spacecraft Anomaly Detection Problem Using Kernel Feature Space IF:5 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: A reasonable alternative to this conventional anomaly detection method is to reuse a vast amount of telemetry data which is multi-dimensional time-series continuously produced from a number of system components in the spacecraft.This paper proposes a novel "knowledge-free" anomaly detection method for spacecraft based on Kernel Feature Space and directional distribution, which constructs a system behavior model from the past normal telemetry data from a set of telemetry data in normal operation and monitors the current system status by checking incoming data with the model.In this method, we regard anomaly phenomena as unexpected changes of causal associations in the spacecraft system, and hypothesize that the significant causal associations inside the system will appear in the form of principal component directions in a high-dimensional non-linear feature space which is constructed by a kernel function and a set of data.We have confirmed the effectiveness of the proposed anomaly detection method by applying it to the telemetry data obtained from a simulator of an orbital transfer vehicle designed to make a rendezvous maneuver with the International Space Station. |
Ryohei Fujimaki; Takehisa Yairi; Kazuo Machida; |
2005 | 14 | Model-based Overlapping Clustering IF:5 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we interpret an overlapping clustering model proposed by Segal et al. [23] as a generalization of Gaussian mixture models, and we extend it to an overlapping clustering model based on mixtures of any regular exponential family distribution and the corresponding Bregman divergence. |
Arindam Banerjee; Chase Krumpelman; Joydeep Ghosh; Sugato Basu; Raymond J. Mooney; |
2005 | 15 | Summarizing Itemset Patterns: A Profile-based Approach IF:5 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Based on the restoration error, we propose a quality measure function to determine the optimal value of parameter K. Polynomial time algorithms are developed together with several optimization heuristics for efficiency improvement. |
Xifeng Yan; Hong Cheng; Jiawei Han; Dong Xin; |
2004 | 1 | Mining And Summarizing Customer Reviews IF:10 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this research, we aim to mine and to summarize all the customer reviews of a product. |
Minqing Hu; Bing Liu; |
2004 | 2 | Regularized Multi–task Learning IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we present an approach to multi–task learning based on the minimization of regularization functionals similar to existing ones, such as the one for Support Vector Machines (SVMs), that have been successfully used in the past for single–task learning. |
Theodoros Evgeniou; Massimiliano Pontil; |
2004 | 3 | Kernel K-means: Spectral Clustering And Normalized Cuts IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we give an explicit theoretical connection between them. |
Inderjit S. Dhillon; Yuqiang Guan; Brian Kulis; |
2004 | 4 | Adversarial Classification IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we develop a formal framework and algorithms for this problem. |
Nilesh Dalvi; Pedro Domingos; Sumit Sanghai; Deepak Verma; |
2004 | 5 | A Probabilistic Framework For Semi-supervised Clustering IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a probabilistic model for semi-supervised clustering based on Hidden Markov Random Fields (HMRFs) that provides a principled framework for incorporating supervision into prototype-based clustering. |
Sugato Basu; Mikhail Bilenko; Raymond J. Mooney; |
2004 | 6 | Probabilistic Author-topic Models For Information Discovery IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a new unsupervised learning technique for extracting information from large text collections. |
Mark Steyvers; Padhraic Smyth; Michal Rosen-Zvi; Thomas Griffiths; |
2004 | 7 | Towards Parameter-free Data Mining IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we show that recent results in bioinformatics and computational theory hold great promise for a parameter-free data-mining paradigm. |
Eamonn Keogh; Stefano Lonardi; Chotirat Ann Ratanamahatana; |
2004 | 8 | Learning To Detect Malicious Executables In The Wild IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we describe the development of a fielded application for detecting malicious executables in the wild. |
Jeremy Z. Kolter; Marcus A. Maloof; |
2004 | 9 | A Quickstart In Frequent Structure Mining Can Make A Difference IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce the GrAph/Sequence/Tree extractiON (Gaston) algorithm that implements this idea by searching first for frequent paths, then frequent free trees and finally cyclic graphs. |
Siegfried Nijssen; Joost N. Kok; |
2004 | 10 | Automatic Multimedia Cross-modal Correlation Discovery IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a novel, graph-based approach, "MMG", to discover such cross-modal correlations.Our "MMG" method requires no tuning, no clustering, no user-determined constants; it can be applied to any multimedia collection, as long as we have a similarity function for each medium; and it scales linearly with the database size. |
Jia-Yu Pan; Hyung-Jeong Yang; Christos Faloutsos; Pinar Duygulu; |
2004 | 11 | A Generalized Maximum Entropy Approach To Bregman Co-clustering And Matrix Approximation IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a substantially generalized co-clustering framework wherein any Bregman divergence can be used in the objective function, and various conditional expectation based constraints can be considered based on the statistics that need to be preserved. |
Arindam Banerjee; Inderjit Dhillon; Joydeep Ghosh; Srujana Merugu; Dharmendra S. Modha; |
2004 | 12 | Fast Discovery Of Connection Subgraphs IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a formal definition of this problem, and an ideal solution based on electricity analogues. |
Christos Faloutsos; Kevin S. McCurley; Andrew Tomkins; |
2004 | 13 | SPIN: Mining Maximal Frequent Subgraphs From Graph Databases IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a new algorithm that mines only maximal frequent subgraphs, i.e. subgraphs that are not a part of any other frequent subgraphs. |
Jun Huan; Wei Wang; Jan Prins; Jiong Yang; |
2004 | 14 | Cyclic Pattern Kernels For Predictive Graph Mining IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In contrast to these approaches, we propose a kernel function based on a natural set of cyclic and tree patterns independent of their frequency, and discuss its computational aspects. |
Tamás Horváth; Thomas Gärtner; Stefan Wrobel; |
2004 | 15 | Data Mining In Metric Space: An Empirical Analysis Of Supervised Learning Performance Criteria IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a new metric, SAR, that combines squared error, accuracy, and ROC area into one metric. |
Rich Caruana; Alexandru Niculescu-Mizil; |
2003 | 1 | Maximizing The Spread Of Influence Through A Social Network IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Models for the processes by which ideas and influence propagate through a social network have been studied in a number of domains, including the diffusion of medical and technological innovations, the sudden and widespread adoption of various strategies in game-theoretic settings, and the effects of "word of mouth" in the promotion of new products. |
David Kempe; Jon Kleinberg; Éva Tardos; |
2003 | 2 | Mining Concept-drifting Data Streams Using Ensemble Classifiers IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a general framework for mining concept-drifting data streams using weighted ensemble classifiers. |
Haixun Wang; Wei Fan; Philip S. Yu; Jiawei Han; |
2003 | 3 | Information-theoretic Co-clustering IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present an innovative co-clustering algorithm that monotonically increases the preserved mutual information by intertwining both the row and column clusterings at all stages. |
Inderjit S. Dhillon; Subramanyam Mallela; Dharmendra S. Modha; |
2003 | 4 | Adaptive Duplicate Detection Using Learnable String Similarity Measures IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a framework for improving duplicate detection using trainable measures of textual similarity. |
Mikhail Bilenko; Raymond J. Mooney; |
2003 | 5 | CloseGraph: Mining Closed Frequent Graph Patterns IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Instead of mining all the subgraphs, we propose to mine closed frequent graph patterns. |
Xifeng Yan; Jiawei Han; |
2003 | 6 | Mining Distance-based Outliers In Near Linear Time With Randomization And A Simple Pruning Rule IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We show that a simple nested loop algorithm that in the worst case is quadratic can give near linear time performance when the data is in random order and a simple pruning rule is used. |
Stephen D. Bay; Mark Schwabacher; |
2003 | 7 | CLOSET+: Searching For The Best Strategies For Mining Frequent Closed Itemsets IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: "In this study, we answer the above questions by a systematic study of the search strategies and develop a winning algorithm CLOSET+. |
Jianyong Wang; Jiawei Han; Jian Pei; |
2003 | 8 | Fast Vertical Mining Using Diffsets IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The main problem with these approaches is when intermediate results of vertical tid lists become too large for memory, thus affecting the algorithm scalability.In this paper we present a novel vertical data representation called Diffset, that only keeps track of differences in the tids of a candidate pattern from its generating frequent patterns. |
Mohammed J. Zaki; Karam Gouda; |
2003 | 9 | Probabilistic Discovery Of Time Series Motifs IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Two limitations of this work were the poor scalability of the motif discovery algorithm, and the inability to discover motifs in the presence of noise.Here we address these limitations by introducing a novel algorithm inspired by recent advances in the problem of pattern discovery in biosequences. |
Bill Chiu; Eamonn Keogh; Stefano Lonardi; |
2003 | 10 | Mining Data Records In Web Pages IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a more effective technique to perform the task. |
Bing Liu; Robert Grossman; Yanhong Zhai; |
2003 | 11 | Graph-based Anomaly Detection IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce two techniques for graph-based anomaly detection. |
Caleb C. Noble; Diane J. Cook; |
2003 | 12 | Algorithms For Estimating Relative Importance In Networks IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We investigate the problem of answering such queries in this paper, focusing in particular on defining and computing the importance of nodes in a graph relative to one or more root nodes. |
Scott White; Padhraic Smyth; |
2003 | 13 | Weighted Association Rule Mining Using Weighted Support And Significance Framework IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We address the issues of discovering significant binary relationships in transaction datasets in a weighted setting. |
Feng Tao; Fionn Murtagh; Mohsen Farid; |
2003 | 14 | Indexing Multi-dimensional Time-series With Support For Multiple Distance Measures IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Although most time-series data mining research has concentrated on providing solutions for a single distance function, in this work we motivate the need for a single index … |
Michail Vlachos; Marios Hadjieleftheriou; Dimitrios Gunopulos; Eamonn Keogh; |
2003 | 15 | Finding Recent Frequent Itemsets Adaptively Over Online Data Streams IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a data mining method for finding recent frequent itemsets adaptively over an online data stream. |
Joong Hyuk Chang; Won Suk Lee; |
2002 | 1 | Optimizing Search Engines Using Clickthrough Data IF:10 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents an approach to automatically optimizing the retrieval quality of search engines using clickthrough data. |
Thorsten Joachims; |
2002 | 2 | Bursty And Hierarchical Structure In Streams IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Underlying much of the text mining work in this area is the following intuitive premise — that the appearance of a topic in a document stream is signaled by a "burst of activity," with certain features rising sharply in frequency as the topic emerges.The goal of the present work is to develop a formal approach for modeling such "bursts," in such a way that they can be robustly and efficiently identified, and can provide an organizational framework for analyzing the underlying content. |
Jon Kleinberg; |
2002 | 3 | SimRank: A Measure Of Structural-context Similarity IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a complementary approach, applicable in any domain with object-to-object relationships, that measures similarity of the structural context in which objects occur, based on their relationships with other objects. |
Glen Jeh; Jennifer Widom; |
2002 | 4 | Mining Knowledge-sharing Sites For Viral Marketing IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we extend our previous techniques, achieving a large reduction in computational cost, and apply them to data from a knowledge-sharing site. |
Matthew Richardson; Pedro Domingos; |
2002 | 5 | On The Need For Time Series Data Mining Benchmarks: A Survey And Empirical Demonstration IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work we make the following claim. |
Eamonn Keogh; Shruti Kasetty; |
2002 | 6 | Sequential PAttern Mining Using A Bitmap Representation IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a new algorithm for mining sequential patterns. |
Jay Ayres; Jason Flannick; Johannes Gehrke; Tomi Yiu; |
2002 | 7 | Transforming Classifier Scores Into Accurate Multiclass Probability Estimates IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Here, we show how to obtain accurate probability estimates for multiclass problems by combining calibrated binary probability estimates. |
Bianca Zadrozny; Charles Elkan; |
2002 | 8 | Privacy Preserving Mining Of Association Rules IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a framework for mining association rules from transactions consisting of categorical items where the data has been randomized to preserve privacy of individual transactions. |
Alexandre Evfimievski; Ramakrishnan Srikant; Rakesh Agrawal; Johannes Gehrke; |
2002 | 9 | Selecting The Right Interestingness Measure For Association Patterns IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present an overview of various measures proposed in the statistics, machine learning and data mining literature. |
Pang-Ning Tan; Vipin Kumar; Jaideep Srivastava; |
2002 | 10 | Privacy Preserving Association Rule Mining In Vertically Partitioned Data IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a two-party algorithm for efficiently discovering frequent itemsets with minimum support levels, without either site revealing individual transaction values. |
Jaideep Vaidya; Chris Clifton; |
2002 | 11 | Transforming Data To Satisfy Privacy Constraints IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper addresses the important issue of preserving the anonymity of the individuals or entities during the data dissemination process. |
Vijay S. Iyengar; |
2002 | 12 | Interactive Deduplication Using Active Learning IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We investigate various design issues that arise in building a system to provide interactive response, fast convergence, and interpretable output. |
Sunita Sarawagi; Anuradha Bhamidipaty; |
2002 | 13 | Discovering Word Senses From Text IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a clustering algorithm called CBC (Clustering By Committee) that automatically discovers word senses from text. |
Patrick Pantel; Dekang Lin; |
2002 | 14 | Frequent Term-based Text Clustering IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a novel approach which uses frequent item (term) sets for text clustering. |
Florian Beil; Martin Ester; Xiaowei Xu; |
2002 | 15 | Efficiently Mining Frequent Trees In A Forest IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present TREEMINER, a novel algorithm to discover all frequent subtrees in a forest, using a new data structure called scope-list. |
Mohammed J. Zaki; |
2001 | 1 | Mining The Network Value Of Customers IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: So far, work in this area has considered only the intrinsic value of the customer (i.e, the expected profit from sales to her). |
Pedro Domingos; Matt Richardson; |
2001 | 2 | Co-clustering Documents And Words Using Bipartite Spectral Graph Partitioning IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we present the novel idea of modeling the document collection as a bipartite graph between documents and words, using which the simultaneous clustering problem can be posed as a bipartite graph partitioning problem. |
Inderjit S. Dhillon; |
2001 | 3 | Mining Time-changing Data Streams IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we propose an efficient algorithm for mining decision trees from continuously-changing data streams, based on the ultra-fast VFDT decision tree learner. |
Geoff Hulten; Laurie Spencer; Pedro Domingos; |
2001 | 4 | Random Projection In Dimensionality Reduction: Applications To Image And Text Data IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present experimental results on using random projection as a dimensionality reduction tool in a number of cases, where the high dimensionality of the data would otherwise lead to burden-some computations. |
Ella Bingham; Heikki Mannila; |
2001 | 5 | A Streaming Ensemble Algorithm (SEA) For Large-scale Classification IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The methods presented in this paper take advantage of plentiful data, building separate classifiers on sequential chunks of training points. |
W. Nick Street; YongSeog Kim; |
2001 | 6 | Proximal Support Vector Machine Classifiers IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Instead of a standard support vector machine (SVM) that classifies points by assigning them to one of two disjoint half-spaces, points are classified by assigning them to the … |
Glenn Fung; Olvi L. Mangasarian; |
2001 | 7 | A Robust And Scalable Clustering Algorithm For Mixed Type Attributes In Large Database Environment IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a distance measure that enables clustering data with both continuous and categorical attributes. |
Tom Chiu; DongPing Fang; John Chen; Yao Wang; Christopher Jeris; |
2001 | 8 | Real World Performance Of Association Rule Algorithms IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study compares five well-known association rule algorithms using three real-world datasets and an artificial dataset. |
Zijian Zheng; Ron Kohavi; Llew Mason; |
2001 | 9 | Learning And Making Decisions When Costs And Probabilities Are Both Unknown IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: After discussing how to make optimal decisions given cost and probability estimates, we present decision tree and naive Bayesian learning methods for obtaining well-calibrated probability estimates. |
Bianca Zadrozny; Charles Elkan; |
2001 | 10 | Mining Top-n Local Outliers In Large Databases IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel method to efficiently find the top-n local outliers in large databases. |
Wen Jin; Anthony K. H. Tung; Jiawei Han; |
2001 | 11 | Empirical Bayes Screening For Multi-item Associations IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper considers the framework of the so-called "market basket problem", in which a database of transactions is mined for the occurrence of unusually frequent item sets. |
William DuMouchel; Daryl Pregibon; |
2001 | 12 | Visualizing Multi-dimensional Clusters, Trends, And Outliers Using Star Coordinates IF:5 Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Interactive visualizations are effective tools in mining scientific, engineering, and business data to support decision-making activities. Star Coordinates is proposed as a new … |
Eser Kandogan; |
2001 | 13 | Molecular Feature Mining In HIV Data IF:5 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present the application of Feature Mining techniques to the Developmental Therapeutics Program’s AIDS antiviral screen database. |
Stefan Kramer; Luc De Raedt; Christoph Helma; |
2001 | 14 | Experimental Comparisons Of Online And Batch Versions Of Bagging And Boosting IF:5 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In previous work, we presented online bagging and boosting algorithms that only require one pass through the training data and presented experimental results on some relatively small datasets. |
Nikunj C. Oza; Stuart Russell; |
2001 | 15 | Mining Web Logs For Prediction Models In WWW Caching And Prefetching IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present an application of web log mining to obtain web-document access patterns and use these patterns to extend the well-known GDSF caching policies and prefetching policies. |
Qiang Yang; Haining Henry Zhang; Tianyi Li; |
2000 | 1 | Mining High-speed Data Streams IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: No abstract available. … |
Pedro Domingos; Geoff Hulten; |
2000 | 2 | Efficient Clustering Of High-dimensional Data Sets With Application To Reference Matching IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: No abstract available. … |
Andrew McCallum; Kamal Nigam; Lyle H. Ungar; |
2000 | 3 | Agglomerative Clustering Of A Search Engine Query Log IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: No abstract available. … |
Doug Beeferman; Adam Berger; |
2000 | 4 | FreeSpan: Frequent Pattern-projected Sequential Pattern Mining IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: No abstract available. … |
JIAWEI HAN et. al. |
2000 | 5 | Scaling Up Dynamic Time Warping For Datamining Applications IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: No abstract available. … |
Eamonn J. Keogh; Michael J. Pazzani; |
2000 | 6 | On-line Unsupervised Outlier Detection Using Finite Mixtures With Discounting Learning Algorithms IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: No abstract available. … |
Kenji Yamanishi; Jun-Ichi Takeuchi; Graham Williams; Peter Milne; |
2000 | 7 | Generating Non-redundant Association Rules IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: No abstract available. … |
Mohammed J. Zaki; |
2000 | 8 | Depth First Generation Of Long Patterns IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: No abstract available. … |
Ramesh C. Agarwal; Charu C. Aggarwal; V. V. V. Prasad; |
2000 | 9 | Visualization Of Navigation Patterns On A Web Site Using Model-based Clustering IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: No abstract available. … |
Igor Cadez; David Heckerman; Christopher Meek; Padhraic Smyth; Steven White; |
2000 | 10 | Efficient Identification Of Web Communities IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: No abstract available. … |
Gary William Flake; Steve Lawrence; C. Lee Giles; |
2000 | 11 | Feature Selection In Unsupervised Learning Via Evolutionary Search IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: No abstract available. … |
YeongSeog Kim; W. Nick Street; Filippo Menczer; |
2000 | 12 | Efficient Mining Of Weighted Association Rules (WAR) IF:5 Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: No abstract available. … |
Wei Wang; Jiong Yang; Philip S. Yu; |
2000 | 13 | Can We Push More Constraints Into Frequent Pattern Mining? IF:5 Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: No abstract available. … |
Jian Pei; Jiawei Han; |
2000 | 14 | Mining Asynchronous Periodic Patterns In Time Series Data IF:5 Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: No abstract available. … |
Jiong Yang; Wei Wang; Philip S. Yu; |
2000 | 15 | Efficient Search For Association Rules IF:5 Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: No abstract available. … |
Geoffrey I. Webb; |
1999 | 1 | MetaCost: A General Method For Making Classifiers Cost-sensitive IF:9 Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: No abstract available. … |
Pedro Domingos; |
1999 | 2 | Efficient Mining Of Emerging Patterns: Discovering Trends And Differences IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: No abstract available. … |
Guozhu Dong; Jinyan Li; |
1999 | 3 | Fast And Effective Text Mining Using Linear-time Document Clustering IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: No abstract available. … |
Bjornar Larsen; Chinatsu Aone; |
1999 | 4 | Mining Association Rules With Multiple Minimum Supports IF:8 Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: No abstract available. … |
Bing Liu; Wynne Hsu; Yiming Ma; |
1999 | 5 | Mining The Most Interesting Rules IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: No abstract available. … |
Roberto J. Bayardo; Rakesh Agrawal; |
1999 | 6 | Entropy-based Subspace Clustering For Mining Numerical Data IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: No abstract available. … |
Chun-Hung Cheng; Ada Waichee Fu; Yi Zhang; |
1999 | 7 | Event Detection From Time Series Data IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: No abstract available. … |
Valery Guralnik; Jaideep Srivastava; |
1999 | 8 | CACTUS—clustering Categorical Data Using Summaries IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: No abstract available. … |
Venkatesh Ganti; Johannes Gehrke; Raghu Ramakrishnan; |
1999 | 9 | Trajectory Clustering With Mixtures Of Regression Models IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: No abstract available. … |
Scott Gaffney; Padhraic Smyth; |
1999 | 10 | Activity Monitoring: Noticing Interesting Changes In Behavior IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: No abstract available. … |
Tom Fawcett; Foster Provost; |
1999 | 11 | Horting Hatches An Egg: A New Graph-theoretic Approach To Collaborative Filtering IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: No abstract available. … |
Charu C. Aggarwal; Joel L. Wolf; Kun-Lung Wu; Philip S. Yu; |
1999 | 12 | Pruning And Summarizing The Discovered Associations IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: No abstract available. … |
Bing Liu; Wynne Hsu; Yiming Ma; |
1999 | 13 | Using Association Rules For Product Assortment Decisions: A Case Study IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: No abstract available. … |
Tom Brijs; Gilbert Swinnen; Koen Vanhoof; Geert Wets; |
1999 | 14 | Efficient Progressive Sampling IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: No abstract available. … |
Foster Provost; David Jensen; Tim Oates; |
1999 | 15 | Mining In A Data-flow Environment: Experience In Network Intrusion Detection IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: No abstract available. … |
Wenke Lee; Salvatore J. Stolfo; Kui W. Mok; |