Paper Digest: SIGMOD 2019 Highlights
The ACM Special Interest Group on Management of Data (SIGMOD) is one of the top conferences on database management systems and data management technology. In this year, there were 430 paper submissions, of which 88 accepted.In 2019, it is to be held in Amsterdam, Netherlands.
To help the community quickly catch up on the work presented in this conference, Paper Digest Team processed all accepted papers, and generated one highlight sentence (typically the main topic) for each paper. Readers are encouraged to read these machine generated highlights / summaries to quickly get the main idea of each paper.
We thank all authors for writing these interesting papers, and readers for reading our digests. If you do not want to miss any interesting academic paper, you are welcome to sign up our free paper digest service to get new paper updates customized to your own interests on a daily basis.
Paper Digest Team
team@paperdigest.org
TABLE 1: SIGMOD 2019 Papers
Title | Authors | Highlight | |
---|---|---|---|
1 | Exact Cardinality Query Optimization with Bounded Execution Cost | Immanuel Trummer | We propose a novel algorithm for ECQO. |
2 | Pessimistic Cardinality Estimation: Tighter Upper Bounds for Intermediate Join Cardinalities | Walter Cai, Magdalena Balazinska, Dan Suciu | In this work we introduce a novel approach to the problem of cardinality estimation over multijoin queries. |
3 | Efficiently Searching In-Memory Sorted Arrays: Revenge of the Interpolation Search? | Peter Van Sandt, Yannis Chronis, Jignesh M. Patel | In this paper, we focus on the problem of searching sorted, in-memory datasets. |
4 | Iterative Query Processing based on Unified Optimization Techniques | Kisung Park, Hojin Seo, Mostofa Kamal Rasel, Young-Koo Lee, Chanho Jeong, Sung Yeol Lee, Chungmin Lee, Dong-Hun Lee | In this paper, we propose a novel unified optimization technique for efficient iterative query processing. |
5 | Approximate Distinct Counts for Billions of Datasets | Daniel Ting | We demonstrate existing approaches to solve this problem are inherently flawed, exhibiting bias that can be arbitrarily large, and propose new methods for solving this problem that have theoretical guarantees of correctness and tight, practical error estimates. |
6 | Cache-oblivious High-performance Similarity Join | Martin Perdacher, Claudia Plant, Christian B�hm | In this paper, we propose to refine the pairs in an order defined by a space-filling curve which dramatically improves data locality. |
7 | Blurring the Lines between Blockchains and Database Systems: the Case of Hyperledger Fabric | Ankur Sharma, Felix Martin Schuhknecht, Divya Agrawal, Jens Dittrich | To tackle these questions, we first explore Fabric from the perspective of database research, where we observe weaknesses in the transaction pipeline. We then solve these issues by transitioning well-understood database concepts to Fabric, namely transaction reordering as well as early transaction abort. |
8 | Towards Scaling Blockchain Systems via Sharding | Hung Dang, Tien Tuan Anh Dinh, Dumitrel Loghin, Ee-Chien Chang, Qian Lin, Beng Chin Ooi | This work takes a principled approach to apply sharding to blockchain systems in order to improve their transaction throughput at scale. |
9 | vChain: Enabling Verifiable Boolean Range Queries over Blockchain Databases | Cheng Xu, Ce Zhang, Jianliang Xu | In this paper, we take the first step toward investigating the problem of verifiable query processing over blockchain databases. |
10 | Answering Multi-Dimensional Analytical Queries under Local Differential Privacy | Tianhao Wang, Bolin Ding, Jingren Zhou, Cheng Hong, Zhicong Huang, Ninghui Li, Somesh Jha | In this paper, we study the problem of answering MDA queries under local differential privacy (LDP). |
11 | APEx: Accuracy-Aware Differentially Private Data Exploration | Chang Ge, Xi He, Ihab F. Ilyas, Ashwin Machanavajjhala | We present APEx, a novel system that allows data analysts to pose adaptively chosen sequences of queries along with required accuracy bounds. |
12 | Active Sparse Mobile Crowd Sensing Based on Matrix Completion | Kun Xie, Xiaocan Li, Xin Wang, Gaogang Xie, Jigang Wen, Dafang Zhang | Rather than only taking random measurements following the basic MC theory, to further reduce the cost of MCS while ensuring the quality of missing data inference, we propose an Active Sparse MCS (AS-MCS) scheme which includes a bipartite-graph-based sensing scheduling scheme to actively determine the sampling positions in each upcoming time slot, and a bipartite-graph-based matrix completion algorithm to robustly and accurately recover the un-sampled data in the presence of sensing and communications errors. |
13 | Autocompletion for Prefix-Abbreviated Input | Sheng Hu, Chuan Xiao, Jianbin Qin, Yoshiharu Ishikawa, Qiang Ma | In this paper, we propose a novel QAC paradigm through which users may abbreviate keywords by prefixes and do not have to explicitly separate them. |
14 | Progressive Deep Web Crawling Through Keyword Queries For Data Enrichment | Pei Wang, Ryan Shea, Jiannan Wang, Eugene Wu | In this paper, we study a novel problem-how to progressively crawl the deep web (i.e., a hidden database) through a keyword-search API to enrich a local database in an e ective way. |
15 | Visual Segmentation for Information Extraction from Heterogeneous Visually Rich Documents | Ritesh Sarkhel, Arnab Nandi | We propose VS2, a generalized approach for information extraction from heterogeneous visually rich documents. |
16 | RRR: Rank-Regret Representative | Abolfazl Asudeh, Azade Nazi, Nan Zhang, Gautam Das, H. V. Jagadish | Therefore, we consider the position of the items in the ranked list for defining the regret and propose the \em rank-regret representative as the minimal subset of the data containing at least one of the top-k of any possible ranking function. |
17 | Strongly Truthful Interactive Regret Minimization | Min Xie, Raymond Chi-Wing Wong, Ashwin Lall | Specifically, we present a generic framework for interactive regret minimization, under which we propose algorithms that ask an asymptotically optimal number of questions in 2-dimensional spaces and algorithms with provable performance guarantees in d-dimensional spaces ($d \geq 2$) where each dimension corresponds to a description of a tuple. |
18 | Verifying Text Summaries of Relational Data Sets | Saehan Jo, Immanuel Trummer, Weicheng Yu, Xuezhi Wang, Cong Yu, Daniel Liu, Niyati Mehta | We present a novel natural language query interface, the AggChecker, aimed at text summaries of relational data sets. |
19 | An End-to-End Automatic Cloud Database Tuning System Using Deep Reinforcement Learning | Ji Zhang, Yu Liu, Ke Zhou, Guoliang Li, Zhili Xiao, Bin Cheng, Jiashu Xing, Yangtao Wang, Tianheng Cheng, | To address these challenges, we design an end-to-end automatic CDB tuning system, CDBTune, using deep reinforcement learning (RL). |
20 | Fast General Distributed Transactions with Opacity | Alex Shamis, Matthew Renzelmann, Stanko Novakovic, Georgios Chatzopoulos, Aleksandar Dragojevic, Dushyanth Narayanan, Miguel Castro | This paper extends the design of FaRM — which provides strict serializability only for committed transactions — to provide opacity while maintaining FaRM’s high throughput, low latency, and high availability within a modern data center. |
21 | The Log-Structured Merge-Bush & the Wacky Continuum | Niv Dayan, Stratos Idreos | We introduce the Log-Structured Merge-Bush (LSM-Bush), a new data structure that sets increasing capacity ratios between adjacent pairs of smaller levels. |
22 | RaSQL: Greater Power and Performance for Big Data Analytics with Recursive-aggregate-SQL on Spark | Jiaqi Gu, Yugo H. Watanabe, William A. Mazza, Alexander Shkapsky, Mohan Yang, Ling Ding, Carlo Zaniolo | RaSQL: Greater Power and Performance for Big Data Analytics with Recursive-aggregate-SQL on Spark |
23 | Going Beyond Provenance: Explaining Query Answers with Pattern-based Counterbalances | Zhengjie Miao, Qitian Zeng, Boris Glavic, Sudeepa Roy | We present a novel approach for explaining outliers in aggregation queries through counter- balancing. |
24 | Explaining Wrong Queries Using Small Examples | Zhengjie Miao, Sudeepa Roy, Jun Yang | Therefore, in this paper, given a known counterexample D for $Q_1$ and $Q_2$, we aim to find the smallest counterexample $D’ \subseteq D$ where $Q_1(D’) \neq Q_2(D’)$. |
25 | Ariadne: Online Provenance for Big Graph Analytics | Vicky Papavasileiou, Ken Yocum, Alin Deutsch | This paper presents Ariadne, a system for capturing and querying provenance from Vertex-Centric graph processing systems. |
26 | Hypothetical Reasoning via Provenance Abstraction | Daniel Deutch, Yuval Moskovitch, Noam Rinetzky | To this end, we present a framework that allows to reduce provenance size. |
27 | Event Trend Aggregation Under Rich Event Matching Semantics | Olga Poppe, Chuan Lei, Elke A. Rundensteiner, David Maier | Event Trend Aggregation Under Rich Event Matching Semantics |
28 | Elasticutor: Rapid Elasticity for Realtime Stateful Stream Processing | LI Wang, Tom Z. J. Fu, Richard T. B. Ma, Marianne Winslett, Zhenjie Zhang | We propose an executor-centric approach that avoids operator-level key repartitioning and implements executors as the building blocks of elasticity. |
29 | Real-Time Multi-Pattern Detection over Event Streams | Ilya Kolchinsky, Assaf Schuster | In this paper, we present a novel framework for real-time multi-pattern complex event processing. |
30 | AStream: Ad-hoc Shared Stream Processing | Jeyhun Karimov, Tilmann Rabl, Volker Markl | The goal of this paper is to bridge the gap between stream processing and ad-hoc queries in SPEs by sharing computation and resources. |
31 | Concurrent Prefix Recovery: Performing CPR on a Database | Guna Prasaad, Badrish Chandramouli, Donald Kossmann | In this paper, we propose a new recovery model based on group commit, called concurrent prefix recovery (CPR). |
32 | BriskStream: Scaling Data Stream Processing on Shared-Memory Multicore Architectures | Shuhao Zhang, Jiong He, Amelie Chi Zhou, Bingsheng He | We introduce BriskStream, an in-memory data stream processing system (DSPSs) specifically designed for modern shared-memory multicore architectures. |
33 | Border-Collie: A Wait-free, Read-optimal Algorithm for Database Logging on Multicore Hardware | Jongbin Kim, Hyeongwon Jang, Seohui Son, Hyuck Han, Sooyong Kang, Hyungsoo Jung | Based on our understanding, we propose Border-Collie, a wait-free and read-optimal algorithm for database logging that finds such an upper bound even with some worker threads often being idle. |
34 | Designing Distributed Tree-based Index Structures for Fast RDMA-capable Networks | Tobias Ziegler, Sumukha Tumkur Vani, Carsten Binnig, Rodrigo Fonseca, Tim Kraska | In this paper, we therefore discuss design alternatives for distributed tree-based index structures in the NAM architecture. |
35 | DistME: A Fast and Elastic Distributed Matrix Computation Engine using GPUs | Donghyoung Han, Yoon-Min Nam, Jihye Lee, Kyongseok Park, Hyunwoo Kim, Min-Soo Kim | We propose a distributed elastic matrix multiplication method called CuboidMM that achieves both high performance and large-scale processing. |
36 | GPU-based Graph Traversal on Compressed Graphs | Mo Sha, Yuchen Li, Kian-Lee Tan | In this paper, we introduce GPU-based graph traversal on compressed graphs, so as to enable the processing of graphs having a larger size than the device memory. |
37 | Interventional Fairness: Causal Database Repair for Algorithmic Fairness | Babak Salimi, Luke Rodriguez, Bill Howe, Dan Suciu | In this paper, we formalize the situation as a database repair problem, proving sufficient conditions for fair classifiers in terms of admissible variables as opposed to a complete causal model. |
38 | Uni-Detect: A Unified Approach to Automated Error Detection in Tables | Pei Wang, Yeye He | We propose \sj, a unified framework to automatically detect diverse types of errors. |
39 | HoloDetect: Few-Shot Learning for Error Detection | Alireza Heidari, Joshua McGrath, Ihab F. Ilyas, Theodoros Rekatsinas | We introduce a few-shot learning framework for error detection. |
40 | JOSIE: Overlap Set Similarity Search for Finding Joinable Tables in Data Lakes | Erkang Zhu, Dong Deng, Fatemeh Nargesian, Ren�e J. Miller | We present a new solution for finding joinable tables in massive data lakes: given a table and one join column, find tables that can be joined with the given table on the largest number of distinct values. |
41 | Raha: A Configuration-Free Error Detection System | Mohammad Mahdavi, Ziawasch Abedjan, Raul Castro Fernandez, Samuel Madden, Mourad Ouzzani, Michael Stonebraker, Nan Tang | In this paper, we present Raha, a new configuration-free error detection system. |
42 | Speculative Distributed CSV Data Parsing for Big Data Analytics | Chang Ge, Yinan Li, Eric Eilebrecht, Badrish Chandramouli, Donald Kossmann | To parallelize parsing, this paper proposes a speculation-based approach for the CSV format, arguably the most commonly used raw data format. |
43 | CATAPULT: Data-driven Selection of Canned Patterns for Efficient Visual Graph Query Formulation | Kai Huang, Huey Eng Chua, Sourav S. Bhowmick, Byron Choi, Shuigeng Zhou | In this paper, we present a generic and extensible framework called Catapult to address these limitations. |
44 | iQCAR: inter-Query Contention Analyzer for Data Analytics Frameworks | Prajakta Kalmegh, Shivnath Babu, Sudeepa Roy | We introduce iQCAR, an inter-Query Contention Analyzer, that attributes blame for the slowdown of a query to concurrent queries. |
45 | A Holistic Approach for Query Evaluation andResult Vocalization in Voice-Based OLAP | Immanuel Trummer, Yicheng Wang, Saketh Mahankali | We present a holistic approach that combines query processing and result vocalization. |
46 | Top-k Queries over Digital Traces | Yifan Li, Xiaohui Yu, Nick Koudas | We theoretically analyze the pruning effectiveness of the proposed methods based on a mobility model which we propose and validate in real life situations. |
47 | Visual Road: A Video Data Management Benchmark | Brandon Haynes, Amrita Mazumdar, Magdalena Balazinska, Luis Ceze, Alvin Cheung | To accelerate innovation in this area, we present Visual Road, a benchmark that evaluates the performance of these systems. |
48 | Mining Precision Interfaces From Query Logs | Qianrui Zhang, Haoci Zhang, Thibault Sellam, Eugene Wu | We propose a syntactic approach that uses queries from an analysis to generate a tailored interface. |
49 | Distance-generalized Core Decomposition | Francesco Bonchi, Arijit Khan, Lorenzo Severini | In this work we introduce a distance-based generalization of the notion of k-core, which we refer to as the $(k,h)$-core, i.e., the maximal subgraph in which every vertex has at least k other vertices at distance $leq h$ within that subgraph. |
50 | Unboundedness and Efficiency of Truss Maintenance in Evolving Graphs | Yikai Zhang, Jeffrey Xu Yu | We report our findings in this paper. |
51 | PRSim: Sublinear Time SimRank Computation on Large Power-Law Graphs | Zhewei Wei, Xiaodong He, Xiaokui Xiao, Sibo Wang, Yu Liu, Xiaoyong Du, Ji-Rong Wen | This paper proposes \prsim, an algorithm that exploits the structure of graphs to efficiently answer single-source SimRank queries. |
52 | Scaling Distance Labeling on Small-World Networks | Wentao Li, Miao Qiao, Lu Qin, Ying Zhang, Lijun Chang, Xuemin Lin | This paper scales distance labeling on small-world networks by proposing a Parallel Shortest-distance Labeling (PSL) scheme and further reducing the index size by exploiting graph and label properties. |
53 | Maximizing Welfare in Social Networks under A Utility Driven Influence Diffusion model | Prithu Banerjee, Wei Chen, Laks V.S. Lakshmanan | In this paper, we address all three limitations and propose a novel model called UIC that combines utility-driven item adoption with influence propagation over networks. |
54 | Efficient Approximation Algorithms for Adaptive Seed Minimization | Jing Tang, Keke Huang, Xiaokui Xiao, Laks V.S. Lakshmanan, Xueyan Tang, Aixin Sun, Andrew Lim | We propose a novel algorithm, ASTI, which addresses the adaptive seed minimization problem in $O\Big(\frac? |
55 | DeepBase: Deep Inspection of Neural Networks | Thibault Sellam, Kevin Lin, Ian Huang, Michelle Yang, Carl Vondrick, Eugene Wu | Our insight is that many of those studies follow a common analysis pattern, and therefore there is opportunity to provide a declarative abstraction to easily express, execute and optimize them. We discuss how DeepBase can express existing analyses, propose a set of simple and effective optimizations to speed up a standard Python implementation by up to 72x, and reproduce recent studies from the NLP literature. |
56 | BlinkML: Efficient Maximum Likelihood Estimation with Probabilistic Guarantees | Yongjoo Park, Jingyi Qing, Xiaoyang Shen, Barzan Mozafari | In this paper, we introduce BlinkML, a system for fast, quality-guaranteed ML training. |
57 | SkinnerDB: Regret-Bounded Query Evaluation via Reinforcement Learning | Immanuel Trummer, Junxiong Wang, Deepak Maram, Samuel Moseley, Saehan Jo, Joseph Antonakakis | Along with SkinnerDB, we introduce a new quality criterion for query execution strategies. |
58 | Democratizing Data Science through Interactive Curation of ML Pipelines | Zeyuan Shang, Emanuel Zgraggen, Benedetto Buratti, Ferdinand Kossmann, Philipp Eichmann, Yeounoh Chung, Carsten Binnig, Eli Upfal, Tim Kraska | In this paper we present Alpine Meadow, a first Interactive Automated Machine Learning tool. |
59 | FITing-Tree: A Data-aware Index Structure | Alex Galakatos, Michael Markovitch, Carsten Binnig, Rodrigo Fonseca, Tim Kraska | In this paper, we present a novel data-aware index structure called FITing-Tree which approximates an index using piece-wise linear functions with a bounded error specified at construction time. |
60 | Hyperion: Building the Largest In-memory Search Tree | Markus M�sker, Tim S��, Lars Nagel, Lingfang Zeng, Andr� Brinkmann | In this paper we present Hyperion, a trie-based main-memory key-value store achieving extreme space efficiency. |
61 | Designing Succinct Secondary Indexing Mechanism by Exploiting Column Correlations | Yingjun Wu, Jia Yu, Yuanyuan Tian, Richard Sidle, Ronald Barber | In this paper, we demonstrate that there exist many opportunities to exploit column correlations for accelerating data access. |
62 | AI Meets AI: Leveraging Query Executions to Improve Index Recommendations | Bailu Ding, Sudipto Das, Ryan Marcus, Wentao Wu, Surajit Chaudhuri, Vivek R. Narasayya | We present a study of the design space for this classification problem. |
63 | Designing Fair Ranking Schemes | Abolfazl Asudeh, H. V. Jagadish, Julia Stoyanovich, Gautam Das | In this paper, we develop a system that helps users choose criterion weights that lead to greater fairness. |
64 | Anti-Freeze for Large and Complex Spreadsheets: Asynchronous Formula Computation | Mangesh Bendre, Tana Wattanawaroon, Kelly Mack, Kevin Chang, Aditya Parameswaran | We propose a new asynchronous formula computation framework: instead of freezing the interface we return control to users quickly to ensure interactivity, while computing the formulae in the background. |
65 | Anytime Approximation in Probabilistic Databases via Scaled Dissociations | Maarten Van den Heuvel, Peter Ivanov, Wolfgang Gatterbauer, Floris Geerts, Martin Theobald | We propose a new anytime B&B approximation scheme that encompasses all prior model-based approximation schemes proposed in the PDB and SRL literature. |
66 | Uncertainty Annotated Databases – A Lightweight Approach for Approximating Certain Answers | Su Feng, Aaron Huber, Boris Glavic, Oliver Kennedy | In this paper, we propose Uncertainty Annotated Databases (UA-DBs), which combine an under- and over-approximation of certain answers to achieve the reliability of certain answers, with the performance of a classical database system. |
67 | Efficient Estimation of Heat Kernel PageRank for Local Clustering | Renchi Yang, Xiaokui Xiao, Zhewei Wei, Sourav S. Bhowmick, Jun Zhao, Rong-Hua Li | In this paper, we present TEA and TEA+, two novel local graph clustering algorithms based on HKPR, to address the aforementioned limitations. |
68 | Fractal: A General-Purpose Graph Pattern Mining System | Vinicius Dias, Carlos H. C. Teixeira, Dorgival Guedes, Wagner Meira, Srinivasan Parthasarathy | In this paper we propose Fractal, a high performance and high productivity system for supporting distributed graph pattern mining (GPM) applications. |
69 | Experimental Analysis of Streaming Algorithms for Graph Partitioning | Anil Pacaci, M. Tamer �zsu | The main objective of this study is to understand how the choice of graph partitioning algorithm affects system performance, resource usage and scalability. |
70 | Interactive Graph Search | Yufei Tao, Yuanbing Li, Guoliang Li | We describe algorithms that solve the problem by asking a provably small number of questions, and establish lower bounds indicating that the algorithms are optimal up to a small additive factor. |
71 | Optimizing Declarative Graph Queries at Large Scale | Qizhen Zhang, Akash Acharya, Hongzhi Chen, Simran Arora, Ang Chen, Vincent Liu, Boon Thau Loo | This paper presents GraphRex, an efficient, robust, scalable, and easy-to-program framework for graph processing on datacenter infrastructure. |
72 | Efficient Subgraph Matching: Harmonizing Dynamic Programming, Adaptive Matching Order, and Failing Set Together | Myoungji Han, Hyunjoon Kim, Geonmo Gu, Kunsoo Park, Wook-Shin Han | In this paper, we introduce three novel concepts to address these inherent limitations: 1) dynamic programming between a directed acyclic graph (DAG) and a graph, 2) adaptive matching order with DAG ordering, and 3) pruning by failing sets, which together lead to a much faster algorithm \textsfDAF for subgraph matching. |
73 | CECI: Compact Embedding Cluster Index for Scalable Subgraph Matching | Bibek Bhattarai, Hang Liu, H. Howie Huang | In this paper, we propose a novel framework for subgraph listing based on Compact Embedding Cluster Index (\idx), which divides the data graph into multiple embedding clusters for parallel processing. |
74 | Efficiently Answering Regular Simple Path Queries on Large Labeled Networks | Sarisht Wadhwa, Anagh Prasad, Sayan Ranu, Amitabha Bagchi, Srikanta Bedathur | In this paper, we circumvent this computational bottleneck by designing a random-walk based sampling algorithm called ARRIVAL, which is backed by theoretical guarantees on its expected quality. |
75 | Answering Why-questions by Exemplars in Attributed Graphs | Mohammad Hossein Namaki, Qi Song, Yinghui Wu, Shengqi Yang | (1) We characterize the problem by \em Q-Chase. |
76 | An Efficient Index for RDF Query Containment | Theofilos Mailis, Yannis Kotidis, Vaggelis Nikolopoulos, Evgeny Kharlamov, Ian Horrocks, Yannis Ioannidis | Based on this observation, we propose a novel indexing structure, named mv-index, that allows for fast containment checking between a single f-graph query and an arbitrary number of stored queries. |
77 | Tuple-oriented Compression for Large-scale Mini-batch Stochastic Gradient Descent | Fengan Li, Lingjiao Chen, Yijing Zeng, Arun Kumar, Xi Wu, Jeffrey F. Naughton, Jignesh M. Patel | We fill this crucial research gap by proposing a new lossless compression scheme we call tuple-oriented compression (TOC) that is inspired by an unlikely source, the string/ text compression scheme Lempel-Ziv-Welch, but tailored to MGD in a way that preserves tuple boundaries within mini-batches. |
78 | Towards Model-based Pricing for Machine Learning in a Data Marketplace | Lingjiao Chen, Paraschos Koutris, Arun Kumar | In this paper, we propose a model-based pricing (MBP) framework, which instead of pricing the data, directly prices ML model instances. |
79 | DBEst: Revisiting Approximate Query Processing Engines with Machine Learning Models | Qingzhi Ma, Peter Triantafillou | The paper presents DBEst, a system based on Machine Learning models (regression models and probability density estimators). |
80 | Enabling and Optimizing Non-linear Feature Interactions in Factorized Linear Algebra | Side Li, Lingjiao Chen, Arun Kumar | In this work, we take a first step towards closing this gap by introducing a new abstraction to enable pairwise feature interactions in multi-table data and present an extensive framework of algebraic rewrite rules for factorized LA operators over feature interactions. |
81 | Incremental and Approximate Inference for Faster Occlusion-based Deep CNN Explanations | Supun Nakandala, Arun Kumar, Yannis Papakonstantinou | We prototype our ideas in Python to create a tool we call Krypton that supports both CPUs and GPUs. |
82 | MNC: Structure-Exploiting Sparsity Estimation for Matrix Expressions | Johanna Sommer, Matthias Boehm, Alexandre V. Evfimievski, Berthold Reinwald, Peter J. Haas | In this paper, we introduce MNC (Matrix Non-zero Count), a remarkably simple, count-based matrix synopsis that exploits these structural properties for efficient, accurate, and general sparsity estimation. |
83 | A Scalable Index for Top-k Subtree Similarity Queries | Daniel Kocher, Nikolaus Augsten | We present a scalable solution for the top-k subtree similarity problem that does not assume specific data types, nor does it require any tuning. |
84 | A Layered Aggregate Engine for Analytics Workloads | Maximilian Schleich, Dan Olteanu, Mahmoud Abo Khamis, Hung Q. Ngo, XuanLong Nguyen | This paper introduces LMFAO (Layered Multiple Functional Aggregate Optimization), an in-memory optimization and execution engine for batches of aggregates over the input database. |
85 | Towards Scalable Hybrid Stores: Constraint-Based Rewriting to the Rescue | Rana Alotaibi, Damian Bursztyn, Alin Deutsch, Ioana Manolescu, Stamatis Zampetakis | We present ESTOCADA, a novel approach connecting applications to the potentially heterogeneous systems where their input data resides. |
86 | MIFO: A Query-Semantic Aware Resource Allocation Policy | Prajakta Kalmegh, Shivnath Babu | We present heuristics that exploit query semantics to proactively trigger MIFO-based allocations in a workload. |
87 | Dissecting the Performance of Strongly-Consistent Replication Protocols | Ailidani Ailijiang, Aleksey Charapko, Murat Demirbas | To fill this gap, we study single-leader, multi-leader, hierarchical multi-leader, and leaderless (opportunistic leader) consensus protocols, and present a comprehensive evaluation of their performance in local area networks (LANs) and wide area networks (WANs). |
88 | FishStore: Faster Ingestion with Subset Hashing | Dong Xie, Badrish Chandramouli, Yinan Li, Donald Kossmann | This paper builds on recent advances in parsing and indexing techniques to propose FishStore, a concurrent latch-free storage layer for data with flexible schema, based on multi-chain hash indexing of dynamically registered predicated subsets of data. |