Paper Digest: SIGMOD 2019 Highlights

July 1, 2019October 16, 2019 admin

The ACM Special Interest Group on Management of Data (SIGMOD) is one of the top conferences on database management systems and data management technology. In this year, there were 430 paper submissions, of which 88 accepted.In 2019, it is to be held in Amsterdam, Netherlands.

To help the community quickly catch up on the work presented in this conference, Paper Digest Team processed all accepted papers, and generated one highlight sentence (typically the main topic) for each paper. Readers are encouraged to read these machine generated highlights / summaries to quickly get the main idea of each paper.

We thank all authors for writing these interesting papers, and readers for reading our digests. If you do not want to miss any interesting academic paper, you are welcome to sign up our free paper digest service to get new paper updates customized to your own interests on a daily basis.

Paper Digest Team
team@paperdigest.org

TABLE 1: SIGMOD 2019 Papers

	Title	Authors	Highlight
1	Exact Cardinality Query Optimization with Bounded Execution Cost	Immanuel Trummer	We propose a novel algorithm for ECQO.
2	Pessimistic Cardinality Estimation: Tighter Upper Bounds for Intermediate Join Cardinalities	Walter Cai, Magdalena Balazinska, Dan Suciu	In this work we introduce a novel approach to the problem of cardinality estimation over multijoin queries.
3	Efficiently Searching In-Memory Sorted Arrays: Revenge of the Interpolation Search?	Peter Van Sandt, Yannis Chronis, Jignesh M. Patel	In this paper, we focus on the problem of searching sorted, in-memory datasets.
4	Iterative Query Processing based on Unified Optimization Techniques	Kisung Park, Hojin Seo, Mostofa Kamal Rasel, Young-Koo Lee, Chanho Jeong, Sung Yeol Lee, Chungmin Lee, Dong-Hun Lee	In this paper, we propose a novel unified optimization technique for efficient iterative query processing.
5	Approximate Distinct Counts for Billions of Datasets	Daniel Ting	We demonstrate existing approaches to solve this problem are inherently flawed, exhibiting bias that can be arbitrarily large, and propose new methods for solving this problem that have theoretical guarantees of correctness and tight, practical error estimates.
6	Cache-oblivious High-performance Similarity Join	Martin Perdacher, Claudia Plant, Christian B�hm	In this paper, we propose to refine the pairs in an order defined by a space-filling curve which dramatically improves data locality.
7	Blurring the Lines between Blockchains and Database Systems: the Case of Hyperledger Fabric	Ankur Sharma, Felix Martin Schuhknecht, Divya Agrawal, Jens Dittrich	To tackle these questions, we first explore Fabric from the perspective of database research, where we observe weaknesses in the transaction pipeline. We then solve these issues by transitioning well-understood database concepts to Fabric, namely transaction reordering as well as early transaction abort.
8	Towards Scaling Blockchain Systems via Sharding	Hung Dang, Tien Tuan Anh Dinh, Dumitrel Loghin, Ee-Chien Chang, Qian Lin, Beng Chin Ooi	This work takes a principled approach to apply sharding to blockchain systems in order to improve their transaction throughput at scale.
9	vChain: Enabling Verifiable Boolean Range Queries over Blockchain Databases	Cheng Xu, Ce Zhang, Jianliang Xu	In this paper, we take the first step toward investigating the problem of verifiable query processing over blockchain databases.
10	Answering Multi-Dimensional Analytical Queries under Local Differential Privacy	Tianhao Wang, Bolin Ding, Jingren Zhou, Cheng Hong, Zhicong Huang, Ninghui Li, Somesh Jha	In this paper, we study the problem of answering MDA queries under local differential privacy (LDP).
11	APEx: Accuracy-Aware Differentially Private Data Exploration	Chang Ge, Xi He, Ihab F. Ilyas, Ashwin Machanavajjhala	We present APEx, a novel system that allows data analysts to pose adaptively chosen sequences of queries along with required accuracy bounds.
12	Active Sparse Mobile Crowd Sensing Based on Matrix Completion	Kun Xie, Xiaocan Li, Xin Wang, Gaogang Xie, Jigang Wen, Dafang Zhang	Rather than only taking random measurements following the basic MC theory, to further reduce the cost of MCS while ensuring the quality of missing data inference, we propose an Active Sparse MCS (AS-MCS) scheme which includes a bipartite-graph-based sensing scheduling scheme to actively determine the sampling positions in each upcoming time slot, and a bipartite-graph-based matrix completion algorithm to robustly and accurately recover the un-sampled data in the presence of sensing and communications errors.
13	Autocompletion for Prefix-Abbreviated Input	Sheng Hu, Chuan Xiao, Jianbin Qin, Yoshiharu Ishikawa, Qiang Ma	In this paper, we propose a novel QAC paradigm through which users may abbreviate keywords by prefixes and do not have to explicitly separate them.
14	Progressive Deep Web Crawling Through Keyword Queries For Data Enrichment	Pei Wang, Ryan Shea, Jiannan Wang, Eugene Wu	In this paper, we study a novel problem-how to progressively crawl the deep web (i.e., a hidden database) through a keyword-search API to enrich a local database in an e ective way.
15	Visual Segmentation for Information Extraction from Heterogeneous Visually Rich Documents	Ritesh Sarkhel, Arnab Nandi	We propose VS2, a generalized approach for information extraction from heterogeneous visually rich documents.
16	RRR: Rank-Regret Representative	Abolfazl Asudeh, Azade Nazi, Nan Zhang, Gautam Das, H. V. Jagadish	Therefore, we consider the position of the items in the ranked list for defining the regret and propose the \em rank-regret representative as the minimal subset of the data containing at least one of the top-k of any possible ranking function.
17	Strongly Truthful Interactive Regret Minimization	Min Xie, Raymond Chi-Wing Wong, Ashwin Lall	Specifically, we present a generic framework for interactive regret minimization, under which we propose algorithms that ask an asymptotically optimal number of questions in 2-dimensional spaces and algorithms with provable performance guarantees in d-dimensional spaces ($d \geq 2$) where each dimension corresponds to a description of a tuple.
18	Verifying Text Summaries of Relational Data Sets	Saehan Jo, Immanuel Trummer, Weicheng Yu, Xuezhi Wang, Cong Yu, Daniel Liu, Niyati Mehta	We present a novel natural language query interface, the AggChecker, aimed at text summaries of relational data sets.
19	An End-to-End Automatic Cloud Database Tuning System Using Deep Reinforcement Learning	Ji Zhang, Yu Liu, Ke Zhou, Guoliang Li, Zhili Xiao, Bin Cheng, Jiashu Xing, Yangtao Wang, Tianheng Cheng,	To address these challenges, we design an end-to-end automatic CDB tuning system, CDBTune, using deep reinforcement learning (RL).
20	Fast General Distributed Transactions with Opacity	Alex Shamis, Matthew Renzelmann, Stanko Novakovic, Georgios Chatzopoulos, Aleksandar Dragojevic, Dushyanth Narayanan, Miguel Castro	This paper extends the design of FaRM — which provides strict serializability only for committed transactions — to provide opacity while maintaining FaRM’s high throughput, low latency, and high availability within a modern data center.
21	The Log-Structured Merge-Bush & the Wacky Continuum	Niv Dayan, Stratos Idreos	We introduce the Log-Structured Merge-Bush (LSM-Bush), a new data structure that sets increasing capacity ratios between adjacent pairs of smaller levels.
22	RaSQL: Greater Power and Performance for Big Data Analytics with Recursive-aggregate-SQL on Spark	Jiaqi Gu, Yugo H. Watanabe, William A. Mazza, Alexander Shkapsky, Mohan Yang, Ling Ding, Carlo Zaniolo	RaSQL: Greater Power and Performance for Big Data Analytics with Recursive-aggregate-SQL on Spark
23	Going Beyond Provenance: Explaining Query Answers with Pattern-based Counterbalances	Zhengjie Miao, Qitian Zeng, Boris Glavic, Sudeepa Roy	We present a novel approach for explaining outliers in aggregation queries through counter- balancing.
24	Explaining Wrong Queries Using Small Examples	Zhengjie Miao, Sudeepa Roy, Jun Yang	Therefore, in this paper, given a known counterexample D for $Q_1$ and $Q_2$, we aim to find the smallest counterexample $D’ \subseteq D$ where $Q_1(D’) \neq Q_2(D’)$.
25	Ariadne: Online Provenance for Big Graph Analytics	Vicky Papavasileiou, Ken Yocum, Alin Deutsch	This paper presents Ariadne, a system for capturing and querying provenance from Vertex-Centric graph processing systems.
26	Hypothetical Reasoning via Provenance Abstraction	Daniel Deutch, Yuval Moskovitch, Noam Rinetzky	To this end, we present a framework that allows to reduce provenance size.
27	Event Trend Aggregation Under Rich Event Matching Semantics	Olga Poppe, Chuan Lei, Elke A. Rundensteiner, David Maier	Event Trend Aggregation Under Rich Event Matching Semantics
28	Elasticutor: Rapid Elasticity for Realtime Stateful Stream Processing	LI Wang, Tom Z. J. Fu, Richard T. B. Ma, Marianne Winslett, Zhenjie Zhang	We propose an executor-centric approach that avoids operator-level key repartitioning and implements executors as the building blocks of elasticity.
29	Real-Time Multi-Pattern Detection over Event Streams	Ilya Kolchinsky, Assaf Schuster	In this paper, we present a novel framework for real-time multi-pattern complex event processing.
30	AStream: Ad-hoc Shared Stream Processing	Jeyhun Karimov, Tilmann Rabl, Volker Markl	The goal of this paper is to bridge the gap between stream processing and ad-hoc queries in SPEs by sharing computation and resources.
31	Concurrent Prefix Recovery: Performing CPR on a Database	Guna Prasaad, Badrish Chandramouli, Donald Kossmann	In this paper, we propose a new recovery model based on group commit, called concurrent prefix recovery (CPR).
32	BriskStream: Scaling Data Stream Processing on Shared-Memory Multicore Architectures	Shuhao Zhang, Jiong He, Amelie Chi Zhou, Bingsheng He	We introduce BriskStream, an in-memory data stream processing system (DSPSs) specifically designed for modern shared-memory multicore architectures.
33	Border-Collie: A Wait-free, Read-optimal Algorithm for Database Logging on Multicore Hardware	Jongbin Kim, Hyeongwon Jang, Seohui Son, Hyuck Han, Sooyong Kang, Hyungsoo Jung	Based on our understanding, we propose Border-Collie, a wait-free and read-optimal algorithm for database logging that finds such an upper bound even with some worker threads often being idle.
34	Designing Distributed Tree-based Index Structures for Fast RDMA-capable Networks	Tobias Ziegler, Sumukha Tumkur Vani, Carsten Binnig, Rodrigo Fonseca, Tim Kraska	In this paper, we therefore discuss design alternatives for distributed tree-based index structures in the NAM architecture.
35	DistME: A Fast and Elastic Distributed Matrix Computation Engine using GPUs	Donghyoung Han, Yoon-Min Nam, Jihye Lee, Kyongseok Park, Hyunwoo Kim, Min-Soo Kim	We propose a distributed elastic matrix multiplication method called CuboidMM that achieves both high performance and large-scale processing.
36	GPU-based Graph Traversal on Compressed Graphs	Mo Sha, Yuchen Li, Kian-Lee Tan	In this paper, we introduce GPU-based graph traversal on compressed graphs, so as to enable the processing of graphs having a larger size than the device memory.
37	Interventional Fairness: Causal Database Repair for Algorithmic Fairness	Babak Salimi, Luke Rodriguez, Bill Howe, Dan Suciu	In this paper, we formalize the situation as a database repair problem, proving sufficient conditions for fair classifiers in terms of admissible variables as opposed to a complete causal model.
38	Uni-Detect: A Unified Approach to Automated Error Detection in Tables	Pei Wang, Yeye He	We propose \sj, a unified framework to automatically detect diverse types of errors.
39	HoloDetect: Few-Shot Learning for Error Detection	Alireza Heidari, Joshua McGrath, Ihab F. Ilyas, Theodoros Rekatsinas	We introduce a few-shot learning framework for error detection.
40	JOSIE: Overlap Set Similarity Search for Finding Joinable Tables in Data Lakes	Erkang Zhu, Dong Deng, Fatemeh Nargesian, Ren�e J. Miller	We present a new solution for finding joinable tables in massive data lakes: given a table and one join column, find tables that can be joined with the given table on the largest number of distinct values.
41	Raha: A Configuration-Free Error Detection System	Mohammad Mahdavi, Ziawasch Abedjan, Raul Castro Fernandez, Samuel Madden, Mourad Ouzzani, Michael Stonebraker, Nan Tang	In this paper, we present Raha, a new configuration-free error detection system.
42	Speculative Distributed CSV Data Parsing for Big Data Analytics	Chang Ge, Yinan Li, Eric Eilebrecht, Badrish Chandramouli, Donald Kossmann	To parallelize parsing, this paper proposes a speculation-based approach for the CSV format, arguably the most commonly used raw data format.
43	CATAPULT: Data-driven Selection of Canned Patterns for Efficient Visual Graph Query Formulation	Kai Huang, Huey Eng Chua, Sourav S. Bhowmick, Byron Choi, Shuigeng Zhou	In this paper, we present a generic and extensible framework called Catapult to address these limitations.
44	iQCAR: inter-Query Contention Analyzer for Data Analytics Frameworks	Prajakta Kalmegh, Shivnath Babu, Sudeepa Roy	We introduce iQCAR, an inter-Query Contention Analyzer, that attributes blame for the slowdown of a query to concurrent queries.
45	A Holistic Approach for Query Evaluation andResult Vocalization in Voice-Based OLAP	Immanuel Trummer, Yicheng Wang, Saketh Mahankali	We present a holistic approach that combines query processing and result vocalization.
46	Top-k Queries over Digital Traces	Yifan Li, Xiaohui Yu, Nick Koudas	We theoretically analyze the pruning effectiveness of the proposed methods based on a mobility model which we propose and validate in real life situations.
47	Visual Road: A Video Data Management Benchmark	Brandon Haynes, Amrita Mazumdar, Magdalena Balazinska, Luis Ceze, Alvin Cheung	To accelerate innovation in this area, we present Visual Road, a benchmark that evaluates the performance of these systems.
48	Mining Precision Interfaces From Query Logs	Qianrui Zhang, Haoci Zhang, Thibault Sellam, Eugene Wu	We propose a syntactic approach that uses queries from an analysis to generate a tailored interface.
49	Distance-generalized Core Decomposition	Francesco Bonchi, Arijit Khan, Lorenzo Severini	In this work we introduce a distance-based generalization of the notion of k-core, which we refer to as the $(k,h)$-core, i.e., the maximal subgraph in which every vertex has at least k other vertices at distance $leq h$ within that subgraph.
50	Unboundedness and Efficiency of Truss Maintenance in Evolving Graphs	Yikai Zhang, Jeffrey Xu Yu	We report our findings in this paper.
51	PRSim: Sublinear Time SimRank Computation on Large Power-Law Graphs	Zhewei Wei, Xiaodong He, Xiaokui Xiao, Sibo Wang, Yu Liu, Xiaoyong Du, Ji-Rong Wen	This paper proposes \prsim, an algorithm that exploits the structure of graphs to efficiently answer single-source SimRank queries.
52	Scaling Distance Labeling on Small-World Networks	Wentao Li, Miao Qiao, Lu Qin, Ying Zhang, Lijun Chang, Xuemin Lin	This paper scales distance labeling on small-world networks by proposing a Parallel Shortest-distance Labeling (PSL) scheme and further reducing the index size by exploiting graph and label properties.
53	Maximizing Welfare in Social Networks under A Utility Driven Influence Diffusion model	Prithu Banerjee, Wei Chen, Laks V.S. Lakshmanan	In this paper, we address all three limitations and propose a novel model called UIC that combines utility-driven item adoption with influence propagation over networks.
54	Efficient Approximation Algorithms for Adaptive Seed Minimization	Jing Tang, Keke Huang, Xiaokui Xiao, Laks V.S. Lakshmanan, Xueyan Tang, Aixin Sun, Andrew Lim	We propose a novel algorithm, ASTI, which addresses the adaptive seed minimization problem in $O\Big(\frac?
55	DeepBase: Deep Inspection of Neural Networks	Thibault Sellam, Kevin Lin, Ian Huang, Michelle Yang, Carl Vondrick, Eugene Wu	Our insight is that many of those studies follow a common analysis pattern, and therefore there is opportunity to provide a declarative abstraction to easily express, execute and optimize them. We discuss how DeepBase can express existing analyses, propose a set of simple and effective optimizations to speed up a standard Python implementation by up to 72x, and reproduce recent studies from the NLP literature.
56	BlinkML: Efficient Maximum Likelihood Estimation with Probabilistic Guarantees	Yongjoo Park, Jingyi Qing, Xiaoyang Shen, Barzan Mozafari	In this paper, we introduce BlinkML, a system for fast, quality-guaranteed ML training.
57	SkinnerDB: Regret-Bounded Query Evaluation via Reinforcement Learning	Immanuel Trummer, Junxiong Wang, Deepak Maram, Samuel Moseley, Saehan Jo, Joseph Antonakakis	Along with SkinnerDB, we introduce a new quality criterion for query execution strategies.
58	Democratizing Data Science through Interactive Curation of ML Pipelines	Zeyuan Shang, Emanuel Zgraggen, Benedetto Buratti, Ferdinand Kossmann, Philipp Eichmann, Yeounoh Chung, Carsten Binnig, Eli Upfal, Tim Kraska	In this paper we present Alpine Meadow, a first Interactive Automated Machine Learning tool.
59	FITing-Tree: A Data-aware Index Structure	Alex Galakatos, Michael Markovitch, Carsten Binnig, Rodrigo Fonseca, Tim Kraska	In this paper, we present a novel data-aware index structure called FITing-Tree which approximates an index using piece-wise linear functions with a bounded error specified at construction time.
60	Hyperion: Building the Largest In-memory Search Tree	Markus M�sker, Tim S��, Lars Nagel, Lingfang Zeng, Andr� Brinkmann	In this paper we present Hyperion, a trie-based main-memory key-value store achieving extreme space efficiency.
61	Designing Succinct Secondary Indexing Mechanism by Exploiting Column Correlations	Yingjun Wu, Jia Yu, Yuanyuan Tian, Richard Sidle, Ronald Barber	In this paper, we demonstrate that there exist many opportunities to exploit column correlations for accelerating data access.
62	AI Meets AI: Leveraging Query Executions to Improve Index Recommendations	Bailu Ding, Sudipto Das, Ryan Marcus, Wentao Wu, Surajit Chaudhuri, Vivek R. Narasayya	We present a study of the design space for this classification problem.
63	Designing Fair Ranking Schemes	Abolfazl Asudeh, H. V. Jagadish, Julia Stoyanovich, Gautam Das	In this paper, we develop a system that helps users choose criterion weights that lead to greater fairness.
64	Anti-Freeze for Large and Complex Spreadsheets: Asynchronous Formula Computation	Mangesh Bendre, Tana Wattanawaroon, Kelly Mack, Kevin Chang, Aditya Parameswaran	We propose a new asynchronous formula computation framework: instead of freezing the interface we return control to users quickly to ensure interactivity, while computing the formulae in the background.
65	Anytime Approximation in Probabilistic Databases via Scaled Dissociations	Maarten Van den Heuvel, Peter Ivanov, Wolfgang Gatterbauer, Floris Geerts, Martin Theobald	We propose a new anytime B&B approximation scheme that encompasses all prior model-based approximation schemes proposed in the PDB and SRL literature.
66	Uncertainty Annotated Databases – A Lightweight Approach for Approximating Certain Answers	Su Feng, Aaron Huber, Boris Glavic, Oliver Kennedy	In this paper, we propose Uncertainty Annotated Databases (UA-DBs), which combine an under- and over-approximation of certain answers to achieve the reliability of certain answers, with the performance of a classical database system.
67	Efficient Estimation of Heat Kernel PageRank for Local Clustering	Renchi Yang, Xiaokui Xiao, Zhewei Wei, Sourav S. Bhowmick, Jun Zhao, Rong-Hua Li	In this paper, we present TEA and TEA+, two novel local graph clustering algorithms based on HKPR, to address the aforementioned limitations.
68	Fractal: A General-Purpose Graph Pattern Mining System	Vinicius Dias, Carlos H. C. Teixeira, Dorgival Guedes, Wagner Meira, Srinivasan Parthasarathy	In this paper we propose Fractal, a high performance and high productivity system for supporting distributed graph pattern mining (GPM) applications.
69	Experimental Analysis of Streaming Algorithms for Graph Partitioning	Anil Pacaci, M. Tamer �zsu	The main objective of this study is to understand how the choice of graph partitioning algorithm affects system performance, resource usage and scalability.
70	Interactive Graph Search	Yufei Tao, Yuanbing Li, Guoliang Li	We describe algorithms that solve the problem by asking a provably small number of questions, and establish lower bounds indicating that the algorithms are optimal up to a small additive factor.
71	Optimizing Declarative Graph Queries at Large Scale	Qizhen Zhang, Akash Acharya, Hongzhi Chen, Simran Arora, Ang Chen, Vincent Liu, Boon Thau Loo	This paper presents GraphRex, an efficient, robust, scalable, and easy-to-program framework for graph processing on datacenter infrastructure.
72	Efficient Subgraph Matching: Harmonizing Dynamic Programming, Adaptive Matching Order, and Failing Set Together	Myoungji Han, Hyunjoon Kim, Geonmo Gu, Kunsoo Park, Wook-Shin Han	In this paper, we introduce three novel concepts to address these inherent limitations: 1) dynamic programming between a directed acyclic graph (DAG) and a graph, 2) adaptive matching order with DAG ordering, and 3) pruning by failing sets, which together lead to a much faster algorithm \textsfDAF for subgraph matching.
73	CECI: Compact Embedding Cluster Index for Scalable Subgraph Matching	Bibek Bhattarai, Hang Liu, H. Howie Huang	In this paper, we propose a novel framework for subgraph listing based on Compact Embedding Cluster Index (\idx), which divides the data graph into multiple embedding clusters for parallel processing.
74	Efficiently Answering Regular Simple Path Queries on Large Labeled Networks	Sarisht Wadhwa, Anagh Prasad, Sayan Ranu, Amitabha Bagchi, Srikanta Bedathur	In this paper, we circumvent this computational bottleneck by designing a random-walk based sampling algorithm called ARRIVAL, which is backed by theoretical guarantees on its expected quality.
75	Answering Why-questions by Exemplars in Attributed Graphs	Mohammad Hossein Namaki, Qi Song, Yinghui Wu, Shengqi Yang	(1) We characterize the problem by \em Q-Chase.
76	An Efficient Index for RDF Query Containment	Theofilos Mailis, Yannis Kotidis, Vaggelis Nikolopoulos, Evgeny Kharlamov, Ian Horrocks, Yannis Ioannidis	Based on this observation, we propose a novel indexing structure, named mv-index, that allows for fast containment checking between a single f-graph query and an arbitrary number of stored queries.
77	Tuple-oriented Compression for Large-scale Mini-batch Stochastic Gradient Descent	Fengan Li, Lingjiao Chen, Yijing Zeng, Arun Kumar, Xi Wu, Jeffrey F. Naughton, Jignesh M. Patel	We fill this crucial research gap by proposing a new lossless compression scheme we call tuple-oriented compression (TOC) that is inspired by an unlikely source, the string/ text compression scheme Lempel-Ziv-Welch, but tailored to MGD in a way that preserves tuple boundaries within mini-batches.
78	Towards Model-based Pricing for Machine Learning in a Data Marketplace	Lingjiao Chen, Paraschos Koutris, Arun Kumar	In this paper, we propose a model-based pricing (MBP) framework, which instead of pricing the data, directly prices ML model instances.
79	DBEst: Revisiting Approximate Query Processing Engines with Machine Learning Models	Qingzhi Ma, Peter Triantafillou	The paper presents DBEst, a system based on Machine Learning models (regression models and probability density estimators).
80	Enabling and Optimizing Non-linear Feature Interactions in Factorized Linear Algebra	Side Li, Lingjiao Chen, Arun Kumar	In this work, we take a first step towards closing this gap by introducing a new abstraction to enable pairwise feature interactions in multi-table data and present an extensive framework of algebraic rewrite rules for factorized LA operators over feature interactions.
81	Incremental and Approximate Inference for Faster Occlusion-based Deep CNN Explanations	Supun Nakandala, Arun Kumar, Yannis Papakonstantinou	We prototype our ideas in Python to create a tool we call Krypton that supports both CPUs and GPUs.
82	MNC: Structure-Exploiting Sparsity Estimation for Matrix Expressions	Johanna Sommer, Matthias Boehm, Alexandre V. Evfimievski, Berthold Reinwald, Peter J. Haas	In this paper, we introduce MNC (Matrix Non-zero Count), a remarkably simple, count-based matrix synopsis that exploits these structural properties for efficient, accurate, and general sparsity estimation.
83	A Scalable Index for Top-k Subtree Similarity Queries	Daniel Kocher, Nikolaus Augsten	We present a scalable solution for the top-k subtree similarity problem that does not assume specific data types, nor does it require any tuning.
84	A Layered Aggregate Engine for Analytics Workloads	Maximilian Schleich, Dan Olteanu, Mahmoud Abo Khamis, Hung Q. Ngo, XuanLong Nguyen	This paper introduces LMFAO (Layered Multiple Functional Aggregate Optimization), an in-memory optimization and execution engine for batches of aggregates over the input database.
85	Towards Scalable Hybrid Stores: Constraint-Based Rewriting to the Rescue	Rana Alotaibi, Damian Bursztyn, Alin Deutsch, Ioana Manolescu, Stamatis Zampetakis	We present ESTOCADA, a novel approach connecting applications to the potentially heterogeneous systems where their input data resides.
86	MIFO: A Query-Semantic Aware Resource Allocation Policy	Prajakta Kalmegh, Shivnath Babu	We present heuristics that exploit query semantics to proactively trigger MIFO-based allocations in a workload.
87	Dissecting the Performance of Strongly-Consistent Replication Protocols	Ailidani Ailijiang, Aleksey Charapko, Murat Demirbas	To fill this gap, we study single-leader, multi-leader, hierarchical multi-leader, and leaderless (opportunistic leader) consensus protocols, and present a comprehensive evaluation of their performance in local area networks (LANs) and wide area networks (WANs).
88	FishStore: Faster Ingestion with Subset Hashing	Dong Xie, Badrish Chandramouli, Yinan Li, Donald Kossmann	This paper builds on recent advances in parsing and indexing techniques to propose FishStore, a concurrent latch-free storage layer for data with flexible schema, based on multi-chain hash indexing of dynamically registered predicated subsets of data.