Paper Digest: NSDI 2019 Highlights

February 25, 2019November 5, 2019 admin

USENIX Symposium on Networked Systems Design and Implementation (NSDI) is one of the top conferences on networked systems. In 2019, it is to be held in Boston, Massachusetts. Among 332 submissions, 49 of which were accepted.

To help the community quickly catch up on the work presented in this conference, Paper Digest Team processed all accepted papers, and generated one highlight sentence (typically the main topic) for each paper. Readers are encouraged to read these machine generated highlights / summaries to quickly get the main idea of each paper.

We thank all authors for writing these interesting papers, and readers for reading our digests. If you do not want to miss any interesting academic paper, you are welcome to sign up our free daily paper digest service to get new paper updates customized to your own interests on a daily basis. You are also welcome to follow us on Twitter and Linkedin for conference digest updates.

Paper Digest Team
team@paperdigest.org

TABLE 1: NSDI 2019 Papers

	Title	Authors	Highlight
1	Datacenter RPCs can be General and Fast	Anuj Kalia, Michael Kaminsky, David Andersen,	It is commonly believed that datacenter networking software must sacrifice generality to attain high performance. The popularity of specialized distributed systems designed specifically for niche technologies such as RDMA, lossless networks, FPGAs, and programmable switches testifies to this belief. In this paper, we show that such specialization is not necessary.
2	Eiffel: Efficient and Flexible Software Packet Scheduling	Ahmed Saeed and Yimeng Zhao, Nandita Dukkipati, Ellen Zegura and Mostafa Ammar, Khaled Harras, Amin Vahdat,	Our focus in this paper is on the design and deployment of packet scheduling in software.
3	Loom: Flexible and Efficient NIC Packet Scheduling	Brent Stephens, Aditya Akella and Michael Swift,	To overcome these limitations, we present Loom, a new NIC design that moves all per-flow scheduling decisions out of the OS and into the NIC.
4	Exploiting Commutativity For Practical Fast Replication	Seo Jin Park and John Ousterhout,	In this paper, we show that this entanglement of ordering and durability is unnecessary for strong consistency.
5	Flashield: a Hybrid Key-value Cache that Controls Flash Write Amplification	Assaf Eisenman, Asaf Cidon, Evgenya Pergament and Or Haimovich, Ryan Stutsman, Mohammad Alizadeh, Sachin Katti,	We present Flashield, a hybrid key-value cache that uses DRAM as a ?filter? to control and limit writes to SSD.
6	Size-aware Sharding For Improving Tail Latencies in In-memory Key-value Stores	Diego Didona, Willy Zwaenepoel,	This paper introduces the concept of size-aware sharding to improve tail latencies for in-memory key-value stores, and describes its implementation in the Minos key-value store.
7	Monoxide: Scale out Blockchains with Asynchronous Consensus Zones	Jiaping Wang, Hao Wang,	In this paper, we introduce the Asynchronous Consensus Zones, which scales blockchain system linearly without compromising decentralization or security.
8	FreeFlow: Software-based Virtual RDMA Networking for Containerized Clouds	Daehyeok Kim and Tianlong Yu, Hongqiang Harry Liu, Yibo Zhu, ; Jitu Padhye and Shachar Raindel, Chuanxiong Guo, Vyas Sekar and Srinivasan Seshan,	In this paper, we present FreeFlow, a software-based RDMA virtualization framework designed for containerized clouds.
9	Direct Universal Access: Making Data Center Resources Available to FPGA	Ran Shu and Peng Cheng, Guo Chen, Zhiyuan Guo, Lei Qu and Yongqiang Xiong, Derek Chiou and Thomas Moscibroda,	In this paper, we present Direct Universal Access (DUA), a communication architecture that provides uniform access for FPGA to heterogeneous data center resources.
10	Stardust: Divide and Conquer in the Data Center Network	Noa Zilberman, Gabi Bracha and Golan Schzukin,	We introduce Stardust, a fabric architecture for data center scale networks, inspired by network-switch systems.
11	Blink: Fast Connectivity Recovery Entirely in the Data Plane	Thomas Holterbach, Edgar Costa Molero, and Maria Apostolaki, Alberto Dainotti, Stefano Vissicchio, Laurent Vanbever,	In this paper, we explore new possibilities, created by programmable switches, for fast rerouting upon signals triggered by Internet traffic disruptions.
12	Hydra: a federated resource manager for data-center scale analytics	Carlo Curino, Subru Krishnan, and Konstantinos Karanasos, Sriram Rao, Giovanni M. Fumarola, Botong Huang, Kishore Chaliparambil, Arun Suresh, Young Chen, Solom Heddaya, Roni Burd, Sarvesh Sakalanaga, Chris Douglas, Bill Ramsey, and Raghu Ramakrishnan,	In this paper, we present Hydra, the resource management infrastructure we built to meet these requirements.Hydra leverages a federated architecture, in which a cluster is comprised of multiple, loosely coordinating subclusters.
13	Shuffling, Fast and Slow: Scalable Analytics on Serverless Infrastructure	Qifan Pu, Shivaram Venkataraman, Ion Stoica,	In this paper, we present Locus, a serverless analytics system that judiciously combines (1) cheap but slow storage with (2) fast but expensive storage, to achieve good performance while remaining cost-efficient.
14	dShark: A General, Easy to Program and Scalable Framework for Analyzing In-network Packet Traces	Da Yu, Yibo Zhu, Behnaz Arzani, Rodrigo Fonseca, Tianrong Zhang, Karl Deng, and Lihua Yuan,	Arbitrary combinations of middleboxes which transform packet headers make it challenging to even identify the same packet across multiple hops; packet drops in the collection system create ambiguities that must be handled; the large volume of captures, and their distributed nature, make it hard to do even simple processing; and the one-off and urgent nature of problems tends to generate ad-hoc solutions that are not reusable and do not scale. In this paper we propose dShark to address these challenges.
15	Minimal Rewiring: Efficient Live Expansion for Clos Data Center Networks	Shizhen Zhao, Rui Wang, Junlan Zhou, Joon Ong, Jeffrey C. Mogul, and Amin Vahdat,	We use a layer of patch panels between blocks of switches in a Clos DCN, which makes physical rewiring feasible, and we describe how to use integer linear programming (ILP) to minimize the number of patch-panel connections that must be changed, which makes expansions faster and cheaper.
16	Understanding Lifecycle Management Complexity of Datacenter Topologies	Mingyang Zhang, Radhika Niranjan Mysore, Sucha Supittayapornpong and Ramesh Govindan,	In this paper, we explore a new dimension, life cycle management, which attempts to capture operational costs of topologies.
17	Shoal: A Network Architecture for Disaggregated Racks	Vishal Shrivastav, Asaf Valadarsky, Hitesh Ballani and Paolo Costa, Ki Suh Lee, Han Wang, Rachit Agarwal and Hakim Weatherspoon,	We present Shoal, a power-efficient yet performant intra-rack network fabric built using fast circuit switches.
18	NetScatter: Enabling Large-Scale Backscatter Networks	Mehrdad Hessar, Ali Najafi, and Shyamnath Gollakota,	We present the first wireless protocol that scales to hundreds of concurrent transmissions from backscatter devices.
19	Towards Programming the Radio Environment with Large Arrays of Inexpensive Antennas	Zhuqi Li, Yaxiong Xie, and Longfei Shangguan, Rotman Ivan Zelaya, Jeremy Gummeson, Wenjun Hu, Kyle Jamieson,	In this work, we instrument the environment with a large array of inexpensive antennas (LAIA) and design algorithms to configure them in real time.
20	Pushing the Range Limits of Commercial Passive RFIDs	Jingxian Wang, Junbo Zhang, Rajarshi Saha, Haojian Jin and Swarun Kumar,	We present PushID, a system that exploits collaboration between readers to enhance the range of commercial passive RFID tags, without altering the tags whatsoever.
21	SweepSense: Sensing 5 GHz in 5 Milliseconds with Low-cost Radios	Yeswanth Guddeti, Raghav Subbaraman, Moein Khazraee, Aaron Schulman, and Dinesh Bharadia,	To overcome this challenge, we correct the distortion with self-generated calibration data, and classify the protocol that originated each transmission with only a fraction of the transmission?s samples.
22	Slim: OS Kernel Support for a Low-Overhead Container Overlay Network	Danyang Zhuo and Kaiyuan Zhang, Yibo Zhu, Hongqiang Harry Liu, Matthew Rockett, Arvind Krishnamurthy, and Thomas Anderson,	We have designed and implemented Slim, a low-overhead container overlay network that implements network virtualization by manipulating connection-level metadata.
23	Shinjuku: Preemptive Scheduling for ?second-scale Tail Latency	Kostis Kaffes, Timothy Chong, and Jack Tigar Humphries, Adam Belay, David Mazi?res and Christos Kozyrakis,	Shinjuku is a single-address space operating system that uses hardware support for virtualization to make preemption practical at the microsecond scale.
24	Shenango: Achieving High CPU Efficiency for Latency-sensitive Datacenter Workloads	Amy Ousterhout, Joshua Fried, Jonathan Behrens, Adam Belay, and Hari Balakrishnan,	It achieves such fast reallocation rates with (1) an efficient algorithm that detects when applications would benefit from more cores, and (2) a privileged component called the IOKernel that runs on a dedicated core, steering packets from the NIC and orchestrating core reallocations.
25	End-to-end I/O Monitoring on a Leading Supercomputer	Bin Yang, Xu Ji, Xiaosong Ma, Xiyang Wang, Tianyu Zhang and Xiupeng Zhu, Nosayba El-Sayed, Haidong Lan and Yibo Yang, Jidong Zhai, Weiguo Liu, Wei Xue,	This paper presents an effort to overcome the complexities of production-use I/O performance monitoring.
26	Zeno: Diagnosing Performance Problems with Temporal Provenance	Yang Wu, Ang Chen, Linh Thi Xuan Phan,	We present an algorithm for generating temporal provenance and an experimental debugger called Zeno; our experimental evaluation shows that Zeno can successfully diagnose several realistic performance bugs.
27	Confluo: Distributed Monitoring and Diagnosis Stack for High-speed Networks	Anurag Khandelwal, Rachit Agarwal, Ion Stoica,	Confluo is an end-host stack that can be integrated with existing network management tools to enable monitoring and diagnosis of network-wide events using telemetry data distributed across end-hosts, even for high-speed networks.
28	DETER: Deterministic TCP Replay for Performance Diagnosis	Yuliang Li, Rui Miao, Mohammad Alizadeh, Minlan Yu,	In this paper, we introduce DETER, a deterministic TCP replay tool, which runs lightweight recording all the time at all the hosts and then replay selected collections where operators can collect packet traces and trace TCP executions for diagnosis.
29	JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs	Eunji Jeong, Sungwoo Cho, Gyeong-In Yu, Joo Seong Jeong, Dong-Jin Shin, and Byung-Gon Chun,	This paper presents JANUS, a system that combines the advantages from both sides by transparently converting an imperative DL program written in Python, the de-facto scripting language for DL, into an efficiently executable symbolic dataflow graph.
30	BLAS-on-flash: An Efficient Alternative for Large Scale ML Training and Inference?	Suhas Jayaram Subramanya and Harsha Vardhan Simhadri, Srajan Garg, Anil Kag and Venkatesh Balasubramanian,	We propose an inexpensive and efficient alternative based on the observation that many ML tasks admit algorithms that can be programmed with linear algebra subroutines.
31	Tiresias: A GPU Cluster Manager for Distributed Deep Learning	Juncheng Gu, Mosharaf Chowdhury, and Kang G. Shin, Yibo Zhu, Myeongjae Jeon, Junjie Qian, Hongqiang Liu, Chuanxiong Guo,	We present Tiresias, a GPU cluster manager tailored for distributed DL training jobs, which efficiently schedules and places DL jobs to reduce their job completion times (JCTs).
32	Correctness and Performance for Stateful Chained Network Functions	Junaid Khalid and Aditya Akella,	To this end, we built CHC, an NFV framework that leverages an external state store coupled with state management algorithms and metadata maintenance for correct operation even under a range of failures.
33	Performance Contracts for Software Network Functions	Rishabh Iyer, Luis Pedrosa, Arseniy Zaostrovnykh, Solal Pirelli, Katerina Argyraki, and George Candea,	We describe BOLT, a technique and tool for computing such performance contracts for the entire software stack of NFs written in C, including the core NF logic, DPDK packet processing framework, and NIC driver.
34	FlowBlaze: Stateful Packet Processing in Hardware	Salvatore Pontarelli, Roberto Bifulco, Marco Bonola, Carmelo Cascone, Marco Spaziani and Valerio Bruschi, Davide Sanvito, Giuseppe Siracusano, Antonio Capone, Michio Honda and Felipe Huici, Giuseppe Bianchi,	We address the problem with FlowBlaze, an open abstraction for building stateful packet processing functions in hardware.
35	SIMON: A Simple and Scalable Method for Sensing, Inference and Measurement in Data Center Networks	Yilong Geng, Shiyu Liu, and Zi Yin, Ashish Naik, Balaji Prabhakar and Mendel Rosenblum, Amin Vahdat,	In this paper, we set out to push the boundary of edge-based measurement by scalably and accurately reconstructing the full queueing dynamics in the network with data gathered entirely at the transmit and receive network interface cards (NICs).
36	Is advance knowledge of flow sizes a plausible assumption?	Vojislav ?ukic, Sangeetha Abdu Jyothi, Bojan Karla?, Muhsen Owaida, Ce Zhang, and Ankit Singla,	These results indicate that a presumed lack of advance knowledge of flow sizes is not necessarily prohibitive for highly efficient scheduling, and suggest further exploration in two directions: (a) scheduling under partial knowledge; and (b) evaluating the practical payoff and expense of obtaining more knowledge.
37	Stable and Practical AS Relationship Inference with ProbLink	Yuchen Jin, Colin Scott, Amogh Dhamdhere, Vasileios Giotsas, Arvind Krishnamurthy, Scott Shenker,	We then develop a probabilistic algorithm, ProbLink, to overcome the inference barriers for hard links, such as non-valley-free routing, limited visibility, and non-conventional peering practices.
38	NetBouncer: Active Device and Link Failure Localization in Data Center Networks	Cheng Tan, Ze Jin, Chuanxiong Guo, Tianrong Zhang, Haitao Wu, Karl Deng, Dongming Bi, and Dong Xiang,	In this paper, we propose NetBouncer, a failure localization system that leverages the IP-in-IP technique to actively probe paths in a data center network.
39	Riverbed: Enforcing User-defined Privacy Constraints in Distributed Web Services	Frank Wang, Ronny Ko and James Mickens,	Riverbed is a new framework for building privacy-respecting web services.
40	Hyperscan: A Fast Multi-pattern Regex Matcher for Modern CPUs	Xiang Wang, Yang Hong, and Harry Chang, KyoungSoo Park, Geoff Langdale, Jiayu Hu and Heqing Zhu,	In this paper, we present Hyperscan, a high performance regular expression matcher for commodity server machines.
41	Deniable Upload and Download via Passive Participation	David Sommer, Aritra Dhar, Luka Malisa, and Esfandiar Mohammadi, Daniel Ronzani, Srdjan Capkun,	In order to enable plausible deniability while providing or accessing controversial information, we design CoverUp: a system that enables users to asynchronously upload and download data.
42	CAUDIT: Continuous Auditing of SSH Servers To Mitigate Brute-Force Attacks	Phuong M. Cao, Yuming Wu, and Subho S. Banerjee, Justin Azoff and Alex Withers, Zbigniew T. Kalbarczyk and Ravishankar K. Iyer,	This paper describes CAUDIT, an operational system deployed at the National Center for Supercomputing Applications (NCSA) at the University of Illinois.
43	Dataplane equivalence and its applications	Dragos Dumitrescu, Radu Stoenescu, Matei Popovici, Lorina Negreanu, and Costin Raiciu,	We present the design and implementation of netdiff, an algorithm that uses symbolic execution to check the equivalence of two network dataplanes modeled in SEFL.
44	Alembic: Automated Model Inference for Stateful Network Functions	Soo-Jin Moon, Jeffrey Helt, Yifei Yuan, Yves Bieri, Sujata Banerjee, Vyas Sekar, Wenfei Wu, Mihalis Yannakakis, Ying Zhang,	In this work, we present Alembic, which synthesizes NF models viewed as an ensemble of finite-state machines (FSMs).
45	Model-Agnostic and Efficient Exploration of Numerical State Space of Real-World TCP Congestion Control Implementations	Wei Sun and Lisong Xu, Sebastian Elbaum, Di Zhao,	In this paper, we propose an automated numerical state space exploration method, called ACT, which leverages the model-agnostic feature of random testing and greatly improves its efficiency by guiding random testing under the feedback iteratively obtained in a test.
46	Scaling Community Cellular Networks with CommunityCellularManager	Shaddi Hasan, Mary Claire Barela, Matthew Johnson, Eric Brewer, Kurtis Heimerl,	In this paper, we present CommunityCellularManager (CCM), a system for operating community cellular networks at scale.
47	TrackIO: Tracking First Responders Inside-Out	Ashutosh Dhekne, Ayon Chakraborty, Karthikeyan Sundaresan, and Sampath Rangarajan,	In this work, we present the design, implementation and evaluation of TrackIO–a system capable of accurately localizing and tracking mobile responders real-time in large indoor environments.
48	3D Backscatter Localization for Fine-Grained Robotics	Zhihong Luo, Qiping Zhang, Yunfei Ma, Manish Singh, and Fadel Adib,	This paper presents the design and implementation of TurboTrack, a 3D localization system for fine-grained robotic tasks.
49	Many-to-Many Beam Alignment in Millimeter Wave Networks	Suraj Jog, Jiaming Wang, Junfeng Guan, Thomas Moon, Haitham Hassanieh, and Romit Roy Choudhury,	This paper presents BounceNet, the first many-to-many millimeter wave beam alignment protocol that can exploit dense spatial reuse to allow many links to operate in parallel in a confined space and scale the wireless throughput with the number of clients.