
IEEE/ACM TRANSACTIONS ON NETWORKING 1
Fluid-Shuttle: Efficient Cloud Data Transmission
Based on Serverless Computing Compression
Rong Gu , Member, IEEE, Shulin Wang, Haipeng Dai , Senior Member, IEEE, Xiaofei Chen,
Zhaokang Wang , Wenjie Bao, Jiaqi Zheng , Senior Member, IEEE, Yaofeng Tu, Yihua Huang ,
Lianyong Qi , Senior Member, IEEE, Xiaolong Xu , Senior Member, IEEE,
Wanchun Dou , and Guihai Chen , Fellow, IEEE
Abstract— Nowadays, there exists a lot of cross-region data
transmission demand on the cloud. It is promising to use
serverless computing for data compressing to save the total data
size. However, it is challenging to estimate the data transmission
time and monetary cost with serverless compression. In addition,
minimizing the data transmission cost is non-trivial due to the
enormous parameter space. This paper focuses on this problem
and makes the following contributions: 1) We propose empirical
data transmission time and monetary cost models based on
serverless compression. It can also predict compression infor-
mation, e.g., ratio and speed using chunk sampling and machine
learning techniques. 2) For single-task cloud data transmission,
we propose two efficient parameter search methods based on
Sequential Quadratic Programming (SQP) and Eliminate then
Divide and Conquer (EDC) with proven error upper bounds.
Besides, we propose a parameter fine-tuning strategy to deal
with transmission bandwidth variance. 3) Furthermore, for multi-
task scenarios, a parameter search method based on dynamic
programming and numerical computation is proposed. We have
implemented the system called Fluid-Shuttle, which includes
straggler optimization, cache optimization, and the autoscaling
decompression mechanism. Finally, we evaluate the performance
of Fluid-Shuttle with various workloads and applications on the
real-world AWS serverless computing platform. Experimental
results show that the proposed approach can improve the param-
eter search efficiency by over 3× compared with the state-of-art
methods and achieves better parameter quality. In addition, our
Manuscript received 25 May 2023; revised 7 February 2024;
accepted 18 April 2024; approved by IEEE/ACM TRANSACTIONS ON
NETWORKING Editor R. Pedarsani. This work was supported in part by
the National Natural Science Foundation of China under Grant 62072230,
Grant 62272223, and Grant U22A2031; in part by Jiangsu Province
Science and Technology Key Program under Grant BE2021729; and in
part by the Collaborative Innovation Center of Novel Software Technology
and Industrialization. (Corresponding authors: Rong Gu; Haipeng Dai;
Jiaqi Zheng.)
Rong Gu, Shulin Wang, Haipeng Dai, Xiaofei Chen, Wenjie Bao,
Jiaqi Zheng, Yihua Huang, Wanchun Dou, and Guihai Chen are
with the State Key Laboratory for Novel Software Technology,
Nanjing University, Nanjing, Jiangsu 210023, China (e-mail:
gurong@nju.edu.cn; wangshulin@smail.nju.edu.cn; haipengdai@nju.edu.cn;
xfchen@smail.nju.edu.cn; bwj_678@qq.com; jzheng@nju.edu.cn; yhuang@
nju.edu.cn; douwc@nju.edu.cn; gchen@nju.edu.cn).
Zhaokang Wang is with the College of Computer Science and Technology,
Nanjing University of Aeronautics and Astronautics, Nanjing 210016, China
(e-mail: wangzhaokang@nuaa.edu.cn).
Yaofeng Tu is with ZTE Corporation, Shenzhen 518057, China (e-mail:
tu.yaofeng@zte.com.cn).
Lianyong Qi is with the College of Computer Science and Technology,
China University of Petroleum (East China), Dongying 257099, China
(e-mail: lianyongqi@gmail.com).
Xiaolong Xu is with the School of Software, Nanjing University of
Information Science and Technology, Nanjing 210044, China (e-mail:
xlxu@nuist.edu.cn).
Digital Object Identifier 10.1109/TNET.2024.3402561
approach achieves higher time efficiency and lower monetary cost
compared with competing cloud data transmission approaches.
Index Terms— Data transmission, serverless compression,
cloud function configuration.
I. INTRODUCTION
N
OWADAYS, a large amount of data needs to be trans-
ferred across data centers or cloud regions [1]. For
example, software/model distribution, database replication,
search index synchronization, and other data backup opera-
tions require frequent data transmission on the cloud [2]. It is
reported that 70% of IT firms have massive data transmission
among data centers, ranging from 330 TB to 3.3 PB per
month, and the amount of data keeps overgrowing [3]. Data
transmission over long distances consumes massive band-
width resources, which is costly on the cloud. Cloud service
providers have spent hundreds of millions of US dollars on
data transmission every year [4]. Therefore, improving time
efficiency and reducing the monetary cost of cross-region
data transmission on the cloud is vital. In order to save the
bandwidth cost and improve data transmission efficiency, data
is usually compressed before transmission [5], [6]. However,
data compression itself brings extra computation costs. Thus,
it is important to make a tradeoff between the compression
computation cost and the saved bandwidth cost.
In the traditional cloud environment, it is common to rent
virtual machines (VMs) for data compression [7]. However,
the VMs are heavy for data transmission tasks because they
usually take nearly 1 minute to start [8] and are charged
hourly. In recent years, serverless computing is emerging
as the next generation of cloud computing technology [9].
It provides computing resources by cloud functions (e.g., AWS
Lambda [10]) with strong elasticity and fine-grained billing.
We compared the data transmission time and monetary cost
of using cloud functions compression with virtual machine
compression to transfer a 1 GB Lineitem dataset [11] on
AWS.
1
Experimental results show that the end-to-end data
transmission time of serverless compression (including cold
start time) is 1/4 of that of virtual machine compression
(including boot-up time).
Nevertheless, achieving efficient data transmission with
serverless compression faces the challenge of choosing
1
We use a typical AWS EC2 c5a.xlarge instance with 4 vCPU and 8 GB
memory, and 4 cloud functions each has 1536 MB memory.
1558-2566 © 2024 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See https://www.ieee.org/publications/rights/index.html for more information.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
Authorized licensed use limited to: ZTE CORPORATION. Downloaded on November 26,2024 at 05:42:44 UTC from IEEE Xplore. Restrictions apply.
文档被以下合辑收录
评论