暂无图片
暂无图片
暂无图片
暂无图片
暂无图片
Efficiently Supporting Adaptive Multi-Level Serializability Models in Distributed Database Systems .pdf
175
3页
3次
2021-11-29
免费下载
Eiciently Supporting Adaptive Multi-Level Serializability
Models in Distributed Database Systems
Zhanhao Zhao
Renmin Univerisity of China
zhanhaozhao@ruc.edu.cn
ABSTRACT
Informally, serializability means that transactions appear to have
occurred in some total order. In this paper, we show that only the se-
rializability guarantee with some total order is not enough for many
real applications. As a complement, extra partial orders of transac-
tions, like real-time order and program order, need to be introduced.
Motivated by this observation, we present a framework that mod-
els serializable transactions by adding extra partial orders, namely
multi-level serializability models. Following this framework, we pro-
pose a novel concurrency control algorithm, called bi-directionally
timestamp adjustment (BDTA), to supporting multi-level serializ-
ability models in distributed database systems. We integrate the
framework and BDTA into Greenplum and Deneva to show the
benets of our work. Our experiments show the performance gaps
among serializability levels and conrm BDTA achieves up to 1.7
×
better than state-of-the-art concurrency control algorithms.
ACM Reference Format:
Zhanhao Zhao. 2021. Eciently Supporting Adaptive Multi-Level Serial-
izability Models in Distributed Database Systems. In Proceedings of the
2021 International Conference on Management of Data (SIGMOD ’21), June
20–25, 2021, Virtual Event , China. ACM, New York, NY, USA, 3 pages.
https://doi.org/10.1145/3448016.3450579
1 INTRODUCTION
Thus far, almost every major RDBMS is multi-version concurrency
control based to achieve serializability. However, recent studies en-
gendered by decentralized database systems under multi-version
concurrency control show that serializability guarantee could still
lead to stale reads. Take transactions shown in the top part of Fig-
ure 1 as an example. Transaction
T
3
starts after
T
1
commits, and
T
1
creates a new version
x
1
of tuple
x
. A stale read happens since the
read of
T
3
returns
x
0
but not
x
1
. This is possible, due to a lack of a
globally synchronized clock, for a database system to execute
T
3
over a state of the database before
T
1
exists (e.g.,
T
1
’s write cannot
be seen by
T
3
since the snapshot of
T
3
is less than the commit times-
tamp of
T
1
). Strict serializability, in essence, adds a simple additional
constraint, i.e., linearizable consistency [
15
], which preserves the
real-time order,
, of non-concurrent operations (i.e.,
op
1
op
2
I would like to thank Wei Lu, Haixiang Li and Xiaoyong Du for advising this work
and the following grants: National Natural Science Foundation of China 61972403, and
Tencent Rhino-Bird Joint Research Program.
Permission to make digital or hard copies of part or all of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for prot or commercial advantage and that copies bear this notice and the full citation
on the rst page. Copyrights for third-party components of this work must be honored.
For all other uses, contact the owner/author(s).
SIGMOD ’21, June 20–25, 2021, Virtual Event , China
© 2021 Copyright held by the owner/author(s).
ACM ISBN 978-1-4503-8343-1/21/06.
https://doi.org/10.1145/3448016.3450579
T3
Process P
1
Process P
2
R
3
(x
0
)
C
3
Time
T1
W
1
(x
1
)
C
1
T0
W
0
(x
0
)
C
0
History
T2
R
2
(x
0
)
C
2
Projection
H | T0
W
0
(x
0
)
C
0
H | T1
W
1
(x
1
)
C
1
H | T2
R
2
(x
0
)
C
2
H | T3
R
3
(x
0
)
C
3
Equivalent
Sequential History
T0
W
0
(x
0
)
C
0
T1
W
1
(x
1
)
C
1
T2
R
2
(x
0
)
C
2
T3
R
3
(x
0
)
C
3
Figure 1: Formulation of Multi-level Serializability
if operation
op
2
starts after operation
op
1
completes), on top of
serializability to prevent stale reads. That is, strict serializability
guarantees the properties of both serializability and linearizable
consistency hold during the entire execution of transactions.
Except Spanner [
2
,
9
], the majority of NewSQL systems, includ-
ing CockroachDB [
25
], TiDB [
16
] and MongoDB [
27
], do not sup-
port strict serializability. The reason behind this is not the diculty
of the implementation, but the costly overhead to request glob-
ally ordered timestamps to preserve real-time order. Consider that
linearizable consistency is the strongest consistency model in dis-
tributed systems, while a weaker consistency model yields higher
performance. We are therefore motivated to provide a framework to
optimize between the performance and consistency in distributed
database systems based on the requirements of applications.
In this paper, we aim to address the issue of serializability with
multi-level consistency models, which include linearizable consis-
tency (LC), sequential consistency (SC) [
18
], and causal consistency
(CC) [
17
], etc. Based on the observation that the consistency model
is dened by selectively preserving certain kind of partial orders
among operations and serializability is dened by preserving certain
kind of total orders among transactions, We rst propose a framework
by re-dening the consistency models in terms of the transaction
granularity rather than the operation granularity. We theoretically
analyze that all possible combinations are limited to serializability
& LC (namely strict serializability), and serializability & SC (namely
sequential serializability), while the other combinations are reduced
to sequential serializability.
Based on the framework, we propose a unied concurrency con-
trol algorithm with various optimizations. We attach each transac-
tion with a timestamp interval (denoted as [LB, UB]), and propose
the bi-directionally timestamp adjustment (BDTA) algorithm to
dynamically tracking the order of transactions by comparing and
coordinating timestamp intervals of transactions. We then integrate
the framework and BDTA into Greenplum [
13
] to study the system
performance under multi-level serializability models. Besides, we
implement BDTA and state-of-the-art concurrency control algo-
rithms in Deneva [
14
] to compare BDTA with other algorithms.
The results show BDTA achieves up to 1.7× performance gain.
Graduate Student Research Competition Track Abstract
SIGMOD ’21, June 20–25, 2021, Virtual Event, China
2908
2 FRAMEWORK
We propose a framework to model serializability transactions by
integrating consistency models in distributed systems. Most exist-
ing works [
1
,
10
,
28
,
29
] focus on the combinations between a weak
isolation level [
4
] and a weak consistency model [
7
]. A few works
discuss strict serializability and serializability [
3
,
23
,
24
]. Diering
from existing works, we make a systematic analysis of possible
combinations between serializability with consistency models. We
dene multi-level serializability below:
Sequential History.
A history
H
is sequential if for any two trans-
actions
T
i
and
T
j
in
H
, either the last event of
H |T
i
precedes the
rst event of
H |T
j
or the last event of
H |T
j
precedes the rst event
of H |T
i
.
We denote T
i
T
j
if T
i
precedes T
j
in H.
Write Legal.
A sequential history
H
is write legal if for any
H |x
,
any read of
x
m
(
m
th
version of
x
) must immediately follow the
write of x
m
when excluding all other read events of x in H |x.
Partial Order.
We extend the partial orders introduced by consis-
tency models from the operation granularity to transaction gran-
ularity. Formally, given two transactions
T
i
and
T
j
in
H
, Program
order
pr
H
:
T
i
pr
H
T
j
if transactions
T
i
and
T
j
are in the same
process
P
and the last event of
T
i
precedes the rst event of
T
j
;
Write-read order
wr
H
:
T
i
wr
H
T
j
if there exists an operation
op
1
in
T
i
and another operation
op
2
in
T
j
such that
op
1
reads a version
written by
op
2
; Casual-related order
cr
H
:
T
i
cr
H
T
j
if (a)
T
i
pr
H
T
j
or (b)
T
i
wr
H
T
j
, or they are related by a transitive closure leverag-
ing (a) and/or (b); Real-time order
r t
H
:
T
i
r t
H
T
j
if the last event
of T
i
precedes the rst event of T
j
.
Multi-level Serializability Modeling.
As shown in Figure 1, given
a history
H
, we check whether an equivalent history
to
H
can be
sequential, write-legal, and can preserve the partial orders required
by a specic serializability level. Following these rules, we propose
the framework by dening strict serializability (SER-L), sequential
serializability (SER-S), and serializability (SER) levels below. Given
a history H, H satises
SER-L level
if
H
is equivalent to some sequential history
S
H
,
which is write legal, with
cr
H
cr
S
H
and
r t
H
r t
S
H
.
SER-S level
if
H
is equivalent to some sequential history
S
H
,
which is write legal, with
cr
H
cr
S
H
.
SER level
if there is a sequential history
S
H
, with
S
H
|T = H |T
for each T in H, and S
H
is write legal, with
wr
H
wr
S
H
.
3 CONCURRENCY CONTROL PROTOCOL
Our BDTA algorithm is multi-version based, and follows the op-
timistic way enhanced with bi-directionally dynamic timestamp
adjustment to do concurrency control.
The main idea of BDTA is described as follows. Firstly, a trans-
action
T
should preserve partial orders required by the specic
serializability level when
T
starts. In our algorithm, this property
is guaranteed by snapshot isolation. We generate a snapshot from
either Timestamp Oracle [
22
] or HLC [
11
] to preserve the partial
A history
H
is a nite sequence of operation events generated by transactions. We
assume each process executes transactions sequentially.
H |T
i
represents the projection
of T
i
in H , i.e., a sequence of all events in H whose operations are from T
i
.
H |x
denotes a subsequence of all events in
H
whose operations executed on item
x
.
H
and
H
are said to be equivalent if for every process
P
,
H |P = H
|P
.
H |P
denotes
a subsequence of all events in H whose process names are P .
0
10
20
30
40
50
60
70
80
8 16 32 64 128 256 512
Throughput (10
3
Txns/s)
# of Threads
SER-L
SER-S
SER-L(1.5ms)
SER-L(5ms)
RC(Default)
(a) Multi-level Serializability
0
10
20
30
40
50
60
70
0 20 40 60 80 100
Throughput (10
3
Txns/s)
% of Read-Write Transactions
MAAT
MVCC
SILO
T/O
SUNDIAL
BDTA
(b) Concurrency Control Evaluations
Figure 2: Verication on Greenplum and Deneva
order required by SER-L or SER-S, respectively. Secondly, a trans-
action
T
guarantees a total order for timestamp intervals with all
concurrent transactions
§
when
T
commits, i.e., for any two con-
current transactions
T
i
, T
j
, we can have either
T
i
.U B < T
j
.LB
or
T
j
.U B < T
i
.LB
, leading to an order of
T
i
T
j
, or
T
j
T
i
, respec-
tively. A transaction
T
aborts if no legal timestamp interval exists,
i.e.,
T .U B < T .LB
. We shall give an example shown in Figure 1
to elaborate on the BDTA mechanism. Focused on transaction
T
1
,
when
T
1
enters the validation phase, it detects
T
2
has read tuple
x
with version
x
0
, and hence
T
1
actively adjusts its timestamp interval
with
T
2
to let
T
2
.U B < T
1
.LB
, indicating that
T
2
is ordered before
T
1
.
This adjustment is bi-directional, which renes timestamp intervals
of both
T
1
and
T
2
, to preserve the partial order
T
2
T
1
, causing
T
2
.U B < T
1
.LB
. After
T
1
passing the validation phase, the commit
timestamp of T
1
and x
1
will be assigned with T
1
.LB.
BDTA algorithm brings the following benets. First, validations
for read-only transactions are eliminated, which bring performance
gain in read-intensive workload compared with single-version
based [
20
,
30
,
31
] and other timestamp interval based [
19
] algo-
rithms. We use timestamp intervals to maintain the order of trans-
actions, which signicantly reduces the network transferring over-
head against dependency graph based algorithms that shue read-
/write set instead [
12
,
21
]. Further, databases equipped with BDTA
can uniformly support SER-L, SER-S, and SER level, which is capa-
ble of running at an appropriate level according to the requirements
of applications.
4 VERIFICATION
As shown in Figure 2(a), we rst verify the performance of multi-
level serializability models. We use the workload-a in YCSB [
8
] and
vary the network latency when communicating with Timestamp
Oracle. There is an obvious performance gap among dierent net-
works between SER-L (when RTT is set to 1
.
5
ms
or 5
ms
) and SER-S
since SER-S does not suer from the increasing network latency.
Besides, BDTA outperforms the default lock-based concurrency
control algorithm which provide the read commited in Greenplum.
In Figure 2(b), we then conrm that BDTA performs up to 1.7
×
better than the next-best concurrency control algorithm. We use
the same settings provided in [
14
] and vary the percentage of read-
write transactions. Benets with the transaction reordering ability,
transactions should be aborted in the other algorithms [
5
,
6
,
20
,
26
,
31], can be successfully committed in some cases by BDTA.
§
A transaction
T
is said to be concurrent to
T
if
T
starts before
T
commits and
commits after T starts.
Graduate Student Research Competition Track Abstract
SIGMOD ’21, June 20–25, 2021, Virtual Event, China
2909
of 3
免费下载
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文档的来源(墨天轮),文档链接,文档作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论

关注
最新上传
暂无内容,敬请期待...
下载排行榜
Top250 周榜 月榜