暂无图片
暂无图片
暂无图片
暂无图片
暂无图片
Decoupled Transactions: Low Tail Latency Online Transactions Atop Jittery Servers
73
30页
0次
2022-01-30
免费下载
Decoupled Transactions:
Low Tail Latency Online Transactions Atop Jiery Servers
Pat Helland
phelland@salesforce.com
Salesforce
San Francisco, CA, USA
ABSTRACT
Modern cloud data centers are busy places that share lots of re-
sources. It is common for services to uctuate in their respon-
siveness, sometimes becoming slow or very slow. Many distributed
systems experience cascading slowness as one or a few slow servers
(or their network) bring the entire system to its knees.
Non-transactional work copes by using idempotent retries by-
passing the laggards. For transactional databases, it’s not so simple.
This paper sketches a design for a distributed database providing
responsive snapshot isolation transactions even when some of its
servers and connections stop or, more perniciously, just slow down.
We present a thought experiment for a decoupled transactions
database system that avoids cascading slowdown when a subset of
its servers are sick but not necessarily dead. The goal is to provide
low tail latency online transactions atop servers and networks that
may sometimes go slow. Assume at most F recalcitrant servers
in the database. Can we design a robust system that makes pre-
dictable progress while not waiting for F slow servers? Can we use
these ideas for practical deployments in modern data centers with
availability zones and today’s expected operational challenges?
This hypothetical design explores techniques to dampen applica-
tion visible jitter in a database system running in a cloud datacenter
when most of the servers are responsive. This inevitably causes us
to examine the nature of a database’s knowledge of correctness and
how that can exist without a centralized authority.
ACM Reference Format:
Pat Helland. 2022. Decoupled Transactions: Low Tail Latency Online Trans-
actions Atop Jittery Servers. In Proceedings of 12th Annual Conf on Innova-
tive Data Systems Research (CIDR ’22) (CIDR’22). ACM, New York, NY, USA,
30 pages. https://doi.org/10.1145/1122445.1122456
1 INTRODUCTION
Typically, online transactional databases are deployed with expen-
sive dedicated hardware
1
in data centers owned by their enterprise.
It is desirable to run these solutions in cloud data center environ-
ments to avail ourselves of their tremendous advantages in exible
deployments and elasticity of resources. However, there are new
challenges as these shared environments do not oer predictable
1
High quality servers, data storage in SANs (Storage Area Networks) [
10
,
43
] and
bespoke networks are common deployments for mission-critical databases.
This article is published under a Creative Commons Attribution License
(http://creativecommons.org/licenses/by/3.0/), which permits distribution and repro-
duction in any medium as well as allowing derivative works, provided that you at-
tribute the original work to the author(s) and CIDR 2022. 12th Annual Conference on
Innovative Data Systems Research (CIDR ’22). January 9-12, 2022, Chaminade, USA.
CIDR’22, January 10-13, 2022, Chaminade, CA, USA
© 2022 Association for Computing Machinery.
ACM ISBN 978-x-xxxx-xxxx-x/YY/MM. . .$15.00
https://doi.org/10.1145/1122445.1122456
response time to requests owing across servers. In the cloud, re-
sponses exhibit probabilistic latency. Worse, the expected response
time varies as the environment experiences pressure. Some servers
will take noticeably longer while others provide their normal ex-
pected response time distributions.
Human facing non-transactional work provides a vibrant and
responsive experience using retries of idempotent operations
2
.
Can we provide high-availability low-latency responses from a
high-throughput externally consistent
3
distributed database that
tolerates jittery or sick servers? Can we know what happened in the
past quickly with low tail-latency so we can briskly make changes in
the future based on what happened? Can we rapidly protect against
conicting concurrent updates as we commit transactions? How
the heck can we build a system without a centralized authority to
remember what’s happened and decide what should happen next?
It is not our goal to dene a super-scalable SQL system.
We hope to scale to tens of servers with predictable response
time while running in a largely unpredictable environment.
This thought exercise aspires to pry apart the cross-server depen-
dencies in our imaginary system and push our minds to understand
the nature of how state is represented in a system and how that
state may be decentralized. Can we bound the internal dependen-
cies within the implementation of a database so transactions can
commit rapidly even when some components aren’t responsive?
It is hoped that this can empower future systems to blithely ig-
nore performance irregularities in large data centers and just give
prompt answers when enough servers participate.
1.1 Inspiration for This Work
In the cloud, we frequently measure the latency between a request
and its response as one service invokes another. These response la-
tencies are probabilistic and can be expressed as a CDF (Cumulative
Distribution Function) showing the expected likelihood a response
will be received by a specied time
4
.
For example, an SLO (Service Level Objective) for a service may
specify a 99.9% probability of responding within 2 milliseconds.
A system will typically have multiple dimensions to its SLO such
as windows of time in which the target response latency is actually
met. For example, 99.9% of the responses will be received within
2 milliseconds at least 98% of the time.
When a larger system is built using smaller services, the SLO of
the smaller services can have signicant impact on the resulting
2
See The Tail at Scale [
20
] for an excellent discussion about bounding tail-latency
probabilities by retrying to an alternate server.
3
I.e. snapshot isolation [13, 41] as seen external to a distributed database.
4
See §12 (Appendix A: Building on a Jittery Foundation) for an in-depth discussion of
the challenges we face within modern data centers.
CIDR’22, January 10-13, 2022, Chaminade, CA, USA Pat Helland
system. When all goes well, the system itself oers its expected
latency as each of its components meets their SLO for prompt
service. When services within the system are less punctual, we can
see impact on the larger system’s responsiveness
5
.
Retrying slow requests to an alternate server is a common tech-
nique to mitigate this [
20
]. Proactively sending two or more requests
to dierent servers and accepting the rst response dramatically
lowers the expected latency for the rst response.
In the DB world, we see latency bounding for log subsystems.
One example is Apache Bo okkeeper [
7
]. Log entries are appended by
writing to N log-storage-replicas (called Bookies). When Q of N log-
storage-replicas have acknowledged the receipt of the log entry, it is
durable. Since N - Q log-storage-replicas may be slow, the expected
and observed SLO for writing to the log is dramatically reduced.
The database system is predictably faster with less variability.
AWS Aurora[
46
,
48
] carries this further with replicated AWS-
storage-servers for log replay and block creation. These AWS-storage-
servers are eectively the lower half of the database. They are placed
with 2 servers in each of 3 AZs (Availability Zones). When 4 of
these servers have acknowledged the log entry, Aurora considers
the log entry to be durable. Transactions having commit records
in the entry may be conrmed to the waiting human. Even when
AZ+1
6
failures have happened, the commit record can be read later.
Logging is fast even if 2 of the 6 AWS-storage-servers are slow.
Still, major portions of the database run in a single server.
If that server is jittery, humans will experience unpredictable delays.
Can ALL of the database be responsive
atop unpredictable servers?
Can every aspect of a database be decentralized
and responsive even when some of its pieces are not?
1.2 Applicability to a Broad Range of Systems.
This paper is about jitter-avoidance through the use of quorum. It
investigates techniques to manage complex systems based on an al-
ternate form of order (called seniority) that is largely decoupled from
classic happened before partial-order messaging guarantees[35].
By sketching a design for a transactional database, we
demonstrates the broad applicability of these principles
to many complex distributed systems.
Paxos[36] provides linear order but it jitters
We provide partial order and avoid jitter
1.3 Availability in a Complex System
The phone system over land lines oers amazing availability. Occa-
sionally, a call is dropped but you can redial and connect.
Availability is dened as the opportunity to dial another call.
In transactional database systems, transactions may fail. Lock
conicts or other problems can cause work to be discarded. You
don’t want this to be common but once in a while is OK. When a
transaction aborts, the application can restart the work and hope-
fully succeed. In a distributed database, restarted transactions may
5
See the paper Thinking about Availability in Large Service Infrastructures [
39
] for great
discussions of SLAs (Service Level Agreements), SLOs (Service Level Objectives), and
SLIs (Service Level Indicators). The discussion also includes multi-dimensional SLOs.
6
Tolerating AZ+1 failures means an entire AZ can be lost at the same time as one more
server is down for other reasons [46].
route to a dierent database server. If a single transaction gets stuck,
the application can abandon it, retry, and probably succeed.
Our decoupled transactions database may occasionally timeout
while doing a transaction. Even so, it remains open for new business.
1.4 Snapshot Isolation Guarantees
The goal for this design is to imagine a database that runs many
existing applications with excellent availability and responsiveness.
Special attention is required to when, how, and with what guar-
antees the application sees and changes its data. Snapshot Isolation
[
13
,
41
] denes the database behavior that an application sees when
running with concurrent work. It is arguably the most common
isolation guarantee provided by modern commercial databases. In
its basic form, snapshot isolation provides guarantees when reading
and committing updates in a transaction:
Snapshot reads: Each transaction gets a snapshot time as it
begins and, as it reads data, sees updates from transactions
committed before the snapshot time and its own updates.
Conict detection prior to commit: As a transaction is
about to commit, the database will ensure that none of the
updated records has been changed by other transactions
since this transaction’s snapshot time.
Snapshot isolation is extremely popular. Many existing applications
depend on these guarantees for their successful execution.
1.5 What’s Jitter?
Jitter mean behavior dierent than the expected norm. In particular,
for response time varying from what’s expected. In electronics, it
means deviation from the circuit’s expected clock timing.
In networking, jitter describes larger than expected variance in
the delay moving through the network [
17
]. Servers may jitter for
many reasons from gray failure to Operating Systems to Java VMs.
See §12: (Appendix A: Building on a Jittery Foundation).
Within a distributed system, jitter comprises both variability
from the network and variability from the server processing a request.
From the standpoint of the client issuing a request, it can’t tell the
cause of the delay, just that the response is not as zippy as hoped.
If part of the cluster is not responding, those servers may or
may not be doing work. For servers spread widely within or across
data centers, an observer may see some servers responding quickly
but not all of them. The ones WE see responding quickly may not be
providing prompt responses to other servers in the datacenter.
What can we assume so we can make progress?
1.6 Quorum: Avoiding Snapshot Isolation Jitter
Snapshot isolation transactions can do independent changes to the
database as long as there are no conicting updates and each trans-
action sees only snapshot reads. What if these two things can be
accomplished without stalling behind jittery servers?
We can avoid stalling
7
when we do work with quorums. The
idea is to ask a collection of servers to do each operation and wait
for only some of them to answer. Suppose we have a quorum of N
servers and we only need answers from Q of them (where Q > N/2).
When we get Q answers, we know that at least one server of the Q
has already seen any single previous operation. By combining all
Q answers, we can know about previous work. F is our maximum
7
The probability of stalling drops dramatically when we don’t wait for every server.
of 30
免费下载
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文档的来源(墨天轮),文档链接,文档作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论

关注
最新上传
暂无内容,敬请期待...
下载排行榜
Top250 周榜 月榜