【阿里云2024SIGMOD】Rethink Query Optimization in HTAP Databases.pdf

胖酒精灯

27页

2次

2025-04-15

免费下载

256

Rethink ery Optimization in HTAP Databases

HAOZE SONG

∗

, The University of Hong Kong, Hong Kong SAR

WENCHAO ZHOU, Alibaba Group, China

FEIFEI LI, Alibaba Group, China

XIANG PENG, Alibaba Group, China

HEMING CUI, The University of Hong Kong, Hong Kong SAR

The advent of data-intensive applications has fueled the evolution of hybrid transactional and analytical

processing (HTAP). To support mixed workloads, distributed HTAP databases typically maintain two data

copies that are specially tailored for data freshness and performance isolation. In particular, a copy in a

row-oriented format is well-suited for OLTP workloads, and a second copy in a column-oriented format is

optimized for OLAP workloads. Such a hybrid design opens up a new design space for query optimization:

plans can be optimized over dierent data formats and can be executed over isolated resources, which we

term hybrid plans. In this paper, we demonstrate that hybrid plans can largely benet query execution (e.g.,

up to 11

speedups in our evaluation). However, we also found these benets will be potentially at the cost

of sacricing data freshness or performance isolation since traditional optimizers may not precisely model

and schedule the execution of hybrid plans on real-time updated HTAP databases. Therefore, we propose

Metis, an HTAP-aware optimizer. We show, both theoretically and experimentally, that using the proposed

optimizations, a system can largely benet from hybrid plans while preserving isolated performance for OLTP

and OLAP, and these optimizations are robust to the changes in workloads.

CCS Concepts: • Information systems

→

Data access methods; Query optimization; Data layout; •

Computer systems organization → Real-time system architecture.

Additional Key Words and Phrases: Hybrid Transactional and Analytical Processing (HTAP) Databases,

Adaptive Query Plan, Mixed Workloads

ACM Reference Format:

Haoze Song, Wenchao Zhou, Feifei Li, Xiang Peng, and Heming Cui. 2023. Rethink Query Optimization

in HTAP Databases. Proc. ACM Manag. Data 1, 4 (SIGMOD), Article 256 (December 2023), 27 pages. https:

//doi.org/10.1145/3626750

1 INTRODUCTION

Today, data-intensive applications often utilize vast amounts of data for diverse real-time business

tasks (e.g., data-driven decisions [

]), necessitating weaving analytical and transactional

processing techniques together [

]. In response, many recent academic and industrial eorts have

been devoted to developing hybrid transactional and analytical processing (HTAP) systems [

–

], which are expected to provide

prompt analysis of

∗

Work performed during an internship at Alibaba Group.

Authors’ addresses: Haoze Song, hzsong@cs.hku.hk, The University of Hong Kong, Pok Fu Lam, Hong Kong SAR; Wenchao

Zhou, zwc231487@alibaba-inc.com, Alibaba Group, 969 West Wen Yi Road, Yu Hang District, Hangzhou, Zhejiang, China;

Feifei Li, lifeifei@alibaba-inc.com, Alibaba Group, 969 West Wen Yi Road, Yu Hang District, Hangzhou, Zhejiang, China;

Xiang Peng, pengxiang.px@alibaba-inc.com, Alibaba Group, 969 West Wen Yi Road, Yu Hang District, Hangzhou, Zhejiang,

China; Heming Cui, heming@cs.hku.hk, The University of Hong Kong, Pok Fu Lam, Hong Kong SAR.

This work is licensed under a Creative Commons Attribution-NonCommercial International 4.0

License.

2836-6573/2023/12-ART256

https://doi.org/10.1145/3626750

Proc. ACM Manag. Data, Vol. 1, No. 4 (SIGMOD), Article 256. Publication date: December 2023.

Downloaded from the ACM Digital Library on April 8, 2025.

256:2 Haoze Song et al.

Tuple ID

Results*

Primary Index

Key (s)

Results*

Secondary Indices

Indices

Row-oriented Store

Column-oriented Store

Async. Data Synchronization

N-ary

Storage Model

Decomposition

Storage Model

Delta Merge

(a) Hybrid Physical Layout.

Data

Freshness

Perf.

Isolation

Range Scan

Efficiency

Probe

Efficiency

Better

Resource

Utilization

Optimization

Efficiency

METIS

Row-oriented Plan Column-oriented Plan

HTAP-agnostic Plan

(b) Performance Comparison.

Fig. 1. (a) shows an example of the hybrid physical layout in modern HTAP systems (e.g., SQL Server [

TiDB [

]): the row-oriented tables are well-suited for updates and probes; a second copy in a column-oriented

layout is optimized for range scan. Leveraging a hybrid physical layout, Metis strikes a practical balance

between performance, isolation, and freshness for HTAP (see (b)).

fresh data and 2 isolate the performance of interleaved workloads.

A practical HTAP database generally consists of an online transactional processing (OLTP)

engine that supports high throughput transaction processing, and an online analytical processing

(OLAP) engine supports analytics with low latency. To handle mixed workloads eciently, a popular

category of distributed HTAP databases (e.g., SQL Server [

], TiDB [

], ByteHTAP [

], PolarDB-

IMCI [

], Oracle Dual [

], and AlloyDB [

]) typically employs the two engines with specialized

data stores and asynchronously replicated data from one copy to the other, achieving both

and

An example is shown in Figure 1a: a row-oriented store (for short, row store) that stores data

in rows is optimized for operating on a single data tuple at a time and accessing many attributes,

favor for OLTP; a column-oriented store (for short, column store) that stores the same attributes of

dierent rows contiguously in columns is optimized for accessing a massive number of rows at a

time with a subset of tuple attributes, favor for OLAP. To provide swift OLAP capabilities on new

data, updates are asynchronously replicated from the row store into the column store while posing

minimal impact on OLTP. We further formulate the system model in §2.1.

Given such a design, OLTP and OLAP workloads can be independently processed on their

desirable stores, thus naively providing isolations between OLTP and OLAP in the storage layer.

Unfortunately, restricting each workload to its specialized store leaves much of the performance

potential unrealized. This is because, for read-only queries, both the row and column store can

signicantly outperform one another based on the characteristics of system implementations and

workloads [

] (see our experimental results in §2.2). Thus, there may be queries for which

neither the column store nor the row store is optimal.

To reach the full potential of the hybrid physical layouts, several HTAP systems [

] have

integrated the two stores as alternative data access methods in their query optimizers to generate

hybrid plans for queries. Specically, a hybrid plan allows a single query to retrieve data from both

the row and column stores simultaneously and calculate the query results based on a consistent

data view.

Motivation. Nevertheless, existing approaches [

] select access paths and do query optimiza-

tions simply based on queries’ selectivity [

], neglecting the data dynamicity of HTAP databases.

In §3, we show that blindly pursuing hybrid plans can easily make the generated plans sub-optimal

and damage the two important properties: data freshness ( 1 ) and performance isolation ( 2 ).

Proc. ACM Manag. Data, Vol. 1, No. 4 (SIGMOD), Article 256. Publication date: December 2023.

Downloaded from the ACM Digital Library on April 8, 2025.

Rethink ery Optimization in HTAP Databases 256:3

We regard the two properties essential because high data freshness [

] provides users

with intelligent insights into the fresh data at generation speed (i.e., OLAP queries can always

work on the new data generated by OLTP), which distinguishes HTAP databases from traditional

Extract-Transform-Load (ETL) method [

]. Performance isolation [

] in HTAP refers to

the ability to maintain the performance of both OLTP and OLAP workload while another workload

increases, which is important for providing individual service-level agreements (SLAs).

Therefore, in this paper, we take the rst step to systematically study the benets, side-eects,

and solutions of hybrid plans given our target HTAP system model (§2.1). Our key insight is

that, to keep hybrid plans ecient, we should put data dynamicity into the design of the query

optimizer by capturing the mutual relationship between reads (i.e., read-only queries) and writes

(i.e., write transactions). In our paper, data dynamicity is dened as the nature in that the OLTP

engine continuously executes users’ write transactions and incrementally applies new data from

the row store into the column store. Based on our insight, we identify three key challenges.

The rst challenge is how to precisely model the cost of data access paths when new writes continu-

ously update the replicated data copy for reads? As shown in Figure 1a, distributed HTAP databases

typically support timely updates in the read-optimized column store (i.e., the data copy for reads)

through a separate delta store. Delta store accumulates updates continuously and periodically

merges them into the columnar storage (see Figure 1a, detailed in §2.1). This architecture makes

the traditional cost model imprecise for evaluating the cost of data access paths: there is no xed

selectivity threshold for access path selection; rather, the division depends on the workload’s dy-

namicity (i.e., the concurrency of writes).

To address this challenge, we propose a new delta-store-aware cost model incorporating the data

dynamicity into the optimizer: Demain (Delta-main model). Demain captures the performance of

select operators in both delta stores and column stores and thus can eciently guide the access

path selection.

The second challenge is how to optimize data freshness and execution time together, especially when

new writes are propagated asynchronously? Generally, in an HTAP database that uses asynchronously

replicated data copies, optimizing execution time can be at the cost of data freshness. This is

because, due to data replication, the visibility of new writes in the column store is always delayed.

Hence, even if the column store may outperform the row store (i.e., the data copy for writes) on

the sequential scan, the execution must be blocked until the new writes are fully synchronized to

the column store, leading to a longer response time.

We regard considering visibility delay in query optimizations as an important research problem

because, even though multiple existing works [

] strive to minimize the visibility delay

between the row store and column store from system scopes, depending on the deployment, the

visibility delay is still pronounced (e.g., 10

𝑠

delay in DB2 IDAA [

], 8

𝑚𝑖𝑛𝑠

delay in production at

Google [68], 606𝑚𝑠 delay in experiments at ByteDance [16]).

For this challenge, we propose a new visibility-aware plan selection algorithm. It rstly estimates

the visibility delay between the row store and column store based on the ongoing and predicted

workload characteristics. When optimizing queries, it advances the query performance by pre-

executing plans on the available data, thus masking the notorious visibility delays.

The nal challenge is how to ensure isolated p erformance between reads and writes when query

plans are hybrid? A strawman approach uses a pre-dened quota for the reads in row stores (i.e.,

the data copy for writes). For example, TiDB limits the default access table size on its row store

for the OLAP workload to at most 500 𝑀𝐵 [33]. However, manual intervention cannot eectively

utilize resources while reducing query latency. A conguration that works well for one workload

is unlikely to work well for another.

We develop a new resource-aware query re-optimization approach for hybrid plans. Instead of

Proc. ACM Manag. Data, Vol. 1, No. 4 (SIGMOD), Article 256. Publication date: December 2023.

Downloaded from the ACM Digital Library on April 8, 2025.

of 27

免费下载

htap数据库阿里云 sigmod

相关文档

评论