
Apache IoTDB: A Time Series Database for Large Scale IoT
Applications
CHEN WANG, Tsinghua University, Beijing, China
JIALIN QIAO, Timecho Ltd, Beijing, China
XIANGDONG HUANG, Tsinghua University, Beijing, China
SHAOXU SONG, Tsinghua University, Beijing, China
HAONAN HOU, Timecho Ltd, Beijing, China
TIAN JIANG, Tsinghua University, Beijing, China
LEI RUI, Tsinghua University, Beijing, China
JIANMIN WANG, Tsinghua University, Beijing, China
JIAGUANG SUN, Tsinghua University, Beijing, China
A typical industrial scenario encounters thousands of devices with millions of sensors, consistently generating billions of data
points. It poses new requirements of time series data management, not well addressed in existing solutions, including (1)
device-dened ever-evolving schema, (2) mostly periodical data collection, (3) strongly correlated series, (4) variously delayed
data arrival, and (5) highly concurrent data ingestion. In this paper, we present a time series database management system,
Apache IoTDB. It consists of (i) a time series native le format, TsFile, with specially designed data encoding, and (ii) an IoTDB
engine for eciently handling delayed data arrivals and processing queries. We introduce a native distributed solution with
distributed queries optimized by parallel operators. We also explore ecient TsFile synchronization mechanisms, ensuring
seamless data integration without the need for ETL processes. The system achieves a throughput of 10 million inserted values
per second. Queries such as 1-day data selection of 0.1 million points and 3-year data aggregation over 10 million points can
be processed in 100 ms. Comparisons with InuxDB, TimescaleDB, KairosDB, Parquet and ORC over real world data loads
demonstrate the superiority of IoTDB and TsFile.
CCS Concepts: • Information systems → Data management systems.
Additional Key Words and Phrases: time series, data model, database engine, distributed
1 Introduction
In the Internet of Things (IoT), a huge amount of time series is generated by various devices with many sensors
attached. The data need to be managed not only in the cloud for intelligent analysis but also at the edge for
real-time control. For example, more than 20,000 excavators are managed by one of our industrial partners, a
maintenance service provider of heavy industry machines, each of which has hundreds of sensors, e.g., monitoring
Shaoxu Song (https://sxsong.github.io/) and Jianmin Wang are the corresponding authors.
Authors’ Contact Information: Chen Wang, Tsinghua University, Beijing, Beijing, China; e-mail: wang_chen@tsinghua.edu.cn; Jialin Qiao,
Timecho Ltd, Beijing, Beijing, China; e-mail: jialin.qiao@timecho.com; Xiangdong Huang, Tsinghua University, Beijing, Beijing, China;
e-mail: hxd@timecho.com; Shaoxu Song, Tsinghua University, Beijing, Beijing, China; e-mail: sxsong@tsinghua.edu.cn; Haonan Hou,
Timecho Ltd, Beijing, Beijing, China; e-mail: haonan.hou@timecho.com; Tian Jiang, Tsinghua University, Beijing, Beijing, China; e-mail:
jiangtia18@mails.tsinghua.edu.cn; Lei Rui, Tsinghua University, Beijing, Beijing, China; e-mail: rl18@mails.tsinghua.edu.cn; Jianmin Wang,
Tsinghua University, Beijing, Beijing, China; e-mail: jimwang@tsinghua.edu.cn; Jiaguang Sun, Tsinghua University, Beijing, Beijing, China;
e-mail: sunjg@tsinghua.edu.cn.
This work is licensed under a Creative Commons Attribution-NoDerivatives International 4.0 License.
© 2025 Copyright held by the owner/author(s).
ACM 1557-4644/2025/3-ART
https://doi.org/10.1145/3726523
ACM Trans. Datab. Syst.
评论