HybrIDX: New Hybrid Index for Volume-hiding
Range Queries in Data Outsourcing Services
Kui Ren
‡
, Yu Guo
†
, Jiaqi Li
‡
, Xiaohua Jia
†
, Cong Wang
†
, Yajin Zhou
‡
, Sheng Wang
§
, Ning Cao
£
, Feifei Li
§
‡
Zhejiang University,
†
City University of Hong Kong,
§
Alibaba Group,
£
Hangzhou Mishu Technology.
{kuiren, lijiaqi93, yajin zhou}@zju.edu.cn, y.guo@my.cityu.edu.hk, {congwang, csjia}@cityu.edu.hk,
{sh.wang, lifeifei}@alibaba-inc.com, ncao@hzmstech.com
Abstract—An encrypted index is a data structure that assists
untrusted servers to provide various query functionalities in the
ciphertext domain. Although traditional index designs can pre-
vent servers from directly obtaining plaintexts, the confidentiality
of outsourced data could still be compromised by observing
the volume of different queries. Recent volume attacks have
demonstrated the importance of sealing volume-pattern leakage.
To this end, several works are made to design secure indexes with
the volume-hiding property. However, prior designs only work
for encrypted keyword search. Due to the unpredictable range
query results, it is difficult to protect the volume-pattern leakage
while achieving efficient range queries.
In this paper, for the first time, we define and solve the chal-
lenging problem of volume-hiding range queries over encrypted
data. Our proposed hybrid index framework, called HybrIDX,
allows an untrusted server to efficiently search encrypted data
based on order conditions without revealing the exact result
size. It resorts to the trusted hardware techniques to assist
range query processing by moving the comparison algorithm
to trusted SGX enclaves. To enable volume-hiding data retrieval,
we propose to host encrypted file blocks outside the enclave in an
encrypted volume-hiding structure. Apart from this novel hybrid
index framework, we further customize a result caching method
to obfuscate the results co-occurrence among different queries.
We formally analyze the security strengths and complete the
prototype implementation. Evaluation results demonstrate the
feasibility and practicability of our designs.
I. INTRODUCTION
An encrypted index offers a secure interface for data owners
to outsource private data to an untrusted server, while se-
lectively retrieving data without decryption. It is recognized
as a fundamental component of building privacy-preserving
applications. Early studies on secure index construction mainly
focus on the direction of keyword search [1]–[7]. Along
with the development of data outsourcing paradigm, enabling
efficient range queries over secure indexes has attracted a lot of
attention from both academia and industry [8]–[11]. A typical
example in the encrypted database community [12]–[17] is
finding all matched files according to order conditions such as
indexed values or timestamps.
Over the past decade, the demand for designing range-based
indexes has been expected to be addressed by the innovation
of cryptographic techniques, e.g., Order-preserving/revealing
encryption (OPE/ORE) schemes [18]–[27], which allow or-
der comparisons to be performed directly on ciphertexts. In
practice, however, these schemes are developed as crypto-
graphic primitives and can hardly accommodate system-wise
security requirements [28]–[36]. Specifically, range queries
reveal the different number of return results (i.e., volume-
pattern) and the results co-occurrence pattern. Recent leakage-
abuse attacks [28]–[32] have shown how these leakages can
be exploited to learn sensitive information about the query
content and outsourced data. To mitigate this security concern,
a simple approach is to apply naive padding. However, this
approach would introduce expensive storage overhead because
the result set is padded to the maximum length.
In the literature, only a few recent works [37]–[39] have
started to study practical index designs with the protection
of volume-pattern. In [37], Kamara et al. proposed the first
practical volume-hiding encryption scheme by using the multi-
map data structure [38]. The follow-up design [39] utilized
cuckoo hashing and differential privacy to further reduce the
volume and storage overhead. Unfortunately, neither of the
existing schemes can support volume-hiding range queries
because of two challenging issues. Firstly, unlike keyword
search functions, the result size of different range queries
cannot be pre-defined. Clients without pre-knowledge of the
result size have to download all the data in order to hide the
volume-pattern leakage, which demands a large amount of
bandwidth overhead. Secondly, different range queries reveal
the results co-occurrence pattern, which can be exploited for
inferring the plaintext distribution. For example, the result
set associated with the endpoint values must appear in all
the other range query results. By observing the results co-
occurrence among different queries, an attacker can still learn
the result distribution even if the volume-pattern has been well-
protected [28]–[30]. Therefore, how to enable volume-hiding
range queries over encrypted indexes is still challenging and
remains to be fully explored.
In this paper, for the first time, we solve the problem of
volume-hiding range queries in the data outsourcing paradigm.
Our proposed hybrid index framework, called HybrIDX, can
effectively address the aforementioned challenges while being
efficient in supporting range queries. To achieve our design
goals on both security and usability, we propose to bring
together the advancement of both volume-hiding data structure
and trusted hardware to build the desired index construction.
To hide the result size of different range queries, we first
propose to utilize the trusted enclave to securely convert any
range query into multiple independent sub-queries with equal
fixed volume. While seemingly inconvenient, this way of query
评论