暂无图片
暂无图片
暂无图片
暂无图片
暂无图片
SIGMOD2023_Hereditary Cohesive Subgraphs Enumeration on Bipartite Graphs The Power of Pivot-based Approaches_腾讯云.pdf
8
26页
0次
2025-04-21
免费下载
138
Hereditary Cohesive Subgraphs Enumeration on Bipartite
Graphs: The Power of Pivot-based Approaches
QIANGQIANG DAI, RONG-HUA LI, XIAOWEI YE, and MEIHAO LIAO, Beijing Institute of
Technology, China
WEIPENG ZHANG, Tencent Technology (Shenzhen) Company Limited, China
GUOREN WANG, Beijing Institute of Technology, China
Finding cohesive subgraphs from a bipartite graph is a fundamental operator in bipartite graph analysis.
In this paper, we focus on the problem of mining cohesive subgraphs from a bipartite graph that satisfy a
hereditary property. Here a cohesive subgraph meets the hereditary property if all of its subgraphs satisfy the
same property as itself. We show that several important cohesive subgraph models, such as maximal biclique
and maximal
𝑘
-biplex, satisfy the hereditary property. The problem of enumerating all maximal hereditary
subgraphs was known to be NP-hard. To solve this problem, we rst propose a novel and general pivot-based
enumeration framework to eciently enumerate all maximal hereditary subgraphs in a bipartite graph. Then,
based on our general framework, we develop a new pivot-based algorithm with several pruning techniques to
enumerate all maximal bicliques. We prove that the worst-case time complexity of our pivot-based maximal
biclique enumeration algorithm is
𝑂 (𝑚 ×
2
𝑛/2
)
(or
𝑂 (𝑚 ×
1
.
414
𝑛
)
) which is near optimal since there exist
up to
𝑂 (
2
𝑛/2
)
maximal bicliques in a bipartite graph with
𝑛
vertices and
𝑚
edges. Moreover, we also show
that our algorithm can achieve polynomial-delay time complexity with a slight modication. Third, on the
basis of our general framework, we also devise a novel pivot-based algorithm with several non-trivial pruning
techniques to enumerate maximal
𝑘
-biplexes in a bipartite graph. Finally, we conduct extensive experiments
using 11 real-world bipartite graphs to evaluate the proposed algorithms. The results show that our pivot-based
solutions can achieve one order of magnitude (three orders of magnitude) faster than the state-of-the-art
maximal biclique enumeration algorithms (maximal 𝑘-biplex enumeration algorithms).
CCS Concepts: Theory of computation Backtracking.
Additional Key Words and Phrases: hereditary cohesive subgraph, biclique,
𝑘
-biplex, enumeration framework
ACM Reference Format:
Qiangqiang Dai, Rong-Hua Li, Xiaowei Ye, Meihao Liao, Weipeng Zhang, and Guoren Wang. 2023. Hereditary
Cohesive Subgraphs Enumeration on Bipartite Graphs: The Power of Pivot-based Approaches. Proc. ACM
Manag. Data 1, 2, Article 138 (June 2023), 26 pages. https://doi.org/10.1145/3589283
1 INTRODUCTION
Bipartite graphs are ubiquitous in real-world applications. In a bipartite graph, the vertices can be
divided into two disjoint sets and each edge connects a vertex in one set to a vertex in the other set.
Some representative examples of real-world bipartite graphs include user-item networks [
43
,
44
],
author-publication networks [
22
], and biological networks [
23
]. Real-world bipartite graphs often
Authors’ addresses: Qiangqiang Dai, qiangd66@gmail.com; Rong-Hua Li, lironghuabit@126.com; Xiaowei Ye, yexiaowei@
bit.edu.cn; Meihao Liao, mhliao@bit.edu.cn, Beijing Institute of Technology, Beijing, China; Weipeng Zhang, Tencent
Technology (Shenzhen) Company Limited, Shenzhen, China, jackwpzhang@tencent.com; Guoren Wang, Beijing Institute of
Technology, Beijing, China, wanggrbit@126.com.
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee
provided that copies are not made or distributed for prot or commercial advantage and that copies bear this notice and the
full citation on the rst page. Copyrights for components of this work owned by others than the author(s) must be honored.
Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires
prior specic permission and/or a fee. Request permissions from permissions@acm.org.
© 2023 Copyright held by the owner/author(s). Publication rights licensed to ACM.
2836-6573/2023/6-ART138 $15.00
https://doi.org/10.1145/3589283
Proc. ACM Manag. Data, Vol. 1, No. 2, Article 138. Publication date: June 2023.
138:2 Qiangqiang Dai et al.
contain cohesive subgraph structures which correspond to communities or densely-connected
groups. Mining cohesive subgraphs from a bipartite graph is a fundamental operator in bipartite
graph analysis which has been widely used in many applications, such as community detection
[20, 22], online recommendation [17, 36], and biological network analysis [6, 42].
There exist many cohesive subgraph models in bipartite graphs. Notable examples include
maximal biclique [
1
,
9
,
14
,
24
,
28
,
50
], maximal
𝑘
-biplex [
40
,
48
,
49
],
(𝛼, 𝛽)
-core [
7
,
27
],
𝑘
-bitruss
[45, 52], and quasi biclique [19, 30, 46]. Among them, the maximal biclique model, perhaps, is the
most fundamental model, as all the other cohesive subgraph models can be considered as a relaxed
biclique model.
Instead of focusing on a particular cohesive subgraph model, in this paper, we study a family of
cohesive subgraph models in bipartite graphs that meet the hereditary property, namely hereditary
cohesive subgraphs. Here a subgraph
𝐺
is called a hereditary subgraph if (1)
𝐺
satises a property
P
and (2) every induced subgraph of
𝐺
also meets the property
P
. For convenience, we refer to
a subgraph that meets a hereditary property
P
as a
P
-subgraph. A
P
-subgraph
𝐺
is a maximal
P
-subgraph if there is no other
P
-subgraph containing
𝐺
. Given a bipartite graph
𝐺
, our goal is
to enumerate all maximal
P
-subgraphs from
𝐺
. To our knowledge, such a maximal
P
-subgraph
enumeration problem has not been investigated before. We show that both the maximal biclique and
maximal
𝑘
-biplex are maximal
P
-subgraphs, thus both the classic maximal biclique enumeration
and maximal 𝑘-biplex enumeration problems are special instances of our problem.
Motivations. Practical solutions for maximal
P
-subgraph enumeration can be applied to many
applications, and two of them are summarized as follows.
Community detection in bipartite graphs.
Detecting communities in a bipartite graph is an impor-
tant graph analysis task [
6
,
20
,
23
,
36
,
42
]. We can use the hereditary cohesive subgraph to model
communities in a bipartite graph. The communities detected by the hereditary cohesive subgraph
model often exhibit strong robustness, since the removal of any subset of vertices from the commu-
nity does not destroy the structural property. In eect, the classic maximal biclique and maximal
𝑘
-biplex models have been widely used for community detection applications [
20
,
28
,
40
,
48
] due to
such a nice hereditary property and cohesive property. Thus, the solution for maximal
P
-subgraph
enumeration can provide a general framework for community detection in bipartite graphs, which
captures a family of dierent community models.
Fraud detection in user-item networks.
Consider an online user-item rating network (e.g., Amazon’s
user-product rating network), where the users can give ratings to the items. The item owners may
wish to improve their items’ ratings by hiring some fake users to frequently give high ratings to
their items. Clearly, the set of fake users and the set of their rated items often form a densely-
connected subgraph. Once again, we can use the hereditary cohesive subgraph model, such as
maximal biclique or maximal
𝑘
-biplex to detect such fake users [
48
]. As a result, the approaches to
maximal
P
-subgraph enumeration can also be used for identifying possible rating frauds in online
user-item networks.
Although the signicance of the maximal
P
-subgraph enumeration problem, a practical solution
for this problem is still lacking due to the intrinsic challenges of this problem. First, as indicated in
[
21
], the problem of enumerating all maximal
P
-subgraphs from a bipartite graph is NP-hard. Thus,
there does not exist a polynomial algorithm to solve this problem unless
NP = P
. Second, since the
hereditary property
P
is arbitrary (the enumeration algorithm for our problem should work for
any hereditary property
P
) and internal structure of the
P
is unclear, it is quite non-trivial to use
such a property
P
to design an enumeration algorithm. Moreover, existing solutions for maximal
biclique enumeration and maximal
𝑘
-biplex enumeration also cannot be generalized to handle
our problem. This is because dierent properties
P
give rise to dierent enumeration problems,
Proc. ACM Manag. Data, Vol. 1, No. 2, Article 138. Publication date: June 2023.
of 26
免费下载
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文档的来源(墨天轮),文档链接,文档作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论

关注
最新上传
暂无内容,敬请期待...
下载排行榜
Top250 周榜 月榜