
138:2 Qiangqiang Dai et al.
contain cohesive subgraph structures which correspond to communities or densely-connected
groups. Mining cohesive subgraphs from a bipartite graph is a fundamental operator in bipartite
graph analysis which has been widely used in many applications, such as community detection
[20, 22], online recommendation [17, 36], and biological network analysis [6, 42].
There exist many cohesive subgraph models in bipartite graphs. Notable examples include
maximal biclique [
1
,
9
,
14
,
24
,
28
,
50
], maximal
𝑘
-biplex [
40
,
48
,
49
],
(𝛼, 𝛽)
-core [
7
,
27
],
𝑘
-bitruss
[45, 52], and quasi biclique [19, 30, 46]. Among them, the maximal biclique model, perhaps, is the
most fundamental model, as all the other cohesive subgraph models can be considered as a relaxed
biclique model.
Instead of focusing on a particular cohesive subgraph model, in this paper, we study a family of
cohesive subgraph models in bipartite graphs that meet the hereditary property, namely hereditary
cohesive subgraphs. Here a subgraph
𝐺
′
is called a hereditary subgraph if (1)
𝐺
′
satises a property
P
and (2) every induced subgraph of
𝐺
′
also meets the property
P
. For convenience, we refer to
a subgraph that meets a hereditary property
P
as a
P
-subgraph. A
P
-subgraph
𝐺
′
is a maximal
P
-subgraph if there is no other
P
-subgraph containing
𝐺
′
. Given a bipartite graph
𝐺
, our goal is
to enumerate all maximal
P
-subgraphs from
𝐺
. To our knowledge, such a maximal
P
-subgraph
enumeration problem has not been investigated before. We show that both the maximal biclique and
maximal
𝑘
-biplex are maximal
P
-subgraphs, thus both the classic maximal biclique enumeration
and maximal 𝑘-biplex enumeration problems are special instances of our problem.
Motivations. Practical solutions for maximal
P
-subgraph enumeration can be applied to many
applications, and two of them are summarized as follows.
Community detection in bipartite graphs.
Detecting communities in a bipartite graph is an impor-
tant graph analysis task [
6
,
20
,
23
,
36
,
42
]. We can use the hereditary cohesive subgraph to model
communities in a bipartite graph. The communities detected by the hereditary cohesive subgraph
model often exhibit strong robustness, since the removal of any subset of vertices from the commu-
nity does not destroy the structural property. In eect, the classic maximal biclique and maximal
𝑘
-biplex models have been widely used for community detection applications [
20
,
28
,
40
,
48
] due to
such a nice hereditary property and cohesive property. Thus, the solution for maximal
P
-subgraph
enumeration can provide a general framework for community detection in bipartite graphs, which
captures a family of dierent community models.
Fraud detection in user-item networks.
Consider an online user-item rating network (e.g., Amazon’s
user-product rating network), where the users can give ratings to the items. The item owners may
wish to improve their items’ ratings by hiring some fake users to frequently give high ratings to
their items. Clearly, the set of fake users and the set of their rated items often form a densely-
connected subgraph. Once again, we can use the hereditary cohesive subgraph model, such as
maximal biclique or maximal
𝑘
-biplex to detect such fake users [
48
]. As a result, the approaches to
maximal
P
-subgraph enumeration can also be used for identifying possible rating frauds in online
user-item networks.
Although the signicance of the maximal
P
-subgraph enumeration problem, a practical solution
for this problem is still lacking due to the intrinsic challenges of this problem. First, as indicated in
[
21
], the problem of enumerating all maximal
P
-subgraphs from a bipartite graph is NP-hard. Thus,
there does not exist a polynomial algorithm to solve this problem unless
NP = P
. Second, since the
hereditary property
P
is arbitrary (the enumeration algorithm for our problem should work for
any hereditary property
P
) and internal structure of the
P
is unclear, it is quite non-trivial to use
such a property
P
to design an enumeration algorithm. Moreover, existing solutions for maximal
biclique enumeration and maximal
𝑘
-biplex enumeration also cannot be generalized to handle
our problem. This is because dierent properties
P
give rise to dierent enumeration problems,
Proc. ACM Manag. Data, Vol. 1, No. 2, Article 138. Publication date: June 2023.
评论