ICDE2023_Impact-aware_Maneuver_Decision_with_Enhanced_Perception_for_Autonomous_Vehicle_华为云.pdf

迹部景吾

14页

2次

2024-08-28

免费下载

Impact-aware Maneuver Decision with Enhanced

Perception for Autonomous Vehicle

Shuncheng Liu

, Yuyang Xia

, Xu Chen

, Jiandong Xie

, Han Su

1,3,*

, Kai Zheng

1,4,*

School of Computer Science and Engineering, University of Electronic Science and Technology of China, China

Huawei Cloud Database Innovation Lab, China

Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, China

Shenzhen Institute for Advanced Study, University of Electronic Science and Technology of China, China

{liushuncheng,xiayuyang,xuchen}@std.uestc.edu.cn, xiejiandong@huawei.com, {hansu,zhengkai}@uestc.edu.cn

Abstract—Autonomous driving is an emerging technology that

has developed rapidly over the last decade. There have been

numerous interdisciplinary challenges imposed on the current

transportation system by autonomous vehicles. In this paper,

we conduct an algorithmic study on the autonomous vehicle

decision-making process, which is a fundamental problem in

the vehicle automation ﬁeld and the root cause of most trafﬁc

congestion. We propose a perception-and-decision framework,

called HEAD, which consists of an enH

anced pErception module

and a mA

neuver Decision module. HEAD aims to enable the

autonomous vehicle to perform safe, efﬁcient, and comfortable

maneuvers with minimal impact on other vehicles. In the en-

hanced perception module, a graph-based state prediction model

with a strategy of phantom vehicle construction is proposed to

predict the one-step future states for multiple surrounding vehi-

cles in parallel, which deals with sensor limitations such as limited

detection range and poor detection accuracy under occlusions.

Then in the maneuver decision module, a deep reinforcement

learning-based model is designed to learn a policy for the

autonomous vehicle to perform maneuvers in continuous action

space w.r.t. a parameterized action Markov decision process.

A hybrid reward function takes into account aspects of safety,

efﬁciency, comfort, and impact to guide the autonomous vehicle

to make optimal maneuver decisions. Extensive experiments offer

evidence that HEAD can advance the state of the art in terms

of both macroscopic and microscopic effectiveness.

Index Terms—Autonomous driving, Perception, Decision

I. INTRODUCTION

Most major cities worldwide experience high levels of

trafﬁc congestion due to the rapid development in urbanization

and vehicle ownership [1]. Typically, road environments or

drivers are to blame for trafﬁc congestion [2]. Environmental

variables like road construction, trafﬁc lights, or a reduction

in the number of lanes (bottleneck) can give rise to trafﬁc

congestion. Additionally, a driver’s poor driving behavior (e.g.,

hard braking and forced lane change) may result in trafﬁc

congestion or even accidents. The latter is more frequent due to

the differences in drivers’ habits [3]. It usually happens when

the trafﬁc density is high, thus even a slight ﬂuctuation in the

trafﬁc ﬂow can generate a ‘domino effect’ and lead to serious

trafﬁc congestion. To avoid such phenomenon, drivers need

to keep good driving behaviors and maintain a safe distance

* Corresponding authors: Kai Zheng and Han Su

between vehicles [4], which are highly challenging, if not

impossible, for human drivers.

With the rapid development of vehicle automation tech-

nology, this goal may be achieved in the future when a

considerable portion of on-road vehicles are autonomous

vehicles. Some dangerous driving behaviors such as speed

driving and drowsy driving can be avoided by gradually

replacing human control with autonomous decision-making

algorithms [5]. Traditional methods have demonstrated that

autonomous vehicles can maintain a constant distance from

surrounding vehicles with the aid of adaptive cruise control

and lane-changing models [6]–[8]. However, these methods

involve a set of rule-matching algorithms and require expert

experience and manual tuning, leading to poor generalizability

with the increasing complexity of autonomous driving scenar-

ios. Considering the mechanism that a driver perceives the sur-

rounding trafﬁc and makes a maneuver decision (lane change

behavior and/or velocity change behavior), it ﬁts well within

the realm of reinforcement learning [9]. Due to the ﬂexible

reward designing and superior optimization effect, there have

been plenty of works utilizing reinforcement learning-based

methods to accomplish vehicle maneuver decisions in the sce-

nario of autonomous driving [10]–[13]. These reinforcement

learning-based approaches mainly optimize the driving safety,

efﬁciency, and comfort of autonomous vehicles, leaving the

impact on other surrounding vehicles and eventually trafﬁc

conditions largely uninvestigated. Evidently, if autonomous

vehicles make maneuver decisions simply based on their states

without taking the driving conditions of surrounding vehicles

into account, they may cause more serious trafﬁc congestion

or even accidents. Recently, a prediction-and-search frame-

work [14] is proposed to make discrete maneuver decisions,

which considers three impact situations of an autonomous

vehicle on its surrounding vehicles, including queuing, cross-

ing, and jumping the queue. However, it ideally discretizes

the velocity change behavior as speed-up, speed-down, and

maintain speed, which lacks effectiveness in continuous action

space. Further, it still relies on hand-crafted rules for deter-

mining different impact situations, which cannot deal with the

impact of continuous velocity change behavior [15]. Therefore,

existing decision-making algorithms for autonomous driving

3255

2023 IEEE 39th International Conference on Data Engineering (ICDE)

DOI 10.1109/ICDE55515.2023.00250

Authorized licensed use limited to: Huawei Technologies Co Ltd. Downloaded on August 23,2024 at 09:42:07 UTC from IEEE Xplore. Restrictions apply.

cannot effectively reduce the impact on surrounding vehicles.

To sum up, an ideal decision framework for the autonomous

vehicle can perform safe and comfortable maneuvers with

high driving efﬁciency and minimal impact on surrounding

vehicles. In particular, reducing the impact plays a vital role

in addressing poor driving behaviors and reducing trafﬁc

congestion or accidents. Intuitively, one can embed a tra-

jectory prediction model [14], [16] in perception modules

(with onboard sensors), which not only capture the current

state of surrounding vehicles, but also proactively anticipate

their future behaviors, and then utilize a deep reinforce-

ment learning-based model to make maneuver decisions [9].

However, the intuition will face two main challenges: (1)

The states of surrounding vehicles are not always observable

due to sensor limitations like detection range and occlusion,

making the trajectory prediction models less effective. (2)

Reinforcement learning-based models struggle to balance the

factors of safety, efﬁciency, comfort, and impact for complex

vehicle maneuvers, and it is also challenging to measure the

impact factor. In this work, we aim to address the above

challenges and enable the autonomous vehicle to perform

safe and comfortable maneuvers while maximizing its average

velocity and minimizing its impact on surrounding vehicles.

To this end, we proposed a novel perception-and-decision

framework, called HEAD, which consists of an enH

anced

rception module and a mAneuver Decision module. In the

enhanced perception module, we propose a state prediction

model to predict the one-step future states for multiple sur-

rounding vehicles in parallel. To deal with the incomplete

historical states caused by sensor limitations, it ﬁrst constructs

phantom vehicles based on observable surrounding vehicles

and organizes their relationships using spatial-temporal graph,

and then utilizes a graph attention mechanism with an LSTM

to enable vehicle interactions and parallel prediction. For the

maneuver decision module, it ﬁrst receives the future states of

surrounding vehicles and formulates the maneuver decision

task as a Parameterized Action Markov Decision Process

(PAMDP) with discrete lane change behaviors and continuous

velocity change behavior, and then uses a deep reinforcement

learning-based model and a properly designed reward function

to solve the PAMDP, which learn an optimized policy for the

autonomous vehicle to achieve our objective. In summary, we

make the following contributions:

• We develop a perception-and-decision framework that en-

ables the autonomous vehicle to perform safe, efﬁcient, and

comfortable maneuvers with minimal impact on other vehicles.

• We propose a graph-based state prediction model with

a strategy of phantom vehicle construction to solve sensor

limitations and support high-accuracy prediction in parallel.

• We propose a deep reinforcement learning-based model

and a hybrid reward function to make maneuver decisions in

continuous action space that follows a parameterized action

Markov decision process.

• We conduct extensive experiments to evaluate HEAD on

real and simulated data, verifying the effectiveness on both

macroscopic and microcosmic metrics.

II. O

VERVIEW

A. Preliminary Concepts

Environment.

We consider an interactive environment where

there are one autonomous vehicle A and a set of conventional

vehicles C traveling on a straight multi-lane road. For the sake

of simplicity, parking and turning are not considered for now.

The autonomous vehicle can obtain the states (i.e., locations

and velocities) of surrounding conventional vehicles through

its sensors, and perform a maneuver at each time instant t

within a target time duration T of interest.

Lane.

A lane is part of the road used to guide vehicles

in the same direction. Herein, all the lanes are numbered

incrementally from the leftmost side to the rightmost side,

i.e., l

,...,l

, where l

and l

indicate the leftmost lane

and rightmost lane, respectively.

Time Step.

In order to model the problem more concisely,

we treat the continuous time duration as a set of discrete

time steps, i.e., T = {1, 2,...,t,...}. We denote Δt as

the time interval between two consecutive time steps, which

serves as the minimum frequency for the autonomous vehicle

to perform maneuvers. Following the settings used in the

previous work [14], [17], the time granularity in this work

is set to 0.5 seconds (i.e., Δt =0.5s).

Location.

.lat, C

.lon) and (A

.lat, A

.lon) indicate the

locations of C

and A, respectively, at time step t, where

lat denotes the lat

eral lane number and lon refers to the

lon

gitudinal location of a vehicle traveled from the origin.

lon

) denotes the relative longitudinal distance be-

tween C

and A at time step t, which can be calculated as

follows:

lon

)=C

.lon − A

.lon (1)

In addition, the d

lat

) denotes the relative lateral dis-

tance between C

and A at time step t, which can be calculated

as follows:

lat

)=(C

.lat − A

.lat) ∗ wid

(2)

where wid

is the width of a lane. An advantage of using

this type of lane-aware location is to allow us to focus on the

lane change behavior itself without worrying about the lateral

location of the vehicle.

Velocity.

.v and A

.v indicate the longitudinal velocities of

and A, respectively, at time step t. v(C

) denotes the

relative longitudinal velocity between C

and A at time step

t, which can be calculated as follows:

v(C

)=C

.v − A

.v (3)

Beneﬁting from the discrete time step and the lane-aware

location, the lateral motion between two consecutive time

steps is assumed to be the uniform motion [14], [18], so

we focus on the longitudinal velocity in this work. In the

rest of the paper, we use velocity and longitudinal velocity

interchangeably when no ambiguity is caused.

Maneuver.

A maneuver is a pair of a lateral lane change

behavior and a longitudinal velocity change behavior simul-

taneously performed by a vehicle [19]. (A

.b, A

.a) represents

3256

Authorized licensed use limited to: Huawei Technologies Co Ltd. Downloaded on August 23,2024 at 09:42:07 UTC from IEEE Xplore. Restrictions apply.

of 14

免费下载

关注

评论