暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

PostgreSQL 谁堵塞了谁(锁等待检测)- pg_blocking_pids, pg_safe_snapshot_blocking_pids

digoal 2019-02-01
313

作者

digoal

日期

2019-02-01

标签

PostgreSQL , 锁等待 , 队列


背景

当一个进程处于等待(被堵塞)状态时,是谁干的?可以使用如下函数,快速得到捣蛋(堵塞别人)的PID。

1、请求锁时被堵,是哪些PID堵的?

pg_blocking_pids(int) int[] Process ID(s) that are blocking specified server process ID from acquiring a lock

2、请求safe快照时被堵(SSI隔离级别,请求安全快照冲突),是哪些PID堵的?

pg_safe_snapshot_blocking_pids(int) int[] Process ID(s) that are blocking specified server process ID from acquiring a safe snapshot

例子

1、会话1

```
postgres=# begin;
BEGIN
postgres=# select * from tbl limit 1;
id | c1 | c2
--------+----+----
918943 | 1 | 0
(1 row)

postgres=# select pg_backend_pid();
pg_backend_pid


      30862

(1 row)
```

2、会话2

```
postgres=# begin;
BEGIN
postgres=# select pg_backend_pid();
pg_backend_pid


      30928

(1 row)

postgres=# truncate tbl;

等待中
```

3、会话3

```
postgres=# begin;
BEGIN
postgres=# select pg_backend_pid();
pg_backend_pid


      30936

(1 row)

postgres=# select * from tbl limit 1;

等待中
```

4、会话4

```
postgres=# select pg_backend_pid();
pg_backend_pid


      30999

(1 row)

postgres=# select * from tbl limit 1;

等待中
```

5、查看捣蛋PID

postgres=# select pid,pg_blocking_pids(pid),wait_event_type,wait_event,query from pg_stat_activity ; pid | pg_blocking_pids | wait_event_type | wait_event | query -------+------------------+-----------------+---------------------+------------------------------------------------------------------------------------------- 30862 | {} | Client | ClientRead | select pg_backend_pid(); 30928 | {30862} | Lock | relation | truncate tbl; 30936 | {30928} | Lock | relation | select * from tbl limit 1; 30999 | {30928} | Lock | relation | select * from tbl limit 1;

代码

src/backend/utils/adt/lockfuncs.c

```
/
* pg_blocking_pids - produce an array of the PIDs blocking given PID

* The reported PIDs are those that hold a lock conflicting with blocked_pid's
* current request (hard block), or are requesting such a lock and are ahead
* of blocked_pid in the lock's wait queue (soft block).

* In parallel-query cases, we report all PIDs blocking any member of the
* given PID's lock group, and the reported PIDs are those of the blocking
* PIDs' lock group leaders. This allows callers to compare the result to
* lists of clients' pg_backend_pid() results even during a parallel query.

* Parallel query makes it possible for there to be duplicate PIDs in the
* result (either because multiple waiters are blocked by same PID, or
* because multiple blockers have same group leader PID). We do not bother
* to eliminate such duplicates from the result.

* We need not consider predicate locks here, since those don't block anything.
/
Datum
pg_blocking_pids(PG_FUNCTION_ARGS)
{

...............

/
* pg_safe_snapshot_blocking_pids - produce an array of the PIDs blocking
* given PID from getting a safe snapshot

* XXX this does not consider parallel-query cases; not clear how big a
* problem that is in practice
*/
Datum
pg_safe_snapshot_blocking_pids(PG_FUNCTION_ARGS)
{
...........
```

src/backend/storage/lmgr/predicate.c

```
/
* GetSafeSnapshotBlockingPids
* If the specified process is currently blocked in GetSafeSnapshot,
* write the process IDs of all processes that it is blocked by
* into the caller-supplied buffer output[]. The list is truncated at
* output_size, and the number of PIDs written into the buffer is
* returned. Returns zero if the given PID is not currently blocked
* in GetSafeSnapshot.
/
int
GetSafeSnapshotBlockingPids(int blocked_pid, int output, int output_size)
{
int num_written = 0;
SERIALIZABLEXACT
sxact;

    LWLockAcquire(SerializableXactHashLock, LW_SHARED);

    /* Find blocked_pid's SERIALIZABLEXACT by linear search. */  
    for (sxact = FirstPredXact(); sxact != NULL; sxact = NextPredXact(sxact))  
    {  
            if (sxact->pid == blocked_pid)  
                    break;  
    }

    /* Did we find it, and is it currently waiting in GetSafeSnapshot? */  
    if (sxact != NULL && SxactIsDeferrableWaiting(sxact))  
    {  
            RWConflict      possibleUnsafeConflict;

            /* Traverse the list of possible unsafe conflicts collecting PIDs. */  
            possibleUnsafeConflict = (RWConflict)  
                    SHMQueueNext(&sxact->possibleUnsafeConflicts,  
                                             &sxact->possibleUnsafeConflicts,  
                                             offsetof(RWConflictData, inLink));

            while (possibleUnsafeConflict != NULL && num_written < output_size)  
            {  
                    output[num_written++] = possibleUnsafeConflict->sxactOut->pid;  
                    possibleUnsafeConflict = (RWConflict)  
                            SHMQueueNext(&sxact->possibleUnsafeConflicts,  
                                                     &possibleUnsafeConflict->inLink,  
                                                     offsetof(RWConflictData, inLink));  
            }  
    }

    LWLockRelease(SerializableXactHashLock);

    return num_written;

}
```

参考

https://www.postgresql.org/docs/11/functions-info.html

《PostgreSQL 锁等待排查实践 - 珍藏级 - process xxx1 acquired RowExclusiveLock on relation xxx2 of database xxx3 after xxx4 ms at xxx》

《PostgreSQL 锁等待监控 珍藏级SQL - 谁堵塞了谁》

《PostgreSQL 锁等待跟踪》

PostgreSQL 许愿链接

您的愿望将传达给PG kernel hacker、数据库厂商等, 帮助提高数据库产品质量和功能, 说不定下一个PG版本就有您提出的功能点. 针对非常好的提议,奖励限量版PG文化衫、纪念品、贴纸、PG热门书籍等,奖品丰富,快来许愿。开不开森.

9.9元购买3个月阿里云RDS PostgreSQL实例

PostgreSQL 解决方案集合

德哥 / digoal's github - 公益是一辈子的事.

digoal's wechat

文章转载自digoal,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论