暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

海山数据库(He3DB)源码详解:海山PG 表和元组的组织方式(2)

cxp 2025-01-23
15

海山数据库(He3DB)源码详解:海山PG 表和元组的组织方式(2)

一、页的操作

1、页面初始化

访问Page时会先将它加载到内存,所以Page可以仅用一个char *类型的指针来表示,指向内存中该Page的起始位置。由于Page的大小是已知的,通过Page指针和Page的大小即可表示并访问一个Page。在构建一个Page时,会调用PageInit函数进行初始化。

void PageInit(Page page, Size pageSize, Size specialSize) { // p指向Page的头部的起始位置,也是整个Page的起始位置 PageHeader p = (PageHeader) page; // 对special区域的大小进行对齐 specialSize = MAXALIGN(specialSize); // Page的大小应该为常量BLCKSZ(默认是8192) Assert(pageSize == BLCKSZ); // 除了头部和special区域外,Page内还应该有可用空间 Assert(pageSize > specialSize + SizeOfPageHeaderData); // 将整个Page的内容填充为0 MemSet(p, 0, pageSize); // 初始化Page头部的一些字段 p->pd_flags = 0; p->pd_lower = SizeOfPageHeaderData; p->pd_upper = pageSize - specialSize; p->pd_special = pageSize - specialSize; PageSetPageSizeAndVersion(page, pageSize, PG_PAGE_LAYOUT_VERSION); /* p->pd_prune_xid = InvalidTransactionId; done by above MemSet */ }
复制
  • 页面初始化函数:流程首先判断参数是否正确,即pageSize是否等于BLCKSZ8KB),之后对specialSize的大小做判断
  • 初始化页面头部信息中的字段,如pd->flagspd->lowerpd->upperpd->special以及pd->pagesize_version等标志位

2、检验页面有效性

bool PageIsVerifiedExtended(Page page, BlockNumber blkno, int flags) { PageHeader p = (PageHeader) page; size_t *pagebytes = NULL; int i = 0; bool checksum_failure = false; bool header_sane = false; bool all_zeroes = false; uint16 checksum = 0; if (!PageIsNew(page)) { if (DataChecksumsEnabled()) { checksum = pg_checksum_page((char *) page, blkno); if (checksum != p->pd_checksum) { checksum_failure = true; } } if ((p->pd_flags & ~PD_VALID_FLAG_BITS) == 0 && p->pd_lower <= p->pd_upper && p->pd_upper <= p->pd_special && p->pd_special <= BLCKSZ && p->pd_special == MAXALIGN(p->pd_special)) { header_sane = true; } if (header_sane && !checksum_failure) { LOG_FUNCTION_EXIT(); return true; } } all_zeroes = true; pagebytes = (size_t *) page; for (i = 0; i < (BLCKSZ / sizeof(size_t)); i++) { if (pagebytes[i] != 0) { all_zeroes = false; break; } } if (all_zeroes) { LOG_FUNCTION_EXIT(); return true; } if (checksum_failure) { if ((flags & PIV_LOG_WARNING) != 0) { ereport(WARNING, (errcode(ERRCODE_DATA_CORRUPTED), errmsg("page verification failed, calculated checksum %u but expected %u", checksum, p->pd_checksum))); } if ((flags & PIV_REPORT_STAT) != 0) { pgstat_report_checksum_failure(); } if (header_sane && ignore_checksum_failure) { LOG_FUNCTION_EXIT(); return true; } } return false; }
复制
  • 函数的作用是检查页面头部信息和检验和是否有效
  • 首先判断页面的校验和功能是否开启,如果开启,使用pg_checksum_page()函数计算页面的校验和并和页面头部信息中存储的校验和做对比
  • 计算页面头部信息的那些字段是否符合常理
  • 为了效率上的优化,不对全零页面做检测

二、表的操作

在这里插入图片描述

表的打开并不是物理的打开文件,而是返回表的RelationData结构体,核心就是两个函数:

1、relation_open

根据表的OIDlockmode来获得表的RealtionData结构体并加锁,返回relationData。如果是第一次打开,会在RelCache中创建一个新的RelationData结构体。

Relation relation_open(Oid relationId, LOCKMODE lockmode) { Relation r; Assert(lockmode >= NoLock && lockmode < MAX_LOCKMODES); if (lockmode != NoLock) { LockRelationOid(relationId, lockmode); } r = RelationIdGetRelation(relationId); if (!RelationIsValid(r)) { elog(ERROR, "could not open relation with OID %u", relationId); } Assert(lockmode != NoLock || IsBootstrapProcessingMode() || CheckRelationLockedByMe(r, AccessShareLock, true)); if (RelationUsesLocalBuffers(r)) { MyXactFlags |= XACT_FLAGS_ACCESSEDTEMPNAMESPACE; } pgstat_init_relation(r); return r; }
复制
  • 1、断言:Assert(lockmode >= NoLock && lockmode < MAX_LOCKMODES);确保提供的锁模式在有效范围内。
  • 2、获取锁:如果锁模式不是 NoLock,则调用 LockRelationOid(relationId, lockmode);获取相应的锁。如果是 NoLock,则不获取锁,并记录日志。
  • 3、打开关系:通过 RelationIdGetRelation(relationId); 根据 OID 获取关系的缓存条目。
  • 4、检查关系有效性:如果获取的关系无效(!RelationIsValid(r)),则记录错误日志并抛出错误。
  • 5、断言持有锁:如果锁模式不是 NoLock,则断言当前事务已经持有该关系的锁(
  • 6、标记访问临时关系:如果关系使用本地缓冲区(即临时表),则通过 MyXactFlags |= XACT_FLAGS_ACCESSEDTEMPNAMESPACE;标记当前事务访问了临时命名空间。
  • 7、统计初始化:调用 pgstat_init_relation(r);初始化关系的统计信息。
  • 8、返回关系:最后,函数返回打开的关系的缓存条目。

2、relation_openrv

根据表的name来获取表的OID,进而调用relation_open函数。

Relation relation_openrv(const RangeVar *relation, LOCKMODE lockmode) { Oid relOid = 0; if (lockmode != NoLock) { AcceptInvalidationMessages(); } relOid = RangeVarGetRelid(relation, lockmode, false); return relation_open(relOid, NoLock); }
复制
  • 1、变量初始化:Oid relOid = 0;初始化一个用于存储关系对象标识符(OID)的变量。
  • 2、处理锁模式:
    如果锁模式不是 NoLock,则调用AcceptInvalidationMessages();。这个函数通常用于处理来自其他事务的无效化消息,以确保当前事务能够感知到最新的数据状态。
    如果锁模式是 NoLock,则记录一条日志。
  • 3、获取关系 OID:通过调用 RangeVarGetRelid(relation, lockmode, false);获取关系的 OID。这个函数会根据提供的 RangeVar结构和锁模式来查找关系的 OID,并在需要时获取相应的锁。
  • 4、打开关系:调用 relation_open(relOid, NoLock);根据获取到的 OID 打开关系。这里传递 NoLock 作为锁模式是因为在 RangeVarGetRelid 中已经根据需要获取了锁,所以在这里不需要再次获取。
  • 5、返回值:函数返回打开的关系的缓存条目。

3、扫描表

在这里插入图片描述

  1. 首先将文件块逐一加载到缓冲区中,然后扫描每个缓冲区中的每一个元组,以找到满足条件的元组。
  2. 在对一个表进行扫描的时候,会使用结构体HeapScanDescData来保存表的基本信息以及当前的扫描状。

三、元组的操作

对元组的操作包括、插入、删除和更新三种操作,其中在元组操作中,更新是通过删除旧元组并插入新元组实现的。

1、插入元组

插入元组的数据接口是 heap_insert()函数。

void heap_insert(Relation relation, HeapTuple tup, CommandId cid, int options, BulkInsertState bistate) { TransactionId xid = GetCurrentTransactionId(); HeapTuple heaptup; Buffer buffer = InvalidBuffer; Buffer vmbuffer = InvalidBuffer; bool all_visible_cleared = false; Assert(HeapTupleHeaderGetNatts(tup->t_data) <= RelationGetNumberOfAttributes(relation)); heaptup = heap_prepare_insert(relation, tup, xid, cid, options); buffer = RelationGetBufferForTuple(relation, heaptup->t_len, InvalidBuffer, options, bistate, &vmbuffer, NULL); CheckForSerializableConflictIn(relation, NULL, InvalidBlockNumber); START_CRIT_SECTION(); RelationPutHeapTuple(relation, buffer, heaptup, (options & HEAP_INSERT_SPECULATIVE) != 0); if (PageIsAllVisible(BufferGetPage(buffer))) { all_visible_cleared = true; PageClearAllVisible(BufferGetPage(buffer)); visibilitymap_clear(relation, ItemPointerGetBlockNumber(&(heaptup->t_self)), vmbuffer, VISIBILITYMAP_VALID_BITS); } MarkBufferDirty(buffer); /* XLOG stuff */ if (RelationNeedsWAL(relation)) { xl_heap_insert xlrec; xl_heap_header xlhdr; XLogRecPtr recptr = 0; Page page = BufferGetPage(buffer); uint8 info = XLOG_HEAP_INSERT; int bufflags = 0; if (RelationIsAccessibleInLogicalDecoding(relation)) { log_heap_new_cid(relation, heaptup); } if (ItemPointerGetOffsetNumber(&(heaptup->t_self)) == FirstOffsetNumber && PageGetMaxOffsetNumber(page) == FirstOffsetNumber) { info |= XLOG_HEAP_INIT_PAGE; bufflags |= REGBUF_WILL_INIT; } xlrec.offnum = ItemPointerGetOffsetNumber(&heaptup->t_self); xlrec.flags = 0; if (all_visible_cleared) { xlrec.flags |= XLH_INSERT_ALL_VISIBLE_CLEARED; } if (options & HEAP_INSERT_SPECULATIVE) { xlrec.flags |= XLH_INSERT_IS_SPECULATIVE; } Assert(ItemPointerGetBlockNumber(&heaptup->t_self) == BufferGetBlockNumber(buffer)); if (RelationIsLogicallyLogged(relation) && !(options & HEAP_INSERT_NO_LOGICAL)) { xlrec.flags |= XLH_INSERT_CONTAINS_NEW_TUPLE; bufflags |= REGBUF_KEEP_DATA; if (IsToastRelation(relation)) { xlrec.flags |= XLH_INSERT_ON_TOAST_RELATION; } } XLogBeginInsert(); XLogRegisterData((char *) &xlrec, SizeOfHeapInsert); xlhdr.t_infomask2 = heaptup->t_data->t_infomask2; xlhdr.t_infomask = heaptup->t_data->t_infomask; xlhdr.t_hoff = heaptup->t_data->t_hoff; XLogRegisterBuffer(0, buffer, REGBUF_STANDARD | bufflags); XLogRegisterBufData(0, (char *) &xlhdr, SizeOfHeapHeader); /* PG73FORMAT: write bitmap [+ padding] [+ oid] + data */ XLogRegisterBufData(0, (char *) heaptup->t_data + SizeofHeapTupleHeader, heaptup->t_len - SizeofHeapTupleHeader); /* filtering by origin on a row level is much more efficient */ XLogSetRecordFlags(XLOG_INCLUDE_ORIGIN); recptr = XLogInsert(RM_HEAP_ID, info); PageSetLSN(page, recptr); } END_CRIT_SECTION(); UnlockReleaseBuffer(buffer); if (vmbuffer != InvalidBuffer) { ReleaseBuffer(vmbuffer); } CacheInvalidateHeapTuple(relation, heaptup, NULL); pgstat_count_heap_insert(relation, 1); if (heaptup != tup) { tup->t_self = heaptup->t_self; heap_freetuple(heaptup); } return; }
复制

该函数的流程如下所示:
在这里插入图片描述

  1. 首先为新插入的元组调用 newoid 函数为其分配一个OID
  2. 初始化元组,包括设置t_xmint_cmin为当前事务ID和当前命令ID、将t_xmax设置为无效、设置
    tableOid(包含此元组的表的OID
  3. 找到属于该表且空闲空间大于newtup的文件块,将其载入缓冲区以用来插入tup(调用函数
    RealtionGetBufferForTuple)。
  4. 有了新插入的元组tup和存放元组的缓冲区后,就会调用RelationPutHeapTuple函数将新元组插入
    至选中的缓冲区。
  5. 向事务日志(XLog)写入一条XLog
  6. 当完成上述过程后,将缓冲区解锁释放,并返回插入元组的OID

2、删除元组

PostgreSQL中,使用标记删除的方式删除元组,这对于MVCC是有好处的,其UndoRedo速度是相当高速的,因只需重新设置即可。被标记删除的磁盘空间会通过运行VACUUM收回。
删除元组主要调用 heap_delete 来实现:

TM_Result heap_delete(Relation relation, ItemPointer tid, CommandId cid, Snapshot crosscheck, bool wait, TM_FailureData *tmfd, bool changingPart) { TM_Result result; TransactionId xid = GetCurrentTransactionId(); ItemId lp; HeapTupleData tp; Page page; BlockNumber block = 0; Buffer buffer = InvalidBuffer; Buffer vmbuffer = InvalidBuffer; TransactionId new_xmax; uint16 new_infomask, new_infomask2; bool have_tuple_lock = false; bool iscombo = false; bool all_visible_cleared = false; HeapTuple old_key_tuple = NULL; /* replica identity of the tuple */ bool old_key_copied = false; Assert(ItemPointerIsValid(tid)); if (IsInParallelMode()) { ereport(ERROR, (errcode(ERRCODE_INVALID_TRANSACTION_STATE), errmsg("cannot delete tuples during a parallel operation"))); } block = ItemPointerGetBlockNumber(tid); buffer = ReadBuffer(relation, block); page = BufferGetPage(buffer); if (PageIsAllVisible(page)) { visibilitymap_pin(relation, block, &vmbuffer); } LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE); lp = PageGetItemId(page, ItemPointerGetOffsetNumber(tid)); Assert(ItemIdIsNormal(lp)); tp.t_tableOid = RelationGetRelid(relation); tp.t_data = (HeapTupleHeader) PageGetItem(page, lp); tp.t_len = ItemIdGetLength(lp); tp.t_self = *tid; l1: if (vmbuffer == InvalidBuffer && PageIsAllVisible(page)) { LockBuffer(buffer, BUFFER_LOCK_UNLOCK); visibilitymap_pin(relation, block, &vmbuffer); LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE); } result = HeapTupleSatisfiesUpdate(&tp, cid, buffer); if (result == TM_Invisible) { UnlockReleaseBuffer(buffer); ereport(ERROR, (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), errmsg("attempted to delete invisible tuple"))); } else if (result == TM_BeingModified && wait) { TransactionId xwait; uint16 infomask; /* must copy state data before unlocking buffer */ xwait = HeapTupleHeaderGetRawXmax(tp.t_data); infomask = tp.t_data->t_infomask; if (infomask & HEAP_XMAX_IS_MULTI) { bool current_is_member = false; if (DoesMultiXactIdConflict((MultiXactId) xwait, infomask, LockTupleExclusive, &current_is_member)) { LockBuffer(buffer, BUFFER_LOCK_UNLOCK); if (!current_is_member) { heap_acquire_tuplock(relation, &(tp.t_self), LockTupleExclusive, LockWaitBlock, &have_tuple_lock); } /* wait for multixact */ MultiXactIdWait((MultiXactId) xwait, MultiXactStatusUpdate, infomask, relation, &(tp.t_self), XLTW_Delete, NULL); LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE); if ((vmbuffer == InvalidBuffer && PageIsAllVisible(page)) || xmax_infomask_changed(tp.t_data->t_infomask, infomask) || !TransactionIdEquals(HeapTupleHeaderGetRawXmax(tp.t_data), xwait)) goto l1; } } else if (!TransactionIdIsCurrentTransactionId(xwait)) { LockBuffer(buffer, BUFFER_LOCK_UNLOCK); heap_acquire_tuplock(relation, &(tp.t_self), LockTupleExclusive, LockWaitBlock, &have_tuple_lock); XactLockTableWait(xwait, relation, &(tp.t_self), XLTW_Delete); LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE); if ((vmbuffer == InvalidBuffer && PageIsAllVisible(page)) || xmax_infomask_changed(tp.t_data->t_infomask, infomask) || !TransactionIdEquals(HeapTupleHeaderGetRawXmax(tp.t_data), xwait)) goto l1; /* Otherwise check if it committed or aborted */ UpdateXmaxHintBits(tp.t_data, buffer, xwait); } if ((tp.t_data->t_infomask & HEAP_XMAX_INVALID) || HEAP_XMAX_IS_LOCKED_ONLY(tp.t_data->t_infomask) || HeapTupleHeaderIsOnlyLocked(tp.t_data)) result = TM_Ok; else if (!ItemPointerEquals(&tp.t_self, &tp.t_data->t_ctid)) result = TM_Updated; else result = TM_Deleted; } if (crosscheck != InvalidSnapshot && result == TM_Ok) { /* Perform additional check for transaction-snapshot mode RI updates */ if (!HeapTupleSatisfiesVisibility(&tp, crosscheck, buffer)) { result = TM_Updated; } } if (result != TM_Ok) { Assert(result == TM_SelfModified || result == TM_Updated || result == TM_Deleted || result == TM_BeingModified); Assert(!(tp.t_data->t_infomask & HEAP_XMAX_INVALID)); Assert(result != TM_Updated || !ItemPointerEquals(&tp.t_self, &tp.t_data->t_ctid)); tmfd->ctid = tp.t_data->t_ctid; tmfd->xmax = HeapTupleHeaderGetUpdateXid(tp.t_data); if (result == TM_SelfModified) tmfd->cmax = HeapTupleHeaderGetCmax(tp.t_data); else tmfd->cmax = InvalidCommandId; UnlockReleaseBuffer(buffer); if (have_tuple_lock) { UnlockTupleTuplock(relation, &(tp.t_self), LockTupleExclusive); } if (vmbuffer != InvalidBuffer) { ReleaseBuffer(vmbuffer); } return result; } CheckForSerializableConflictIn(relation, tid, BufferGetBlockNumber(buffer)); HeapTupleHeaderAdjustCmax(tp.t_data, &cid, &iscombo); old_key_tuple = ExtractReplicaIdentity(relation, &tp, true, &old_key_copied); MultiXactIdSetOldestMember(); compute_new_xmax_infomask(HeapTupleHeaderGetRawXmax(tp.t_data), tp.t_data->t_infomask, tp.t_data->t_infomask2, xid, LockTupleExclusive, true, &new_xmax, &new_infomask, &new_infomask2); START_CRIT_SECTION(); PageSetPrunable(page, xid); if (PageIsAllVisible(page)) { all_visible_cleared = true; PageClearAllVisible(page); visibilitymap_clear(relation, BufferGetBlockNumber(buffer), vmbuffer, VISIBILITYMAP_VALID_BITS); } /* store transaction information of xact deleting the tuple */ tp.t_data->t_infomask &= ~(HEAP_XMAX_BITS | HEAP_MOVED); tp.t_data->t_infomask2 &= ~HEAP_KEYS_UPDATED; tp.t_data->t_infomask |= new_infomask; tp.t_data->t_infomask2 |= new_infomask2; HeapTupleHeaderClearHotUpdated(tp.t_data); HeapTupleHeaderSetXmax(tp.t_data, new_xmax); HeapTupleHeaderSetCmax(tp.t_data, cid, iscombo); /* Make sure there is no forward chain link in t_ctid */ tp.t_data->t_ctid = tp.t_self; /* Signal that this is actually a move into another partition */ if (changingPart) { HeapTupleHeaderSetMovedPartitions(tp.t_data); } MarkBufferDirty(buffer); if (RelationNeedsWAL(relation)) { xl_heap_delete xlrec; xl_heap_header xlhdr; XLogRecPtr recptr = 0; if (RelationIsAccessibleInLogicalDecoding(relation)) { log_heap_new_cid(relation, &tp); } xlrec.flags = 0; if (all_visible_cleared) { xlrec.flags |= XLH_DELETE_ALL_VISIBLE_CLEARED; } if (changingPart) { xlrec.flags |= XLH_DELETE_IS_PARTITION_MOVE; } xlrec.infobits_set = compute_infobits(tp.t_data->t_infomask, tp.t_data->t_infomask2); xlrec.offnum = ItemPointerGetOffsetNumber(&tp.t_self); xlrec.xmax = new_xmax; if (old_key_tuple != NULL) { if (relation->rd_rel->relreplident == REPLICA_IDENTITY_FULL) xlrec.flags |= XLH_DELETE_CONTAINS_OLD_TUPLE; else xlrec.flags |= XLH_DELETE_CONTAINS_OLD_KEY; } XLogBeginInsert(); XLogRegisterData((char *) &xlrec, SizeOfHeapDelete); XLogRegisterBuffer(0, buffer, REGBUF_STANDARD); if (old_key_tuple != NULL) { xlhdr.t_infomask2 = old_key_tuple->t_data->t_infomask2; xlhdr.t_infomask = old_key_tuple->t_data->t_infomask; xlhdr.t_hoff = old_key_tuple->t_data->t_hoff; XLogRegisterData((char *) &xlhdr, SizeOfHeapHeader); XLogRegisterData((char *) old_key_tuple->t_data + SizeofHeapTupleHeader, old_key_tuple->t_len - SizeofHeapTupleHeader); } /* filtering by origin on a row level is much more efficient */ XLogSetRecordFlags(XLOG_INCLUDE_ORIGIN); recptr = XLogInsert(RM_HEAP_ID, XLOG_HEAP_DELETE); PageSetLSN(page, recptr); } END_CRIT_SECTION(); LockBuffer(buffer, BUFFER_LOCK_UNLOCK); if (vmbuffer != InvalidBuffer) { ReleaseBuffer(vmbuffer); } if (relation->rd_rel->relkind != RELKIND_RELATION && relation->rd_rel->relkind != RELKIND_MATVIEW) { /* toast table entries should never be recursively toasted */ Assert(!HeapTupleHasExternal(&tp)); } else if (HeapTupleHasExternal(&tp)) { heap_toast_delete(relation, &tp, false); } CacheInvalidateHeapTuple(relation, &tp, NULL); /* Now we can release the buffer */ ReleaseBuffer(buffer); /* * Release the lmgr tuple lock, if we had it. */ if (have_tuple_lock) { UnlockTupleTuplock(relation, &(tp.t_self), LockTupleExclusive); } pgstat_count_heap_delete(relation); if (old_key_tuple != NULL && old_key_copied) { heap_freetuple(old_key_tuple); } return TM_Ok; }
复制

其主要流程如下:

  1. 根据要删除的元组 tid 得到相关的缓冲区,并对其加排他锁。
  2. 调用 HeapTupleSatisfiesUpdate 函数检查元组对当前事务的可见性。如果元组对当前事务是不可见的(HeapTupleSatisfiesUpdate函数返回HeapTupleInvisible),那么对缓冲区解锁并释放,再返回错误信息。
  3. 如果元组正在被本事务修改(HeapTupleSatisfiesUpdate 函数返回 HeapTupleSelfUpdated)或已经修改(HeapTupleSatisfiesUpdate 函数返回 HeapTupleUpdated),则将元组的ctid字段指向被修改后的元组物理位置,并对缓冲区解锁,释放,再返回 HeapTupleSelfUpdatedHeapTupleUpdated 信息。
  4. 如果元组正在被其他事务修改(HeapTupleSatisfiesUpdate 函数返回HeapTupleBeingUpdated),那么将等待该事务结束再检测。如果事务可以修改(HeapTupleSatisfiesUpdate 函数返回HeapTupleMayBeUpdated),那么heap_delete会继续向下执行。
  5. 进入临界区域,设置t_xmaxt_cmax为当前事务ID和当前命令ID。那么到此位置该元组已经被标记删除
  6. 记录XLog
  7. 如果此元组存在线外数据,即经过TOAST的数据,那么还需要将其TOAST表中对应的数据删除。
  8. 如果是系统表元组,则发送无效消息。
  9. 设置FSM表中该元组所处文件块的空闲空间值。

其主要流程如下:

  1. 根据要删除的元组 tid 得到相关的缓冲区,并对其加排他锁。
  2. 调用 HeapTupleSatisfiesUpdate 函数检查元组对当前事务的可见性。如果元组对当前事务是不可见的(HeapTupleSatisfiesUpdate函数返回HeapTupleInvisible),那么对缓冲区解锁并释放,再返回错误信息。
  3. 如果元组正在被本事务修改(HeapTupleSatisfiesUpdate 函数返回 HeapTupleSelfUpdated)或已经修改(HeapTupleSatisfiesUpdate 函数返回 HeapTupleUpdated),则将元组的ctid字段指向被修改后的元组物理位置,并对缓冲区解锁,释放,再返回 HeapTupleSelfUpdatedHeapTupleUpdated 信息。
  4. 如果元组正在被其他事务修改(HeapTupleSatisfiesUpdate 函数返回HeapTupleBeingUpdated),那么将等待该事务结束再检测。如果事务可以修改(HeapTupleSatisfiesUpdate 函数返回HeapTupleMayBeUpdated),那么heap_delete会继续向下执行。
  5. 进入临界区域,设置t_xmaxt_cmax为当前事务ID和当前命令ID。那么到此位置该元组已经被标记删除
  6. 记录XLog
  7. 如果此元组存在线外数据,即经过TOAST的数据,那么还需要将其TOAST表中对应的数据删除。
  8. 如果是系统表元组,则发送无效消息。
  9. 设置FSM表中该元组所处文件块的空闲空间值。
「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论