Doesn't a REPEATABLE READ transaction already freeze the catalog view?

No. PostgreSQL system-catalog scans use a catalog snapshot that refreshes as needed, so DDL committed by another session becomes visible to pg_class and pg_attribute mid-transaction even under REPEATABLE READ. That is exactly why a before/after fingerprint comparison is required rather than relying on transaction isolation.

Why not just lock the catalog or the tables during extraction?

Taking locks that block DDL would let a read-only compliance scan stall application deploys and migrations, which is unacceptable on a production primary. Fingerprinting detects drift after the fact under only AccessShareLock, so extraction never competes with DDL for a lock it should not hold.

Should one drifted object fail the entire extraction run?

No. Partition the matrix so hundreds of clean objects still publish, and move only the drifted objects into a reason-tagged quarantine for a bounded re-extraction. A single mid-read ALTER should never invalidate a whole environment's baseline.

How does mid-read drift detection differ on MySQL?

MySQL has no pg_export_snapshot equivalent and INFORMATION_SCHEMA reflects live state per statement, so you fingerprint from INFORMATION_SCHEMA.COLUMNS and read the role graph from mysql.role_edges. Because metadata locks serialize DDL rather than freezing reads, the before/after fingerprint comparison remains the portable drift signal across both engines.

Handling schema drift during catalog extraction

When a column is renamed, a view is replaced, or a table is dropped while your privilege extractor is reading the catalog, the resulting matrix is internally inconsistent — this page shows how to detect that mid-read structural drift and quarantine the affected objects instead of emitting a corrupted baseline that the diff engine reads as mass revocations.

The problem is specific and easy to underestimate: extraction is not atomic with respect to DDL. A migration deploy, an ALTER TABLE, or a CREATE OR REPLACE VIEW from another session can commit between the moment you read a table’s grants and the moment you read its columns. The tuples you emit then describe a schema that no longer exists as a coherent whole. Downstream, the schema validation pipeline that gates the drift diff cannot tell a genuinely revoked grant from one that merely pointed at an object that changed shape mid-scan — so it either blocks a clean audit or, worse, lets the corruption through.

When catalog-time drift handling applies

Use this when extraction runs against a live production catalog during business hours, where migrations, CREATE OR REPLACE, and partition maintenance can commit at any point during a multi-second scan across hundreds of schemas.
Use this when the same target is read by several concurrent workers — the drift window widens with every extra second the extraction transaction stays open, as it does under async privilege batching.
Skip this for a frozen target: a read replica with replication paused, or a maintenance window where DDL is administratively blocked. There the whole-catalog read is already point-in-time and a single snapshot suffices without the before/after fingerprint.

Step 1: Pin an extraction snapshot with a read-only transaction

Open the extraction in a REPEATABLE READ, READ ONLY transaction. This freezes the data snapshot and guarantees the extractor can never issue DDL or DML, but — importantly — it does not freeze catalog visibility (see the gotcha below). Export the snapshot id so every worker in a batch reads the same point in time.

BEGIN ISOLATION LEVEL REPEATABLE READ READ ONLY;
SELECT pg_export_snapshot() AS snapshot_id;

Expected output — a token you pass to sibling workers via SET TRANSACTION SNAPSHOT:

    snapshot_id
-------------------
 00000003-0000001B-1

Step 2: Fingerprint the catalog surface

Before touching privileges, capture a structural fingerprint of every extractable object. The fingerprint folds each object’s column set — names, resolved types, and nullability — into one md5, so any ALTER that changes the object’s shape changes its fingerprint. Anchor the scan with the exclusion filters from system catalog query optimization so transient and internal namespaces never enter the baseline.

SELECT
    n.nspname AS schema_name,
    c.relname AS object_name,
    c.relkind AS object_kind,
    md5(string_agg(
        a.attname || ':' ||
        format_type(a.atttypid, a.atttypmod) || ':' ||
        a.attnotnull::text,
        ',' ORDER BY a.attnum
    )) AS column_fingerprint
FROM pg_catalog.pg_class c
JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace
JOIN pg_catalog.pg_attribute a
     ON a.attrelid = c.oid AND a.attnum > 0 AND NOT a.attisdropped
WHERE n.nspname NOT LIKE 'pg\_%'
  AND n.nspname <> 'information_schema'
  AND c.relkind IN ('r', 'p', 'v', 'm')
GROUP BY n.nspname, c.relname, c.relkind
ORDER BY schema_name, object_name;

Verification — each row is one relation with a stable hash; re-running against an unchanged catalog reproduces the identical fingerprint set:

 schema_name | object_name | object_kind |     column_fingerprint
-------------+-------------+-------------+----------------------------------
 billing     | invoices    | r           | 6f1e...c3a9
 billing     | ar_summary  | v           | 9b02...771d

Step 3: Read privileges, then re-fingerprint to catch mid-read drift

Read the grant graph — object privileges from information_schema.role_table_grants and membership edges from pg_auth_members — then fingerprint the surface a second time inside the same transaction. Any object whose fingerprint moved between the two reads was altered by a committed DDL statement while the privilege read was in flight.

import hashlib
from dataclasses import dataclass

import psycopg  # psycopg 3

FINGERPRINT_SQL = """..."""  # the query from Step 2
GRANTS_SQL = """
    SELECT grantee, table_schema, table_name, privilege_type, is_grantable
    FROM information_schema.role_table_grants
    WHERE table_schema NOT IN ('pg_catalog', 'information_schema')
    ORDER BY grantee, table_schema, table_name, privilege_type
"""


@dataclass(frozen=True)
class SurfaceObject:
    schema: str
    name: str
    kind: str
    column_fingerprint: str


def fingerprint_surface(cur: psycopg.Cursor) -> dict[tuple[str, str], SurfaceObject]:
    cur.execute(FINGERPRINT_SQL)
    return {(r[0], r[1]): SurfaceObject(*r) for r in cur.fetchall()}


def extract_with_drift_guard(dsn: str) -> dict:
    with psycopg.connect(dsn, autocommit=False) as conn:
        conn.execute("SET TRANSACTION ISOLATION LEVEL REPEATABLE READ, READ ONLY")
        with conn.cursor() as cur:
            before = fingerprint_surface(cur)
            cur.execute(GRANTS_SQL)
            grants = cur.fetchall()
            after = fingerprint_surface(cur)
        conn.rollback()  # read-only work: drop the snapshot cleanly

    shared = before.keys() & after.keys()
    return {
        "grants": grants,
        "drift": {
            "altered": sorted(k for k in shared
                              if before[k].column_fingerprint
                              != after[k].column_fingerprint),
            "appeared": sorted(after.keys() - before.keys()),
            "vanished": sorted(before.keys() - after.keys()),
        },
    }

Expected result — drift.altered, drift.appeared, and drift.vanished are empty on a quiet catalog and populated the instant a migration commits mid-read:

>>> result = extract_with_drift_guard(dsn)
>>> result["drift"]
{'altered': [('billing', 'invoices')], 'appeared': [], 'vanished': []}

Step 4: Classify and quarantine drifted objects, do not fail the run

A single drifted object must not invalidate an otherwise clean extraction of hundreds. Split the matrix into a trustworthy set and a quarantine set keyed by drift class, then re-extract only the affected objects. The quarantine record carries a machine-readable reason so the validation pipeline can block publication of just those tuples rather than the whole batch.

def partition_matrix(result: dict) -> dict:
    drifted = set(result["drift"]["altered"]) \
        | {o for o in result["drift"]["vanished"]}
    clean, quarantined = [], []
    for grantee, schema, table, priv, grantable in result["grants"]:
        row = (grantee, schema, table, priv, grantable)
        if (schema, table) in drifted:
            quarantined.append({"tuple": row, "reason": "surface_drift_mid_read"})
        else:
            clean.append(row)
    return {"clean": clean, "quarantined": quarantined}

Feed quarantined back into a bounded re-extraction of just those (schema, table) keys. Because the reason code is explicit, a persistent quarantine — an object that keeps drifting across retries — surfaces as an operational alert instead of a silent gap, and the retry stays inside the same read-only, timeout-bounded contract described below.

Worked example: PostgreSQL 15, a column rename during a nightly extraction

A nightly compliance sync extracts three environments. At 02:14, while the extractor is between its grant read and its second fingerprint against prod, a deploy runs ALTER TABLE billing.invoices RENAME COLUMN amount TO amount_cents.

Before fingerprint for ('billing', 'invoices'): 6f1e...c3a9 (built from amount:numeric:true).
After fingerprint: 2d77...4b0e (built from amount_cents:numeric:true).

extract_with_drift_guard returns drift.altered == [('billing', 'invoices')]. partition_matrix moves the four invoices grant tuples into quarantine with reason surface_drift_mid_read and publishes the remaining 512 tuples as clean. A targeted re-read of billing.invoices a second later returns a stable fingerprint across its own before/after pair, so its grants rejoin the clean matrix. The environment comparison workflow that consumes the baseline therefore never sees the four invoices grants flicker to revoked-then-restored — it only ever compares a coherent snapshot.

{
  "snapshot_id": "00000003-0000001B-1",
  "clean_count": 512,
  "quarantined": [
    {"tuple": ["ci_reader", "billing", "invoices", "SELECT", "NO"],
     "reason": "surface_drift_mid_read"}
  ],
  "reextracted": {"billing.invoices": "stable"}
}

Gotchas and engine-specific notes

PostgreSQL — REPEATABLE READ does not freeze the catalog. System-catalog scans use a catalog snapshot that is refreshed as needed, so DDL committed by another session becomes visible to pg_class/pg_attribute mid-transaction even under REPEATABLE READ. This is precisely why the before/after fingerprint in Step 3 is necessary — transaction isolation alone will not hide the drift, nor protect you from it.
PostgreSQL — dropped columns leave holes. NOT a.attisdropped is mandatory; a column that was DROPped still occupies an attnum and would otherwise poison the fingerprint with a phantom slot. Include relkind = 'p' to cover partitioned parents alongside ordinary tables ('r').
MySQL — information_schema is not point-in-time. There is no pg_export_snapshot() equivalent; INFORMATION_SCHEMA.COLUMNS reflects live catalog state per statement. Fingerprint from INFORMATION_SCHEMA.COLUMNS (COLUMN_NAME, COLUMN_TYPE, IS_NULLABLE) and read membership from mysql.role_edges (8.0+). Because metadata locks (MDL) serialize DDL against open handles rather than freezing reads, the before/after comparison is the portable drift signal across both engines.
MySQL — role grants live in a different place. Object privileges come from INFORMATION_SCHEMA.SCHEMA_PRIVILEGES / TABLE_PRIVILEGES, and the role graph from mysql.role_edges; there is no is_grantable-per-tuple column shaped like role_table_grants, so the cross-DB parser adapter must normalize both engines into one tuple shape before this drift check runs against it.
Bound every retry. Run the re-extraction with SET LOCAL statement_timeout = '5s' and SET LOCAL lock_timeout = '1s' so a drifting object under heavy DDL can never park the extractor on a metadata lock. Exponential backoff with jitter keeps repeated retries from synchronizing into a thundering herd against a busy primary.

Compliance note

Mid-read drift handling directly serves the completeness and accuracy expectations behind SOC 2 CC6.1 and CC6.3 and PCI DSS Requirement 7: it proves that the access baseline an auditor reviews describes one coherent catalog state, not a smear across concurrent migrations. The artifact it produces is a drift-classified extraction manifest — the exported snapshot_id, the clean tuple count, and the quarantine list with per-object reason codes — which is exactly the evidence needed to show that any object excluded from a given audit cycle was excluded deliberately, with a recorded cause, rather than dropped silently.

Schema Validation Pipelines — the parent gate that consumes the clean/quarantine split this page produces and blocks publication when the quarantine is non-empty.
Extracting user grants from Oracle data dictionary — the same read-consistency problem on Oracle’s DBA_* dictionary views.
Python scripts for async batch privilege scraping — how the shared snapshot id fans out to concurrent workers without widening the drift window.

Up: Schema Validation Pipelines