Skip to content

fix(tables): repair only true duplicate order_keys, re-key by order_key not position#4913

Closed
TheodoreSpeaks wants to merge 1 commit into
stagingfrom
fix/repair-dup-order-keys
Closed

fix(tables): repair only true duplicate order_keys, re-key by order_key not position#4913
TheodoreSpeaks wants to merge 1 commit into
stagingfrom
fix/repair-dup-order-keys

Conversation

@TheodoreSpeaks

Copy link
Copy Markdown
Collaborator

Summary

  • Follow-up to fix(tables): compare order_key bytewise (COLLATE "C") to stop insert collation errors #4908. The repair-table-order-key-collation.ts detection was wrong: it walked rows in position order and flagged any table where order_key wasn't strictly increasing. Under TABLES_FRACTIONAL_ORDERING that's normalposition is just an append counter, order_key is authoritative — so every flag-on middle-insert false-positived, and the position-based re-key would have scrambled the real order.
  • Now it flags only tables with actual duplicate order_keys (GROUP BY table_id, order_key HAVING count(*) > 1) and re-keys in (order_key, id) display order, preserving what users see while making keys distinct. Never touches position.

Why it matters

Caught on staging: a normal row insert added a healthy table to the re-key list, and inspection showed it had zero duplicates — just a mid-table insert (mid order_key, max position). The old logic would have re-keyed it by position and moved the row to the end.

Verification (staging, en_US.UTF-8, flag on)

  • Before: dry-run flagged 3 tables (incl. one with 0 duplicates — a false positive).
  • After: dry-run flags exactly the 2 tables with genuine duplicate keys (data_10mb: 1018 dup groups / 500k rows; sales_crm: 2 dup groups / 4 rows); the false-positive table is no longer listed.
  • bun run lint clean; tsc --noEmit clean.

Type of Change

  • Bug fix

Testing

Validated with --dry-run against staging (read-only). Real run pending.

Checklist

  • Code follows project style guidelines
  • Self-reviewed my changes
  • Tests added/updated and passing
  • No new warnings introduced
  • I confirm that I have read and agree to the terms outlined in the Contributor License Agreement (CLA)

…ey not position

The repair script detected mis-keying by walking position order and
flagging any table whose order_key disagreed. Under
TABLES_FRACTIONAL_ORDERING that disagreement is normal — position is an
append counter, order_key is authoritative — so every flag-on
middle-insert false-positived, and re-keying by position would scramble
the real order. Now it flags only tables with actual duplicate keys and
re-keys in (order_key, id) display order. Verified against staging: drops
the false positive, keeps the two genuinely-duplicated tables.
@vercel

vercel Bot commented Jun 8, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Actions Updated (UTC)
docs Skipped Skipped Jun 8, 2026 10:32pm

Request Review

@cursor

cursor Bot commented Jun 8, 2026

Copy link
Copy Markdown

PR Summary

Medium Risk
One-off data repair script; wrong logic could reorder rows on live tables, but the fix narrows scope and aligns re-keying with display order.

Overview
Fixes repair-table-order-key-collation.ts so it only repairs real corruption and re-keys in the order users actually see.

Detection no longer walks rows in position order and flags non-increasing order_key (which falsely hit healthy middle-insert tables when fractional ordering is on). It now selects tables with duplicate order_key values via GROUP BY table_id, order_key HAVING count(*) > 1.

Re-key order changes from (position, id) to (order_key, id), matching app display order and avoiding scrambling rows whose position is only an append counter. Comments and log text are updated to describe duplicate-key repair rather than broad collation mis-ordering.

Reviewed by Cursor Bugbot for commit d7e6666. Bugbot is set up for automated code reviews on this repo. Configure here.

@TheodoreSpeaks

Copy link
Copy Markdown
Collaborator Author

@greptile review

@greptile-apps

greptile-apps Bot commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This follow-up to #4908 corrects a false-positive detection bug in the repair-table-order-key-collation.ts script: the old logic flagged tables whose order_key sequence wasn't strictly increasing in position order, which is a normal state for mid-table inserts under TABLES_FRACTIONAL_ORDERING. The new logic only flags tables with genuine duplicate order_key values (via GROUP BY table_id, order_key HAVING count(*) > 1), and re-keys them in (order_key, id) display order instead of position order.

  • Detection rewrite: replaces the LEAD()-based inversion check with a GROUP BY … HAVING count(*) > 1 duplicate check, eliminating false positives for healthy mid-insert tables.
  • Re-key order fix: rows are now fetched in (order_key, id) order — matching the app's display order under the feature flag — rather than (position, id) order, which would have scrambled the visible row sequence.
  • DISTINCT outer query: correctly deduplicates table_id when a table has multiple distinct duplicate-key groups, ensuring each table appears once in the repair list.

Confidence Score: 4/5

Safe to merge — the fix correctly narrows the repair scope to only genuinely broken tables and preserves the user-visible row order during re-keying.

The detection and re-key logic are both sound and the staging dry-run validation confirms the false-positive is gone. The one noteworthy gap is that the re-key ORDER BY on order_key does not carry an explicit COLLATE C qualifier; it inherits the column's DDL collation, which is only guaranteed to be bytewise after migration 0228 is applied. Since this is a documented prerequisite and the column-level collation handles it in practice, this is not a blocking concern, but a defensive raw-SQL COLLATE C in the ORDER BY would remove the implicit dependency.

apps/sim/scripts/repair-table-order-key-collation.ts — specifically the re-key ORDER BY at line 112.

Important Files Changed

Filename Overview
apps/sim/scripts/repair-table-order-key-collation.ts Fixes false-positive detection (position-order inversion → duplicate order_key GROUP BY) and re-keys in (order_key, id) display order instead of position order; re-key ORDER BY relies implicitly on migration 0228 having set the column collation to C

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Start script] --> B{--dry-run?}
    B -->|Yes| C[Count rows per table\nno lock, no writes]
    B -->|No| D[Begin transaction]

    subgraph Detection
        Q["GROUP BY table_id, order_key\nHAVING count(*) > 1\n(duplicate order_key only)"]
    end

    A --> Detection
    Detection --> E[pending: list of table_ids]

    E --> F{For each table_id}
    F --> C
    F --> D

    D --> G[pg_advisory_xact_lock\nuser_table_rows_pos:tableId]
    G --> H["SELECT id\nORDER BY order_key COLLATE C ASC, id ASC\n(display order)"]
    H --> I["nKeysBetween(null, null, rows.length)\nfresh evenly-spaced keys"]
    I --> J["Chunked UPDATE FROM VALUES\nmap id to new key"]
    J --> K[Commit]

    C --> L[Log stats]
    K --> L
    L --> F
    F -->|Done| M[Print summary]
Loading

Reviews (1): Last reviewed commit: "fix(tables): repair only true duplicate ..." | Re-trigger Greptile

.from(userTableRows)
.where(eq(userTableRows.tableId, tableId))
.orderBy(asc(userTableRows.position), asc(userTableRows.id))
.orderBy(asc(userTableRows.orderKey), asc(userTableRows.id))

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 The re-key ORDER BY relies on the column's DDL collation being bytewise (COLLATE "C") — which is only guaranteed after migration 0228 is applied. The previous detection query guarded against this by using explicit COLLATE "C" qualifiers. Using raw SQL here would match that defensive pattern and prevent incorrect ordering (and therefore a scrambled re-key) if the script is ever run in an environment where the migration hasn't landed yet.

Suggested change
.orderBy(asc(userTableRows.orderKey), asc(userTableRows.id))
.orderBy(sql`"order_key" COLLATE "C" ASC`, asc(userTableRows.id))

@greptile-apps

greptile-apps Bot commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR fixes a false-positive detection bug in the repair-table-order-key-collation.ts one-off script. The old window-function approach flagged any table where order_key was not strictly increasing when walked in position order — normal under TABLES_FRACTIONAL_ORDERING — causing healthy tables to be needlessly re-keyed in the wrong order.

  • Detection is replaced with a GROUP BY table_id, order_key HAVING count(*) > 1 query that catches only true duplicates, leaving mid-insert tables untouched.
  • Re-key ordering is changed from ORDER BY position, id to ORDER BY order_key, id, preserving the actual display order users see rather than the append-counter order.

Confidence Score: 4/5

Safe to merge; the detection and re-key logic are both correct, and the fix is validated against staging data.

The GROUP BY / HAVING detection is collation-agnostic and correctly identifies only true duplicates. Re-keying in (order_key, id) order matches what the app displays. The only gap is that the ORDER BY inside the transaction does not pin the collation to C — it trusts the column's post-migration default — so an accidental out-of-order run would silently re-key by locale order instead of bytewise order.

apps/sim/scripts/repair-table-order-key-collation.ts — the re-key ORDER BY relies on implicit post-migration collation; consider explicit COLLATE C.

Important Files Changed

Filename Overview
apps/sim/scripts/repair-table-order-key-collation.ts Corrects duplicate-detection logic (GROUP BY/HAVING instead of window-function inversion) and switches re-key ordering from position to (order_key, id); the re-key ORDER BY relies on the column's post-migration collation without an explicit COLLATE "C" guard.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Start runRepair] --> B{DATABASE_URL set?}
    B -- No --> C[Exit with error]
    B -- Yes --> D["Detect tables:\nGROUP BY table_id, order_key\nHAVING count(*) > 1"]
    D --> E{--dry-run?}
    E -- Yes --> F["Count rows per table\n(no lock, no write)"]
    F --> G[Log stats, continue]
    E -- No --> H["Acquire pg_advisory_xact_lock\n(user_table_rows_pos:tableId)"]
    H --> I["SELECT id ORDER BY order_key ASC, id ASC\n(preserves display order)"]
    I --> J["nKeysBetween(null, null, count)\ngenerate fresh evenly-spaced keys"]
    J --> K["Chunked UPDATE FROM VALUES\n(5 000-row slices)"]
    K --> L{More chunks?}
    L -- Yes --> K
    L -- No --> M[Commit transaction]
    M --> N{More tables?}
    N -- Yes --> H
    N -- No --> O[Print stats, exit]
    G --> N
Loading

Reviews (2): Last reviewed commit: "fix(tables): repair only true duplicate ..." | Re-trigger Greptile

Comment on lines 108 to +112
const rows = await trx
.select({ id: userTableRows.id })
.from(userTableRows)
.where(eq(userTableRows.tableId, tableId))
.orderBy(asc(userTableRows.position), asc(userTableRows.id))
.orderBy(asc(userTableRows.orderKey), asc(userTableRows.id))

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 The ORDER BY order_key ASC sort inside the transaction is emitted by Drizzle without an explicit COLLATE "C", so it inherits whatever collation the column currently has. The script's contract is to run after migration 0228 (which changes the column to COLLATE "C"), but there is no runtime guard. If the script is mistakenly run before that migration the re-key will reorder rows in en_US.UTF-8 locale order rather than bytewise order, silently producing a scrambled but distinct key set — the exact failure mode the PR is trying to prevent. Using a raw sql fragment for this one column pins the comparison to bytewise order regardless of when the migration runs.

Suggested change
const rows = await trx
.select({ id: userTableRows.id })
.from(userTableRows)
.where(eq(userTableRows.tableId, tableId))
.orderBy(asc(userTableRows.position), asc(userTableRows.id))
.orderBy(asc(userTableRows.orderKey), asc(userTableRows.id))
const rows = await trx
.select({ id: userTableRows.id })
.from(userTableRows)
.where(eq(userTableRows.tableId, tableId))
.orderBy(sql`${userTableRows.orderKey} COLLATE "C" ASC`, asc(userTableRows.id))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant