fix(tables): repair only true duplicate order_keys, re-key by order_key not position#4913
fix(tables): repair only true duplicate order_keys, re-key by order_key not position#4913TheodoreSpeaks wants to merge 1 commit into
Conversation
…ey not position The repair script detected mis-keying by walking position order and flagging any table whose order_key disagreed. Under TABLES_FRACTIONAL_ORDERING that disagreement is normal — position is an append counter, order_key is authoritative — so every flag-on middle-insert false-positived, and re-keying by position would scramble the real order. Now it flags only tables with actual duplicate keys and re-keys in (order_key, id) display order. Verified against staging: drops the false positive, keeps the two genuinely-duplicated tables.
|
The latest updates on your projects. Learn more about Vercel for GitHub. |
PR SummaryMedium Risk Overview Detection no longer walks rows in Re-key order changes from Reviewed by Cursor Bugbot for commit d7e6666. Bugbot is set up for automated code reviews on this repo. Configure here. |
|
@greptile review |
Greptile SummaryThis follow-up to #4908 corrects a false-positive detection bug in the
Confidence Score: 4/5Safe to merge — the fix correctly narrows the repair scope to only genuinely broken tables and preserves the user-visible row order during re-keying. The detection and re-key logic are both sound and the staging dry-run validation confirms the false-positive is gone. The one noteworthy gap is that the re-key ORDER BY on order_key does not carry an explicit COLLATE C qualifier; it inherits the column's DDL collation, which is only guaranteed to be bytewise after migration 0228 is applied. Since this is a documented prerequisite and the column-level collation handles it in practice, this is not a blocking concern, but a defensive raw-SQL COLLATE C in the ORDER BY would remove the implicit dependency. apps/sim/scripts/repair-table-order-key-collation.ts — specifically the re-key ORDER BY at line 112. Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[Start script] --> B{--dry-run?}
B -->|Yes| C[Count rows per table\nno lock, no writes]
B -->|No| D[Begin transaction]
subgraph Detection
Q["GROUP BY table_id, order_key\nHAVING count(*) > 1\n(duplicate order_key only)"]
end
A --> Detection
Detection --> E[pending: list of table_ids]
E --> F{For each table_id}
F --> C
F --> D
D --> G[pg_advisory_xact_lock\nuser_table_rows_pos:tableId]
G --> H["SELECT id\nORDER BY order_key COLLATE C ASC, id ASC\n(display order)"]
H --> I["nKeysBetween(null, null, rows.length)\nfresh evenly-spaced keys"]
I --> J["Chunked UPDATE FROM VALUES\nmap id to new key"]
J --> K[Commit]
C --> L[Log stats]
K --> L
L --> F
F -->|Done| M[Print summary]
Reviews (1): Last reviewed commit: "fix(tables): repair only true duplicate ..." | Re-trigger Greptile |
| .from(userTableRows) | ||
| .where(eq(userTableRows.tableId, tableId)) | ||
| .orderBy(asc(userTableRows.position), asc(userTableRows.id)) | ||
| .orderBy(asc(userTableRows.orderKey), asc(userTableRows.id)) |
There was a problem hiding this comment.
The re-key
ORDER BY relies on the column's DDL collation being bytewise (COLLATE "C") — which is only guaranteed after migration 0228 is applied. The previous detection query guarded against this by using explicit COLLATE "C" qualifiers. Using raw SQL here would match that defensive pattern and prevent incorrect ordering (and therefore a scrambled re-key) if the script is ever run in an environment where the migration hasn't landed yet.
| .orderBy(asc(userTableRows.orderKey), asc(userTableRows.id)) | |
| .orderBy(sql`"order_key" COLLATE "C" ASC`, asc(userTableRows.id)) |
Greptile SummaryThis PR fixes a false-positive detection bug in the
Confidence Score: 4/5Safe to merge; the detection and re-key logic are both correct, and the fix is validated against staging data. The GROUP BY / HAVING detection is collation-agnostic and correctly identifies only true duplicates. Re-keying in (order_key, id) order matches what the app displays. The only gap is that the ORDER BY inside the transaction does not pin the collation to C — it trusts the column's post-migration default — so an accidental out-of-order run would silently re-key by locale order instead of bytewise order. apps/sim/scripts/repair-table-order-key-collation.ts — the re-key ORDER BY relies on implicit post-migration collation; consider explicit COLLATE C. Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[Start runRepair] --> B{DATABASE_URL set?}
B -- No --> C[Exit with error]
B -- Yes --> D["Detect tables:\nGROUP BY table_id, order_key\nHAVING count(*) > 1"]
D --> E{--dry-run?}
E -- Yes --> F["Count rows per table\n(no lock, no write)"]
F --> G[Log stats, continue]
E -- No --> H["Acquire pg_advisory_xact_lock\n(user_table_rows_pos:tableId)"]
H --> I["SELECT id ORDER BY order_key ASC, id ASC\n(preserves display order)"]
I --> J["nKeysBetween(null, null, count)\ngenerate fresh evenly-spaced keys"]
J --> K["Chunked UPDATE FROM VALUES\n(5 000-row slices)"]
K --> L{More chunks?}
L -- Yes --> K
L -- No --> M[Commit transaction]
M --> N{More tables?}
N -- Yes --> H
N -- No --> O[Print stats, exit]
G --> N
Reviews (2): Last reviewed commit: "fix(tables): repair only true duplicate ..." | Re-trigger Greptile |
| const rows = await trx | ||
| .select({ id: userTableRows.id }) | ||
| .from(userTableRows) | ||
| .where(eq(userTableRows.tableId, tableId)) | ||
| .orderBy(asc(userTableRows.position), asc(userTableRows.id)) | ||
| .orderBy(asc(userTableRows.orderKey), asc(userTableRows.id)) |
There was a problem hiding this comment.
The
ORDER BY order_key ASC sort inside the transaction is emitted by Drizzle without an explicit COLLATE "C", so it inherits whatever collation the column currently has. The script's contract is to run after migration 0228 (which changes the column to COLLATE "C"), but there is no runtime guard. If the script is mistakenly run before that migration the re-key will reorder rows in en_US.UTF-8 locale order rather than bytewise order, silently producing a scrambled but distinct key set — the exact failure mode the PR is trying to prevent. Using a raw sql fragment for this one column pins the comparison to bytewise order regardless of when the migration runs.
| const rows = await trx | |
| .select({ id: userTableRows.id }) | |
| .from(userTableRows) | |
| .where(eq(userTableRows.tableId, tableId)) | |
| .orderBy(asc(userTableRows.position), asc(userTableRows.id)) | |
| .orderBy(asc(userTableRows.orderKey), asc(userTableRows.id)) | |
| const rows = await trx | |
| .select({ id: userTableRows.id }) | |
| .from(userTableRows) | |
| .where(eq(userTableRows.tableId, tableId)) | |
| .orderBy(sql`${userTableRows.orderKey} COLLATE "C" ASC`, asc(userTableRows.id)) |
Summary
repair-table-order-key-collation.tsdetection was wrong: it walked rows inpositionorder and flagged any table whereorder_keywasn't strictly increasing. UnderTABLES_FRACTIONAL_ORDERINGthat's normal —positionis just an append counter,order_keyis authoritative — so every flag-on middle-insert false-positived, and theposition-based re-key would have scrambled the real order.order_keys (GROUP BY table_id, order_key HAVING count(*) > 1) and re-keys in(order_key, id)display order, preserving what users see while making keys distinct. Never touchesposition.Why it matters
Caught on staging: a normal row insert added a healthy table to the re-key list, and inspection showed it had zero duplicates — just a mid-table insert (mid
order_key, maxposition). The old logic would have re-keyed it by position and moved the row to the end.Verification (staging, en_US.UTF-8, flag on)
data_10mb: 1018 dup groups / 500k rows;sales_crm: 2 dup groups / 4 rows); the false-positive table is no longer listed.bun run lintclean;tsc --noEmitclean.Type of Change
Testing
Validated with
--dry-runagainst staging (read-only). Real run pending.Checklist