nes-datagen: add cursor-jump (NCLP) sample task#320113
Conversation
Extend nes-datagen with a next-cursor-line prediction task alongside
the existing xtab path. Detects the user's next intentional cursor
move after the request bookmark and emits a training sample with the
production cursor-prediction prompt + the observed jump as the
expected response.
Three sub-modes via --sample-task:
- cursor-same-file: a jump farther than N lines from cursor at
request time
- cursor-cross-file: focus/selection on a different file
- cursor-both: either of the above
Reuses the production cursor-prediction prompt by capturing it via
the telemetry builder and a no-op fetcher; the cross-file target
line is resolved from a request-time content snapshot + post-request
replay so previously-opened targets get a correct line number
instead of being silently labelled :0.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Convert the string-union + as-const tuple to a proper NesDatagenSampleTask
string enum. CLI surface is string-enum members keep theunchanged
kebab-case wire values ('xtab', 'cursor-same-file', ...). All consumers
(dispatch, fixtures, response metadata typing) updated to reference enum
members instead of string literals.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Upward cursor jumps (back to a definition, an import, etc.) are typically tighter than downward jumps after the user has been writing. Lower the default threshold to match. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Drop the NCLP abbreviation in favor of the more descriptive 'cursor-jump' name already used in the production xtab provider. cursorJumpPromptStep, cursorJumpResponseStep), the capture request ids, and all surrounding doc comments / test descriptions. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
path map from only the pre-request slice and then re-walked the post-request slice to backfill any documentEncountered entries that arrived later. Pass the whole recording into documentIndexMapping instead so the helper sees every document the user touched in a single pass; the backfill loop is gone. splitRecordingAtRequestTime now also returns the full entries array so both callers can reuse it without re-deriving it from altAction. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
This PR extends the nes-datagen pipeline (Copilot extension test tooling) to generate training samples for the cursor-jump / next-cursor-line prediction model in addition to the existing xtab (edit prediction) samples. It adds a CLI switch to select the sample task, reuses the production cursor-prediction prompt construction to avoid prompt drift, and introduces pure detector/formatter modules to label expected cursor-jump outputs from post-bookmark recordings.
Changes:
- Add
--sample-task(xtab / cursor-same-file / cursor-cross-file / cursor-both) plus same-file threshold flags, and route the pipeline accordingly (including parallel worker argument propagation). - Implement cursor-jump detection + response formatting (same-file and cross-file) with dedicated unit tests.
- Expose production cursor-prediction prompt construction (
buildCursorPredictionPrompt) and capture prompt/kept-range for datagen tooling.
Show a summary per file
| File | Description |
|---|---|
| extensions/copilot/test/pipeline/test/pipeline.spec.ts | Updates pipeline test wiring to pass new nes-datagen options. |
| extensions/copilot/test/pipeline/test/pipeline.e2e.spec.ts | Updates e2e tests to include new nes-datagen options. |
| extensions/copilot/test/pipeline/replayRecording.ts | Extends processed-row shape with post-request recording slices + request-time cursor/content snapshots. |
| extensions/copilot/test/pipeline/pipeline.ts | Adds cursor-jump pipeline path (jump detection → prompt capture → response formatting → sample output). |
| extensions/copilot/test/pipeline/output.ts | Extends sample metadata with task and optional jump info; threads through assembleSample. |
| extensions/copilot/test/pipeline/cursorJump/detectJump.ts | Adds pure same-file / cross-file jump detectors + path normalization helper. |
| extensions/copilot/test/pipeline/cursorJump/detectJump.spec.ts | Unit tests for detectors, normalization, and response formatting. |
| extensions/copilot/test/pipeline/cursorJump/cursorJumpResponseStep.ts | Formats expected cursor-jump assistant outputs and attaches jump metadata. |
| extensions/copilot/test/pipeline/cursorJump/cursorJumpPromptStep.ts | Runs production NES pipeline with a no-op fetcher to capture the real cursor-jump prompt and kept range. |
| extensions/copilot/test/pipeline/alternativeAction/processor.ts | Exposes recording split-at-request logic and broadens doc-id→path mapping to whole recording. |
| extensions/copilot/test/base/simulationOptions.ts | Introduces NesDatagenSampleTask enum, parses new CLI flags, updates help text. |
| extensions/copilot/src/platform/inlineEdits/common/statelessNextEditProvider.ts | Adds fields intended for debug/datagen access to cursor-jump prompt data. |
| extensions/copilot/src/extension/xtab/node/xtabNextCursorPredictor.ts | Exports cursor-jump system prompt and factors out prompt construction (buildCursorPredictionPrompt). |
| extensions/copilot/src/extension/inlineEdits/node/nextEditProviderTelemetry.ts | Adds a getter for accessing stored stateless telemetry on the NES telemetry builder. |
Copilot's findings
- Files reviewed: 14/14 changed files
- Comments generated: 2
| cursorJumpModelName: this._cursorJumpModelName, | ||
| cursorJumpPrompt: this._cursorJumpPrompt ? JSON.stringify(this._cursorJumpPrompt.map(({ role, content }) => ({ role, content }))) : undefined, | ||
| cursorJumpResponse: this._cursorJumpResponse, | ||
| cursorJumpRawMessages: this._cursorJumpPrompt, | ||
| cursorJumpKeptRange: this._cursorJumpKeptRange, |
| // If we observed the cross-file selection but couldn't resolve a line | ||
| // number for it, surface that explicitly so callers can drop the sample | ||
| // instead of treating "unknown line" as "line 0". | ||
| if (selectionEntryIndex !== undefined && targetLine === undefined) { | ||
| return { ok: false, reason: 'crossFileTargetLineUnresolved' }; | ||
| } | ||
|
|
||
| return { ok: true, value: { kind: 'crossFile', toDocLogId: targetDocId, toRelativePath: relPath, toLine: targetLine } }; |
Drop the bespoke { ok, value | reason } discriminated union in
detectJump.ts and reuse the existing Result<T, E> from
src/util/common/result. JumpDetectionResult<T> is now just an alias
for Result<T, string>.
.isOk(),
.err) and the spec file accordingly.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
cursorJumpRawMessages and cursorJumpKeptRange were added to IStatelessNextEditTelemetry so in-process debug / datagen tooling could read them back via getStatelessNextEditTelemetry(). However LlmNESTelemetryBuilder.build() spreads ...this._statelessNextEditTelemetry into the emitted payload, so those two fields would leak to telemetry cursorJumpRawMessages can contain full prompt content (sourcesinks code), which must never leave the process. Destructure them out before spreading into the build() payload. They remain readable via getStatelessNextEditTelemetry() for tooling. Documented the privacy contract on the IStatelessNextEditTelemetry field declarations so future edits don't forget. Addresses copilot-pull-request-reviewer feedback on PR #320113. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
detectCrossFileJump previously returned Result.ok with toLine
undefined when only a focused event was seen for the target doc (no
selectionChanged). That left generateCrossFileResponse to drop the
sample later while the detector still reported a successful jump.
Treat focused-without-selectionChanged as a failed detection
('crossFileTargetNoSelection') so callers can skip early, and tighten
ICrossFileJump.toLine to non-undefined now that ok results always
have a usable line number. Removes the dead error path in
generateCrossFileResponse.
Adds a regression test that focused-only triggers the new error.
Addresses copilot-pull-request-reviewer feedback on PR #320113.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The datagen pipeline previously stashed the raw cursor-jump prompt and keptRange on IStatelessNextEditTelemetry so cursorJumpPromptStep.ts could read them back via LlmNESTelemetryBuilder.getStatelessNextEditTelemetry(). That leaked raw prompts into the telemetry payload (worked around by a destructure-strip hack in LlmNESTelemetryBuilder.build()) and was asymmetric with the xtab path, which captures via InlineEditRequestLogContext.rawMessages. Move the cursor-jump capture vehicle onto InlineEditRequestLogContext to match xtab: - Add cursorJumpRawMessages / cursorJumpKeptRange fields and setCursorJumpPrompt(messages, keptRange) to InlineEditRequestLogContext. - XtabNextCursorPredictor.predictNextCursorPosition now takes a logContext parameter and writes to it directly. The xtabProvider callsite passes the same logContext it already had in scope. - cursorJumpPromptStep reads from logContext instead of the telemetry builder. - Remove cursorJumpRawMessages / cursorJumpKeptRange from IStatelessNextEditTelemetry, plus the corresponding setter/getter on StatelessNextEditTelemetryBuilder and the getter on LlmNESTelemetryBuilder. - Revert the destructure-strip hack in LlmNESTelemetryBuilder.build(). The pre-existing cursorJumpPrompt telemetry field (JSON-stringified, fed by setCursorJumpPrompt(messages)) is intentional and unchanged. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…landing Selection-based detection treated peek, navigation, IDE auto-scroll, and recursive cursor settling as if they were the user's next intended edit location. The model's job is to predict where the user will EDIT next, so key off the first 'changed' event after the request bookmark instead. Same-file detector: - Walks for the first 'changed' on the active doc; uses the first edit's start offset to compute toLine; applies the linesAbove/linesBelow threshold. Bails with editsAnotherFileFirst when a non-active doc is edited first (lets the cross-file detector claim the sample in cursor-both mode). 'selectionChanged' is no longer consulted, so the settle-after-edit filter is gone it was a workaround for thetoo selection-based approach. Cross-file detector: - Walks for the first 'changed' on a non-active doc; uses the first edit's start offset, resolved against the target doc's snapshot just-before applying the event. Drops focused / selectionChanged heuristics and the crossFileTargetNoSelection error path (a focused event without an edit no longer counts; background peek can't pollute the dataset). buildLineResolver: tightened i <= entryIndex to i < entryIndex so the resolver returns the pre-edit line when entryIndex is itself a 'changed' event. The bound is equivalent for the old selectionChanged caller. Spec: switched ground-truth events from selChanged to changed; added coverage for first-edit-of-multi-edit, editsAnotherFileFirst, and active-doc-then-other-doc ordering. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Every other helper in xtabProvider takes RequestTracingContext (the
{ tracer, logContext, telemetry } bundle). The cursor predictor was the
odd one out, taking the three pieces as separate positional params with
the latter two that asymmetry made the new logContext-captureoptional
plumbing look more invasive than it is and forced an awkward
?.setCursorJumpPrompt chain at the use site.
Switch the predictor to take RequestTracingContext directly:
- Export RequestTracingContext from xtabProvider so the predictor can
type-import it (TS-erased to avoid the runtime circular import).
- predictNextCursorPosition signature collapses from 5 params to 3.
- Drop the optional chains; tracing.telemetry / tracing.logContext are
always present in production and the spec constructs a real bundle.
- Spec adds a createTestTracingContext helper using the cheap
InlineEditRequestLogContext / StatelessNextEditTelemetryBuilder
constructors already used by other inlineEdits specs.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
ulugbekna
left a comment
There was a problem hiding this comment.
- have you added tests similar to what we have for the xtab samples?
| const recordingAfterRequest = recording.slice(recordingIdxOfRequestTime + 1); | ||
|
|
||
| return { | ||
| entries: recording, |
There was a problem hiding this comment.
I don't like this being called entries -- it should be "wholeRecording"
| readonly toLine: number; | ||
| } | ||
|
|
||
| export type JumpDetectionResult<T> = Result<T, string>; |
There was a problem hiding this comment.
this looks like it couldn't been inlined?
| * cursor variants, never `CursorBoth`. | ||
| */ | ||
| readonly task: Exclude<NesDatagenSampleTask, NesDatagenSampleTask.CursorBoth>; | ||
| /** Present only when {@link task} is a cursor-* task. */ |
There was a problem hiding this comment.
I wonder if having a union type to avoid optional fields would be a better engineering decision here.
Extends
nes-datagento generate training data for the cursor-jump(next-cursor-line) prediction model alongside the existing xtab path.
CLI
A new
--sample-taskflag selects what to generate:xtab(default)cursor-same-filecursor-cross-filecursor-bothSame-file detection is gated by
--same-file-jump-min-above(default2) and--same-file-jump-min-below(default5); a recording isonly used if the post-bookmark cursor lands farther than that window
from the request-time cursor.
How it works
The datagen path reuses the production
XtabNextCursorPredictortobuild the prompt drift between datagen and production wouldverbatim
corrupt the training distribution. The prompt is captured via a
no-op fetcher + telemetry builder; the expected response is built by
pure detector / formatter modules from the post-bookmark recording.
Polish commits
After the feature landed, four follow-up commits address review
feedback:
SAMPLE_TASK_VALUEStuple with aNesDatagenSampleTaskstring enum
2
cursor-jump everywhere (files via
git mv, ids /comments / test descriptions updated)
documentIndexMappinginstead ofslicing then backfilling post-request
documentEncounteredentriesTests
68 / 1
npx vitest run test/pipeline. Notable:skippeddetectJump.spec.tscovers same-file thresholds, cross-fileresolution, and the (regression-tested) bug where multiple
changededits were applied ascending in-place instead ofdescending.
npx tsgo --noEmit --project tsconfig.jsoninextensions/copilot.