Skip to content

nes-datagen: add cursor-jump (NCLP) sample task#320113

Draft
ulugbekna wants to merge 11 commits into
mainfrom
agents/nes-datagen-currently-only-supports-generating-t-52f1de8c
Draft

nes-datagen: add cursor-jump (NCLP) sample task#320113
ulugbekna wants to merge 11 commits into
mainfrom
agents/nes-datagen-currently-only-supports-generating-t-52f1de8c

Conversation

@ulugbekna
Copy link
Copy Markdown
Contributor

Extends nes-datagen to generate training data for the cursor-jump
(next-cursor-line) prediction model alongside the existing xtab path.

CLI

A new --sample-task flag selects what to generate:

value what it emits
xtab (default) unchanged xtab samples
cursor-same-file cursor-jump samples where the jump stays in the active file
cursor-cross-file cursor-jump samples where the jump goes to another file
cursor-both both of the above

Same-file detection is gated by --same-file-jump-min-above (default
2) and --same-file-jump-min-below (default 5); a recording is
only used if the post-bookmark cursor lands farther than that window
from the request-time cursor.

How it works

The datagen path reuses the production XtabNextCursorPredictor to
build the prompt drift between datagen and production wouldverbatim
corrupt the training distribution. The prompt is captured via a
no-op fetcher + telemetry builder; the expected response is built by
pure detector / formatter modules from the post-bookmark recording.

                  
      DetectedJump detectJump ( pure)                     recording 
 same-file vs cross-file                      
 applies edits in correct                     
    (descending) order                         
                  
                                
            {                 echo ___BEGIN___COMMAND_OUTPUT_MARKER___;                 PS1="";PS2="";unset HISTFILE;                 EC=$?;                 echo "___BEGIN___COMMAND_DONE_MARKER___$EC";             }
                  
 reuses production   cursorJumpPromptStep                          
    XtabNextCursorPredictor ''    
 reads back prompt from                       
    telemetry builder                          
                  
                                
            {                 echo ___BEGIN___COMMAND_OUTPUT_MARKER___;                 PS1="";PS2="";unset HISTFILE;                 EC=$?;                 echo "___BEGIN___COMMAND_DONE_MARKER___$EC";             }
                  
 cursorJumpResponseStep                        
 same-file: keptRange                         
 cross-file: path:line                        
                  

Polish commits

After the feature landed, four follow-up commits address review
feedback:

  • replace SAMPLE_TASK_VALUES tuple with a NesDatagenSampleTask
    string enum
    2
    cursor-jump everywhere (files via git mv, ids /
    comments / test descriptions updated)
  • pass the whole recording to documentIndexMapping instead of
    slicing then backfilling post-request documentEncountered entries

Tests

68 / 1 npx vitest run test/pipeline. Notable:skipped

  • detectJump.spec.ts covers same-file thresholds, cross-file
    resolution, and the (regression-tested) bug where multiple
    changed edits were applied ascending in-place instead of
    descending.
  • Typecheck clean: npx tsgo --noEmit --project tsconfig.json in
    extensions/copilot.

ulugbekna and others added 5 commits June 5, 2026 16:26
Extend nes-datagen with a next-cursor-line prediction task alongside
the existing xtab path. Detects the user's next intentional cursor
move after the request bookmark and emits a training sample with the
production cursor-prediction prompt + the observed jump as the
expected response.

Three sub-modes via --sample-task:
  - cursor-same-file: a jump farther than N lines from cursor at
    request time
  - cursor-cross-file: focus/selection on a different file
  - cursor-both: either of the above

Reuses the production cursor-prediction prompt by capturing it via
the telemetry builder and a no-op fetcher; the cross-file target
line is resolved from a request-time content snapshot + post-request
replay so previously-opened targets get a correct line number
instead of being silently labelled :0.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Convert the string-union + as-const tuple to a proper NesDatagenSampleTask
string enum. CLI surface is  string-enum members keep theunchanged
kebab-case wire values ('xtab', 'cursor-same-file', ...). All consumers
(dispatch, fixtures, response metadata typing) updated to reference enum
members instead of string literals.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Upward cursor jumps (back to a definition, an import, etc.) are
typically tighter than downward jumps after the user has been
writing. Lower the default threshold to match.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Drop the NCLP abbreviation in favor of the more descriptive
'cursor-jump' name already used in the production xtab provider.
 cursorJumpPromptStep,
 cursorJumpResponseStep), the capture request ids,
and all surrounding doc comments / test descriptions.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
path map from only the
pre-request slice and then re-walked the post-request slice to
backfill any documentEncountered entries that arrived later. Pass the
whole recording into documentIndexMapping instead so the helper sees
every document the user touched in a single pass; the backfill loop is
gone.

splitRecordingAtRequestTime now also returns the full entries array so
both callers can reuse it without re-deriving it from altAction.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings June 5, 2026 14:36
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR extends the nes-datagen pipeline (Copilot extension test tooling) to generate training samples for the cursor-jump / next-cursor-line prediction model in addition to the existing xtab (edit prediction) samples. It adds a CLI switch to select the sample task, reuses the production cursor-prediction prompt construction to avoid prompt drift, and introduces pure detector/formatter modules to label expected cursor-jump outputs from post-bookmark recordings.

Changes:

  • Add --sample-task (xtab / cursor-same-file / cursor-cross-file / cursor-both) plus same-file threshold flags, and route the pipeline accordingly (including parallel worker argument propagation).
  • Implement cursor-jump detection + response formatting (same-file and cross-file) with dedicated unit tests.
  • Expose production cursor-prediction prompt construction (buildCursorPredictionPrompt) and capture prompt/kept-range for datagen tooling.
Show a summary per file
File Description
extensions/copilot/test/pipeline/test/pipeline.spec.ts Updates pipeline test wiring to pass new nes-datagen options.
extensions/copilot/test/pipeline/test/pipeline.e2e.spec.ts Updates e2e tests to include new nes-datagen options.
extensions/copilot/test/pipeline/replayRecording.ts Extends processed-row shape with post-request recording slices + request-time cursor/content snapshots.
extensions/copilot/test/pipeline/pipeline.ts Adds cursor-jump pipeline path (jump detection → prompt capture → response formatting → sample output).
extensions/copilot/test/pipeline/output.ts Extends sample metadata with task and optional jump info; threads through assembleSample.
extensions/copilot/test/pipeline/cursorJump/detectJump.ts Adds pure same-file / cross-file jump detectors + path normalization helper.
extensions/copilot/test/pipeline/cursorJump/detectJump.spec.ts Unit tests for detectors, normalization, and response formatting.
extensions/copilot/test/pipeline/cursorJump/cursorJumpResponseStep.ts Formats expected cursor-jump assistant outputs and attaches jump metadata.
extensions/copilot/test/pipeline/cursorJump/cursorJumpPromptStep.ts Runs production NES pipeline with a no-op fetcher to capture the real cursor-jump prompt and kept range.
extensions/copilot/test/pipeline/alternativeAction/processor.ts Exposes recording split-at-request logic and broadens doc-id→path mapping to whole recording.
extensions/copilot/test/base/simulationOptions.ts Introduces NesDatagenSampleTask enum, parses new CLI flags, updates help text.
extensions/copilot/src/platform/inlineEdits/common/statelessNextEditProvider.ts Adds fields intended for debug/datagen access to cursor-jump prompt data.
extensions/copilot/src/extension/xtab/node/xtabNextCursorPredictor.ts Exports cursor-jump system prompt and factors out prompt construction (buildCursorPredictionPrompt).
extensions/copilot/src/extension/inlineEdits/node/nextEditProviderTelemetry.ts Adds a getter for accessing stored stateless telemetry on the NES telemetry builder.

Copilot's findings

  • Files reviewed: 14/14 changed files
  • Comments generated: 2

Comment on lines +522 to +526
cursorJumpModelName: this._cursorJumpModelName,
cursorJumpPrompt: this._cursorJumpPrompt ? JSON.stringify(this._cursorJumpPrompt.map(({ role, content }) => ({ role, content }))) : undefined,
cursorJumpResponse: this._cursorJumpResponse,
cursorJumpRawMessages: this._cursorJumpPrompt,
cursorJumpKeptRange: this._cursorJumpKeptRange,
Comment on lines +175 to +182
// If we observed the cross-file selection but couldn't resolve a line
// number for it, surface that explicitly so callers can drop the sample
// instead of treating "unknown line" as "line 0".
if (selectionEntryIndex !== undefined && targetLine === undefined) {
return { ok: false, reason: 'crossFileTargetLineUnresolved' };
}

return { ok: true, value: { kind: 'crossFile', toDocLogId: targetDocId, toRelativePath: relPath, toLine: targetLine } };
ulugbekna and others added 6 commits June 5, 2026 16:44
Drop the bespoke { ok, value | reason } discriminated union in
detectJump.ts and reuse the existing Result<T, E> from
src/util/common/result. JumpDetectionResult<T> is now just an alias
for Result<T, string>.

.isOk(),
.err) and the spec file accordingly.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
cursorJumpRawMessages and cursorJumpKeptRange were added to
IStatelessNextEditTelemetry so in-process debug / datagen tooling
could read them back via getStatelessNextEditTelemetry(). However
LlmNESTelemetryBuilder.build() spreads ...this._statelessNextEditTelemetry
into the emitted payload, so those two fields would leak to telemetry
 cursorJumpRawMessages can contain full prompt content (sourcesinks
code), which must never leave the process.

Destructure them out before spreading into the build() payload. They
remain readable via getStatelessNextEditTelemetry() for tooling.

Documented the privacy contract on the IStatelessNextEditTelemetry
field declarations so future edits don't forget.

Addresses copilot-pull-request-reviewer feedback on PR #320113.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
detectCrossFileJump previously returned Result.ok with toLine
undefined when only a focused event was seen for the target doc (no
selectionChanged). That left generateCrossFileResponse to drop the
sample later while the detector still reported a successful jump.

Treat focused-without-selectionChanged as a failed detection
('crossFileTargetNoSelection') so callers can skip early, and tighten
ICrossFileJump.toLine to non-undefined now that ok results always
have a usable line number. Removes the dead error path in
generateCrossFileResponse.

Adds a regression test that focused-only triggers the new error.

Addresses copilot-pull-request-reviewer feedback on PR #320113.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The datagen pipeline previously stashed the raw cursor-jump prompt and
keptRange on IStatelessNextEditTelemetry so cursorJumpPromptStep.ts could
read them back via LlmNESTelemetryBuilder.getStatelessNextEditTelemetry().
That leaked raw prompts into the telemetry payload (worked around by a
destructure-strip hack in LlmNESTelemetryBuilder.build()) and was
asymmetric with the xtab path, which captures via
InlineEditRequestLogContext.rawMessages.

Move the cursor-jump capture vehicle onto InlineEditRequestLogContext to
match xtab:

- Add cursorJumpRawMessages / cursorJumpKeptRange fields and
  setCursorJumpPrompt(messages, keptRange) to InlineEditRequestLogContext.
- XtabNextCursorPredictor.predictNextCursorPosition now takes a logContext
  parameter and writes to it directly. The xtabProvider callsite passes
  the same logContext it already had in scope.
- cursorJumpPromptStep reads from logContext instead of the telemetry
  builder.
- Remove cursorJumpRawMessages / cursorJumpKeptRange from
  IStatelessNextEditTelemetry, plus the corresponding setter/getter on
  StatelessNextEditTelemetryBuilder and the getter on
  LlmNESTelemetryBuilder.
- Revert the destructure-strip hack in LlmNESTelemetryBuilder.build().

The pre-existing cursorJumpPrompt telemetry field (JSON-stringified, fed
by setCursorJumpPrompt(messages)) is intentional and unchanged.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…landing

Selection-based detection treated peek, navigation, IDE auto-scroll, and
recursive cursor settling as if they were the user's next intended edit
location. The model's job is to predict where the user will EDIT next, so
key off the first 'changed' event after the request bookmark instead.

Same-file detector:
- Walks for the first 'changed' on the active doc; uses the first edit's
  start offset to compute toLine; applies the linesAbove/linesBelow
  threshold. Bails with editsAnotherFileFirst when a non-active doc is
  edited first (lets the cross-file detector claim the sample in
  cursor-both mode). 'selectionChanged' is no longer consulted, so the
  settle-after-edit filter is gone  it was a workaround for thetoo
  selection-based approach.

Cross-file detector:
- Walks for the first 'changed' on a non-active doc; uses the first
  edit's start offset, resolved against the target doc's snapshot
  just-before applying the event. Drops focused / selectionChanged
  heuristics and the crossFileTargetNoSelection error path (a focused
  event without an edit no longer counts; background peek can't
  pollute the dataset).

buildLineResolver: tightened i <= entryIndex to i < entryIndex so the
resolver returns the pre-edit line when entryIndex is itself a 'changed'
event. The bound is equivalent for the old selectionChanged caller.

Spec: switched ground-truth events from selChanged to changed; added
coverage for first-edit-of-multi-edit, editsAnotherFileFirst, and
active-doc-then-other-doc ordering.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Every other helper in xtabProvider takes RequestTracingContext (the
{ tracer, logContext, telemetry } bundle). The cursor predictor was the
odd one out, taking the three pieces as separate positional params with
the latter two  that asymmetry made the new logContext-captureoptional
plumbing look more invasive than it is and forced an awkward
?.setCursorJumpPrompt chain at the use site.

Switch the predictor to take RequestTracingContext directly:
- Export RequestTracingContext from xtabProvider so the predictor can
  type-import it (TS-erased to avoid the runtime circular import).
- predictNextCursorPosition signature collapses from 5 params to 3.
- Drop the optional chains; tracing.telemetry / tracing.logContext are
  always present in production and the spec constructs a real bundle.
- Spec adds a createTestTracingContext helper using the cheap
  InlineEditRequestLogContext / StatelessNextEditTelemetryBuilder
  constructors already used by other inlineEdits specs.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

@ulugbekna ulugbekna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • have you added tests similar to what we have for the xtab samples?

const recordingAfterRequest = recording.slice(recordingIdxOfRequestTime + 1);

return {
entries: recording,
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like this being called entries -- it should be "wholeRecording"

readonly toLine: number;
}

export type JumpDetectionResult<T> = Result<T, string>;
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks like it couldn't been inlined?

* cursor variants, never `CursorBoth`.
*/
readonly task: Exclude<NesDatagenSampleTask, NesDatagenSampleTask.CursorBoth>;
/** Present only when {@link task} is a cursor-* task. */
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if having a union type to avoid optional fields would be a better engineering decision here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants