Reduce TypeForm recognition slowdown: 2 identifier fast-reject filters by davidfstr · Pull Request #21596 · python/mypy

davidfstr · 2026-06-05T22:39:24Z

References #21262.

Includes 2 filters from the original combined 7-filter PR:

Reduce TypeForm recognition slowdown #21585

Summary

Enabling TypeForm made SemanticAnalyzer.try_parse_as_type_expression run
eagerly on every expression in certain syntactic positions (~2.84M calls per
self-check). <5% of those reach an expensive full-parse block
(expr_to_analyzed_type + isolated_error_analysis), and ~91% of
those fail — pure wasted work.

This PR adds 2 cheap, early-reject filters for the StrExpr-identifier
case, plus a reordering of the existing early-reject checks by decreasing
rejection frequency. Together the two filters eliminate 8% of full
parses (2,548 → 2,343) per self-check.

2 infrastructure commits:

misc/perf_compare.py: Add options/behaviors
TypeForm: Add instrumentation of full parses

followed by 3 key commits, all operating within the str_value.isidentifier()
branch of try_parse_as_type_expression:

Filter A: isinstance(node, Var) + more conditions
Reorder the 3 mutually-exclusive isinstance checks by decreasing rejection frequency.
Filter B: isinstance(node, (FuncDef, OverloadedFuncDef, MypyFile))

Looking at the str_value.isidentifier() branch alone,
84% (244 → 39) of failing full parses are eliminated per self-check.

Performance

misc/perf_compare.py, single worker, paired per-round deltas.

CPU time (canonical, lowest-variance) — master vs branch tip, n=100

python misc/perf_compare.py --warmup-runs 3 --num-runs 100 -j 3 \
    --metric cpu --workers1 <master> <tip>

n=100: −7.6 ms ±5.9 (−0.28%)

Significant (CI excludes 0). This is the net effect of the branch.

Wall-clock — master vs branch tip

python misc/perf_compare.py --warmup-runs 3 --num-runs 400 -j 3 <master> <tip>

n=400: −3.6 ms ±4.3 (−0.26%), CI [−7.9, +0.7].

Borderline significant: the CI just barely includes 0. Consistent with a real
per-call win partly masked by multi-worker wall-clock.

Correctness

All tests pass.

Note that the var_is_typing_special_form helper needed to be extended to
recognize stringified forms of typing.Self so that Filter A would not
incorrectly reject a stringified 'Self' annotation, regressing the
testSelfRecognizedInOtherSyntacticLocations test.

Open Questions

Should the 2 infrastructure commits be moved to a separate PR?
The full commit messages (after the subject line) on the 3 key commits
are rather verbose. Let me know if you'd like me to trim them down,
perhaps by removing everything after the subject.

Specifically: * Median is reported, in addition to the existing mean+stdev, which is significantly more resistant to skew by outliers. * --metric {wall,cpu} (default wall): Enables profiling using CPU time rather than wall-clock time. CPU profiling has roughly half the coefficient of variation as wall-clock profiling equal run count. * --workers1: Forces MYPY_NUM_WORKERS=1 (rather than the default 4) to cut CPU scheduling variance. Strongly recommended when using --metric cpu. * --warmup-runs N (default 1): Configurable number of leading cold runs to discard. Previously was always 1. Higher run counts decrease outliers that skew the reported mean. * A new "Paired deltas vs <first commit>" section is added to the report, showing per-round paired differencing against the first commit to cancel round-level common-mode noise, reducing variance. Reported as median +/-95% CI. Also: * --cache-binaries (default false): Caches each commit's compiled clone to avoid ~5min recompile whenever comparing the same commit multiple times. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…_parse_as_type_expression() Specifically: - If you set MYPY_TYPEFORM_PROFILE_FULL_PARSE environment variable, mypy will output a .tsv to that filepath which characterizes the kinds of Expressions that try_parse_as_type_expression() in semanal.py was forced to do a full parse of, which was not rejected early. - A misc/analyze_typeform_full_parse_profile.py script is added which takes those .tsvs and prints an expression-time summary (by total time) plus top-N descriptors per FAIL class. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…s_type_expression() Add a fast-rejection filter to SemanticAnalyzer.try_parse_as_type_expression(): a string literal that is an identifier naming a Var whose declared type is a concrete Instance (and is not a typing special form) is a value -- a local, parameter, or module-level constant -- never a type expression. Reject it before the expensive full-parse block (expr_to_analyzed_type + isolated_error_analysis). On the mypy self-check this filter rejects 157 of the 381 identifier-string literals that currently reach the full-parse block (e.g. "__doc__", "__name__", enum/constant members like "ROUND_DOWN", "GEN_CREATED"), all of which were failing full parses -- pure wasted work. Insertion point chosen empirically. The filter is placed AFTER the existing PlaceholderNode and unbound-tvar checks rather than before them. Its only expensive conjunct (get_proper_type(node.type)) runs solely for Var nodes, and all Var nodes already survive both earlier checks, so position cannot change how often the expensive part runs -- only how often the cheap isinstance(node, Var) conjunct is evaluated. Because 951 of the identifier-strings reaching this block are unbound type variables, evaluating the filter before the tvar check would force an extra isinstance onto ~951 nodes it can never catch. perf_compare.py (--metric cpu --workers1 --num-runs 100) confirms: paired median vs baseline was -15.6ms +/-4.6 here, vs -12.9ms (before placeholder) and -9.3ms (before tvar) -- matching the eval-count model's predicted ordering. Also add typing.Self / typing_extensions.Self to var_is_typing_special_form(). Self is a _SpecialForm-typed Var, so without this guard the new filter would wrongly reject a stringified "Self" type annotation (regressing testSelfRecognizedInOtherSyntacticLocations). This guard is a correctness prerequisite of the filter and is committed together with it. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…frequency Move the unbound-type-variable check ahead of the PlaceholderNode check (and ahead of the Var-value filter added in the previous commit) in SemanticAnalyzer.try_parse_as_type_expression(). Final order: unbound type variable -> value Var -> placeholder These three checks are mutually exclusive -- a node is at most one of a TypeVarExpr/ParamSpecExpr, a Var, or a PlaceholderNode -- so reordering cannot change which expressions are rejected (verified: check-typeform and testsemanal unchanged). Ordering them by descending rejection frequency, as measured on mypy's self-check (unbound type vars ~951 >> value Vars ~157 > placeholders ~23), lets the commonest rejections exit first and minimizes total check evaluations (~2700 -> ~1750 cheap isinstance calls over the self-check). The win is below perf_compare's noise floor on its own (~10us), but the reordering is free and behavior-preserving, and it makes the final ordering self-documenting. A rationale comment is added at the head of the block. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…ression() Add a second fast-rejection filter: a string literal that is an identifier naming a FuncDef, OverloadedFuncDef, or MypyFile is a function or module, never a type expression. Reject it before the expensive full-parse block. On mypy's self-check this rejects 48 of the identifier-string literals that reach the full-parse block (e.g. builtin functions like "classmethod", "staticmethod", "hash"; user functions; module names like "platform"), all of which were failing full parses. Unlike the Var-value filter, this check is a single isinstance with no expensive follow-on work, and FuncDef/OverloadedFuncDef/MypyFile are mutually exclusive with the other early-reject node kinds, so it is freely positionable and its rejection count is order-independent. It is placed by descending rejection frequency: after the Var-value filter (~157) and before the placeholder check (~23), i.e. the final order is unbound type variable (~951) -> value Var (~157) -> function/module (~48) -> placeholder (~23) No companion guard is needed (a function or module name is never a valid type, so nothing valid is rejected; check-typeform and testsemanal unchanged). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

github-actions · 2026-06-05T22:56:08Z

According to mypy_primer, this change doesn't affect type check results on a corpus of open source code. ✅

davidfstr and others added 5 commits June 5, 2026 13:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Reduce TypeForm recognition slowdown: 2 identifier fast-reject filters#21596

Reduce TypeForm recognition slowdown: 2 identifier fast-reject filters#21596
davidfstr wants to merge 5 commits into
python:masterfrom
davidfstr:f/typeform_complete--take3.1

davidfstr commented Jun 5, 2026

Uh oh!

github-actions Bot commented Jun 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

davidfstr commented Jun 5, 2026

Summary

Performance

CPU time (canonical, lowest-variance) — master vs branch tip, n=100

Wall-clock — master vs branch tip

Correctness

Open Questions

Uh oh!

github-actions Bot commented Jun 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant