fix: Boyer-Moore bad character shift was dead code in for-loop#14770
Open
anyncfunction wants to merge 1 commit into
Open
fix: Boyer-Moore bad character shift was dead code in for-loop#14770anyncfunction wants to merge 1 commit into
anyncfunction wants to merge 1 commit into
Conversation
The bad_character_heuristic() method used a for-loop with an assignment to the loop variable i, which was immediately overwritten by the next iteration. This caused the algorithm to degrade from O(n/m) to O(n*m) naive search. Changed to a while-loop so the shift actually takes effect. Added max(i+1, shift) guard to prevent backward skips when the mismatched character appears to the right of the mismatch in the pattern. Added edge case doctests.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
The
bad_character_heuristic()method used afor-loop with an assignment to the loop variablei, which was immediately overwritten by the next iteration. This caused the algorithm to degrade from O(n/m) to O(n*m) naive search -- the bad character shift was effectively dead code.Changes
for i in range(...)to awhileloop so the shift actually takes effectmax(i + 1, mismatch_index - match_index)guard to prevent backward skips when the mismatched character appears to the right of the mismatch in the patternVerification
All 12 doctests pass: