fix(folderparser): use csv.reader to handle RFC 4180 quoted fields in annotation CSVs#490
Open
AbdulrahmanXAmer wants to merge 2 commits into
Open
Conversation
… annotation CSVs Split-on-comma parsing silently corrupted file_name when filenames or labels contained commas inside quoted fields. Switch to csv.reader. Raw line text preserved for server reconstruction. Adds regression tests for both regular and _classes.csv paths. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
_parseAnnotationCSVusedline.split(",")to extract filenames. This silently corruptsfile_namewhen a filename contains commas inside quoted fields — valid per RFC 4180. Any dataset with such filenames fails to link images to their annotations.Fix
Replace manual split with
csv.reader(stdlib). Raw line text preserved verbatim for server upload reconstruction. File opened in text mode so\r\nnormalises to\n(backward compat with Windows CSVs unchanged).Tests
test_parse_csv_quoted_filename— regular CSV with comma-containing filenametest_parse_multilabel_csv_quoted_filename—_classes.csvwith comma-containing filename🤖 Generated with Claude Code