Skip to content

avoid int32 overflow in BitPackedRunDecoder::GetBatch offset#50089

Open
metsw24-max wants to merge 1 commit into
apache:mainfrom
metsw24-max:rle-bitpacked-offset-overflow
Open

avoid int32 overflow in BitPackedRunDecoder::GetBatch offset#50089
metsw24-max wants to merge 1 commit into
apache:mainfrom
metsw24-max:rle-bitpacked-offset-overflow

Conversation

@metsw24-max
Copy link
Copy Markdown
Contributor

int32 overflow in the bit-packed run decoder offset
GetBatch works out the byte position with values_read_ * value_bit_width in 32-bit int. For a large bit-packed run (this decodes untrusted parquet RLE/bit-packed dictionary indices and levels, with value width up to 64) the product passes INT32_MAX and wraps negative, so bytes_fully_read goes negative and unread_data ends up before the buffer, giving an out of bounds read in unpack. raw_data_size just above already widens to int64 before the same multiply, so I matched that here.

@github-actions github-actions Bot added the awaiting review Awaiting review label Jun 4, 2026
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 4, 2026

Thanks for opening a pull request!

If this is not a minor PR. Could you open an issue for this pull request on GitHub? https://github.com/apache/arrow/issues/new/choose

Opening GitHub issues ahead of time contributes to the Openness of the Apache Arrow project.

Then could you also rename the pull request title in the following format?

GH-${GITHUB_ISSUE_ID}: [${COMPONENT}] ${SUMMARY}

or

MINOR: [${COMPONENT}] ${SUMMARY}

See also:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant