Skip to content

Conversation

@anonrig
Copy link
Member

@anonrig anonrig commented Dec 1, 2025

Fixes #4712
Fixes #4977

@anonrig anonrig requested review from a team as code owners December 1, 2025 19:00
@anonrig anonrig requested review from jasnell and npaun December 1, 2025 19:15
@npaun
Copy link
Member

npaun commented Dec 1, 2025

This one is technically observable, isn't it? Like what if I had crappy code that expected the BOM there

@codspeed-hq

This comment was marked as outdated.

@anonrig anonrig changed the title Fix Body::text() to handle BOMs Fix Body::text() and Body::json() to handle BOMs Dec 1, 2025
@anonrig anonrig force-pushed the yagiz/fix-wpt-tests-text-json branch from 560fc65 to d475224 Compare December 1, 2025 20:48
@anonrig anonrig requested review from jasnell, kentonv and npaun December 1, 2025 20:48
@anonrig anonrig force-pushed the yagiz/fix-wpt-tests-text-json branch from d475224 to dc9792f Compare December 1, 2025 21:14
@anonrig anonrig requested a review from jasnell December 1, 2025 21:14
@anonrig anonrig requested a review from jasnell December 3, 2025 22:23
@anonrig anonrig enabled auto-merge (squash) December 3, 2025 22:29
auto size = part.size() - skipBytes;
skipBytes = 0;
KJ_DASSERT(size <= out.size() - pos);
out.slice(pos, pos + size).copyFrom(kj::arrayPtr(begin, size));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The part really ought to be using slice also here...

Would need to test but I think something like ...

Suggested change
out.slice(pos, pos + size).copyFrom(kj::arrayPtr(begin, size));
out.slice(pos, pos + size).copyFrom(part.slice(skipBytes));
skipBytes = 0;

And removing the auto begin = ... and auto size = ...

You can also simplify this by trimming out on each iteration.

auto option = ReadAllTextOption::NULL_TERMINATE;
if (FeatureFlags::get(js).getPedanticWpt()) {
option |= ReadAllTextOption::STRIP_BOM;
}
Copy link
Collaborator

@jasnell jasnell Dec 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if bundling this under the pedantic wpt flag directly is correct. I think it's likely better to have it as a separate compat flag that can have a implied-by pedantic wpt.

I'm also not convinced that we should expose the option completely out to this point. Specifically, readAllText() here will always null terminate so exposing the option to do so or not here doesn't make sense and at this layer we should also assume it strips the BOM if necessary. The option was added internally in the impl only because we use the same base impl logic for both readAllText() and readAllBytes(), but at this level it shouldn't be something that is surfaced.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Investigate BOM handling during JSON parsing Investigate our handling of the UTF BOM (byte order mark)

4 participants