Skip to content

Conversation

@fhanau
Copy link
Contributor

@fhanau fhanau commented Dec 3, 2025

This will avoid excessive memory overhead in a few edge cases – usually the tail worker will be able to keep up with reporting events but we still need to put a limit to the queue size.

This will avoid excessive memory overhead in a few edge cases.
@fhanau fhanau force-pushed the felix/112625-stw-load-shed branch from 3a09407 to 7692d3e Compare December 3, 2025 22:01
tracing::SpanOpen(span.spanId, span.operationName.clone()), span.startTime, spanNameSize);
// If a span manages to exceed the size limit, truncate it by not providing span attributes.
if (span.tags.size() && messageSize <= MAX_TRACE_BYTES) {
if (span.tags.size() && spanTagsSize <= MAX_TRACE_BYTES) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As the watchful reviewer will notice, this is a minor functional change: We're no longer including the size of the span name in this check since it is included in a different tail event, so we're slightly relaxing the size limit for emitting span tags here.

event.sequence,
tracing::Log(event.timestamp, LogLevel::WARN,
kj::str(
"Dropped ", active->droppedEvents, "tail events due to excessive queueing")),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Open to a different message here if requested.

// The TailStreamWriterState holds the current client-side state for a collection
// of streaming tail workers that a worker is reporting events to.
struct TailStreamWriterState {
// The maximum size of the queue, in bytes.
Copy link
Contributor Author

@fhanau fhanau Dec 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Open to different names for these constants/defining them somewhere else.

Note that testing that these checks take effect at some point will be provided in a downstream PR (to follow). This will also include a discussion of why they are sufficient for almost all use cases.

@fhanau fhanau marked this pull request as ready for review December 3, 2025 22:08
@fhanau fhanau requested review from a team as code owners December 3, 2025 22:08
@fhanau fhanau requested a review from mar-cf December 3, 2025 22:08
@fhanau
Copy link
Contributor Author

fhanau commented Dec 3, 2025

This is now feature-complete. As noted in a PR comment – tests for this and rationale will be provided in a downstream PR, but I think we can already discuss the merits of the code changes here.

Event event;

// The approximate size of the event, in bytes.
size_t sizeHint;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could probably restructure this to pass sizeHint as a parameter to reportImpl() instead of storing it in TailEvent directly? Might be cleaner.

auto builder = KJ_ASSERT_NONNULL(current->capability).reportRequest();
auto eventsBuilder = builder.initEvents(current->queue.size());
size_t n = 0;
// KJ_LOG(WARNING, "queue: sending", current->queue.size(), "events", current->queueSize);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be removed before merge – still using this while tuning test cases

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant