Skip to content

Conversation

@Gnuxie
Copy link
Contributor

@Gnuxie Gnuxie commented Feb 19, 2024

Rendered

Signed-off-by: Gnuxie [email protected]

@Gnuxie Gnuxie changed the title Auth Lock: Soft-failure-be-gone! MSC4104: Auth Lock: Soft-failure-be-gone! Feb 19, 2024
thanks Andy
We need to say explicitly that all `m.auth_lock` events that have
been created that match the `auth_event` are considered.
It's allowed because servers will still retain the history and an
admin can easily recanonicalise the state so that joining
servers can see the relevant history again.
@turt2live turt2live added requires-room-version An idea which will require a bump in room version proposal A matrix spec change proposal s2s Server-to-Server API (federation) room-spec Something to do with the room version specifications unassigned-room-version Remove this label when things get versioned. needs-implementation This MSC does not have a qualifying implementation for the SCT to review. The MSC cannot enter FCP. kind:feature MSC for not-core and not-maintenance stuff labels Feb 19, 2024
Comment on lines +73 to +74
The receiving server must combine the `extremities` from both events
such that the canonical history becomes C -> B -> A.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if Alice sees Chelsea send B -> A but Bob sees C -> A instead, because Chelsea crafted two extremities from the same event, delivering one to Alice's HS and the other to Bob's HS?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a better example. However, provided Bob receives Alice's m.auth_lock, then Bob would reject any further events sent by Chelsea referencing A or C. Bob could also send their own m.auth_lock referencing Chelsea's prior membership that includes C within extremities. Then Alice will be able to pull in C when Alice gets Bob's m.auth_lock. Alice could then redact C.

Though this presents another problem when Bob is unprivileged and can't send m.auth_lock. The reason why we give bob the capability to send m.auth_lock is because we trust that he will observe and apply Alice's m.auth_lock and not leak more of Chelsea's events to us. And this means he's most likely already a room admin. This is a problem that already exists that soft failure doesn't prevent either element-hq/synapse#9329. It would be nice to find some kind of solution.

Copy link
Contributor Author

@Gnuxie Gnuxie Feb 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe negotiation of extremities could to considered separately, such that servers with unprivileged users could propose them and room admins could preview them and choose to accept or reject them. That does feel a little bit like over engineering for and entrenching a bad state of affairs though. But at least it's something that would work.

Copy link
Contributor Author

@Gnuxie Gnuxie Feb 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I should probably clarify somewhere in the proposal that we don't propose eagerly issuing an m.auth_lock with every ban, we simply do not soft fail anymore. If we then see events being authorized that would have normally been candidates for soft failure, the room admin or their tooling would then issue m.auth_lock. In principle, this should reduce the number of instances where element-hq/synapse#9329 occurs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

negotiation

I'm forgetting that a server can ask for the prev_events, find the diverged extremity provide a preview to an admin and allow them to choose incorporate it, all without manual intervention from another server (other than continuing to participate)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dkasak I have rewritten the potential issues section to try cover this concern and all the solutions that I have discovered so far

Comment on lines 80 to 83
We counteract this issue by requiring that servers maintain all
existing `extremities` and their `prev_events`, regardless of
whether they are missing from the `extremities` field of
an `m.auth_lock` event.
Copy link
Member

@dkasak dkasak Feb 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But what is "existing" at the point of receipt of the m.auth_lock event could differ across different servers participating in the room. Wouldn't this strategy lock them into having diverging views of the room?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, because users on those servers can issue their own m.auth_lock events for the same locked_event_id, which would allow them to converge again. Am I understanding this correctly as the same problem described in the second paragraph of https://github.com/matrix-org/matrix-spec-proposals/pull/4104/files#r1494856367 ?

@@ -0,0 +1,204 @@
# MSC4104: Auth Lock: Soft-failure-be-gone!
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm finding the proposal very hard to understand, and there seems to be some deal breakers in the potential issues section, like being able to lock all auth events and effectively locking the room..?

I suggest adding diagrams for each scenario to explain what is going on.

I am very interested in a solution better than soft failure, so if I can understand this proposal better than maybe this is something we can work with. However, the scenarios presented here make it seem like this is not the final form of this proposal.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, the scenarios presented here make it seem like this is not the final form of this proposal.

It isn't, the MSC is still draft.

I'm finding the proposal very hard to understand

Is there a way to improve this outside of diagrams for the scenarios in the potential issues section?

I am very interested in a solution better than soft failure

Me too but I no longer know if this is the right approach

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Diagrams for normal operations would help too! Diagrams are great.

As of yesterday I think I now have an idea of how we can do away with soft fail, but I need to check various scenarios and think more on the performance implications. The gist is CALM theorem + state resolve literally all events.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CALM theorem

I'm unfamiliar with CALM, is this the best resource to explain what this is?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's basically it.

In the intervening 9 months since writing that comment, I've been almost entirely consumed by DAG related work. There's a host of tradeoffs we'd need to make to get the semantics you're describing in this proposal. However, it's interesting that some of my thoughts are similar enough to what this proposal is doing. I have a few thoughts now on this proposal:

state_key: A sha256 of the locked_event_id concatenated with the mxid of the event sender. We do this so that different room admins can optionally specify their own versions of the extremities that will combine to form a complete set to be considered during authorization.

This makes no sense, as users on the same server will see the same events. Soft-failure is designed to protect against backdating, it isn't a general moderation tool (so multiple admins on the same server should always agree on the same set of extremities in practice).

extremities: A list of extremities that should be shared to other servers representing the canonicalised version of the room history. This is protected from redaction.

This is really no different from the server sending this event just sharing their forward extremities, which is prev_events, so just use that.

For each auth_event all m.auth_lock events with a locked_event_id matching the event_id of the auth_event are found from the room's current state. From each m.auth_lock event relevant to one auth_event, the extremities fields are combined to form a set of all extremities relevant to the auth_event. If the considered auth_event's event_id matches any m.auth_lock's locked_event_id field AND the event_id of the event being authorized is absent from both the extremities set AND the event_id of any connected event referenced via each extremity's prev_events, reject.

This is hard to parse, but I think what you're saying is "look at an event's auth_events, are they in the auth chain from the locked_event_id? If yes, this means they are at-risk of being rejected since they refer to a locked section of the DAG. Are the auth_events in the auth chain from extremities? If yes, then they are in the protected region of the DAG, so it's allowed. If no, then this is a concurrent fork, so reject."

The receiving server must combine the extremities from both events

You need to be more specific here, particularly when the boundary between locked/protected regions disagree. Using an intersection of locked regions ensures we don't accidentally reject events which we genuniely sent concurrently, but this demands servers never go away (as a dead server will never updated the locked region).


Solution A: Dedicated API for showing diverged events to room admins

This isn't user-friendly. My grandma would never be able to understand and use this for her group.

Solution B: Room administrators on other servers issue their own m.auth_lock

I like the idea of coordinating to converge on a consistent set of auth locks, but given auth locks block messages, this may be hard to achieve.

Solution C: Unprivileged users propose changes to the extremities of m.auth_lock

Similarly, I like the idea but it has the same problems as solution B. It's also not clear how unprivileged users would even know of this considering the "Dedicated API" is admin only.

Solution D: Recrafting PDUs

I've been calling this "rebasing" but yeah, it has its own set of problems.

Copy link
Contributor Author

@Gnuxie Gnuxie Jul 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes no sense, as users on the same server will see the same events. Soft-failure is designed to protect against backdating, it isn't a general moderation tool (so multiple admins on the same server should always agree on the same set of extremities in practice).

This is to allow room admins on different servers to also send the event because they might not see the same events. Which is why they might like to adjust the canonical room history to include the events that only they can currently see.

As for whether it's a moderation tool? it isn't, but the only reason this exists is because power levels cannot be revoked and instead Matrix uses soft failure to cover our ass when stale auth events are used, so unfortunately this does mean it may be necessary for us to micromanage soft failure.

This is really no different from the server sending this event just sharing their forward extremities, which is prev_events, so just use that.

ok.

This is hard to parse, but I think what you're saying is "look at an event's auth_events, are they in the auth chain from the locked_event_id? If yes, this means they are at-risk of being rejected since they refer to a locked section of the DAG. Are the auth_events in the auth chain from extremities? If yes, then they are in the protected region of the DAG, so it's allowed. If no, then this is a concurrent fork, so reject."

I think so.

You need to be more specific here, particularly when the boundary between locked/protected regions disagree. Using an intersection of locked regions ensures we don't accidentally reject events which we genuniely sent concurrently, but this demands servers never go away (as a dead server will never updated the locked region).

I don't understand what you mean here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kind:feature MSC for not-core and not-maintenance stuff needs-implementation This MSC does not have a qualifying implementation for the SCT to review. The MSC cannot enter FCP. proposal A matrix spec change proposal requires-room-version An idea which will require a bump in room version room-spec Something to do with the room version specifications s2s Server-to-Server API (federation) unassigned-room-version Remove this label when things get versioned.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants