FEP 7888 serving up an OrderedCollection
from julian@community.nodebb.org to swicg-threadiverse-wg@community.nodebb.org on 18 Feb 16:59
https://community.nodebb.org/post/103364

Just wrapped up a call with @pfefferle@mastodon.social and @jesseplusplus@mastodon.social to review their implementations of FEP 7888, specifically in relation to conversational backfill.

:heavy_check_mark: individual objects serve a context property
:heavy_check_mark: that context property is a URL that resolves

One of the concerns raised was related to the OrderedCollection of items served by the context. Specifically, if the items presented in the collection were not in chronological order, NodeBB failed at importing some of the items as the inReplyTo referenced an object that did not exist.

The solution to this was to ensure that the collection items were in chronological order from oldest to newest. Once fixed:

:heavy_check_mark: the context resolved to an OrderedCollection containing objects
:heavy_check_mark: NodeBB was able to pull in the entire conversation

NodeBB used to guard against this by ordering all received items by chronological order, but I realized that while this worked 99%+ of the time, there are some fun (ahem...) individuals who send objects with timestamps way in the future.

Personally I think removing the sorting just to fix one edge case was premature. At the same time, I think specifying that the OrderedCollection be sorted in chronological order should be a requirement.

cc @trwnh@mastodon.social

#7888 #activitypub #backfill #swicg-threadiverse-wg

threaded - newest

trwnh@mastodon.social on 18 Feb 17:08 next collapse

@julian @pfefferle @jesseplusplus shouldn't the order basically be "whatever the owner gives you"? i can see there being a SHOULD on forward-chron maybe but it's also possible for reverse chron, order of acknowledgement, or anything else. by default Adding to a collection might append a thing to either end of it. probably the simplest thing is the latest inserted item becomes the first, and the previous list becomes the rest. Reordering of collections, sorting of collections, etc is not possible

trwnh@mastodon.social on 18 Feb 17:11 next collapse

@julian @pfefferle @jesseplusplus but you should be able to handle an object where inReplyTo doesn't resolve (yet or ever). it shouldn't break processing

julian@community.nodebb.org on 18 Feb 17:15 collapse

@trwnh@mastodon.social yeah that was what I was afraid of. Maybe a SHOULD?

What we discussed on the call was "keep it simple" for implementors, which would be to have implementors order by forward chron.

End of the day since NodeBB is the one having trouble with the order I think it's on me to re-add re-ordering so other implementors can send in whatever order they want.

cc @pfefferle@mastodon.social @jesseplusplus@mastodon.social

trwnh@mastodon.social on 18 Feb 17:20 collapse

@julian @pfefferle @jesseplusplus i've got some prior fep work on OrderedCollection.orderType and SortedCollection.sortedBy|sortOrder (1985, 1863 but not submitted yet)

but there are deeper issues when trying to do anything with Collections other than just read them as-is... i've filed several issues on w3c/activitypub etc but tldr idk what the best solution is here. the existing abstraction of as2 collections falls short when you are trying to represent data structures other than a basic set.

harmonicarichard@techhub.social on 18 Feb 17:11 next collapse

@julian Are the future posts scheduled posts. Do they have a "scheduled" flag or something similar? Is timezone info included by default? Those are the two situations I can think of where this would occur with "timestamps in the future".

julian@community.nodebb.org on 18 Feb 17:16 collapse

@harmonicarichard@techhub.social if an object gets federated out with a future timestamp but is meant to be scheduled, I think that's a bug.

I wouldn't trust other implementors to respect post scheduling ๐Ÿ˜ฌ

harmonicarichard@techhub.social on 18 Feb 17:21 collapse

@julian I have a classicpress and a wordpress instance where I could test scheduled posts. It might help to check instance logs. I can test scheduled posts from classicpress with ease.

silverpill@mitra.social on 18 Feb 17:28 next collapse

@julian @pfefferle @jesseplusplus @trwnh +1 for chronological order requirement.
Are those implementations public? I'd like to test my context resolver against them too

julian@community.nodebb.org on 18 Feb 17:36 collapse

@silverpill@mitra.social yes, but they serve objects because we're radical implementors who don't do the whole activities thing ๐Ÿ˜ sorry in advance.

We were testing against these URLs from @pfefferle@mastodon.social's personal blog:

  1. Top level Article: https://notiz.blog/2025/02/11/fedidem/
  2. Mid-level reply Note: https://notiz.blog/?c=2045174

@jesseplusplus@mastodon.social had a test URL but NodeBB fell over because it encountered an Object in next instead of a URL, so that's my bad:

  1. https://frequency.app/@frequency/112078982641203605

All top level or mid-level objects should report a resolvable context, resolving to an OrderedCollection (the same one if the objects are in the same conversation) containing URLs to said objects.

jesseplusplus@mastodon.social on 18 Feb 17:50 next collapse

@julian @silverpill @pfefferle Right now my forkโ€™s implementation has just a Collection rather than an OrderedCollection, but Iโ€™m going to change that to an OrderedCollection based on todayโ€™s discussion

julian@community.nodebb.org on 18 Feb 18:44 collapse

@jesseplusplus@mastodon.social @pfefferle@mastodon.social I fixed up NodeBB's janky handling of collections and now I am able to import an entire Frequency conversation via its context :tada:

I think @silverpill@mitra.social is right though, your context.first needs an id. I worked around it but it's better to have it so it can be referenced against by another page's prev.

silverpill@mitra.social on 18 Feb 18:17 next collapse

@julian @pfefferle @jesseplusplus @harmonicarichard @trwnh

>yes, but they serve objects because we're radical implementors who don't do the whole activities thing ๐Ÿ˜ sorry in advance.

My server can retrieve both kinds of collections :) I had concerns about diverging / conflicting implementations in the past, but the solution was found...

>We were testing against these URLs from @pfefferle@mastodon.social's personal blog

This context is working ๐Ÿ‘

>@jesseplusplus@mastodon.social had a test URL but NodeBB fell over because it encountered an Object in next instead of a URL

I have a problem with this one because the first page doesn't have an id. I can adjust my code but the absence of id is unusual. For example, there is a next page (currently 404), and if we navigate to it, how prev would look like if the first page is anonymous?

>All top level or mid-level objects should report a resolvable context

Do you mean replies made by the context owner specifically? I think remote mid-level replies should not be required to have context (that would prevent non-implementing servers from participating).

julian@community.nodebb.org on 18 Feb 18:29 next collapse

@silverpill@mitra.social said in FEP 7888 serving up an OrderedCollection:

Do you mean replies made by the context owner specifically?

Correct, or more specifically, at least for all replies coming from the instance that the context owner belongs to... so other sibling replies by users on that instance should also report that same context.

julian@community.nodebb.org on 18 Feb 19:58 collapse

@silverpill@mitra.social you can also test against this instance, though I assume you already tried:

  1. context url: https://community.nodebb.org/topic/18632
  2. Top level post: https://community.nodebb.org/post/103364
  3. Mid-level reply: https://community.nodebb.org/post/103377
silverpill@mitra.social on 18 Feb 20:08 collapse

@julian Yes, I tested against NodeBB and other implementations mentioned in FEP-f228. Will add WordPress and Frequency to the list

trwnh@mastodon.social on 18 Feb 18:32 collapse

@silverpill @julian @pfefferle @jesseplusplus @harmonicarichard the context SHOULD be copied if you want to participate in the same context, but the owner MAY Add whatever they want to an arbitrary Collection

context-unaware impls aren't prevented from participating but this will lead to a degraded experience if you're not careful. likely you would start with some graph source, filter for context, then optionally crawl replies for anything missing or otherwise optionally reverse query inReplyTo

julian@community.nodebb.org on 18 Feb 18:37 collapse

@trwnh@mastodon.social @silverpill@mitra.social I'd argue that the broad baseline implementation is to allow anyone to participate in the same context... mostly because you can't guard against other implementors specifying your context value.

However, there's nothing stopping a future FEP ("context ownership"/"reply controls") from recommending that implementors check against the context first before locally adding a received object that provides context.

trwnh@mastodon.social on 18 Feb 18:43 collapse

@julian @silverpill right, it's a bidirectional link that should be verified both ways ideally. so you have an object claiming to be included, but you also ideally have a reverse claim of inclusion

i think you can mostly get away with doing something like:

- if context is present and it has a canonical collection, you can browse it directly. ignore the original object
- if someone declares your context and you become aware of it somehow, you can optionally add it to the canonical collection

trwnh@mastodon.social on 18 Feb 18:44 collapse

@julian @silverpill the third point which is more contentious:

- deleting a context implies all objects included in it are now orphans and can be garbage-collected (deleted, updated, moved, whatever)

julian@community.nodebb.org on 18 Feb 18:46 collapse

deleting a context implies...

@trwnh@mastodon.social that would be yet another FEP :joy:

mario@hub.somaton.com on 18 Feb 19:43 next collapse

@julian

yes, but they serve objects because we're radical implementors who don't do the whole activities thing ๐Ÿ˜ sorry in advance.

What if only the activity is signed (as is often the case) and the object is not fetchable due to let's say some network error?
julian@community.nodebb.org on 18 Feb 19:56 collapse

@mario@hub.somaton.com at least per our working implementation, the context only deals with public objects which should be fetchable by an instance (either anonymously or via signed GET).

@silverpill@mitra.social's 171b does what you suggest, sending the full signed (via proofs) activities, which in that sense is more performant as fewer network requests are required (just the one, really), and more reliable as you don't need to fetch the individual objects. However, requiring object integrity proofs is a burden that seems quite difficult to clear at present.

WordPress, NodeBB, and Mastodon are not built in such a way that activities are saved direct-to-database. The activities are consumed and a local representation is saved, which makes going reverse quite difficult, especially when it comes to content from outside the local instance.

mario@hub.somaton.com on 18 Feb 21:12 collapse

@julian i see... In general i think in distributed systems it makes a lot of sense to keep the source intact while consuming the data you need. Other projects might need to consume a different set of fields and the overhead is minimal...

@silverpill

mikedev@fediversity.site on 18 Feb 22:20 collapse

We normally do full activities, but after some prodding, I've begun implemented the forth-coming FEP about context vs contextHistory. So now the default context is objects for Create|Update|Delete Note|Article and activities for everything else, and contextHistory is full activities for everything associated with the opening post-of the conversation; but what a nightmare...ย ย 

We do not typically provide URLs as collection members, because you may need a signed activity to access and validate third-party objects which have source access control enabled.

silverpill@mitra.social on 19 Feb 17:25 next collapse

@mikedev

>So now the default context is objects for Create|Update|Delete Note|Article and activities for everything else

Shouldn't Create|Update|Delete also have activities in context?
My understanding is that context collection is supposed to contain things that have collection ID as their context property.

If entity is an activity, its context is a collection of activities.
If entity is a post, its context is a collection of posts.

@julian @pfefferle @jesseplusplus @trwnh @mario @harmonicarichard @reiver @aslakr @Fitik

trwnh@mastodon.social on 19 Feb 17:43 collapse

@mikedev I wonder, what do you think about having conversations represented by a specific Conversation object? Basically, saying a conversation *has* a collection of posts, rather than saying that a conversation *is* a collection of posts.

- Conversation.posts = Collection of Objects
- Conversation.outbox = Collection of Activities
- Post.context = Conversation

where "Post" is loosely an Object that has content

mikedev@fediversity.site on 19 Feb 20:40 collapse

I think it's all insane. A collection of everything related to a thread is a collection is a collection. Distinguishing between those that can contain an implied Create and/or full activities and/or a URL and those who can't/shouldn't seems like a nightmare of unnecessary complexity.

An item in a collection can be a number of things (URLs, objects, activities), any of which you need to be able to accept and process to be considered ActivityPub/ActivityStreams compliant at a basic level.

I don't believe we really need one collection for the posts people and another for the conversation people containing different representations of the same content - but here we are. Perhaps we need yet another representation for the URL-only folks? If you think that's madness, then why isn't this madness?

trwnh@mastodon.social on 19 Feb 21:08 collapse

@mikedev i think this is all protocol stuff, and the real madness is claiming we have a single protocol called "activitypub" that everyone can implement the same way

rather, a protocol is the sum total of everything needed to communicate, negotiated between parties. fundamentally this question is about whether you store activities as resources, or just consume them as RPC. but ideally the conversation/thread should be agnostic to this, and it shouldn't be bound to any particular representation.

trwnh@mastodon.social on 19 Feb 21:16 collapse

@mikedev more generally, there are a lot of conceptual issues with AS2 Collection in both data model and protocol. it seems especially problematic to say that a conversation/thread *is* a Collection, because a Collection can be anything -- someone's followers, a conversation, a photo album, an audience, a feed, a task list, and so on. imo this is a fundamental error in information modeling. a Collection ought to represent a set, not myriad other entities that might *have* a Collection.

julian@community.nodebb.org on 20 Feb 16:20 collapse

@trwnh@mastodon.social just because it's a Collection/OrderedCollection doesn't mean that it need have inferred meaning behind it.

A conversational context (either ours or 171b's Containers) could serve an OrderedCollection with additional signals via properties that specify behaviour and expectations.

trwnh@mastodon.social on 20 Feb 18:33 collapse

@julian the taxonomist in me screams, for i do not know what i am looking at (and people refuse to describe it properly)

julian@community.nodebb.org on 20 Feb 19:28 collapse

@trwnh@mastodon.social given that one way a context collection is discovered is via reference via a collection member itself, that's one way to define it.

Admittedly, direct access to a context collection is also a legitimate use case (e.g. a NodeBB topic URL itself is the context collection), so something more explicit might be required.

angus@socialhub.activitypub.rocks on 19 Feb 08:15 collapse

julian:

individual objects serve a context property that context property is a URL that resolves

One of the concerns raised was related to the OrderedCollection of items served by the context. Specifically, if the items presented in the collection were not in chronological order, NodeBB failed at importing some of the items as the inReplyTo referenced an object that did not exist.

The solution to this was to ensure that the collection items were in chronological order from oldest to newest. Once fixed:

the context resolved to an OrderedCollection containing objects NodeBB was able to pull in the entire conversation

Just a note that I endorse @julian's suggested approach here, and this is how the Discourse plugin has implemented the backfill feature.