Gatekeeping is a structural bug

#indienews

On the web we either read or publish. Readers want to discover content, publishers want to be seen. Gatekeepers sit between readers and publishers and provide discoverability and visibility by aggregating data.

Gatekeeping on the web is the result of concentration of control and responsibility.

There are two main reasons for concentration of control on the web: JavaScript and spam.

JavaScript

HTML is dangerous technology. Embedded JavaScript runs with the user's authority. Without restrictions, it could read sensitive data by making cross-origin requests.

To prevent unintended data access, the browser's default read permission mode for cross-origin data access is DENY. This is called Same-Origin Policy (SOP).

Because browsers cannot freely read across origins, data aggregation happens on servers that are not subject to SOP, and those servers build large indexes.

This creates a dependence on a backend service that can be difficult to escape.

Data access read permission can be enabled by third-party sites that set an Access-Control-Allow-Origin (ACAO) header.

But to require all websites to grant read permissions for resources is perceived as impractical because ACAO headers are origin-based, not user-based access control, and allowing all origins (* wildcard) prevents the browser from sending credentials for cross-origin requests.

The complexity and cost of running your own backend service is much higher than hosting a static website.

Control over the network therefore concentrates in the backend services.

Search is an example of data aggregation where control concentrates in a few services.

Delegating computing to centralized aggregation backends concentrates control and creates a gatekeeper.

To be maximally useful, the centralized data aggregation service needs to accept submissions from many potentially untrusted sources. Which leads us to spam.

Spam

Spam has many forms. It's unwanted junk and manipulation.

Data aggregation requires moderation to combat spam.

Spam degrades the value of the index and creates moral and legal obligations for the index owner.

Data aggregators must treat their indexes as economic assets to cover infrastructure costs. This creates asymmetric incentives. Publishers use an aggregator to gain visibility while aggregators restrict access to maximize the value of their operation.

The structural consequence of centralization is that index owners become policy enforcers, bottlenecks, and gatekeepers in their own economic interest and under legal pressure.

Whoever owns the index controls visibility and discoverability. Users depend on it to find and publish information. Access to information is mediated.

Bug fixing

Centralized architecture concentrates computation inside data aggregators: RSS syndicators, search engines, and social media platforms.

An alternative decentralized architecture might move computation away from the data aggregators and outward to the edge of the network.

If crawling happens on the client and discovery happens through the network, then there is no single bottleneck where one person or company decides what you can see.

A practical version of this alternative decentralized approach is to create a website (just a single web page) and give the web read access by setting an ACAO header. Then link to your friends who make the same kind of websites with a rel=friend link.

An app that runs in the browser with a restrictive Content-Security-Policy can crawl the network, fetch updates safely, and display them to the user.

Moderation happens locally by three mechanisms. The first mechanism is by forming the neighborhood by adding rel=friend links to the web page. The second mechanism is to explicitly add entrypoints for the crawler in the app. The third moderating mechanism is blocking sources in the app, which prevents the crawler from visiting those sources.

I built a completely client-side crawler for a network like that. To publish all you need is a static web host that lets you set HTTP headers. To read the rel=friend network all you need is the URL of a website that has rel=friend links.

When the app fetches content it gets normalized and snapshotted allowing for persistent and expressive posts in users' feeds.

While not a perfect solution for every use-case, it works well for slow, status-like content (continuous personal updates). It creates a kind of ambient feed. Updates are always posted to the same URL, with each new update overwriting the previous. Each update should contain all relevant state, like all your rel=friend links and other metadata to preserve the network.

While cost is distributed and spam becomes a social responsibility, it also entails that participation is not zero-cost. Free hosting does typically not allow setting HTTP headers exactly because of spam (of the phishing kind).

When you pay in the distributed model, you can at least freely choose who to pay.

I currently work on a visualization of the user's network to make the moderation task easier for the user.

The code is GPL. You can build it and host it yourself. I think it can run on Tor.