LOCAL.HTML ========== DOCUMENTATION (WIP) The local.html PWA ------------------ Homepage: https://sr.ht/~espen/local.html The local.html app is a progressive web app that treats ordinary web pages as channel sources in a social link network. The app extracts social links from channel updates to extend its view of the network and discover new channels. Channel updates are presented in the app as a live stream that continually updates as new content is published. There's no global view of the network. The update stream is constructed locally and different users will see different projections of the link network. The entire system consists of static HTML documents and a PWA app that can be hosted on any static web server. There is no central authority or backend federation going on. Link networks ------------- A link network is a network of web resources connected by rel=friend hyperlinks. A rel=friend hyperlink is an or element with a rel="friend" attribute. Link networks are a way to create social contexts on the web that are both private and publicly discoverable. There is not only one link network. Participants define different social contexts through the links they publish. Links in the link network define the PWA crawler frontier and have no semantic meaning. The use of the rel attribute is practical abuse. Channels and links ------------------ A channel is a URL that you control and use to publish updates to the network. A link network is defined by the rel=friend links in the updates published by its participating channels. Networks can be small and local or large and global depending on what links the members choose to publish. A link has a source URL and a destination URL. If the source URL identifies your channel and the destination URL identifies someone else's channel, then the link is an outgoing link from your viewpoint. Conversely, if the source URL identifies someone else's channel and the destination URL identifies your channel, then the link is an incoming link from your viewpoint. To become part of the network, other channels need to know the URL of your channel and publish an outgoing link to it. It is conceptually similar to link walls or blogrolls. How discovery works ------------------- 1. Follow a channel. 2. Reachable URLs are crawled. 3. rel="friend" links are discovered. 4. Newly discovered URLs are added to the crawl queue. 5. Reachable channels become part of the local graph. 6. Updates are detected by normalized-content hashing. Discovery is opportunistic and asynchronous. Crawlers operate independently and will discover updates at different times. There is no global ordering of updates. If a channel publishes multiple updates between crawl attempts, intermediate updates will be missed. Moderation ---------- The network has no central authority and no global moderation system. Instead, each local projection is shaped by three forms of local control: Following: Following a channel anchors its URL in the app and makes it a starting point for crawling and network discovery. The local projection is built by starting at the anchored channel URLs and following rel="friend" links to discover additional channels. Linking: Publishing a rel=friend link to another channel means adopting that channel into the social context of your channel. Giving someone an incoming link changes their status in the network: - Makes them discoverable. - Keeps them from being orphaned. - Introduces them to other participants. - Extends the graph through them. Blocking: Users can block individual channels. Blocked channels are not crawled and cannot contribute new updates to the local projection. Follow a channel ---------------- Use the Follow button in the app to follow a channel. Following a channel anchors it in the local graph and makes it a starting point for discovery. Anchor links can also be added and removed in app settings. Adding anchor links does not create incoming or outgoing friend links in the network. Anchor links are never orphaned locally. A single instance of the app does not distinguish between members of separate networks. All discovered rel=friend networks merge into a single local graph. To keep separate networks apart, use different instances of the app hosted on different origins. Become adopted into a network ----------------------------- To become adopted into a network you need at least one incoming rel=friend link from someone who is already a part of that network. 1. Publish a web page (a channel). 2. Someone already in the network has to adopt you by linking to your web page. As long as you have incoming links, your channel is discoverable by the network. If nobody links to you, your channel becomes orphaned. Orphaned channels are deleted from the app index after a grace period of 3 days. Adopting someone into the network --------------------------------- Include an outgoing rel=friend link in your web page using either an or element that points to the channel you want to adopt into the network. Example: my friend Publishing guidelines --------------------- Channels are ordinary web pages. Every update to a "channel" web page becomes an item in the streams of following clients. Channels can be published to any static host that allows setting HTTP headers. You should set at least this header: Access-Control-Allow-Origin: * This will allow the whole web to read your web page cross-origin. You should also set a cache validator like Last-Modified or ETag to avoid excessive data transfers. There are almost no requirements to the actual content you publish. If your channel web page can be displayed in a browser it fulfills every requirement. JavaScript will not run. Inline CSS will work. Feed rendering clamps horizontal overflow. Updates should always include all active outgoing rel=friend links. If you remove them then they will be deleted on all clients, possibly leading to orphaned channels. Orphaned channels will be dropped from the network after 3 days. The following elements from channel web pages are used for the stream item header: - - <meta name="author"> - <link rel="me"> Your channel web page should be served over an encrypted connection (HTTPS). Unencrypted web pages work, but external resources served over unencrypted connections will not load. JSON-LD objects embedded in HTML are extracted, parsed and inserted into the search index. The JSON-LD parser will never dereference IRIs that do not use the https: protocol. Deleting stream items --------------------- Stream items can be deleted. The channel will still be crawled and checked for updates, but the deleted item will not be re-added to the stream. The next update from that channel will be added to the stream. Blocking channels ----------------- While a channel is blocked it will not be crawled. Any URLs previously discovered from the blocked channel are still crawled. Any updates from the blocked channel are retained and are visible in the stream. To delete all updates from a channel and block it at the same time use the "put it in a hole" button in the stream item header dots menu. Controlling crawler behavior ---------------------------- The crawler maintains a priority queue of URLs that it crawls at periodic intervals. Each URL has its own schedule. URLs are by default crawled according to a schedule roughly like this: If the URL resource has not changed then try again after 5 minutes. Do this 3 times, then back off exponentially until a maximum interval of 24 hours is reached. The schedule is reset every time the crawler sees that the resource at that URL has been updated. The crawler determines if a resource has been updated based on content hashing. The crawler hashes a normalized representation of fetched resources. Some changes may therefore not produce a new stream item. The crawler schedule can be configured from the publisher side for each URL in the network by including a <meta http-equiv="refresh" content="60"> element in the web page. The crawler can also read the Refresh HTTP header for similar effect. The crawler interprets Refresh headers and refresh directives as scheduling hints. They do not trigger navigation. To make the crawler check a channel for updates at a specific time you can set an Expires HTTP header. The crawler will read that header and schedule the next update for that exact time. The crawler never tries to crawl more than one URL per second. This rate limit is not configurable. Because update checks are dequeued sequentially and subject to crawler rate limits, checks may occur later than their scheduled time. The crawler runs only when the app is in the foreground on mobile platforms and continuously while the app is open in a browser tab on the desktop platforms. Update processing ----------------- Downloaded content is processed before being saved to the database. Resources with a text/* MIME type are processed as HTML. All other resource types are processed as binary blobs. HTML content is normalized before serialization and content hashing. Scripts: <script> elements are stripped from the HTML document before hashing and will therefore not affect the final digest. CSS stylesheets: External stylesheets in the head element are moved to the body. Internal stylesheets are rewritten to display properly inside a web component shadow DOM. Any <style> elements in the <head> element are moved to the body before processing. Links: <link> elements are removed before serialization. All URLs in <a> elements are resolved to absolute URLs before serialization. Snapshotting: The following elements are snapshotted and saved in the browser cache: <img>, <video>, <audio>, <source>, <object>. Metadata: <title>, <meta name="author"> and <link rel="me"> elements are used to generate the stream item header. Serialization: When serializing the HTML content only the body element is serialized. Search ------ The app maintains a search index, but it does not yet provide a search interface. Background updates ------------------ It's a feature to not have it. Reliable background crawling is not currently possible across browsers and mobile platforms, so the app does not implement it. Stored credentials ------------------ Credentials are stored unencrypted in IndexedDB. Anyone with access to the browser profile may be able to recover them. Use only on devices you trust. Notifications ------------- The app will opportunistically attempt to display notifications when the app does not have focus. This will only work reliably on desktop, because Web Workers do not keep running on mobile platforms. Notifications must be enabled in settings. Self-hosting ------------ The app can be hosted on any static host that lets you set HTTP headers. All application data is stored in IndexedDB and scoped to the app origin. Download a zip archive from the source code repository: https://git.sr.ht/~espen/local.html Unpack it into any directory under your web root. Make sure you set a Content-Security-Policy like the one in the /web/_headers file.