New site listing workflow and search as a service improvements

This is a quick post summarising the simplified site listing workflow and search as a service improvements.

Why the site listing workflow needed simplifying

The old site listing workflow was suprisingly complicated, with a number of different routes through the process, and the ability to restart and take a different route at a later date. Unfortunately, there were a number of issues, for example:

The new site listing workflow

All submissions in the new workflow start from the same Add Site page, and the listing types have been renamed to “Basic” and “Full” (plus the new “Free Trial”), which is hopefully clearer. The second step for the Full listing asks for “Login and domain ownership validation method”, which again is hopefully clearer than the existing “Domain Control Validation” or “IndieAuth” options.

The new workflow diagram with the new terminology is:

New add site workflow

The new database schema

This is the 2nd major change to the database schema. The first design had 3 tables for domains - pending, approved, and rejected. That turned out to be a bit of a pain to maintain, having to move records between tables as they changed state, so I replaced with a simplified schema in Oct 2021. That new schema had one table. Unfortunately that turned out to be a bit of an oversimplification, meaning (as per one of the fundamentals of database design) single values per domain, while there were at least two features that would benefit from more than one value per domain:

  • A site should have more than one state, so someone can have it indexing on the Basic tier while they (for example) try out the Free Tier or sort out the Full Tier.
  • A site should be able to have more than one paid subscription, i.e. users should be able to renew a subscription while the current one is still active. As it was, users needed to let their subscription expire before they could renew, which was a poor user experience.

So the main schema changes are:

  • A new listing status table to track signup state. This will hopefully reduce the chance of inconsistent states, e.g. if someone tries to change tier mid-way through signup. The primary key is a combination of domain and tier, so there will only be one instance of each tier for each domain.
  • A new subscriptions table. Crucially, this allows for more than one subscription for a site, so you can renew your subscription before the old one expires. It was also pretty useful for adding support for the new Free Trial listing.

This should make the whole site listing workflow much more robust and extensible in the long term (although admittedly the introduction of a lot of new code might lead to a few new bugs in the short term).

What data has been migrated to the new version

Given past experience of odd issues when migrating potentially inconsistent states to a new schema, I’ve decided to only migrate fully approved sites to the new schema, and also only migrate them with their current status. There are 1400 such sites.

This means the following is not migrated:

  • Previous verification details, i.e. if sites were initially Verified Add but lapsed to Quick Add (old terminology), they will need to reverify if moving from Basic back to Full listing (new terminology). This applies to 30 of the 1400 migrated sites.
  • Unlisted sites where the listing is “in progress”. A total of 33 such sites haven’t been migrated (noting that some have been “in progress” for over 2 years).
  • The blocked sites list. There are 591 blocked sites which haven’t been migrated.
  • Sites which have had indexing disabled because indexing has failed twice in a row, e.g. because the site is down, or blocking indexing due to robots.txt or Cloudflare. There are 70 of these.

This should however mean starting with a clean slate, and preventing issues caused by the accumulation of potentially inconsistent data over the years.

Improvements to the search as a service

In addition to the new site listing workflow, the ability to resubscribe, and the introduction of a Free Trial, other improvements to the search as a service include:

So I think that will make it a much more useful and attractive search as a service.

For the next major release, I’ve got a number of improvements I’d like to make to the public search.