Earth Notes: On Website Technicals (2019-04)

Updated 2024-05-10 14:32 GMT.
By Damon Hart-Davis.
Tech updates: moar litererer, bumpy indexing, copyrightYear fix, Schedule, HH:MM and spatial page metadata, notworking, vox pop, tap target size.
20190407 56pc pages indexed from GSC Enhancements view
More (schema.org microdata) metadata this month. Once I went looking for what might be relevant rather than just scraping a minimum to keep Google happy, there's lots!

2019-04-29: Clickable Nav

I have tweaked my top navigation links to be spaced better to make them more easily clickable on touch displays such as mobile.

This was triggered by a warning from the Lighthouse test now available in WebPageTest, and its help page Accessible Styles:

A minimum recommended touch target size is around 48 device independent pixels on a site with a properly set mobile viewport. For example, while an icon may only have a width and height of 24px, you can use additional padding to bring the tap target size up to 48px. The 48x48 pixel area corresponds to around 9mm, which is about the size of a person's finger pad area.

Touch targets should also be spaced about 8 pixels apart, both horizontally and vertically, so that a user's finger pressing on one tap target does not inadvertently touch another tap target.

For some reason the router has been behaving itself today. (Maybe there are fewer external attacks adding to its woes at weekends.) So I added a block of 'popular' links at the foot of most pages. I hope that EOU's bounce rate will drop, but who knows!

2019-04-27: Networking Emergency

The router (and WiFi access point) has been fairly ill for at least a couple of weeks now.

The WiFi is nearly deaf, so connectivity is slow and poor, and only works near the device, even for local traffic.

Plus the device keeps dropping the upstream connection and/or stopping with its power light on red, meaning 'fault'.

I have harboured ambitions to bring up my RPi3B+ as router and WiFi AP to save energy, etc, and I tried to do that during a substantial outage 'emergency' a few days ago.

I could indeed present a WiFi AP, and route to laptops, phones, and other devices behind the RPi3B+. But I could not get routing working to the older RPi2 that is the server for this site amongst other things. Maybe I was close, but I was most definitely not there.

My ISP agreed that my configuration with a block of static public IP addresses is hard to cater for. A single such IP is routed to the (PPPoE) WAN interface of the RPI3B+, but that is within the full block that needs to be forwarded to the RPi2, which is not a situation that I have encountered before. And the WAN PPP interface cannot be bridged to the Ethernet connection to the RPI2.

(Even if I was to instantly move all my servers from the RPI2 to the RPI3B+, I would have conflicts between such systems as DNSMASQ and my public DNS primary.)

2019-04-22: Adding spatialCoverage

I have just created a special meta/header tag to label a page as being spatially at 16WW. This injects an appropriate spatialCoverage itemprop in a footer, other than in 'lite' pages.

In another pass I added a way to label a page as being in a specific country (usually UK for this site) then optionally at a latitude and longitude, and then exceptionally at a given elevation.

I have been able to give more than 75% of pages a sensible spatialCoverage between those two.

I just implemented an item from my to-do list:

  • Postpone next auto-inserted ad for each float/body IMG inserted, to make it less likely that a bunch of floats will pile up ahead of an ad and show up clear:both.

2019-04-18: More Precise datePublished

All but ~20 pages have now had their datePublished updated, mainly simply to add a trailing time, but sometimes also to correct the date.

There are cases where the SVN repository date is not a good reflection of when the information on page was first published, eg because a single huge page was split up into many. (Sometimes the creation date of a new page into which information was moved has been accepted, but the temporalCoverage set to reflect a range including the original date.)

Another case is where a page was created a day or three ahead of time to save a rush on the day, but that page was not actually published (ie embargoed) until a more logical time.

In such cases where the SVN timestamp is unhelpful, the manually-selected plain date has been left in place.

The output of the tool I created to cross-check as I last ran it was:

WARNING: datePublished .electricity-storage-whole-household.html is 2010-12-20, svn is 2017-05-20T10:23:44Z...
WARNING: datePublished .index.html is 2007-05-25, svn is 2007-07-18T09:52:33Z...
WARNING: datePublished .note-on-site-technicals-20.html is 2019-01-01, svn is 2018-12-27T12:48:38Z...
WARNING: datePublished .note-on-site-technicals-3.html is 2017-08-01, svn is 2017-07-31T19:22:13Z...
WARNING: datePublished .off-grid-stats-historical-200909.html is 2009-09-11, svn is 2018-10-06T18:48:45Z...
WARNING: datePublished .off-grid-stats-historical.html is 2007-11-08, svn is 2018-10-06T18:09:39Z...
WARNING: datePublished .saving-electricity-2008.html is 2008-01-01, svn is 2017-08-20T14:20:48Z...
WARNING: datePublished .saving-electricity-2009.html is 2009-01-01, svn is 2017-08-20T14:09:14Z...
WARNING: datePublished .saving-electricity-2010.html is 2010-01-01, svn is 2017-08-20T13:51:28Z...
WARNING: datePublished .saving-electricity-2011.html is 2011-01-01, svn is 2017-08-20T13:09:22Z...
WARNING: datePublished .saving-electricity-2012.html is 2012-01-01, svn is 2017-08-20T12:54:01Z...
WARNING: datePublished .saving-electricity-2013.html is 2013-01-01, svn is 2017-08-20T12:13:14Z...
WARNING: datePublished .saving-electricity-2014.html is 2014-01-01, svn is 2017-08-20T11:00:31Z...
WARNING: datePublished .saving-electricity-2015.html is 2015-01-01, svn is 2017-08-20T10:29:47Z...
WARNING: datePublished .saving-electricity-2016.html is 2016-01-01, svn is 2017-08-19T15:23:50Z...
WARNING: datePublished .saving-electricity-2017.html is 2017-01-01, svn is 2017-08-18T18:39:50Z...

The whole site appears to have been imported into SVN 2007-07-18T09:52:33Z, at that point consisting of the following files/directories:

.LED-lighting.html
.index.html
.low-power-laptop.html
.saving-electricity.html
.solar-PV-pilot-summer-2007.html
.work
.work/wrap_art.sh
650Wp-1kWhPerDay-sim-tn.gif
650Wp-1kWhPerDay-sim.gif
CFL-12V.jpg
CFL-desk-lamp.jpg
LED-bulb-5W.jpg
LED-light-3W.jpg
battery-and-controller.jpg
battery-voltage-monitor-12V-13V-thresholds-1-full.gif
battery-voltage-monitor-12V-13V-thresholds-1.gif
compost-bin.jpg
laptop-12V-mains-fallback-schema-1-full.gif
laptop-12V-mains-fallback-schema-1.gif
laptop-12V-mains-fallback-schema-2-full.gif
laptop-12V-mains-fallback-schema-2.gif
laptop-12V-mains-fallback-schema-2.ps
makefile
q.jpg
solar-PV-system-June-2007-and-planned-expansion-1-full.gif
solar-PV-system-June-2007-and-planned-expansion-1.gif
solar-cells.jpg
solar-panel-on-wall.jpg
sparks.jpg
xephi_small_logo.png

Timestamps from SVN for anything imported at that point are misleading. Clues in the text, and assuming that a page is at least one day older that the oldest capture in the Wayback Machine, help with these.

This date inference is needed for the home page index.html as it predates the repository.

2019-04-16: More Precise dateModified

I have extended page date metadata to hours and minutes (UTC) for dateModified and sdDatePublished. In particular dateModified is now the repository source file latest commit date and time rather than the file timestamp.

Other places, such as the sitemaps, may still use the file timestamp, as it is quick to get and a reasonable guide for a search engine of when content has changed. And timestamps such as HTTP LastModified will come from the file timestamps of the plain or compressed version of the file as requested by the client.

Rather than be free-floating, I have now attached the 'EOU' info as sourceOrganization to the page/Article.

I have also allowed datePublished to include a (UTC) full time, where I am able to provide it, eg from inspection of SVN repository logs.

Finally, to make the (last updated) date easy for a user to find, it is now shown per the Google News guidelines:

Date and time should be positioned between the headline and the article text.

2019-04-15: The Joy of Schedule

As I posted as a new issue on Github schemaorg/schemaorg:

In my page:

http://www.earth.org.uk/green-halloween-pumpkin.html

I talk about the joy of planting, growing and eating pumpkins.

Somewhere in there I would like to mark up that this fun is to be had April to September every year. Maybe I could jam one of the ISO 8601 repeating times into temporal or temporalCoverage, maybe for the Article or an embedded Thing or Event representing the growing of pumpkins.

What would be he right thing to do here? So far I really can't see what it would be!

I was pointed to the existence of schema.org/repeatFrequency and Schedule so I shall see how I might make those work!

2019-04-10: Fixed copyrightYear

Since structured/meta data has been on EOU, I have had the copyrightYear be the whole site's first year, ie 2007. I have now fixed it to be the year that the individual page, ie CreativeWork, was first published, which is more true to the definition, and more granular.

2019-04-07: Indexing Bumpy

56% of AMP pages reported valid/indexed

It's still really unclear to me what the notions of "valid" and "indexed" mean in various places in GSC, such as the AMP and Mobile Usability 'Enhancements' vs coverage by sitemap... GSC seems unwilling to stray reporting much above 50% of my AMP pages as being "Valid" (green) even though it reports no problems, and has 100% of the canonical (desktop) pages as "Valid" in the main sitemap.

(Also, my network connection has been very flaky for about 24h, so I am expecting some complaints from GSC about that in due course...)

2019-04-04: Speak Moar Liter

I am for now stripping out speakable meta-data from lite pages, since no one is going to be using it for a while, and Google only cares about markup and content parity between desktop and AMP it seems.

~80 bytes lopped off each 'lite' .htmlgz.

There are other marginal metadata elements that I could strip out for lite (m-dot) also, if I had the urge!

~1578 words.