Earth Notes: On Website Technicals (2018/10)
2018/10/31: Informational Image Appearance and AMP
I'm attempting to allow the generation of AMP-compliant pages. I'm not sure yet whether AMP is in fact a good idea for this site.
One big problem with this is that AMP does not allow use of normal HTML(5)
img tags, and my pages are indeed full of them, hand-crafted.
More than a decade's worth!
I have created a more 'portable' and restricted EOU '
with essentially only
(I can use case since all I have all normal HTML tags lower-case.)
src must be a local image under
class must be a single class that is one of a subset of
the site's responsive image types. If these tight conditions are not
met then the image is not inserted and an error is generated, though
the raw tag should in fact always be valid (if not optimal) HTML5.
If the EOU
IMG tag passes muster then the hero image
autogeneration mechanism is used to spit out a suitably scaled image.
It inserts an
img tag with auto-generated
alt. These are extracted from
the actual image's size and filename.
(For AMP, a
amp-img is used instead.)
IMG tag is quick and easy to write. It's also
less error-prone than manually sizing a 'thumbnail' image and
constructing the whole browser-friendly HTML5
tag to use it...
See a "KODA House" simple example, floated left.
These autogenerated images are space-efficient (relatively light-weight).
There is also the possibility in future to use
to have the browser fetch (say) an even lighter-weight WebP image if
that format is understood by the browser.
A feature already implemented has the
img tag inserted
for the mobile/lite/AMP version refer to an image pre-scaled down
to the maximum size that could responsively be seen on the presumed
640px-or-narrower screen. And the
image version is auto-created for miserly browsers that request it.
Thus saving even more bandwidth!
One wrinkle is that these images are not purely decorative 'hero' images, but are intended to convey useful information, so the generation script has been tweaked to allow more bits per pixel for such images on mobile. Indeed potentially up to desktop image 'quality' settings. (The cap on image size remains for now lower than desktop even for these.)
This 'nicer' mobile image licence could be extended to the carousel images on the home page, for a better user experience there.
These informational image sizes should match the carousel images to try to allow (at least eg on desktop) the same images to end up being referenced in both cases, improving cache hits for visitors.
2018/11/01: follow-up note: I've segregated 'hero' from 'body' images, eg in separate cache directories, and allowed the latter to retain their aspect ratio.
It's evident from the logs that plenty of browsers narrower than 640px
are arriving at the www/desktop site, so it would be worth extending the
IMG output HTML to use
srcset to have those use the mobile
version of each image in that case for further bandwidth savings.
I have added code to insert a
srcset for both desktop
and mobile image versions where both are present. This allows a
small device arriving at the desktop site to fetch the smaller
mobile image and save some bandwidth.
2018/10/24: Crawl Efficiency and Split Signals
It is very important for crawling efficiency to reduce duplicate content according to Bing's @CoperniX #smx
Don't have lots of useless pages, 404 pages, etc. secondary files like JS, CSS, etc. m dot URLs are essentially duplicate URLs and impacts crawl budget.
To which I responded:
mm, I can't agree that m. pages are (always) dupes. For my key site [EOU] I have a pre-trimmed (but still responsive) experience for smaller devices on more expensive and higher-latency connections.
Then @CoperniX said:
Generally speaking, having both www. and m. means you split signal and crawlers have to crawl both URLs, which is suboptimal of the URL structures are otherwise the same. We do not recommend it but like everything on the web YMMV and it could still work for you.
I countered with:
... Note that the intention is to improve UX (eg everything key is fetched in first round-trip) for network/device constrained users, but the www. page is in each case marked link rel=canonical, and the m. as rel=alternate media=" ... max-width:640px)"
To which @CoperniX replied:
If you canonicalize m. to www. then you mitigate most of the signal split and a good part of the double crawl.
Phew! I'd rather optimise for the end user than the crawler...
2018/10/23: RPi3 App Inventory
I am taking the opportunity of the (re)build to construct an application inventory for EOU and other uses. This way unused apps that aren't actually used get implicitly 'garbage collected'... I also get to discover which scripts and so on fail badly when an expected app is missing. I can then choose to install the app or fix the script not to need it.
2018/10/13: Preparing for the Raspberry Pi 3 Upgrade
Thinking about the logistics of bringing up the new RPi3 server, it seems to me that it will be tricky enough restoring the current capabilities (eg working out all the packages to reinstall and upgrade) and getting the new networking right (doing without the existing router to save ~8W and some downtime).
So the desired changes that provoked the upgrade (supporting HTTPS and HTTP/2, and probably Brotli) will have to wait until everything existing is back to where it was (though a little faster). It would be good even before then, but apparently essential to make HTTP/2 work well, to switch on BBR and tcp_notsent_lowat etc. (2018/10/22: I have added the BBR settings for TCP.)
I have ordered storage for the new RPi3. 256GB of fast non-volatile storage in the size of a fingernail, around £60 retail, and requiring tiny amounts of power. When I tell the kids these days that my entire university's storage in 1986 was 1.5GB they don't believe me!