Earth Notes: On Website Technicals (2023-09)
Updated 2024-02-03 11:55 GMT.By Damon Hart-Davis.
2023-09-28: Raspberry Pi 5 Good Off-Grid?
The RPi5 is nearly on the market. It seems to have an RTC built-in, which means that I might be able to do without HAT(s). But is its minimum (eg idle) consumption: will it be low enough for my off-grid server when I replace my RPi3B?
According to Jeff Geerling idle consumption of Raspberry Pi 5 Model B Rev 1.0 is 1.8W on an 8GB device running Debian GNU/Linux 12 (bookworm) aarch64
with kernel 6.1.32-v8+.
(Subsequent measurements suggest 1.5x to 2.x RPi3B idle consumption, so a little over 2W. This may drop a little with firmware updates, as happened with the RPi4.)
So the RPi5 may still be good for off-grid, but with a lot more oomph on demand (~8W more CPU power).
Indeed I may be able to burn a few watts of excess solar when the battery is full, catching up on expensive stored work!
reutils V1.2.2
The BMRS system seems to be glitching more often. So reutils
V1.2.2
avoids tooting (ie posting to Mastodon) if it does not have fresh data. That should eliminate low-value toots of the form
National Grid CO2 intensity probably average so don't run big appliances now if you can wait.
2023-09-25: AdSense GDPR Messages
Recently, Google AdSense started insisting that I set up a GDPR cookie message. Even though I had turned off ad personalisation and I do not do anything requiring permission myself so far as I can tell.
So I set up the Google-provided mechanism ~2023-09-07.
One slight bonus is that now Google is allowing me explicitly to shut out third-party providers making dubious claims to legitimate interest in tracking me across multiple sides, establishing my inside-leg measurement, etc. At this stage excluding everything but non-personalised ads via Google itself seems not to have diminished my tiny revenue. And I have now removed JavaScript from essentially all pages other than the few with ads, so those are the only places a cookie-consent pop-up can even appear.
Google continues to complain that I'm now asking for permission but not doing ad personalisation. Well duh...
Anyhow, the stats so far are interesting even though clouded a bit with noise from me setting things up and testing.
Messages shown | EEA and UK traffic rate | Consent rate |
---|---|---|
259 | 85% | 58% |
The consent rate has been fairly stable and higher than I expected. I have a prominent "I do not consent" button to make it easy to say no. Saying no simply means getting no ads at all (in the UK and EU) I think.
Immutable
I am reinstating the immutable
directive for Cache-Control
response headers for objects under
/img/
. Firefox and Safari, and thus ~22% of users according to
Can I Use, may benefit.
Once an object exists under /img/
it may be removed. It may also be replaced with a functionally/logically identical smaller object, ie it may get optimised, but should not otherwise change. This seems a good use of immutable
.
But I think that I can safely remove the public
directive, having re-read specs.
2023-09-22: copyrightNotice and Other Metadata
I have added a copyrightNotice
to every page, wrapping the existing copyright statement.
I have manually added an archivedAt
metadata link
to the home page, pointing to the offline ZIP archive.
I have selectively added a contentReferenceTime
to a couple of pages where it seemed relevant: there may be more. It may also be sensible to automate insertion of this in some cases, maybe even in dated entries such as this, and ins
?
2023-09-18: Mastodon vs Twitter and og:image
In the olden days, Twitter strongly encouraged you to add an appropriate
meta name="og:image"
tag to your HTML page header. This allowed Twitter (and other systems) to show a thumbnail of the specified image in your tweet, which improved engagement.
Twitter would fetch that image once (or a small number of times) when the tweet was initially made, and occasionally thereafter, and scale it to show to clients in the tweet that they saw.
Google's instrumentation, eg for AMP, had/has some strong views on the minimum number of pixels that should be in such images, in part to enable a good experience on high-resolution desktops and mobiles based on some related uses of that image.
So I have tended to pick a decent ~1000px image and let my static site builder extract a letter-box clip for the page hero. That hero is made to weigh in at under 40kB. The original weighing an order of magnitude more does not matter to viewers of the Web page; they may never get to see it.
Under the Mastodon / ActivityPub / Fediverse distributed model, even with only a few hundred followers, posting a link to an EOU page causes tens to hundreds of requests for the page HTML and
the og:image
which often dwarfs the page in size, again by an order of magnitude often.
So in a couple of significant cases, such as the grid intensity page which is posted a lot, I have made efforts to shrink the og:image
(see yesterday) and/or replace the full image with a hand-crafted letterbox image. Significance was judged by turning up in the top-50-ish results from yesterday's bandwidth-hog measurements.
Unless thought about, this risks DDoSing EOU whenever I toot a link to it! (Conversely, it means that distributed stress-testing on demand just became easier for me, when I want it...)
For example, one automated carbon intensity post resulted in 70 fetches of the
og:image
within 7 minutes, all but 2 within the first 2 minutes. The EOU account on which the toot was posted has 311 followers currently. The image's cache life (Cache-Control max-age
) is ~1 year. All of the GET
requests were 200 (none 304). The previous toot with the og:image
was ~1 hour before, and the image has not changed in that time.
2023-09-17: Bandwidth Hogs
I think that I have done this before, but I cannot find my previous if so...
I have put together a small script that examines site logs for a week and totals up count and total bytes per EOU site object, looking for bandwidth sucks.
In principle that might help me minimise/optimise objects that are the targets of (say) Mastodon / ActivityPub mass fetches when a page link is posted.
In practice I see that the top entries are mainly FLAC
files that are probably being downloaded by greedy and stupid spiders.
The very top item, responsible for ~500MB last week (168 downloads) was a version of the video of me digging compost with toddler commentary.
In at number 6 is ~130MB in three downloads of a rather obscure data archive, again unlikely to have been a human driving!
In at number 9 is ~110MB for ~1500 downloads of the energy series dataset page, which suggests that many of them are being fetched without any compression:
407398 energy-series-dataset.html 23461 energy-series-dataset.htmlbr 31308 energy-series-dataset.htmlgz
(Indeed uncompressed transfers seem mainly to be Akkoma Fediverse bots and some Pleroma...)
In at number 16 is ~60MB is ~5000 polls of the podcast RSS
file.
In at number 22 is ~45MB is ~3200 downloads of the banner image that goes in automated grid-intensity posts to Mastodon. So I have made as tight a .webp
version as I can, a bit lossy, and smaller than the previous 8006 bytes. This should save bandwidth for browsers, but possibly not for the Fediverse bots fetching the original.
cwebp -v -m 6 -near_lossless 1 img/grid-demand-curves/gCO2perkWh-1.png -o img/grid-demand-curves/gCO2perkWh-1.png.webp 8845 img/grid-demand-curves/gCO2perkWh-1.png 4522 img/grid-demand-curves/gCO2perkWh-1.png.webp
I did another round of byte squeezing, the .png
again with TinyPNG
and zopflipng -m -m -m
, and re-created the .webp
:
7535 img/grid-demand-curves/gCO2perkWh-1.png 4478 img/grid-demand-curves/gCO2perkWh-1.png.webp
In other news
While messing around I found that I had left some bytes on the table elsewhere:
% ls -al img/tools-800w-JA.jpg* 31421 img/tools-800w-JA.jpg 11321 img/tools-800w-JA.jpg.avif 1708 img/tools-800w-JA.jpg.avifL 20021 img/tools-800w-JA.jpg.jxl 5104 img/tools-800w-JA.jpg.jxlL 10791 img/tools-800w-JA.jpgL 145 img/tools-800w-JA.jpg.txt 24608 img/tools-800w-JA.jpg.webp 3844 img/tools-800w-JA.jpg.webpL % script/lossless_JPEG_compress img/tools-800w-JA.jpg INFO: file 31421 shrunk to 31415 (non-progressive) img/tools-800w-JA.jpg INFO: file 31421 shrunk to 30661 (semi-progressive) img/tools-800w-JA.jpg
A ~2.5% lossless saving on a common hero image... (The .jpgL
had a little bit of fat in it also.)
IPTables to the rescue
While trying to do some of that optimisation, my logs were being sullied by several bogus repeated requests per second from a UAE data centre IP, which I blocked entirely in iptables
.
2023-09-02: Crashy Server
For some reason sencha
is now crashing every few days.
One crash while capturing monthly data archives, just after revision 53008,
lost one of the less important monthly data sets (CPU temperature data/RPi/cputemp/202308.log
), and got the SVN client side in a severe tangle that it took some hours to undo.
We will never talk about revisions 53009 to 53012 again, OK?
Things were largely sorted by revision 53019, though some clean-up is still needed of log files damaged during the crash.
(Corruption clean-up in aisle 53038...)
Maybe time for a server upgrade to a much newer kernel?