Earth Notes: On Website Technicals (2022-09)

Updated 2022-10-09 20:16 GMT.
By Damon Hart-Davis.
Tech updates: TX down, spelingg, bad words, structured data.
tools 800w JA
Out of holiday slumber and into an energy-expensive winter for Europe, and visitor stats perking up.

2022-09-23: Structure, Schema and Search

It is interesting to see several separate items of structured data, from HTML itself to schema.org microdata, feed into a Google search result (for ZWF01483W).

This is for the page On the Zanussi ZWF01483W front-loader washing machine: Review.

Screenshot 20220923 ZWF01483W EOU result in Google search with schema structured data features
Screenshot from Google search showing enhancements based on various types of structured data in the page, including the thumbnail, star rating and price, (trimmed) title, bullet list converted to a snippet phrase, etc.

2022-09-18: Fewer Spellings

I am slowly working through hundreds of pages to add spelling exceptions. The number of spellings in the EOU global 'personal' dictionary has passed 500. This will cost time to load and parse by the spell-checker for every page, so I'm starting to trim entries from it that can be better dealt with by per-page exceptions, or use of code or q wrappers. The entries that I am removing are command and package names, ie 'code'.

I have considered automating this process to simply accept the current list of 'bad' words for each page. One complication is that the Linux and macOS dictionaries do not quite fully overlap.

There are reasons to take this slowly and look at each page on its merits:

  • There are still some typos and more subtle spelling errors lurking, which I get a chance to deal with well.
  • I'm learning some new spellings!
  • I'm working out how to best use code and q wrappers.
  • There are more subtle things going on that I will otherwise miss.
  • There is a chance to review and refresh and improve each page as I go.

2022-09-14: More Spelling

I have enhanced the spell check to ignore within-line code and q (quote) blocks, and blockquote also. I can safely quote eg US English without being dinged for US spelling.

I have now added a mechanism to generate a build warning if a page has remaining 'bad' words that do not contain a digit or uppercase letter, and are not in an exceptions list for the page.

For popular pages such a warning will prevent an updated version of the page from being built and published until the warnings are cleared.

Dic Size

The hunspell en-GB dictionary installed on my Mac appears to be a little large than that on my RPi:

3090   en_GB-large.aff
881592 en_GB-large.dic
en_GB.aff -> en_GB-large.aff
en_GB.dic -> en_GB-large.dic
 28123 en_GB.aff
685158 en_GB.dic

That means a few spellings are accepted on the Mac but rejected on the RPi. (Though there are a few differences the other way too.) That makes for an extra round-trip to fully resolve exceptions, etc.

2022-09-07: Spelling

I now have a command-line tool that I am happy with, does en-GB spelling, and runs on both Raspbian and macOS, and is tweaked for EOU, in script/spellb.sh:

#!/bin/sh
# Apply British spellcheck to HTML file on stdin.
# Removes all top-level <pre> blocks.

sed '/^<pre[> ]/,/^<\/pre>/d' |
    hunspell -d en_GB -H -l | \
    sort -u | fmt -60 | awk '{print "INFO: spell: "$0}'
exit 0

I'm happy with stripping out the pre (usually code) blocks!

I folded this into the makefile for when building each desktop top-level page, so prints on stdout at the end of the page build and gets into the .info log file.

% make note-on-site-technicals-64.html
Rebuilding note-on-site-technicals-64.html
INFO: Fake lockfile -r 0 -l 7211 note-on-site-technicals-64.html.lock
INFO: using .work/archive/poparts/20220517-populararts.txt in place of missing/empty .work/tmp/populararts.txt
INFO: ***** changes not committed to SVN for .note-on-site-technicals-64.html! *****
INFO: dateModified is file modification time not repo commit time for .note-on-site-technicals-64.html.
INFO: .note-on-site-technicals-64.html description       35, less than suggested 70.
INFO: .note-on-site-technicals-64.html primary hero 800w less than recommended (AMP minimum) 1200w: img/tools-800w-JA.jpg.
INFO: .note-on-site-technicals-64.html primary hero 160000 pixels area less than 300000 and recommended 800000: img/tools-800w-JA.jpg.
INFO: ADS ALLOWED: .note-on-site-technicals-64.html not popular but recently updated.
INFO: .note-on-site-technicals-64.html share42 hash tags= #tech
INFO: note-on-site-technicals-64.html readability score 63; min 42.
INFO: spell: 300mW 4GHz 5GHz AdvanceCOMP EOU Fi Morningstar OptiPNG PNG
INFO: spell: PNGOUT Raspbian Technicals Wi advdef macOS makefile spelingg
INFO: spell: spellb stdout ve zopflipng

Then a bit of magic/ugly lets me look at all potential misspellings site-wide:

% cat .build/* | egrep '^INFO: spell: ' | sed -e 's/^INFO: spell: //' | fmt -1 | sort -u | fmt

Or more directly and comprehensively (and still reasonably quickly):

% cat .*.html | sh script/spellb.sh

So I fixed a bunch of tpxing errers across ~25 files last night, and gave myself a sore back because I was so into it!

I'm still getting a buzz several days on, incrementally cleaning up spelling errors that have lurked for a long time.

I have also been adding 'specialist' words to an exceptions dictionary (eg FUELINST, DNO, kW, etc) to reduce the noise level a bit.

2022-09-04: TX Power

In an attempt to save a tiny bit more energy, I have turned down the TX power of the 5GHz channel on the WiFi router (to 60% for now), and I have also set the channel to go off at night; most of our devices and computers connect on 2.4GHz anyway.

It is really difficult to measure if this is making any difference by observing reported power draw via the Morningstar while the router is a dump load, partly because demand fluctuates a lot anyway, but savings are maybe ~300mW from powering down 5GHz entirely.

Nothing is standing out on the off-grid chart in terms of savings. Maybe I'm just cleaning up the airwaves a bit a few hours per night!

2022-09-07: Lower, Bruce!

Given no obvious problems so far I'm going to knock 5GHz TX power down to 30%, and 2.4GHz down to 80%...

...

And at least from my Mac things are still working. I will keep ears open for complaints!

Flaky

2022-09-13: the 'temporary' WiFi connection to the Thermino RPi (pekoe) seems to be more flaky than usual, so I have pushed 2.4GHz TX power back up to 100% to see if that helps.

~852 words.