|Posted by:||Stan Brown (the_stan_bro…@fastmail.fm)|
|Date:||Thu, 15 Oct 2020|
I'm trying, and failing, to write the proper charset in my meta tag.
I have this line in the <head> of my Web pages:
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
But perfectly decent characters like é, ×, ² show up as a question
mark in a lozenge. I figured out that that's because my HTML files
are all plain text, 8 characters per byte, which is not UTF8 when I
use characters above 127.
So I changed the charset to latin-1, and then to iso-8859-1. With
each of them, characters 160-255 display correctly, but the W3C's
validator gives this error message:
Bad value ?text/html; charset=iso-8859-1? for attribute
?content? on element ?meta?: ?charset=? must be followed by ?utf-8?
So what charset should I use to represent a file where every
character is 8 bits, and those 8 bits match the iso=8851-1 or latin-1
To make things even more murky, at
I found this gem: "If the attribute is present, its value must be an
ASCII case-insensitive match for the string "utf-8", because UTF-8 is
the only valid encoding for HTML5 documents."
If that's true, it sounds very much like I can't generate my web
pages unless I code every 160-255 character as a six-byte &#nnn;
string, which is not only a pain but makes editing harder.
(I tried looking at character encodings in Vim, and indeed it does
have a utf-8 option, but after I do my editing I run all my pages
through a very complicated awk script, and it looks like awk can't
handle UTF-8, at least not in Windows.)
Stan Brown, Tehachapi, California, USA
HTML 4.01 spec:http://www.w3.org/TR/html401/
CSS 2.1 spec: http://www.w3.org/TR/CSS21/
Why We Won't Help You:http://preview.tinyurl.com/WhyWont