[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

The 256 or 512 most important Unicode glyphs for console fonts



On Thu, 8 Nov 2001, David Starner wrote:
> Give me a list of characters (< 256) that
> en_*.UTF-8 should support, and a consensus, then we can start
> claiming they're broken and fixing them.

I have some to offer:

$ uniset + CP1252.TXT + 8859-2.TXT - 00a0-00a0 - 00ad-00ad - 0200-02ff \
- 2013-2014 - 2026-203a compact nr

# Plane 00
# Rows  Positions (Cells)

  00    20-7E A1-AC AE-FF
  01    02-07 0C-11 18-1B 39-3A 3D-3E 41-44 47-48 50-55 58-5B 5E-65 6E-71
  01    78-7E 92
  20    18-1A 1C-1E 20-22 AC
  21    22

# Number of characters in above table: 256

Note that this is a glyph collection and that other important Unicode
characters (such as HYPHEN, MINUS, NBSP, SOFT HYPHEN, EN DASH, etc.) all
have homoglyphs in the above sets that are not worth distinguishing in a
monospaced console font.

If you prefer to support MS-DOS (sensible idea for a console environment
with block graphics tools!), Welch and Turkish (both languages that
almost fitted into Latin-1, but not quite), how about

$ uniset + CP1252.TXT + CP850.TXT + 8859-9.TXT - 00a0-00a0 - 00ad-00ad +
0174-0178 compact nr

# Plane 00
# Rows  Positions (Cells)

  00    20-7E A1-AC AE-FF
  01    1E-1F 30-31 52-53 5E-61 74-78 7D-7E 92
  02    C6 DC
  20    13-14 17-1A 1C-1E 20-22 26 30 39-3A AC
  21    22
  25    00 02 0C 10 14 18 1C 24 2C 34 3C 50-51 54 57 5A 5D 60 63 66 69
  25    6C 80 84 88 91-93 A0

# Number of characters in above table: 256

The uniset tool to explore such options quickly is available on

  http://www.cl.cam.ac.uk/~mgk25/download/uniset.tar.gz

I also think that just providing the ISO 8859 sets as console fonts
without filling in the remaining slots with CP1252 quotes and block
graphics is rather disappointing.

If you have 512 glyphs, you can do a lot:

$ uniset + CP1252.TXT + CP850.TXT + 8859-2.TXT + 8859-5.TXT + 8859-7.TXT +
8859-9.TXT + 8859-10.TXT + 8859-13.TXT + 0174-0178 + 2190-2193 + fffd-fffd
- 00a0-00a0 - 00ad-00ad compact nr

# Plane 00
# Rows  Positions (Cells)

  00    20-7E A1-AC AE-FF
  01    00-07 0C-13 16-1B 1E-1F 22-23 28-2B 2E-31 36-3E 41-48 4A-4D 50-5B
  01    5E-6B 6E-7E 92
  02    C6-C7 D8-D9 DB-DD
  03    84-86 88-8A 8C 8E-A1 A3-CE
  04    01-0C 0E-4F 51-5C 5E-5F
  20    13-15 17-1A 1C-1E 20-22 26 30 39-3A AC
  21    16 22 90-93
  25    00 02 0C 10 14 18 1C 24 2C 34 3C 50-51 54 57 5A 5D 60 63 66 69
  25    6C 80 84 88 91-93 A0
  FF    FD

# Number of characters in above table: 512

Just some ideas ...

Markus

-- 
Markus G. Kuhn, Computer Laboratory, University of Cambridge, UK
Email: mkuhn at acm.org,  WWW: <http://www.cl.cam.ac.uk/~mgk25/>

--
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/linux-utf8/