[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Unicode console font



"Stanislav V. Voronyi" wrote on 1999-09-13 09:30 UTC:
> 	PSF come from kbd package and has following format
> header which define size of font, 256 or 512 glyphs font and presence of SFM table
> 256 or 512 glyphs
> SFM table
> 
> 	SFM table it is very simple table which translate Unicode to 
> font position. For more information look sources of kbd package.

The limitation to 512 characters is a limitation of the VGA text mode
hardware. It can only display 512 characters at the same time (or 256
characters, if you want to have 16 instead of just 8 colors). A glyph is
selected by 8-bits in the character byte and 1-bit in the attribute byte
of the screen buffer.

The PSF format is obviously highly specialized towards the needs of
the old VGA text mode. Since the Linux console runs now in graphics
mode, may be we should get rid of PSF and define a more powerful
format.

Features:

  - space for up to 1 million glyphs
  - efficient access path to these glyphs (i.e., something
    better than linear search, that can be accessed from a
    memory mapped file)
  - support for glyph variant options (i.e., the user can activate
    or deactivate some style options, which will influence the
    character/glyph mapping)
  - glyph variations could include bold/italic/wide/etc., they could also
    include CJK style variations, as well as more mundane things
    such as whether you want to have a slash through a zero or
    whether you want to have visible codes for the many different
    Unicode space characters (for debugging)
  - support for ligature substitution (for languages that depend
    on spacing combining characters); this means that a sequence of
    several Unicode characters can be replaced by a single wide glyph.
  - support for glyph variations such as smaller uppercase character,
    which have enough space on top to fit a combining character over them.
  - support for combining character, not only by simple overstriking,
    but also by allowing some offset. In other words, I want the
    diaresis over the "a" to be 2 pixels lower than over the "A",
    therefore there should be a way to add some "combining_shift(0,-2)"
    attribute to the "a" character to make this happen.

So an NCF (Next generation Console Font) file would contain
a set of glyphs, each of which comes with some attributes.
These attributes could be

  - the sets of Unicode characters that this glyph can represent
    under the condition that some style variant has been activated
  - the set of Unicode sequences that this glyph can
    represent (ligature substitution), again perhaps conditioned
    by style variants
  - shift offsets for various types of combining characters
  - etc.

Such a format would allow very powerful Unicode Level 3 support,
even in a charcell environment like the console. The available
functionality would be way beyond what for instance BDF provides
us with. The down side would be, that it will become tricky to
edit such fonts with tools like Xmbdfed, which are very much
designed with BDF's character=glyph equivalence in mind (what
we currently use in xterm and other X11 applications).

It is a crucial step to understand that style variants are better handled
within one single font file as opposed to having several different
font files. This is because some style options apply only to certain
parts of unicode. For instance, the upright/italic option is of no
convern to block graphics and Han character, while the simplified vs.
traditional Chinese style option is of no concern to the
Latin/Greek/Cyrillic characters. Style variants that are valid across
all glyphs and that therefore should be handled by having
different font styles include

  - glyph sizes and glyph proportions (total size, x-height, cap-size,
    descender-hight, etc. must me consistent across scripts within a
    font)

  - major style variations (serif vs. sans serif)

  - different designers

A first draft for possible style options would be

  - Han style: kanji/traditional/simplified
  - weight: normal/bold
  - slant: upright/italic
  - height: normal/reduced-for-combining-accent-above
  - squeezed: normal/reduced-for-combining-accent-left-or-right
  - width: 1-charcell/2-charcells
  - appearance: normal/debugging (debugging makes sure that glyphs
    that would normally look identical look very different, e.g. the
    spaces and hyphens)

So the font file would contain essentially contain a large matrix (which
need not be really stored as a matrix), that lists on one axis unicode
characters (or sequences of Unicode character for ligature support), and
on the other axis style combinations, and then assigns to each of these
a glyph.

How would we edit something like that? Can we still use Xmbdfed as part
of the tool chain to maintain such fonts?

Markus

-- 
Markus G. Kuhn, Computer Laboratory, University of Cambridge, UK
Email: mkuhn at acm.org,  WWW: <http://www.cl.cam.ac.uk/~mgk25/>

-
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/lists/