[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Forcing vim 6.0 to stay in UTF-8 mode in a UTF-8 locale
Markus Kuhn wrote:
> I just noticed that when I work in a UTF-8 locale (LC_CTYPE=en_GB.UTF-8),
> that vim 6.0 normally opens a UTF-8 file such as
Please use Vim 6.1 for this kind of testing. With the released patches
if possible (using CVS is easiest). Vim 6.0 is quite old now.
> http://www.cl.cam.ac.uk/~mgk25/ucs/examples/lyrics-ipa.txt
>
> properly in UTF-8 mode, but it deactivates UTF-8 mode when you load
> instead a file that contains malformed sequences, such as
>
> http://www.cl.cam.ac.uk/~mgk25/ucs/examples/UTF-8-test.txt
Since this file contains byte sequences that are illegal in UTF-8, it is
converted to UTF-8 as if it were a latin1 file. The converted text can
be edited normally. When writing the file the conversion is done in
reverse, thus a read command followed by a write command produces an
identical file.
If you want to edit the file as if it were utf-8 you should first filter
out the illegal byte sequences. To manually overrule the detection of
the encoding use this command:
:edit ++enc=utf-8 UTF-8-test.txt
This is unsafe though, because you edit the file with the illegal byte
sequences.
> Even worse, it also deactivates UTF-8 mode when you load a file that
> contains new Unicode 3.2 characters, such as
>
> http://www.cl.cam.ac.uk/~mgk25/UTF-8-demo.txt
That should be:
http://www.cl.cam.ac.uk/~mgk25/ucs/examples/UTF-8-demo.txt
I can load this file without trouble with Vim 6.1.
> I live now on a planet were any other encoding than UTF-8 does not exist
> when I am in LC_CTYPE=en_GB.UTF-8. How do I tell vim 6.0 (and also
> emacs) to pick the encoding *strictly* based on the locale and look at
> absolutely nothing else? Falling back to ISO 8859-1 is not an option,
> because ISO 8859-1 is completely unknown on my planet.
If you only have UTF-8 files you don't need to do anything. If you
communicate with other planets (and this message indicates you do :-)
you will have to be able to edit ISO-8859-1 files as well.
> Trying to escape the horrors and pain of automatic encoding detection in
> a pure UTF-8 environment ...
I haven't seen this planet yet. And as soon as I see it, I'll send a
Latin1 file to it :-). Conclusion: this UTF-8 only planet does not exist.
--
hundred-and-one symptoms of being an internet addict:
244. You use more than 20 passwords.
/// Bram Moolenaar -- Bram@xxxxxxxxxxxxx -- http://www.moolenaar.net \\\
/// Creator of Vim -- http://vim.sf.net -- ftp://ftp.vim.org/pub/vim \\\
\\\ Project leader for A-A-P -- http://www.a-a-p.org ///
\\\ Lord Of The Rings helps Uganda - http://iccf-holland.org/lotr.html ///
--
Linux-UTF8: i18n of Linux on all levels
Archive: http://mail.nl.linux.org/linux-utf8/