NetBSD Problem Report #51470
From www@NetBSD.org Sun Sep 11 17:47:15 2016
Return-Path: <www@NetBSD.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
(Client CN "mail.netbsd.org", Issuer "Postmaster NetBSD.org" (verified OK))
by mollari.NetBSD.org (Postfix) with ESMTPS id 14DA47A289
for <gnats-bugs@gnats.NetBSD.org>; Sun, 11 Sep 2016 17:47:15 +0000 (UTC)
Message-Id: <20160911174713.A4DAC7A2AE@mollari.NetBSD.org>
Date: Sun, 11 Sep 2016 17:47:13 +0000 (UTC)
From: saab99@gmx.com
Reply-To: saab99@gmx.com
To: gnats-bugs@NetBSD.org
Subject: UTF-8 not support Russian
X-Send-Pr-Version: www-1.0
>Number: 51470
>Category: misc
>Synopsis: UTF-8 not support Russian
>Confidential: no
>Severity: serious
>Priority: high
>Responsible: misc-bug-people
>State: open
>Class: support
>Submitter-Id: net
>Arrival-Date: Sun Sep 11 17:50:00 +0000 2016
>Last-Modified: Fri Nov 11 19:30:01 +0000 2016
>Originator: Michael
>Release: NetBSD 7.0
>Organization:
Home
>Environment:
NetBSD 7.0.1 NetBSD 7.0.1 (GENERIC.201605221355Z) amd64
>Description:
My servers keep many files in cyrillic naming. Serving big user loads
it is hard to keep files in old encodings with outside world is already
living in UTF-8. Storing files not in UTF-8 cause some problems with Samba
and fatal problems with Linux & NFS, which don't have conversions at all.
So I feel there is time to move on UTF-8 on NetBSD too, and it seems
NetBSD 6 has ru_RU.UTF-8 support, however it is still is not complete.
Fresh installed 6.1.4 can store files in UTF-8. It also can share these
via SMB or NFS, but I can't make it work in shell.
As I see it has support only for LC_CTYPE and LC_MESSAGES via locale.alias
having no native ru_RU.UTF-8 support.
My linux rxvt-unicode terminal (working locally as expected) with ssh to
NetBSD box show:
[***@gloria ~]$ locale
LANG="ru_RU.UTF-8"
LC_CTYPE="ru_RU.UTF-8"
LC_COLLATE="C"
LC_TIME="C"
LC_NUMERIC="C"
LC_MONETARY="C"
LC_MESSAGES="ru_RU.UTF-8
This cause cyrillic filenamse being shown good, but I cannot access it,
because shell print hex code (f.e. \:\262\321\320) instead of letters.
Bash is 4.3.0(1) out of the box. (By the way https://wiki.netbsd.org/unicode/
says it will work out of the box)
Two questions on that:
Am I right and aliasing ru_RU.UTF-8 to en_US.UTF-8 make this that bad?
If I am right - what I shall do to complete ru_RU.UTF-8 locale and have
no problems in writing cyrillic filenames?
>How-To-Repeat:
>Fix:
>Audit-Trail:
From: Michael van Elst <mlelstv@serpens.de>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: misc/51470 UTF-8 not support Russian
Date: Fri, 11 Nov 2016 20:27:44 +0100
There are deficiencies in UTF-8 support, but Russian is no special
case.
The system itself is pretty agnostic and treats filenames just as
a byte sequence with special meaning to the values 47 (slash) and
zero. Interpreting that byte sequence as some codepage or as UTF-8
is a matter of convention.
The bourne shell (/bin/sh) didn't handle input bytes with bit 7 set
because that was used internally by the parser. The C-shell can
handle 8-bit filenames but wouldn't understand a utf-8 environment
in NetBSD-6, NetBSD-7 is fine. Other shells, including the current
bash (4.3.0) from pkgsrc don't have that problem and the native
/bin/sh has been fixed in NetBSD/-current.
VFAT stores long filenames in 16bit unicode. NetBSD would ignore
that and use only the lower byte of each character. This allowed
arbitrary byte sequences in filenames but is incompatible with Windows.
NetBSD/-current can translate between the 16bit unicode data on
disk and UTF-8.
The vi editor gained wide character support in NetBSD/-current
and you can now edit utf-8 text files with it.
NetBSD locale support is limited, but LC_CTYPE shouldn't differ
between the various languages when encoding is UTF-8. Using ru_RU.UTF-8
is fine.
Greetings,
--
Michael van Elst
Internet: mlelstv@serpens.de
"A potential Snark may lurk in every tree."
(Contact us)
$NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2014
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.